Background
Organizations of all sizes and within all industries generally distribute their corporate knowledge amid a variety of heterogeneous database applications: from customer relationship systems, staff directories, content management systems (CMS), electronic document and records management systems (EDRMS) to library catalogues.
Mainstream search engines are about finding any information: "a list of all documents containing a specific word or phrase”. Because of this, search engines paradoxically return both too much information (i.e., long lists of links) and too little information (i.e., links to content, not content itself). IB, by contrast, is about exploiting document structure, both implicit (XML and other markup) and explicit (visual groupings such as paragraph), to zero in on relevant sections of documents, not just links to documents.
IB intelligently provides information search and retrieval services to text, data (a large number of standard types such as numerical, ranges, dates etc), images/video/audio, geographic information, network objects and databases. It exploits document structure, both implicit (XML and other markup) and explicit (visual groupings such as paragraph), to zero in on and retrieve relevant information.
IB Key Features and Benefits:
- Cost effective access to a heterogeneous mix of XML and other data of any shape and size
- Sophisticated extendable type system allowing for numerical, date, geospatial and other search strategies parallel to textual methods: "Universal Indexing"
- Distributed and highly scalable information retrieval and information discovery solution
- No advance setup or preprocessing.
- Easy maintenance
- Rapid creation of scalable (XML) warehouse
- Among others: Java, Python and C APIs
- Full ability to search specific structure/context in information without even knowing their details (such as tag or field names).
- IB lets you simply request structural/contexual elements you need and they are returned directly.
- User defined "search time" unit of retrieval: the structure of documents is exploited to identify which document elements (such as the appropriate chapter or page) to retrieve.
[More Key Features and Benefits]The default modus is to index all the words and all the structure of documents. It provides powerful and fast search without prior knowledge about the content yet enables arbitrarily complex questions across all the content and from different perspectives. Not bound by the constraints of "records" as unit of information, one can immediately derive value from content with the flexibility to enhance content and the application incrementally over time without "breaking anything".
IB was designed from the ground up to address three key goals: universal SGML/XML (and other document formats) hierarchical/context search, distributed objects (transparent integrated views to other sources of information such as relational DBs, search services and object brokers) and to provide optimal support for features (current and future) of the ISO 23950 (ANSI/NISO Z39.50) Information Retrieval Protocol services standard to allow for standard interoperable interfaces.