Natural Language Processing (NLP) and Machine Learning power the advanced features of Sinequa's Intelligent search platform. After more than 25 years of NLP research, we are experts at making sense of each piece of text, whatever the native language. In addition, the platform embeds state-of-the-art Deep Learning frameworks to close the gap between the experience of classical enterprise search and today's web search engines. The resulting proprietary index is optimized to cope with huge volumes and intensive usage.
Natural language processing is the science behind machine comprehension. If you’re new to the concept or looking for an overview of what it is and how it’s used, then this guide is for you.
Sinequa's NLP semantically enriches content in any language and powers an intelligent employee experience for search and analytics. At indexing time, NLP applies to:
Sinequa utilizes advanced information retrieval techniques to provide relevant and contextualized results. Sinequa embeds a sophisticated variation of well known TF-IDF and BM-25 algorithms, enhanced by multiple factors including:
Intelligently identifies and extracts entities for document classification and tagging.
Sinequa provides advanced capabilities to detect patterns in text, specifically for entity extraction, including:
Advanced capabilities are included with the Sinequa platform to significantly simplify creating complex extraction rules, enhancing native capabilities with help from a dedicated descriptive language.
Easily classify documents using our embedded machine learning models and semantic techniques without being a data scientist. When content cannot be easily organized based on its location, existing metadata or associated properties, the dynamic classification may help surface structure out of the apparent chaos. Two technologies are combined to make this happen:
Sinequa relies on its comprehensive and efficient index structure to deliver superior relevance from even the most extensive content and datasets without compromising performance.
If enterprise search were all about matching a keyword, a single index would suffice. While this may be sufficient for narrow applications of highly classified and structured data, it would fail if applied to unstructured data. Since the vast majority of information is captured in everyday language, no single index can serve as an optimal measure of the information contained in a corpus. Therefore, there is no one “ideal” index for every potential information query. The best results are achieved when multiple indexes are combined, each providing a different perspective or emphasis and a comprehensive view of the information available – thus deriving the best possible understanding of the meaning it carries.
When indexing unstructured data, Sinequa automatically generates a variety of indexes to provide the most comprehensive assessment of the text content. Sinequa also provides the ability to tailor how the different indexes are used in a search (by changing their weightings), allowing search results to be fine-tuned for the best results in highly specialized contexts. Therefore, the Sinequa index is a dynamic combination of the indexes: full text, structured, semantic.
Sinequa can query any combinations of indexes with different schemas, searching through all structured and unstructured data at once to offer the best data discovery modes.
Sinequa does not rely on any data structure derived from Apache Lucene. Multiple NLP layers process the raw text and optimally enrich it at the lowest possible level, ensuring rich functionality does not impact search performance because of external and supplemental packages or libraries.
Sinequa indexes comprise full-text parts, typed columns to store and retrieve associated metadata or entities extracted at indexing time, columns dedicated to security aspects, etc.
The index data structure is strongly optimized to ensure elasticity. It mitigates competition between simultaneous updates and searches. It is also secured with safe transactions, redundancy, and internal reorganization capabilities.
"Sinequa is simply great technology. We immediately saw its benefit watching it perform something we didn’t know was possible. It makes an exponential difference for our organization. We were also impressed with the number of smart connectors available out-of-the-box and Sinequa’s unique ability to develop new ones."
Oliver Thoennessen, Senior Manager Global IT Drug Development
The Documentation Portal is the reference guide to all features and components of the Sinequa platform. In addition, the portal contains number of tutorials, how-tos, as well as the extensive reference manuals of our development frameworks, toolkits and APIs.
The Download Center is the place from where you can download the Sinequa platform, from the latest stable version to the most experimental one. Go through the various release notes, search for older versions your environment might require, and securely upload your files, should you want to exchange documents with the customer support team.
A Sinequa support portal is accessible by customers or partners who have been trained and certified by Sinequa. It lets you submit cases and track all changes in relation with our Customer Engineering’s team. The support portal is also a good entry point to submit feature requests or questions that require quick answers.