Sinequa Augments Companies with Release of New Generative AI Assistants. Learn more

Chat with Sinequa Assistant
Sinequa GenAI AssistantSinequa GenAI Assistant

Keeping Secrets Secret: How to Industrialize Information Privacy

The Customer

A leading European bank and one of the world’s top 20 banks based on assets, working across retail banking, corporate and investment banking, wealth and asset management. This financial institution operates in 67 countries.

The Challenge

The bank wanted to identify confidential data to protect the privacy of its clients more consistently and efficiently. What makes this data difficult to find is that most of it is unstructured textual content from which the sensitive information must be mined. This content is spread across the bank systems, held in multiple formats, and constantly changes. Plus, much of this data is subject to regulations, especially, Non-Public Information (NPI) and Personally Identifiable Information (PII).

In order to mitigate the risks, the bank’s compliance team defined multiple confidentiality categories to apply to all content across the bank. Each employee creating or modifying a document was responsible for evaluating its confidentiality category and tagging it correctly. Only a small percentage of the staff knew and fully understood these classification guidelines.

The resultant cognitive burden imposed on employees came at a high price in terms of lost time, while also distracting private banking staff from building customer relationships. This might have been worthwhile, however, compliance with the process was unsystematic and inconsistent.

Critically, the bank also wanted to comply with regulatory mandates proactively while increasing security and reducing end user friction. This meant protecting both NPI (such as investment holdings) and PII (such as customer account numbers and Social Security numbers). The bank was keenly aware of breaching regulations, such as the General Data Protection Regulation (GDPR) that exposes companies to fines up to a maximum of 4 percent of the company’s global annual revenue. A recent infamous example was Equifax’s data breach, which cost the company $439 milion by the end of 2018, with total expected costs of over $600 million.

The Solution

The initial project was in the bank’s Private Banking business.

As a first step, leveraging Sinequa’s AI-powered Search platform, the bank was able to analyze the full-text of documents, regardless of data source, format, or language. Content was ingested seamlessly from both structured sources (such as CRM, portfolio management, and performance measurement) and unstructured sources (such as email, Microsoft Office, scanned documents, and PDFs).

To understand the content of a document in multiple languages (English, French, German…), Sinequa leveraged its state-of-the-art Natural Language Processing capabilities (NLP) including:

  • Part-of-speech tagging and lemmatization analysis to represent written language as a set of linguistic tokens for machine processing.
  • Extracted document’s main concepts, such as asset classes (equities and bonds) and client needs (such as return goals and risk tolerance). And reviewed the documents for potential NPI and PII.
  • Text mining agents to analyze the documents and apply rules that identify complex patterns in the text, such as co-occurrences and word sequences.

Using NLP enabled the bank to tag documents and identify confidential information. Although additional context is required to understand the full meaning of a document. For example, an investment in a publicly listed equity would be tagged by NLP, irrespective of client objective or identity. However, a document’s confidential category will vary depending on whether it is held in a managed account of the portfolio of private wealth client. This is where AI and machine learning come into play. Sinequa’s machine learning capabilities can understand the difference.

To predict document confidentiality more precisely, Sinequa worked with the Risk & Compliance experts at the bank to build training sets and train machine learning models. These machine learning models, which continually learn and improve over time, were applied to millions of documents across multiple, global business units to predict confidentiality categories automatically and more accurately.

The Outcomes

Investing in the Sinequa solution helped our banking customer in three ways.

  • Improved confidentiality protection and client privacy in the private bank. This reduces the risk of data leaks and regulatory breaches around client privacy. Having an intelligent, automated process preventing data leakage reassures ultra-high-net-worth clients. It also deepens competitive differentiation in a growing, high-margin business.
  • Increased productivity. The pre-solution process placed a heavy cognitive burden on employees and distracted them from building customer relationships. To illustrate the value of the initial deployment in private banking, the bank estimated its costs at approximately USD 44 million dollars a year in lost employee time. This was based on the 2,800 private banking staff worldwide each spending 45 minutes a day on the manual process.
  • Platform for the whole bank. The initial project in Private Banking focused on control of sensitive data lays a foundation for related projects across the entire bank because the Sinequa platform supports a wide range of use cases. This creates the opportunity for much larger financial benefits across the other lines of business, including retail banking, wealth management, and investment banking.