IDC Case Study: How AstraZeneca Built an AI-Powered R&D App Store with Sinequa

The Challenge AstraZeneca Faced
AstraZeneca is a global biopharmaceutical company investing over $4 billion annually in R&D, with more than 15,000 research professionals across 8 countries on 3 continents working daily to bring new medicines to patients across cardiovascular, gastrointestinal, neuroscience, respiratory and inflammation, infection, and oncology. The quality and speed of its science depends directly on its researchers’ ability to find, connect, and act on the right information at the right time.
The reality was that this capability did not exist. Researchers had to navigate multiple disconnected systems — SharePoint, Office 365, EMC Documentum, eRoom, AstraZeneca’s own R&D wiki, cloud repositories including Box and Veeva, and numerous external scientific data sources — with no unified way to search across all of them simultaneously. According to IDC’s own Knowledge Worker Survey cited in the report, 57% of all researchers stated they needed access to four or more systems just to perform their daily work, and only 45% rated their current process of finding information as being even somewhat beneficial.
Rob Hernandez, AstraZeneca’s Data Analytics Lead, described the ambition behind the transformation: “From the beginning, we wanted to be more than just traditional IT staff and not just deliver traditional search but actually be able to build applications that were making use of the underlying information. We wanted to bring under the same kind of roof every single document, and every single piece of information, from structured data to unstructured text that a scientist needs.”
The goal was not to build a better search box. It was to build an entirely new information discovery infrastructure — one that would fundamentally change how AstraZeneca’s scientists and business teams worked.
The Solution: Sinequa’s Cognitive Search Platform — Deployed as an App Store
AstraZeneca selected Sinequa’s cognitive search and analytics platform as the foundation for its next-generation information discovery architecture. Sinequa’s platform extracts content from both structured and unstructured data using Natural Language Processing, statistical and semantic analysis, and a set of scalable machine learning libraries — continuously improving the relevance of information delivered to users based on real search behaviour and context.
The most distinctive element of AstraZeneca’s implementation was its delivery model: rather than deploying a single monolithic search interface, AstraZeneca built an internal app store — a business-centric application marketplace where role-specific, task-specific search applications could be rapidly developed, published, and adopted by employees across the organization. The Sinequa platform provided the underlying cognitive search engine; the app store gave each department a targeted application built on top of it, tuned to its specific workflow and information needs.
Results: What AstraZeneca Achieved
The IDC case study documents a series of concrete outcomes:
Implemented in under 12 months. AstraZeneca deployed Sinequa’s real-time search engine for its global R&D infrastructure — covering all scientific information and core internal repositories — within 12 months of project initiation.
10+ applications built in the first 3 months. Once the core platform was in place, AstraZeneca rapidly built out its app store. In just the first three months, more than 10 role-based search applications were deployed across R&D and business functions. Named applications from the IDC report include:
- R&D ChemSearch — compound and chemical structure search across R&D repositories
- R&D Intelligence — aggregated research intelligence for scientific decision-making
- Find Scientific Partners — expert identification across the global research organization
- Find People at AstraZeneca — people-finding across the full 60,000-person organization
- R&D News Alerts — proactive, role-relevant alerts for new developments in target research areas
- In-Video Search — search within scientific video content, not just document metadata
- Mobile productivity apps — including approvals workflows, available on mobile
180 million documents searchable in real time. Today, AstraZeneca has over 180 million documents fully indexed and searchable in real time across all repositories — with key scientific vocabularies automatically tagged using SciBite’s biomedical ontology layer integrated directly into the Sinequa platform.
60,000 employees with access. The platform expanded from its R&D origins to become AstraZeneca’s global enterprise search infrastructure, accessible to more than 60,000 employees across the full organization.
Available to all cloud repositories. Indexation covers AstraZeneca’s cloud infrastructure including Box, SharePoint, and Veeva — ensuring no relevant research data remains outside the searchable environment.
What the IDC Report Concludes
The case study’s essential guidance for other pharmaceutical and enterprise organizations includes three key lessons drawn from AstraZeneca’s experience. First, role-based and task-based information discovery solutions — designed to address 80% of employee needs rather than attempting to solve every use case at once — are significantly more successful than monolithic enterprise search deployments. Second, the Sinequa platform’s ability to allow developers to build new applications quickly, using cognitive search as a shared infrastructure layer, enables organizations to continuously expand their capability without rebuilding from scratch. Third, the adoption of agile information discovery methods — treating information access as an evolving set of targeted business applications rather than a single IT system — delivers measurably better ROI, faster time to value, and greater cross-departmental acceptance than traditional enterprise search projects.
Who Should Download This Document
This IDC Buyer Case Study is essential reading for pharmaceutical, life sciences, and enterprise technology leaders evaluating cognitive search and AI-powered knowledge management platforms — including Chief Scientific Officers, VP R&D Operations, Heads of Drug Discovery, R&D IT and digital transformation leaders, enterprise search platform owners, and technology strategy executives responsible for knowledge worker productivity at scale.
If your organization is managing tens or hundreds of millions of documents across disconnected repositories, with researchers spending hours navigating multiple systems to find information that should take minutes — the AstraZeneca story is the closest available analogue to what transformation looks like in practice, validated by an independent analyst.
From Agile Information Discovery to Agentic AI
AstraZeneca’s implementation of Sinequa — documented in this 2016 IDC case study — established the cognitive search infrastructure that continues to underpin its AI and data science strategy today. The 180-million-document, 60,000-employee platform built on Sinequa became the foundation for AstraZeneca’s subsequent investments in knowledge graphs, machine learning-driven drug target identification, and AI-assisted clinical trial management.
This trajectory reflects a pattern Sinequa now sees consistently across its pharmaceutical customer base: organizations that invest in a robust, enterprise-scale cognitive search layer first gain a significant structural advantage when deploying the next generation of AI capabilities. Generative AI, agentic AI, and RAG-based systems all depend on the quality of the knowledge they retrieve. An organization with 180 million documents indexed, enriched with scientific ontologies, and permission-correctly accessible is in a fundamentally different position to deploy trustworthy AI in drug discovery than one still navigating fragmented, unindexed data silos.
The AstraZeneca case study is not just a historical reference. It is a blueprint for how pharmaceutical organizations build the knowledge infrastructure required to make AI in R&D genuinely reliable at scale.
Frequently Asked Questions
This is an independent Buyer Case Study published by IDC — International Data Corporation — examining how AstraZeneca deployed Sinequa’s cognitive search platform to transform information discovery across its global R&D organization. The report documents AstraZeneca’s implementation of a role-based enterprise search app store, resulting in over 180 million documents searchable in real time for more than 60,000 employees across 8 countries — implemented within 12 months.
Agile information discovery is an approach to enterprise knowledge management that replaces a single monolithic search system with a portfolio of role-specific, task-specific search applications — each built on a shared cognitive search infrastructure. Rather than one generic search interface for all employees, each team or function gets a targeted application tuned to its specific workflows and data needs. AstraZeneca’s app store model — built on Sinequa — is the case study IDC uses to define best practice for this approach in the pharmaceutical sector.
According to IDC’s own Knowledge Worker Survey cited in the report, 57% of researchers needed access to four or more systems just to do their daily work, and fewer than half rated their current information-finding process as even somewhat beneficial. AstraZeneca’s researchers were navigating SharePoint, Office 365, EMC Documentum, eRoom, AstraZeneca’s R&D wiki, Box, Veeva, and multiple external scientific databases with no unified search capability across them — resulting in significant time loss and missed connections between related research.
AstraZeneca built over 10 role-based applications in the first three months, including R&D ChemSearch (compound and chemical search), R&D Intelligence (research aggregation), Find Scientific Partners (internal expert identification), Find People at AstraZeneca (global people search), R&D News Alerts (proactive research monitoring), In-Video Search (search within video content), and mobile productivity apps for workflow approvals. The app store model allowed new applications to be developed and deployed rapidly as new business needs emerged.
SciBite provides biomedical ontologies and scientific vocabulary tagging that enables search systems to understand the specialized language of pharmaceutical and life sciences research. In AstraZeneca’s implementation, SciBite’s scientific vocabularies are automatically applied to indexed documents within the Sinequa platform — meaning that searches for a compound, gene, disease, or mechanism of action surface all relevant documents regardless of the specific terminology used in each document. This semantic layer is what makes cognitive search meaningfully different from keyword search in a scientific research environment.
AstraZeneca implemented Sinequa’s real-time search engine for its entire global R&D infrastructure — covering all scientific information and core internal repositories — in under 12 months. The subsequent app store expansion, including more than 10 role-based applications, was built within the first three months following the core platform deployment.
Assistant
