Elsevier, founded in 1880, is a publishing and analytics company specialising in scientific, technical, and medical content.
Elsevier wanted to build an AI data source to allow researchers around the world to explore multiple chemistry data sources, incorporating their own organisation’s data, with guaranteed security, to discover new relationships, interactions and cures. The challenge was to allow the integration of disparate data sources and data structures using totally different schemas and naming conventions. It was essential to allow data integration to be easily carried out, comparisons to be made manually and automatically (via AI) and for results to be meaningful and clear in a single new industry standard schema, all through a single interface, using any device.
Security was paramount as the biotech, engineering and Pharma industries who would use this solution require absolute confidentiality in their work at all times.
With FAIR (Findable, Accessible, Interoperable and Reusable) principles at the forefront, we compared a number of the major STEM industry standard database schemas in order to evaluate a single standard that would allow these disparate data sources to be ingested, standardised and interrogated in a meaningful and valuable manner by users.
We interviewed students and the scientific community in the UK and Europe to assess their day to day work and needs.
We architected a cloud-based data platform designed to help life sciences companies overcome the challenges of modern R&D by enriching and harmonising proprietary and external data, delivering it in an AI-ready environment.
The solution was designed to ingest disparate data from huge volumes of existing data stored in individual Electronic Lab Notebooks, to finding the desired piece of information in scientific literature, in a frictionless manner.
We defined a best practice ingestion data model. We researched, prototyped and tested an extensible interface to allow intelligent, useful search results.
The result is a scalable and customisable knowledge environment, enabling exploratory and predictive analytics applications and a full-data scientist development stack, enabling data scientists to work on solving problems rather than manipulating the data.
With nearly a fifth of pharmaceutical spending going to R&D and the average cost of bringing a new drug to market estimated at $4 billion USD, life sciences companies urgently need more efficient ways to analyse data.
This new platform saves time and costs by de-siloing, contextualising and connecting drug, target, and disease data to deliver normalised, discoverable and model-ready information.
The solution radically simplified activities like text mining, data normalisation, application of ontologies, mapping of ontologies onto multiple data sets