Cloud AI Data Integration & ML Research Platform

Challenge

Elsevier, founded in 1880, is a publishing and analytics company specialising in scientific, technical, and medical content.

Elsevier wanted to build an AI data source to allow researchers around the world to explore multiple chemistry data sources, incorporating their own organisation’s data, with guaranteed security, to discover new relationships, interactions and cures. The challenge was to allow the integration of disparate data sources and data structures using totally different schemas and naming conventions. It was essential to allow data integration to be easily carried out, comparisons to be made manually and automatically (via AI) and for results to be meaningful and clear in a single new industry standard schema, all through a single interface, using any device.

Security was paramount as the biotech, engineering and Pharma industries who would use this solution require absolute confidentiality in their work at all times.

Our approach

With FAIR (Findable, Accessible, Interoperable and Reusable) principles at the forefront, we compared a number of  the major STEM industry standard database schemas in order to evaluate a single standard that would allow these disparate data sources to be ingested, standardised and interrogated in a meaningful and valuable manner by users.

We interviewed students and the scientific community in the UK and Europe to assess their day to day work and needs.

Outcomes

We architected a cloud-based data platform designed to help life sciences companies overcome the challenges of modern R&D by enriching and harmonising proprietary and external data, delivering it in an AI-ready environment.

The solution was designed to ingest disparate data from huge volumes of existing data stored in individual Electronic Lab Notebooks, to finding the desired piece of information in scientific literature, in a frictionless manner.

We defined a best practice ingestion data model. We researched, prototyped and tested an extensible interface to allow intelligent, useful search results.

The result is a scalable and customisable knowledge environment, enabling exploratory and predictive analytics applications and a full-data scientist development stack, enabling data scientists to work on solving problems rather than manipulating the data.

Benefits

With nearly a fifth of pharmaceutical spending going to R&D and the average cost of bringing a new drug to market estimated at $4 billion USD, life sciences companies urgently need more efficient ways to analyse data.

This new platform saves time and costs by de-siloing, contextualising and connecting drug, target, and disease data to deliver normalised, discoverable and model-ready information.

The solution radically simplified activities like text mining, data normalisation, application of ontologies, mapping of ontologies onto multiple data sets

  • Delivers connected and AI-ready data by linking and enriching disparate content against established life science taxonomies.
  • Reduces the need for labor-intensive data ingestion and harmonisation, so more time can be spent on high-value predictive algorithms
  • Streamlines the deployment process to ensure that work is not restricted to a small group of people
  • Easily updated models based on new incoming data
  • Assesses results from different approaches (e.g., algorithm-generated results and rule-based approaches) and combines the findings to generate better outcomes
  • Ensures that discovery chemistry output is based on solid data foundations
  • Leverages internal failed reaction data to identify patterns for more effective routes for complex molecules
  • Creates or evaluates atom mapping algorithms
  • Constructs reaction classifications for more effective similarity searches or reaction clustering
  • Predicts metabolites, or analyses metabolite networks of a given organism
  • Retrosynthesis: estimates chemical accessibility for library design
  • Integrates proprietary reaction data and/or other third-party data alongside Reaxys reaction and USPTO data
  • No need to harmonise and cleanse the data before performing the data science
  • One-click deployment of models built on solid data foundations to the bench chemists
  • Models deployed seamlessly into end-user applications of choice for better end-point outcomes

Contact Us

Email: info@uxsan.com

S&N
Berkeley Square,
London
W1J 5AP
UX S and N
Copyright S&N 2022