Online Exclusives

AI in Pharma: Transforming Data into Drugs

Jane Z. Reed of Linguamatics discusses the potential benefits and obstacles leveraging data through AI

By: Kristin Brooks

Managing Editor, Contract Pharma

With the amount of time and the R&D costs associated with bringing drugs to market, greater efficiency, better processes and analysis are needed for innovation in the pharma and biopharma industry. Artificial Intelligence offers tremendous potential across the drug development continuum, however there are obstacles to overcome in order to benefit from numerous advances in science and technology. Leveraging the massive amounts of data unearthed in drug discovery, development and clinical trials may be key to selecting and advancing the right drug candidates.

Several AI collaborations among top pharma companies are currently underway. For example, Johnson & Johnson partnered with IBM’s Watson Health for its AI cloud-based system capable of processing high volumes of data and offering evidence-based answers. Pfizer is leveraging the Watson for Drug Discovery cloud-based platform in an effort to help discover new drug targets and alternative drug indications, and Novartis partnered with IBM Watson Health to develop a cognitive solution using real-time data to gain better insights on the expected outcomes of breast cancer treatment options. Furthermore, Genentech, a member of the Roche group, is collaborating with GNS Healthcare to use machine learning to convert high volumes of cancer patient data into computer models that can be used to identify novel targets.

Based on its recent analysis of the global natural language processing (NLP) in life sciences Artificial Intelligence (AI) market, Frost & Sullivan recently recognized Linguamatics for its I2E NLP text mining platform. This intelligent solution generates insights from a wide range of unstructured and semi-structured data, helping clients to efficiently integrate AI into their operations.

In its analysis, Frost & Sullivan noted the flexibility of the I2E platform, which can be applied across a variety of applications such as gene-disease mapping and target identification, biomarker discovery, regulatory compliance management, drug safety, clinical trial analysis, drug patent landscape reporting and analysis, real-world data analysis, and identification of clinical care gaps. The solution is currently being used by 18 of the top 20 pharmaceutical companies and numerous oncology research organizations.

Jane Z. Reed, Ph.D. head of life science strategy at Linguamatics’ discusses the potential benefits and obstacles leveraging data through AI. –KB

Contract Pharma: What aspects of the drug development continuum have the potential to benefit from AI?

Jane Z. Reed: There’s real potential for AI to impact the whole bench-to-bedside continuum. AI in early research isn’t new. Algorithms for sequence manipulation (BLAST, Clustal), methods in computational chemistry and QSAR, or Bayesian clustering for lead compound or lead series design – these have all been used for years. But one of the core prerequisites for good AI is good data; and previously other areas in drug development have not had suitable data types, data volumes, to provide the substrate for AI technologies. That’s all changing, we have tsunamis of data available – genomic, genetic, real world data from wearables, literature, patents, clinical and more. We now have more data than the human brain could interpret in a lifetime. The move to digital transformation, changing processes to be data-driven rather than document-driven, means that algorithms can be applied to search for patterns, for significance, for many applications.

That said, one of the hot spots for pharma is utilizing real world data. Data from healthcare records can be accessed better using AI text mining, such as natural language processing, to extract features for machine learning models for risk management, safety alerts, or patient monitoring during clinical trials. These data can be combined with information from wearables, individual genetic data, and more, to better understand patient responses and/or cohort selection, and hence improve clinical trial efficiencies or reduce attrition.

CP: What are some of the current obstacles with AI in the pharmaceutical industry? How can they be overcome?

JR: There is a huge amount of hype currently around AI, and this can bring pressures for companies or teams to set up an AI project without fully understanding the problem space. Obstacles to a successful project include: not defining a clear problem statement; not having suitable data (too small a data set, too dirty; or locked in data silos); or also, not having good communication between the business user and the IT/IS/data science teams.

Where we have seen successful implementation of natural language processing (NLP), for example, is where a team can identify a clear and specific problem that is causing a block or challenge within the business (obviously around unstructured text, for NLP applications); define and scope a pilot project to test this; work closely with the business users to understand their needs; and in this way address a practical use case and demonstrate value. Experimenting rapidly and then scaling up successes is a great way to test any new approach. This way, you can gain understanding, and trust of the AI technology, and then build from there.

CP: Where do you see the most opportunities for AI?

JR: As I said above, making sense of all the structured and unstructured real world data (e.g. electronic heath records) and combining this with genetic and genomic data, is really going to impact on how we can design, run and monitor clinical trials.  There’s also significant potential value for AI earlier in drug development, in early discovery. Here, combining information from literature, genomics and genetics, biological assay databases and more, can deepen our understanding of the basic biology of targets, pathways, diseases; and thus improve target selection, target prioritization, biomarker discovery. Currently nine out of ten candidate drugs fail between Phase I clinical trials and regulatory approval, and lack of efficacy is a major concern; often coming down to failure in understanding the fundamental natural history of the disease.

Several big pharma companies have announced projects with AI or Machine Learning (ML) technologies to address these challenges (e.g. Pfizer is using IBM Watson to improve the search for immuno-oncology agents), and it will be interesting to see how these initiatives progress. We are probably still some years away from the first approved drug discovered by artificial intelligence; but the combination of AI technologies and the human brain (“augmented intelligence”) is already showing signs of accelerating many parts of the drug discovery and development pipeline.

CP: What does leveraging data through AI entail?

JR: Data can tell a story, but only if you can listen. In order to leverage big data, you need to find it, have access, ensure the data can map to suitable identifiers or standards for integration with other data sets; and only then you can apply algorithms to start analysing the data. Data silos within pharma organizations can prevent data integration hence data understanding; and there are many initiatives currently to make data more FAIR (Findable, Accessible, Interoperable, and Reusable)* and hence ensure value can be obtained. In addition, as with any analytical technology, the quality of the underlying data is critical. Flawed input data will produce flawed outputs (or, “garbage in, garbage out”). So, in order to leverage data effectively for AI and ML, it has to be clean.

Within both our healthcare and pharma customers, Linguamatics NLP software is used to extract and clean data features from literature, clinical trial reports, electronic health records, in order to feed machine learning models. Use cases include using NLP in machine learning models to predict the risk of opioid medication abuse, identify gout flares to provide better patient control, or predict success or failure of novel therapeutics in clinical trials.
 
* A publication in 2016 introduced FAIR; https://www.nature.com/articles/sdata201618
 



 
Jane Z. Reed, Ph.D. is Linguamatics’ head of life science strategy and responsible for developing the strategic vision for Linguamatics’ growing product portfolio and business development in the life science market.

Keep Up With Our Content. Subscribe To Contract Pharma Newsletters