Adding FDA data to Open Targets with Scala and Spark

JarrodBaker · 6 May 2021 07:40

Last year, the Open Targets team integrated the United States’ Food and Drug Administration’s (FDA) Adverse Event Report System (FAERS) into the Open Targets data ecosystem.

FAERS is a database of adverse events and medication error reports submitted to the FDA, and the publicly available API provides nearly 12 million records of adverse drug reactions. Medical professionals and consumers can voluntarily file reports of adverse medical events which are suspected to be associated with a drug. Since reporting is voluntary, the FDA has a number of disclaimers, including that the existence of a report does not establish causation, and that the information in the reports has not been verified.

The Platform team prepared a pipeline using Apache Spark and Scala, making it possible to analyse and extract insights from this information in minutes using Google Cloud’s Dataproc service. We limited the results to reports which did not result in patient death and were reported by a medical professional. We also ignored specific events related to treatment, technology or human action, rather than the action of the drug itself. This left us with approximately 55 000 unique drug-reaction pairs, covering 465 biological compounds.

We implemented a likelihood ratio test to account for how often the event and drug appear in the data set, and to test the relevance of each adverse effect associated with the drug. This leaves us with a useful guide as to which adverse events are strongly associated with specific drugs, which enables researchers to identify potential new linkages between specific targets and drugs and their effects.

This post is based on an Open Targets Blog post, where you can find further information on the process of integrating this data into the Platform, including details on why Spark was a particularly useful technology in this case, and how you can run your own analyses.

Topic		Replies	Views
Which FAERS pharmacovigilance data release is used in the Open Targets Platform? General ot-platform , data	1	256	17 August 2022
Calling software developers! Jobs	0	308	16 December 2022
Does Open Targets have a table of drugs with a safety rating based on adverse events? Platform feature requests data-updates	1	344	17 February 2022
Gap between cancer driver genes and clinical trials Data downloads datadownloads	0	496	27 April 2021
Batch-query variant-centric evidence for a list of targets (R) Data downloads	2	1105	15 August 2022

Adding FDA data to Open Targets with Scala and Spark

Related topics