Differences between linkedDiseases and approvedIndications across versions

christosasi · 18 January 2024 17:35

Hi team,
I have a query about differences in the ‘molecule’ and ‘indication’ datasets available on OpenTargets.
My current understanding of these datasets is according to the schema of the OpenTargets datasets available at https://api.platform.opentargets.org/api/v4/graphql/schema

linkedDiseases: “Therapeutic indications for drug based on clinical trial data or post-marketed drugs, when mechanism of action is known"”
approvedIndications: “Indications for which there is a phase IV clinical trial”
indications: “Investigational and approved indications curated from clinical trial records and post-marketing package inserts”

When I compared the number of ChEMBL IDs in phase IV (molecules with maximumClinicalTrialPhase = 4) from the ‘molecule’ dataset with the number of ChEMBL IDs with approvedIndications>0 in the ‘indication’ dataset, I found that the there is a significant difference in their total count. My comparison ranged from versions 21.04 to 23.12 and it is illustrated in the attached bar plot.

In manual checks, I noticed that some of the ChEMBL IDs in phase IV from the ‘molecule’ dataset have no linkedDiseases values. However, the same IDs have values in the ‘approvedIndications’ column in the ‘indication’ dataset. I would appreciate your help in understanding the reasons for these differences in your datasets. I have also listed this difference in the following table.

I would appreciate your help in understanding this difference in annotation between the 2 datasets and if there is any possibility for us to fill in the gaps with the data you have published across versions. As the difference has decreased across versions, I suspect there is something already present in the dataset that we could use to fill in the gap. There is a significant addition of drugs from version 22.04. I could not find an explanation of this in the release notes. Was a a new data source used to add these Drugs?

Please let me know if I can provide you any more information to help you answer this query.

Topic		Replies	Views
Lack of approved drug indications Data issue ot-platform	8	780	22 September 2025
Downloadble drug indication data - 'Known drug' vs. 'Drug - indications' Data issue datadownloads , data	2	77	31 March 2025
Mismatch between maximum clinical trial phase and phases by indication? General ot-platform	2	453	11 May 2021
Drug-indication/Clinical precedence pairs on Open Targets Data issue ot-platform , data	1	90	13 August 2024
Known_drugs absent from 26.03 Data Access	4	126	28 May 2026

Differences between linkedDiseases and approvedIndications across versions

Related topics