Downloadble drug indication data - 'Known drug' vs. 'Drug - indications'

sigven · 31 March 2025 08:42

Hi,

I’ve been looking at the parquet files (25.03 release) from the ‘Known drug’ dataset as well as the ‘Drug - indications’ dataset. I am a bit confused regarding which data to utilize when it comes to drug indication data. From what I can see, the ‘Known drug’ dataset comes also with drug indication data(?) - i.e. columns ‘diseaseId’,‘phase’,‘status’,‘urls’, yet these data do not seem to match exactly with data that are listed in the ‘Drug -indications’ dataset? Would it be possible to clarify what the differences are, and the underlying idea for having both of them represented?

kind regards,
Sigve

irene · 31 March 2025 12:32

Hi @sigven,

thank you for your question! You can find an explanation of the differences between datasets in this other thread Clinical precedence not capturing entire data - #2 by irene

This is indeed a confusing topic. If you are interested in drug/indication pairs, I’d rely on the Drug - indications dataset. We are currently working towards merging these 2 sections into one, to avoid cases where ChEMBL has curated a drug/indication pair, but the clinical precedence doesn’t reflect it.

Hope this is helpful!
Irene

sigven · 31 March 2025 12:44

Thanks! I probably should have gone through the other threads more carefully

Topic		Replies	Views
Differences in number of unique drugs per clinical precedence Data Access	1	251	16 May 2023
Known_drugs absent from 26.03 Data Access	4	122	28 May 2026
Clinical precedence not capturing entire data Bug reports ot-platform , data	3	313	17 May 2023
Differences between linkedDiseases and approvedIndications across versions Data issue datadownloads , ot-platform , data	0	157	18 January 2024
Drug-indication/Clinical precedence pairs on Open Targets Data issue ot-platform , data	1	90	13 August 2024

Downloadble drug indication data - 'Known drug' vs. 'Drug - indications'

Related topics