Clarification on Number of Protein-Coding Drug Targets in OpenTargets Database

Dear OpenTargets Team,

I hope this message finds you well. I am currently working on a project where I am focusing on genes for which there is evidence of an interaction between a drug and a gene product, whether supported by a publication or an assay.

In reviewing the OpenTargets database, I noticed that while the platform boasts over 63,000 targets across all biotypes, filtering the drug table to include only those with a linked protein-coding target (and removing empty ‘linkedTargets’ fields) reduces this number significantly to just 1554 protein-coding genes out of 20,094 protein-coding genes found in your database. This represents a substantial decrease, and I wanted to confirm whether this figure is accurate or if I may be missing something in my filtering approach.

Could you please provide clarification on this discrepancy? Additionally, if there are specific criteria or considerations when interpreting the drug-target interactions in the database, I would greatly appreciate any insights.

Thank you for your time and support.

Kind Regards,
Joy

Hi @joy2001 and welcome to our Community!

Your analysis is correct. Only a small fraction of protein-coding genes have their products directly targeted by drugs. You might find some gaps in the curation of these mechanisms of action, but in essence, this reflects the current state of drug development.

If you’re interested in expanding your universe of drug/target interactions beyond the drug mechanism of action, I’d recommend:

  • using our Pharmacogenetics data to pull targets that are involved in the response to a drug response;
  • using the bioactivity data available in ChEMBL

Thank you for your question!
Irene