Details on ChEMBL gene target links

Hello,

I am a project lead on an OTAR study at Sanger. We have been exploring the drug-target database that links bioactive molecules (ChEMBL ids) to candidate target genes.

A couple of questions:

  1. What is the difference between the “linkedTargets” column in the “molecule” table (rsync -rpltvz --delete rsync.ebi.ac.uk::pub/databases/opentargets/platform/21.06/output/etl/json/molecule .) compared to the “targets” column of the “mechanismOfAction” table (rsync -rpltvz --delete rsync.ebi.ac.uk::pub/databases/opentargets/platform/21.04/output/etl/json/mechanismOfAction .)?

  2. Is there documentation for how these gene targets are determined?

I thought I would post my questions here as others may have similar questions in the future.

Many thanks for your time and help.

Best,
Leland

Hi @tlr! :wave:

Welcome to the Open Targets Community! :tada:

I spoke with one of our data scientists, @irene, and here’s what she told me:

  1. There is no difference between the linkedTargets and mechanismOfAction data. In our pipelines, we generate linkedTargets from the mechanismOfAction data provided by ChEMBL. We use the linkedTargets data in the Platform search results page when a user searches for a drug. For example, please see the “Drug targets” section on the metformin search results) page. Although they are the same data, my recommendation would be to use the mechanismOfAction dataset as that includes other insights including the mechanism(s) of action curated by ChEMBL along with reference links.

  2. The drug targets for a given compound or drug are curated by ChEMBL. For more information, please check out the ChEMBL documentation or email their help desk at chembl-help@ebi.ac.uk.

Great - thanks @ahercules!

Looping back with the message from ChEMBL. This helped me and may be useful to others.

As a summary, drug mechanisms are manually curated using drug leaflets and/or literature and all references are provided in ChEMBL.

We also extract data from core medicinal chemistry literature and deposited data sets and map compounds to all tested targets. We include negative data in ChEMBL and therefore compounds may be active or inactive against targets. Please note, our targets include whole organisms, cell lines etc. in addition to single proteins.

There is also some background information in our recent webinars that may be useful:

  1. “ChEMBL: Quick tour” - ChEMBL | EMBL-EBI Training
  2. “A guide to exploring drug-like compounds and their biological targets using ChEMBL” - A guide to exploring drug-like compounds and their biological targets using ChEMBL | EMBL-EBI Training

Thanks for sharing @tlr! :slight_smile: