Getting all the LOF association to disease for a gene

Hi! I’m trying to write an API query that can return all the data of the direction of effect, of diseases associated with a target, in ClinVar and Orphant.

Grateful for any help.

Hi @Aner_Ottolenghi and welcome to the Open Targets Community! :tada:

You can use the following query to retrieve all the data of the direction of effect, of diseases associated with a target, in ClinVar and Orphanet on the GraphQL API playground:

query associatedDiseasesExample {
  target(ensemblId: "ENSG00000157764") {
    id
    approvedSymbol
    associatedDiseases {
      count
      rows {
        disease {
          id
          name
          evidences(
            ensemblIds: ["ENSG00000157764"], 
            datasourceIds: ["eva", "orphanet"]
            size: 10000
            ) {
            count
            rows {
              datasourceId
              variantId
              variantRsId
              directionOnTrait
              variantEffect
            }
          }
        }
      }
    }
  }
}

Kindly note that there is a limit of 10000 on the API response size, post which you will have to specify the

cursor: “WzAuOTcsIjAwYTJjZGJiMTM1NTI2ODljZGYzYmI0ZmQ5N2Q1NWYxZTQ3OGU1MTUiXQ==” along with size, specially for ClinVar.

Hope this helps!

Best,
Prashant

1 Like

That’s great! thank you very much!

@Prashant_Uniyal
How do you limit the directionOnTrait to be “protect”? I can’t find where in the schema is "directionOnTrait ".

-Pankaj

Hi Pankaj,

DirectionOnTrait is not something that you can filter for using the API.

You can either filter the query responses once you have them, or use our data downloads or BigQuery datasets (more information here: Data access | Open Targets Platform Documentation)

@hcornu Is there a list of the data downloads that have this field? -Pankaj

Hi @Pankaj_Agarwal, you can get the DirectionOnTrait data from the Target - Disease evidence dataset in the downloads page.

@Prashant_Uniyal That seems to be just the literature evidence as the Target-Disease evidence points to http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/24.09/output/etl/json/literature/evidence? I thought the DirectionOnTrait was also available for other data types.

Also this path ftp://ftp.ebi.ac.uk/pub/databases/opentargets/platform/24.09/output/etl/json/literature/evidence no longer exists. Under Index of /pub/databases/opentargets/platform/latest/output/etl/json/literature you do not have an “evidence” folder.

It would be useful to make DirectionOnTrait data available in a more intuitive way as I think it can be quite useful.

-Pankaj

Hi @Pankaj_Agarwal , the correct URL in the downloads page to get the DirectionOnTrait data is this.

Thank you for your feedback. We will look into creating a separate dataset for the DirectionOnTrait data but it might be difficult as currently it is part of multiple evidence sources.

Thanks @Prashant_Uniyal , do you at least have a list of all the evidence sources that include DirectionOnTrait?

Hi @Pankaj_Agarwal , the direction of effect analysis was done for eight sources of target-disease association evidence. These are:

  • Open Targets Genetics
  • Gene Burden
  • ClinVar
  • ClinVar somatic
  • Gene2Phenotype
  • Orphanet
  • IMPC
  • ChEMBL

The assessment is tailored to each dataset; this is detailed in our target-disease evidence documentation.

@Prashant_Uniyal I looked through the data in the directory and there seems to be an issue. Here are the counts of the number entries by each evidence type of DirectionOnTrait counts and those flagged as “protect”.

The ot_genetics_portal and gene_burden look good. Chembl you explain is always flagged as “protect”. Otherwise, these numbers seem curious.

Are we sure there is ZERO “protect” data in cancer_gene_census, eva_somatic, gene2phentoype, impc, intogen, and orphanet? There are no Gain of function mutations in any of these sources. And only 30 in eva?

|24.09/output/etl/json/evidence/|DirectionOnTrait counts|Protection counts|
|cancer_gene_census| 82,754 | - |
|chembl| 652,859 | 652,859 |
|eva| 440,837 | 30 |
|eva_somatic| 10,244 | 1 |
|gene2phenotype| 3,644 | - |
|gene_burden| 36,414 | 14,868 |
|impc| 1,156,920 | - |
|intogen| 4,359 | - |
|orphanet| 6,271 | - |
|ot_genetics_portal| 595,415 | 287,652 |

For Chembl as well, one could argue that it should imply the relationship between the target and disease. Thus an agonist and antagonists should be flagged differently.

Hi @Pankaj_Agarwal,

Thanks for your questions and ideas about the direction of effect (DoE).

I have looked into the data:

  1. All evidences from ChEMBL are flagged as protective/protection because we understand that whenever you use a drug you try to protect from the disease. We do flag depending on the mechanism of action of the drug (agonists are GoF and antagonist LoF), but this is in the “variantEffect” column. I may need to discuss with the team about changing the column name “variantEffect” to something more generic like directionOnTarget to avoid confusions.

  2. For datasources “cancer_gene_census, gene2phentoype, impc, intogen, and orphanet” we make the assumption that all associations give risk. In the case of Eva germline and somatic we use the annotation “clinical significance” to inform protection/risk and I confirm that there are 30 and 1 evidences giving protection, respectively.

  3. Could you please check the column of “variantEffect” and tell us whether you see something strange in terms of gain of function or loss of function numbers?

In case it can be helpful, you can find more information in the documentation (For each datasource we explain how DoE is evaluated) Target - disease evidence | Open Targets Platform Documentation

Thanks for your feedback!! :slight_smile:

Best,

Juan.

1 Like