Accessing L2G scores through API

marynias · 10 June 2021 15:01

Hello, I am struggling to formulate a GraphQL query that would allow me to fetch L2G scores for a given SNP-study association, such as displayed here under Gene prioritisation using locus-to-gene pipeline: Open Targets Genetics . Or if not available, at least the highest scoring gene for a given SNP-study association. I can only find the ManhattanAssociation type which should somehow be able to produce bestLocus2Genes but I am unable to formulate a successful query for it.

ahercules · 11 June 2021 11:28

Hello @marynias!

Welcome, and thank you for joining the Open Targets Community!

How do I access Locus2Gene data for a specific study and locus?

To access the Locus2Gene data for a specific study and locus (represented by the lead variant ID) using the Genetics Portal GraphQL API, you will need to use the studyLocus2GeneTable endpoint and pass the studyId and variantId as parameters.

For example, to access the Locus2Gene data for Crohn’s disease (GCST003044) and the locus around 1_67215986_T_G, you would use the following query:

query getL2GScoresForAStudyVariantPair {
  studyLocus2GeneTable(studyId:"GCST003044", variantId:"1_67215986_T_G") {
    rows {
      gene {
        symbol
        id
      }
      yProbaModel
      yProbaDistance
      yProbaInteraction
      yProbaMolecularQTL
      yProbaPathogenicity
      hasColoc
      distanceToLocus
    }
  }
}

Link to run GraphQL API query

Within the API response, the following fields map to the columns in the Locus2Gene data table seen in the screenshot below:

Column label	API response field
Gene	`rows.gene.symbol`
Overall L2G score	`rows.yProbaModel`
Variant Pathogenicity	`rows.yProbaPathogenicity`
Distance	`rows.yProbaDistance`
QTL Coloc	`rows.yProbaMolecularQTL`
Chromatin interaction	`rows.yProbaInteraction`
Distance to locus (bp)	`rows.distanceToLocus`
Evidence of colocalisation	`rows.hasColoc`

Are there other ways to access Locus2Gene data for more systematic queries?

The GraphQL API approach is good way to find Locus2Gene data for a single study-locus pair.

However, for more systemic analyses involving multiple studies or loci, I would recommend that you use our BigQuery instance - open-targets-genetics.

For example, the same GraphQL API query for Crohn’s disease (GCST003044) and the locus around 1_67215986_T_G in BigQuery would be:

SELECT 
    study_id,
    gene_id,
    y_proba_full_model,
    y_proba_logi_distance,
    y_proba_logi_interaction,
    y_proba_logi_molecularQTL,
    y_proba_logi_pathogenicity
FROM `open-targets-genetics.200201.locus2gene` 
WHERE 
    study_id='GCST003044' 
    AND pos=67215986 
    AND chrom='1'
    AND ref='T'
    AND alt='G' 
ORDER BY y_proba_full_model desc 
# LIMIT 100

Link to run the BigQuery script

Results of the query can be exported in CSV or JSON formats or can be imported into your own BigQuery instance of Google Sheets.

Alternatively, you can also download all of the Locus2Gene data in Parquet format using our FTP service.

I hope this helps answer your question! Please feel free to comment below if you have further questions about accessing the data!

marynias · 11 June 2021 14:13

Thank you @ahercules for the clear and prompt recipe! I would have never figured that one out on my own :).

frahimov · 26 May 2022 18:27

Hello,

I would like to do the exact same thing, but access the “Overall V2G” scores instead. What table do you suggest? To be more specific, I want to access columns in this table through API

Thank you!

Xiangyu · 17 June 2022 13:14

Hi!

For the overall V2G score for a given variant (1_67215986_T_G in this case), you can use the following query in the GraphAPI playground:

query getV2GScoresForAStudyVariantPair {
  genesForVariant(variantId: "1_67215986_T_G"){
      gene {
        symbol
        id
      }
      overallScore
    }
  }
`

Thanks,
Xiangyu

Topic		Replies	Views
Accessing locus-to-gene (L2G) and colocalisation data from Open Targets Genetics by querying the API using Python GraphQL API graphql , genetics-portal	0	786	3 June 2021
Query by GeneID and Phenotype to get L2G scores GraphQL API genetics-portal	1	329	15 March 2023
An R script to use API for fetching "Associated studies: locus-to-gene pipeline" section for a gene GraphQL API	3	464	11 August 2023
How to access FinnGen GWAS data using the Open Targets Genetics Portal API GraphQL API genetics-portal	4	1062	16 August 2021
Can I load my sumstats or loci to perform Locus-to-gene annotation? GraphQL API genetics-portal	1	338	23 November 2022

Accessing L2G scores through API

How do I access Locus2Gene data for a specific study and locus?

Are there other ways to access Locus2Gene data for more systematic queries?

Related topics