API request for associations with gene

simbaum · 21 October 2022 07:29

Dear team and community,
I would like to receive the genetic associations and scores from opentargets over the API.
Some time ago I had this script:

import requests
import json
import pandas as pd


def associated_studies_l2g_query(gene_id, verbose = False):
    """ example associated_studies_l2g_query(ENSG00000102908)
    """
    #construct query string and declare variables that will be sent in query
    api_query = """
    query GenePageQuery($geneId: String!) {
      geneInfo(geneId: $geneId) {
        id
        symbol
      }
      studiesAndLeadVariantsForGeneByL2G(geneId: $geneId) {
        pval
        yProbaModel
        study {
          studyId
          traitReported
          pubAuthor
          pubDate
          pmid
          nInitial
          nReplication
          hasSumsstats
        }
        variant {
          rsId
          id
        }
        odds{
          oddsCI
          oddsCILower
          oddsCIUpper
        }
        beta{
          betaCI
          betaCILower
          betaCIUpper
          direction
        }
      }
    }
    """

    #set base_url for Open Targets Genetics Portal API
    base_url = "https://api.genetics.opentargets.org/graphql [api.genetics.opentargets.org]" # "http://genetics-api.opentargets.io/graphql"

    #set variables object
    variables = {"geneId": gene_id}

    #perform API call using query string and variables object
    r = requests.post(base_url, json={"query": api_query, "variables": variables})

    #check status code of GraphQL API response and print error message if code == 400
    if str(r.status_code) == "400":
        print(f"{gene_id} query status code: {r.status_code}")
    else:
        pass

    #transform API response into JSON
    api_response_as_json = json.loads(r.text)

    #print first element of JSON response data
    if verbose:
        print(api_response_as_json["data"]["studiesAndLeadVariantsForGeneByL2G"][0])
    #return entire JSON response data
    #return api_response_as_json
    
    # return pandas df with data
    for i in range(0,len(api_response_as_json["data"]["studiesAndLeadVariantsForGeneByL2G"])):
        df = pd.json_normalize(api_response_as_json["data"]["studiesAndLeadVariantsForGeneByL2G"][i])
        if i == 0 :
            res = df
        else:
            res = res.append(df)
    return res

But it doesnt work anymore. It gives this error:

---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
/tmp/ipykernel_162977/3388234050.py in <module>
----> 1 associated_studies_l2g_query("ENSG00000102908")

/tmp/ipykernel_162977/2206194942.py in associated_studies_l2g_query(gene_id, verbose)
     62 
     63     #transform API response into JSON
---> 64     api_response_as_json = json.loads(r.text)
     65 
     66     #print first element of JSON response data

/opt/local/python/conda/lib/python3.9/json/__init__.py in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    344             parse_int is None and parse_float is None and
    345             parse_constant is None and object_pairs_hook is None and not kw):
--> 346         return _default_decoder.decode(s)
    347     if cls is None:
    348         cls = JSONDecoder

/opt/local/python/conda/lib/python3.9/json/decoder.py in decode(self, s, _w)
    335 
    336         """
--> 337         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338         end = _w(s, end).end()
    339         if end != len(s):

/opt/local/python/conda/lib/python3.9/json/decoder.py in raw_decode(self, s, idx)
    353             obj, end = self.scan_once(s, idx)
    354         except StopIteration as err:
--> 355             raise JSONDecodeError("Expecting value", s, err.value) from None
    356         return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Any ideas how I could get the gene-trait associations with the variance to locus score?

The aim is to request for a gene, what genetic trait associations are there, and how sure can one be that the associations measured are true.

Very much looking forward for feedback and best wishes,
Simon

dsuveges · 21 October 2022 10:30

Hi, Welcome to the OpenTargets Community Portal!

There are two small issues with your script:

The base URL should be: base_url = "https://api.genetics.opentargets.org/graphql"
The hasSumsstats field has a typo. It is hasSumstats

This link shows the correct query.

Please don’t hesitate contacting us if you have further questions.

simbaum · 21 October 2022 12:03

Amazing!
This works, thanks a lot!

One more question. Is it possible to get the L2G score out of the query as well?

Thanks and best wishes,
Simon

dsuveges · 21 October 2022 13:52

It is not only possible, but the above query is already returning the l2g score! It’s name, is yProbaModel , which I understand not the most intuitive label. However we are in the process of a large scale refactoring of our pipelines, and such inconsistencies will be addressed.

simbaum · 26 October 2022 12:57

Great! Thanks so much for your help : )

Topic		Replies	Views
Accessing locus-to-gene (L2G) and colocalisation data from Open Targets Genetics by querying the API using Python GraphQL API graphql , genetics-portal	0	781	3 June 2021
An R script to use API for fetching "Associated studies: locus-to-gene pipeline" section for a gene GraphQL API	3	453	11 August 2023
Differences between interface vs API results in terms of list of studies associated with a gene GraphQL API batch-search , genetics-portal	2	376	30 November 2021
Query by GeneID and Phenotype to get L2G scores GraphQL API genetics-portal	1	322	15 March 2023
How to access FinnGen GWAS data using the Open Targets Genetics Portal API GraphQL API genetics-portal	4	1054	16 August 2021

API request for associations with gene

Related topics