A user recently got in touch with the helpdesk to find out how they could obtain locus-to-gene (L2G) and colocalisation data for a specific gene in the Open Targets Genetics Portal.
Here we present two ways of obtaining the data displayed on the gene profile page, using NFAT5 as an example.
Using the GraphQL API
In order to access the data, you will need to access our GraphQL API endpoint and construct a query using different endpoints, parameters, fields noted in our schema.
The Genetics Portal GraphQL playground is a good place to try out different queries. For a step-by-step walkthrough of the GraphQL API playground, check out @JarrodBaker’s post on the Open Targets blog: Accessing Open Targets Genetics using GraphQL.
To obtain the data in the “Associated studies: locus-to-gene pipeline” table, you will need to query the ‘studiesAndLeadVariantsForGeneByL2G’ field and sub-fields. Below is a sample query for NFAT5, which you can also try directly in the playground by pressing the play button.
query GenePageQuery {
geneInfo(geneId:"ENSG00000102908") {
id
symbol
}
studiesAndLeadVariantsForGeneByL2G(geneId: "ENSG00000102908") {
pval
yProbaModel
study {
studyId
traitReported
pubAuthor
pubDate
pmid
nInitial
nReplication
hasSumsStats
}
variant {
rsId
id
}
odds{
oddsCI
oddsCILower
oddsCIUpper
}
beta{
betaCI
betaCILower
betaCIUpper
direction
}
}
}
Similarly, to access the data in the “Associated studies: Colocalisation analysis” table, you will need to query the ‘colocalisationForGene’ field and sub-fields. Below is a sample query using NFAT5, which you can also try out directly in the playground.
query GenePageQuery {
geneInfo(geneId:"ENSG00000102908") {
id
symbol
}
colocalisationsForGene(geneId: "ENSG00000102908") {
leftVariant {
id
rsId
}
study {
studyId
traitReported
pubAuthor
pubDate
pmid
hasSumsStats
}
qtlStudyId
phenotypeId
tissue {
id
name
}
h3
h4
log2h4h3
}
}
Accessing the same data using Python
The same queries can be constructed in Python. Below are sample scripts that construct the query string from NFAT5, execute the query using ‘requests’, and print the first element of the response data in JSON format.
Sample script for querying L2G data:
#import libraries to test solution
import requests
import json
def associated_studies_l2g_query(gene_id):
#construct query string and declare variables that will be sent in query
api_query = """
query GenePageQuery($geneId: String!) {
geneInfo(geneId: $geneId) {
id
symbol
}
studiesAndLeadVariantsForGeneByL2G(geneId: $geneId) {
pval
yProbaModel
study {
studyId
traitReported
pubAuthor
pubDate
pmid
nInitial
nReplication
hasSumsStats
}
variant {
rsId
id
}
odds{
oddsCI
oddsCILower
oddsCIUpper
}
beta{
betaCI
betaCILower
betaCIUpper
direction
}
}
}
"""
#set base_url for Open Targets Genetics Portal API
base_url = "http://genetics-api.opentargets.io/graphql"
#set variables object
variables = {"geneId": gene_id}
#perform API call using query string and variables object
r = requests.post(base_url, json={"query": api_query, "variables": variables})
#check status code of GraphQL API response and print error message if code == 400
if str(r.status_code) == "400":
print(f"{gene_id} query status code: {r.status_code}")
else:
pass
#transform API response into JSON
api_response_as_json = json.loads(r.text)
#print first element of JSON response data
print(api_response_as_json["data"]["studiesAndLeadVariantsForGeneByL2G"][0])
#return entire JSON response data
# return api_response_as_json
# execute function with sample gene - NFAT5 (ENSG00000102908)
associated_studies_l2g_query("ENSG00000102908")
Sample script for querying colocalisation data:
#import libraries to test solution
import requests
import json
def associated_studies_coloc_query(gene_id):
#construct query string and declare variables that will be sent in query
api_query = """
query GenePageQuery($geneId: String!) {
geneInfo(geneId: $geneId) {
id
symbol
}
colocalisationsForGene(geneId: $geneId) {
leftVariant {
id
rsId
}
study {
studyId
traitReported
pubAuthor
pubDate
pmid
hasSumsStats
}
qtlStudyId
phenotypeId
tissue {
id
name
}
h3
h4
log2h4h3
}
}
"""
#set base_url for Open Targets Genetics Portal API
base_url = "http://genetics-api.opentargets.io/graphql"
#set variables object
variables = {"geneId": gene_id}
#perform API call using query string and variables object
r = requests.post(base_url, json={"query": api_query, "variables": variables})
#check status code of GraphQL API response and print error message if code == 400
if str(r.status_code) == "400":
print(f"{gene_id} query status code: {r.status_code}")
else:
pass
#transform API response into JSON
api_response_as_json = json.loads(r.text)
#print first element of JSON response data
print(api_response_as_json["data"]["colocalisationsForGene"][0])
#return entire JSON response data
# return api_response_as_json
# execute function with sample gene - NFAT5 (ENSG00000102908)
associated_studies_coloc_query("ENSG00000102908")
Sample scripts courtesy of @ahercules.