Hi Open Targets team, thank you for this resource.
I am looking at colocalisation data and can find summary statistics for studies, but this same information doesn’t appear to be available for the QTL source as efficiently, if at all.
For example, if I look at PCSK9’s first colocalisation entry and click on gene prioritisation, I can see the study p-value and the effect size for GCST90038690 and variant 1_55025188_T_C. The closest information I can find for the QTL summary statistics is the gene prioritisation using colocalisation analysis table which shows a QTL beta. Are the rest of the summary statistics stored within the database and accessible? Or is there a file through FTP which contains this information?
I am using the GraphQL API and would appreciate an example of how to access this information through the API as well.
Kind regards,
Geri
Hi Geri.
We don’t provide full summary statistics, since these are available independently from the eQTL catalogue.
However, you can access QTL credible sets that we’ve computed if you know the study ID, tissue, and lead variant to query for. Here is an example GraphQL query.
Perhaps a better way to query the data is to download the full credible set table from the FTP. As you can see the data are quite large. Alternatively, you can use BigQuery to select just the data you want from the table. You need an account with Google, and a limited amount of data can be used free before it’s charged. Here is a link to our BigQuery tables, where you want the variant_disease_credset table.
A final note is that if you just want QTL credible sets unrelated to the genetics portal, currently I would recommend getting these directly from eQTL catalogue, since they have done in-sample fine-mapping using SuSIE, which is a superior method to what we’ve used. Here is a link to eQTL catalogue credible set data.
Best regards,
Jeremy
3 Likes
eQTL catalogue use totally different eQTL identification method form GTEx. I am wondering opentarget platform use GTEx-v8 from their project website or from eQTL catalog?
Thanks.
Shicheng
Hi Shicheng.
We currently get the GTEx-v8 data from eQTL catalogue. As you say, it used different methods for eQTL identification, so there will be cases where an eQTL for a gene/tissue is reported in GTEx and not in eQTL catalogue, and vice versa. On average, I saw that eQTL catalogue reported slightly lower significance for many eQTLs, which may be due to including fewer genotype PCs as covariates. I believe they will be looking further into maximising power in a future update.
Thanks Jeremy, I have been able to use the example query. I am now wondering why some queries work and others do not?
This first search produces results:
query qtl_query1 {
qtlCredibleSet(studyId: "Braineac2", variantId: "1_55053079_C_T", phenotypeId: "ENSG00000169174", bioFeature: "SUBSTANTIA_NIGRA") {
tagVariant {
id
}
}
RESULTS
{
"data": {
"qtlCredibleSet": [
{
"tagVariant": {
"id": "1_55052188_C_G"
}
},
However, this second example produces empty results:
query qtl_query2 {
qtlCredibleSet(studyId: "HipSci", variantId: "1_55021673_C_G", phenotypeId: "ENSG00000169174", bioFeature: "IPSC") {
tagVariant {
id
}
}
RESULTS
{
"data": {
"qtlCredibleSet": []
}
}
The fields should be correct as I retrieved them using the colocalisationsForGene query that provides the qtlStudyId, variantId, and tissueId:
query colocs {
colocalisationsForGene(geneId: "ENSG00000169174"){
leftVariant {
id
}
study {
pubAuthor
}
qtlStudyId
tissue {
id
}
}
}
RESULTS
"leftVariant": {
"id": "1_55021673_C_G"
},
"phenotypeId": "ENSG00000169174",
"study": {
"pubAuthor": "UKB Neale v2"
},
"qtlStudyId": "HipSci",
"tissue": {
"id": "IPSC"
I also know the data is available somewhere as the gene prioritisation page for the second query shows a beta of 0.546. Any help on how to access it would be appreciated.
Cheers,
Geri
1 Like
Hi Jeremy, would you mind having a look at this for me? We would appreciate this information being incorporated into our analyses.
Cheers,
Geri