BigQuery for a variant

I am familiar with SQL and just starting with Open Targets to query for variants.

Could you provide example Google Big Query SQL statement for this example variant - 1_154453788_C_T - which would ideally output all the info displayed on the website, mainly I am interested on data on “Assigned genes” section?

If you are familiar with SQL you will find easy to extract information from Google BigQuery. The “Assigned genes” information is stored in the variant_gene table:

FROM `open-targets-genetics.genetics.variant_gene` 
WHERE chr_id = "1" AND 
    position = 154453788 AND 
    ref_allele = "C" AND 
    alt_allele = "T"
1 Like

Thanks, yes, I got there in the end.

I needed to match on rsid, so used open-targets-genetics.genetics.variants to get the pos/ref/alt, then merge with variant_gene table.

Follow up question, some variants do not match, for example -ncbi rs371707439 Why is this variant is not in variants table?

Which matches only on PheWAS - opentargets - X_70920058_G_A

Doesn’t match on rsid variant:

  rs_id = 'rs371707439'

Doesn’t match on variant_gene:

  chr_id = "X"
  AND position = 70920058
  AND ref_allele = "G"
  AND alt_allele = "A"

But matches on PheWAS:

  chrom = "X"
  AND pos = 70920058
  AND ref = "G"
  AND alt = "A"

Side question, why sa_gwas has different column names chrom instead of chr_id, pos instead of position, etc?

The reason for the variant not being found in the variants table can be found in the documentation:

The PheWAS variants come directly from the GWAS studies, so they are at the moment not required to be present in the variant index.

We are working on redefining the variant index and it’s likely that this and other variants with lower MAF will be available soon in the variant index. Important changes in the variant index · Issue #2074 · opentargets/issues · GitHub

Regarding the side question. I share your pain. We’ll work on this but not currently a priority.

1 Like