Genetics data - combine variant id, target id, disease id

Hello,

I am searching for some genetics data to download that combine info about variant IDs, target IDs and disease IDs.

From what i see in the downloads page, in the genetics part:

variant files: there is variant id, target id but not disease id

gwas files: gene id, disease id, but no variant id

credible set files: variant id, but not target id and disease id

Is there any other option that i didnt see and can serve better my purpose?

Thank you in advance!

Hello! There are a number of sources that report gene/disease relationship based on some underlying genetic variation beside GWAS studies! This includes ClinVar, curated coding variants from Uniprot and phamacogenetics data from PharmGKB.

To collate all this data you’ll need

  • evidence data, which will have variant/target/disease for eva, eva_somatic (evidence from ClinVar) and uniprot_variants.
  • Then you can use gwas_credible_sets evidence, which links target and disease to credible_set dataset via the studyLocusId. This dataset has variantId (here you need to consider tagging variants inside the locus object).
  • pharmacogenomics dataset, which contains disease (phenotypeFromSourceId), target (targetFromSourceId ) and variant (variantId).

It’s a bit of a journey, but you can find everything you need. Fore more information on the datasets and column names and contents, please take a look at our downloads page.

Good luck! :slight_smile:

2 Likes