BigQuery data genetics credset table identifiers

Estefania_Rojas · 27 July 2025 22:17

Hi there,

I have a query regarding the Google BigQuery-public-data.open_targets_genetics. In the tab of variant_disease_credset table, I am not sure how to identify each credible set as there is not a unique identifier for them. I understand that the credset is a combination of variants that potentially have the causal variant for the phenotype studied. How can I identify the set of variants that belong to a specific credible set?

Also I am not sure what does tag variant refer to, and why there is not a p-value for the lead variant, should I look for it in another table? I am interested in finding the marginal and conditional p value for the lead variant.

Does the postprob_cumsum englobes the combined PIP for a credible set?

Thank you very much for your help!

dsuveges · 28 July 2025 12:34

Hi,

I would recommend not to use the data from bigquery-public-data.open_targets_genetics, as that resource is superseded by the new Platform, where the genetics data is integrated
into other Platform resources (and as such, the Genetics Portal has been shut down already). Please use bigquery-public-data.open_targets_platform, the most up-to-date Open Targets dataset, which is also available publicly on big query.

Colocalisation data can be found in two tables reflecting the colocalisation method used: colocalisation_coloc and colocalisation_ecaviar. These tables contain study locus identifiers, so you’ll be able to identify the credible sets in the credible_set table for more detailed statistics. For more information on the contents of these tables see the column and dataset descriptions on the Platform downloads page

Please let us know if there’s anything else we could help with!

.

Estefania_Rojas · 30 July 2025 21:04

Hi Daniel,

Thank you so much for your reply, it is very helpful. I will explore that data, it seems much more complete than the one I was using. I’ll let you know if I have any other questions.

Have a nice day

Best wishes,

Estefania

Topic		Replies	Views
Genetics data - combine variant id, target id, disease id Data downloads	1	44	19 November 2025
In Open Targets Genetics, what is the “credible set overlap”? Open Targets Genetics FAQs	0	935	22 July 2021
Older version of Open Targets results General ot-platform , data	2	58	28 October 2025
BigQuery for a variant Google BigQuery/Cloud	4	396	22 April 2022
Missing variant in Platform compared to Open Targets Genetics Data issue genetics-portal	5	185	24 April 2025

BigQuery data genetics credset table identifiers

Related topics