Hello,
I was wondering where i can find the json files of the colocalisation analysis of OT genetics.
Thank you in advance
Hello,
I was wondering where i can find the json files of the colocalisation analysis of OT genetics.
Thank you in advance
Hi,
The colocalization datasets can be downloaded from here: ftp://ftp.ebi.ac.uk/pub/databases/opentargets/genetics/latest/v2d_coloc
in parquet format.
The schema of the dataset looks like this:
root
|-- coloc_n_vars: integer (nullable = true)
|-- coloc_h0: double (nullable = true)
|-- coloc_h1: double (nullable = true)
|-- coloc_h2: double (nullable = true)
|-- coloc_h3: double (nullable = true)
|-- coloc_h4: double (nullable = true)
|-- left_type: string (nullable = true)
|-- left_study: string (nullable = true)
|-- left_chrom: string (nullable = true)
|-- left_pos: integer (nullable = true)
|-- left_ref: string (nullable = true)
|-- left_alt: string (nullable = true)
|-- right_type: string (nullable = true)
|-- right_study: string (nullable = true)
|-- right_bio_feature: string (nullable = true)
|-- right_phenotype: string (nullable = true)
|-- right_chrom: string (nullable = true)
|-- right_pos: integer (nullable = true)
|-- right_ref: string (nullable = true)
|-- right_alt: string (nullable = true)
|-- coloc_h4_h3: double (nullable = true)
|-- coloc_log2_h4_h3: double (nullable = true)
|-- is_flipped: boolean (nullable = true)
|-- right_gene_id: string (nullable = true)
|-- left_var_right_study_beta: double (nullable = true)
|-- left_var_right_study_se: double (nullable = true)
|-- left_var_right_study_pval: double (nullable = true)
|-- left_var_right_isCC: boolean (nullable = true)
As the dataset is specific for colocalization, you might need to join the table with association and study level information stored under https://ftp.ebi.ac.uk/pub/databases/opentargets/genetics/latest/v2d/
.
Please let us know if thereâs anything else we can help with!
Best,
Daniel
Hi Daniel,
This is a great table. Would love to know what âis_flippedâ means? And can I assume that
âleft_var_right_study_betaâ is the beta for the left-hand variant, but for the âright_studyâ, as found in the summary stats table? Or is there some flipping of alleles going on there?
Thanks,
Clare.
Huh, this is a bit of a black magic. To make the colocalisation process efficient, each comparison is only done once (A vs B done, but not B vs A). This process yields partial data however containing only half of the matrix of colocalising peaks. To complete the matrix, thereâs a âflippingâ. The code for the process can be found here.
There is a rather obscure resource with some explanation on the columns here. It says:
If thereâs anything to clear up, let us know.
Hi Daniel,
Thanks very much for this, thatâs helpful!
Presumably as the colocalisation is a pairwise procedure, the âreflectionâ always yields the same colocalisation result (i.e the coloc_h4 is the same, regardless of whether it was A vs. B or B vs. A)? But I understand then that the âleft_varâ could be a different variant as that depends on which is the âleft_studyâ.
Am I also correct in assuming that the variant for the âleft_studyâ corresponds to a âlead_variant_idâ in the credible set table?
Cheers
Clare.
Presumably as the colocalisation is a pairwise procedure, the âreflectionâ always yields the same colocalisation result (i.e the coloc_h4 is the same, regardless of whether it was A vs. B or B vs. A)? But I understand then that the âleft_varâ could be a different variant as that depends on which is the âleft_studyâ.
Yes, this is all correct! The comparison is pairwise, the flipping procedure is as simple as changing column names from left
to right
and vice versa. The colocalisation statistics remained unchanged. The colocalization is also a study aware process, as study/locus pairs are compared, so when flipping, studies also flipped.
Am I also correct in assuming that the variant for the âleft_studyâ corresponds to a âlead_variant_idâ in the credible set table?
Yes, thatâs also correct. I hope it all makes sense.
Great, thank you. Yes, that all makes sense (and it is a fantastic resource!).
Clare.