Fine-mapping using the GCTA-COJO software

deniseo · 20 June 2023 17:38

Hi and thank you for the development of this platform. I would like to implement the COJO pipeline on my data as performed here by Open Targets. I am aware of the documentation page and the GitHub repository, however there is something that is still unclear to me on how was the pipeline applied. As I understand from these sources, the COJO --slct pipeline was applied on a set of summary statistics in each locus for the region surrounding the top SNPs of the summary statistics. From these results, the jma.cojo output is used for rerunning the -cojo-cond algorithm in order to condition on the variants identified by --slct and ultimately construct credible sets.

I have three questions:

First is the conditioning performed even on loci that have a single independent locus identified by --slct algorithm, in order to see whether in a credible set or “signal” multiple causal variants are present?
Second, if the answer to the above question is no, how do we construct credible sets to loci in which a single independent snp is identified by --slct, since in fine-mapping a signal may consist of a single causal variant (CCV) or even more?
Another question is that when we are conditioning using --cojo-cond how do we construct separate credible sets per locus. E.g. if a locus has 10 independent loci generated by --cojo-slct, how do we construct 10 credible sets if we are only running --cojo-cond using all 10 snps to condition the summary statistics on?

I really appreciate any input. Thank you.

Xiangyu · 22 June 2023 13:20

Welcome to the OTG community and thank you for your question! In response to your questions:

No, the conditioning step is only performed when there are multiple independent signals within a window (2mb in our implementation).
If there is a single independent, significant SNP at a locus, and it is not in LD with any other SNPs in the region, then the credible set would consist of just the one SNP, with posterior probability of 1.
Separate credible sets are computed for each set of conditionally independent summary stats. In your example, there would be 10 credible sets, each would be conditioned on the other 9 independent signals.

Best wishes,
Xiangyu

deniseo · 22 June 2023 13:43

Thank you for your reply, I really appreciate it.

As I understand it, the slct algorithm of the software is applied and the resulting independent variant IDs are used subsequently in the cond algorithm. With regards your second answer, if I condition each signal based on the remaining signals in a locus, isn’t it possible that his would result in overlaps between credible sets?

Also something is still unclear to me. For loci with multiple independent signals, the ABF can be used to construct the comprising credible sets, which I get. In cases however, where there is only one single independent variant per locus identified by the slct algorithm, how do I proceed with the credible set construction for this independent signal? You mentioned LD, but how do I find this information, can you elaborate on the steps following the identification of single independent variant from slct?

Thank you and apologies for the confusion. It seems that I miss some of the steps of the pipeline.

Topic		Replies	Views
GCTA-COJO command and parameters Community Feedback genetics-portal	1	242	9 May 2022
In Open Targets Genetics, what is the “credible set overlap”? Open Targets Genetics FAQs	0	912	22 July 2021
What fine-mapping and colocalisation software does Open Targets Genetics use? Open Targets Genetics FAQs	0	521	14 July 2021
Credible set variants with PIP but no L2G score GraphQL API data	1	23	6 June 2025
In the “tag variants” section of variant summary pages in Open Targets Genetics, why do some SNPs have a posterior probability and others don’t? Open Targets Genetics FAQs	0	449	23 August 2021

Fine-mapping using the GCTA-COJO software

Related topics