Dear OTG Team and comunity,
I was looking at the reported variants from Sliz et al 2021 (GCST90027161) associated with atopic dermatitis. But, the reported results make no sense to me…
The study, as well as GWAS catalog, reports 30 variants, whereas OTG reports 26. However, OTG reports variants that are not found on the study, and p-values that are not shown on the study. Some variants that seem to be matching between GWAS catalog and OTG have different odd ratios, position for some variants differ by one base, what means that the SNP id reported is not the same betwen GWAS catalog and OTG, etc, etc…
The number of errors and missmatches between GWAS catalog and OTG for this study is so large that I cannot even list all them. So can I rely on the information reported by OTG?
Is there an explanation to this? Could someone explain why what I see reported for this study do not match information from GWAS catalog (which is identical to what is reported in the original article)? Is there any kind of processing of results that may could explain all these discrepancies?
If the information from this study is completely wrong, how can I rely on anything else reported by OTG?
I would much appreciate and answer to this. I have been using OTG for a while and I really like it, but after this my confidence on the platform and its utility has plummeted.
Looking forward to an answer,
Welcome to the OpenTargets community! We are glad to hear that you have been enjoying your usage of the OTG resource, and thank you for bringing your confusion to our attention!
In an effort to interpret GWAS studies in the context of drug target identification, the Genetics Portal aims to provide an interpretation of the GWAS signals reported in different sources, including the GWAS Catalogue. As part of this process, several quality control and harmonisation steps are required to ensure information is captured in comparable ways.
Regarding your concerns on the discrepancy between OTG and GWAS catalogue, let’s take for example the rs17371133 from GCST9002716. As stated in our documentation, we harmonise all effects with respect to the alternative allele, because the effector allele in this case is the alt allele, the beta (-0.0651) would be flipped. Therefore, when the odds ratio (OR) is computed using a positive beta (0.0651), you arrive at OR = 1.067, feel free to verify the reverse yourself!
The differences in p-values are a result of different rounding practices, i.e. GWAS catalogue rounds to 1 significant figure, while OTG rounds to 2 s.f.
Regarding the missing loci, it’s possible that they were dropped due to not existing in the gnomAD variant index, in addition, these may not be independent from the other loci reported in the study.
There could also be some inconsistencies due to the different criteria Ensembl and gnomAD use to represent indel locations (e.g. 0-1 numbering). We are actively looking into this issue.
We would be really interested to know whether this cleared up some of your confusion. We are trying our best to document every step of the process but there will be steps that may not be transparent to every user. Which we will happily resolve with the invaluable feedback of the user community.
I have noticed at least another study with a similar issue (missing locci)
The study is Open Targets Genetics and I compared it to the GWAS Catalog GWAS Catalog
Of the 12 Loci under p<5e-8 in GWAS Catalog only 8 are reported in Open Targets
The missing locci (rs80138802-C, rs2279343-G, rs76015112-G,rs1778155-T) should probably be independent from the other statistically significant locci given that in all but one case are in a different chromosome.
Are those missing because they are not in the gnomAD variant index?
Interestingly, at least some of those missing loci are present in the L2F dataset (obtained from /pub/databases/opentargets/genetics/22.09/l2g/ ) (Essentially I have been looking into missing loci because I noticed I could not match some L2F entries to variants in either v2d or variant_index datasets)