Remove all category based analysis of genebass from OpenTargets platform

Dear Team,

I would like to propose removing all category-based analysis of genebass from the OpenTargets platform. Upon further examination, I have concerns regarding the reliability and accuracy of these results, as many appear to be misleading. To illustrate this point, I’d like to share one specific example. The positive association between IL33 pLoF and asthma seems questionable and potentially inaccurate. Such findings raise doubts about the credibility of the analysis and suggest that a thorough review of the methodology and data sources is warranted.

While I understand the value of providing comprehensive information, presenting potentially unreliable or erroneous data could undermine the platform’s integrity and credibility. Therefore, I recommend temporarily removing the category-based analysis of genebass until we can thoroughly validate the results and ensure their accuracy.

Best regards,


Thanks for the example. I think it would be good to clean this up.

The problem here seems to be that the Asthma-IL33 association is derived from the genebass category Blood clot, DVT, bronchitis, emphysema, asthma, rhinitis, eczema, allergy diagnosed by doctor.

By looking at the betas and the number of cases, it’s quite unlikely this result is a good representation of asthma, also confusing the direction of effect assessment.

The proposed solution by @Shicheng_Guo is to temporarily remove all category-based assessments from Genebass. If my understanding is correct, we would be removing all associations with the flag Categorical. For IL33, we would exclude the “Asthma” category, but not the J45 Asthma ICD10:

We can start by looking at the impact of the suggested solution cc @irene @Juan_Roldan
A more complex option could be to try to identify problematic categories, for example, mapping to many EFO terms, comparing sample sizes, excluding self-reported traits, etc.

Yes. Keep continious data and ICD10 data but remove all categorical result which including huge number of misleading result.