Dear Open Targets Community,
I downloaded OT Genetics data preprocessed by OT Platform to analyze them locally. However, the data include some strange disease descriptions (diseaseFrmSource):
While the EFO code means diet measurement, the description seems unlikely derived from the ontology. Could anyone please talk about where those descriptions came from or how the diseases were preprocessed by OT?
The EFO mappings for GWAS studies are imported from the GWAS Catalog, where curatos assign the most granular EFO terms available based on the trait description provided by the authors.
Sometimes ontogy won’t provide the sufficient granularity to capture the studied phenotype in its entire complexity. In these cases, to make sure users have the complete trait description, this information is captured under
Thank you for your prompt response, @dsuveges ! The data seems to come from UK Biobank-based analysis (NAELE) instead of GWAS Catalog.
Do the importing procedures differ?
Yes, you’re right. In which case the curation happened in house within Open Targets by the Genetics Team, but the process is generally the same. The label
diseaseFromSource is coming from the source and is as complete as possible, while the
diseaseFromSourceMappedId is an ontology representation, which might not be as complete.
Many thanks! That’s clearer now