Due to a problem in the ontology, a number of rare diseases are duplicated in the 22.06. EFO is currently undergoing a cleanup of the Orphanet terms to assign them to their corresponding MONDO id when appropriate. As part of this process, a large number of terms have been duplicated in the ontology.
Hi team! I wanted to thank you for the v22.09 update; it’s really great to see the additions and the clean-up of these duplicate terms!
However, it seems that while the diseases dataset has had the duplicate entries removed, the diseaseToPhenotype dataset has had entries with the retired IDs removed entirely, instead of updating/remapping to the new term.
For example, “Familial dilated cardiomyopathy” was one of those rare diseases with a duplicate entry of Orphanet_217607 and MONDO_0016333. Now only the MONDO ID is contained in the diseases data. But diseaseToPhenotype previously had over 60 phenotypes associated with the Orphanet ID. Now there are no phenotypes associated with either the MONDO or the (retired) Orphanet ID.
The related rare disease of “Familial dilated cardiomyopathy with conduction defect due to LMNA mutation” only ever had a single Orphanet ID (Orphanet_300751). This still exists in both the diseases and the diseaseToPhenotype datasets.
Please let me know if you want any more information about this issue!