In order to analyze the Open Targets dataset located at ‘Open Targets Platform’, I would like to know the description of each feature in the dataset. For example, the ‘Disease/Phenotype’ dataset has features with names like ‘aspect’ and ‘bioCuration’, but I cannot infer what exactly these features represent from the feature names. If you have, I would appreciate it if you could tell me where to find the descriptions.
Thank you in advance.
Hi,
Welcome to the Open Targets community! The fields you were asking about are located in the diseaseToPhenotype
dataset, where each evidence looks like this:
{
"disease": "EFO_0000538",
"phenotype": "HP_0001631",
"evidence": [
{
"aspect": "P",
"bioCuration": "HPO:probinson[2022-07-11]",
"diseaseFromSourceId": "OMIM:612098",
"diseaseFromSource": "Cardiomyopathy, familial hypertrophic, 11",
"diseaseName": "hypertrophic cardiomyopathy",
"evidenceType": "PCS",
"frequency": "1/9",
"qualifierNot": false,
"references": [
"PMID:10966831"
],
"resource": "HPO"
}
]
}
As the source field indicates, this piece of data is sourced from HPO, where you can find further explanation of the context of each features here.
We are looking for ways to document our datasets better, and/or provide schemas to help our users interpret our data better, however considering that the number of columns are in the hundreds, this is not a straightforward problem (especially if you consider how to make sure these descriptions are always up to date). In the meantime, if you could not find answers in our documentation, you can always ask here.
Best,
Daniel
Thank you for your informative response!
I understand that it is hard works to provide detailed descriptions of all datasets on the OTG website. If I have any more question about the details of datasets, I would like to ask here.
Best regards.