Problem
The Open Targets helpdesk recently received the following question:
I have used the older version of the Open Targets Platform (21.02), and the previous API technique, to get targets associated with disease and their overall scores.
But now after updating (to version 21.04) I can’t do it with the new GraphQL API technique. Could you help me with this? How can I get associated targets with disease and their overall scores using the new GraphQL API?
To make it easier, I need to get the first two columns of the table on, for example, targets associated with COVID, but without downloading it, just getting info from API inside Python script.
Using the GraphQL API
In order to get targets associated with a specific disease, you will need to start your query with the disease ID and then use the associatedTargets
field to access the relevant target data. For example, to get the first 10 targets associated with COVID (MONDO_0100096), you would run the following query:
query targetsAssociatedWithCOVID {
disease(efoId: "MONDO_0100096") {
id
name
associatedTargets(page: { index: 0, size: 10 }) {
count
rows {
target {
id
approvedSymbol
}
score
}
}
}
}
You can try this query in our GraphQL API playground by pressing the triangle play button.
Please note that when using the GraphQL API, you will need to construct your code and adjust the index
and size
parameters in order to access all of the data. The GraphQL API defaults to returning only 25 entries and so index:0, size:25
will return the first 25 entries, index:1, size:25
will return the next 25 entries, etc.
Using BigQuery
To support more complex and systematic queries, we recommend that you use our BigQuery instance — open-targets-prod. To access the associations data using BigQuery, you can use the following SQL query with the associationByOverallDirect dataset.
SELECT
associations.diseaseId AS disease_id,
diseases.name AS disease_name,
associations.targetId AS target_id,
targets.approvedSymbol AS target_approved_symbol,
associations.score AS overall_association_score,
evidenceCount AS number_of_evidence_strings
FROM
`open-targets-prod.platform.associationByOverallDirect` AS associations
JOIN
`open-targets-prod.platform.diseases` AS diseases
ON
associations.diseaseId = diseases.id
JOIN
`open-targets-prod.platform.targets` AS targets
ON
associations.targetId = targets.id
WHERE
associations.diseaseId='MONDO_0100096'
ORDER BY
associations.score DESC
You can access the query here and download the data as JSON or CSV format.
Please note that the association scores returned by the GraphQL API and those available in BigQuery (or via our data downloads) are known to be different due to a modified algorithm and harmonic sum strategy. This will be fixed in our upcoming release, scheduled for the end of June — see our GitHub issue tracker #1508 for more information.
Other examples
For further information and sample scripts, take a look at the Platform documentation. We also have a few other example scripts from @irene and @ahercules:
- How to access drug warning and pharmacovigilance data using the API
- How to find known drugs for a given disease using BigQuery or data downloads
- How to get marketed drugs for a set of targets using BigQuery