Hi,
I’m encountering a discrepancy between the evidence retrieved via the Open Targets GraphQL API and the BigQuery dataset when querying evidence for a specific disease-target pair with enableIndirect: true
.
- GraphQL API:
- Using the following query with
enableIndirect: true
, I retrieve evidence for 28 papers associated with the disease (EFO_0005917
) and the target (ENSG00000164588
).
query EuropePMCQuery(
$ensemblId: String!
$efoId: String!
$size: Int!
$cursor: String
) {
disease(efoId: $efoId) {
id
europePmc: evidences(
ensemblIds: [$ensemblId]
enableIndirect: true
size: $size
datasourceIds: ["europepmc"]
cursor: $cursor
) {
count
cursor
rows {
disease {
name
id
}
target {
approvedSymbol
id
}
literature
resourceScore
}
}
}
}
This query provides evidence for both direct and indirect associations (through disease ontology descendants).
- BigQuery:
I attempted to replicate this query in BigQuery using theopen_targets_platform
dataset. The following query was used:
SELECT
targetId,
diseaseId,
ARRAY_AGG(literature_element) AS aggregated_literature,
COUNT(*) AS evidence_count
FROM
`bigquery-public-data.open_targets_platform.evidence`,
UNNEST(literature.list) AS literature_element
WHERE
targetId = "ENSG00000164588"
AND diseaseId = "EFO_0005917"
GROUP BY
targetId, diseaseId
This query only returns 7 pieces of evidence, despite accounting for all literature in the evidence.literature.list
field.
Key Observations
- The GraphQL API retrieves 28 pieces of evidence with
enableIndirect: true
. - BigQuery results are limited to 7 pieces of evidence, even when aggregating all available literature for the disease-target pair.
Questions
- How does the GraphQL API handle
enableIndirect: true
? Does it include evidence from indirect associations (e.g., descendant diseases) not captured in the BigQuery dataset? - Is there a way to replicate the behavior of
enableIndirect: true
in BigQuery? - Are there known differences between the GraphQL API and BigQuery datasets that might explain this discrepancy?
Thank you for your help in clarifying this issue.
Muhamed