Discrepancy in Evidence Counts Between GraphQL API and BigQuery for enableIndirect: true

Hi, I’m afraid your query is not right. If you look at the exploded diseases:

WITH DiseaseIds AS (
  SELECT
    id AS parentDiseaseId,
    descendant.element AS diseaseId
  FROM
    `bigquery-public-data.open_targets_platform.diseases`,
    UNNEST(descendants.list) AS descendant
  WHERE
    id = 'EFO_0000729'
)

select * from DiseaseIds

You get a table with two columns and two rows:

[{
  "parentDiseaseId": "EFO_0000729",
  "diseaseId": "EFO_0005623"
}, {
  "parentDiseaseId": "EFO_0000729",
  "diseaseId": "EFO_0005626"
}]

If you want all direct and indirect evidence for the parent disease, it is not enough to join by the child disease identifiers: JOIN DiseaseIds d ON e.diseaseId = d.diseaseId, you also need to take the parent id into account!

ON (e.diseaseId = d.diseaseId) OR (e.diseaseId = d.parentDiseaseId)

The modified query returns 46 rows, which is consistent with the UI. Please give it a go!

1 Like