Can we get all Bibliography data per gene using the api?

when a gene is searched most tabs have this option for the api
image
BUt for the bibliography it’s not there …


it would be really handy to be able to get this data through the api
PS i do realise you can download all the bibliography data through FTP, but there is quite a lot of it !!
thanks

Charlie

Hi Charlie,

Yes, indeed, the Bibliography widget is one of the exceptions where these functionalities are not available. The above table is generated by the following API query:

query SimilarEntitiesQuery(
  $id: String!
  $ids: [String!] = []
  $startYear: Int = null
  $startMonth: Int = null
  $endYear: Int = null
  $endMonth: Int = null
  $threshold: Float = 0.5
  $size: Int! = 15
  $entityNames: [String!] = []
  $cursor: String = null
) {
  target(ensemblId: $id) {
    id
    approvedName
    similarEntities(
      additionalIds: $ids
      threshold: $threshold
      size: $size
      entityNames: $entityNames
    ) {
      id
      score
      object {
        ... on Target {
          id
          approvedSymbol
        }
        ... on Drug {
          id
          name
        }
        ... on Disease {
          id
          name
        }
      }
    }
    literatureOcurrences(
      additionalIds: $ids
      cursor: $cursor
      startYear: $startYear
      startMonth: $startMonth
      endYear: $endYear
      endMonth: $endMonth
    ) {
      count
      earliestPubYear
      cursor
      rows {
        pmid
        pmcid
        publicationDate
      }
    }
  }
}

With the following parameters:

{
  "ids": [],
  "startYear": null,
  "startMonth": null,
  "endYear": null,
  "endMonth": null,
  "threshold": 0.5,
  "size": 15,
  "entityNames": [
    "disease",
    "drug",
    "target"
  ],
  "cursor": null,
  "id": "ENSG00000144029"
}

You can see from the query and the returned data, that the the bare minimum is returned, we then use EuroPMC’s API to retrieve title and all publication related metadata. However, if you explore our API further, you can see there are options to extract matches.

In the meantime, I have opened a ticket to explore the possibilities to get this feature included.

1 Like

HI thanks very much for this - I’m being this dim but running your code through python like this : but getting the error
400
{‘data’: None, ‘errors’: [{‘message’: 'Variable '$startYear' expected value of type 'Int' but got: “null”

variables = {
  "ids": [],
  "startYear": "null",
  "startMonth": "null",
  "endYear": "null",
  "endMonth": "null",
  "threshold": 0.5,
  "size": 15,
  "entityNames": [
    "disease",
    "drug",
    "target"
  ],
  "cursor": "null",
  "id": "ENSG00000144029"
}

query_string = """
query SimilarEntitiesQuery(
  $id: String!
  $ids: [String!] = []
  $startYear: Int = null
  $startMonth: Int = null
  $endYear: Int = null
  $endMonth: Int = null
  $threshold: Float = 0.5
  $size: Int! = 15
  $entityNames: [String!] = []
  $cursor: String = null
) {
  target(ensemblId: $id) {
    id
    approvedName
    similarEntities(
      additionalIds: $ids
      threshold: $threshold
      size: $size
      entityNames: $entityNames
    ) {
      id
      score
      object {
        ... on Target {
          id
          approvedSymbol
        }
        ... on Drug {
          id
          name
        }
        ... on Disease {
          id
          name
        }
      }
    }
    literatureOcurrences(
      additionalIds: $ids
      cursor: $cursor
      startYear: $startYear
      startMonth: $startMonth
      endYear: $endYear
      endMonth: $endMonth
    ) {
      count
      earliestPubYear
      cursor
      rows {
        pmid
        pmcid
        publicationDate
      }
    }
  }
}
"""

base_url = "https://api.platform.opentargets.org/api/v4/graphql"

# Perform POST request and check status code of response
r = requests.post(base_url, json={"query": query_string, "variables": variables})
print(r.status_code)

# Transform API response from JSON into Python dictionary and print in console
api_response = json.loads(r.text)
print(api_response)

OK i’m solving myself but leaving if others were also stumped:

use None in python not null and it works

One thing i’m a bit confused about this coocurrences FTP download

http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/21.04/output/literature/cooccurrences/

where it seems open targets has done all this work themselves, but then you are talking about accessing data from EuroPMCs api - so how exactly is the FTP site used in open targets?

Hi,

I am happy to hear the issue with the query is sorted. Regarding how the literature data ingested to our website and how we are further process that dataset, please take a look at this blog post:

Best,
Daniel

Here is the R solution:

library(httr)
library(jsonlite)

# Define the query string
query_string <- '
query SimilarEntitiesQuery(
  $id: String!
  $ids: [String!] = []
  $startYear: Int = null
  $startMonth: Int = null
  $endYear: Int = null
  $endMonth: Int = null
  $threshold: Float = 0.5
  $size: Int! = 15
  $entityNames: [String!] = []
  $cursor: String = null
) {
  target(ensemblId: $id) {
    id
    approvedName
    similarEntities(
      additionalIds: $ids
      threshold: $threshold
      size: $size
      entityNames: $entityNames
    ) {
      id
      score
      object {
        ... on Target {
          id
          approvedSymbol
        }
        ... on Drug {
          id
          name
        }
        ... on Disease {
          id
          name
        }
      }
    }
    literatureOcurrences(
      additionalIds: $ids
      cursor: $cursor
      startYear: $startYear
      startMonth: $startMonth
      endYear: $endYear
      endMonth: $endMonth
    ) {
      count
      earliestPubYear
      cursor
      rows {
        pmid
        pmcid
        publicationDate
      }
    }
  }
}
'

# Define the variables
variables <- list(
  id = "ENSG00000157764",  # Example Ensembl ID
  size = 15,
  threshold = 0.5
)

# Define the URL for the API endpoint
base_url <- "https://api.platform.opentargets.org/api/v4/graphql"

# Perform the POST request
response <- POST(
  url = base_url,
  body = list(query = query_string, variables = variables),
  encode = "json"
)

# Check the status code of the response
if (status_code(response) == 200) {
  # Parse the response from JSON
  api_response <- content(response, as = "parsed", type = "application/json")
  
  # Print the response in a readable format
  print(toJSON(api_response, pretty = TRUE))
} else {
  # Print the error message if the request failed
  print(paste("Error:", status_code(response)))
}