Can we get all Bibliography data per gene using the api?

cjeynes · 6 July 2023 07:41

when a gene is searched most tabs have this option for the api

BUt for the bibliography it’s not there …

it would be really handy to be able to get this data through the api
PS i do realise you can download all the bibliography data through FTP, but there is quite a lot of it !!
thanks

Charlie

dsuveges · 14 July 2023 10:08

Hi Charlie,

Yes, indeed, the Bibliography widget is one of the exceptions where these functionalities are not available. The above table is generated by the following API query:

query SimilarEntitiesQuery(
  $id: String!
  $ids: [String!] = []
  $startYear: Int = null
  $startMonth: Int = null
  $endYear: Int = null
  $endMonth: Int = null
  $threshold: Float = 0.5
  $size: Int! = 15
  $entityNames: [String!] = []
  $cursor: String = null
) {
  target(ensemblId: $id) {
    id
    approvedName
    similarEntities(
      additionalIds: $ids
      threshold: $threshold
      size: $size
      entityNames: $entityNames
    ) {
      id
      score
      object {
        ... on Target {
          id
          approvedSymbol
        }
        ... on Drug {
          id
          name
        }
        ... on Disease {
          id
          name
        }
      }
    }
    literatureOcurrences(
      additionalIds: $ids
      cursor: $cursor
      startYear: $startYear
      startMonth: $startMonth
      endYear: $endYear
      endMonth: $endMonth
    ) {
      count
      earliestPubYear
      cursor
      rows {
        pmid
        pmcid
        publicationDate
      }
    }
  }
}

With the following parameters:

{
  "ids": [],
  "startYear": null,
  "startMonth": null,
  "endYear": null,
  "endMonth": null,
  "threshold": 0.5,
  "size": 15,
  "entityNames": [
    "disease",
    "drug",
    "target"
  ],
  "cursor": null,
  "id": "ENSG00000144029"
}

You can see from the query and the returned data, that the the bare minimum is returned, we then use EuroPMC’s API to retrieve title and all publication related metadata. However, if you explore our API further, you can see there are options to extract matches.

In the meantime, I have opened a ticket to explore the possibilities to get this feature included.

cjeynes · 19 July 2023 13:52

HI thanks very much for this - I’m being this dim but running your code through python like this : but getting the error
400
{‘data’: None, ‘errors’: [{‘message’: 'Variable '$startYear' expected value of type 'Int' but got: “null”

variables = {
  "ids": [],
  "startYear": "null",
  "startMonth": "null",
  "endYear": "null",
  "endMonth": "null",
  "threshold": 0.5,
  "size": 15,
  "entityNames": [
    "disease",
    "drug",
    "target"
  ],
  "cursor": "null",
  "id": "ENSG00000144029"
}

query_string = """
query SimilarEntitiesQuery(
  $id: String!
  $ids: [String!] = []
  $startYear: Int = null
  $startMonth: Int = null
  $endYear: Int = null
  $endMonth: Int = null
  $threshold: Float = 0.5
  $size: Int! = 15
  $entityNames: [String!] = []
  $cursor: String = null
) {
  target(ensemblId: $id) {
    id
    approvedName
    similarEntities(
      additionalIds: $ids
      threshold: $threshold
      size: $size
      entityNames: $entityNames
    ) {
      id
      score
      object {
        ... on Target {
          id
          approvedSymbol
        }
        ... on Drug {
          id
          name
        }
        ... on Disease {
          id
          name
        }
      }
    }
    literatureOcurrences(
      additionalIds: $ids
      cursor: $cursor
      startYear: $startYear
      startMonth: $startMonth
      endYear: $endYear
      endMonth: $endMonth
    ) {
      count
      earliestPubYear
      cursor
      rows {
        pmid
        pmcid
        publicationDate
      }
    }
  }
}
"""

base_url = "https://api.platform.opentargets.org/api/v4/graphql"

# Perform POST request and check status code of response
r = requests.post(base_url, json={"query": query_string, "variables": variables})
print(r.status_code)

# Transform API response from JSON into Python dictionary and print in console
api_response = json.loads(r.text)
print(api_response)

cjeynes · 19 July 2023 14:04

OK i’m solving myself but leaving if others were also stumped:

use None in python not null and it works

cjeynes · 19 July 2023 14:07

One thing i’m a bit confused about this coocurrences FTP download

http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/21.04/output/literature/cooccurrences/

where it seems open targets has done all this work themselves, but then you are talking about accessing data from EuroPMCs api - so how exactly is the FTP site used in open targets?

dsuveges · 19 July 2023 14:22

Hi,

I am happy to hear the issue with the query is sorted. Regarding how the literature data ingested to our website and how we are further process that dataset, please take a look at this blog post:

Best,
Daniel

Shicheng_Guo · 16 July 2024 16:40

Here is the R solution:

library(httr)
library(jsonlite)

# Define the query string
query_string <- '
query SimilarEntitiesQuery(
  $id: String!
  $ids: [String!] = []
  $startYear: Int = null
  $startMonth: Int = null
  $endYear: Int = null
  $endMonth: Int = null
  $threshold: Float = 0.5
  $size: Int! = 15
  $entityNames: [String!] = []
  $cursor: String = null
) {
  target(ensemblId: $id) {
    id
    approvedName
    similarEntities(
      additionalIds: $ids
      threshold: $threshold
      size: $size
      entityNames: $entityNames
    ) {
      id
      score
      object {
        ... on Target {
          id
          approvedSymbol
        }
        ... on Drug {
          id
          name
        }
        ... on Disease {
          id
          name
        }
      }
    }
    literatureOcurrences(
      additionalIds: $ids
      cursor: $cursor
      startYear: $startYear
      startMonth: $startMonth
      endYear: $endYear
      endMonth: $endMonth
    ) {
      count
      earliestPubYear
      cursor
      rows {
        pmid
        pmcid
        publicationDate
      }
    }
  }
}
'

# Define the variables
variables <- list(
  id = "ENSG00000157764",  # Example Ensembl ID
  size = 15,
  threshold = 0.5
)

# Define the URL for the API endpoint
base_url <- "https://api.platform.opentargets.org/api/v4/graphql"

# Perform the POST request
response <- POST(
  url = base_url,
  body = list(query = query_string, variables = variables),
  encode = "json"
)

# Check the status code of the response
if (status_code(response) == 200) {
  # Parse the response from JSON
  api_response <- content(response, as = "parsed", type = "application/json")
  
  # Print the response in a readable format
  print(toJSON(api_response, pretty = TRUE))
} else {
  # Print the error message if the request failed
  print(paste("Error:", status_code(response)))
}

Topic		Replies	Views
Publication years for Proteins GraphQL API	2	31	3 March 2025
Getting literature evidence using the Open Targets Platform GraphQL API GraphQL API	4	440	24 March 2022
GraphQL and API downloads GraphQL API datadownloads	4	299	14 August 2023
Retrieving all targets associated with a disease using the API (pagination) GraphQL API	1	400	11 January 2023
Getting all the LOF association to disease for a gene GraphQL API	12	126	14 October 2024

Can we get all Bibliography data per gene using the api?

Related topics