R script for GraphQL query: query targetDiseaseEvidence

Hi,

I’m struggling to extract the association scores (i.e. overall association score, genetic associations, drugs, etc…) for a chosen disease of interest, and genes of interest.

How could I build a query for Alzheimer’s disease on a set of user defined genes?

I think the query targetDiseaseEvidence is relevant here, but I’m struggling to make my query work and translate it to a post request to make it work on R.

Any help would be greatly appreciated!

Thanks
Eléonore

Hi @eleonoreschneegans and welcome to the Open Targets Community!

The overall association scores for a target or a disease can be found in the object that collects the associations with the given target or disease, that is associatedDiseases or associatedTargets respectively.

query associatedDiseases {
  target(ensemblId: "ENSG00000127318") {
    id
    approvedSymbol
    associatedDiseases {
      rows {
        disease {
          id
          name
        }
	    score
      }
    }
  }
}


This is an example query of all the phenotypes or diseases that you can find in the Platform to be related with IL22, along with the overall scores. The approach would be very similar when you want to query for a given disease.

However, if you are interested in many diseases or targets and want to query them more systematically, I would suggest you to use out data dumps. You have more information about how to achieve this using R in our Documentation.

Best,
Irene

Thanks @irene !
Now my issues is to have more than 25 line returned by the query. How can we extend the following to return all possible targets of the base:

Query for targets associated with a disease

otp_qry$query(‘simple_query’, ‘query simpleQuery($efoId: String!){
disease(efoId: $efoId){
name
associatedTargets{
rows{
target{
id
approvedName
}
datatypeScores{
id
score
}
}
}
}
}’
)

And what would be a way to get the gene id for the target (not Ensembl)

Hi Eléonore,

And what would be a way to get the gene id for the target (not Ensembl)

When querying the GraphQL API, the only way to retrieve association is to provide Ensembl gene id or the EFO disease id. These are the type of identifiers our infrastructure uses for genes and diseases. So if you have gene names or symbols, you need to map them first. For the mapping you can use either our downloadable flatfiles or you can use Ensembl’s REST API, that allows you to map gene names and symbols to ensembl id.

When you build your GraphQL request, you can specify which fields you are interested in, so you can add disease name, even disease ancestors. Having the disease names can help you to find your disease of interest if you are querying with a fixed target. The documentation of the GraphQL schema chan be found here.

An example R implementation for retrieving associated diseases for a given target looks as follows:

library('jsonlite')
library('httr')

get_associations = function(target_id){
    query_url = 'https://api.platform.opentargets.org/api/v4/graphql'

    # Building query:
    request_body = list(
        operationName= 'TargetAssociationsQuery',
        variables = list(
            ensemblId= target_id,
            index= 0,
            size= 10000,
            sortBy= 'score',
            filter= '',
            aggregationFilters = list()

        ),
        query = '
            query TargetAssociationsQuery($ensemblId: String!, $index: Int!, $size: Int!, $filter: String, $sortBy: String!, $aggregationFilters: [AggregationFilter!]) {
                target(ensemblId: $ensemblId) {
                    id
                    approvedSymbol
                    approvedName
                    associatedDiseases(page: {index: $index, size: $size}, orderByScore: $sortBy, BFilter: $filter, aggregationFilters: $aggregationFilters) {
                        count
                        rows {
                            disease {
                                id
                                name
                            }
                            score
                            datatypeScores {
                                componentId: id
                                score
                            }
                        }
                    }
                }
            }
        '
    )

    # Retrieve data:
    response = POST(query_url, body=query, encode='json')

    # Parse data:
    char = rawToChar(response$content)
    data = jsonlite::fromJSON(char)

    # Extracting associations:
    associations = data$data$target$associatedDiseases$rows

    # Adding target and disease columns to dataframe:
    associations$targetSymbol = data$data$target$approvedSymbol
    associations$targetId = data$data$target$id
    associations$diseaseId = associations$disease$id
    associations$diseaseName = associations$disease$name

    # Dropping unused columns and return:
    return (associations[, c('targetId', 'targetSymbol', 'diseaseId', 'diseaseName', 'score', 'datatypeScores')])
}


target_id = 'ENSG00000065361'

get_associations(target_id)

To extract dataType scores for each association requires a bit of work with the dataframe, so you might want to use tidyverse.