What kind of score is returned when using the "search" endpoint?

CPerscheid · 13 September 2021 18:36

Hi there,

I would like to query OpenTargets via GraphQL based on general search terms, e.g. “breast cancer” (I do not have a suitable EFO ID at hand for all search terms). As such, I am using the search endpoint. However, I noticed that the score that is returned does not have a range between 0 and 1 as stated in the documentation (and which is the case when using the more specific endpoints). Can you please explain what this score is actually about?

This is my query:

query simpleQuery{
  search(queryString: "glioma", entityNames: ["target"], page: {index: 0, size: 10}){ 
    hits{
      	id
      	name
      	entity
      	score
    }
  }
}

and these are some of my results:

{
  "data": {
    "search": {
      "hits": [
        {
          "id": "ENSG00000108231",
          "name": "LGI1",
          "entity": "target",
          "score": 873.07935
        },
        {
          "id": "ENSG00000025293",
          "name": "PHF20",
          "entity": "target",
          "score": 353.00244
        },
...
]
    }
  }
}

I would be happy if you could help me with that.

Best,
Cindy Perscheid

ahercules · 15 September 2021 12:37

Hello @CPerscheid!

Welcome to the Open Targets Community!

When using the search endpoint, the score is the a measure of how well your search term matched the returned result. It is not connected or related to the target-disease association scores. In order to access the target-disease association scores, you would need to use the target or disease endpoint and pass the relevant ID.

I hope this helps answer your question — and feel free to respond below if you have any further questions about the search endpoint for our team.

Thank you,

Andrew

CPerscheid · 15 September 2021 13:07

Dear @ahercules,

thanks for letting me know. I used the function get_associations_for_disease from the Python OpenTargetsClient before, which automatically resolved the corresponding IDs for a disease search term. I suppose I will have to do this task now on my own via the general search query first to get a valid EFO ID for my search term and then use it for the actual query?
If so: what would be the maximum score if we had an exact match, and at what level could I consider a returned result to be a good match (aka: how is the score calculated? Is it some string distance measure?)?

Best,

Cindy

irene · 15 September 2021 14:59

Hi @CPerscheid!

Getting the right EFO ID for a set of disease labels is a problem we face everyday at Open Targets (and it is not easy!).

Given your use-case, I would suggest you to use OnToma - a tool that we have developed in-house and that we use for exactly that purpose.
Both the Python client and the CLI are easy to implement within your code or as an independent step of your data processing. Here is the documentation: OnToma documentation — OnToma documentation

Unlike the API, OnToma only returns results of high confidence and quality. It is therefore a more complex algorithm of that implemented in the search endpoint and it prevents you from the overhead of making many API queries.

If you are interested in the association scores for target/disease pairs, I would suggest you to:

generate a table with all the disease labels that you have + the EFO IDs outputted by OnToma;
join that table with the associations dataset from our Data Downloads page.

This can be rapidly done using Pandas.
I hope we have been of help. Please reach out if you have further questions!

Best,
Irene

CPerscheid · 16 September 2021 08:13

Hi @irene,

OnToma sounds great - thanks for pointing out! I will definitely try it out.

Best,
Cindy

Topic		Replies	Views
Differences in overall association scoring between target-disease and disease-target General	3	371	28 September 2022
Overall association score via graphql/API GraphQL API	6	471	28 June 2022
Score values from Disease->Target vs. Target->Disease Frequently Asked Questions	3	552	25 June 2021
R script for GraphQL query: query targetDiseaseEvidence GraphQL API	7	1014	11 January 2024
How to find targets associated with a disease using the new GraphQL API or Google BigQuery Data Access ot-platform	0	1064	1 June 2021

What kind of score is returned when using the "search" endpoint?

Related topics