Differences between associations in the UI and associations in the data downloads (21.04)

sigven · 3 May 2021 13:22

Hi,
I am struggling a bit to understand the (new) associaton scores, and to what extent they are comparable. I just looked at two of the strongest and well-known associations in cancer; EGFR in lung cancer, and BRAF in melanoma. From your web interface, the association of BRAF with cutaneous melanoma has a score of 0.82 (mouse over in the Associated diseases/Table pane), while the association of EGFR with lung adenocarcinoma has a score of 23.3 (similar mouse over). Are they at all comparable, in the sense that evidence for an association between EGFR and lung cancer is many orders higher than the association between BRAF and melanoma? And is there at all a global scale when it comes to the scores, or is the scale “local” for each target and its respective associations?

kind regards,
Sigve

ochoa · 4 May 2021 08:29

Dear Sigve,

Unfortunately, this is a bug we have identified that affects our first release of the new Platform.

It will surely be fixed within the next couple of days. We’ll keep you posted.

Best,

David

sigven · 4 May 2021 08:42

Dear David,
Thanks for your swift response. Looking forward to the fix:-)

kind regards,
Sigve

ahercules · 24 May 2021 15:56

Hi @sigven! I’ve responded in our GitHub issue tracker but also wanted to post here for other Community members.

In regards to the differences when comparing associations displayed in the user interface and associations available in our dataset downloads, our data and technical teams have investigated the issue. Both the data available in the user interface and the associations datasets available for download are correct and valid. However, the difference between them is due to a slightly different algorithm and normalisation and harmonic sum strategy. We expect that the ranking between the user interface and the datasets will be broadly similar, but there will be some differences due to the different algorithms.

We will be harmonising our approach with our next release — 21.06 — scheduled for release at the end of June. This will mean that both the user interface and datasets will provide the same data.

For more information, please see Investigate scores available in associations data files versus API · Issue #1508 · opentargets/platform · GitHub.

ahercules · 30 June 2021 15:48

Hi @sigven!

Just wanted to let you know that with our recent Platform 21.06 release, we have harmonised our approach and so the scores in our datasets and user interface should now be the same. We have also simplified the associations datasets and only include the targetId, diseaseId, score, and evidenceCount fields — and so please note that the file size for the 21.06 data will be smaller than the data from our 21.04 release.

Thank you!

~ Andrew

sigven · 30 June 2021 17:14

Hi @ahercules!

Great! Looking forward to downloading and using the latest release.

kind regards,
Sigve

ahercules · 28 July 2021 09:52

Hi @sigven,

Thank you for being so patient and waiting while we fixed the mismatch in the association scores. Last week, our back-end team applied a fix and have regenerated the files and they match what is now available in the UI.

You can find the files in our FTP in both JSON and Parquet formats. Alternatively, you can also access the data through our BigQuery instance, open-targets-prod.

Cheers,

Andrew

Topic		Replies	Views
Understanding and Comparing OAS Scores for Target-Disease Associations General	5	59	9 October 2024
Has OpenTargets significantly changed the number of high associations (e.g. overall association > 0.6) in the last 3 years? Technical Support data	2	266	16 September 2022
Disease categories redundancy/overlap Bug reports	3	289	6 July 2022
"Associations on the Fly" has been released and we need your feedback! Releases ot-platform	4	717	2 October 2023
How does the Platform display direct and indirect evidence? Frequently Asked Questions	7	863	13 May 2022

Differences between associations in the UI and associations in the data downloads (21.04)

Related topics