23.02 Platform release now live!

hcornu · 22 February 2023 09:56

We have just released the latest update to the Open Targets Platform — 23.02.

Key highlights for this release:

This release integrates 14,611,717 evidence strings to build 6,960,486 target-disease associations between 22,274 diseases or phenotypes and 62,678 targets from the following 22 public resources:

2,177,595 genetic evidence from European Variation Archive (EVA)
782,147 genetic evidence from Open Targets Genetics
3,031 genetic evidence from Gene2Phenotype
31,995 genetic evidence from the Genomics England PanelApp
1,971 genetic evidence from ClinGen
6,254 genetic evidence from Orphanet
27,373 genetic evidence from Gene burden
4,151 genetic evidence from UniProt Literature
16,965 somatic evidence from European Variation Archive (EVA)
3,299 somatic evidence from intOGen
76,292 somatic evidence from the Cancer Gene Census
26,383 somatic evidence from Uniprot
612,079 drug evidence from ChEMBL
230,903 expression evidence from Expression Atlas
10,413 affected pathway evidence from Reactome
72,294 affected pathway evidence from SLAPenrich
378 affected pathway evidence from PROGENy
390 systems biology evidence from SysBio
1,298 somatic evidence from the Cancer Genome Interpreter
1,838 CRISPR-Cas9 (Cancer Cell Lines) evidence from Behan et al. 2019
1,047,024 mouse model evidence from IMPC
5,300,042 scientific literature evidence from co-occurence mining in Europe PMC

Additionally, the Platform now allows users to explore data on 12,854 drugs or compounds.

For more details, read the 23.02 blog post.

Pankaj_Agarwal · 22 February 2023 14:12

The evidence from Europe PMC seems to be half that from the previous release, in spite of adding patents. What caused this change? It would be useful to put a percentage change for evidence count in each of these release notes.

hcornu · 22 February 2023 14:42

Hi @Pankaj_Agarwal!

The drop in evidence from EuropePMC is due to a known bug in the pipeline. Because of this, we are not processing all the publications that we should be processing. We are actively working with Europe PMC to resolve this.

However, the drop in associations is less drastic, which suggests that we are not losing crucial evidence.

Thank you for your feedback about including percentage changes. However, we provide these metrics to give a sense of the amount of data and its distribution, but we do not want to place too much emphasis on the numbers, since we value the quality of the data over its quantity. The amount of evidence will fluctuate over time based on our data sources or the way we process the data, particularly with a data source like this one. In fact, we are currently introducing some changes to the pipeline that will cause the number of evidences to drop.

Out of curiosity, would you be willing to share how you use these metrics? Thank you!

Pankaj_Agarwal · 22 February 2023 14:49

Thanks, @hcornu. Can you provide a little bit more detail in terms of the number of publications not being processed? Should we continue to use the previous release for the epmc data until this bug is fixed, or a union of the two releases?

I agree with your comment about quality, not quantity, but I have found that quantity is important for QC reasons. I use it to ensure that when I am postprocessing the data something has not changed to make me lose significant amounts.

thondeboer · 25 February 2023 18:18

I noticed that one of the INPUT files for evidence in the 23.02 release is BZIP2 zipped (evidence-files/atlas.json.bz2) while all other files are GZIPPED, but the platform-etl-bakcend scripts and reference.conf do not specify that it is BZIP2, so the pipeline fails to process this file since it is trying to load it as text files since all evidence files are just “globbed up” I think…

Thon

Topic		Replies	Views
23.06 Platform release now live! Releases ot-platform	0	646	26 June 2023
23.12 Platform release now live! Releases	0	432	30 November 2023
24.03 Platform release now live! Releases ot-platform	2	407	25 March 2024
24.06 Platform release now live! Releases	1	184	19 June 2024
22.09 Platform release now live! Releases ot-platform	0	566	3 October 2022

23.02 Platform release now live!

Key highlights for this release:

Related topics