Filtration of the data ingested into the Open Targets Platform from Open Targets Genetics

Hi Open Targets Community,
I downloaded the evidence data from ftp ( 22.11/output/etl/json/evidence/sourceId=ot_genetics_portal) to find variants associated with diseases of our interest. However, studies shown in the website and those in the downloaded data seem inconsistent.

For instance, the website shows the association of this study and COL1A1


However, in the downloaded data, the variant 17_49868692_C_T linked to ‘Liver enzyme levels (alkaline phosphatase)’ is only relevant to the nearest gene TAC4.

Could anyone help explain the difference? Any questions or recommendations are also welcome.

My hypothesis now is the prefiltration procedures (L2G=0.05), as this is also the min score in the local database. Any other comments are also welcome.

Hi @tzukuo! Welcome to the Open Targets Community :tada:

From my understanding, this is correct. As you saw in the documentation, the Open Targets Platform only includes associations where the L2G score is above a certain threshold.

Let us know if you have any further questions!

Thank you for your kind feedback, @hcornu

Hi @tzukuo! Someone else had the same question, and you might find this response helpful:

1 Like