Questions about GWAS association score aggregation

mujisaka · 27 May 2026 13:45

Dear Open Targets team,

I am currently working with the GWAS associations and L2G datasets from the Open Targets Platform and have a few questions regarding score aggregation.

How is the final GWAS association score calculated from multiple L2G predictions across different credible sets or GWAS studies?
- Is it based on the maximum L2G score, summation, weighted aggregation, or another method?
- Screenshot 2026-05-27 at 15.44.541680×536 63.6 KB
How are GWAS studies annotated with multiple diseases/traits handled?
- For example, if one GWAS study is linked to several EFO terms, how are the GWAS association assigned to each disease?
I also noticed that the L2G score shown in the main table sometimes differs slightly from the score displayed in the detailed SHAP contribution widget (e.g., 0.919 vs 0.913). Could you clarify the reason for this difference?

Screenshot 2026-05-27 at 15.40.232106×836 87.1 KB

Thank you for your help!

dsuveges · 27 May 2026 13:56

Hi,

Evidence scores for each datasources are aggregated to get a datasource specific association scores. We apply a normalised harmonic sum as a way for aggregation. You can found more details on the process in the documenation here.

We do this for every datasources, each datasource has its way of defining what is considered evidence scores. For GWAS credible set derived evidence, the score is the locus to gene score (l2g), as that value captures our confidence of link between the genetic signal and the gene (docs are here). In this context (however important they are), the strength of the association or the effect size doesn’t really matter.

Please let us know if you have further questions.

mujisaka · 28 May 2026 13:43

Hi,

Thanks for the quick response, really helpfully!

Best,

Mujisaka

dsuveges · 28 May 2026 13:45

It was a very quick, however a very incomplete reply! Sorry, I have missed the last two questions:

Sometimes the experimental design of the GWAS studies implies multiple assigned diseases eg. when the genome of multiple types of cancer patients are compared with healthy genomes. You can see this reflects in the study schema. Associated loci from these studies are exploded providing evidence for all assigned diseases.
We are looking into it. I believe those numbers should be the same.

mujisaka · 28 May 2026 13:50

Thanks. I have check for the source data, the number shown in shap widget matches those from Locus-to-Gene (L2G) prediction file (Open Targets Platform). Not sure if I made something wrong, but thanks for checking it!

Topic		Replies	Views
Understanding genetic_association datasourceId in 25.06.0 General datadownloads , genetics-portal	5	160	11 July 2025
Understanding GWAS association generation and score calculation in the Platform General	2	44	19 June 2026
Data source scores ot_genetics_portal vs L2G in final association score General genetics-portal	1	135	8 May 2024
Best approach for L2G mapping with GWAS variant lists for a specific indication Data Access	1	121	10 November 2025
Query by GeneID and Phenotype to get L2G scores GraphQL API genetics-portal	1	417	15 March 2023

Questions about GWAS association score aggregation

Related topics