Understanding the data available online

interleukin · 24 February 2025 17:05

Dear community,

I am trying to decipher the information on the data (specifically v2g_scored) available online and extract valuable information. While working on that, I have a few questions:

1. I can infer that source_list and source_score_list are interconnected, however, I wanted to ask what do the numbers in the source_score_list represent exactly?

overall_score                source_list  source_score_list

0 0.046479 [canonical_tss] [0.7]
1 0.086519 [vep, canonical_tss] [0.1, 1.0]

2. Each genetic locus might appear several times, by capturing different pieces of information. I have found that one with position “958339”, ref_allele = G, alt_allele = A that has double information in the “source_id”. In particular, when source_id = canonical_tss, the information is different. Why does this discrepancy occur? Please check the cells highlighted in blue.

3.

Assuming the following portion of the data

Column 1	Column 2	Column 3	Column 4	E	F	G	H	I
position	ref_allele	alt_allele	d	type_id	source_id	source_list	source_score_list	feature
958339	G	A	315525.0	distance	canonical_tss	[jung2019, javierre2016,canonical_tss]	[0.0,0.6,0.4]	unspecified
958339	G	A	8143.0	distance	canonical_tss	[eqtl, canonical_tss]	[0.9, 1.0]	unspecified

Could you help me by confirming whether the following JSON format captures every information correctly?

[
{
“distance”: 315525.0 ,
“source_scores”: {
“javierre2016”: 0.0,
“jung2019”: 0.6,
“canonical_tss”: 0.4
}
]

Thank you very much in advance!

Aglaia

Xiangyu · 25 February 2025 11:45

Hi Aglaia,

Is this the full dataset? Are these rows for the same gene or different genes?

Best wishes,
Xiangyu

Topic		Replies	Views
What do scores represent in downloaded tables of eQTL data for query variants Open Targets Genetics FAQs eqtl	1	232	4 January 2023
Data sources for V2G Scoring General genetics-portal , data	5	46	13 March 2025
Data source scores ot_genetics_portal vs L2G in final association score General genetics-portal	1	88	8 May 2024
V2g data downloads General genetics-portal	1	525	24 October 2022
How is the V2G score calculated? General genetics-portal	6	660	10 October 2023

Understanding the data available online

Related topics