Variant table and VEP annotations

Ehsan_Khajouei · 1 August 2025 13:53

I want to prioritize variants’ annotation from VEP – from variant table – with the final goal of having only one combination of associated gene/transcript for each variant. The definition of the table columns are not very clear. Is there any way for me to understand which gene/transcript pair is the most important combination according to OTs criteria?

When you run VEP command on terminal, you can add a flag (i.e. --pick --pick_order tsl,appris,rank) to prioritize annotated gene/transcript. However, I cannot find what was the original flags included in OTs variant table and the column definition is very concise. There is a column called “transcriptConsequences” and within that there is “transcriptIndex” which I initially thought as a priority index, but the definition is ambiguous.

I appreciate to have your input and help with this, thanks!

irene · 1 August 2025 14:32

Hi @Ehsan_Khajouei, and welcome to our Community!

Here is the VEP query we use to populate our variant index. The transcript information found in transcriptConsequences is sorted based on two factors: the predicted functional impact (from most to least severe) and the distance from the gene’s footprint. This means the first item in the list corresponds to the closest gene with the most severe predicted consequence. If you’re looking to extract one gene per variant and this criterion fits your needs, you can use this first index to filter the object.

Thank you for your feedback! We’ll update the column description to be more informative. I hope this is helpful!

irene · 1 August 2025 14:36

FYI the consequence scores are manually defined by us in Gentropy: gentropy/src/gentropy/config.py at 161159fe4aa087d9b8f9b03169aa979462fde7d9 · opentargets/gentropy · GitHub

Ehsan_Khajouei · 1 August 2025 16:29

Hi @irene and thanks very much for your reply!

Just to make sure; if I use the first index within this expanded column “transcriptconsequences_transcriptIndex”, then I have the most severe & closest to the gene’s footprint transcript?

Thanks for sharing the links as well.

irene · 1 August 2025 17:21

The exact behaviour is that transcripts are ordered by consequence score in the first place; in case there are multiple transcripts with the same score, it compares based on the distance to gene’s footprint. The approach is, therefore, to prioritise functional impact over proximity.

Topic		Replies	Views
Mismatched variant consequences between OT Genetics VEP data and the OT Platform evidence page? Data issue genetics-portal	3	272	9 June 2022
Difference in variant and variant_gene files General genetics-portal	2	292	25 May 2023
Identify variants for list of ~300 genes (V2G output only) GraphQL API	0	77	5 July 2024
Difference between variant and gene pages Data issue genetics-portal	2	234	23 November 2023
Accessing PCHi-C, DHS-promoter corr. etc. info via Open Targets Genetics GraphQL API GraphQL API genetics-portal	2	262	9 February 2023

Variant table and VEP annotations

Related topics