Downloading data for V2G pipeline

Dear Open Target team.

First of all, I want to thank you for all your valuable work.

I am writing representing the research group Bioinformatics of Genome Diversity from Autonomous University of Barcelona, in Spain. For one of our projects, we are interested in assigning genes to specific variants.

We have seen that a pipeline called Variant-to-Gene (V2G) is offered by Open Targets Genetics, in which the outcome is this wanted assignation. We have tried to get the data for this pipeline from the links in Data Download in Open Target Genetics Documentation. Unfortunately, the link for the FTP does not get us to a page with data. We have also tried the GoogleBigQuery link, where we have seen the data, but we are not able to download it.

In GoogleBigQuery, we have seen that only three of the datasets would be needed for our goals: variants, variant_gene and genes.

Would it be possible to access these datsets any other way?

Thank you!

Bioinformatics of Genome Diversity group

Dear Bioinformatics of Genome Diversity group,

Welcome to the OpenTargets community and thank you for your question! It is interesting that the ftp link does not link you to a page with data. I would suggest directly downloading the v2g dataset using wget:

wget ftp://ftp.ebi.ac.uk/pub/databases/opentargets/genetics/latest/v2g/*

This will download all the partitioned .parquet files into your working directory.

Please let me know if additional information is required!

Best wishes,
Xiangyu