Availability of mapping files in BigQuery

For certain analysis and data integration to Open Targets, we map genes/targets to ensembl ID. There are some data download files on the FTP/GCP locations that appear to drive the Platform application’s ability to resolve synonyms/terms to a target – it would be fantastic if this could be mimicked through a table in Big Query.

Is this something the team might be able to investigate?

This request was sent to the Open Targets helpdesk and has been posted here so answers can benefit the whole Community of users.

1 Like

There is a BigQuery table in genetics: “genes”, which contains the ensembl IDs (gene_id), HGNC gene names (gene_name) and other associated information for all genes in the genetics portal. This table can then be used to perform joins on user inputs of genes/targets if the ensembl IDs are required.

I was thinking something more like the searchTargets files in the FTP for the platform output. I believe these are used for the platform site lookahead part of the web site. It resolves several identifiers/synonyms. We import this to our own BQ dataset but would be nice to have in the public EU region BQ project (there is also a searchDrugs search Diseases table).