Hello!
In the Genetics portal FAQs you mention, regarding mapping conflicts:
Many multi-allelic sites can be assigned a single rsID, and some rsIDs can point to different positions in the genome.
This means that rsIDs are not unique to a single variant. We have mapped all rsIDs from GWAS Catalog to unique variants.
A small minority of rsIDs will map to multiple variant IDs (approximately 0.6% of lead variants). When this occurs, variants will be duplicated in the portal.
Where does the mapping from rsId to unique variants happen? I’ve looked at the variant annotation on GitHub but there is no explicit disambiguation process there that I can see.
To be clear my question is: when a given rsId is mapped to more than one locus, how do you pick the one locus to which this rsId will be mapped inside the variant-index dataset?
This question was sent to the Open Targets helpdesk and has been anonymised.