How to Get ICD codes for diseases?

Hi! Where can I get the ICD 9/10 codes for the known diseases, in the data downloads?
(I see the ICD codes in the website, but I can’t seem to find where they are in the different data downloads. I want to link this downstream to the UK biobank to get prevalence of the diseases, but I need a medical codes first).

Hi dofer, to get the codes you’re looking for:

1 Like

Great, thanks very much!
(Now to get it working with the parquets/pandas).

This should get you most of the way there:

# python 3.10.6
# pandas 1.4.4
import pandas as pd
path = 'some path to files'
ref_filter = r"^ICD[9,10].+"
raw_df: pd.DataFrame = pd.read_parquet(path, columns=['id', 'dbXRefs'])
id_and_xref_df: pd.DataFrame = raw_df.explode('dbXRefs').astype('string')
id_and_icd_df: pd.DataFrame = id_and_xref_df[id_and_xref_df["dbXRefs"].str.match(
    ref_filter)]

The FTP also needs to be updated to: http://ftp.ebi.ac.uk/pub/databases/opentargets/platform/22.11/output/etl/parquet/diseases/

1 Like

Oh, thanks! That just saved me even more time :slight_smile: