Target class and "unclassified protein": missing protein class?

Hi ,
again maybe a trivial question, sorry about that :grinning: (I do have so many…)
After doing a search using a disease, you get a certain amount of target (let’s say 10000 for example).
If you unfold the “Target class menu” (see below) the total does not add up. Exporting and looking at coupled of example, i.e. a target that is reported but indeed is nowhere in target class category, they are indeed “non classified” protein with kind of “unknown MOA”. But in this case, why aren’t they in the ‘unclassified protein’ target class ? (basically why is it not adding up :slight_smile: )
PS: the same question could be say for “Pathway”, however if it’s unknown, I understand numbers not adding up as there is no “unknown pathway” category)

Best!
Nicolas

Hi Nicolas! :wave:

Not to worry — please ask any and all questions. It helps us build a knowledge-base within the Community! :nerd_face: :open_book:

In the Target Classes filter on the associations page, we source that data directly from ChEMBL. We use the target classification data that they display on their gene profile page (e.g. EGFR). However, ChEMBL does not have all of the drug targets that the Open Targets Platform includes in our target index. And so some targets are returned on the associations page, but are not visible if you select any or all of the filters.

For example, ATP7B is associated with Wilson disease, but the target ATP7B does not appear in the ChEMBL database. If you select all of the options in the Target Classes drop-down menu, ATP7B will not be returned in the results. And so that leads to the mismatch between the overall number of targets returned when you first visit the Wilson disease associations page (365) and the number of targets visible when you select all available target classes filters (191).

Similarly, the Pathway Types filter uses data sourced directly from Reactome, which has curated gene pathways from literature. We integrate their pathway data and present it on the target profile page (e.g. ESR1).

Using the same example — targets associated with Wilson disease — if you select all available options in the Pathway Types drop-down menu, you will see a list of 312 targets. The other 53 targets (e.g. TTC7A, TGFB1, and C9orf72) do not have Reactome pathway data. And so they will not appear in the results, even when you select all of the options.

Hopefully this answers your question — and as I said, please continue to post more questions as you explore our data! :grin:

Hi Andrew,
Crystal clear explanation as usual :slight_smile:
thanks
Nicolas