How to assess how valid data on protein expression is?

Roman · 24 July 2023 13:50

Hi everyone,

after setting up our API we found several values concering expression patterns which we can’t understand. I didn’t find any explanation in the documentation so I wanted to ask here, how are the levels on RNA and Protein calculated and can we use them to rank targets due to likelihood of expression?

rna’: {‘zscore’: -1, ‘value’: 0, ‘unit’: ‘’, ‘level’: -1},
‘protein’: {‘reliability’: True,
‘level’: 2,

If I missed it in the documentation I would greatly appreciated a link to the said data paramers.

Kind regards,
Roman

Kirill_Tsukanov · 25 July 2023 08:45

Hi Roman,

First of all, here is the general explanation as to how the existing baseline expression dataset was produced: Baseline expression - Open Targets Platform Documentation

Now about the specific fields you see in the API. For RNA expression:

value is a normalised TPM (transcipts per million) count for all transcripts of a given gene in a given tissue
level is a bin number (1 to 10), which is mentioned in the documentation above as “Binned value of expression”. If the value is -1, it means expression was lower than a threshold, and it was discarded
zscore is tissue specificity score, which is mentioned in the documentation above as “Tissue specificity”

For protein expression:

level is a categorical variable: 0 - Not detected/below threshold, 1 - Low expression, 2 - Medium, 3 - High
reliability is a technical flag passed on from the HPA data which reflects whether the value in the level field is reliable enough. You can discard values with "reliability = False`

So, in conclusion: yes, you could use the “level” field to rank targets by expression in a given tissue, but just keep in mind that this field has different ranges for RNA and protein expression.

Finally, in case you are interested in the fine technical details, here is the source code of the module which produces these datasets: GitHub link

Roman · 25 July 2023 08:57

Thank you very much!

Topic		Replies	Views
Human protein atlas - baseline expression General ot-platform , data	2	47	24 March 2025
mRNA tissue specificity is included or not? Community Feedback genetics-portal	2	245	8 May 2022
Search from expression profile Technical Support data	2	227	21 October 2022
Missing RNA expression data in the overall score (by disease) Community Feedback	5	362	4 May 2022
'Target Distribution' and 'Target Specificity' Community Feedback ot-platform , data	2	52	26 August 2024

How to assess how valid data on protein expression is?

Related topics