Binned value of expression - difficulty replicating distribution

If you download the binned gene expression data from open targets, values are distributed across the bins with the majority of values in bin 1-4 and then some in 5 and a small number in the higher bins. If I try and replicate this binning strategy with other data from Expression atlas I find nearly all the values are in bin 1.

Reading the docs on Computational pipelines and datasets (RNA expression meta-analysis) I can see that the normalisation used might be different (I don’t think iRAP is using RUV) but I do not think that would be expected to change the distribution so dramatically? Particularly considering I’m comparing Expression Atlas TPMs to these TPMs.

The only way I can make the distribution comparable is if I log transform (log2) the expression values. I can see no mention of such a transformation the docs - am I missing this somewhere? Do you log transform the expression data?

Any input would be much appreciated & apologies if I have missed something in the documentation!

Hi Grace,

You’re correct in identifying that this process isn’t particularly clear. Our baseline expression pipeline has remained untouched in many ways for several years now and we are in the process of revamping it to update the data, code and documentation (amongst other things!) so that the replication process is clearer to follow. I would expect this to be live later this year but cannot offer any promises.

Until that time we cannot offer further assistance here unfortunately - apologies.

Kind regards,
Tobi

Hi Tobi,

Excited to see the updates later in the year :slight_smile:

There is no way you can confirm or deny whether the expression data is logged at any stage in the current process?

Many thanks,
Grace

Unfortunately not, I don’t have access to the code which generated the data. This is a large motivator for the refactoring. By the end of the year the whole process should be open source and clearly documented.

Sorry that I cannot help further in this regard.

Best,
Tobi

Ok thanks Tobi, I appreciate the candor.

Best of luck with the refactor!

BW,
Grace