Antibodies tractability pipe and results issues

Dear all,

Going deeper in the tracktability pipe and output file I was surprised by the numbers of hit in the “Antibody pipeline buckets” (seems a bit large, not what have been reported).

Unfortunately, a quick look at the list turned on red alert light in my mind, let me explain:

One example (among many others) is PDE9A, a classic cytoplasmic phosphodiesterase.

There’s actually three “hits” reported in all the Antibodies tracktability buckets: Bucket #4, #5 and #9.

If I’m not mistaken, #4 is “Uniprot Loc”: looking at it, it described it cytoplasmic, and membrane ruffles associated (i.e. intracellular fully – uniport is ok on this one). That should not give a one point to bucket #4.

Next point is for bucket #9 i.e. HPA:

clear enough, HPA report “Located in Nucleoplasm, Plasma membrane, Cytosol (Single cell variability)” hurm hurm…everywhere. Ok, on this one HPA is just wrong, plasma membrane yes indeed but not extracellular. (That the downside of HPA, I don’t think in fact considering the considerable bias it should even be consider having it’s own bucket in the Abs target assessment, but that is another matter of debate :blush: )

I could not check the Bucket #5, as it is “(GO CC) Targets with GO CC terms indicative for plasma membrane, extracellular region/matrix, or secretion - high confidence “ however I don’t know own to access such data directly, so could not argument.

Another example if you would like, PRKCA: same bucket highlighted (4-5-9), same problem with bucket 9, but bucket 4 is ‘worse’: uniport announce plasma membrane associated but against there is no distinguo between extra and intracellular (and the answer is: intra…). In term of Abs drug development/trackability that’s a huge difference.

Last example is HDAC8 (Histone deacetylase 8): this one take one point only, in bucket #5. I would love to understand why (as it’s obviously wrong) if you may pinpoint me to the right direction to understand the bucket 5 please :blush:

Thanks a lot for the discussion!
Best,
Nicolas

Hi @Nicolas! :wave:

Thank you for your feedback on the tractability antibody workflow. We are aware that there are some limitations of the underlying data for the high and medium/low confidence buckets. To support further research into the suitability of a given target for antibody development, we make the data file generated by the tractability pipeline available for download — see tractability_buckets-2021-06-03.tsv

I would also recommend reviewing the antibody workflow pipeline code as it details how the bucket 5 assessments are made. You can also clone the repository and adjust the filters to run the pipeline and generate your own assessments.

Cheers,

Andrew :slight_smile:

Hi Andrew,

Many thanks for your answer.
By the way, one think I would like to emphasize is that my comment(s) are really/truly to help (one of my mentor when asked about why we were doctor in philosophy give the simple answer by saying that philosophy is a search for truth :blush: ) and my ‘challenge’ is by no mean to undermined the super-great work the team is doing but to actually contribute.

That say, anyway, yes we will look at the pipeline indeed and see if we may (or not :blush: ) circumvent the false positive. In any way, at least it’s “inclusive”, it will ‘just’ need a curation work after running the filter!

Best,

Nicolas

Hi @Nicolas,

thank you very much for your comments. If I may add something to Andrew’s reply is that you can see the subcellular location annotations from GO in the Uniprot page as well.

For example, for PDE9A, if you go to the Subcellular location section, you can see two tabs: one with Uniprot annotation, while the other one contains the GO one you were looking for.

Thanks again for your notes, we will bear them in mind!

Best,
Irene

Hi Everyone,
I have a question about the Antibody (AB) workflow too.
If AB bucket 4(Uniprot) or 5(GO) is the highest assigned bucket then Predicted_Tractable_ab_High_confidence category score is calculated by 0.7 * [‘Bucket_4_ab’] + 0.3 * [‘Bucket_5_ab’].
My question is why does OT assign 0.7 to Uniprot and 0.3 to GO? Is Uniprot more reliable than GO in this sense?

Kind regards,
Csaba

This is the logic the team who developed the tractability pipeline used. I would agree with this formula, based on how the 2 resources are defined.

Gene Ontology would not produce an assessment of the most likely subcellular location of a protein. It just keeps adding annotations based on the availability of evidence.

In BRCA1 for example, you can find Gene ontology evidence pointing to the cytoplasm as a cellular component (an unlikely location).

Uniprot would provide instead a curated assessment through manual inspection of the evidence that would discard the membrane as a likely location for BRCA1:

More info in the BRCA1 profile page