Hi, I run the DACC for the Impact of Genomic Function on Variation consortium .
It has been suggested that the IGVF extend OpenTargets software rather than build our own giant genomics db from scratch.
In order to do this, we would need to create a full clone of your entire software stack; ideally on AWS.
As far as I know, support by EBI is done only for members of the consortium but our company called The Hyve is more than happy to help you with this. We have set up open targets for multiple clients. See https://www.thehyve.nl/services/open-targets for more information.
As @Sjoerd mentioned, we don’t provide support for particular instances. However, we want to make as easier as possible for others to spin out their own versions and we accommodate our stack to make this possible. As part of the open source spirit, we also welcome any contributions in the form of code or documentation.
If you want to share your particular use case, I’m sure the community could help you to make some progress and you can decide whether you need additional services.
Indeed I was able to implement the OT Platform internally here, to combine our own evidence data with the ones provided by OT. The GITHUB repos contain most of the information you need, although if you want to install OTP locally and not use Google cloud as OT does, you are going to have to do some wrangling since there is a lot of assumptions that the deployment is on Google…
But it can definitely be done (as @Sjoerd and his company The Hyve shows) but it is not an easy lift…Let me know if you want to chat about what it will take and I can show you…
Maybe start by organizing what github repositories are important for which functions; possibly this is buried in some help doc somewhere but I found it hard to follow the global organization.
Our use case is essentaily a massive extension of what is in OpenTargets: Genetics. We would add specifically LD blocks for various ancestry groups, and most importantly definition of “cCRES” (putative conserved regulatory elements) defined either computationally or via assays such as ATAC-seq + histone ChIP-seq etc. Then these elements/genes/variants are all linked by other sets of experiments - including full gene regulatory networks.
OpenTargets genetics has a useful baseline for us (Variants, genes, GWAS) but we would extend it to specific regulatory interactions and use it to house the experimental results and computational predictons from our IGVF consortium.
By any chance, do you remember, how much RAM did you needed for running OT tractability utility. It will kind of you if you can share your suggestion regarding the memory needed for deploying OT platform on personal computer.
I have scaled my input for 2 Ensembl ids to 64 ids, the problem is the same. This means it is not the input but the pipeline need little tweaking in my case.
Thanks for noting you are finding it hard to reproduce the tractability pipeline. I have passed this comment to our colleagues to see what is going on.
I was able to run the tractability pipeline but with much higher RAM.
I am trying to build Open Targets in command line using the scripts kept in Github for each pipeline.
I do not want to use Google Storage bucket and would like to use local server for the development. The pipelines that is of interest to me are tractability, safety, Chemical probes & TEPs, Baseline expression, and Molecular interactions, as I wish to customize the tool according to our current use case.