What is the backend database system that is used for open targets data?

wenliang · 29 November 2021 01:44

Hello, Open Target Community,

I am pretty new to Open Targets. I am fascinated by how Open Targets can host such big data with increasing size and varieties and also providing fast queries. I am very interested in know more about the platform infrastructure. I find some information here, Platform infrastructure - Open Targets Platform Documentation, but it looks somewhat abstract to me. Are the backend a mixed implementation of Elasticsearch and Clickhouse? And how the GraphQL and Google BigQuery interact with the backend? I have some familiarity with Elasticsearch, but no Clickhouse. Is there any good introductory tutorial on more details on how the system works and infrastructure setup? Thanks a lot!

ahercules · 9 December 2021 16:45

Hello @wenliang!

Welcome to the Open Targets Community!

My apologies for the delay in responding to your post.

Yes, we use both ElasticSearch and ClickHouse to power the Open Targets Platform. ElasticSearch contains data related to our main entities – targets, diseases/phenotypes, and drugs – along with our target-disease evidence. ClickHouse contains our target-disease associations data and this allows us to explore opportunities to provide on-the-fly scoring for target-disease associations.

Our GraphQL API exposes various endpoints that are used by our front-end web interface.

We also host our data in Google BigQuery to support users that want to use SQL to answer more complex and systematic queries.

You can learn more about the technical aspects of the Platform in our infrastructure documentation which contains relevant links to our GitHub repositories.

Feel free to comment below if you have any further questions.

Cheers,

~ Andrew

Topic		Replies	Views
Clickhouse and Elasticsearch Databases General data	1	44	24 October 2024
Query disease association for target list in BigQuery Google BigQuery/Cloud	3	552	21 September 2022
Help with using the new GraphQL API to pull targets associated with few diseases GraphQL API ot-platform	2	520	28 October 2021
How to used BigQuery for downloading the same result table format as a regular search for a specific disease and version release? Google BigQuery/Cloud ot-platform	1	208	12 July 2023
Help understanding the data contained in the Platform data downloads Data downloads	6	429	9 September 2024

What is the backend database system that is used for open targets data?

Related topics