Questions regarding the accuracy of data in the Drugs and Clinical Candidates section

chen_lv · 18 May 2026 04:11

First of all, I am very grateful for the Open Targets tool, which has brought a lot of convenience to research. When I was searching for drugs related to UC, I found a problem: Aspirin (acetylsalicylic acid) is a classic nonsteroidal anti-inflammatory drug (NSAID). After entering the human body, aspirin may damage the intestinal mucosal barrier and is not a drug routinely used to treat UC. In the clinicaltrials link corresponding to this record, there was no mention of the use of aspirin in the experiment. So I think there may be a problem with the data collection here(As shown in the picture below).

I would be very grateful if you could answer my question.

Best regards,

Lvchen

Study Details | NCT00269438 | New Tablet Formulation and Dosing Regimen of Balsalazide Disodium in Mildly to Moderately Active Ulcerative Colitis | ClinicalTrials.gov

irene · 18 May 2026 11:39

Hi @chen_lv ,

I am very happy that you find our data useful! I think you have encountered a curation issue: the intervention we extract from this clinical trial is `5 ASA, enemas, suppositories, corticosteroids` (you can see this when you hover over the ASPIRIN tooltip). As this is free text, we have to assign a ChEMBL ID to display it in the Platform. We follow a semi-automatic method to achieve this: we first try to ground the labels by looking up in a dictionary of drug names/synonyms, and, when this is not successful, we pull clinical trial curation from ChEMBL if available.

In this case, the automatic grounding of the intervention label wasn’t successful (as 5 ASA refers to a group of molecules). So we relied on the curation available in ChEMBL, which has already extracted information from this trial linking 5 ASA to aspirin, as you can check here:

Clinical trial extraction is complicated as the data is, very often, messy. Looking forward, we are testing new approaches which you can follow on this ticket (LLM-based extraction of drug/indication pairs from clinical trials · Issue #4339 · opentargets/issues · GitHub), where we use LLMs with a provided model to do the entity extraction for us. This is what the pipeline (in development) extracts for this particular trial:

{
  "id": "nct00269438",
  "drug_intent": "therapeutic",
  "drug_intent_confidence": 0.95,
  "primary_indications": [
    {
      "name": "ulcerative colitis",
      "evidence_quote": "achieving clinical improvement in subjects with mildly to moderately active ulcerative colitis after 8 weeks of therapy"
    }
  ],
  "investigated_drugs": [
    {
      "drug": "balsalazide disodium",
      "evidence_quote": "a new tablet formulation and dosing regimen of balsalazide disodium dosed twice daily"
    }
  ]
}

As you can see, this throws a much better representation of indications/conditions than the one provided in the raw data, which will make the grounding task much easier. Please stay tuned, the enhancements we are preparing look very promising!

Best,

Irene

chen_lv · 18 May 2026 13:05

Hi irene,

Thank you very much for your reply. It has been very helpful for my understanding of the data in Open Targets. As you mentioned, Open Targets is still largely dependent on ChEMBL at this stage. Using LLM-based approaches to extract drug–disease relationships is indeed a very promising idea, and I look forward to seeing this tool integrated into the Open Targets Platform soon.

I also have a small technical question I would like to ask you. The ultimate goal of my research is to obtain disease–drug–target relationships. Based on my current understanding, the Clinical Targets section under the Download module in Open Targets provides disease–drug–target information, although it is not complete – I suspect this is mainly due to the limitations of drug–target information captured in ChEMBL. In addition, I have collected some additional disease–drug–target associations from the Therapeutic Target Database (TTD).

In your opinion, are there any other ways to obtain more complete disease–drug–target relationship information? For example, from other databases or through literature mining? Thank you very much for your time.

Bset,

Chen Lv

irene · 18 May 2026 14:26

Glad that was helpful!

On your drug/disease/target trios, I recently did an analysis of the drug/target relationships that are present in TTD and whether we should integrate them in OT. Please have a look at the conclusions I describe here: Integrate TTD as a source for a drug's mechanism of action · Issue #4337 · opentargets/issues · GitHub

I think TTD annotation is very valuable, but it falls out of the scope we currently have: where a drug’s mechanism of action is meant to describe the therapeutic target. If your scope is wider, like you want to know all the targets a drug interacts with, TTD is a good resource. I would also look at the annotation in Probes&Drugs, ChEMBL’s bioactivity data, or DrugCentral for example. This is not a comprehensive list; I am sure I am missing other valuable resources Feel free to share if you find anything you’d like to see in our Platform!

Best,

Irene

chen_lv · 19 May 2026 01:37

Thanks a lot for the reply—this is super helpful! I’ll definitely stay tuned for updates from Open Targets; it’s such a fantastic database. Have a wonderful day ahead!

Best,

Chen Lv

Topic		Replies	Views
Does target-disease evidence from clinical trials include control arm drugs? Technical Support	3	443	21 January 2022
Suggested for inclusion in the Platform: experimental drug LY3325656 Community Feedback	1	379	4 April 2023
Drug-indication/Clinical precedence pairs on Open Targets Data issue ot-platform , data	1	90	13 August 2024
Lack of approved drug indications Data issue ot-platform	8	778	22 September 2025
Discrepancies in search results General ot-platform	1	85	19 May 2025

Questions regarding the accuracy of data in the Drugs and Clinical Candidates section

Related topics