As machine learning techniques and artifical intelligence become more widely available and used, they are being explored in the context of gathering clinical trial data.
Xu J, Yu C, Xu J et al. PubMed knowledge graph 2.0: Connecting papers, patents, and clinical trials in biomedical science. Sci Data. 2025;12:1018.
Linkages between PubMed and ClinicalTrials.gov
The PubMed Knowledge Graph is available at https://pubmedkg.github.io, and the advanced neural biomedical named entity recognition and normalization tool is available at https://github.com/dmis-lab/BERN2.
Vora B, Kuruvilla D, Kim C, Wu M, Shemesh CS, Roth GA. Applying Natural Language Processing to ClinicalTrials.gov: mRNA cancer vaccine case study. Clin Transl Sci. 2023;16:2417-2420.
"A use case where NLP and text mining methodologies were used to extract clinical trial data from ClinicalTrials.gov for mRNA cancer vaccines".
The authors used "the Linguamatics I2E NLP platform which provides text analytics solutions based on keyword searches."
Du J, Wang Q, Wang J et al. COVID-19 Trial Graph: A Linked Graph for COVID-19 Clinical Trials. Journal of the American Medical Informatics Association. 2021. DOI: 10.1093/jamia/ocab078.
The authors built a "COVID-19 Trial Graph, a graph-based clinical trial data repository, to link structured and unstructured (ie, eligibility criteria) information for existing registered COVID-19 clinical trials. The COVID-19 Trial Graph supports diverse search queries with a particular focus on eligibility criteria and provides a graph-based visualization of COVID-19 clinical trials."
https://github.com/UT-Tao-group/clinical_trial_graph
Elghafari A, Finkelstein J. Automated Identification of Common Disease-Specific Outcomes for Comparative Effectiveness Research Using ClinicalTrials.gov: Algorithm Development and Validation Study. JMIR Med Inform 2021;9 (2): e18298. DOI: 10.2196/18298
The authors built a query pipeline with ClinicalTrials.gov to obtain lists of outcome measures used in trials of a specific condition.
Fanshawe T R, Perera, R. Automatic extraction of quantitative data from ClinicalTrials.gov to conduct meta-analyses. BMJ Evid Based Med 2020;25 (3): 113–114. DOI: 10.1136/bmjebm-2019-111206.
A Python-based software application (EXACT) that automatically extracts data required for meta-analysis from the ClinicalTrials.gov database in a spreadsheet format.
Updated 14 October 2025