University of Colorado Denver
Realizing the Translator vision requires literature mining. Our high-level strategic plan to address problems identified in the summary vision begins with standing up an open, extensible, FAIR-TLC-compatible framework for text-mining full-text biomedical journal articles that is seeded with state-of-the-art structural and concept recognition systems normalized to produce output in terms of prominent Open Biomedical Ontologies (OBOs). We then plan to align the functioning of that system with the BioLink model and the needs of the Translator community, including a mechanism for effective and efficient community feedback regarding errors. We describe further plans to increase the scope and utility of text-mined knowledge and expand the range of documents mined. Additionally, we propose an innovative technical and governance framework for benchmarking and integrating new text-mining approaches developed by others during the course of the project. Starting from our demonstration of the BioStacks text-mining and CRAFT-ST benchmarking frameworks developed in Segment 1, we propose a well-reasoned and logical plan to achieve the goals described in our vision statement.
The plan has three broad components: (1) Align our framework, tools and gold standards with the BioLink model and the needs of the Translator community. (2) Improve the quality of existing text-mining tools, in terms of recall, precision and computational performance. (3) Expand the set of documents that form the source of the text-mined associations, to include full-text biomedical journal articles, patents and regulatory filings.