The large amount of available digital biomedical information has a great potential to impact both clinical practice and biomedical research. Exploiting this potential requires efficient and effective mechanisms to access this information. This project proposal addresses the problem of retrieving relevant information from medical scientific literature sources to support case-based decision and analysis processes. This aim requires handling and exploiting the multimodal nature of scientific biomedical literature through data fusion and machine learning techniques. In addition, the large size of literature collections available today enables a big data approach to data driven discovery, requiring not only the development of state of the art algorithms for multimodal data analysis and machine learning on large datasets, but ￼￼￼￼￼￼￼also devising the appropriate tools to guarantee the capacity and scalability of the involved technologies. In this context, we conceive the notion of “effectiveness” in the sense of empowering researchers with the capability of performing their experimental life cycles in an agile and focused manner, either when building their information retrieval systems or when such systems are used for a specific purpose.
Our long term and foremost goal is to build a system for medical case-based information retrieval on large collections of scientific biomedical literature. This goal encompasses two major research problems: (1) devising an effective multimodal representation strategy that captures the rich visual and textual content of medical cases and scientific papers; (2) designing and implementing efficient algorithms and processing strategies that can cope with the ever growing collections of scientific papers and biomedical data. This includes addressing the scalability of both the devised strategies and algorithms and the underlying technological substrate supporting them.
Research and Application Areas
The proposal covers the following research topics included in the call for proposals:
The main application area of the proposed system is biomedical research and healthcare. Efficient and effective access to the humongous volume of biomedical data that is generated nowadays is a current and challenging research problem with an important potential impact on these areas. Techniques for content-based retrieval of textual and visual information are an important component of any solution to this problem.
Medical images are used to visualize biological structures that are not observable otherwise, such as microscopy images for tissue composition, and x-rays for internal organs. These images serve as evidence of the patient’s health status and allow physicians to make decisions about potential diagnosis and treatments. Also, the patient’s health record brings a set of semi-structured textual information describing basic attributes such as gender and age, as well as expert opinions, recommendations, and even test results.
When a physician is treating a patient, both data modalities, medical images and textual records, may be used to automatically identify related information from scientific sources. An effective search system can help to process the multimodal query and retrieve relevant documents from a collection of up to date scholarly articles. These results may support the decision making process in medicine by allowing to identify images in papers that illustrate medical cases as well as reports of the latest procedures for treating patients.
The multimodal search system can be a valuable tool for Evidence-Based Medicine (EBM), which aims to use the best possible evidence for making decisions about the care of individual patients. In that sense, the ability of using the actual health record for finding useful information in scientific archives will help to support the underlying medical reasoning. Then, the process will result in a better understanding of the situation, and therefore, more informed decisions that will impact the quality of life of patients.