During my PhD thesis at the Paul Sabatier Univeristy, I have developed the following prototypes :
• BioSIR (Biomedical Semantic Information Retrieval platform) : BioSIR is our prototype system for
indexing and searching of biomedical information. Our biomedical IR platform is able to:
extract biomedical concepts that are defined in one more more biomedical terminologies
(e.g., MeSH, SNOMED, ICD-10, GO, etc.) from biomedical text,
index documents with biomedical topics that represent the context of biomedical documents,
search for biomedical information in response to queries submitted by several users
(e.g., clinicians, physicians, practitioners, ...).
*** Perspectives: BioSIR can be applied not only in the biomedical domain, for example to improve the
search performance of the PubMED portal comprising more than 24 million citations for biomedical
literature from MEDLINE, life science journals, and online books. For more information, please refer
to my PhD thesis. We can also adapt BioSIR to the legal domain for a semantic indexing and semantic
search of legal documents using the multilingual EuroVoc thesaurus.
• CXTRACTOR (Concept eXtractor): is a generic concept extractor that integrates state-of-the-art
term/concept extraction methods. cxtractor is one of the components of the BioSIR platform.
It allows to extract concepts (terms and codes) predefined in several terminologies
like SNOMED-CT, ICD9, MeSH, etc. (463 downloads from 53 countries on sourceforge).
• IRToolkit is an attempt to build and develop a generic search engine that integrates state-of-the-art
Information Retrieval (IR) models. Furthermore, it also allows to compare the performance (in terms of
precision, recall, index size, search response time and so on) between several open source IR applications.
(513 downloads from 36 countries on sourceforge).
During my postdoc at the Luxembourg Institute of Science and Technology, I have mainly participated to
the research and development of the following prototypes:
• DyKOSMap is a general framework for coping with the mapping maintenance problem between
dynamic Knowledge Organizing Systems (KOS), like those of the biomedical domain. More specifically,
we defined and implemented original methods suited to adapt mappings impacted by KOS evolution
without re-computing the whole set of mappings each time a (target and/or source) KOS evolves.
• GECAMed-search is a prototype dedicated to searching for information in electronic health
records. The research and development have been undertaken in collaboration with cardiologists at
a Luxembourgish Hospital. We have evaluated our search strategy using a clinical corpus delivered
by the TREC organisation in 2014. The results are encouraging because our system outperforms the
median run of the Clinical Decision Support Track.
• Elasticsearch: Elastic provides a growing platform of open source projects and commercial products designed to search, analyze, and visualize your data, allowing you to get actionable insight in real time.