Semantic Indexing and Information Retrieval for Health

from heterogeneous content types and languages

Held in conjunction with the 42nd European Conference on Information Retrieval (ECIR2020)


COVID-19 Open Research Dataset (CORD-19) by Kyle Lo and Lucy Lu Wang

CoronaTracker by Cher Han Lau

BioASQ by George Paliouras

SIIRH2020 Workshop Preliminary Program - Program.pdf
SIIRH2020 Workshop Preliminary Proceedings.pdf

Important Dates

- Open Session

  • Submission
    • Full papers: January 27, 2020 (11:59pm GMT) February 3, 2020 (11:59pm GMT)
    • Short Papers and Abstracts: March 9, 2020
  • Notifications
    • Full papers: February 28, 2020 March 2, 2020
    • Short Papers and Abstracts: March 16 18, 2020 (11:59 GMT)
  • Camera Ready Contributions
    • 30 March 2020 (11:59pm GMT)

- MESINESP Session deadlines: check

- Workshop: 14 April 2020 in Lisbon, Portugal


There is an increasing interest in exploiting the vast amount of rapidly growing content related to health by means of information retrieval and deep learning strategies. Health-related content is particularly challenging, due to the very specialized domain language, implicit differences in language characteristics depending on the content type (patient-generated content like discussion forum, blogs and other Internet sources, healthcare documentation and clinical records, professional or scientific publications, clinical practice guidelines, clinical trials documentation, etc.). Moreover, it is also critical to provide search solutions for non-English content as well as cross-language or multilingual IR solutions.

This workshop will be a forum where the community can present and discuss current and future directions for the area based on the experience and results obtained in the BioASQ task. Moreover, the workshop proposal, in addition to the MESINESP ( session, will include an Open Session covering IR technologies for heterogeneous health-related content open to multiple languages with a particular interest in the exploitation of structured controlled vocabularies and entity linking. Among the proposed topics for this open session are:

  1. multilingual and non-English health-related IR, concept indexing and text categorization strategies;
  2. generation of evaluation resources for biomedical document IR strategies;
  3. scalability, robustness and reproducibility of health and biomedical IR and text mining resources;
  4. use of specialized machine translation and advanced deep learning approaches for improving health related search results;
  5. medical Question Answering search tools;
  6. retrieval of multilingual health related web-content;
  7. and other related topics.

Submission Details for Open Session

All submissions must be written in English following Springer LNCS author guidelines and submitted as PDF files to EasyChair. At least one author per paper needs to register and attend the workshop to present the work.

  • Full papers: maximum 6 pages including references
  • Short papers: maximum 3 pages including references
  • Abstracts: maximum 1 page including references

Springer LNCS:


All accepted papers and abstracts will be published in an issue of CEUR-WS (

Open Session Accepted Contributions:

  • First Steps Towards Patient-Friendly Presentation of Dutch Radiology Reports by Koen Dercksen and Arjen P. de Vries
  • Enriching Consumer Health Vocabulary Using Enhanced GloVe Word Embedding by Mohammed Ibrahim, Susan Gauch, Omar Salman and Mohammed Alqahatani
  • SmokPro: Towards Tobacco Product Identification in Social Media Text by Himakar Yv, Kartikey Pant and Radhika Mamidi
  • Twitter goes to the Doctor: Detecting Medical Tweets using Machine Learning and BERT by Kevin Roitero, Cristian Bozzato, Vincenzo Della Mea, Stefano Mizzaro and Giuseppe Serra [Short Paper]
  • Biomedical Question Answering using Extreme Multi-Label Classification and Ontologies in the Multilingual Panorama by André Neves, André Lamúrias and Francisco Couto [Short Paper]
  • Towards a multilingual corpus for Named Entity Linking evaluation in the clinical domain by Pedro Ruas, André Lamúrias and Francisco M. Couto [Short Paper]

Journal Special Issue

The planned workshop functions as a venue for the different types of contributors, mainly task providers and solution providers, to meet together and exchange their experiences.

We expect that investigation on the topics of the task will continue after the workshop, based on new insights obtained through discussions during the workshop.

As a venue to compile the results of the follow-up investigation, a journal special issue will be organized to be published a few months after the workshop. The specific journal will be announced after negotiation with publishers


  • Martin Krallinger (Text Mining unit at the Barcelona Supercomputing Center (BSC))
  • Francisco M. Couto (University of Lisbon, Portugal)

Programme Committee:

  • Alberto Lavelli - FBK, Trento, Italy
  • Alfonso Valencia - Barcelona Supercomputing Center, Spain
  • Analia Lourenco - Universidade de Vigo, Spain
  • Anastasios Nentidis - National Center for Scientific Research Demokritos, Greece
  • André Lamurias - LASIGE, Portugal
  • Anne-Lyse Minard - University of Orleans, France
  • Aron Henriksson - Stockholm University, Sweden
  • Bruno Martins - INESC-ID, Portugal
  • Carsten Eickhoff - Brown University, USA
  • Chih-Hsuan Wei - NCBI/NIH, National Library of Medicine, USA
  • Cyril Grouin - LIMSI, CNRS, Université Paris-Saclay, Orsay, France
  • Diana Sousa - LASIGE, Portugal
  • Dimitrios Kokkinakis - University of Gothenburg, Sweden
  • Eben Holderness - McLean Hospital, Harvard Medical School & Brandeis University, USA
  • Ellen Vorhees - National Institute of Standards and Technology (NIST), USA.
  • Fabio Rinaldi - IDSIA, University of Zurich, Switzerland & FBK, Trento, Italy
  • Fleur Mougin - University of Bordeaux, France
  • Georgeta Bordea - Université de Bordeaux, France
  • Georgios Paliouras - National Center for Scientific Research Demokritos, Greece
  • Goran Nenadic - University of Manchester, UK
  • Graciela Gonzalez-Hernandez - University of Pennsylvania, USA
  • Hanna Suominen - CSIRO, Australia
  • Henning Muller - University of Applied Sciences Western Switzerland, Switzerland
  • Hercules Dalianis - Stockholm University, Sweden
  • Hyeju Jang - University of British Columbia, Canada
  • James Pustejovsky - Brandeis University, USA
  • Jin-Dong Kim - Research Organization of Information and Systems, Japan
  • Jong C. Park - KAIST Computer Science, Korea
  • Kevin Bretonnel Cohen - University of Colorado School of Medicine, Aurora, Colorado, USA
  • Maria Skeppstedt - Institute for Language and Folklore, Sweden
  • Marcia Barros - LASIGE, Portugal
  • Mariana Lara-Neves - German Federal Institute for Risk Assessment, Germany
  • Marta Villegas - BSC, Spain
  • Pedro Ruas - LASIGE, Portugal
  • Rafael Berlanga Llavori - Universitat Jaume I, Spain
  • Rezarta Islamaj-Dogan - NIH/NLM/NCBI, USA
  • Sérgio Matos - University of Aveiro, Portugal
  • Shyamasree Saha - Europe PubMed Central, European Bioinformatics Institute – EMBL-EBI, UK
  • Suzanne Tamang - Stanford University School of Medicine, USA
  • Thierry Hamon - LIMSI, CNRS, Université Paris-Saclay, Orsay & Université Paris 13, Villetaneuse, France
  • Thomas Brox Røst - Norwegian University of Science and Technology, Norway
  • Yifan Peng - NCBI/NIH, National Library of Medicine, USA
  • Yonghui Wu - University of Florida, USA
  • Yoshinobu Kano - Shizuoka University, Japan
  • Zhiyong Lu - NCBI/NIH, National Library of Medicine, USA
  • Zita Marinho - Priberam, Portugal


All questions about submissions should be emailed to siirh2020 (at)