Research interests

My current projects focus on both fundamental NLP and its many real-life applications (ranging from text mining to machine translation and dialogue systems).

Some current areas of interest include:

  • lexical and knowledge acquisition
  • computational semantics and discourse
  • domain, task and language adaptation / transfer
  • machine learning / deep learning for NLP
  • multilingual and low resource NLP
  • multimodal NLP
  • language grounding and cognitive modeling of language
  • real-world applications of NLP (e.g. text mining and search, conversational AI, machine translation)
  • NLP for biomedical and cognitive sciences
  • NLP for social and global good

Current projects

  • Automatic Induction and Adaptation of Syntactic Structures for Improved Cross-Lingual NLP. Google Faculty Award (2019-2020)
  • Building Multilingual Multi-Domain Dialogue Systems (2019-2023).
  • LEXICAL - Lexical Acquisition across Languages. Funded by ERC (2015-2020)
  • EF Education First Research Lab for Applied Language Learning. Funded by Education First (2015-2020)

Past projects

  • LION - Literature-based discovery for cancer biology. Funded by MRC (2015-2018)
  • PheneBank - automatic extraction and validation of a database of human phenotype-disease associations from the scientific literature. Funded by MRC (2015-2018)
  • ENRICH - Enriched phrasal representations for improved language understanding. Google Faculty Award (2015-2016)
  • The Education First-Cambridge Learner Corpus of English - a data driven approach to second language learning. Funded by EF and Isaac Newton Trust (2010-2015).
  • Developing Lexical Resources for Natural Language Processing Applications. University Research Fellowship. Funded by the Royal Society (2005-2014).
  • PANACEA - Platform for Automatic, Normalized Annotation and Cost-Effective Acquisition of Language Resources for Human Language Technologies. Funded by EU FP7 (2010-2012).
  • Lexical Acquisition for the Biomedical Domain. Funded by EPSRC (2009-2012).
  • Developing Multilingual Technologies for Automatic Lexical Acquisition. Funded by Isaac Newton Trust (2010-2012).
  • CRAB - Using Text Mining to Aid Cancer Risk Assessment. Funded by MRC, EU and FSA and FORMAS in Sweden (2008).
  • COMPLEX - Computational Natural Language Processing and the Neuro-Cognition of Language. Co-funded by EPSRC, ESRC and MRC (2008-2011).
  • Developing Multilingual Technologies for Automatic Lexical Acquisition. Funded by British Council (2008-2009).
  • ACLEX - Accurate and Comprehensive Lexical Classification for Natural Language Processing Applications. Funded by EPSRC (2005-2008).
  • Using Automatic Verb Classification to Aid Event Extraction. JSPS Postdoctoral Fellowship. Funded by the Japan Society for the Promotion of Science (2004-2005)