Interests

@nigelhcollier  Follow on Twitter


Research Interests

Computational linguistics, Information extraction and knowledge discovery for health, Lexical semantics and ontologies, including Grounding of natural language, Machine learning approaches for NLP

Roles and responsibilities

Welcome!  I am Director of Research in Computational Linguistics (Research Professor) at the Department of Theoretical and Applied Linguistics (DTAL) in the University of Cambridge and co-founder of the Language Technology Laboratory. I am also a Turing Fellow at the Alan Turing Institute for data science, a member of the
EPSRC Peer Review College, an elected member of the faculty board in Modern and Medieval Languages and a study steering committee member on the NIHR DEPEND project. I am an Associate Editor of BMC Bioinformatics and a Faculty Member of F1000. I am a member of the Cambridge Centre for Science and Policy (CSaP), Cambridge Big Data, Cambridge Health, Medicine and Society as well as Cambridge Language Sciences.  Please see Activities for video lectures and a list of my role in recent seminars, conferences, workshops, etc.


News:
  • 10/2017 Prospective PhD students: Funding from the Alan Turing Institute - deadline 30th November (make contact early - please see below).
  • 10/2017 Slides for my talk at the International Society of Pharmacovigilence (ISoP)
  • 9/2017 Prospective PhD students and MPhil students please see below.
Upcoming events:
Current projects

EPSRC SIPHS (
EP/M005089/1): I am funded by a 1.2 million 5-year EPSRC fellowship to investigate the Semantic Interpretation of Personal Health messages on the Web (SIPHS) project. This is an international collaborative effort to leverage social media data for digital disease applications such as detecting infectious disease outbreaks and adverse drug reaction.

MRC PheneBank (
MR/M025160/1): I am PI on the PheneBank project. This project seeks to develop a new method for the identification and harmonisation of human phenotypes from the scientific literature as well as their associations to entities of interest such as diseases, genes and other phenotypes.

Recent publications

  1. Gritta, M., Pilehvar, M. T., Limsopatham, N. and Collier, N. (2017), "Vancouver Welcomes You! Minimalist Location Metonymy Resolution", in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada, August, pp. 1248-1259. Download pdf.
  2. Pilehvar, M. T., Camacho-Collados, J., Navigli, R. and Collier, N. (2017), "Towards a Seamless Integration of Word Senses into Downstream NLP Applications", in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2017), Vancouver, Canada, August, pp. 1857-1869. Download pdf.
  3. Camacho-Collados, J., Pilehvar, M. T., Collier, N., & Navigli, R. (2017). "Semeval-2017 task 2: Multilingual and cross-lingual semantic word similarity", In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017). Vancouver, Canada, August, pp. 6-17. Download pdf.
  4. Gritta, M., Pilehvar, M. T., Limsopatham, N., & Collier, N. What’s missing in geographical parsing?. Language Resources and Evaluation, 1-21. Download pdf.
  5. Pilehvar, M. T., and Collier, N. (2016), "De-Conflated Semantic Representations", arXiv preprint arXiv:1608.01961. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, USA, November 1-5, pp. 1680-1690.  Download pdf.
  6. Limsopatham, N. and Collier, N. (2016), “Normalising medical concepts in social media texts by learning semantic representation”, in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2016), Berlin, Germany, August 1-7. Download pdf.
  7. Le, H.Q., Tran, M.V., Dang, T.H. Ha, Q.T. and Collier, N. (2016), “Sieve-based coreference resolution enhances semi-supervised learning model for chemical-induced disease relation extraction”, in Database, Oxford University Press, vol. 2016: article ID baw102; DOI: 10.1093/database/baw102.
    Download pdf.
  8. Pilehvar, M. T. and Collier, N. (2016), “Improved Semantic Representation for Domain-Specific Entities”, in Proc. BioNLP 2016 at the 2016 Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany, August 12-13.  Download pdf.
  9. Limsopatham, N. and Collier, N. (2016), “Modelling the combination of generic and target domain embeddings in a convolution neural network for sentence classification”, in Proc. BioNLP 2016 at the 2016 Annual Meeting of the Association for Computational Linguistics (ACL 2016), Berlin, Germany, August 12-13. Download pdf.
If you want to see a snapshot of my publications please follow the link to Google Scholar. Also available on the LTL publications page.

Teaching

I have been actively involved in teaching throughout my career and have taught a range of computational linguistics units both at the Department of Informatics in Sokendai and in the University of Cambridge where I have given frequent guest lectures. I currently teach on the Biomedical Information Processing course (R214) at the Computer Lab during the Lent term.

Prospective PhD students

I am delighted to consider applications for PhD project proposals from students with a strong background in computing, linguistics or AI. I do however receive a steady stream of such contacts and in order to save time request that in your initial message you (a) provide a brief overview of your project idea and - importantly - how it relates to my research interests, and (b) provide an up to date CV including overall course grades. If you wish to apply for the PhD course in October 2018 please contact me by October/November 2017. You might find it useful to see some project ideas here as starting points.

MPhil students

MPhil students on the ACS course please contact me about project proposals on Biomedical Information Processing. You can find two project ideas here.

Activities - see here

Other background

Prior to joining the University of Cambridge I was a FP7 Marie Curie fellow on the PhenoMiner project at EMBL-EBI (2012-2014) and Associate Professor at the National Institute of Informatics in Tokyo where I led the Natural Language Processing laboratory. From 2007 - 2012 I served as a technology advisor on the international Global Health Security Action Group technical working group on Risk Management and Communication. I obtained my PhD in computational linguistics at UMIST in 1996 (now the University of Manchester) for my research into the application of neural networks for machine translation.

I am a senior member of the Association for Computing Machinery (1996 - present) and a member of the Association for Computational Linguistics (1996 - present).

Contact

Nigel Collier
Director of Research in Computational Linguistics, and
Co-Director of the Language Technology Lab
Department of Theoretical and Applied Linguistics
University of Cambridge
9 West Road, Cambridge CB3 9DB
United Kingdom

Tel: +44 (0)1223-760373
Email: nhc30 AT cam dot ac dot uk

Office: Room TR-23, English Faculty Building

ORCID ID: 0000-0002-7230-4164 [Search Europe PMC]

Follow on Twitter | Slideshare | LinkedIn