Projects

Canonical reference: Limsopatham, N. and Collier, N. (2016), “Normalising medical concepts in social media texts by learning semantic representation”, in Proceedings of the Association of Computational Linguistics Annual Meeting (ACL 2016), Berlin, Germany, August 1st to 7th [pdf].

Funded by EPSRC (EP/M005089/1)


  • Panda Alert (2015 - 2018) # Panda Alert: understanding the significance of adverse health event reports using distributed semantic models

Canonical reference: Gritta, M., Pilehvar, M. T., Limsopatham, N., & Collier, N. (2017), "What’s missing in geographical parsing?", Language Resources and Evaluation, 1-21.[pdf].

Funded by NERC DREAM CDT (1649558)


  • PheneBank (2015 - 2018) # PheneBank: automatic extraction and validation of a database of human phenotype-disease associations from the scientific literature [Web site]

Funded by MRC (MR/M025160/1)


Canonical reference: Collier, N., Groza, T., Smedley, D., Robinson, P., Oellrich, A. and Rebholz-Schuhmann, D. (2015). "PhenoMiner: from text to 4,000 phenotypes associated with OMIM diseases", Database, Oxford University Press, vol. 2015, bav104. DOI: 10.1093/bav104. [pdf]

Funded by EC Marie Curie (301806)


  • BioCaster (2006 - 2012) # Infectious disease detection and tracking from newswire [web site]

Canonical reference: Collier, N. Doan, S., Kawazoe, A., Matsuda Goodwin, R., Conway, M., Tateno, Y., Ngo, Q., Dien, D., Kawtrakul, A., Takeuchi, K., Shigematsu, M. and Taniguchi, K. (2008), “BioCaster: detecting public health rumors with a Web-based text mining system”, Bioinformatics, 24(24):2940-2941, Oxford University Press, DOI: 10.1093/bioinformatics/btn534. [html] [pubmed]

Funded under multiple grants including the JST PRESTO programme and the JSPS Young Researcher programme.


  • ZAISA (2003 - 2006) # Zone Analysis in Scientific Articles [web site]

Reference: Mizuta, Y., Korhonen, A., Mullen, T. and Collier, N. (2006), “Zone analysis in biology articles as a basis for information extraction”, International Journal of Medical Informatics, Elsevier, 75(6): 468-487. [pubmed]


  • PASBio (2002 - 2005) # Predicate Argument Structures for Biology [web site]

Reference: Wattarujeekrit, T., Shah, P. and Collier, N. (2004), “PASBio: predicate-argument structures for event extraction in molecular biology”, BMC Bioinformatics, 5:155, DOI: 10.1186/1471-2105-5-155. [pdf][pubmed][PASBio data]


  • PIA (2000 - 2004) # Portable Information Access project [web site]

Reference: Collier, N. (2001), “Machine learning for information extraction from XML markup-up text on the Semantic Web”, Proc. Semantic Web Workshop at the Tenth International Conference on the World Wide Web (WWW’10), Hong Kong, May 1-5, pp. 29-36. [pdf]