The BioCaster project has been running since 2006.
About BioCaster is a research project aimed at providing advanced search
and analysis of Internet news and research literature for public health
workers, clinicians and researchers interested in communicable
diseases. The portal is currently under development at the National
Institute of Informatics by Dr. Nigel Collier with the cooperation of
colleagues at the National Institute of Infectious Diseases, National
Institute of Genetics, Okayama University, Vietnamese National
University at Ho Chi Minh City and Kasetsart University. Based on text
mining technology we aim to provide intelligent tools to help users
obtain a clearer picture about actual and potential disease outbreaks
in a timely manner.
Detecting and tracking infectious disease
outbreaks involves having access to information from a variety of
sources. Increasingly this means monitoring many hundreds of Internet
news feeds simultaneously. However three difficulties exist in finding
information using traditional search methods: firstly the massive
volume of dynamically changing unstructured news data available on the
Internet makes it extremely difficult for governments and public health
workers to obtain a clear picture of the outbreak. Secondly, the
initial reports of an outbreak are contained in only a few news
articles which will usually be overlooked using simple keyword indexing
methods. Thirdly, the initial reports of an infectious disease will
usually be reported in local none-English news media. In order to
capture outbreak information in the most timely manner it is therefore
crucial for computer systems to have an understanding of several
languages.
The BioCaster system has two major
components: a web/database server and a backend cluster computer
equipped with text mining technology which continuously scans hundreds
of RSS newsfeeds from local and national news providers. Since the text
mining system has a detailed knowledge about the important concepts
such as diseases, pathogens, symptoms, people, places, drugs etc. this
allows us to semantically index relevant parts of news articles,
enabling users to have quicker and highly precise access to
information. The knowledge we use comes from annotated text
collections, gazetteer lists of nomenclature and the BioCaster
ontology, all of which are currently under development. We are making
the BioCaster ontology available for public access and feedback in the
hope that it will be useful to those interested in the field. Software
resources are also expected to be released as the project progresses.
We gratefully acknowledge grant-in-aid support for parts of the
BioCaster project from the Japan Science and Technology Agency's PRESTO
program, the Transdisciplinary Research Integration Center fund at the
Research Organization for Information Systems (ROIS), and the Japan
Society for the Promotion of Science. Publications
- Collier, N., Doan, S., Matsuda Goodwin, R., McCray, J.,
Conway, J., Shigematsu, M. and Kawazoe, A. (2010), “Navigating the Information
Storm: Web-based Global Health Surveillance in BioCaster”, invited contribution
under preparation for ‘BioSurveillance: A Health Protection Priority”, Kass-Hout,
T. and Zhang, X. (eds) (in press).
- Doan, S., Conway, M. and Collier, N. "An Empirical Study of
Sections in Classifying Disease Outbreak Reports", invited chapter in
Annals of Information Systems, Special Issue "Web-based Applications in
Health Care & Biomedicine", Springer, 2010 (in press).
- Conway, M., Kawazoe, A., Chanlekha, H.
and Collier, N. (2010), “Developing a disease outbreak corpus”, under review
for the Journal of Medical Internet
Research.
- Collier, N. (2010), “What’s unusual in
online disease outbreak news?”, under review for the Journal of Biomedical Semantics.
- Hartley,
D., Nelson N., Walters R., Arthur R., Yangarber R., Madoff L., Linge J.,
Mawudeku A., Collier N., Brownstein J., Thinus, G. and Lightfoot N. (2010), “The
landscape of international event-based biosurveillance”, Emerging Health Threats Journal, 3:e3.[html]
- Conway, M., Doan, S., Kawazoe, A. and
Collier, N. (2009), “Classifying disease outbreak reports using n-grams and
semantic features”, International Journal
of Medical Informatics (in press): DOI 10.1016/j.ijmedinfo.2009.03.0101. [pubmed]
- Doan, S., Kawazoe, A., Conway, M. and
Collier, N. (2009), “Towards role-based filtering of disease outbreak reports”,
Journal of Biomedical Informatics,
Elsevier, DOI: 10.1016/j.jbi.2008.12.009). [html][pubmed]
- Conway, M., Doan, S., Kawazoe, A. and
Collier, N. (2009), “Using hedges to enhance a disease outbreak report text
mining system”, Proc. BioNLP 2009, pp. 142-143. [pdf]
- Collier, N. Doan, S., Kawazoe, A., Matsuda Goodwin, R.,
Conway, M., Tateno, Y., Ngo, Q., Dien, D., Kawtrakul, A., Takeuchi, K.,
Shigematsu, M. and Taniguchi, K. (2008), “BioCaster: detecting public health
rumors with a Web-based text mining system”, Bioinformatics, 24(24):2940-2941, Oxford University Press, DOI:
10.1093/bioinformatics/btn534. [html] [pubmed]
- Kawazoe, A., Jin, L., Shigematsu, M., Bekki, D., Barrero,
R., Taniguchi, K. and Collier, N. (2008), “The development of a schema for the
annotation of terms in the BioCaster disease detection/tracking system”, Journal of Applied Ontology, IOS Press. [html]
- Conway, M., Doan, S., Kawazoe, A. and Collier, N. (2008),
"Classifying disease outbreak reports using n-grams and semantic
features", Proc. 3rd International Symposium on Semantic Mining
in Biomedicine (SMBM 2008), Turku, Finland, September 2-3, pp. 29-36. [pdf]
- Kawazoe, A., Chanlekha, H., Shigematsu, M. and Collier, N.
(2008), “Structuring an event ontology for disease outbreak detection”, BMC Bioinformatics, 9 (Suppl 3): S8,
DOI: 10.1186/1471-2105-9-S3-S8. [pdf][html][pubmed]
- Doan, S., Hung-Ngo, Q.,
Kawazoe, A. and Collier, N. (2008), "Global Health Monitor - a Web-based
system for detecting and mapping infectious diseases", Proc. International
Joint Conference on Natural Language Processing (IJCNLP), Companion Volume,
Hyderabad, India, January 7-12, pp. 951-956
- Collier, N., Kawazoe, A., Son, D., Shigematsu, M.,
Taniguchi, K., Jin, L., McCrae, J., Chanlekha, H., Dien, D., Hung, Q., Nam, V.,
Takeuchi, K. and Kawtrakul, A. (2007), “Detecting Web rumours with a multilingual
ontology-supported text classification system”, Advances in Disease Surveillance, 4: 242, ISDS.
- Collier, N., Kawazoe, A., Jin, L., Shigematsu, M., Dien, D.
Barrero, R., Takeuchi , K.and Kawtrakul, A. (2006), “A multilingual ontology
for infectious disease surveillance: rationale, design and challenges”, Language Resources and Evaluation, 40(3-4): 405-413, Springer Netherlands, DOI: 10.1007/s10579-007-9019-7. [html]
- Kawazoe, A., Jin, L.,
Shigematsu, M., Barerro, R., Taniguchi , K. and Collier, N. (2006), "The
development of a schema for the annotation of terms in the BioCaster disease
detection/tracking system", Olivier Bodenreider (ed)., Proc. International
Workshop on Biomedical Ontology in Action (KR-MED 2006), Baltimore, Maryland,
USA, November 8th, pp. 77-85. [pdf]
- Collier, N.,
Kawazoe, A. Shigematsu, M., Taniguchi, K., Jin, L., McCrae, J., Dien, D., Hung,
Q., Takeuchi, K., Kawtrakul, A. (2007), "Ontology-driven influenza
surveillance from Web rumours", Proc. Options for the Control of Influenza
VI (Options 2007), Toronto, Ontario, Canada, June 17-23.
Members- Son Doan (NII, now at Vanderbilt University Medical Center))
- Ai Kawazoe (NII, now at Tsuda College))
- Reiko Matsuda Goodwin (Fordham University)
- Mike Conway (NII, now at Pittsburgh University)
- Quoc Hung-Ngo (VNU)
- Mika Shigematsu (NIID)
- Kiyosu Taniguchi (NIID)
- Dinh Dien (VNU)
- Asanee Kawtrakul (Kasetsart University and NECTEC)
- Koichi Takeuchi (Okayama University)
- Nigel Collier (NII and JST)
Funding BioCaster has been partly funded by various grants in aid. The core text mining system and the bio-geographic interface was supported by grants from the Japan Society for the Promotion of Science (JSPS); the first stage of the BioCaster Ontology for infectious disease detection was supported by a grant in aid from the Research Organization for Information System's Transdisciplinary Research Center. Work on event alerting has been supported by the Japan Science and Technology Agency (JST).
|