Selected Publications

Conference Papers

Wu, Stephen Tze-Inn, Adel Diyaf, and Reem Abusanina. Rapid growth of research output amidst political instability: A study of Libya's last 20 years. In Proceedings of the 20th International Conference on Scientometrics and Informetrics 2025. Yerevan, Armenia. June 2025.
Wu, Stephen Tze-Inn, Dan Demetriou, and Rudwan Husain. Honor Ethics: The Challenge of Globalizing Value Alignment in AI. Proceedings of FAccT '23: 2023 ACM Conference on Fairness, Accountability, and Transparency. Chicago, USA. June 2023.
Mersha, Amanuel and Stephen Wu. Morphology-rich Alphasyllabary Embeddings. Proceedings of the 12th Edition of the Language Resources and Evaluation Conference (LREC). Marseilles, France. 2020.
Wu, Stephen, Andrew Wen, Yanshan Wang, Sijia Liu, and Hongfang Liu. Aligned-layer text search in clinical notes. Proceedings of the 16th World Congress on Medical and Health Informatics (MedInfo 2017). Hangzhou, China. 2017.
Liu, Sijia, Yanshan Wang, Na Hong, Feichen Shen, Stephen Wu, William Hersh, Hongfang Liu. On Mapping Textual Queries to a Common Data Model. 2017 IEEE International Conference on Healthcare Informatics (ICHI). Park City, Utah. 2017
Wu, Stephen T., Tamara Timmons, Amy Yates, Meikun Wang, Steven Bedrick, William Hersh, and Hongfang Liu. On Developing Resources for Patient-level Information Retrieval. Proceedings of the 10th Edition of its Language Resources and Evaluation Conference. Portorož, Slovenia. 2016.
Wu, Stephen T., Chung-Il Wi, Sunghwan Sohn, Hongfang Liu, and Young J. Juhn. Staggered NLP-assisted refinement for Clinical Annotations of Chronic Disease Events. Proceedings of the 10th Edition of the Language Resources and Evaluation Conference. Portorož, Slovenia. 2016. [poster]
Zhu, Dongqing, Stephen Wu*, Ben Carterette, Hongfang Liu. Using Discharge Summaries to Improve Information Retrieval in the Clinical Domain. Proceedings of the CLEF eHealth Evaluation Lab. Valencia, Spain. 2013. *Equal Contribution
Wu, Stephen, Dongqing Zhu, William Hersh, and Hongfang Liu. Clinical Information Retrieval with Split-layer Language Models. In Proc ACM SIGIR Workshop on Health Search and Discovery (HSD). 2013.
Wu, Stephen, Dongqing Zhu, Ben Carterette, Hongfang Liu. MayoClinicNLP-CORE: Semantic representations for textual similarity. Joint Conference on Lexical and Computational Semantics. Atlanta, GA. 2013.
Wagholikar KB, Boardman LA, Chaudhry R, Greenes RA, Wu Tsung-teh, Buehler SA, Larson DW, Sohn S, Wu ST, Kaggal VC, Liu H. Workflow-based Data Reconciliation for Clinical Decision Support: Case of Colorectal Cancer Screening and Surveillance. AMIA CRI. 2013.
Wu, Stephen, James Masanz, Ravikumar K.E., Hongfang Liu. Three Questions about Clinical Information Retrieval. Text Retrieval Conference - Medical Records Track. 2012.
Wagholikar, Kavishwar, Sunghwan Sohn, Stephen Wu, Vinod Kaggal, Sheila Buehler, Robert Greenes, Tsung-Teh Wu, David Larson, Hongfang Liu, Rajeev Chaudhry, Lisa Boardman. Clinical Decision Support for Colonoscopy Surveillance Using Natural Language Processing. IEEE HISB. La Jolla, CA. 2012.
Sohn, Sunghwan and Stephen Wu. Dependency Parser-based Negation Detection in Clinical Narratives. AMIA CRI. San Fransisco, CA. 2012.
Liu, Hongfang, Kavishwar Wagholikar, and Stephen Tze-Inn Wu. Using SNOMED CT to encode summary level data - a corpus analysis. AMIA CRI. San Fransisco, CA. 2012.
Wu, Stephen, Kavishwar Wagholikar, Sunghwan Sohn, Vinod Kaggal, Hongfang Liu . Empirical Ontologies for Cohort Identification. Text REtrieval Conference - Medical Records Track. 2011. [poster]
Wu, Stephen, and Hongfang Liu. Semantic Characteristics of NLP-extracted Concepts in Clinical Notes vs. Biomedical Literature. Proceedings of the Annual AMIA Fall Symposium. Washington DC. 2011. [slides]
Wu, Stephen T, Vinod C Kaggal, Guergana K Savova, Hongfang Liu, Dmitriy Dligach, Jiaping Zheng, Wendy W Chapman, and Christopher G Chute. Generality and Reuse in a Common Type System for Clinical Natural Language Processing. Proceedings of the First International Workshop on Managing Interoperability and compleXity in Health Systems. Glasgow, Scotland. 2011. [slides]
Schwartz, Lane, Chris Callison-Burch, William Schuler, and Stephen Wu. Incremental Syntactic Language Models for Phrase-based Translation. Proceedings of the Association for Computational Linguistics. Portland, Oregon. 2011. [errata]
Wu, Stephen, and William Schuler. Structured Composition of Semantic Vectors. Proceedings of the International Conference on Computational Semantics. Oxford, UK. 2011. [slides]
Wu, Stephen, Asaf Bachrach, Carlos Cardenas, and William Schuler. Complexity Metrics in an Incremental Right-corner Parser. Proceedings of the Association for Computational Linguistics. Uppsala, Sweden. 2010. [slides]
Wu, Stephen, Lane Schwartz, and William Schuler. Referential Semantic Language Modeling for Data-Poor Domains. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, Nevada. 2008.
Wu, Stephen, Lane Schwartz, William Schuler. Exploiting Referential Context in Spoken Language Interfaces for Data-Poor Domains. In Proceedings of the 2008 International Conference on Intelligent User Interfaces, Canary Islands, Spain. 2008.
Schuler, William, Tim Miller, Andrew Exley, and Stephen Wu. Dynamic Evidence Models in a DBN Phone Recognizer. In Proceedings of the 9th International Conference on Spoken Language Processing, Pittsburgh, Pennsylvania. 2006.

Journal Articles

Wu, Stephen, Kirk Roberts, Surabhi Datta, Jingcheng Du, Zongcheng Ji, Yuqi Si, Sarvesh Soni, Qiong Wang, Qiang Wei, Yang Xiang, Bo Zhao, Hua Xu. Deep Learning in Clinical Natural Language Processing: A Methodical Review. J Am Med Inform Assoc. Volume 27, Issue 3, March 2020, Pages 457–470. [Supplemental Material]
Wu, Stephen, Sijia Liu, Sunghwan Sohn, Sungrim Moon, Chung-il Wi, Young Juhn, Hongfang Liu. Modeling Asynchronous Event Sequences with RNNs. J Biomed Inform, 83:167-77. Jul 2018. [Author version]
Sohn S, Wi CI, Wu ST, Liu H, Ryu E, Krusemark E, Seabright A, Voge GA, Juhn YJ. Ascertainment of asthma prognosis using natural language processing from electronic medical records. J Allergy Clin Immunol, 141(6):2292-2294.e3. doi: 10.1016/j.jaci.2017.12.1003. Jun 2018.
Wu, Stephen T, Sijia Liu, Yanshan Wang, Tamara Timmons, Harsha Uppili, Steven Bedrick, William Hersh, and Hongfang Liu. Intra-institutional EHR Collections for Patient-Level Information Retrieval. Journal of the American Society for Information Science and Technology. doi:10.1002/asi.23884. September 2017.
Wang, Yanshan, Stephen Wu, Dingcheng Li, Saeed Mehrabi, Hongfang Liu. A Part-Of-Speech term weighting scheme for biomedical information retrieval. J Biomed Inform, 63:379-389. Oct 2016.
Wu, Stephen T, Young J Juhn, Sunghwan Sohn, Hongfang Liu. Patient-level Temporal Aggregation for Text-based Asthma Ascertainment. J Am Med Inform Assoc. 21(5):876-884, 2014. Errata: there are publisher-introduced typos for p(t) and a(t), which should be written with a conditional probability P(... | f(0)=1)
Wu, Stephen T, Timothy Miller, James Masanz, Matt Coarr, Scott Halgrim, David Carrell, Cheryl Clark. Negation's not solved: Generalizability versus Optimizability in clinical natural language processing. PLoS One. Nov 2014. [NLP Best Paper, 2015 IMIA Yearbook]
Pathak, Jyotishman, Ashutosh Jadhav, ... Stephen Wu. Comparative Analysis of Online Health Information Search Behavior. J Med Internet Res. 16(7):e160. Jul 04, 2014.
Dongqing Zhu*, Wu, Stephen T*, Ben Carterette, Hongfang Liu. Using large clinical corpora for query expansion in text-based cohort identification. J Biomed Inform. 49:275–81, June 2014. *Equal Contribution
Pathak, Jyotishman, ...... Stephen Wu, ... Christopher G Chute. Normalization and standardization of electronic health records for high-throughput phenotyping: the SHARPn consortium. J Am Med Inform Assoc. 20:341-348. 2013.
Wu, Stephen T, Sunghwan Sohn, Ravikumar K.E., Kavishwar Wagholikar, Siddhartha R. Jonnalagadda, Hongfang Liu, Young J Juhn. Automated Chart Review for Asthma Cohort Identification using Natural Language Processing: An Exploratory Study. Ann Allergy Asthma Immunol. 111(5):364-369, November, 2013.
Wu, Stephen T. Computational Semantics in Clinical Text (editorial). Biomedical Informatics Insights. 2013:Suppl. 1 3-5.
Jonnalagadda S, Cohen T, Wu S, Liu H, Gonzalez G. Using empirically constructed lexical resources for named entity recognition. Biomed Inform Insights. 6(Suppl 1):17-27. 2013.
Sohn S, Clark C, Halgrim SR, Murphy SP, Jonnalagadda SR, Wagholikar KB, Wu ST, Chute CG, Liu H. Analysis of cross-institutional medication description patterns in clinical narratives. Biomed Inform Insights. 6(Suppl 1):7-16. Epub 2013 Jun 24, 2013.
Wu, Stephen T, Vinod C Kaggal, Dmitriy Dligach, James J Masanz, Pei Chen, Lee Becker, Wendy W Chapman, Guergana K Savova, Hongfang Liu, Christopher G Chute. A common type system for clinical natural language processing. J Biomed Sem. 4:1. 2013.
Sohn S, Torii M, Li D, Wagholikar K, Wu S, Liu H. A Hybrid Approach to Sentiment Sentence Classification in Suicide Notes. Biomed Inform Insights. 5(Suppl. 1):43-50; 2012.
Jonnalagadda SR, Li D, Sohn S, Wu ST, Wagholikar K, Torii M, Liu H. Coreference analysis in clinical notes: a multi-pass sieve with alternate anaphora resolution modules. J Am Med Inform Assoc. 19(5):867-74. 2012.
Wu, Stephen, Hongfang Liu, Dingcheng Li, Cui Tao, Mark Musen, Christopher Chute, and Nigam Shah. UMLS Term Occurrences in Clinical Notes: A Large-scale Corpus Analysis. J Am Med Inform Assoc. Published online first: 4 April, 2012. doi:10.1136/amiajnl-2011-000744 [slides]
Jonnalagadda, Siddhartha, Trevor Cohen, Stephen Wu, Graciela Gonzalez. Enhancing clinical concept extraction with distributional semantics. Journal of Biomedical Informatics. Published online first: 7 November, 2011. 45(1):129-40, Elsevier, 2012.
Schuler, William, Stephen Wu, and Lane Schwartz. A Framework for Fast Incremental Interpretation during Speech Decoding. Computational Linguistics, 35(3):313–343, MIT Press, 2009.

Posters/Abstracts/Workshops

Mersha, Amanuel and Stephen Wu. DistillEmb: Distilling word embeddings via contrastive learning. Widening NLP 2022. Abu Dhabi, UAE. 2022.
Wu, Stephen T, Young J Juhn, Sunghwan Sohn, Ravikumar Komandur-Elayavilli, Kavishwar Wagholikar, Siddhartha Jonnalagadda, Hongfang Liu. Automated Chart Review for Asthma Cohort Identification using Natural Language Processing. Proceedings of the Annual AMIA Fall Symposium. Chicago, IL. 2012.
Wu, Stephen, Hongfang Liu, Dingcheng Li, Cui Tao, Mark Musen, Christopher Chute, and Nigam Shah. UMLS Term Occurrences in Clinical Notes: A Large-scale Corpus Analysis. AMIA CRI, journal-eligible. San Fransisco, CA. 2012. [slides - see JAMIA article]
Wu, Stephen, Sunghwan Sohn, Kavishwar Wagholikar, Sheila Buehler, Lisa Boardman. Systematizing Colonoscopy Follow-up Recommendations. AMIA CRI, San Fransisco, CA. 2012. [abstract|poster]
Liu, Hongfang, Manabu Torii, Stephen Wu, Vinod Kaggal, Christopher Chute. Modeling UIMA Type System Using Web Ontology Language - Towards Interoperability among UIMA NLP Tools. AMIA CRI, San Fransisco, CA. 2012.

Other

Mark L. Wieland, Stephen T. Wu, Vinod C. Kaggal, and Barbara P. Yawn. Tracking Health Disparities Through Natural-Language Processing. American Journal of Public Health. Vol. 103, No. 3, pp. 448-449. March 2013.

Thesis

Wu, Stephen. Vectorial Representations of Meaning for a Computational Model of Language Comprehension. University of Minnesota, 2010.

Page updated

Google Sites

Report abuse