Publications
100+ papers at top AI/NLP/Speech/CV conferences. Below is a list of my selected and recent publications. For a complete list of my publications, see my Google Scholar or Research gate.
2024
Hong, Y., Zhang, K., Gu, J., Bi, S., Zhou, Y., Liu, D., Liu, F., Sunkavallo, K., Bui, T., Tan, H. LRM: Large Reconstruction Model for Single Image to 3D. ICLR 2024 (oral, 1.2% acceptance rate of 7262 submissions)
Xing, L., Tran, Q., Caba, F., Dernoncourt, F., Yoon, S., Wang, Z., Bui, T., Carenini, G. Multi-Modal Video Topic Segmentation with Dual-Contrastive Domain Adaptation. Multimedia Modeling 2024.
Argaw, D., Yoon, D., Heilbron, F., Deilamsalehy, H., Bui, T., Wang, Z., Dernoncourt, F., Chung, J. Scaling Up Video Summarization Pretraining with Large Language Models. CVPR 2024
Kim, H., Yoon, D., Bui, T., Zhao, H., Tran, Q., Dernoncourt, F., Kang, J. Fine-tuning CLIP Text Encoders with Two-step Paraphrasing. EACL Findings 2024.
Pham, T., Chen, P., Nguyen, T., Yoon, S., Bui, T., Nguyen, A. PEEB: Part-based Image Classifiers with an Explainable and Editable Language Bottleneck. NAACL Findings 2024
2023
Lai, V., Ngo, N., Veyseh, A., Man, H., Dernoncourt, F., Nguyen, T. ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning. EMNLP 2023
Kim, Y., Hwang, Y., Yun, H., Yoon, S., Bui, T., Jung, K. PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning. EMNLP 2023
Chen, C., Nguyen, C., Hoffswell, J., Healey, J., Bui, T., Weibel, N. PaperToPlace: Transforming Instruction Documents into Spatialized and Context-Aware Mixed Reality Experiences. UIST 2023
Hong, Y., Zhou, Z., Zhang, R., Dernoncourt, F., Bui, T., Gould, S., Tan, H. Learning Navigational Visual Representations with Semantic Map Supervision. ICCV 2023
Wu, Q., Liu, Y., Zhao, H., Bui, T., Lin, Z., Zhang, Y., Chang, S. Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis. ICCV 2023
Croitoru, I., Bogolin, S., Albanie, S., Liu, Y., Wang, Z., Yoon, S., Dernoncourt, F., Jin, H., Bui, T. Moment Detection in Long Tutorial Videos. ICCV 2023
Lai, V., Salinas, A., Tan, H., Bui, T., Tran, Q., Yoon, S., Deilamsalehy, H., Dernoncourt, F., Nguyen, T. Boosting Punctuation Restoration with Data Generation and Reinforcement Learning. Interspeech 2023.
Prasad, A., Bui, T., Yoon, S., Deilamsalehy, H., Dernoncourt, F., Bansal, M. MeetingQA: Extractive Question-Answering on Meeting Transcripts. ACL 2023
Qiu, J., Zhu, J., Xu, M., Dernoncourt, F., Bui, T., Wang, Z., Li, B., Zhao, D., Jin, H. SCCS: Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment ACL 2023
Wu, Q., Liu, Y., Zhao, H., Kale, A., Bui, T., Yu, T., Lin, Z, Zhang, Y., Chang S. Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models. CVPR 2023
He, B., Wang, J., Qiu, J., Shrivastava, A., Bui, T., Wang, A. Align and Attend: Multimodal Summarization with Dual Contrastive Losses. CVPR 2023
Pham, T., Yoon, S., Bui, T., Nguyen, A. PiC: A Phrase-in-Context Dataset for Phrase Understanding and Semantic Search. EACL 2023
Phan, N., Ta, H., Duong, S., Hoang, N., Tran, S., Dao, H., Nguyen, C., Bui, T., Truong, S. LOGOVIT: LOCAL-GLOBAL VISION TRANSFORMER FOR OBJECT RE-IDENTIFICATION. ICASSP 2023
Nguyen, H., Nguyen, N., Bui, T., Dao, H., Truong, S., Hoang, V. An efficient approach for real-time abnormal human behavior recognition on surveillance cameras. FG 2023
Tang, H., Duong, S., Nguyen, C., Huynh, T., Vo, D., Phan, C., Le, H., Bui, T., Truong, S. Wavelet Radiomic Features from Multiphrase CT Images for Screening Hepatocellular Carcinoma: Analysis and Comparison. Nature, Scientific Reports 13, Article number: 19559, 2023.
2022
Chen, P., Li, Q., Biaz, S., Bui, T., Nguyen, A. gScoreCAM: What objects is CLIP looking at? ACCV 2022 (oral)
Seonwoo, Y., Yoon, S., Dernoncourt, F., Bui, T., Oh, A. Virtual Knowledge Graph Construction for Zero-Shot Domain-Specific Document Retrieval. COLING 2022 (long paper)
Mrini, K., Singh, H., Dernoncourt, F., Yoon, S., Bui, T., Chang, W., Farcas, E., Nakashole, N. Medical Question Understanding and Answering with Knowledge Grounding and Semantic Self-Supervision. COLING 2022 (long paper)
Salaam, C., Dernoncourt, F., Bui, T., Yoon, S. Offensive Content Detection Via Synthetic Code-Switched Text. COLING 2022 (short paper)
Veyseh, A., Tran, Q., Yoon, S., Manjunatha, V., Deilamsalehy, H., Jain, R., Bui, T., Chang, W., Dernoncourt, F., Nguyen, T. Keyphrase Prediction from Video Transcripts: New Dataset and Directions. COLING 2022 (short paper)
Maharana, A., Tran, Q., Yoon, S., Dernoncourt, F., Bui, T., Chang, W., Bansal, M. Multimodal Intent Discovery from Livestream Videos. NAACL Findings 2022
Cho, J., Yoon, S., Kale, A., Dernoncourt, F., Bui, T., Bansal, M. Fine-grained Image Captioning with CLIP Reward. NAACL Findings 2022
Tran, T., Chu, T., Hoang, V., Bui, T., Truong, H. An Efficient and High Fidelity Vietnamese Streaming End-to-End Speech Synthesis. Interspeech 2022
Kim, H., Kim, D., Yoon, S., Dernoncourt, F., Bui, T., Bansal, M. CAISE: Conversational Agent for Image Search and Editing. AAAI 2022
Lai, T., Bui, T., D., Kim. End-to-End Neural Coreference Resolution Revisited: A Simple Yet Effective Baseline. ICASSP 2022
Phan, N., Tran, S., Ta, H., Duong, S., Nguyen, C., Bui, T., Truong, S. Adaptive Proxy Anchor Loss for Deep Metric Learning. ICIP 2022
Nguyen, M., Tran, V., Hoang, V., Ta, H., Bui, T., Truong, S. ViHealthBERT: Pre-trained Language Models for Vietnamese in Health Text Mining. LREC 2022
Pham, D., Nguyen T., Nguyen, N., Nguyen, K., Nguyen, C., Bui, T., Truong, S. Segtransvae: Hybrid Cnn - Transformer with Regularization for Medical Image Segmentation. ISBI 2022
Nguyen, H., Hoang, V., Bui, T., Truong, S., Huynh, T., Nguyen, D., Nguyen, T., Cong, C. An Efficient Approach for Tuberculosis Diagnosis on Chest X-Ray. ISBI 2022
Huynh, T., Nguyen, C., Nguyen, K., Bui, T., Truong, S. CapNeXt: Unifying Capsule And Resnext For Medical Image Segmentation. ISBI 2022
Ta H., Hoang H., Nguyen, C., Duong, S., Bui, T., Truong, S. Adversarial Contrastive Fourier Domain Adaptation for Polyp Segmentation. ISBI 2022
2021
Cho, S., Dernoncourt, F., Ganter, T., Bui, T., Lipka, N., Chang, W., Jin, H., Brandt, J., Foroosh, H., Liu, F. StreamHover: Livestream Transcript Summarization and Annotation. EMNLP 2021
Zhang, J., Bui, T., Yoon, S., Chen, X., Liu, Z., Xia, C., Tran, Q., Chang, W., Yu, P. Few-Shot Intent Detection via Contrastive Pre-Training and Fine-Tuning. EMNLP 2021
Lee, H., Yoon, S., Dernoncourt, F., Bui, T., Jung, K. UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning. ACL 2021
Mrini, K., Dernoncourt, F., Yoon, S., Bui, T., Chang, W., Farcas, E., Nakashole N. A Gradually Soft Multi-Task and Data-Augmented Approach to Medical Question Understanding. ACL 2021
Pham, T., Bui, T., Mai, L., Nguyen, A. Out of Order: How important is the sequential order of words in a sentence in Natural Language Understanding tasks? ACL Findings 2021
Shi, J., Xu, N., Xu, Y., Bui, T., Dernoncourt, F., Xu C. Learning by Planning: Language-Guided Global Image Editing. CVPR 2021
Lai, T., Ji, H., Bui, T., Tran, Q., Dernoncourt, F., Chang, W. A Context-Dependent Gated Module for Incorporating Symbolic Semantics into Event Coreference Resolution NAACL 2021
M'hamdi, M., Kim, D., Dernoncourt, F., Bui, T., Ren, X., May, J. X-METRA-ADA: Cross-lingual Meta-Transfer learning Adaptation to Natural Language Understanding and Question Answering NAACL 2021
Lee, H., Yoon, S., Dernoncourt, F., Kim, D., Bui, T., Shin, J., Jung K. KPQA: A Metric for Generative Question Answering Using Keyphrase Weights. NAACL 2021
Nguyen, H., Hoang, V., Nguyen, T., Bui, T. Automatic Radiology Report Editing Through Voice. Interspeech 2021: Show & Tell
Tran, Q., Huynh, T., Ta, H., Nguyen, C., Nguyen, A., Phan, V., Nguyen, N., Tran, T., Vu, D., Bui, G., Bui, T., Truong, S. XPGAN: X-ray Projected Generative Adversarial Network for Improving COVID-19 Image Classification. ISBI 2021
Improve Quora Question Pair Dataset for Question Similarity Task. RIVF 2021
Mrini, K., Dernoncourt, F., Yoon, S., Bui, T., Chang, W., Farcas, E., Nakashole N. UCSD-Adobe at MEDIQA 2021: Transfer Learning and Answer Sentence Selection for Medical Summarization. BioNLP, NAACL 2021
Xiao, J., Wang, L., Dernoncourt, F., Bui, T., Sun, T., Han, J. Open-Domain Question Answering with Pre-Constructed Question Spaces. SRW, NAACL 2021
Tran, T., Tran, N., Duong, S., Ta, H., Nguyen, C., Bui, T., Truong, S. ReSORT: an ID-recovery multi-face tracking method for surveillance cameras. FG 2021
2020
Yoo, K., Lee, H., Dernoncourt, F., Bui, T., Chang, W., Lee, S. Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation. EMNLP 2020.
Mrini, K., Dernoncourt, F., Tran, Q., Bui, T., Chang, W., Nakashole N. Rethinking Self-Attention: Towards Interpretability in Neural Parsing. EMNLP Findings 2020
He, X., Tran, Q., Haffari, G., Chang, W., Lin, Z., Bui, T., Dernoncourt, F., Dam, N. Scene Graph Modification Based on Natural Language Commands. EMNLP Findings 2020
Agarwal, S., Bui, T. Lee, J-Y., Konstas, I., Rieser, V. History for Visual Dialog: Do we really need it? ACL 2020
Wu, C., Lin, Z., Cohen, S., Bui, T., Maji, S. PhraseCut: Language-based Image Segmentation in the Wild. CVPR 2020
Lai, T., Bui, T., Kim, D., Tran, Q. A Joint Learning Approach based on Self-Distillation for Keyphrase Extraction from Scientific Documents. COLING 2020
Shi, J., Xu, N., Bui, T., Dernoncourt, F., Wen, Z., Xu, C. A Benchmark and Baseline for Language-Driven Image Editing. ACCV 2020
Lai, T., Tran, Q., Bui, T., Kihara, D. A simple but effective BERT model for dialog state tracking on resource-limited systems. ICASSP 2020
Lai, T., Bui, T., Lipka, N. ISA: An Intelligent Shopping Assistant. AACL-IJNLP 2020 (Demonstration Session)
Le, N., Lai, T., Bui, T., Kim, D. AutoNLU: An On-demand Cloud-based Natural Language Understanding System for Enterprises. AACL-IJNLP 2020 (Demonstration Session)
Huynh, T., Nguyen, C., Ta, H., Hoang, H., Bui, T., Truong, S. Diffeomorphism Matching for Fast Unsupervised Pretraining on Radiographs. BMVC 2021
Lin, T., Rudnicky, A., Bui, T., Kim, D., Oh, J. Adjusting Image Attributes of Localized Regions with Low-level Dialogue. LREC 2020
Yoon, S., Dernoncourt, F., Kim, D., Bui, T., Jung, K. Propagate-Selector: Detecting Supporting Sentences for Question Answering via Graph Neural Networks. LREC 2020
Lee, H., Yoon, S., Dernoncourt, F., Kim, D., Bui, T., Jung, K. ViLBERTScore: Evaluating Image Caption Using Vision-and-Language BERT. Eval4NLP, EMNLP 2020
Colas, A., Bui, T., Dernoncourt, F., Sinha, M., Kim, D. Efficient Deployment of Conversational Natural Language Interfaces over Databases. NLI, ACL 2020
Lee, H., Yoon, D., Dernoncourt, F., Kim, D., Bui, T., Jung, K. DSTC8-AVSD: Multimodal Semantic Transformer Network with Retrieval Style Word Generator. DSTC8, AAAI 2020
2019
Tan, H., Dernoncourt, F., Lin, Z., Bui, T., Bansal, M. Expressing Visual Relationships in Language. ACL 2019 (Oral)
Lai, T., Tran, Q., Bui, T., Kihara, D. A Gated Self-attention Memory Network for Answer Selection. EMNLP 2019
Yoon, S., Dernoncourt, F., Kim, D., Bui, T., Jung, K. A Compare-Aggregate Model with Latent Clustering for Answer Selection. CIKM 2019
Dey, S., Motlicek, P., Bui, T., Dernoncourt, F. Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition. Interspeech 2019
Zhou, Y., Wang, Z., Fang, C., Bui, T., Berg, T. Dance Dance Generation: Motion Transfer for Internet Videos. HBU Workshop at ICCV 2019 (Oral)
Kale, A., Faieta, B., Lin, Z., Bui, T., King, T. Multi-modal Correlated Tag Propagation For Boosting Image Search Relevance. DCCL Workshop at KDD 2019
Wang, L., Dernoncourt, F., Bui, T. Bayesian Optimization for Selecting Efficient Machine Learning Models. CIKM 2019 MoST-Rec Workshop 2019
2018
Lin, T., Bui, T., Kim, D., Oh, J. A Multimodal Dialogue System for Conversational Image Editing. NIPS 2018 - 2nd Conversational AI Workshop
Lai, T., Bui, T., Lipka N., Li., S. Supervised Transfer Learning for Product Information Question Answering. ICMLA 2018
Dernoncourt, F., Bui, T., Chang, W. A Framework for Speech Recognition Benchmarking. Interspeech 2018 (Show and Tell Demonstrations)
Chen, K., Zhang, C., Fang, C., Wang, Z., Bui, T., Nevatia, R. Visually Indicated Sound Generation by Perceptually Optimized Classification. 1st Multimodal Learning and Applications Workshop, ECCV 2018 (Best Paper Award)
Lai, T., Bui, T., Li, S. A Review on Deep Learning Techniques Applied to Answer Selection. COLING 2018
Manuvunakurike, R., Brixey, J., Bui, T., Chang, W., Artstein, R., Georgila, K. DialEdit: Annotations for spoken conversational image editing. Join ACL-ISO Workshop on Interoperable Semantic Annotation, COLING 2018.
Lai, T., Bui, T., Li, S., Lipka, N. A Simple End-to-End Question Answering Model for Product Information. ACL 2018 Workshop ECONLP
Manuvinakurike, R., Bui, T., Chang, W., Georgila, K. Conversational Image Editing: Incremental Intent Identification in a New Dialogue Task. SIGDIAL 2018 (Best Paper Award, see here)
Zhou, Y., Wang, Z., Fang, C., Bui, T., Berg, T. Visual to Sound: Generating Natural Sound for Videos in the Wild, CVPR 2018
Tran, Q., Lai, T., Zuckerman, I., Haffari, G., Bui, T., Bui, H. The Context-dependent Additive Recurrent Neural Net. NAACL 2018
Cohan, A., Dernoncourt, F., Kim, D., Bui, T., Kim, S., Chang, W., Goharizan, B. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents. NAACL 2018
Manuvinakurike, R., Brixey, J., Bui, T., Chang, W., Kim, D., Artstein, R., Georgila, K. Edit me: A Corpus and a Framework for Understanding Natural Language Image Editing. LREC 2018
Dulceanu, A., Le, T., Chang, W., Bui, T., Kim, D., Vu, C., Kim, S. PhotoshopQuiA: A Corpus of Non-Factoid Questions and Answers for Why-Question Answering. LREC 2018
2017
Chen, K., Bui, T., Fang, C., Wang, Z., Nevatia, R. AMC: Attention guided Multi-modal Correlation Learning for Image Search. CVPR 2017
Dernoncourt, F., Lee, J., Bui, T., Bui, H. Robust Dialog State Tracking for Large Ontologies. Dialogues with Social Robots, pp. 475-485
2016
Sedhain, S., Bui, H., Kawale, J., Vlassis, N., Kveton, B., Menon, A., Bui, T., Sanner, S. Practical linear models for large-scale one-class collaborative filtering. IJCAI 2016
Dernoncourt, F., Lee, J., Bui, T., Bui, H. Robust Dialog State Tracking for Large Ontologies. International Workshop on Spoken Dialog Systems (IWSDS 2016). DSTC4 Winning Entry for the main task. See the detailed results of team 3 in here
Dernoncourt, F., Lee, J., Bui, T., Bui, H. Adobe-MIT Submission to the DSTC 4 Spoken Language Understanding Pilot Task. IWSDS 2016. See the detailed results of team 3 in here
Bakhshandeh, O., Bui, T., Lin, Z., Chang, W. Proposing Plausible Answers for Open-ended Visual Question Answering, arXiv, 2016
Le, T., Phan, T., Bui, T., Vu, C. A service-oriented framework for big data-driven knowledge management systems. International Conference on Exploring Services Science, 2016
Le, T., Phan, T., Bui, T. Towards an architecture for big data-driven knowledge management systems. AMCIS 2016.
2014 and earlier
Peters, S., Bui, T. A Speech-driven E-book Technology Prototype Demo. IEEE Spoken Language Technology Workshop (SLT 2014)
Bui, T. A Multi-core Fitted Q Iteration Algorithm for Customer Lifetime Value Optimization. ICML 2014, Workshop on Customers Value Optimization in Digital Marketing
Bui, T., Peters, S. Decision Detection using Hierarchical Graphical Models. ACL 2010
Bui, T., Zwiers, J., Poel, M., Nijholt, A. Affective Dialogue Management Using Factored POMDPs. Interactive Collaborative Information Systems, SCI 281, 2010
Bui, T., Frampton, M., Dowding, J., Peters, S. Extracting decisions from multi-party dialogue using directed graphical models and semantic similarity. SIGDIAL 2009
Bui, T., Poel, M., Nijholt, A., Zwiers, J. A tractable hybrid DDN-POMDP approach to affective dialogue modeling for probabilistic frame-based dialogue systems. Natural Language Engineering, 15(2), 2009
Frampton, M., Huang, J., Bui, T., Peters, S. Real-time decision detection in multi-party dialogue. EMNLP 2009
Bui, T. Toward Affective Dialogue Management using Partially Observable Markov Decision Processes. DOI: 10.3990/1.9789036527149,2008
Bui, T., van Schooten, B., Hofs, D. Practical Dialogue Manager Development using POMDPs. SIGDIAL 2007
Bui, T., Zwiers, J., Nijholt, A., Poel, M. Generic Dialogue Modeling for Multi-application Dialogue Systems. Machine Learning for Multimodal Interaction, LNCS 3866, 2006 (PDF)
Bui, T. Multimodal Dialogue Management - State of the art. CTIT Technical Report Series, no. 06-01, Centre for Telematics and Information Technology (CTIT), Enschede, 2006
Lisowska, A., Rajman, M., Bui, T. ARCHIVUS: A System for Accessing the Content of Recorded Multimodal Meetings. Machine Learning and Multimodal Interaction, LNCS 3361, 2005 (PDF)
Bui, T., Rajman, M., Melichar, M. Rapid Dialogue Prototyping Methodology. Text, Speech and Dialogue, LNCS 3206, 2004 (PDF).
Bui, T., Rajman, M. Rapid Dialogue Protyping Methodology. EPFL Technical Report, 2004.
Rajman, M., Bui, T., Rajman, A., Seydoux, F., Trutnev, A., Quarteroni, S. Assessing the Usability of a Dialogue Management System Designed in the Framework of a Rapid Dialogue Prototyping Methodology. Acta Acustica United with Acustica-Stuttgart, Vol 90, Issue 6, pages. 1096-111, 2004