Jiaheng Wei Assistant Professor
Data Science and Analytics (DSA) Thrust
Hong Kong University of Science and Technology, Guang Zhou
Email: jiahengwei@hkust-gz.edu.cn Office: E3-305
Email: jiahengwei@hkust-gz.edu.cn Office: E3-305
About Me
I am a tenure-track Assistant Professor at DSA, Hong Kong University of Science and Technology (Guang Zhou). Previously, I was an Advanced AI Research Scientist (Senior Manager) at Accenture. I received my Ph.D. in Computer Science at University of California, Santa Cruz, recipient of Jack Baskin and Peggy Downes-Baskin Fellowship 2023-2024, advised by Prof. Yang Liu. I previously received my Master of Science degree (Data Science) from Brown University and my B.S. degree in Honors Math & Honors Youth (Gifted Young) from Xi’an Jiaotong University.
My research interests lie in trustworthy machine learning & large language models, i.e., robust learning under real-world constraints (label errors in human-generated data, class-imbalanced learning, group distributional robustness), data alignment in large language models, incentive design for data collection, and generative modeling.
[2025. 05] On the Generalization Ability of Machine-Generated Text Detectors was accepted by KDD 2025 -- Dataset & Benchmark Track.
[2025. 05] Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-tuning was accepted by ICML 2025.
[2025. 01] LLM unlearn loss adjustment with only the forget data & Improving data efficiency via curating LLM-driven rating systems were accepted by ICLR 2025.
[2025. 01] The exploration of human-machine perceptual differences was accepted by AAAI 2025 (Oral).
[2024. 12] I joined the Data Science and Analytics Thrust of the Information Hub at Hong Kong University of Science and Technology (HKUST) – Guangzhou Campus, as an assistant professor.
[2024. 07] I officially passed my PhD dissertation.
[2024. 06] Our team (at Accenture) published the paper on Fortune Analytics Language Model (FALM) which powers fortune.com/analytics.
[2024. 04] I joined Accenture as an Advanced AI Research Scientist (Senior Manager).
[2023. 08] We delivered a hands-on tutorial on learning with noisy labels at IJCAI 2023.
[2023. 05] docta.ai is online. This is a library to help you understand and curate your data.
[2023. 05] One first-author paper was accepted to KDD 2023.
[2023. 05] I am honored to be selected as the only 2023-24 Jack Baskin and Peggy Downes-Baskin Fellowship recipient.
[2023. 05] Invited talk from AI-Time.
[2023. 04] Invited talk from TMLR Young Scientist Seminar.
[2023. 03] Oral presentation at WSDM 2023 (Crowd Science Workshop).
[2023. 02] I will join ByteDance AI Lab as the Machine Learning (Research) Intern this summer.
[2023. 01] One first-author paper accepted to ICLR 2023 (work done at Google Brain).
[2022. 12] Joined CROSS as a Research Fellow.
[2022. 10] Invited talk from the Domain Adaptation Team at the University of Toronto.
[2022. 08] Invited talk from AI-Time.
[2022. 07] 20-mins Long-Oral presentation at ICML 2022 (Deep Learning: Robustness).
[2022. 07] One first-author paper was accepted to ECCV 2022.
[2022. 06] Invited talk from AI-Time.
[2022. 05] One first-author paper accepted to ICML 2022 (Long Presentation, 2.1%).