Hongyu Zhang
Primary Email Address: hongyujohn@gmail.com
I am a Professor and Dean of School of Big Data and Software Engineering, Chongqing University, China. I also worked at The University of Newcastle, Australia, Microsoft Research, and Tsinghua University. I received my PhD degree in Computer Science from School of Computing, National University of Singapore in 2003.
My research is in the area of software engineering, in particular, intelligent software engineering, software analytics, software fault management, software maintenance, and software reuse. The main theme of my research is to improve software quality and productivity by mining and analyzing software data.
I am a senior member of IEEE, a Distinguished Member of ACM, a Distinguished Member of CCF, and a Fellow of Engineers Australia (FIEAust). I am a recipient of David Lorge Parnas Fellowship.
I am always open for collaborations!
Research Area
My research area is software engineering, in particular:
software analytics, data-driven software engineering
intelligent software and service engineering
software testing, debugging, fault diagnosis
software maintenance and reuse, software product line
The main theme of my research is to improve software quality and productivity by utilizing knowledge mined from software data. Over the years, a software organization could accumulate a large amount of data including source code, bug reports, execution logs, changes, metrics, documents, and so on. Data mining, machine learning, and information retrieval techniques can be applied to extract knowledge from the software data and solve software engineering problems. Together with my students and collaborators, I have published more than 250 research papers in reputable international journals and conferences.
According to two independent studies on Bibliometric Assessment of Software Engineering Scholars (2010-2017, 2013-2020), I am among the top 20 most active experienced researchers in software engineering. I was recognized in The Australian’s Top Researchers special edition (September 2020) as the leading researcher in the field of Software Systems. I am among World’s Top 2% Scientists (2020, career-long). According to a recent study, I also have No. 1 publications in top software engineering conferences (worldwide).
Publications
Some of my publications can be found here.
The complete list of publications are available at: DBLP and Google Scholar.
I received the following ACM SIGSOFT Distinguished Paper/Artifact Awards and Best Paper Awards:
ACM Distinguished Paper Award: When to Stop: Towards Efficient Code Generation in LLMs with Excess Token Prevention, Proc. ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024), Sep 2024.
ACM Distinguished Artifact Award: BARO: Robust Root Cause Analysis for Microservices via Multivariate Bayesian Online Change Point Detection. Proc. International Conference on the Foundations of Software Engineering (FSE 2024), July 2024.
ACM Distinguished Paper Award: Modularizing while Training: a New Paradigm for Modularizing DNN Models, the 46th IEEE/ACM International Conference on Software Engineering (ICSE 2024), April 2024.
ACM Distinguished Paper Award: SPINE: a scalable log parser with feedback guidance. Proc. the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022), November 2022, pp. 1198–1208.
ACM Distinguished Paper Award: An Empirical Study on Program Failures of Deep Learning Jobs, Proc. the 42nd International Conference on Software Engineering (ICSE 2020), June-July 2020, Seoul, South Korea, pp. 1159–1170.
ACM Distinguished Paper Award: History-Guided Configuration Diversification for Compile Test-Program Generation, Proc. The 34th IEEE/ACM International Conference on Automated Software Engineering (ASE 2019), San Diego, California, USA, November 2019, pp. 305-316.
ACM Distinguished Paper Award: SymCrash: Selective Recording for Reproducing Crashes, Proc. The 29th IEEE/ACM International Conference on Automated Software Engineering (ASE 2014), Västeras, Sweden, September 2014, pp. 791-802.
ACM Distinguished Paper Award: CrashLocator: Locating Crashing Faults based on Crash Stacks, Proc. International Symposium on Software Testing and Analysis (ISSTA 2014), San Jose, CA, July 2014, pp. 204-214.
Best Paper Award: How Long Will it Take to Mitigate this Incident for Online Service Systems?, Proc. 32nd International Symposium on Software Reliability Engineering (ISSRE 2021), Wuhan, China, Oct 2021, pp. 36-46.
Research Grants
Australian Research Council (ARC) Discovery Project, Intelligent Incident Management for Software-Intensive Systems, Grant No. DP220103044, 2022-2024. (Lead CI)
Australian Research Council (ARC) Discovery Project, Data-driven Approach to Resilient Online Service Systems, Grant No. DP200102940, 2020-2022. (Lead CI)
NSF China, Project “Software Crash Analysis”, Grant No. 61272089 (PI)
NSF China, Project “Software Defect Prediction Models and Applications”, Grant No. 61073006, 2011 – 2013. (PI)
NSF China, Project "Software Customization Techniques", Grant No. 60703060, 2008-2011. (PI)
NSF China, Project "Software Defect and Failure Prediction Techniques", Grant No. 90718022, 2008-2011. (PI)
National High-tech 863 Project No. 2007AA01Z122, 2008-2010. (Co-PI)
National High-tech 863 Project No. 2007AA01Z480, 2008-2010. (Co-PI)
The 3rd Tsinghua University Zi-Zhu Research Program, "Software Quality Measurement and Prediction", Project ID: 2010THZ0, 2011-2013. (PI)
The 6th Key Researcher Support Program, Tsinghua University, 2007-2009. (PI)
Tool Development
I am involved in the Microsoft project Developer Assistant, which puts millions of code snippets at your fingertips while you are coding in Visual Studio. News: Microsoft Blog, Visual Studio Blog 2, Visual C++ Team Blog
BugLocator: locating buggy source code files based on bug reports.
ReLink: recovering missing links between fixed bugs and committed changes.
XVCL: an XML-based variant configuration language.
Log Intelligence: Data-driven, AI-enabled log analytics tools.
Program Committee
The International Conference on Software Engineering ICSE 2021, ICSE 2024, ICSE 2025 (Technical Track)
IEEE/ACM International Conference on Automated Software Engineering (ASE 2015, ASE 2018, ASE 2019, ASE 2020, ASE 2021, ASE 2022, ASE 2023).
The ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024, ISSTA 2025)
The Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering ESEC/FSE 2015 (Industrial Track), FSE 2016 ESEC/FSE 2017, FSE 2018 (artifacts), FSE 2022, FSE 2024 (research track)
International Conference on Machine Learning (ICML 2020 - 2024)
International Conference on Learning Representations (ICLR 2020 - 2023)
AAAI Conference on Artificial Intelligence (AAAI 2020 - 2024)
Annual Conference on Neural Information Processing Systems (NeurIPS 2021 - 2023)
The 34th/35th/36th International Conference on Software Engineering ICSE 2012, ICSE 2013 (Formal Demonstration Track), ICSE 2014 (Tutorials and Technical Briefings), ICSE 2020 (artifacts)
The IEEE International Conference on Software Maintenance and Evolution (ICSM 2013, ICSME 2014, ICSME’16, ICSME'17, ICSME'18, ICSME'19).
The International Conference on Software Analysis, Evolution and Reengineering (SANER 2015, SANER 2016 , SANER 2017, SANER 2018 (industrial), SANER 2020 (industrial) , SANER 2021 (industrial) , SANER 2023)
Working Conference on Mining Software Repositories (MSR 2013, MSR 2014, MSR 2015, MSR 2017, MSR 2020, MSR 2021, MSR 2022)
International Symposium on Empirical Software Engineering and Measurement (ESEM 2016 - 2024)
The 15th/16th/17th/18th European Conference on Software Maintenance and Reengineering (CSMR 2011, CSMR 2012, CSMR 2013, CSMR-18/WCRE-21).
The 5th/6th/7th/8th International Conference on Predictive Models in Software Engineering (PROMISE 2010, PROMISE 2011, PROMISE 2012, PROMISE 2013, PROMISE 2014, PROMISE 2015, PROMISE 2016)
The 24th/25th/26th International Conference on Software Engineering and Knowledge Engineering (SEKE 2012, SEKE 2013, SEKE 2014)
The joint 10th International Workshop on Principles of Software Evolution and the 5th ERCIM Workshop on Software Evolution (IWPSE/EVOL'09), 24-25 August (co-located with ESEC/FSE 2009), Amsterdam.
The 3rd/6th/7th/8th IEEE International Symposium on Theoretical Aspects of Software Engineering (TASE 2009, TASE2012, TASE 2013, TASE 2014)
The 15th/16th/17th/18th/19th/20th/21st/22nd/23rd/24th Asia-Pacific Software Engineering Conference (APSEC 2022, APSEC 2017, APSEC2016, APSEC 2015, APSEC 2014, APSEC 2013, APSEC 2012, APSEC 2011, APSEC 2010, APSEC 2009, APSEC 2008)
The 17th/18th/19th/20th/22nd/23rd/24th Australian Software Engineering Conference (ASWEC 2018, ASWEC 2015, ASWEC 2014, ASWEC 2013, ASWEC 2010, ASWEC 2009, ASWEC 2008, ASWEC 2007, ASWEC 2006)
The 2nd/3rd SEMAT Workshop on a General Theory of Software Engineering (GTSE 2013, GTSE 2014)
The International Conference on Software Quality, Reliability and Security (formerly International Conference on Quality Software) (QSIC 2013-2014, QRS 2015-2022)
The 38th Annual International Computers, Software & Applications Conference (COMPSAC 2014)
The 9th International Conference on Global Software Engineering (ICGSE 2014)
The 24th International Conference on Program Comprehension (ICPC 2016)
The International Workshop on Software Engineering Research and Industrial Practice (SER&IP 2016, SER&IP 2017)
The Asian Conference on Pattern Languages of Programs (AsianPLoP 2014, 2015, 2016, 2017)
The IEEE Working Conference on Software Visualization (VISSOFT 2017)
International Workshop on Blockchain Oriented Software Engineering (IWBOSE 2018 - 2022)
The Asia-Pacific Symposium on Internetware (Internetware 2016, 2017, 2018)
The Annual Conference on Software Analysis, Testing and Evolution (SATE 2018, SATE 2019)
The 13th/14th/15th Workshop on Testing: Academia-Industry Collaboration, Practice and Research Techniques (TAIC-PART 2018, TAIC-PART 2019, TAIC-PART 2020)
The 18th International Conference on Software and Systems Reuse (ICSR 2019)
The 24th International Systems and Software Product Line Conference (SPLC 2020)
Program Organization
General co-chair: The 31st Asia-Pacific Software Engineering Conference (APSEC 2024), Chongqing, China, Dec 2024.
Co-chair: The 45th International Conference on Software Engineering (ICSE 2023), Technical Briefings Track, May 2023.
General co-chair, The 36th International Conference on Software Maintenance and Evolution (ICSME 2020)
Program co-chair, The 18th IEEE International Conference on Software Quality, Reliability, and Security (QRS 2018)
Program co-chair, The 25th Asia-Pacific Software Engineering Conference (APSEC 2018)
Tool Demonstration co-chair: The International Symposium of Software Testing and Analysis (ISSTA 2019)
Short Paper chair: 2018 Australian Software Engineering Conference (ASWEC 2018)
Co-organizer: Dagstuhl Seminar 17502 on "Testing and Verification of Compilers", Dec 2017, Germany.
Program co-chair, Early Research Achievements (ERA) track, ICSME’16
Program co-chair, The 12th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE’16)
The International Conference on Predictive Models in Software Engineering (PROMISE), 2014-2017. (Steering Committee Member)
The Second International Workshop on Software Mining (SoftMine-2013, co-located with ASE'13), Silicon Valley, CA, November 2013. (co-organizer)
The 8th International Workshop on Advanced Modularization Techniques (AOAsia/Pacific 2013), a workshop at AOSD 2013, March 2013. (co-organizer)
The First International Workshop on Software Mining (SoftMine-2012, co-located with KDD'12), Beijing, China, May 2012. (co-organizer)
The 12th International Conference on Quality Software (QSIC 2012), August 2012, Xi'an, China. (industry track co-chairs)
The 26th European Conference on Object-Oriented Programming (ECOOP 2012), June 2012, Beijing, China. (local organisation co-chairs)
Co-organizer: Workshop on Emerging Trends in Software Metrics (WETSoM @ ICSE 2014, WETSoM @ ICSE 2012, WETSoM @ ICSE 2011, WETSoM @ ICSE 2010, ETSM 2009)
15th Asia-Pacific Software Engineering Conference (APSEC 2008), Beijing, China, Dec 2008 (publicity chair).
Journal Services
I am an associate editor of ACM Computing Surveys, Automated Software Engineering, and Journal of Systems and Software.
I am a frequent reviewer for the following international journals: IEEE Transactions on Software Engineering, ACM Transactions on Software Engineering and Methodology, IEEE Software, IEEE Transactions on Knowledge and Data Engineering, Science of Computer Programming, Software Quality Journal, Software Practice & Experience, Journal of Software Maintenance and Evolution, Empirical Software Engineering.
I am invited to review proposals for Natural Science Foundation of China (NSFC), Natural Sciences and Engineering Research Council of Canada (NSERC), European research Council (ERC), Singaporean National Satellite of Excellence, Hong Kong Research Grants Council (RGC), and Australian Research Council (ARC).
Invited Talks/Keynotes
Keynote: Intelligent Software Engineering – Progress and Challenges, The 30th Asia-Pacific Software Engineering Conference (APSEC 2023), Seoul, Korea, Dec 2023.
Invited: Automated Fault Detection for Software Systems through Log Intelligence, The ICSE'23 Workshop on Cloud Intelligence / AIOps, Melbourne, Australia, May 2023.
Invited: An Empirical Study on Program Failures of Deep Learning Jobs, China Computer Federation (CCF) Software Frontier Forum, May 2020.
Keynote: Prediction Models in Software Engineering, The 2nd Forum on Mining Software Repository, Hangzhou, China, Nov 2019.
Keynote: Intelligent Fault Diagnosis and Prediction through Data Analytics, The 6th International Workshop on Quantitative Approaches to Software Quality (QuASoQ 2018).
Invited: Towards Effective Code Reuse by Searching, Workshop on Machine Learning and Software Engineering, National Institute of Informatics (NII), Tokyo, Dec 3, 2018.
Invited: AI-Enabled Software and Service Engineering, The 2018 Computing in the 21st Century Conference & Asia Faculty Summit, Microsoft Research Asia, Nov 2018.
Invited: Log-based Fault Diagnosis for Large-Scale Software Systems, Asian-Pacific Workshop of Advanced Software Engineering, Gold Coast, Australia, Nov 2018.
Invited: Towards Intelligent Software Development, The First Yanqi Meeting on Automatic Software Engineering, Beijing, China, Oct 2018.
Invited: Towards Intelligent Code Reuse, 2017 China Software Engineering Research and Industry Summit, Sep 2017, Shanghai, China.
Keynote: Software Analytics: Data-Driven Software Engineering, The Fourth International Workshop on Software Mining, Nov 2015, 2016, Lincoln, Nebraska, USA (co-located with ASE 2015)
Invited: Code Search: Research and Practice, The 3rd Chinese forum of Software Engineering Research and Practice (SERP 2016), July 20, 2016, Beijing, China
Invited: Towards a Theory of Software Engineering, The 5th International Workshop on Theory-Oriented Software Engineering, May 15, 2016, Austin, Texas, USA (co-located with ICSE 2016)
Invited: Effective Bug Management via Software Analytics, 4th International Symposium on High Confidence Software (ISHCS 2015), Jan 2015, Beijing, China.
Invited: Monte Verita Symposium on Developer Support, Switzerland, March 2012.
Invited: MSR (Mining Software Repository) Vision 2020, Canada, August 2012.
Invited: Symposium on Advanced Software Engineering Techniques, Shanghai Jiaotong University, 2012.
Invited: Symposium on Software Quality and Analysis, Nanjing University, 2012.
Seminar: at University of Texas at Dallas, Feb 2017.
Seminar: at University of Science and Technology Beijing, April 2016.
Seminar: at Chinese Academy of Science, April 2016.
Seminar: at Tsinghua University, May 2014.
Visiting Positions
I was a visiting professor/scholar at the following organizations:
University of Cagliari, Italy (1/2011 – 3/2011)
Microsoft Research Asia (7/2012 – 8/2012, 12/2017-1/2018, 12/2018-1/2019)
Swinburne University of Technology, Australia (8/2012 – 9/2012)
The Hong Kong University of Science and Technology (10/2012 – 3/2013)
University of Toronto/University of Waterloo (5/2001 - 9/2001)
Teaching
I taught the following courses to postgraduate and undergraduate students:
Software Verification and Validation
Software Measurement
Software Quality Engineering (This course was evaluated top 15% among all postgraduate courses offered in Tsinghua University in 2011)
Software Reuse
Software Engineering Projects
Object-Oriented Programming
Students
I am grateful that I have the privilege to advise the following brilliant students/interns:
Liya Chakma, Rongxin Wu (now at Xiamen University), Jian Zhou (now at Baidu), Liang Gong (now at Facebook), Jianxun Yang, Shuijin Lu, Jue Wang (now at Postal Bank), Shuai Chen (now at Facebook), Wei Li (now at Google), Jiangtao Gong (now at Tsinghua), Ke Ma, Bei Shi (now at CUHK), Lu Zhang (now at Virginia Tech), Zeqi Shen, Yu Cao, Bo Zhang, Van-Hoang Le...
Fei Lv (now at Alibaba), Galina Meyer (now at Stanford), Qing Ren (now at UCLA), Pinjia He, Sheng Tian, Wenhao Song, Senlan Yao (now at Google), Bonan Dong (now at Cornell), Xutong Chen, Wangsheng Hu, Hong Wu (now at Morgan Stanley), Jinbo Pan, Xiaodong Gu (now at Shanghai Jiaotong University), Wenxiang Hu (now at Microsoft), Chengxun Shu (now at 4Paradigm), Xingzhao Yue (now at Huawei), Chen Xia (now at UCLA)...
Note: If I missed any of you accidently, please do email me (and forgive me). Please also let me know your latest status.
My Erdös number is 4: Hongyu Zhang - Stanislaw Jarzabek - Tomasz Krawczyk - William T. Trotter, Jr. - Paul Erdös
(Last updated: Jan 2024)
Psalm 67:1-3: May God be gracious to us and bless us, and make his face shine on us, so that your ways may be known on earth, your salvation among all nations.