Dr Hongyu Zhang

 


Hongyu Zhang


Personal Email:  hongyujohn@gmail.com 
Skype ID: hongyujohn
GTalk ID: hongyujohn


I am now a researcher at Microsoft Research in Beijing, China. Before joining Microsoft in 2014, I was an associate professor at Tsinghua University, China (2006-2014); a lecturer at  RMIT University, Australia (2003-2006); a research fellow at National University of Singapore (2000-2002); and a software engineer at IBM Singapore (1999-2000). I received my PhD degree in Computer Science from School of Computing, National University of Singapore in 2003.


My research is in the area of software engineering, in particular, software analytics, software quality, software maintenance, and software reuse. The main theme of my research is to improve software quality and productivity by utilizing knowledge mined from software repositories. Over the years, a software organization could accumulate a large amount of data including source code, bug reports, execution logs, changes, metrics, documents, and so on. Data mining, machine learning, and information retrieval techniques can be applied to extract knowledge from the software data and solve software engineering problems. Together with my students and collaborators, I have published more than 80 research papers in international journals and conferences. More details about the papers can be found at my Google Scholar page.

Outside work, I like reading, hiking, spending time with friends, and playing with my two kids.

Research Area

My research area is software engineering, in particular:

  • software analytics, mining software repository
  • software measurement and empirical software engineering
  • software quality assurance, testing, debugging
  • software reuse (generative programming and software product lines)
  • software maintenance
My DBLP  (a few of them are not mine), and Google Scholar

Research Grants:
  • NSF China, Project “Software Crash Analysis”, Grant No. 61272089, 2013 – 2016. (PI)
  • NSF China, Project “Software Defect Prediction Models and Applications”, Grant No. 61073006, 2011 – 2013. (PI)
  • NSF China, Project "Software Customization Techniques", Grant No. 60703060, 2008-2011. (PI)
  • NSF China, Project "Software Defect and Failure Prediction Techniques", Grant No. 90718022, 2008-2011. (PI)
  • National High-tech 863 Project No. 2007AA01Z122, 2008-2010. (Co-PI)
  • National High-tech 863 Project No. 2007AA01Z480, 2008-2010. (Co-PI)
  • The 6th Key Researcher Support Program, Tsinghua University, 2007-2009. (PI)

Recent Publications by year:
  • Rongxin Wu, Hongyu Zhang, Shing-Chi Cheung and Sunghun Kim, CrashLocator: Locating Crashing Faults based on Crash Stacks, to appear, Proc. International Symposium on Software Testing and Analysis (ISSTA'14), San Jose, CA, July 2014.
  • Hucheng Zhou, Jian-Guang Lou, Hongyu Zhang, Haoxiang Lin, and Tingting Qin, Common Causes and Mitigations of Service Quality Issues in Big Data Computing, technical report, MSR-TR-2014-34, March 2014
  • Sun Ding, Hongyu Zhang, H. B. K. Tan, Detecting infeasible branches based on code patterns, In Proc. Conference on Software Maintenance, Reengineering and Reverse Engineering (CSMR-WCRE’14), February  2014, Antwerp, Belgium, pp.74-83.
  • Hongyu Zhang, Liang Gong, Steve Versteeg, Predicting Bug-Fixing Time: An Empirical Study of Commercial Software Projects, in Proc. 35th International Conference on Software Engineering (ICSE'13),  May 2013, San Francisco, CA, USA., pp. 1042-1051. (full industry track paper, 20.4% acceptance rate).
  • Hongyu Zhang and S. C. Cheung, A Cost-Effectiveness Criterion for Applying Software Defect Prediction Models, in Proc. ESEC/FSE 2013, Saint Petersburg, Russia, Aug 2013.
  • Jiangtao Gong and Hongyu Zhang, BugMap: A Topographic Map of Bugs, in Proc.  ESEC/FSE 2013, Saint Petersburg, Russia, Aug 2013.
  • K. Liu, H. B. K. Tan and H. Zhang, Has this Bug Been Reported ? in Proc. WCRE 2013,  Koblenz, Germany,October , 2013.
  • Dan Hao, Tian Lan, Hongyu Zhang, Chao Guo, Lu Zhang, Is This a Bug or an Obsolete Test?, in Proc. The European Conference on Object-Oriented Programming (ECOOP 2013), Montpellier, France, July 2013.
  • Jue Wang, Yingnong Dang, Hongyu Zhang, Kai Chen, Tao Xie and Dongmei Zhang, Mining Succinct and High-Coverage API Usage Patterns from Source Code, in Proc. MSR 2013May 2013, San Francisco, CA, USA
  • Giulio Concas, Maria Ilaria Lunesu, Michele Marchesi, Hongyu Zhang, Simulation of Software Maintenance Process, with and without a Work-In-Process Limit, Journal of Software: Evolution and Process, 25(12): 1225-1248, 2013.
  • Fayola Peters, Tim Menzies, Liang Gong, Hongyu Zhang, Balancing Privacy and Utility in Cross-Company Defect Prediction, IEEE Trans. on Software Eng., 39(8), 2013, 1054-1068.
  • Jian Zhou, Hongyu Zhang, Learning to Rank Duplicate Bug Reports, in Proc. 21st ACM Conference on Information and Knowledge Management (CIKM 2012), Maui, Hawaii, Oct 2012. (13.4% acceptance rate)
  • Ming Li, Hongyu Zhang, Rongxin Wu, and Zhi-Hua Zhou, Sample-based Software Defect Prediction with Active and Semi-supervised Learning, Journal of Automated Software Engineering, Springer, Jan 2012, pp.1-30.
  • Jian Zhou, Hongyu Zhang, and David Lo, Where Should the Bugs be Fixed? in Proc. 34th IEEE/ACM International Conference on Software Engineering (ICSE'12), Zurich, Switzerland, June 2012. (full research track paper, 21% acceptance rate).
  • Yingnong Dang, Rongxin Wu, Hongyu Zhang, Dongmei Zhang, and Peter Novel, ReBucket – A Method for Clustering Duplicate Crash Reports based on Call Stack Similarity, in Proc. 34th IEEE/ACM International Conference on Software Engineering (ICSE'12), Zurich, Switzerland, June 2012. (full industry track paper, 18% acceptance rate).
  • Jue Wang and Hongyu Zhang, Predicting Defect Numbers based on Defect State Transition Models, Proc. 6th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2012) , Lund, Sweden, Sep 2012.
  • Liang Gong, David Lo, Lingxiao Jiang and Hongyu Zhang: Diversity Maximization Speedup for Fault Localization. Automated Software Engineering (ASE 2012), Essen, Germany, Sep 2012. (full paper, 15% acceptance rate)
  • Liang Gong, David Lo, Lingxiao Jiang and Hongyu Zhang: Interactive Fault Localization Leveraging Simple User Feedback. International Conference on Software Maintenance (ICSM 2012), Riva del Garda, Trento, Italy, Sep 2012. (full paper, 25% acceptance rate)
  • Rongxin Wu, Hongyu Zhang, Sunghun Kim, and S.C.Cheung, ReLink: Recovering Links between Bugs and Changes, in Proc. The joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE'11)SzegedHungary, Sep 5-9 2011.  (full paper, 17% acceptance rate).
  • Sunghun Kim, Hongyu Zhang, Rongxin Wu and Liang Gong, Dealing with Noise in Defect Prediction, in Proc. of 33rd IEEE/ACM International Conference on Software Engineering (ICSE'11), Honolulu, Hawaii, USA, May 21-28 2011. (full paper, 14% acceptance rate).
  • Hongyu Zhang, Hee Beng Kuan Tan, Lu Zhang, Xi Lin, Xiaoyin Wang, Chun Zhang and Hong Mei, Checking Enforcement of Integrity Constraints in Database Applications Based on Code Patterns,  Journal of Systems and Software, 2011.
  • Hongyu Zhang and Sunghun Kim, Monitoring the Evolution of Software Quality with Respect to Defect, IEEE Software, July/August, 2010.
  • Hongyu Zhang and Rongxin Wu, Sampling Program Quality, Proc. 26th IEEE International Conference on Software Maintenance (ICSM 2010), Timisoara, Romania, September 2010.
  • Hongyu Zhang, Adam Nelson, Tim Menzies, On the Value of Learning From Defect Dense Components for Software Defect Prediction, Proc. International Conference on Predictor Models in Software Engineering (PROMISE10), Timisoara, Romania,Sep 12-13, 2010.
  • Hongyu Zhang, An Investigation of the Relationships between Lines of Code and Defects, Proc. 25th IEEE International Conference on SoftwareMaintenance (ICSM 2009), Edmonton, Canada, September 2009.
  • Hongyu Zhang, “On the Distribution of Software Faults”, IEEE Transactions on Software Engineering, vol. 34(2), March/April 2008. IEEE Press.
  • Hongyu Zhang, Yuan-Fang Li and Hee Beng Kuan Tan, Measuring Design Complexity of Semantic Web Ontologies, Journal of Systems and Software, 83(5), 2009.
  • Lin Liu, Hongyu Zhang, Fei Peng, Wenting Ma, et al, Understanding Chinese Characteristics of Requirements Engineering, Proc. of 17th International Requirements Engineering Conference (RE’09), August 2009, Atlanta, USA.
  • Hongyu Zhang, Discovering Power Laws in Computer Programs, Information Processing & Management, 45(4): 477-483, Elsevier, 2009.
  • Stan Jarzabek, Hongyu Zhang, Youpeng Lee, Yinxing Xue and Naveed Shaikh, Increasing Usability of Preprocessing for Feature Management in Product Lines with Queries, 31st International Conference on Software Engineering (ICSE 2009),Vancouver, Canada, May 2009. (New Ideas and Emerging Results track, pp. 215-218)
  • Hongyu Zhang, Exploring Regularity in Source Code: Software Science and Zipf's Law, Proc. 15th Working Conference on Reverse Engineering (WCRE 2008), Antwerp, Belgium, October 2008, IEEE Press.
  • Tan, H B K, Yuan Zhao, and Hongyu Zhang, Conceptual Data Model Based Software Size Estimation for Information System, ACM Transactions on Software Engineering and Methodology (TOSEM), 2008.
  • Hongyu Zhang and Xiuzhen Zhang, Comments on “Data Mining Static Code Attributes to Learn Defect Predictors”, IEEE Transactions on Software Engineering, IEEE Press, vol. 33(9), Sep 2007.
  • Hai Wang , Yuan Fang Li, Jing Sun, Hongyu Zhang and Jeff Pan, Verifying Feature Models using OWL, Journal of Web Semantics, Vol 5(2), June 2007, Elsevier, pp. 117-129.
  • Hongyu Zhang and Hee Beng Kuan Tan, An Empirical Study of Class Sizes for Large Java Systems, Proc. of 14th Asia-Pacific Software Engineering Conference (APSEC 2007), Nagoya, Japan, December 2007. IEEE Press, pp. 230-237.
  • Tan, H B K, Yuan Zhao, and Hongyu Zhang, Estimating LOC for information systems from their conceptual models, Proc. of International Conference on Software Engineering (ICSE 2006), May 2006, Shanghai, China, pp. 321-333. (full research paper track)
  • Hongyu Zhang and Stan Jarzabek, A Bayesian Network Approach to Rational Architectural Design, International Journal of Software Engineering and Knowledge Engineering, vol. 15 (4), World Scientific, August 2005, pp. 695-717.
  • Hongyu Zhang and Stan Jarzabek, XVCL: A Mechanism for Handling Variants in Software Product Lines, Science of Computer Programming, vol. 53 (3), Elsevier, Dec 2004, pp. 381-407.
  • Stan Jarzabek, Wai Chun Ong and Hongyu Zhang, Handling Variant Requirements in Domain Modeling, Journal of Systems and Software, vol. 68 (3), Elsevier, Dec 2003, pp. 171-182.
  • Jing Sun, Hongyu Zhang, Yuan Fang Li and Hai Wang. Formal Semantics and Verification for Feature Modeling. Proc. of 10th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'05), IEEE Press, Shanghai, China, June 2005. pp. 303-312.
  • Hongyu Zhang, Stan Jarzabek and Bo Yang, Quality Prediction and Assessment for Product Lines, Proc. of the 15th International Conference On Advanced Information Systems Engineering (CAiSE'03), Klagenfurt/Velden, Austria, June 2003. Lecture Notes in Computer Science (LNCS) 2681, Springer-Verlag, pp. 681-695.
  • Stan Jarzabek, Paul Bassett, Hongyu Zhang, and Weishan Zhang, XVCL: XML-based Variant Configuration Language. Proc. of 25th International Conference on Software Engineering (ICSE 2003), pp. 810-811. 
  • Soe Myat Swe, Hongyu Zhang and Stan Jarzabek, XVCL: A Tutorial, Proc. of 14th International Conference on Software Engineering and Knowledge Engineering (SEKE’02), Ischia, Italy, 2002.ACM Press. pp. 341-349.
  • Stan Jarzabek and Hongyu Zhang, XML-based Method and Tool for Handling Variant Requirements in Domain Models, Proc. Fifth IEEE International Symposium on Requirements Engineering (RE’01), Toronto, Canada, August 2001. IEEE Press, pp. 166-173.
Research Program Committee:
  • The 35th International Conference on Software Engineering (ICSE 2014), Hyderabad, India, June 2014. (Tutorials and Technical Briefings)
  • The 18th European Conference on Software Maintenance and Reengineering and The 21th Working Conference on Reverse Engineering (CSMR-18/WCRE-21), Antwerp, Belgium, February, 2014.
  • The 29th/30th IEEE International Conference on Software Maintenance (ICSM 2013, ICSME 2014).
  • The 10th/11th Working Conference on Mining Software Repositories (MSR 2013, MSR 2014),  May 18-19, 2013, San Francisco, CA, USA
  • The 35th International Conference on Software Engineering (ICSE 2013), San Francisco, CA, May 2013. (Formal Demonstration Track).
  • The 34th International Conference on Software Engineering (ICSE 2012), Zurich, Switzerland, June 2012. (Formal Demonstration Track).
  • The 15th/16th/17th European Conference on Software Maintenance and Reengineering (CSMR 2011CSMR 2012, CSMR 2013).
  • The 5th/6th/7th/8th International Conference on Predictive Models in Software Engineering (PROMISE 2010PROMISE 2011PROMISE 2012PROMISE 2013, PROMISE 2014)
  • The 24th/25th/26th International Conference on Software Engineering and Knowledge Engineering (SEKE 2012, SEKE 2013, SEKE 2014
  • The joint 10th International Workshop on Principles of Software Evolution and the 5th ERCIM Workshop on Software Evolution (IWPSE/EVOL'09), 24-25 August (co-located with ESEC/FSE 2009), Amsterdam.
  • The 10th International Conference on Agile Processes and eXtreme Programming in Software Engineering (XP 2009), May 26-30, 2009, Sardinia, Italy
  • The 3rd/6th/7th/8th IEEE International Symposium on Theoretical Aspects of Software Engineering (TASE 2009, TASE2012, TASE 2013, TASE 2014)
  • The 15th/16th/17th/18th/19th/20th/21st Asia-Pacific Software Engineering Conference (APSEC 2014, APSEC 2013APSEC 2012, APSEC 2011, APSEC 2010, APSEC 2009, APSEC 2008
  • The 17th/18th/19th/20th/22nd Australian Software Engineering Conference (ASWEC 2013ASWEC 2010, ASWEC 2009, ASWEC 2008, ASWEC 2007, ASWEC 2006)
  • The 2nd SEMAT Workshop on a General Theory of Software Engineering (GTSE 2013, an ICSE 2013 workshop)
  • The 13th/14th International Conference on Quality Software (QSIC 2013, QSIC 2014)
  • The 9th International Conference on Global Software Engineering (ICGSE 2014)
Program organizations:
  • The Second International Workshop on Software Mining (SoftMine-2013, co-located with ASE'13),  Silicon Valley, CA, November 2013.  (co-organizers)
  • The 8th International Workshop on Advanced Modularization Techniques (AOAsia/Pacific 2013), a workshop at AOSD 2013, March 2013.
  • The First International Workshop on Software Mining (SoftMine-2012, co-located with KDD'12),  Beijing, China, May 2012.  (co-organizers)
  • The 12th International Conference on Quality Software (QSIC 2012), August 2012, Xi'an, China. (industry track co-chairs)
  • The 26th European Conference on Object-Oriented Programming (ECOOP 2012), June 2012, Beijing, China. (local organisation co-chairs)
  • ICSE 2014 Workshop on Emerging Trends in Software Metrics (WETSoM @ ICSE 2014), India, June 2014. (co-organizers)
  • ICSE 2012 Workshop on Emerging Trends in Software Metrics (WETSoM @ ICSE 2012), Zurich, Switzerland, June 2012. (co-organizers)
  • ICSE 2011 Workshop on Emerging Trends in Software Metrics (WETSoM @ ICSE 2011), May, 2011, Honolulu, Hawaii, USA. (co-organizers)
  • ICSE 2010 Workshop on Emerging Trends in Software Metrics (WETSoM @ ICSE 2010), May 4, 2010, Cape Town, South Africa. (co-organizers)
  • The 1st International Symposium on Emerging Trends in Software Metrics (ETSM 2009), 26 May, 2009, Pula, Sardinia, Italy. (co-organizers)
  • 15th Asia-Pacific Software Engineering Conference (APSEC 2008), Beijing, China, Dec 2008 (publicity chair). 
  • The 2nd IEEE Asia-Pacific Workshop on Software Architectures and Component Technologies (SACT 2007), in conjunction with COMPSAC 2007, Beijing, China.

I am also a frequent reviewer for the following international journals: IEEE Transactions on Software Engineering, IEEE Software, IEEE Transactions on Knowledge and Data Engineering, Data and Knowledge Engineering, Journal of Systems and Software, Empirical Software Engineering, International Journal of Software Engineering and Knowledge Engineering, Science of Computer Programming, Software Quality Journal, Software Practice & Experience,  Journal of Software Maintenance and Evolution....

I am a member of IEEE and ACM.


(Last updated: April 2014)

Psalm 67:1-3: May God be gracious to us and bless us, and make his face shine on us, so that your ways may be known on earth, your salvation among all nations.


Subpages (1): Dr Hongyu Zhang