AI Quality Assurance

This website provides supplementary materials for the AI Quality Assurance tutorial.

Contact

Zhijie Wang: zhijie.wang@ualberta.ca

Yuheng Huang: yuheng18@ualberta.ca

Lei Ma: ma.lei@acm.org

Houssem Ben Braiek: houssem.ben-braiek@polymtl.ca

Foutse Khomh: foutse.khomh@polymtl.ca

Abstract

Data-driven AI (e.g., deep learning) has become a driving force and has been applied in many applications across diverse domains. The human-competitive performance makes them stand as core components in complicated software systems for tasks, e.g., computer vision (CV) and natural language processing (NLP). Corresponding to the increasing popularity of deploying more powerful and complicated DL models, there is also a pressing need to ensure the quality and reliability of these AI systems. However, the data-driven paradigm and black-box nature make such AI software fundamentally different from classical software. To this end, new software quality assurance techniques for AI-driven systems are thus challenging and needed. In this tutorial, we introduce the recent progress in AI Quality Assurance, especially for testing techniques for DNNs and provide hands-on experience. We will first give the details and discuss the difference between testing for traditional software and AI software. Then, we will provide hands-on tutorials on testing techniques for feed-forward neural networks (FNNs) with a CV use case and recurrent neural networks (RNNs) with an NLP use case. Finally, we will discuss with the audience the success and failures in achieving the full potential of testing AI software as well as possible improvements and research directions.

Link to tutorial notebooks

Part 1: Difference between traditional software and AI model

https://colab.research.google.com/drive/11DaChi00L7dJIyZfbFiTQMgnACfosz9-

Part 2: Feedforward Neural Network Testing:

https://colab.research.google.com/drive/1J9_dB4wlqTPiqKCWGiwNaHu_1MOf2bTX

Part 3: Stateful Neural Network Analysis and Testing:

https://colab.research.google.com/drive/1-QI2qQKl1leSEXsRjCBlrDhY3Wj6YpI6

Link to related papers

DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems

DeepHunter: A Coverage-Guided Fuzz Testing Framework for Deep Neural Networks

Page updated

Report abuse