Automatic Web Tesing using Curiosity-Driven Reinforcement Learning


Abstract

Web testing has long been recognized as a notoriously difficult task. Even nowadays, web testing still mainly relies on manual efforts in many cases while automated web testing is still far from achieving human-level performance. Key challenges include dynamic content update and deep bugs hiding under complicated user interactions and specific input values, which can only be triggered by certain action sequences in the huge space of all possible sequences. In this paper, we propose WebExplor, an automatic end-to-end web testing framework, to achieve an adaptive exploration of web applications. WebExplor adopts a curiosity-driven reinforcement learning to generate high-quality action sequences (test cases) with temporal logical relations. Besides, WebExplor incrementally builds an automaton during the online testing process, which acts as the high-level guidance to further improve the testing efficiency. We have conducted comprehensive evaluations on six real-world projects, a commercial SaaS web application, and performed an in-thewild study of the top 50 web applications in the world. The results demonstrate that in most cases WebExplor can achieve significantly higher failure detection rate, code coverage and efficiency than existing state-of-the-art web testing techniques. WebExplor also detected 12 previously unknown failures in the commercial web application, which have been confirmed and fixed by the developers. Furthermore, our in-the-wild study further uncovered 3,466 exceptions and errors.

Overflow of WebExplor

Figure 1: The workflow of DeepExplorer

Pre-processing

This component maps an HTML page to an abstract state. The main purpose is to avoid the state explosion caused by dynamic updates in a web page, such that a good policy can be learned effectively.

Curiosity-driven RL

This component proposes a curiosity-driven reward function, which provides low-level guidance for the exploration of RL such that the learned policy could explore more behaviors of the web applications.

Deterministic Finite Automation (DFA)

This component uses a deterministic finite automaton (DFA) guided exploration strategy that provides high-level guidance for reinforcement learning to efficiently explore the web applications.

Web applications for Evaluations

  • Research Benchmark: Six popular GitHub projects (each has more than 50 stars) from the prior work. These projects use six most popular JavaScript frameworks: dimeshift (Backbone.js), pagekit (Vue.js), Splittypie (Ember.js), phoenix-trello (Phoenix/React), Retroboard (React), and PetClinic (AngularJS).

  • Real-World Websites: According to the Alexo rank list, we select the top 50 web applications in the world for evaluation. To investigate the scalability, we directly leverage WebExplor for an end-to-end testing of these applications without fine-tuning.

  • An industrial Website: A complex Software as a Service (SaaS) system is adopted for the further case studies. We omit the system name for anonymous review reasons.

(six real websites)

(top 50 websites around the world)