The assurance of mobile application (app) GUI is becoming more and more significant since it is a direct intermediary between apps and users. Automated GUI testing approaches of different strategies have been dramatically developed, while there are still huge gaps between such approaches and the app business logic, not taking the completion of specific testing scenarios as the exploration target, leading to the exploration missing of critical app functionalities. Learning from the manual testing process, which takes testing scenarios with app business logic as the basic granularity, in this paper, we utilize the large language models (LLMs) to understand the semantics presented in app GUI and how they are mapped in the testing context based on specific testing scenarios. Then, scenario-based GUI tests are generated with the guidance of multi-agent LLM collaboration. Specifically, we propose ScenGen, a novel LLM-guided scenario-based GUI testing approach involving five LLM agents, the Observer, the Decider, the Executor, the Supervisor, and the Recorder, to respectively take responsibilities of different phases of the manual testing process. The Observer perceives the app GUI state by extracting GUI widgets and forming GUI layouts, understanding the semantics expressed from the app GUI. Then the app GUI information is sent to the Decider to make decisions on target widgets based on the target testing scenarios. The decision-making process takes the completion of specific testing scenarios as the exploration target. The Executor then executes the demanding operations on the apps. The execution results are checked by the Supervisor on whether the generated tests are consistent with the completion target of the testing scenarios, ensuring the traceability of the test generation and execution. Furthermore, the corresponding GUI test operations are recorded to the context memory by Recorder as an important basis for further decision-making, meanwhile monitoring the runtime bug occurrences. ScenGen is evaluated on its effectiveness in different aspects and the results show that ScenGen can effectively generate scenario-based GUI tests guided by LLMs.
We propose a novel scenario-based GUI testing framework, which starts from the GUI understanding perspective and is based on multi-agent collaboration. The framework can understand the testing scenarios and conduct scenario-based GUI test generation.
We propose a set of GUI understanding and widget localization methods combining computer vision technologies and visual LLMs, which can have a comprehensive understanding of the app GUI and assist GUI test generation.
We construct an app dataset containing 10 common testing scenarios, and the experiment results show the good performance of the proposed approach.
RQ1: How effective is the logical decision-making?
RQ2: How effective is the widget localization?
RQ3: How effective is the scenario-based test generation?
RQ4: How efficient is the scenario-based test generation?
RQ5: How effective is the bug detection of ScenGen?