In this section, we discuss the challenges in our experiments and opportunities for future work.
Based on the findings of our survey, we recognize that a versatile environment, e.g., a benchmark, is crucial for the development of AI-enabled robotics systems. Practitioners seek an intuitive pipeline and plug-and-play solutions to streamline their efforts and avoid getting entangled in inconsequential steps. In response to this demand, we construct our benchmark with an emphasis on ease of use, i.e., developers can easily deploy their methods and techniques on our benchmark without the need for extensive low-level adjustments. Moreover, our evaluation provides insights into the challenges and potential opportunities involved in developing AI software controllers with Isaac Sim. Notably, existing DRL algorithms may neglect the impact of the parallel running of learning environments associated with Isaac Sim. Further research is needed to design and develop AI methods that effectively account for and exploit this characteristic.
Although AI controllers demonstrate good performance in manipulation tasks, their reliability cannot be guaranteed according to our falsification tests. In the field of robotics, various traditional controllers, e.g., computed torque controller and model predictive controller, have been proposed to solve manipulation tasks, which, unlike AI controllers, often provide guaranteed performance and safety assurance. However, these controllers are typically task-specific and rely on precise object information for motion planning, which limits their generalizability and applicability in changing environments. Therefore, a recent trend is to integrate AI approaches with traditional control-theoretical concepts, resulting in controllers that are flexible and adaptable with high levels of performance, safety, and reliability. This idea can also be applied to AI-CPSs, opening up new possibilities for developing trustworthy systems.
We also notice that the performance of different optimization techniques can be influenced by the properties of the objective function, which is highly dependent on the task being tested. Therefore, taking into account the characteristics of the task is important for performing effective testing, and different methods may be necessary for different tasks.
While Isaac Sim is known for its fast training speed, its ability to accurately reflect real-world behaviors requires further investigation. For applying AI controllers in real-world industrial applications, an analysis of the simulation-to-reality gap is critical. Therefore, one of our future research directions is to investigate whether Isaac Sim can help overcome this gap. We have built an experimental platform that includes two Franka Emika robotic manipulators for this purpose (see the following figure).
Another possible direction is to incorporate more state-of-the-art optimization methods, such as those available in Breach or S-Taliro libraries, into our falsification framework. This would increase the versatility of our framework, making it useful not only for robotics manipulation tasks but also for other types of AI-CPSs.
Benchmark of CPS and AI-CPS
CPSs are highly integrated systems that collaborate diverse disciplines, e.g., mechanical, electrical and particularly software engineering, to tackle challenges in real-world applications. Ernst et al. [1] provided a benchmark of traditional CPSs and compared the performance of various testing tools on these systems. However, this benchmarks only focuses on traditional CPSs instead of AI-CPSs. Song et al. [2] proposed a first benchmark for AI-CPSs with nine tasks from different domains. However, this benchmark is built on MATLAB, which relies on accurate mathematical models to describe the system behaviors. Although this model-based simulation is capable of design and prototyping, we consider a physical simulator can better reflect the complexity of industrial-level operations. In addition, other existing literature [3,4,5] includes AI-CPSs, but most of them are simplified systems or game scenarios such as Cart-Pole and Inverted Pendulum that are not appropriate for industrial-level applications.
Benchmark of robotics
In the field of robotics, multiple benchmarks have been proposed for general robotics applications, e.g., [6,7,8]. However, these benchmarks often have limitations, as they either focus on providing different robot, object, or sensor models [9,10] or are designed only for specific topics of robotics control, such as manipulating deformable objects [11] or real-time robotics [12]. Moreover, these benchmarks often ignore important components of the software development lifecycle, e.g. testing. Considering these limitations, a unified benchmark that covers a broad range of tasks and effectively supports the software development lifecycle is desirable. By proposing a benchmark based on Isaac Sim, we aim to address this need and take a first step towards building a development platform for AI-enabled robotics applications.
AI-CPS testing
Testing is a non-trivial topic in CPSs as it provides the quality assurance to deploy reliable and robust CPSs in safety-critical applications. Such that, there is an increasing trend of research devoting in this direction [13,14,15,16]. Zolfagharian et al. [16] proposed a search-based testing approach that leverages a genetic algorithm to generate testing cases for DRL agents. Zhang et al. [15] leveraged the temporal behaviors of DNN controllers and introduced a falsification framework for AI-CPSs. In [17], a Python-based falsification toolbox for CPSs is presented. However, like other MATLAB falsification tools, it still requires an accurate system model defined as, e.g., ordinary differential equations. Moreover, it is still unclear the effectiveness of existing testing approaches on modern physical simulators. Our work develops a Python-based extendable falsification framework with various optimization methods for physical simulators as well as OpenAI Gym environments. We believe that our framework can greatly enhance the flexibility of conducting testing for AI-CPS practitioners and motivate further research along this direction.
AI-enabled Control Systems in CPS
Many algorithms and approaches have been proposed by researchers to study the performance of AI-based control methods in diverse aspects such as supervised learning, active learning, transfer learning, etc. However, limited literature considers adopting these methods onto CPSs with modern robotics manipulation.
The robotic manipulation tasks contain conflicting operation goals where an equilibrium solution is needed to satisfy each requirement as much as possible[18]. Many traditional control methods and AI approaches have been proposed by researchers to overcome such multi-requirement applications in various domains. Ji [19] studied a new multi-objective control strategy for inverter-interfaced distributed generation, which utilizes scenario classification and reference determination to distribute the control authority among different controllers. Chen [20] proposed a deep Q-network (DQN)-based agent to achieve multi-objective control for energy management in hybrid electric vehicles. The DQN agent is in charge of controlling the motor speed, the CVT gear ratio, and the engine power to maintain the optimal slip ratio, the engine speed and the fuel consumption, respectively. The authors reported that the single DRL-based strategy is facing the problems like computation cost and high complexity.
[1] Gidon Ernst, Paolo Arcaini, Ismail Bennani, Alexandre Donze, Georgios Fainekos, Goran Frehse, Logan Mathesen, Claudio Menghi, Giulia Pedrinelli, Marc Pouzet, et al. 2020. Arch-comp 2020 category report: Falsification. EPiC Series in Computing (2020).
[2] Jiayang Song, Deyun Lyu, Zhenya Zhang, Zhijie Wang, Tianyi Zhang, and Lei Ma. 2022. When cyber-physical systems meet AI: a benchmark, an evaluation, and a way forward. In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice. 343–352.
[3] David Braganza, Darren M Dawson, Ian D Walker, and Nitendra Nath. 2007. A neural network controller for continuum robots. IEEE Transactions on Robotics 23, 6 (2007), 1270–1277.
[4] Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. 2016. Benchmarking deep reinforcement learning for continuous control. In International Conference on Machine Learning. PMLR, 1329–1338.
[5] Taylor T Johnson, Diego Manzanas Lopez, Patrick Musau, Hoang-Dung Tran, Elena Botoeva, Francesco Leofante, Amir Maleki, Chelsea Sidrane, Jiameng Fan, and Chao Huang. 2020. ARCH-COMP20 category report: artificial intelligence and neural network control systems (AINNCS) for continuous and hybrid systems plants. EPiC Series in Computing 74 (2020).
[6] Michael Ahn, Henry Zhu, Kristian Hartikainen, Hugo Ponte, Abhishek Gupta, Sergey Levine, and Vikash Kumar. 2020. Robel: Robotics benchmarks for learning with low-cost robots. In Conference on Robot Learning. PMLR, 1300–1313.
[7] Berk Calli, Aaron Walsman, Arjun Singh, Siddhartha Srinivasa, Pieter Abbeel, and Aaron M Dollar. 2015. Benchmarking in manipulation research: Using the Yale-CMU-Berkeley object and model set. IEEE Robotics & Automation Magazine 22, 3 (2015), 36–52.
[8] Linxi Fan, Yuke Zhu, Jiren Zhu, Zihua Liu, Orien Zeng, Anchit Gupta, Joan Creus-Costa, Silvio Savarese, and Li Fei-Fei. 2018. Surreal: Open-source reinforcement learning framework and robot manipulation benchmark. In Conference on Robot Learning. PMLR, 767–782.
[9] Mayank Mittal, Calvin Yu, Qinxi Yu, Jingzhou Liu, Nikita Rudin, David Hoeller, Jia Lin Yuan, Ritvik Singh, Yunrong Guo, Hammad Mazhar, et al. 2023. Orbit: A unified simulation framework for interactive robot learning environments. IEEE Robotics and Automation Letters (2023).
[10] Yuke Zhu, Josiah Wong, Ajay Mandlekar, Roberto Martín-Martín, Abhishek Joshi, Soroush Nasiriany, and Yifeng Zhu. 2020. robosuite: A modular simulation framework and benchmark for robot learning. arXiv preprint arXiv:2009.12293 (2020).
[11] Konstantinos Chatzilygeroudis, Bernardo Fichera, Ilaria Lauzana, Fanjun Bu, Kunpeng Yao, Farshad Khadivar, and Aude Billard. 2020. Benchmark for bimanual robotic manipulation of semi-deformable objects. IEEE Robotics and Automation Letters 5, 2 (2020), 2443–2450.
[12] Mohammad Bakhshalipour, Maxim Likhachev, and Phillip B Gibbons. 2022. Rtrbench: A benchmark suite for real-time robotics. In 2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). 175–186.
[13] Xiaowei Huang, Marta Kwiatkowska, SenWang, and MinWu. 2017. Safety verification of deep neural networks. In Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30. Springer, 3–29.
[14] Weiming Xiang, Hoang-Dung Tran, and Taylor T Johnson. 2018. Output reachable set estimation and verification for multilayer neural networks. IEEE Transactions on Neural Networks and Learning Systems 29, 11 (2018), 5777–5783.
[15] Zhenya Zhang, Deyun Lyu, Paolo Arcaini, Lei Ma, Ichiro Hasuo, and Jianjun Zhao. 2022. FalsifAI: Falsification of AI-Enabled Hybrid Control Systems Guided by Time-Aware Coverage Criteria. IEEE Transactions on Software Engineering 01 (2022), 1–17.
[16] Amirhossein Zolfagharian, Manel Abdellatif, Lionel C Briand, Mojtaba Bagherzadeh, and S Ramesh. 2023. A Search-Based Testing Approach for Deep Reinforcement Learning Agents. IEEE Transactions on Software Engineering (2023).
[17] Quinn Thibeault, Jacob Anderson, Aniruddh Chandratre, Giulia Pedrielli, and Georgios Fainekos. 2021. Psy-taliro: A python toolbox for search-based test generation for cyber-physical systems. In Formal Methods for Industrial Critical Systems: 26th International Conference, FMICS 2021, Paris, France, August 24–26, 2021, Proceedings 26. Springer, 223–231.
[18] da Silva, S. F., Eckert, J. J., Silva, F. L., Silva, L. C., & Dedini, F. G. (2021). Multi-objective optimization design and control of plug-in hybrid electric vehicle powertrain for minimization of energy consumption, exhaust emissions and battery degradation. Energy Conversion and Management, 234, 113909.
[19] Ji, L., Shi, J., Hong, Q., Fu, Y., Chang, X., Cao, Z., ... & Booth, C. (2020). A multi-objective control strategy for three phase grid-connected inverter during unbalanced voltage sag. IEEE Transactions on Power Delivery, 36(4), 2490-2500.
[20] Chen, J., Shu, H., Tang, X., Liu, T., & Wang, W. (2022). Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment. Energy, 239, 122123.