Objective:
Propose and build a heterogeneous processing system integrated model-based design methodology and environment that handle heterogeneous processing systems such as cloud processing, edge server/edge device processing, CPU processing, FPGA and GPU in an integrated manner, and automatically generates implementation codes for each heterogeneous processing system including appropriate function partitioning, performance simulation, and FPGA.
Contents:
In model-based design and development, there is a gap between model and implementation (abstraction level gap) . In this research, based on the FPGA component technology (*1), which is the principal investigator's original technology, FPGAs are treated as reusable and interoperable components (parts), so FPGAs can be used as parts in models. In other words, the entire system configuration can be considered based on the FPGA's "functional model" and "performance model" in the upstream design phase, and design productivity shall be greatly improved for systems with demanding performance and power requirements, such as intelligent robots.
The proposed method is shown in Figure 1. The upper half of the figure shows an abstract model of processing and communication, in which an intelligent robot system consists of many processing components that communicate with each other to form the overall functionality. In order for a complex robot system to operate safely and reliably, it is necessary to verify that the time constraints of the real world in which the robot is placed are satisfied between the image/sensor inputs and the motor outputs. In other words, the maximum allowable delay is in microseconds for spinal reflex control (e.g., PID control) and in milliseconds for advanced intelligent processing including AI-like inference. These constraints are specified at the abstract model stage, and concrete models and implementations that satisfy the constraints are generated.
Furthermore, if the computing power of the edge device alone is insufficient, the solution is to place the relevant processing on the edge cloud server, or to predict the performance and power when processing acceleration hardware such as DSA/GPU/FPGA is installed on the edge device or edge cloud server. By including hardware processing such as FPGAs in model-driven development, it is possible to explore the vast design space and optimize function partitioning and processing placement based on a holistic view of the heterogeneous processing system, which is difficult to do in conventional system design and development.
Figure 1
As intelligent robot software becomes more sophisticated and complex, system development technology is required to integrate high-speed, advanced edge processing in the robot itself and higher-order cognitive decision making in the cloud. To realize high-speed and advanced edge processing, hardware using field programmable gate arrays (FPGAs) is expected to achieve higher processing speeds and lower power consumption. On the other hand, to mitigate the complexity of robot software development, automatic system generation technology from highly abstract processing models including edge and cloud computing is needed.
POINT: This research enables the automatic generation of hardware and software processing systems for high-performance robotic systems from processing models by using a unique FPGA component technology to componentize high-performance FPGA circuits. In turn, this research will contribute to the advancement of intelligent robotics by realizing an environment for model-driven development of intelligent robot systems.
This study proposes and evaluates the following three specific technical issues.
(A) Methods to appropriately model many heterogeneous processing environments including hardware and software
(B) A method to simulate performance with highly abstract models in upstream design
(C)Automated design methods that can generate implementations from models
(A) (A-1) Functional requirements of modeling; the interface type and content of input and output data shall be defined for the processing model of each robot software component. (A-2) Non-functional requirements; parameters related to performance and power consumption shall be defined. (A-3) Performance information when these are processed in different processing environments (CPU, FPGA, etc.) shall be defined, and the optimal processing arrangement can be derived.
(B) Performance simulation; a simulator for the model in (A) will be developed to evaluate the functionality and performance of the proposed model. In general, since there is a trade-off between accuracy and simulation time, (B-1) performance simulations with coarse-grained case should be considered at first. On the other hand, (B-2) fine-grained functional and performance simulations based on the implementation code should be possible in the same way as in the coarse-grained case.
(C) The automated design method; develop a system that generates implementations from models as (C-1) normal software for output. Next, the system will be able to generate (C-2) FPGA circuits from the model. This will be based on the automatic generation tools we have developed for ROS-compliant FPGA components, and will be considered as a primary means of generating implementations in C/C++ language that can be input to Xilinx's High-Level Synthesis (HLS) tool, Vivado HLS. Finally, (C-3) generate the entire heterogeneous system. In addition to the generation of software or FPGA units, the integration of the entire system will be automated by generating communication between the software and FPGA (message communication).
In the evaluation of each R&D item, in addition to the evaluation of the functionality and performance of the proposed method or tool, a subjective evaluation will be conducted through experiments with test subjects, so that the technology would be actually used.
Principal Investigator (Dr. Okawa) comments
With the generalization of AI (Artificial Intelligence), the complexity of software for intelligent robots that are connected to the cloud and perform sophisticated processing as a whole is increasing. The use of model-based design methods is essential for design clarity, ease of debugging and verification, and component reuse.
In this trend, robot software development frameworks such as ROS (Robot Operating System) and ROS2, the next generation version of ROS, have been widely adopted in recent years. The essence of ROS/ROS2 is to improve portability and reusability of software components by enabling users to obtain robot software, including AI software, via the Internet and immediately run it on their own PC environment. In other words, ROS has improved the reusability of each component of the system by adopting the Publish/Subscribe communication processing model, which loosely couples software components through communication, thereby reducing the complexity of robot software system development. This is considered to be an essential progress in terms of software systems.
On the other hand, to improve the efficiency of processing energy in various applications, it is required to utilize GPUs (Graphics Processing Units) in addition to general-purpose microprocessors, and dedicated hardware, DSA (Domain Specific Architecture) and FPGA (Field Programmable Gate Array) for neural network processing acceleration. In addition to general-purpose microprocessors, it is required to utilize GPUs (Graphics Processing Units), DSAs (Domain Specific Architecture), FPGAs (Field Programmable Gate Array), and other dedicated hardware to accelerate neural network processing. Especially in the processing environment of embedded systems, where intelligent robots are expected to be realized, performance and power requirements are severe.
In order to satisfy this requirement, we have established "FPGA component technology" in which FPGAs are loosely coupled processing system nodes using Publish / Subscribe communication, which is the base of ROS, with the basic theme of an optimal design method for systems using FPGAs in cooperation with software.
In the future, as noted at the beginning, intelligent robot system designers will be required to look at these heterogeneous processing environments from the perspective of the entire system level, including cloud processing, and determine the system configuration and processing layout after optimally dividing the functions according to the application program requirements in the upstream design stage.
LINK: Model-driven FPGA Design Environment for Intelligent Robot System (P.I. Takeshi OHKAWA)
REFERENCE:
[1] Takeshi Ohkawa, Takashi Yokota, Kanemitsu Ootsu, “A prototyping system for hardware distributed objects with diversity of programming languages design and preliminary evaluation,” 2013 International Conference on Field-Programmable Technology (FPT), pp.474-477, 2013.
[2] Kazushi Yamashina, Takeshi Ohkawa, Kanemitsu Ootsu, Takashi Yokota, “Proposal of ROS-compliant FPGA Component for Low-Power Robotic Systems,” FPGA for Software Programmer (FSP) 2015 c/w FPL2015, CoRR abs/1508.07123, 2015.
[3] Yuhei Sugata, Takeshi Ohkawa, Kanemitsu Ootsu, Takashi Yokota, “Acceleration of Publish/Subscribe Messaging in ROS-compliant FPGA Component,” Pro. of the 8th Intl. Symp. on Highly Efficient Accelerators and Reconfigurable Technologies (HEART 2017), pp.1-6, 2017.
[4] Takeshi Ohkawa, Kazushi Yamashina, Hitomi Kimura, Kanemitsu Ootsu, Takashi Yokota, "FPGA Component Technology for Easy Integration of FPGA into Robot Systems," IEICE Transactions on Information and Systems, Vol.E101-D, No.2, pp.363-375, Feb. 2018.
[5] Takeshi Ohkawa, Kazushi Yamashina, Takuya Matsumoto, Kanemitsu Ootsu, Takashi Yokota, “Automatic Generation Tool of FPGA Components for Robots,” IEICE Transactions on Information and Systems, Vol.E102-D, No. 5, pp.1012-1019, 2019.
[6] Takeshi Ohkawa, Yuhei Sugata, Harumi Watanabe, Nobuhiko Ogura, Kanemitsu Ootsu, Takashi Yokota, “High level synthesis of ROS protocol interpretation and communication circuit for FPGA,” Proceedings of the 2nd International Workshop on Robotics Software Engineering, RoSE@ICSE 2019, Montreal, QC, Canada, May 27, 2019, pp.33-36 2019
[7] Takeshi Ohkawa, “Component-based FPGA Development for Intelligent Robotics,” 2019 International Workshop on Smart Info-Media Systems in Asia (SISA2019), pp. 11-18, 2019.
[8] Takeshi Ohkawa, Ikuta Tanigawa, Mikiko Sato, Kenji Hisazumi, Nobuhiko Ogura, Harumi Watanabe, "Prototype of FPGA Dynamic Reconfiguration based-on Context-Oriented Programming," 13th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, MCSoC 2019, Singapore, Oct. 1-4, 2019, pp.116-122, 2019.
[9] Daniel Pinheiro Leal, Midori Sugaya, Hideharu Amano, Takeshi Ohkawa, “FPGA Acceleration of ROS2-Based Reinforcement Learning Agents,” 8th International Workshop on Computer Systems and Architectures (CSA'20) held in CANDAR’20, Nov. 24-27, 2020, ONLINE), <<Accepted as Regular presentation>>, 2020.