CAI'25 Tutorial

Hardware/Software Co-Design

for Energy-Efficient Cognitive Sensing

Spiking Neural Processor

Abstract

Sensor content in electronic devices is growing, and an increasing number of applications involve battery-powered devices. The application of sensors is typically always-on, and this requires large power efficiency within the sense-process-act chain. However, today, the processors available for handling sensors and processing sensor data are characterized by high power-per-inference consumption. Much of the inefficiency lies within how sensor data is acquired from sensors, and how the information is relayed within processing sub-systems. The architectural enhancements needed for efficiency improvements cannot be achieved without hardware-software co-design.

In this tutorial, we formulate requirements for hierarchical, modular neuromorphic framework that enables concurrent hardware-software co-design in smart sensing System-on-Chip. We exploit the synergy of hardware and software to examine omnidirectional dependencies of the entire design stack (from the application, neural network algorithm, and mapper level towards system-on-chip, sub-systems and technology options level) with the goal to optimize and/or satisfy smart sensing design constraints such as energy-efficiency, performance, cost and time-to-market frame. In particular, we highlight advantages of the concurrent design, and emphasize the synergy between i) scalable reconfigurable segmented architecture that enables real-time always-on inference of sensor data, essential for most pervasive sensing tasks, and ii) software development kit that enables the user to build and run an end-to-end application pipeline comprising multiple processing stages, with spiking neural network accelerators being one of them.

The tutorial aims to delve into the contemporary trends of neuromorphic computing, explore its capabilities and challenges, and contemplate its future directions and broader impact within the AI community, industry, and society. The target audience includes research students, early-stage researchers, and practitioners with a background in AI.

Tutorial Format

2 hours

Target Audience

(30-50 participants expected)

Research students, early-stage researchers, and practitioners with a background in AI.

Attendees should be aware of the machine learning pipeline, and have prior experience with supervised learning and classification algorithms. Knowledge of a neuromorphic computing (e.g. sensor processing with neuromorphic neural networks, executing neuromorphic algorithms on dedicated neuromorphic hardware, neuromorphic computation models, adaptation and learning in neuromorphic neural networks) is a pro, but not strict requirement.

Biography of the Presenter

Amir Zjajo received the M.Sc. and DIC degrees from the Imperial College London, London, U.K. in 2000, and the PhD degree from Eindhoven University of Technology, Eindhoven, The Netherlands in 2010, all in electrical engineering. In 2000, he joined Philips Research Laboratories as a member of the research staff in the Mixed-Signal Circuits and Systems Group. From 2006 until 2009, he was with Corporate Research of NXP Semiconductors as a Senior Research Scientist. He joined Delft University of Technology in 2009 as a senior lecturer, and was responsible for leading research into intelligent systems within a range of EU-funded research projects. In 2018, he co-founded Innatera Nanosystems B.V. to commercialize neuromorphic signal processing technology.

Dr. Zjajo has published more than 90 papers in referenced journals and conference proceedings, and holds more than 20 US patents or patent pending. He is the author of the books Brain-Machine Interface: Circuits and Systems (Springer, 2016, Chinese translation, China Machine Press, 2020), Low-Voltage High-Resolution A/D Converters: Design, Test and Calibration (Springer, 2011, Chinese translation, China Machine Press, 2015), and Stochastic Process Variations in Deep-Submicron CMOS: Circuits and Algorithms(Springer, 2014), and he is the editor of Real-Time Multi-Chip Neural Network for Cognitive Systems (River Publishers, 2019). He served as a member of Technical Program Committee of IEEE International Symposium on Quality Electronic Design, IEEE Design, Automation and Test in Europe Conference, IEEE Symposium on VLSI, IEEE International Symposium on Circuits and Systems, and IEEE Biomedical Circuits and Systems, among others.

His research interests include energy-efficient mixed-signal circuit and system design for neuromorphic processors, on-chip machine learning and inference, and bionic electronic circuits for autonomous cognitive systems. Dr. Zjajo won best/excellence paper award at BioDevices’15, LifeTech’19, and AICAS’23. He is a senior member of IEEE.

LinkedIn: https://nl.linkedin.com/in/amir-zjajo-phd-smieee-8ba25268

Significance of the Topic

The tutorial addresses smart sensor design and high-level of integration technological challenge.

Smart sensor design and high-level of integration: The quest to accommodate higher data rates, greater resolution, higher-level of control and programmability, lower latency, and more accurate inference accuracy, all within the single sensor system is driving the need for higher level of integration of sensors alongside processors. We examine how neuromorphic fabric can be extended towards vertical integration in concurrent application-driven design stack, and horizontal integration along signal (pre-) processing path (i.e. signal conditioning and conversion), not only by effectively performing signal processing functions, but also enabling specific functionalities (e.g. feature extraction) with competitive energy-efficiency, both of whom facilitate to unprecedented bill-of-material reduction in sensor systems.

Outline and Description of the Tutorial

A quest to minimize energy per inference or specific task, requires i) codesign across the compute stack (enabling the algorithms and applications to influence the underlying hardware design, and, in the opposite direction, validating if underlying hardware implementation fit a particular application’s needs or constraints), ii) large degree hardware-software co-optimization (e.g. towards minimizing network size for given accuracy, or maximizing number of operations within allowed power envelope), and iii) fully modular and flexible hardware system to unify design process.

In this tutorial, we formulate requirements for hierarchical, modular neuromorphic framework that enables improved hardware-software co-design in next-generation, smart sensing System-on-Chip. We exploit the synergy of hardware and software to examine omnidirectional dependencies of the entire design stack with the goal to optimize and/or satisfy smart sensing design constraints such as energy-efficiency, performance, cost and time-to-market frame. In particular, we emphasize:

The need for advanced sensor data-handling engines paired with modular NN accelerators for wide spectrum of the sensing tasks: we emphasize the need for a processor that is capable of operating in always-on scenarios within a stringently narrow power envelope, and we define a full set of requirements for a system-on-chip at the sensor edge, effectively formalizing a dedicated sensor-handling engine.

The trade-offs between biological and artificial circuits, focusing on continuous-time processing efficiency: we examine electro-chemically accurate, multi-compartment, neurosynaptic computational elements, abstract their fundamental functions by extracting the underlying dynamics, and analyze their complexity, accuracy, and flexibility in signal processing of a time-varying task.

Strategies to maximize energy-efficiency, latency, and scalability with NN accelerators: we follow through on several design principles, among others, distributed computation, specialization of neural designs at all scales, wiring costs minimization, continuous-time processing to enable high information rates, compartmentalization and complexing/extending of synaptic capabilities for reliable and efficient (resource-aware) signal processing, etc.

Insights into software tools that aid both design and deployment of NN accelerators and their ease of use: sophisticated software tools enable fast exploration of new challenges, e.g. potential benefits in terms of competitive performance or extending product capabilities, in terms of usability, in terms of optimal power performance for an application, or improving efficiency or testability. Hence, we postulate the need for i) a full toolchain to compile SNN binaries from the Python description, ii) a highly-advance mapper for optimum mapping of compositional SoC designs, iii) powerful software framework for detailed architectural exploration of spiking neural networks, accounting for physical effects, and iv) tools for high-performance circuit design optimization for advanced process geometries and sensitive applications, and need for v) automated characterization and validation.

A quantitative validation of NN accelerators’ competitive performance across various applications: NN accelerators hold a critical position in the investigation of novel architectures for accomplishing scalable, energy-efficient, and real-time embedded computation. In the evaluation, we include both spiking neural network (SNN) and non-SNN approaches by utilizing general, task-level benchmarking and hierarchical metric definitions, which capture key performance indicators of interest, and we utilize a common open-source benchmark tools, which facilitate actionable benchmark implementation.

Reading List

Introductory reading:

J.-F. Zhang, Z. Zhang, “Machine learning hardware design for efficiency, flexibility, and scalability,” IEEE Circuits and Systems Magazine, pp. 35- 52, Dec. 2023.
J. Lin, et al. “Tiny machine learning: progress and futures,” IEEE Circuits and Systems Magazine, pp. 8- 34, Dec. 2023.

Questions?

Contact [amir.zjajo@innatera.com] to get more information on the tutorial

Take action

Page updated

Report abuse