Neuromorphic computing represents a paradigm shift in computing, drawing inspiration directly from the neurological structure and processing methods of the human brain. At the heart of this technology are Spiking Neuron Networks (SNNs), which mimic the brain's neuronal activities.
The journey from brain-inspired concepts to actual neuromorphic chips begins with understanding the basic unit of the brain - the biological neuron. This understanding is then translated into a digital neuron or core. Each digital core is designed to simulate the functionalities of a group of biological neurons, including receiving, processing, and transmitting information. By integrating multiple such cores, a digital neurosynaptic chip is created. This chip is capable of mimicking the parallel and distributed processing of the human brain, leading to efficient and powerful computing capabilities.
One of the fundamental models used in SNN-based neuromorphic computing is the Leaky-Integrate-and-Fire (LIF) model. This model is a simplified representation of how neurons in the brain operate. In the LIF model, a neuron's membrane potential increases as it receives incoming spikes (electrical impulses) from other neurons. Once this potential reaches a certain threshold, the neuron 'fires', sending a spike to other neurons, and then resets its potential.
The 'leaky' part of the model refers to the gradual decay of the membrane potential over time, similar to how a real neuron loses its potential unless continually stimulated. This leaky aspect ensures that the neuron does not remain perpetually active and mimics the temporal dynamics of biological neurons.
A prime example of neuromorphic computing in action is the TrueNorth chip, designed by IBM. The chip features a revolutionary neurosynaptic core that represents a network of 256 neurons connected to 256 axons. Each core on the TrueNorth chip is capable of parallel processing, mimicking the way neurons in the brain process information simultaneously.
The TrueNorth chip is a significant milestone in neuromorphic computing as it demonstrates how a large number of neurons and synapses can be efficiently integrated into a single chip. This integration allows for complex computations and decision-making processes that are inspired by the human brain's capabilities, opening new frontiers in artificial intelligence and computing.
The IBM TrueNorth architecture features a two-dimensional network on a chip, comprising several neurosynaptic cores. Each of these cores is an assembly of various functional blocks, each playing a vital role in the overall processing capability of the network.
In the SNN network, information is transmitted through 'spike packets'. A spike packet sent to a core is a 30-bit entity, with the bit distribution as follows, from the most significant bit (MSB) to the least significant bit (LSB):
9 bits for dx (horizontal coordinate)
9 bits for dy (vertical coordinate)
8 bits for the axon destination
4 bits for the tick instance (typically not used in the current context)
Each core in the SNN architecture is composed of five primary blocks:
Router
Scheduler
Core SRAM
Token Controller
Neuron Block
The Router block is crucial for internal communication between cores. It plays the following roles:
It identifies the destination core of incoming packets based on the dx and dy fields.
When a packet reaches its destination core (indicated by dx=dy=0), the Router discards the first 18 bits containing the packet's coordinates. The remaining 12 bits are then forwarded to the Scheduler.
The Scheduler block is responsible for determining the precise location and timing of spike delivery. It uses:
The 8-bit axon destination to determine where the spike should be delivered.
The 4-bit tick instance to indicate when the spike should be delivered.
The Token Controller functions as a state machine, orchestrating the overall operation of the network. It sends control signals to other blocks, effectively managing the flow of information and processing within the core.
The Core Static Random-Access Memory (SRAM) is essential for storing the parameter set for all 256 neurons in the core. This memory block contains vital information that dictates the behavior of each neuron within the core.
The Neuron Block is the computation heart of the core. It contains elements necessary for calculating the potential value of neurons. This block performs the critical task of mimicking the neuronal computation found in biological brains. The operation and interaction within the Neuron Block are regulated by the Token Controller, ensuring precise and timely computations.
The IBM TrueNorth core architecture operates by iterating through the 256 axons of one neuron before moving on to the next, repeating this process for all 256 neurons within the core. This serial approach, while functional, is not optimized for speed.
The shift to a parallel processing model marks a significant leap in performance. In the optimized neurosynaptic core, all 256 neurons operate simultaneously. This architectural transformation is visualized in the provided diagram, showing the transition from the traditional model to the redesigned one.
A pivotal change is the introduction of the Neuron Grid, which takes over the interaction between the Core Static Random-Access Memory (CSRAM) and the Neuron Block. This grid ensures the efficient distribution of spike packets to the Router, mitigating the risk of buffer overflow.
To accommodate the parallel operation of neurons, the Token Controller's Finite State Machine (FSM) is redesigned. The number of states in the FSM is reduced and optimized to only four, streamlining the control flow within the core.
The Neuron Grid module comprises 256 individual neuron instances. Each neuron instance has a corresponding spike buffer that temporarily stores spikes before they are released. The NG Spike component is responsible for distributing these packets to the Router. At the end of each processing tick, the spike buffer is cleared to make room for the next cycle of spikes.
Latency: 769*2 cycles for Parallel architecture and 66050*2 cycles for serial architecture (0.0015ms and 1.3ms at 100 MHz).
At the same frequency (100MHz), the proposed parallel SNN have the processing speed that 86 times higher than the serial one.
With the same processing speed (~1500 image/s), the power consumption of parallel architecture is 22 times lower.
At frequency 25MHz, both speed and dynamic power are improved.
The trade-off for these gains in speed and efficiency is an increase in the utilization of FPGA resources, such as Look-Up Tables (LUTs) and flip-flops. This trade-off is a common consideration in hardware design, where resource usage often correlates with performance enhancements.
Our Github Repo: https://github.com/edabk-hust/edabk_brain_soc/
The RISC-V SoC Caravel Framework provides a robust platform for the integration of specialized processing units, such as our optimized Spiking Neuron Network (SNN) core. This framework is an excellent choice for our implementation due to its flexible design and open-source nature.
Our SNN core has been adapted to work within the parameters of the RISC-V SoC Caravel framework, which includes the following features:
PicoRV32 Processor: This processor is based on the RV32IMC instruction set, with a 2-cycle operation, offering a balance between performance and complexity.
32-bit Wishbone Bus: The Caravel utilizes a 32-bit Wishbone bus, ensuring efficient data transfer within the system.
User Project Area: The framework provides a (2.8mm x 2.8mm) user project area, which is utilized to house our optimized SNN core.
For the purpose of simplification and demonstration, our design implementation focuses on a single physical neuron core containing 256 axons and 32 neurons, effectively creating a 256x32 configuration. This condensed version of our optimized SNN core is a proof-of-concept that demonstrates the capabilities of the design within the constraints of the user project area.
In the pursuit of efficiency and scalability, we have embraced a hardware/software co-design approach for our SNNs within the RISC-V SoC framework. The salient features of this approach include:
Single Physical Core with Software Control: Instead of fabricating multiple physical cores, we have designed a system where one physical core can emulate the operation of multiple cores through software control. This is written in C code and is possible because all cores are structurally identical and differ only in their parameter sets.
Dynamic Parameter Sets: The parameter sets, which include the synapse_matrix parameters and neuron parameters, can be dynamically changed through software. This means we can create one physical core and use software to swap between 'virtual' cores by saving and loading different parameter sets as needed.
Software-Managed Packet Scheduling and Core Switching: The software layer manages which 'virtual' core is currently active, schedules the packets to the core, and orchestrates the core switching process. It determines the destination for outgoing packets and controls the timeline of operations for the SNN core.
By integrating our optimized SNN core into the Caravel framework using a hardware/software co-design approach, we are able to simulate the presence of multiple cores while only utilizing the resources for one. This not only conserves silicon area but also allows for greater flexibility in the use of the core for various applications.