講師 Lecturers
&
&
Neuromorphic computing requires huge computation rates. Efficient computing platforms are required to provide the required computational throughput at feasible power levels. This session will survey the wide range of platforms that can be used to build neuromorphic systems: CPUs, GPUs, FPGAs, digital ASICs, and analog. We will study the design and programming of these platforms. We will evaluate them based on their numerical precision, performance, and power consumption.
(To Be Updated)
(To Be Updated)
This lecture aims to cover design optimization methods of neural network from algorithm level to hardware design. As an example of algorithm optimization, we will study object detection algorithms including R-CNN, fast(er) R-CNN, SSD, R-FCN, and recent ones such as Yolo v2.
Convolution, which is the key function in convolutional neural networks, is usually implemented as matrix multiplication on GPU. We will study the conversion (called convolution lowering) and consider issues in GPU execution of lowered convolution. Then, we will address three important methods of neural network design optimizations, low rank approximation, quantization, and pruning.
Low rank approximation is to transform expensive matrix multiplication to cheaper one by decomposing a large matrix into the multiplication of smaller matrices. Quantization aims at reducing data bit width while keeping the output quality of neural network.
Pruning is to force small-value weights to zero weights, which allows us to skip multiplications with zero input thereby improving performance as well as energy efficiency.
As the last topic, we will cover hardware accelerators. Especially, we will focus on zero-aware hardware accelerators which aim to exploit zero values for better performance and energy in neural network execution.
(To Be Updated)
(To Be Updated)