ARYABHAT-1

( Analog Reconfigurable Technology and Bias-scalable Hardware for AI Tasks )

Team Members

Project Video

ARYABHAT 1 Chip 720p.mp4

Project Background

ARYABHAT-1 is a next-generation analog computing chipset designed to target Artificial-Intelligence (AI) and Machine Learning (ML) applications at the edge. Presently, such computations are achieved by application-specific digital accelerators which utilize spatial arrays of parallel processing elements to significantly improve performance and energy efficiency compared to general purpose platforms. This work focuses on building the first-of-its-kind technology scalable reconfigurable analog processor that can be fully scaled down to sub-nanometer process nodes.  

This post summarizes the contributions and developments that have happened to date along with the upcoming works in the pipeline. 

Project Motivation

We started this research back in 2019 and were intrigued by how powerful and energy-efficient the human brain is. It has roughly 86 Billon processing units (neurons) and consumes only about 25 Watts of power. Even the most powerful supercomputer in the world falls short when it comes to matching the raw computational power, efficiency, and energy consumption of the human brain. Thus, there still is a huge order of magnitude difference in what we have built today and what nature offers us. While replicating the human brain was not the ideal path forward (at least not in the nearby future), it was pretty clear that the existing digital can be augmented with analog to move in a similar direction.


Furthermore, we also wanted to take a route distinct from where the industry was focusing, whereby in the future, it will soon become infeasible to keep on increasing power and area for squeezing more performance per watt for digital Deep-Learning Accelerators.


Current AI Challenges & Industry Approach

It was important to understand the current challenges in designing Analog AI Accelerator along with the alternative solutions the industry was working on to tackle this. Below I highlight a few of the most important ones.

Today, with Moore's law reaching its end and Dennard's scaling already hit the wall, digital accelerators (like GPUs, TPUs, and IPUs) which the industry is currently pursuing, are not enough to execute the demanding workloads efficiently. The overall effect of saturating Moore's law and already saturated Dennard's scaling is that we cannot increase computation in a given sq. mm of the area with the hope of squeezing out more performance per watt. For instance, if I could have done a million operations for 1 mm. sq area in 180nm technology, maybe I could do 20 million operations in the same 1 mm. sq area for 7nm FinFet, and that too with improved power efficiency. But this has stopped for now.

So what industries have done to tackle Problem #1 mentioned above,  is that they are doing low precision calculations such as fixed-point 16-bit or even 8-bit. By this, the hope is to squeeze out even more performance per watt. However, it will soon become infeasible to increase the number of operations onto a chip or reduce bit precision further. There is an inherent limit to implementing machine learning on digital electronics. So, on the one hand, we cannot reduce the bit precision further because the computational accuracy starts dropping, while on the other hand, we cannot scale the technology down because of physical limitations. So we are kind of stuck here.

Machine Learning (ML) algorithms are increasing exponentially in size. In 2015 the VGNet, an artificial neural net that surpassed the human limit, had 100 layers of neurons and performed millions of operations. Since that time, model size has increased drastically, and this has become a problem for several reasons: One is that it now requires computations to happen in billions which we cannot do because of fundamental physical limits, and the second is the energy consumption. 

Performance of analog in terms of power efficiency and the area is unmatched to digital counterpart but porting of design from one node to another advanced need is challenging and generally requires architectural redesign.  For eg., the design that used to work at the 65nm node where most of the industry analog designs are will not work in FinFet.

Analog designs are generally non-modular. In analog, there is no concept of fundamental modules like we have standard cells in digital designs that can be recursively used irrespective of implemented technology node. Therefore each ML architecture design in analog required a complete rework and consumes massive time and manpower. 

Analog designs are stable only in their defined biases. They tend to lose performance and functionality if operated beyond their defined specifications or operating regimes due to which it is difficult to tune the architecture as per desired need of the application.

Analog designs lack the reconfigurability of their digital counterparts.  This is also the reason why we always hear about FPGAs (Field-Programmable GateArray)  but not FPAAs (Field-Programmable Analog Array).

In any analog ML  design, it is generally difficult to control computational/bit precision at the hardware level. Thereby affecting the area saving and power-performance as per the need of application.     

Proposed Solution : Analog  "A Road Less Travelled"

ARYABHAT-1 Design Stages & Timelines

(Aug-2019 to Aug-2020)

(Jan-2020 to Dec-2020)

(Aug-2020 to Aug-2021)

(Dec-2020 to Sep-2021)

(Jan-2021 to Aug-2021)

(Mar-2021 to Aug-2021)

    (May-2021 to Dec-2021)

(Jan-2022 to Dec-2022)

ARYABHAT Snapshots

ARYABHAT 1

A Snapshot of first generation analog AI Accelerator chip called ARYABHAT

ARYABHAT 1

A Snapshot of ARYABHAT with embedded testbed

Test-In Progress

Test-In Progress

ARYABHAT In-News

ARYABHAT-1 Architecture

Related Publications

Patents