Research

Hardware Accelerators Based on In-Memory Computing

“In-Memory Computing” (IMC) is a non-von Neumann architectural paradigm where data processing is performed within the memory boundaries. IMC could greatly reduce both energy consumption and performance overheads associated with data transfer between memory and processing units. IMC architectures can either be specific to a computing task or serve a broad range of applications. For the former, memory elements such as crossbars and ternary content-addressable memories (TCAMs) can be used as computing kernels to efficiently perform operations such as matrix/vector multiplications, lookups, and nearest-neighbor searches. For the latter, general-purpose computing-in-memory (GPCiM) circuits are employed to perform bitwise logic and arithmetic operations at the memory periphery. GPCiM circuits can reduce performance and energy overheads associated with data transfers in a wide range of applications.

My research exploits the high density and virtually zero leakage power of ferroelectric field-effect transistors (FeFETs) for designing both application-specific and GPCiM architectures in support of data-centric applications. In this regard, working with collaborators, I helped to design dense, 2-FeFET TCAMs which can (i) enable data search within dense memory structures, and (ii) support key data analytics operations such as lookups and nearest-neighbor searches. Regarding GPCiM designs, I proposed a FeFET-based GPCiM architecture that can achieve speedups and energy reduction for memory accesses when compared to a “not in-memory” CMOS approach across 12 benchmark tasks (most of them from the machine learning application domain). For this work, I was awarded the Best Paper Award at the International Symposium on Low Power Electronics and Design (ISLPED), a premier forum for the presentation of innovative research in power-efficient electronics, in the year 2018.

Representative papers:

[1] D. Reis, M. Niemier, and X. Sharon Hu. ISLPED, 2018. [link]

[2] D. Reis, A. Laguna, M. Niemier, and X. Sharon Hu. DATE, 2020. [link]

[3] X. Yin, K. Ni, D. Reis, S. Datta, M. Niemier, X. Sharon Hu. IEEE TCAS-II, 2019. [link]

Benchmarking of Architectures based on Emerging Technologies

Emerging technologies (e.g., Spin-Transfer Torque Magnetic Random-Access Memory (STT-MRAM), Resistive Random-Access Memory (RRAM), Phase-Change Memory (PCM), and FeFETs, etc.) may represent a pathway for enabling the continuation of high-density integration over the next decades. Numerous research efforts have been carried on to (i) refine materials/fabrication processes, and (ii) exploit the unique characteristics of emerging devices to design hardware accelerators that could ultimately improve the performance and energy efficiency of architectures. Given the variety of designs, picking an option that provides the best performance and energy savings at the system level can be a complex task. For this reason, benchmarking methodologies have become critical assessment tools for researchers to understand the benefits and tradeoffs associated with different technologies and architectures.

Together with collaborators at Arizona State University and Zhejiang University, I developed a uniform benchmarking framework for evaluating IMC accelerators based on CMOS and emerging technologies. Notably, our work proposes Eva-CiM as part of the developed framework. Eva-CiM models the performance of different IMC designs from device-to-system, which enables researchers to make assessments including whether a program is CiM-favorable (i.e., can benefit from an IMC architecture) as well as the pros and cons of increased memory size for GPCiM. Eva-CiM also facilitates the exploration of different technologies for IMC accelerators.

Representative papers:

[1] D. Gao, D. Reis, X. Sharon Hu, and C. Zhuo. IEEE TCAD, 2019. [link]

[2] S. Angizi, Z. He, D. Reis, X. Sharon Hu, W. Tsai, S. J. Lin, and D. Fan. ISVLSI, 2019. [link]

[3] D. Reis, D. Gao, S. Angizi, X. Yin, D. Fan, M. Niemier, C. Zhuo, and X. Sharon Hu. GLSVLSI, 2020. [link]

Design of Hardware Security Primitives and Accelerators for Cybersecurity

Like speed and energy consumption, the security and privacy of computer systems have become a critical research topic. On one hand, hardware security primitives based on CMOS and emerging technologies ensure that intellectual property (IP) is protected as more specialized computing kernels have emerged to meet the performance demands of a given application. On the other hand, the rise of “Big Data” and computing in the cloud have led to growing concerns regarding the security and privacy of clients’ information that is stored/processed in third-party servers. My research on secure computing seeks to (i) protect intellectual property through the design of hardware security primitives based on emerging technologies, and (ii) enable secure and private computation in the cloud with specialized IMC-based accelerators for encryption (i.e., homomorphic encryption (HE)) as well as block ciphers such as the Advanced Encryption Standard (AES).

Regarding (i), my collaborators from Prof. Joerg Appenzeller’s group at Purdue and I have experimentally demonstrated (simulated), for the first time, NAND/NOR (XOR/XNOR) polymorphic gates based on high-performance 2D black phosphorus field-effect transistors (BP-FETs) with reconfigurable polarities, low-voltage operation (up to 0.2V), and high tolerance against power supply variations. Finally, concerning (ii), I designed an IMC-based accelerator for HE based on the BF/V scheme that can obtain a speedup of 784.0X for homomorphic multiplications. Furthermore, I proposed a programmable IMC-based accelerator for AES encryption/decryption (IMCRYPTO). The IMCRYPTO design is a hybrid of a CMOS-based RAM/CAM memory architecture that improves the throughput per area of AES-128 encryption when compared to previous hardware accelerators.

Representative papers:

[7] P. Wu, D. Reis, X. S. Hu, J. Appenzeller. Nature Electronics, 2021. [link]

[10] J. Takeshita, D. Reis, T. Gong, M. Niemier, X. Sharon Hu, and T. Jung. SAC, 2021. [link]

[11] D. Reis, H. Geng, M. Niemier, and X. Sharon Hu. IEEE TVLSI, 2022. [link]