SKLP Focuses on PDA Method and Tools and Proposes Two Energy-efficient In-memory-computing Accelerator Architectures: SARA and InfoX

Date: May 26, 2023
In recently focusing on PDA method and tools, the State Key Laboratory of Computer Architecture??, SKLP   for short, proposed two energy-efficient in-memory-computing accelerator architectures: InfoX and SARA. The related research, outlined in three papers entitled “Towards State-Aware Computation in ReRAM Neural Networks,” “Saving Energy of RRAM-based Neural Accelerator through State-Aware Computing,” and “InfoX: An Energy-Efficient ReRAM Accelerator Design with Information-Lossless Low-Bit ADCs,” were accepted, respectively, by DAC2020, TCAD, and DAC 2022, all CCF-A tier conferences or journals.

SARA is aimed at efficient ReRAM-based IMC, IMC being among the most promising architecture solutions to realize energy-efficient neural network inference. Resistive RAM (ReRAM) memory technology is a capable device for implementing IMC-based neural network accelerator architecture, which  is particularly suitable for power-constrained IoT systems. Due to the nature of low leakage and in-situ computing, the dynamic power consumption of dot-production operations in ReRAM crossbars dominates chip power, especially when applied to low-precision neural networks.

This work investigated the correlation between cell resistance state and crossbar operation power, and proposed a State-Aware ReRAM Accelerator (SARA) architecture for energy-efficient, low-precision neural networks. Relying on the proposed state-aware network training and mapping strategy, crossbars in the ReRAM accelerator can perform in a lower-power state. When the proposed ReRAM accelerator architecture was leveraged to reduce power consumption of high-precision network inference with both single-level or multi-level ReRAM, the resultant evaluation showed that for binary neural networks, this design saves 40.53% ReRAM computing energy on average over baseline. For high precision neural networks, the proposed method reduces 11.67% computing energy on average without any accuracy loss.
The other energy-efficient innovation, InfoX, is aimed at low-power analog-to-digital conversion for energy-efficient IMC architecture. ReRAM-based accelerators have great potential in neural network acceleration via in-memory analog computing, but the high-precision analog-to-digital converters (ADCs) that ReRAM crossbars need to achieve high-accuracy network model inference are essential for accelerator energy-efficiency. 

To address this issue, SKLP proposed InfoX, an information-aware ReRAM-based accelerator design, and the accompanied XB-wise ADC precision assignment method. This proposed architecture introduces the concept of XB output information, distribution range and information entropy to measure each XB’s ADC requirement. With the information-lossless ADC design with configurable precision, the ADC output precision for each XB is set according to the expected distribution for WBs mapped onto the XB. With the XB-wise ADC precision assignment, low ADC energy consumption and high inference accuracy are achieved simultaneously. In experiments, the proposed information-lossless ReRAM accelerator InfoX only consumes 8.97% ADC energy of the SOTA baseline with no accuracy degradation at all.