Crossbar-constrained technology mapping for ReRAM based in-memory computing
In-memory computing has gained significant attention due to the potential for dramatic improvement in speed and energy. Redox-based resistive RAMs (ReRAMs), capable of non-volatile storage and logic operations simultaneously have been used for logic-in-memory computing approaches. To this effect, we...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/154461 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In-memory computing has gained significant attention due to the potential for dramatic improvement in speed and energy. Redox-based resistive RAMs (ReRAMs), capable of non-volatile storage and logic operations simultaneously have been used for logic-in-memory computing approaches. To this effect, we propose ReRAM based VLIW Architecture for in-Memory comPuting (ReVAMP), supported by a detailed device-accurate simulation setup with peripheral circuitry. We present theoretical bounds on the minimum area required for in-memory computation of arbitrary Boolean functions specified using structural representation (And-Inverter Graph and Majority-Inverter Graph) and two-level representation (Exclusive-Sum-of-Product). To support the ReVAMP architecture, we present two technology mapping flows that fully exploit the bit-level parallelism offered by the execution of logic using ReRAM crossbar array. The area-constrained mapping (ArC) generates feasible mapping for a variety of crossbar dimensions while the delay-constrained mapping (DeC) focuses primarily on minimizing the latency of mapping. We evaluate the proposed mappings against two state-of-the-art technology in-memory computing architectures, PLiM and MAGIC along with their automation flows (SIMPLE and COMPACT). ArC and DeC outperform state-of-the-art PLiM architecture by 1.46×1.46× and 4.3×4.3× on average in latency. ArC offers significantly lower area (on average 25.27×25.27× and 6.57×6.57×), while improving the area-delay product by 1.37×1.37× and 1.12×1.12× against two mapping approaches for MAGIC respectively. In contrast, DeC achieves average area (1.45×1.45× and 3.06×3.06×) and area-delay product (1.12×1.12× and 6.36×6.36×) improvements over the mapping approaches for MAGIC architecture respectively. The proposed mapping techniques allow a variety of runtime efficiency trade-offs. |
---|