Modern computing engines--CPUs, GPUs, and NPUs--require extensive SRAM for cache designs, driven by the increasing demand for higher density, performance, and energy efficiency. This book delves into two primary areas within ultra-scaled technology nodes: (1) advancing SRAM bitcell scaling and (2) exploring innovative subarray designs to enhance power-performance-area (PPA) metrics across technology nodes.
The first part of the book utilizes a bottom-up design-technology co-optimization (DTCO) approach, employing a dedicated PPA simulation framework to evaluate and identify the most promising strategies for SRAM bitcell scaling. It offers a comprehensive examination of SRAM bitcell scaling beyond 1 nm node, outlining a structured research cycle that includes identifying scaling bottlenecks, developing cutting-edge architectures with complementary field-effect transistor (CFET) technology, and addressing challenges such as process integration and routing complexities. Additionally, this book introduces a novel write margin methodology to better address the risks of write failures in resistance-dominated nodes. This methodology accounts for time-dependent parasitic bitline effects and incorporates timing setup of write-assist techniques to prevent underestimating the yield loss.
In the second part, the focus shifts to a top-down DTCO approach due to the diminishing returns of bitcell scaling beyond 5 Å node at the macro level. As technology scales, increasing resistance and capacitance (RC) lead designers to adopt smaller subarray sizes to reduce effective RC and enhance subarray-level PPA. However, this approach can result in increased inter-subarray interconnect overhead, potentially offsetting macro-level improvements. This book examines the effects of various subarray sizes on macro-level PPA and finds that larger subarrays can significantly reduce interconnect overhead and improve the energy-delay-area product (EDAP) of SRAM macro. The introduction of the active interconnect (AIC) concept enables the use of larger subarray sizes, while integrating carbon nanotube FET as back-end-of-line compatible devices results in macro-level EDAP improvements of up to 65% when transitioning from standard subarrays to AIC divided subarrays. These findings highlight the future trajectory of SRAM subarray design in deeply scaled nodes.