Next Article in Journal
A Novel PPG-Based Biometric Authentication System Using a Hybrid CVT-ConvMixer Architecture with Dense and Self-Attention Layers
Previous Article in Journal
Bilinear Interpolation of Three–Dimensional Gain–Scheduled Autopilots
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Design of High-Speed, Low-Power Sensing Circuits for Nano-Scale Embedded Memory

by
Sangheon Lee
1,
Gwanwoo Park
1 and
Hanwool Jeong
1,2,*
1
Department of Electronic Engineering, Kwangwoon University, Seoul 01897, Republic of Korea
2
Articron Inc., Ansan-si 15588, Republic of Korea
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(1), 16; https://doi.org/10.3390/s24010016
Submission received: 8 November 2023 / Revised: 14 December 2023 / Accepted: 17 December 2023 / Published: 19 December 2023
(This article belongs to the Special Issue Electronics for Sensors, Volume 3)

Abstract

:
This paper comparatively reviews sensing circuit designs for the most widely used embedded memory, static random-access memory (SRAM). Many sensing circuits for SRAM have been proposed to improve power efficiency and speed, because sensing operations in SRAM dominantly determine the overall speed and power consumption of the system-on-chip. This phenomenon is more pronounced in the nanoscale era, where SRAM bit-cells implemented near minimum-sized transistors are highly influenced by variation effects. Under this condition, for stable sensing, the control signal for accessing the selected bit-cell (word-line, WL) should be asserted for a long time, leading to increases in the power dissipation and delay at the same time. By innovating sensing circuits that can reduce the WL pulse width, the sensing power and speed can be efficiently improved, simultaneously. Throughout this paper, the strength and weakness of many SRAM sensing circuits are introduced in terms of various aspects—speed, area, power, etc.

1. Introduction

System-on-chip design encounters considerable challenges related to power consumption and latency, with an influence emanating from static random-access memory (SRAM) [1,2,3,4]. Thus, the efficient management of SRAM power consumption and the enhancement of SRAM access speed becomes highly important. Although reducing the supply voltage (VDD) proves effective in reducing power consumption, it introduces potential performance and stability trade-offs. In particular, the SRAM bit-cell, a circuit component for binary data storage, is typically constructed with near minimum-sized transistors to achieve high-density integration, resulting in significant performance variability due to process deviations [5,6,7,8]. Furthermore, to address read stability issues, read assist circuits are employed to suppress the word-line voltage, which can exacerbate performance degradation. Consequently, the optimization of SRAM circuits to minimize both power consumption and delay becomes crucial.
By analyzing the read operation, we can identify a method to simultaneously reduce power consumption and delay in SRAM. During the read operation, the bit-cell generates a voltage difference across the bit-line pair. Then, a sensing circuit measures this voltage difference and subsequently delivers the results to the external system. Importantly, the bit-line pair, which plays a fundamental role, has a significant capacitance, enough to make it the dominant contributor to both delay and power consumption during the read operation. Consequently, when a substantial voltage swing in the bit-line is necessitated for the read operation, it inevitably results in increased delays and power consumption. Thus, reducing the bit-line swing during the read operation can effectively decrease the power consumption and delay at the same time [9,10,11].
However, it is highly challenging to reduce bit-line voltage swing. This is because sensing circuits, especially the sense amplifier (SA) responsible for detecting bit-line swing, necessitate a sufficiently large bit-line voltage difference (ΔVBL) for precise operation. This need arises due to transistor mismatch within the SA, causing asymmetry in its characteristics. The minimum input voltage difference (in this case, ΔVBL) required for stable SA operation is known as the SA offset voltage (VOS). To reduce the ΔVBL, it becomes essential to lower the VOS.
Additionally, the SA is crucially utilized not only in SRAM but also in novel components, improving the efficiency of data processing [12,13,14,15,16,17,18,19,20,21]. SAs are used as row ADCs in [12,13,14], binary activation functions in [15,16,17], multilevel sense amplifiers in [18], four-bit flash ADCs in [19], and sensing circuits in [20,21]. Therefore, research on low VOS for high accuracy, low power consumption, fast speed, and high integration for efficient performance is crucial for SAs.
Consequently, there are numerous prior research efforts proposed to reduce the VOS, the most important performance of SAs. The simplest method is to use larger width transistors for SAs, which can reduce the mismatch between paired transistors. However, this approach incurs area and power overhead. To reduce the VOS while minimizing the area and power overhead, various offset reducing circuit techniques have been proposed [22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47]. This paper aims to conduct a comparative analysis of these circuits, explaining their effectiveness in reducing the VOS and achieving power and performance benefits.
The rest of this paper is organized as follows: Section 2 provides essential background information on SRAM read operations and conventional SRAM sensing circuits, including an examination of their limitations. This foundation is crucial for understanding the subsequent content. Section 3 delves into comprehensive introductions of various previously researched SRAM sensing circuits designed to reduce the VOS, ultimately enhancing speed and power efficiency. Section 4 details a comparative analysis and discussion of the SRAM sensing circuits introduced in Section 3 from various perspectives.

2. Backgrounds on SRAM Read Operation and Conventional Sensing Circuits

Figure 1 presents the simplified circuits in the conventional SRAM for the read operation. In the following, we provide brief explanations for the structure and operation of each circuit shown in Figure 1.
At the top of Figure 1, the bit-cell is composed of six transistors. In this 6T bit-cell, two cross-coupled inverters are formed of M1, M2, M3, and M4 for storing and latching the binary data at two storage nodes, QT and QC. The two access transistors, M5 and M6, serve as control elements that regulate connections between the bit-line pair (BLT and BLC) and storage nodes (QT and QC). When the WL activates (i.e., WL = 1), access transistors are turned on to connect bit-lines to storage nodes.
Next, the bit-line pre-charge circuit is shown, which is formed of MPCT, MPCC, and MEQ. These transistors are controlled by the low-enable pre-charge trigger signal, PCB, with their gates connected. When PCB = 0, MPCT and MPCC are turned on to pre-charge BLT and BLC up to VDD, while MEQ ensures that BLT and BLC are pre-charged to equal voltages.
The column multiplexer (MUX) implemented with MC1, MC2, …, MC8 selects one bit-line pair from multiple pairs (four pairs in Figure 1) and connects it to the SA input pair SLT and SLC. The specific bit-line pair to be connected is determined by the column address signal, COLB[0:3], with only one of these signals set to low.
The SA plays a key role in the SRAM read operation. It amplifies the voltage difference between SLT and SLC, converting it into a full-logic swing voltage. This amplified signal is then made available at the SA’s differential outputs—SOT and SOC. Two commonly used conventional SA structures are the voltage-type latch SA (VLSA) and the current-type latch SA (CLSA), which are shown in Figure 2a,b, respectively [48]. Compared to VLSAs, CLSAs acquire SA input voltages, SLC and SLT, through the gate of access transistors, MS1 and MS2. Therefore, the SA input voltage drives high impedance and less sensitivity to the timing mismatch. However, CLSAs have additional transistors for sensing operations. Therefore, CLSAs have lower speed performance, higher energy consumption, and a larger area, compared to VLSAs. The SA enable signal (SAE), connected to MS5–MS7 of VLSA and MS7–MS9 of CLSA, is utilized for triggering the amplifying operation of the SA.
Figure 3 provides operational waveforms of relevant signals during the conventional SRAM read operation, divided into three phases: the pre-charge phase, the access phase, and the evaluation phase. In the pre-charge phase, the PCB becomes low, which pre-charges the bit-lines (BLT and BLC) and SA inputs (SLT and SLC) to VDD through the bit-line pre-charge circuit and the SA input pre-charge circuit. Then, the access phase starts by making PCB = 1 to turn off the pre-charge circuits, while the WL for the selected bit-cell is asserted to reflect the data at QT and QC onto the bit-line pair of BLT and BLC. Figure 3 shows an example of bit-cell storing datum “1” (QT = 1 and QC = 0). In this example, BLT remains high while BLC falls due to the bit-cell current through M6, creating a voltage difference between BLT and BLC. By lowering the COLB[i] in the selected column, the column MUX transistors transfer only the selected bit-line pair voltage to the SA inputs, SLT and SLC.
During the subsequent evaluation phase, the SA enable signal (SAE) becomes high to trigger the positive feedback configuration in the SA. In this manner, a small voltage difference between SLT and SLC, ΔVIN,SA (See Figure 3), is amplified into the digital voltage difference at SA output nodes SOT and SOC. For example, the sensing operation of a VLSA in Figure 2a is shown in Figure 4.
When the sensing datum is “1”, the SLT remains at VDD while the SLC decreases due to the bit-cell, reaching VDD − ΔVIN,SA, as shown on the left side of Figure 4. The voltages at the SA outputs, SOT and SOC, are equal to those at SLT and SLC, respectively, through the pass transistors MS5 and MS6. During the subsequent evaluation phase, the SAE rises, and current flows through paired nFETs.
The FETs in the SA, MS1 and MS2, are depicted as IS1 and IS2 in the middle of Figure 4. At the beginning of the evaluation phase, the VGS of MS2 (SOT = VDD) is greater than that of MS1 (SOC = VDD − ΔVIN,SA). Consequently, IS2 > IS1 makes SOC fall faster than SOT. This leads to positive feedback, formed by MS1–MS2–MS3–MS4. As a result, SOT and SOC eventually reach VDD and 0 V, respectively, as shown on the right side of Figure 4, indicating a successful “1” datum sensing process.
However, it is not always guaranteed that the SA operation is stably performed. In Figure 5, there is a scenario where sensing failure occurs. The access phase is the same as the previous normal sensing operation (the left side of Figure 5). However, when the evaluation starts by triggering the SA, as shown in the middle of Figure 5, problems can arise. It should be noted that, although the VGS of MS2 (SOT = VDD) is greater than the VGS of MS1 (SOC = VDD − ΔVIN,SA), IS2 < IS1. This can occur because there is a mismatch between the MS1–MS2 pair, specifically since the Vth of MS1 is lower than the Vth of MS2 [22]. Consequently, the SOT (initially VDD) falls more quickly than the SOC (initially VDD − ΔVIN,SA). Therefore, SOT and SOC end up with 0 V and VDD, respectively, meaning that sensing fails in attempting to sense datum “1”.
Here, the key point is that the mismatch between the paired transistors is responsible for the sensing failure. To prevent this sensing failure, ΔVIN,SA should be large enough to compensate the effects of the transistor mismatch. This minimum required ΔVIN,SA for stable sensing is the offset voltage in the SA, referred to as VOS, and necessitates that ΔVIN,SA > VOS. This VOS problem becomes severed in low-VDD regions and is significantly affected by temperature [49,50]. To meet this condition, the WL pulse width is extended to achieve a sufficiently large ΔVBL, which, in turn, results in a large ΔVIN,SA. However, this increased ΔVBL requirement not only causes delays but also raises power consumption, since more power is needed to pre-charge the significant capacitance of the BL pair, stemming from the combined effects of the long wire capacitance of the BL wire and the parasitic capacitance of the bit-cells.
Although employing large-sized transistors for sensing schemes can mitigate the mismatch problem, it incurs power, speed, and area overhead in the sensing stage [18]. In addition, the various replica bit-line delay or self-timed SAE generation techniques are proposed to minimize WL pulses [51,52,53,54,55,56,57,58], but their effects are limited because local variations cannot be considered. The speed and power issue due to the ΔVBL requirement in SRAM becomes more severe in today’s advanced sub-nanometer technology nodes, because WL-suppressed assist circuits are widely used, which necessitates larger WL pulses for ΔVBL requirements [59,60,61,62].
Therefore, it would be highly beneficial to reduce the VOS, as it would alleviate the demand for a large ΔVBL. In the following section, we describe SRAM sensing circuits designed to reduce the VOS for the purpose of improving speed and power efficiency. We will explore these circuits in terms of their structure, operation, and key performance characteristics.

3. SRAM Sensing Circuits for Offset Reduction

3.1. Schmitt Trigger Sense Amplifiers

Schmitt triggers are often used to improve the robustness of a standard inverter by modifying the switching threshold. Utilizing this feature, the authors in [24,25,26] proposed the Schmitt trigger-based SA (STSA) to reduce VOS, where one example structure is shown in Figure 6a. This structure intends to weaken the pull-down network of the inverter holding high voltages relative to that of the low-voltage inverter.
For example, when SLT is VDD while SLC is VDD − ΔVIN,SA for datum “1” sensing, SOT and SOC become VDD and VDD − ΔVIN,SA, respectively, at the end of the access phase. When the evaluation phase starts with SAE rising, MS5 is more strongly turned on than MS6 because SOT > SOC. Thus, the ZT node (the source of MS3) is more strongly pulled up than ZC (the source of MS4). In this manner, which adjusts not only the gate voltage but also controls the source voltages of MS3 and MS4 according to SOT and SOC, the VGS of MS3 is greatly suppressed. That is, the VGS difference in two paired nFETs (MS3–MS4) in the STSA is larger than that in MS1–MS2 in the VLSA, which makes it more tolerant to the mismatch effects. In this manner, the STSA attempts to provide a reduced VOS compared to the VLSA.
However, the STSA has a limited ability to reduce the VOS. This is because there are additional transistor pairs existing in the STSA; thus, the mismatch effect can be larger. In particular, the mismatch between MS5 and MS6 and the mismatch between MS1 and MS2, which are not present in the VLSA, increase the asymmetricity in the SA and increase the VOS. However, the circuit technique implemented in the STSA, performed by MS1, MS2, MS5, and MS6, effectively mitigates these mismatch effects, thereby compensating for the increase caused by the additional transistor pair. As a result, the final VOS is reduced compared to the VLSA. Furthermore, the sensing delay is increased compared to the VLSA due to the use of a stacked nFET structure [26].
To mitigate the speed problem of STSAs, the voltage-boosted STSAs (VBSTSAs) are proposed [27], as shown in Figure 6b. In VBSTSAs, the negative voltage generator (NVG) used for the negative bit-line write-assist circuit is reutilized to accelerate the operation of STSAs. In the NVG, as the NVG operation starts, the BSTEN increases and the BSTENb decreases. Through the decreased BSTENb, MS13, which was holding OUT to VSS, is turned off, allowing OUT to reach a floating state. Subsequently, after MS13 is completely turned off, BSTENd, delayed through inverters, decreases and OUT is lowered to a negative voltage through a coupling capacitor, C. Note that BSTENd should decrease after the MS13 is fully turned off. Therefore, sufficient delay should be provided by the inverter in the NVG. Specifically, the ground voltage for the SA is pulled down to the negative voltage at the rising edge of the SAE, or 0 V otherwise. This is realized by making the switch, which is turned on only when the SAE is high, delivering the negative voltage generated by the NVG. Although sensing speed can be enhanced in this manner, it incurs a significant amount of power overhead. In addition, NVGs are not always used for write-assist circuits; other types of write-assist circuit, such as cell voltage collapse write assist, do not use NVGs.

3.2. Hybrid Latch-Type Sense Amplifiers

Some previously proposed SAs combine the features of VLSAs and CLSAs to reduce the VOS, which can be referred to as hybrid latch-type SAs (HYSA) [28,29,30,31,32,33]. Figure 7a shows one example of an HYSA proposed in [32], the variation-tolerant SA (VTSA). For consistency in explanation with other structures, the polarity in this VTSA example is reversed from the original structure. The VTSA is primarily based on the CLSA structure but also incorporates features of a VLSA. Specifically, the SA outputs, SOT and SOC, are pre-charged to the SA inputs, SLT and SLC, using pass transistors MS7 and MS8.
When comparing VTSAs with VLSAs, a notable difference is observed in the pull-down networks of the positive feedback configurations in the SA. In the VTSA, these networks, consisting of MS3 and MS4, are not directly connected to the CM node as in the VLSA. Instead, they are connected to ZT and ZC nodes, as shown in Figure 7a. These nodes are pulled down by MS1 and MS2, respectively, with their gates controlled by SLC and SLT. This configuration effectively adjusts the VGS of MS3 and MS4 for proper sensing.
The detailed operation of the VTSA is as follows: During the access phase, when SAE = 0 and datum “1” is being sensed, the SLT is at VDD, and SLC is at VDD − ΔVIN,SA, making SOT and SOC pre-charged to VDD and VDD − ΔVIN,SA, respectively, through MS7 and MS8, similar to the VLSA. Additionally, the gate voltages of MS1 and MS2, VG,MS1 and VG,MS2, become VDD − ΔVIN,SA and VDD, respectively. When the evaluation phase begins with SAE = 1, ZT and ZC are pulled down by MS1 and MS2, respectively. In this configuration, since SLT > SLC, MS1 can drive more current than MS2, resulting in ZC being pulled down more strongly than ZT (i.e., ZT > ZC). As a result, compared to the VLSA, the difference between VGS,MS3 and VGS,MS4 is lager in the VTSA, indicating that the amplification can be more stabilized, and thus, VOS can be reduced. This is due to adjustments made not only in the gate voltage conditions of MS3 and MS4 (VG,MS3 < VG,MS4), but also in their source voltage conditions (VS,MS3 > VS,MS4).
However, the VTSA has an additional pair of nFET transistors compared to the VLSA—MS1 and MS2—involved in the initial amplification of signals. This additional pair not only incurs area overhead but also potentially increases the mismatch effects. That is, the mismatch between MS1 and MS2, which does not need to be considered in VLSAs, can result in unintentional changes in ZT and ZC and degrade the sensing stability. In addition, stacked nFETs degrade the sensing delay and power consumption, like STSAs.
Figure 7b shows another example of an HYSA, the HYSA-QZ, which is proposed in [33]. This structure more aggressively pre-charges the internal nodes of the SA than the VTSA. The notation of QZ here means that not only output nodes (Q), but the internal nodes between the MS1–MS2 pair and MS3–MS4 pair (Z) are also pre-charged to SA inputs in a direction for precise sensing. As shown in Figure 7b, not only SOT and SOC are pre-charged to SLT and SLC, but also ZT and ZC are pre-charged to SLT and SLC, respectively. In this manner, the bias condition of the SA becomes more favorable for accurate sensing than the VTSA.

3.3. Capacitor-Based Offset-Compensated SAs

Several previously proposed SAs have addressed transistor mismatches by employing capacitors [34,35,36,37,38,39,40]. These capacitors capture the mismatches between paired transistors, and the stored mismatch information is subsequently utilized to bias the internal nodes of the SA for compensation. Figure 8a illustrates the configuration of a capacitor-based threshold-matching SA (TMSA), as presented in [38].
As demonstrated in Figure 8b,c, the TMSA comprises two main components: a VLSA part and the capacitor-based threshold-matching part. The primary goal of the TMSA is to compensate the mismatch between the MS1–MS2 pair, which is the most critical pair in a VLSA. This correction is accomplished by initially sampling the Vth of MS1 and MS2Vth,MS1 and Vth,MS2—during the pre-charge phase. Then, the sampled Vth,MS1 and Vth,MS2 are stored at the source nodes of MS1 and MS2. This ensures that the current through MS1 and MS2 during the amplification operation—IS1 and IS2—are independent to their Vth mismatch.
The detailed operation that achieves this objective is illustrated in Figure 9a–d, in the example of sensing datum “1”, with a comprehensive explanation provided as follows.
(1)
Pre-charge phase (Figure 9a): During this phase, the input and output nodes of the SA—SLT, SLC, SOT, and SOC—are pre-charged to VDD. Then, the top-plate nodes of C0 and C1CTT and CTC—are pre-charged to VDDVth,MS1 and VDDVth,MS2, respectively, and MS1 and MS2 become turned off. This pre-charge is conducted under the assumption that CTT and CTC are initially at 0 V before pre-charging (the rationale for this will be explained). In addition, the common bottom-plate node for C0 and C1, NRSC, is pre-charged to VDD by MS8, which is turned on by PCB = 0.
(2)
Access phase (Figure 9b): In this phase, SLC is lowered and becomes VDD − ΔVIN,SA by the bit-cell, causing the SOC to also be VDD − ΔVIN,SA. In addition, the PCB becomes high, so the common bottom-plate node of C0 and C1, NRSC, becomes float-high.
(3)
Evaluation phase (Figure 9c): This phase starts with the SAE rising, turning on MS7, so the NRSC is pulled down. This results in negative capacitive voltage couplings from NRSC to CTT and CTC, through C0 and C1, respectively. Thus, CTT and CTC are decreased by ΔV, meaning that CTT and CTC are changed into VDDVth,MS1 − ΔV and VDDVth,MS2 − ΔV, respectively. These turn on MS1 and MS2, where the overdrive voltage (VOV = VGSVth) of MS1 and MS2VOV,MS1 and VOV,MS2—become as follows:
VOV,MS1 = VGS,M1Vth,M1 = V(SOC) − V(CTT) − Vth,MS1
= (VDD − ΔVIN,SA) − (VDDVth,MS1 − ΔV) − Vth,MS1 = ΔV − ΔVIN,SA
VOV,MS2 = VGS,M2Vth,M2 = V(SOT) − V(CTC) − Vth,MS2
= VDD − (VDDVth,MS2 − ΔV) − Vth,MS2 = ΔV
The noticeable point is that VOV,MS1 and VOV,MS2, which determine IS1 and IS2, are independent of Vth,MS1 and Vth,MS2, respectively. Thus, even in the presence of a mismatch between Vth,MS1 and Vth,MS2, IS1 and IS2 can be stably generated (e.g., IS1 < IS2 for datum “1“ sensing as in Figure 9c) at the beginning of the evaluation phase. This renders the TMSA to be notably more robust than the conventional VLSA, leading to a reduced VOS.
(4)
Latching phase (Figure 9d): After the NRSC becomes low in the evaluation phase, this change in NRSC propagates to make LAT = VDD through a delay buffer, which starts the latching phase. In this phase, CTT and CTC become 0 V, so SOT and SOC can latch the sensing results at the full digital level. This state is kept until the next pre-charge phase. Here, one can see that CTT and CTC are 0 V, and they are to be charged up to VDDVth,MS1 and VDDVth,MS2, respectively, in the next pre-charge phase.
Although the TMSA effectively reduces the VOS by compensating the mismatch between MS1 and MS2, there are several shortcomings in this structure. First, the structure is still under the effect of a mismatch between capacitors, C0 and C1. The mismatch, however, is typically much smaller than the transistor Vth mismatch. Second, the implementation of capacitors and delay buffers in the TMSA results in a significant increase in power consumption and area requirements. In particular, a sufficiently large ΔV is necessary to turn on MS1 and MS2 in the early stage of the amplification stage; it is inevitable to employ large capacitors for C0 and C1. However, by placing the metal–oxide–metal (MOM) capacitors on top of the circuit layout, the area overhead can be avoided [39]. Consequently, a significant amount of power is required to charge up the NRSC from 0 V to VDD in the pre-charge phase.
As an alternative approach, the variation-tolerant small-signal SA (VTS-SA) is proposed in [39], specifically addressing mismatches between the two inverters in the SA. This is achieved through the utilization of capacitors at the input acceptance part. The structure of the VTS-SA is shown in Figure 10 below.
The VTS-SA is based on a VLSA composed of MS1–MS2–MS3–MS4, while the SA input nodes, SLT and SLC, are accepted through coupling capacitors CC1 and CC2, respectively. By utilizing capacitors, the VTS-SA can capture and store the trip points of two inverters in SA-INV1 (MS1 and MS3) and INV2 (MS2 and MS4), shown in Figure 10. By biasing the two inverters with their respective trip points, the two inverters become highly sensitive to small voltage input variations. That is, even small input voltage changes can push the inverters to switch their output states. This enhanced voltage gain of the inverters contributes to the improved speed of the SA. Furthermore, trip-point biasing in the VTS-SA serves another crucial purpose: it allows the SA to adapt and account for process variations within the inverters. By individually setting the trip points, the VTS-SA makes each inverter operate primarily in response to input changes, minimizing its dependence on process variations as much as possible.
The detailed operations of the VTS-SA are illustrated in Figure 11a–c, where there are three main operation phases: (1) the trip-point bias phase, (2) the access phase, and (3) the evaluation phase.
(1)
Trip-point bias phase (Figure 11a): In this phase, the input and output are shorted in INV1 and INV2 of the SA. As a result, the input and output of INV1 and INV2 are set to their respective trip points—Vbias,INV1 and Vbias,INV2. This is accomplished by turning on the MS7 and MS8 transistors through PRE = 1, while also turning on the header and footer switches MS11 and MS12 with EN = 1. In addition, SAE = 0 in this phase, to make the bottom plate of the coupling capacitors, SLIT and SLIC, also be equal to the trip points of the inverters.
(2)
Access phase (Figure 11b): In this phase, the input–output connections are disconnected, and the two trip-point-biased inverters are ready to accept changes in SLT and SLC through capacitive couplings. Specifically, when sensing datum “1”, as demonstrated in Figure 11b, SLC is decreased by ΔVIN,SA. Then, SLIC is decreased by ΔVcoup through capacitive coupling via CC1. Due to trip-point bias, this input change of INV2 leads to a significant change in the output of INV2, SOT. As a result, an amplified voltage difference is observed between SOT and SOC, which is K × ΔVIN,SA, where K > 1. It is important to note that, as previously mentioned, because the inverters are biased to their respective trip point, the output change is almost only determined by the input change, while largely independent to the process variations.
(3)
Evaluation phase (Figure 11c): In this phase, the SAE becomes high; thus, the two inverters are connected in a cross-coupled fashion, by turning on MS10 and MS9. At the same time, the two cross-coupled inverters are isolated from the input by turning off MS5 and MS6. Through the positive feedback of the cross-coupled inverters, the final data are latched onto SOT and SOC at the full digital level, similar to the operation of other SAs.
Although the VTS-SA tries to reduce the VOS by capturing the mismatch between INV1 and INV2 through trip-point biasing, there are several limitations to this structure. First, the mismatch between MS5–MS6, MS7–MS8, and MS9–MS10 are newly introduced in this structure, which limits VOS reduction. Second, similar to the TMSA, the VTS-SA is still affected by mismatches between CC1 and CC2, although it is less influential than the transistor mismatch. Third, because the input voltage should be transferred through capacitive coupling, not all of the ΔVIN,SA is delivered to the SA. This inefficiency contributes to an increase in effective VOS. Fourth, the trip-point biasing process should be completed before the ΔVIN,SA appears between SLT and SLC. This requirement potentially increases the circuit complexity. In addition, the short current from VDD to VSS is inevitable during the trip-point biasing, resulting in high power consumption.
The current-mode SA with a capacitive offset correction (CSACOC) structure proposed in [40] utilizes a single capacitor for storing the trip points of inverters, so it is free from capacitor mismatch effects. The schematic of the CSACOC is shown in Figure 12a, and the operation waveforms of its three main control clock signals—the trip-point storage enable, ΦTrs; the trip-point bias enable, ΦTrb; and the sense enable, SAE—are illustrated in Figure 12b.
The key concept of the CSACOC is to store the difference in the trip point voltages of the two inverters, INV1 and INV2, in Figure 12a. The difference in the trip point voltages of the two inverters, ΔVTr = VTr1VTr2, is stored across the single capacitor, C0. Then, the two inverters are biased to compensate the trip-point difference, effectively correcting for the mismatch. The operation of CSACOC unfolds in three phases, as illustrated in Figure 13a–c, with explanations for each provided as follows.
(1)
Trip-point storage phase (ΦTrs = 1, Figure 13a): In this phase, SLT and SLC are pre-charged to VDD, and the input and output of each inverter, INV1 and INV2, are shorted. In this manner, the trip points of INV1 and INV2, VTr1 and VTr2, are captured and stored at the input and output nodes of the respective inverters, as shown in Figure 13a. It is accomplished by turning on MS7, MS8, MS9, and MS10 while turning off T1, T2, MS5, and MS6. The difference between two inverter trip points, ΔVTr, is stored across the capacitor, C0.
(2)
Trip-point bias phase (ΦTrb =1, Figure 13b): During this phase, the input and output of INV1 and INV2 are disconnected by turning off MS7 and MS10. Subsequently, by utilizing the ΔVTr stored in C0 in the previous phase, the input of each inverter is held as its respective trip point, while INV1 and INV2 are configured in the cross-coupled connection. For example, the input of INV1 is kept as VTr1, while it is connected to the output of INV2 (=SOc), and vice versa. This is achieved by turning on MS5 and MS7 while turning off MS11. Then, the voltage difference is made between SLT and SLC by the bit-cell, and develops the differential current through MS3 and MS4.
(3)
Evaluation phase (SAE = 1, Figure 13c): In this phase, the two cross-coupled inverters are disconnected from C0 by turning off MS8 and MS9. Simultaneously, the positive feedback of the cross-coupled inverters is initiated by turning on MS11, T1, and T2. As a result, the full digital voltage level appears at two differential outputs of the SA, SOT and SOC.
The CSACOC is immune to capacitor mismatch due to use of a single capacitor, unlike the TMSA and VTS-SA. However, compared to the previous SAs in which the voltage between SLT and SLC is transferred to SOT and SOC through fully turned-on pFETs during the access phase, in the CSACOC, the voltage difference between SOT and SOC follows that of SLT and SOC through partially turned-on pFETs (current-based). This leads to voltage loss, effectively increasing the VOS. In addition, there are numerous required switches and a control signal generation logic, which increases the circuit design complexity with power and area overhead.

3.4. Offset-Compensated Pre-Amplifiers

Another approach in offset compensation is the use of pre-amplifiers that amplify the bit-line signal preceding the SA stage, as seen in [41,42,43,44]. Instead of directly modifying the SA structure, these additional offset-compensating pre-amplifiers are employed in front of the SA. This allows for the required offset compensation while maintaining the original SA structure. One such example is the bit-line pre-charge and pre-amplifying switching pFET circuit (BP2SP), with its structure and key operational waveforms depicted in Figure 14a,b.
As shown in Figure 14b, BP2SP is operated in three phases, as explained below.
(1)
Pre-charge phase (PCB = 0): In this phase, MS13 and MS14 in BP2SP are turned on to pre-charge BLC and BLT, respectively. This pre-charges BLC and BLT to VDDVth,MS15 and VDDVth,MS16, respectively, through a diode connection. It ensures that MS15 and MS16 have VGS = Vth, allowing them to turn on immediately, regardless of Vth variations, when BLC or BLT is discharged in the subsequent phase. This compensates the Vth mismatch between MS15 and MS16. In the SA side, SLT and SLC are pre-discharged to 0 V through MS8 and MS9.
(2)
Access phase (PCB = 1, WL = 1): During this phase, the data stored in the selected bit-line are reflected to the BLT and BLC. In the example shown in Figure 14b, datum “1” is sensed, so the BLT remains close to its pre-charge level, VDDVth,MS16, while BLC decreases from VDDVth,MS15. Because the BLC is pre-charged at VDD − Vth,MS15, MS15 turns on instantly as soon as the BLC decreases. This causes the BLXT to increase rapidly. Simultaneously, the COLB is lowered to enable the column MUX, resulting in SLT increasing and SLC remaining at 0 V. As shown in Figure 14b, this phase effectively pre-amplifies the voltage difference between BLT and BLC to the voltage difference between SLT and SLC.
(3)
Evaluation phase (SAE = 1): In this phase, the SAE is raised, meaning /SAE is lowered. Consequently, the VLSA is enabled to store the final sensing data in the form of a full digital voltage at the SOT and SOC nodes. In addition, during this phase, the bit-line equalization circuit—transmission gate T1—is activated to equalize BLT and BLC. This ensures that the subsequent pre-charge operation of BLT and BLC can start with both bit-lines having the same low voltage level as the initial condition. This equalization step is important for maintaining consistency in the subsequent memory operation.
The operation principle of BP2SP is to use the same pFETs for using pre-charge bit-line and pre-amplify bit-line voltages. Specifically, by pre-charging the bit-line to capture the Vth variation of the pre-amplifying pFETs, these pre-amplifying pFETs can instantly turn on in response to bit-line pair voltage development. This allows the amplified voltage to be observed at SLT and SLC, reducing the required ΔVBL for stable sensing, leading to improvements in speed and power efficiency. However, to make bit-line pairs to VDDVth, it is necessary to ensure that the bit-line voltages are sufficiently lower than VDDVth before pre-charge. This requirement increases the circuit complexity, especially when the memory is awakened from power-down mode or standby mode. In addition, after pre-charging the bit-line pair to VDD − Vth, the bit-lines become floating, making them susceptible to noise. Moreover, the initial VGS condition of pre-amplifier pFETs can significantly vary according to the pre-charge period, which means that the overall speed of the read operation is highly affected by the pre-charge time.
In [43], another pre-amplifier circuit for SRAM, the cross-coupled nFET pre-amplifier and pre-charge circuit (CCN-PP), is presented. The structure and operational waveforms of the CCN-PP are shown in Figure 15a,b. As depicted in Figure 15b, the CCN-PP operates in four phases.
(1)
Pre-charge phase (PBE = 0, PCB = 0): During this phase, the pre-charging boost enable signal (PBE) and PCB are low, so the SA input pre-charge circuit (MS3–MS4–MS5) and MS6 are turned on. This maintains VDDSA as VDD, while SLXT and SLXC are pre-charged to VDD. It should be noted that, unlike the conventional pre-charge operation, all the column MUX transistors and bit-line equalization circuits (T1) are turned on. As a result, SLT, SLX, BLT, and BLC are pre-charged through the CCN-PP. Because the CCN-PP is composed of nFETS, there a threshold voltage drop for pre-charging voltages. That is, BLT and BLC are pre-charged to VDD − min(Vth,MS1, Vth,MS2).
(2)
Access phase 1 (PBE = 1): During this phase, the unselected column MUX transistors are turned off and the PBE is raised. As a result, MS6 is turned off and then the PBEd rises, boosting the VDDSA into VDD + ΔVC through C0 coupling. Thus, the SA inputs, SLXT and SLXC, are also pre-charged to VDD + ΔVC. Accordingly, BLT and BLC can be slightly raised. In this phase, the WL is activated, so BLT and BLC start to be developed according to bit-cell data.
(3)
Access phase 2 (PBE = 0, PCB = 1): With PCB rising, SLXT and SLXC are affected by the change in BLT and BLC through the CCN-PP. For example, when accessing the datum “1”, as shown in Figure 15b, BLC and SLC decrease, leading MS2 to be turned on while MS1 is kept turned off. The turned-on MS2 makes SLXC fall while SLXT is kept high, close to VDD + ΔVC. Due to the positive feedback nature of cross-coupled nFETs, the voltage difference between SLXT and SLXC is larger than that of BLT and BLC, meaning that the bit-line voltage is pre-amplified.
(4)
Evaluation phase (SAE = 1): High SAEs activate the SA to latch the data at SA outputs, SOT and SOC. In addition, similar to BP2SP, the bit-line equalization circuit is activated to provide proper bit-line initial conditions for the subsequent pre-charge phase.
Unlike BP2SP, the initial VGS of pre-amplifier transistors in the CCN-PP are determined by access phase 1. Thus, the performance is less dependent on the pre-charge period, so a stable speed can be provided with the CCN-PP. However, as in BP2SP, the CCN-PP still suffers from floating BLT and BLC during the pre-charge phase. In addition, the CCN-PP cannot compensate the mismatch between MS1 and MS2, which is an inferior point compared to BP2SP. In addition, utilizing the VDDSA boosting circuit can incur a significant amount of power and area overhead.
In [44], the offset-cancelled current SA (OCCSA) is proposed. As shown in Figure 16, the OCCSA uses nFET MUX transistors instead of pFET MUX transistors. Here, the nFET MUX (PSA) operates as a common-gate amplifier, so it effectively pre-amplifies the BL. To bias these PSAs properly with offset-compensating features, BLT and BLC, the BL should be pre-charged lower than VDD − Vth,MS1 and VDD − Vth,MS2, respectively. To realize this, a separate supply voltage, Vprebl, is required. However, the incorporation of this new voltage source is highly costly due to its substantial power and area overheads, making the circuit impractical for actual implementation.

3.5. Other Structures

In [45], an SA with inherent offset cancellation (SAOC) is proposed, with its structure shown in Figure 17a. The SAOC utilizes pFETS—MS10 and MS11 in Figure 17a—for input reception, connecting SLT and SLC to the gate node of these pFETs. Before sensing, by driving SLT and SLC low and toggling the PRE from low to high, the |Vthp| of MS10 and MS11 is captured at the output nodes of SA—SOT and SOC, respectively. Subsequently, BLT and BLC are transferred into SLT and SLC by turned-on MUX transistors, while MS10 and MS11 are turned on by the low PRE. This results in the charging of SOT and SOC by MS10 and MS11. In this manner, the SAOC achieves sensing operations, compensating the mismatch between MS10 and MS11. However, it should be noted that the mismatch between the nFET MUX pair (MS6 and MS7) is not compensated, and pulling up SLT and SLC with nFETs based on BLT and BLC occurs losses during transmitting BL voltage differences to ΔVIN,SA.
In [46], the body-biasing technique is used at critical sensing transistors for auto-offset mitigation features. A differential-input body-biased sense amplifier with floating output nodes (DIBBSA-FL) and a differential-input body-biased sense amplifier with pre-discharge output nodes (DIBBSA-PD) are shown in Figure 17b,c, respectively. The difference between the DIBBSA-FL and the DIBBSA-PD is that the DIBBSA-PD has additional transistors, MS8 and MS9, to predischarge SOT and SOC, while the DIBBSA-FL only equalizes SOT and SOC. The operations of DIBBSA-FL and DIBBSA-PD are as follows. During the sensing operation, the SAEB decreases and MS3 and MS4 turn on. Simultaneously, when BLT is higher than BLC, through the body-bias effect on MS1, MS2, MS3, and MS4, MS1 and MS3 become forward body-biased and MS2 and MS4 become reverse body-biased. Therefore, SOT pulls up much faster than SOC. However, recently, 3D FETs such as the FinFET and GAA FET have become commonly used. In these technologies, the body effect is nearly negligible. Therefore, using the body-bias technique in recent technologies is not suitable.
Figure 17d shows the cancellation based on delay and offset relation (CDOR) structure [47]. Before the sensing operation, the mismatch in the SA is captured by the sensing operation, with SLT and SLC equally set to VDD. Because of the mismatch in the SA, SOT and SOC become (1, 0) or (0, 1), connected to the gate of MS15 and MS14, respectively. When SOT and SOC are (1, 0), this means that the pull-up strength on the SOT side is higher than that on the SOC side. Simultaneously, Q and QB become VDD and VSS, turning off MS6 and MS7. In the case of (SOT, SOC) = (1, 0), MS14 turns on and MS15 turns off, lowering the SLT. Due to the decreased SLT, the pull-up strength of the SOC side becomes stronger, which operates as offset mitigation. However, the process of adjusting the voltage is highly challenging. This is because the voltage variance is highly dependent on the offset mitigation activation time and the sizes of the MS6 and MS7 transistors.

4. Comparison

Table 1 summarizes the comparison among the SRAM sensing circuit designs covered in Section 3.
Unlike the conventional SAs (VLSA and CLSA), the STSA, VTSA, and HYSA-QZ drive or pre-charge the internal nodes of the SA in favor of accurate sensing. In this manner, without using additional control signals or employing additional operation phases, the offset voltage can be efficiently reduced. In terms of reducing the VOS, the VTSA and HYSA-QZ, which directly pre-charge the internal nodes using pass gates connected to SLT and SLC, outperform the STSA. This is because the mismatch effects in the gated FETs controlling the SLT and SLC in the STSA are larger than the mismatch effects in the transmission gates used by the VTSA or HYSA-QZ to transfer SLT and SLC. Compared with the VTSA, the HYSA-QZ can achieve a smaller VOS because more internal nodes are pre-charged than the VTSA. However, the SA delay is increased in the STSA, VTSA, and HYSA-QZ compared to the VLSA, because of using increased stack numbers.
The TMSA, VTS-SA, and CSACOC directly capture mismatches in SAs, utilizing a capacitor(s). In this manner, the VOS can be further reduced compared to the STSA, VTSA, and HYSA-QZ. However, this improvement comes at a cost: introducing additional phases or control signals, biasing through short circuit currents, and using large capacitors increase the SA delay and energy consumption significantly. The trade-off between BL delay/energy and SA delay/energy becomes evident in this context. More precise compensation of SA mismatches can result in a smaller VOS and reduced BL delay and energy. However, achieving this delicacy requires additional circuit components, which can lead to increased SA delay and energy consumption.
Pre-charging BL circuits, BP2SP and CCN-PP, offer an alternative approach to capturing transistor Vth values and reducing BL voltage development. They can be implemented more simply compared to SA mismatch compensation structures because pre-amplifiers have a simpler structure than SAs. However, controlling BL pre-charge levels can be challenging in practice, especially since they should be floating when diode-connection TRs are used for pre-charging.
In addition to the sensing circuit covered in Section 3, there are several other approaches for reducing VBL requirements [44,45,46,47], as shown in the last four rows in Table 1. However, it is worth noting that these methods have specific characteristics that may affect their applicability. In one of these structures, the SAOC is introduced to address the mismatch between two input pFETs at the beginning of the read access to reduce the VOS. However, it is important to note that the mismatches other transistor pairs, which are also critical for VOS, are not able to be compensated. Thus, it may have increased the VOS even compared to the conventional SAs. In addition, short-circuit current paths are inevitably formed, which limits its practical applicability.
The OCCSA utilizes the MUX transistors as the common gate amplifier to pre-amplify the VBL. Although it is powerful, to operate the MUX as an amplifier, an additional high-voltage source is required for bit-line pre-charge (Vprebl), which significantly incurs power and area overheads. In addition, to compensate the mismatch between the MUX transistor pair, a significant amount of time is required for the separate bit-line pre-charge phase before the access phase, which substantially degrades the cycle time.
The DIBBSA-FL and DIBBSA-PD are proposed. In these structures, differential bit-line inputs are transferred to differential output nodes through pull-up pFETs, while the body of the output pull-up pFETs are biased with bit-lines to enhance sensing accuracy. However, a critical limitation of these approaches arises from the fact that most recent SRAMs utilize multiple gate FETs, such as finFETs and gate-all-around FETs, which exhibit minimal body effects. Consequently, the current or threshold voltage remains nearly independent of body voltage changes, rendering these structures inapplicable.
The CDOR-based offset compensating sensing circuit is introduced. This structure captures the mismatch in SAs during the pre-charge phase of the SRAM. This is achieved by enabling the SA (SAE = 1) with the condition of SLT = SLC = VDD. In this manner, the mismatch information is stored at the differential output nodes, SOT and SOC. For example, if the mismatch favors the SA to make the SOT become low, this mismatch capturing process makes SOT become 0, while SOC becomes high during the pre-charge phase. Then, utilizing this stored information, when the sensing phase starts, SLT and SLC are calibrated to compensate the mismatch. Although the compensation technique is innovative, the accuracy of this compensation technique is highly dependent on factors such as the width of the calibration timing and the sizing of the calibration transistor. This dependency can potentially result in an increase in the effective VOS of the SA, which may render the structure less practical.
Figure 18 shows the minimum operating voltage of SAs according to technology scalability. The minimum operating voltage represents the minimum voltage that satisfies the 6σ sensing yield at the operating frequency of 1 GHz in the 7 nm, 14 nm, and 28 nm processes.
A quantitative comparison among the different SAs covered in Section 3 is shown in Table 2. It is simulated in TSMC 28 nm technology when a four-to-one MUX is used, with VDD = 1.0 V, and the number of bit-cells per column is 256. The distribution of VOS in the SAs is estimated as follows [63]: First, we assume that VOS follows the Gaussian distribution. Thus, PFailSA, the probability of sensing failure, can be expressed as follows:
P FailSA = P V OS >   Δ V IN , SA = P V OS - μ OS σ OS > Δ V IN , SA - μ OS σ OS = P Z > Δ V IN , SA - μ OS σ OS
in (1), ΔVIN,SA is the SA input voltage difference, µOS is the mean VOS, σOS is the standard deviation of the VOS, and Z is the standard Gaussian random variable. Second, representing the standard Gaussian cumulative distribution function (CDF) as Φ(z), Equation (1) can be shown as follows:
P FailSA = 1 - Φ Δ V IN , SA - μ OS σ OS
Third, through the inverse function, (2) can be expressed as follows:
μ OS + σ OS Φ 1 1 P FailSA = Δ V IN , SA
in (3), both PfailSA and ΔVIN,SA are values obtainable through simulation. With the specified values for PfailSA and ΔVIN,SA, only µOS and σOS remain as variables in (3). Thus, with two instances of (3), the two variables, µOS and σOS, can be derived. Therefore, due to a 1000-sample Monte Carlo simulation of VINtest1 (ΔVIN,SA = 10 mV) and VINtest2 (ΔVIN,SA = −10 mV), PFailSA1 and PFailSA2 can be determined and can be shown as the following two equations, using (3):
μ OS + σ OS Φ 1 1 P FailSA 1 = V INtest 1
μ OS + σ OS Φ 1 1 P FailSA 2 = V INtest 2
Finally, by calculating (4) and (5), μOS and σOS can be shown as follows:
μ OS = Φ 1 1 P FailSA 2 V INtest 1 Φ 1 1 P FailSA 1 V INtest 2 Φ 1 1 P FailSA 2 Φ 1 1 P FailSA 1
σ OS = V INtest 1 V INtest 2 Φ 1 1 P FailSA 1 Φ 1 1 P FailSA 2
In (6) and (7), because VINtest1, VINtest2, PFailSA1, and PFailSA2 are determined through simulation, μOS and σOS can be estimated. Additionally, the energy consumption is measured by integrating the sum of all currents flowing during one cycle. The energy consumptions shown correspond to those consumed at the four columns of the BL. As mentioned earlier, the reduction in VOS can be observed to enhance the performance of BL delay/energy and SA delay/energy.

Funding

This research was supported by the Core Research Institute Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Education (No. 2018R1A6A1A03025242), MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2023-RS-2022-00156225), supervised by the IITP (Institute for Information & Communications Technology Planning & Evaluation), and the Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT). (No. 2023-11-0830, Development of memory module and memory compiler for non-volatile PIM optimized for data characteristics and data access characteristics of AI processor).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

Author Hanwool Jeong is a founder of the company Articron Inc. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Zhu, H.; Kursun, V. A comprehensive comparison of data stability enhancement techniques with novel nanoscale SRAM cells under parameter fluctuations. IEEE Trans. Circuits Syst. I Regul. Pap. 2011, 61, 1062–1070. [Google Scholar] [CrossRef]
  2. Indumathi, G.; Aarthi alias Ananthakirupa, V.P.M.B. Energy optimization techniques on SRAM: A survey. In Proceedings of the 2014 International Conference on Communication and Network Technologies, Sivakasi, India, 18–19 December 2014; pp. 216–221. [Google Scholar] [CrossRef]
  3. Saleh, R.; Lim, G.; Kadowaki, T.; Uchiyama, K. Trends in low power digital system-on-chip designs. In Proceedings of the International Symposium on Quality Electronic Design, San Jose, CA, USA, 18–21 March 2002; pp. 373–378. [Google Scholar] [CrossRef]
  4. Lin, S.; Kim, Y.-B.; Lombardi, F. A 32nm SRAM design for low power and high stability. In Proceedings of the 2008 51st Midwest Symposium on Circuits and Systems, Knoxville, TN, USA, 10–13 August 2008; pp. 422–425. [Google Scholar] [CrossRef]
  5. Pelgrom, M.; Duinmaijer, A.C.J.; Wlebers, A.P.G. Matching properties of MOS transistors. IEEE J. Solid-State Circuits 1989, 24, 1433–1439. [Google Scholar] [CrossRef]
  6. Cho, K.; Park, J.; Oh, T.W.; Jung, S.-O. One-Sided Schmitt-Trigger-Based 9T SRAM Cell for Near-Threshold Operation. IEEE Trans. Circuits Syst. I Regul. Pap. 2020, 67, 1551–1561. [Google Scholar] [CrossRef]
  7. Nabavi, M.; Sachdev, M. A 290-mV 3.34-MHz 6T SRAM with pMOS access transistors and boosted wordline in 65-nm CMOS technology. IEEE J. Solid-State Circuits 2018, 53, 656–667. [Google Scholar] [CrossRef]
  8. Abbasian, E. A Highly Stable Low-Energy 10T SRAM for Near-Threshold Operation. IEEE Trans. Circuits Syst. I Regul. Pap. 2022, 69, 5195–5205. [Google Scholar] [CrossRef]
  9. Khayatzadeh, M.; Lian, Y. Average-8 T differential-sensing Subthreshold SRAM with bit Interleaving and 1kbits per Bitline. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2014, 22, 971–982. [Google Scholar] [CrossRef]
  10. Teman, A.; Pergament, L.; Cohen, O.; Fish, A. A 250 mV 8 kb 40 nm ultra-low power 9 T supply feedback SRAM (SF-SRAM). IEEE J. Solid-State Circuits 2011, 46, 2713–2726. [Google Scholar] [CrossRef]
  11. Weste, N.; Harris, D. CMOS VLSI Design: A Circuits and Systems Perspective; Addison-Wesley: Boston, MA, USA, 2005. [Google Scholar]
  12. Mu, J.; Kim, H.; Kim, B. SRAM-Based In-Memory Computing Macro Featuring Voltage-Mode Accumulator and Row-by-Row ADC for Processing Neural Networks. IEEE Trans. Circuits Syst. I Regul. Pap. 2022, 69, 2412–2422. [Google Scholar] [CrossRef]
  13. Yu, C.; Yoo, T.; Kim, H.; Kim, T.T.-H.; Chuan, K.C.T.; Kim, B. A Logic-Compatible eDRAM Compute-In-Memory With Embedded ADCs for Processing Neural Networks. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 667–679. [Google Scholar] [CrossRef]
  14. Yu, C.; Yoo, T.; Kim, T.T.-H.; Chuan, K.C.T.; Kim, B. A 16K Current-Based 8T SRAM Compute-In-Memory Macro with Decoupled Read/Write and 1-5bit Column ADC. In Proceedings of the 2020 IEEE Custom Integrated Circuits Conference (CICC), Boston, MA, USA, 22–25 March 2020; pp. 1–4. [Google Scholar] [CrossRef]
  15. Kim, H.; Kim, Y.; Ryu, S.; Kim, J.-J. Algorithm/Hardware Co-Design for In-Memory Neural Network Computing with Minimal Peripheral Circuit Overhead. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20–24 July 2020; pp. 1–6. [Google Scholar] [CrossRef]
  16. Sun, X.; Peng, X.; Chen, P.-Y.; Liu, R.; Seo, J.-S.; Yu, S. Fully parallel RRAM synaptic array for implementing binary neural network with (+1, −1) weights and (+1, 0) neurons. In Proceedings of the 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC), Jeju, Republic of Korea, 22–25 January 2018; pp. 574–579. [Google Scholar] [CrossRef]
  17. Lee, S.-T.; Woo, S.Y.; Lee, J.-H. Low-Power Binary Neuron Circuit with Adjustable Threshold for Binary Neural Networks Using NAND Flash Memory. IEEE Access 2020, 8, 153334–153340. [Google Scholar] [CrossRef]
  18. Liu, R.; Peng, X.; Sun, X.; Khwa, W.S.; Si, X.; Chen, J.J.; Li, J.F.; Chang, M.F.; Yu, S. Parallelizing SRAM Arrays with Customized Bit-Cell for Binary Neural Networks. In Proceedings of the 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 24–28 June 2018; pp. 1–6. [Google Scholar]
  19. Dong, Q.; Sinangil, M.E.; Erbagci, B.; Sun, D.; Khwa, W.-S.; Liao, H.-J.; Wang, Y.; Chang, J. 15.3 A 351TOPS/W and 372.4GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications. In Proceedings of the 2020 IEEE International Solid-State Circuits Conference—(ISSCC), San Francisco, CA, USA, 16–20 February 2020; pp. 242–244. [Google Scholar] [CrossRef]
  20. Sun, J.; Wang, Y.; Liu, P.; Wen, S.; Wang, Y. Memristor-Based Neural Network Circuit With Multimode Generalization and Differentiation on Pavlov Associative Memory. IEEE Trans. Cybern. 2023, 53, 3351–3362. [Google Scholar] [CrossRef] [PubMed]
  21. Lai, Q.; Wan, Z.; Kuate, P.D.K. Generating Grid Multi-Scroll Attractors in Memristive Neural Networks. IEEE Trans. Circuits Syst. I Regul. Pap. 2023, 70, 1324–1336. [Google Scholar] [CrossRef]
  22. Lovett, S.J.; Gibbs, G.A.; Pancholy, A. Yield and matching implications for static RAM memory array sense amplifier design. IEEE J. Solid-State Circuits 2000, 35, 1200–1204. [Google Scholar] [CrossRef]
  23. Zhang, K.; Hose, K.; De, V.; Senyk, B. The scaling of data sensing schemes for high speed cache design in sub-0.18 μm technologies. In Proceedings of the 2000 Symposium on VLSI Circuits, Honolulu, HI, USA, 15–17 June 2000; pp. 226–227. [Google Scholar]
  24. Boley, J.; Calhoun, B. Stack based sense amplifier designs for reducing input-referred offset. In Proceedings of the Sixteenth International Symposium on Quality Electronic Design, Santa Clara, CA, USA, 2–4 March 2015; pp. 1–4. [Google Scholar]
  25. Reniwal, B.S.; Singh, P.; Vijayvargiya, V.; Vishvakarma, S.K. A new sense amplifier design with improved input referred offset characteristics for energy-efficient sram. In Proceedings of the 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID), Hyderabad, India, 7–11 January 2017; pp. 335–340. [Google Scholar]
  26. Patil, V.; Grover, A.; Parashar, A. Design of sense amplifier for wide voltage range operation of split supply memories in 22nm HKMG CMOS technology. In Proceedings of the 2020 33rd International Conference on VLSI Design and 2020 19th International Conference on Embedded Systems (VLSID), Bangalore, India, 4–8 January 2020; pp. 37–42. [Google Scholar]
  27. Saraswat, G.; Parashar, A. Voltage Boosted Schmitt Trigger Sense Amplifier (VBSTSA) with Improved Offset and Reaction Time For High Speed SRAMs. In Proceedings of the 2023 36th International Conference on VLSI Design and 2023 22nd International Conference on Embedded Systems (VLSID), Hyderabad, India, 8–12 January 2023. [Google Scholar]
  28. Liu, B.; Cai, J.; Yuan, J.; Hei, Y. A low-voltage SRAM sense amplifier with offset cancelling using digitized multiple body biasing. IEEE Trans. Circuits Syst. II Express Briefs 2017, 64, 442–446. [Google Scholar] [CrossRef]
  29. Dhong, S.; Takahashi, O.; White, M.; Asano, T.; Nakazato, T.; Silberman, J.; Kawasumi, A.; Yoshihara, H. A 4.8GHz fully pipelined embedded SRAM in the streaming processor of a CELL processor. In Proceedings of the ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005, San Francisco, CA, USA, 10 February 2005; Volume 1, pp. 486–612. [Google Scholar]
  30. Chen, N.; Chaba, R. Dual Sensing Current Latched Sense Amplifier. U.S. Patent 12 731 623, 3 March 2010. [Google Scholar]
  31. Tsai, M.-F.; Tsai, J.-H.; Fan, M.-L.; Su, P.; Chuang, C.-T. Variation tolerant CLSAs for nanoscale bulk-CMOS and FinFET SRAM. In Proceedings of the 2012 IEEE Asia Pacific Conference on Circuits and Systems, Kaohsiung, Taiwan, 2–5 December 2012; pp. 471–474. [Google Scholar]
  32. Sarfraz, K.; He, J.; Chan, M. A 140-mV variation-tolerant deep sub-threshold SRAM in 65-nm CMOS. IEEE J. Solid-State Circuits 2017, 52, 2215–2220. [Google Scholar] [CrossRef]
  33. Patel, D.; Neale, A.; Wright, D.; Sachdev, M. Hybrid latch-type offset tolerant sense amplifier for low-voltage SRAMs. IEEE Trans. Circuits Syst. I Regul. Pap. 2019, 66, 2519–2532. [Google Scholar] [CrossRef]
  34. Kawasumi, A.; Takeyama, Y.; Hirabayashi, O.; Kushida, K.; Tachibana, F.; Niki, Y.; Sasaki, S.; Yabe, T. Energy efficiency deterioration by variability in SRAM and circuit techniques for energy saving without voltage reduction. In Proceedings of the 2012 IEEE International Conference on IC Design & Technology (ICICDT), Austin, TX, USA, 30 May–1 June 2012; pp. 1–4. [Google Scholar]
  35. Verma, N.; Chandrakasan, A.P. A High-Density 45 nm SRAM Using Small-Signal Non-Strobed Regenerative Sensing. IEEE J. Solid-State Circuits 2008, 44, 163–173. [Google Scholar] [CrossRef]
  36. Qazi, M.; Stawiasz, K.; Chang, L.; Chandrakasan, A. A 512kb 8T SRAM Macro Operating Down to 0.57 V with an AC-Coupled Sense Amplifier and Embedded Data-Retention-Voltage Sensor in 45nm SOI CMOS. IEEE J. Solid-State Circuits 2010, 46, 85–96. [Google Scholar] [CrossRef]
  37. Fragasse, R.; Dupaix, B.; Tantawy, R.; James, T.; Khalil, W. Sense amplifier offset cancellation and replica timing calibration for high-speed SRAMs. In Proceedings of the 2018 IEEE 9th Latin American Symposium on Circuits & Systems (LASCAS), Puerto Vallarta, Mexico, 25–28 February 2018; pp. 1–5. [Google Scholar]
  38. Sinangil, M.E.; Poulton, J.W.; Fojtik, M.R.; Greer, T.H.; Tell, S.G.; Gotterba, A.J.; Wang, J.; Golbus, J.; Zimmer, B.; Dally, W.J.; et al. A 28 nm 2 Mbit 6 T SRAM with highly configurable low-voltage write-ability assist implementation and capacitor-based sense-amplifier input offset compensation. IEEE J. Solid-State Circuits 2015, 51, 557–567. [Google Scholar] [CrossRef]
  39. Giridhar, B.; Pinckney, N.; Sylvester, D.; Blaauw, D. 13.7 A reconfigurable sense amplifier with auto-zero calibration and pre-amplification in 28nm CMOS. In Proceedings of the 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), San Francisco, CA, USA, 9–13 February 2014. [Google Scholar]
  40. Fragasse, R.; Tantawy, R.; Dupaix, B.; Dean, T.; Disabato, D.; Belz, M.R.; Smith, D.; Mccue, J.; Khalil, W. Analysis of SRAM enhancements through sense amplifier capacitive offset correction and replica self-timing. IEEE Trans. Circuits Syst. I Regul. Pap. 2019, 66, 2037–2050. [Google Scholar] [CrossRef]
  41. Schinkel, D.; Mensink, E.; Klumperink, E.; van Tuijl, E.; Nauta, B. A Double-Tail Latch-Type Voltage Sense Amplifier with 18ps Setup+Hold Time. In Proceedings of the 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers, San Francisco, CA, USA, 11–15 February 2007; pp. 314–315. [Google Scholar]
  42. Jeong, H.; Park, J.; Oh, T.W.; Rim, W.; Song, T.; Kim, G.; Won, H.-S.; Jung, S.-O. Bitline precharging and preamplifying switching pMOS for high-speed low-power SRAM. IEEE Trans. Circuits Syst. II Express Briefs 2016, 63, 1059–1063. [Google Scholar] [CrossRef]
  43. Lee, S.; Park, J.; Jeong, H. Cross-Coupled nFET Preamplifier for Low Voltage SRAM. IEEE Trans. Circuits Syst. II Express Briefs 2023, 70, 3604–3608. [Google Scholar] [CrossRef]
  44. Sharifkhani, M.; Rahiminejad, E.; Jahinuzzaman, S.M.; Sachdev, M. A compact hybrid current/voltage sense amplifier with offset cancellation for high-speed SRAMs. IEEE Trans. Very Large Scale Integr. VLSI Syst. 2011, 19, 883–894. [Google Scholar] [CrossRef]
  45. Shah, J.S.; Nairn, D.; Sachdev, M. An energy-efficient offset cancelling sense amplifier. IEEE Trans. Circuits Syst. II Express Briefs 2013, 60, 477–481. [Google Scholar] [CrossRef]
  46. Patel, D.; Neale, A.; Wright, D.; Sachdev, M. Body Biased Sense Amplifier With Auto-Offset Mitigation for Low-Voltage SRAMs. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 3265–3278. [Google Scholar] [CrossRef]
  47. Zhao, Y.; Wang, J.; Tong, Z.; Wu, X.; Peng, C.; Lu, W.; Zhao, Q.; Lin, Z. An offset cancellation technique for SRAM sense amplifier based on relation of the delay and offset. Microelectron. J. 2022, 128, 105578. [Google Scholar] [CrossRef]
  48. Mohammad, B.; Dadabhoy, P.; Lin, K.; Bassett, P. Comparative study of current mode and voltage mode sense amplifier used for 28nm SRAM. In Proceedings of the 2012 24th International Conference on Microelectronics (ICM), Algiers, Algeria, 16–20 December 2012; pp. 1–6. [Google Scholar] [CrossRef]
  49. Pu, Y.; Zhang, X.; Huang, J.; Muramatsu, A.; Nomura, M.; Hirairi, K.; Takata, H.; Sakurabayashi, T.; Miyano, S.; Takamiya, M.; et al. Misleading energy and performance claims in sub/near threshold digital systems. In Proceedings of the 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, USA, 7–11 November 2010; pp. 625–631. [Google Scholar]
  50. Moritz, G.; Giraud, B.; Noel, J.; Turgis, D.; Grover, A. Optimization of a voltage sense amplifier operating in ultra wide voltage range with back bias design techniques in 28nm utbb fd-soi technology. In Proceedings of the 2013 International Conference on IC Design Technology (ICICDT), Pavia, Italy, 29–31 May 2013; pp. 53–56. [Google Scholar]
  51. Niki, Y.; Kawasumi, A.; Suzuki, A.; Takeyama, Y.; Hirabayashi, O.; Kushida, K.; Tachibana, F.; Fujimura, Y.; Yabe, T. A digitized replica bitline delay technique for random-variation-tolerant timing generation of SRAM sense amplifiers. IEEE J. Solid-State Circuits 2011, 46, 2545–2551. [Google Scholar] [CrossRef]
  52. Arslan, U.; McCartney, M.P.; Bhargava, M.; Li, X.; Mai, K.; Pileggi, L.T. Variation-tolerant SRAM sense-amplifier timing using configurable replica bitlines. In Proceedings of the IEEE Custom Integrated Circuits Conference (CICC), San Jose, CA, USA, 21–24 September 2008; pp. 415–418. [Google Scholar]
  53. Wang, P.; Zhou, K.; Zhang, H.; Gong, D. Design of replica bit line control circuit to optimize power for SRAM. J. Semicond. 2016, 37, 125002. [Google Scholar] [CrossRef]
  54. Lin, Z.; Wu, X.; Li, Z.; Guan, L.; Peng, C.; Liu, C.; Chen, J. A pipeline replica bitline technique for suppressing timing variation of SRAM sense amplifiers in a 28-nm CMOS process. IEEE J. Solid-State Circuits 2016, 52, 669–677. [Google Scholar] [CrossRef]
  55. Komatsu, S.; Yamaoka, M.; Morimoto, M.; Maeda, N.; Shimazaki, Y.; Osada, K. A 40-nm low-power SRAM with multi-stage replica-bitline technique for reducing timing variation. In Proceedings of the IEEE Custom Integrated Circuits Conference, San Jose, CA, USA, 13–16 September 2009; pp. 701–704. [Google Scholar]
  56. Amrutur, B.S.; Horowitz, M.A. Fast low-power decoders for RAMs. IEEE J. Solid-State Circuits 2001, 36, 1506–1515. [Google Scholar] [CrossRef]
  57. Chang, M.-F.; Yang, S.-M.; Chen, K.-T.; Liao, H.-J.; Lee, R. Improving the speed and power of compilable SRAM using dual-mode selftimed technique. In Proceedings of the 2007 IEEE International Workshop on Memory Technology, Design and Testing, Taipei, Taiwan, 3–5 December 2007; pp. 57–60. [Google Scholar]
  58. Kim, T.-H.; Liu, J.; Keane, J.; Kim, C.H. A 0.2 V, 480 kb subthreshold SRAM with 1 k cells per bitline for ultra-low-voltage computing. IEEE J. Solid-State Circuits 2008, 43, 518–529. [Google Scholar] [CrossRef]
  59. Wang, D.; Liao, H.; Yamauchi, H.; Chen, Y.; Lin, Y.; Lin, S.; Liu, D.C.; Chang, H.; Hwang, W. A 45nm dual-port SRAM with write and read capability enhancement at low voltage. In Proceedings of the 2007 IEEE International SOC Conference, Hsinchu, Taiwan, 26–29 September 2007; pp. 211–214. [Google Scholar]
  60. Karl, E.; Wang, Y.; Ng, Y.-G.; Guo, Z.; Hamzaoglu, F.; Bhattacharya, U.; Zhang, K.; Mistry, K.; Bohr, M. A 4.6 GHz 162 Mb SRAM design in 22 nm trigate CMOS technology with integrated active VMIN-enhancing assist circuitry. In Proceedings of the 2012 IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 19–23 February 2012. [Google Scholar]
  61. Chang, J.; Chen, Y.H.; Cheng, H.; Chan, W.M.; Liao, H.J.; Li, Q.; Chang, S.; Natarajan, S.; Lee, R.; Wang, P.W.; et al. A 20 nm 112Mb SRAM in highmetal-gate with assist circuitry for low-leakage and low-VMIN applications. In Proceedings of the 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers, San Francisco, CA, USA, 17–21 February 2013; pp. 316–333. [Google Scholar]
  62. Song, T.; Rim, W.; Park, S.; Kim, Y.; Yang, G.; Kim, H.; Baek, S.; Jung, J.; Kwon, B.; Cho, S.; et al. A 10 nm FinFET 128 Mb SRAM with assist adjustment system for power, performance, and area optimization. IEEE J. Solid State Circuits 2017, 52, 240–249. [Google Scholar] [CrossRef]
  63. Baek, G.; Jeong, H. High-Density SRAM Read Access Yield Estimation Methodology. IEEE Access 2021, 9, 128288–128301. [Google Scholar] [CrossRef]
Figure 1. Simplified schematic of the conventional SRAM for the read operation.
Figure 1. Simplified schematic of the conventional SRAM for the read operation.
Sensors 24 00016 g001
Figure 2. Schematic of two commonly used SAs in SRAM: (a) voltage-type latch SA (VLSA) and (b) current-type latch SA.
Figure 2. Schematic of two commonly used SAs in SRAM: (a) voltage-type latch SA (VLSA) and (b) current-type latch SA.
Sensors 24 00016 g002
Figure 3. Operational waveforms for the read operation relevant signals in the conventional SRAM.
Figure 3. Operational waveforms for the read operation relevant signals in the conventional SRAM.
Sensors 24 00016 g003
Figure 4. Description of VLSA operation for sensing datum “1”.
Figure 4. Description of VLSA operation for sensing datum “1”.
Sensors 24 00016 g004
Figure 5. Description of sensing failure in VLSA for sensing datum “1”.
Figure 5. Description of sensing failure in VLSA for sensing datum “1”.
Sensors 24 00016 g005
Figure 6. Schematics of (a) Schmitt trigger-based SA (STSA) and (b) the voltage-boosted STSA (VBSTSA).
Figure 6. Schematics of (a) Schmitt trigger-based SA (STSA) and (b) the voltage-boosted STSA (VBSTSA).
Sensors 24 00016 g006
Figure 7. Schematics of two representative hybrid latch-type SAs: (a) variation-tolerant SA (VTSA) in [32] and (b) hybrid latch-type SA-QZ (HYSA-QZ) in [33].
Figure 7. Schematics of two representative hybrid latch-type SAs: (a) variation-tolerant SA (VTSA) in [32] and (b) hybrid latch-type SA-QZ (HYSA-QZ) in [33].
Sensors 24 00016 g007
Figure 8. (a) Schematic of capacitor-based threshold-matching SA (TMSA), (b) VLSA part in TMSA, and (c) capacitor-based threshold-matching circuit part.
Figure 8. (a) Schematic of capacitor-based threshold-matching SA (TMSA), (b) VLSA part in TMSA, and (c) capacitor-based threshold-matching circuit part.
Sensors 24 00016 g008
Figure 9. Four-step operation of TMSA: (a) pre-charge phase, (b) access phase, (c) evaluation phase, and (d) latching phase.
Figure 9. Four-step operation of TMSA: (a) pre-charge phase, (b) access phase, (c) evaluation phase, and (d) latching phase.
Sensors 24 00016 g009
Figure 10. Structure of VTS-SA.
Figure 10. Structure of VTS-SA.
Sensors 24 00016 g010
Figure 11. Three operation phases of VTS-SA: (a) trip-point bias, (b) access phase, and (c) evaluation phase.
Figure 11. Three operation phases of VTS-SA: (a) trip-point bias, (b) access phase, and (c) evaluation phase.
Sensors 24 00016 g011
Figure 12. (a) Schematic of CSACOC and (b) operation waveforms of three control clock signals.
Figure 12. (a) Schematic of CSACOC and (b) operation waveforms of three control clock signals.
Sensors 24 00016 g012
Figure 13. Three operation phases of CSACOC: (a) trip-point storage phase (ΦTrs = 1), (b) trip-point bias phase (ΦTrb = 1), and (c) evaluation phase (SAE = 1).
Figure 13. Three operation phases of CSACOC: (a) trip-point storage phase (ΦTrs = 1), (b) trip-point bias phase (ΦTrb = 1), and (c) evaluation phase (SAE = 1).
Sensors 24 00016 g013
Figure 14. (a) Schematic of BP2SP and (b) its operational waveforms.
Figure 14. (a) Schematic of BP2SP and (b) its operational waveforms.
Sensors 24 00016 g014
Figure 15. (a) Schematic of CCN-PP and (b) its operational waveforms.
Figure 15. (a) Schematic of CCN-PP and (b) its operational waveforms.
Sensors 24 00016 g015
Figure 16. Structure of OCCSA.
Figure 16. Structure of OCCSA.
Sensors 24 00016 g016
Figure 17. Structure of (a) SAOC, (b) DIBBSA-FL, (c) DIBBSA-PD, and (d) CDOR.
Figure 17. Structure of (a) SAOC, (b) DIBBSA-FL, (c) DIBBSA-PD, and (d) CDOR.
Sensors 24 00016 g017
Figure 18. Minimum operating voltage of SAs according to technology scalability.
Figure 18. Minimum operating voltage of SAs according to technology scalability.
Sensors 24 00016 g018
Table 1. Comparison of SRAM sensing circuit designs.
Table 1. Comparison of SRAM sensing circuit designs.
StructureOffset Reduction TechniqueComponentsControl SignalsLimitations
VLSAFigure 2a-7 TRPCB, SAELarge VOS
CLSAFigure 2b-9 TRPCB, SAEIncreased VOS due to
additional TR pair
STSAFigure 6aDriving Internal Nodes of VLSA
(ZT and ZC)
11 TRPCB, SAESpeed degradation due to stack
VBSTSAFigure 6bSTSA + Negative Boosting VSS14 TR
+ NVG (share)
PCB, SAE, BSTENNecessitating for NVG (power/area cost)
VTSAFigure 7aPre-charging SOT and SOC
to SLT and SLC in CLSA
9 TRPCB, SAESpeed degradation due to stack
HYSA-QZFigure 7bPre-charging output nodes and
internal nodes of CLSA
11 TRPCB, SAESpeed degradation due to stack
TMSAFigure 8aCapturing Vth of pull-down nFETs
through paired cap
11 TR + INV + Buffer + 2 CPCB, SAECapacitor mismatch, Cap power/area overhead
VTS-SAFigure 10Capturing trip points of cross-coupled INVs with input acceptation via coupling cap pair12 TR + 2 CEN, PCB, PRE, SAECapacitor mismatch, power/area overhead
CSACOCFigure 12aCapturing trip points of cross-coupled inverters via single capacitor16 TR + 1C
+2 OR (shared)
PCB, SAE, ΦTrs, ΦTrbMany switches, control signal circuit
BP2SPFigure 14aCapturing Vth of pre-amplifying pFET pair at BL pre-charge 6TR + SAPCB, SAEBit-line floating, unstable pre-charge level, power/area overhead
CCN-PPFigure 15aPre-amplifying BL via cross-coupled nFET pair, while capturing Vth with boosted VDD 4TR + 2C + Buffer + 1TR + SAPCB, SAE, PBEBit-line floating, power/area overhead
OCCSA[44]Capturing Vth of MUX nFETs
at BL pre-charge
7 TRPCB, SAEAdditional Vprebl voltage generator, different MUX signal
SAOC[45]Capturing Vth of input pFETs at SA pre-charge11 TRPCB, SAE, OCENN1, N2 mismatch, control signal circuit
DIBBSA-FL,
DIBBSA-PD
[46]Body biasing7 TR, 9TR
+Body contact
PCB, SAEInapplicable to the recent technology whose body effect is minimal
CDOR[47]Lowering input voltage according to SA mismatch15 TRPCB, SAE, QControl signal circuit for added Q and different PCB, SAE operation
Table 2. Quantitative comparison of SRAM SAs at VDD = 1.0 V in 28 nm technology.
Table 2. Quantitative comparison of SRAM SAs at VDD = 1.0 V in 28 nm technology.
Standard Dev. of VOS (mV)BL Delay
(ps)
SA Delay
(ps)
Energy Consumption for Four BLs (fJ)SA Energy Consumption (fJ)Area (µm2)
VLSA16.46203.8615.2593.862.946.48
CLSA27.77323.3227.69110.353.997.88
STSA12.24159.5719.4186.903.678.49
VTSA11.54152.2117.6787.473.376.79
HYSA-QZ10.39140.2516.8379.213.657.09
TMSA9.96138.4613.4795.4325.238.63
VTS-SA5.7591.8415.2576.6016.387.11
CSACOC9.76133.6727.6989.2718.269.55
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lee, S.; Park, G.; Jeong, H. Design of High-Speed, Low-Power Sensing Circuits for Nano-Scale Embedded Memory. Sensors 2024, 24, 16. https://doi.org/10.3390/s24010016

AMA Style

Lee S, Park G, Jeong H. Design of High-Speed, Low-Power Sensing Circuits for Nano-Scale Embedded Memory. Sensors. 2024; 24(1):16. https://doi.org/10.3390/s24010016

Chicago/Turabian Style

Lee, Sangheon, Gwanwoo Park, and Hanwool Jeong. 2024. "Design of High-Speed, Low-Power Sensing Circuits for Nano-Scale Embedded Memory" Sensors 24, no. 1: 16. https://doi.org/10.3390/s24010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop