# 200-MHz Single-Ended 6T 1-kb SRAM With 0.2313 pJ Energy/Access Using 40-nm CMOS Logic Process

Chua-Chin Wang<sup>()</sup>, Senior Member, IEEE, and Chien-Ping Kuo<sup>()</sup>

Abstract-A high-speed and low energy-consuming SRAM design based on single-ended cells is demonstrated in this work. To resolve poor SNM (static noise margin) of prior single-ended memory cells, the proposed SRAM cell is equipped with a pullup PMOS and a high-Vthn NMOS foot switch such that the cell state is not bothered by noise when the supply voltage is getting lowered. Moreover, a PFOS (Positive Feedback Op-Amp Sensing) circuit is added between bitlines (BL, BL) to reduce the read delay and generate full-swing output. Last but not least, a voltage mode select (VMS) circuit is added to each column to reduce the static power of unselected cells such that idle power is drastically reduced. The reason is that the a lower voltage able to keep the state of bits is applied to those unselected cells. A 1-kb SRAM prototype based on the proposed cells with BIST (build-in self test) circuit is physically fabricated using typical 40-nm CMOS logic process. The maximum operating clock rate is 200 MHz. The energy/access and energy/bit are measured on silicon to be 0.2313 pJ, and 0.00723 pJ, respectively.

*Index Terms*—SRAM, single-ended cell, voltage mode select, SNM, PFOS.

## I. INTRODUCTION

CCORDING to the recent report predicted by ITRS, the overall area of memory devices in an SOC (system on chip) will occupy over 90% of the entire chip area very soon. SRAM has been widely used as caches in various processors, e.g., CPU, AP, etc. The SRAM power reduction certainly propels the advance of these processors. Particularly for the power and energy saving demand of SRAMs, three major design approaches were proposed.

 Current mode sense amplification [1]: During a read operation, the SA (sense amplifier) pre-determines the output result by sensing the differential current on two bitlines such that the low power and high

Manuscript received May 26, 2021; accepted June 18, 2021. Date of publication June 24, 2021; date of current version August 30, 2021. This work was supported in part by Taiwan MOST under Grant 110-2218-E-110-008, Grant 110-2623-E-110-001, and Grant 109-2224-E-110-001. This brief was recommended by Associate Editor W. Shan. (*Corresponding author: Chua-Chin Wang.*)

Chua-Chin Wang is with the Department of Electrical Engineering and the Institute of Undersea Technology, National Sun Yat-sen University, Kaohsiung 804, Taiwan (e-mail: ccwang@ee.nsysu.edu.tw).

Chien-Ping Kuo is with the Department of Electrical Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan.

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCSII.2021.3091973.

Digital Object Identifier 10.1109/TCSII.2021.3091973

speed would be feasible. Notably, the output delay is irrelevant of the bitline capacitances in such a scenario.

- 2) Current compensation circuit [2] : When the SRAM begins to work, the current compensation circuit detects the leakage current of each bitline and then injects a proper current into the corresponding bitline. Although this way can't reduce the leakage current, the access speed of the SRAM will be improved. This kind of approach does not show any advantage of energy saving, particularly for the standby cells.
- 3) Secondary supply [3]: By using another higher supply voltage, the access of the SRAM cell will be fastened. However, the penalty is that extra energy is needed. The standby power or energy of those un-accessed cells is also ignored.

Though many SRAMs have been proposed to achieve low power operations, the generic loadless SRAM cell was considered to attain the edge of small area and low power. However, the loadless cell is bothered by the potential weak "0" state such that it is always under the threat of read/write disturbance and instability, particularly it is used in single-ended bitline structure. This issue also results in SNM (static noise margin) degradation therewith. Many auxiliary circuits were reported to resolve these problems, including readout assist circut [4], write assist circuit [5], etc. Only a few SRAM designs, however, were meant to carry out leakage detection and compensation [6]. In brief, none of the mentioned reports showed a comprehensive result to reduce the power, reject the coupled noise, fasten access speed, and increase SNM at the same time.

To resolve all the issues in single-ended SRAM designs, many circuit features are proposed in this investigation to reduce power and enhance performance simultaneously different aspects, including cell circuitry, bitline architecture, and power supply mode selection. More specifically, the proposed SRAM cell is equipped with a pull-up PMOS and a high-Vthn NMOS foot switch to assist the R/W if selected, and ground the stored state, respectively. Consequently, the SNM is drastically increased. Regarding the read speed, a (Positive Feedback Op-Amp Sensing) circuit is added between bitlines ( $\overline{BL}$ , BL) to reduce the delay. To reduce power dissipation, a supply voltage select mode is added to every column of SRAMs, where the supply voltage of the column unselected is reduced to save idle power.

1549-7747 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.



Fig. 1. Proposed low energy-consuming SRAM system diagram.

# II. LOW-ENERGY SINGLE-ENDED 6T SRAM

With reference to Fig. 1, the architecture of the proposed 1-kb 6T SRAM is disclosed, where an SRAM array, a Controller circuit, a voltage mode select (VMS) circuit, AVD (adaptive voltage detector), PVB (pass-transistor gate voltage boosting), a PFOS, and a BIST (build-in self test) circuit are included. Since AVD and PVB circuits are referred to our previous circuitry reported in [6], their details shall not be covered in the following text. By contrast, the features of the proposed SRAM will be fully described.

## A. 6-T SRAM Cell With Pull-Up Assist

As addressed earlier, prior single-ended SRAM designs were suffered from poor SNM, though they could reduce the area and the power by saving one bitline. The major reasons are the "0" state of storage node is easily flipped by lacking a grounding path, and the P-latch becomes a resistive network when the storage node is to be written with new states. Therefore, two extra devices are added as shown in Fig. 2. The area of the proposed SRAM cell is  $1.12 \times 0.91 \ \mu m^2$ .

- MP<sub>402</sub>: It is a high Vthp PMOS driven by WLO. When the cell is written with "1", it is cut off to kill the current path from VDDC to the P-latch such that Qb can be easily flipped. If written with "0", it is turned on to supply a current to the P-latch from VDDC. That is, it becomes a pull-up assist loop.
- 2) MN<sub>401</sub> : This is a high Vthn NMOS driven by node Qb. If the state at Qb is low, it is off to keep the charge at node Q as high. By contrast, if Qb is high, MN<sub>401</sub> will be on to couple Q perfectly to ground.

Apparently, thanks to the added pull-up assist by  $MP_{402}$  and grounding path by  $MN_{401}$ , the SNM of the cell shall be enlarged significantly.

# B. Positive Feedback Op-Amp Sensing (PFOS)

Another hazard in prior single-ended SRAM designs is that the output swing was never ensured to be full swing, particularly when reading "0". A PFOS circuit is added between  $\overline{BL}$ and BL, as shown in Fig. 2, where the schematic is given in Fig. 3. The function of the proposed PFOS is listed as follows.



Fig. 2. Proposed 6T SRAM cell (a) schematic; (b) cell layout.



Fig. 3. Positive feedback op-amp sensing (PFOS) circuit.



Fig. 4. Voltage mode select circuit.

- (1) SA\_EN = 1 :  $MN_{603}$  is turned on to ground BL, while  $MP_{602}$  is cut off to deny any current injecting into the differential amplifier.
- (2) SA\_EN = 0 : MN<sub>603</sub> is off. MP<sub>602</sub> is turned on to drive MN<sub>601</sub> from the drain of MP<sub>602</sub> and the source of MP<sub>601</sub>, constituting a positive feedback path. Then, the output of the differential amplifier is pulled high to drive MP<sub>604</sub> via an inverter so that BL will become high.

# C. Voltage Mode Select (VMS)

To further reduce the standby power when the cell is not accessed, a power-gated mechanism is employed in Fig. 4, where 3 low-Vthp PMOS transistors are added to every column of memory cells.

- If any cell in the same column is accessed, WL is high to cut off MP<sub>706</sub>. MP<sub>704</sub> is turned on by WLB (low) such that the entire column is driven by the normal VDD.
- (2) If no cell in the column is accessed, WL is low and WLB is high. A reduced supply voltage, VDD Vthp,



Fig. 5. (a) Read cycle timing; (b) Write cycle timing.

is coupled to the entire column, namely VDDC in the memory cell. Thus, the supply voltage is dropped by Vthp to save power in such an idle state.

# D. R/W Cycle Timing

Referring to Figs. 2 and 3, read cycle operations of the proposed SRAM cell are as shown in Fig. 5(a), where 3 major steps are R1 (pre-discharge), R2 (WL active), R3 (output assert).

• Read "1": (R1) Pre-discharge is asserted before WL=1 to ground  $\overline{BL}$ . (R2) WL=WA=1 to turn on MN<sub>402</sub> and MN<sub>403</sub>, respectively. WLO is set to high shutting down MP<sub>402</sub>. (R3) Thus, "0" at Qb will be coupled to PFOS of  $\overline{BL}$  via MN<sub>402</sub> and MN<sub>403</sub> to raise BL high.

• Read "0" : (R1) Pre-discharge is asserted. (R2) WL=1, WA=1, WLO=0 to turn on  $MN_{402}$ ,  $MN_{403}$ ,  $MP_{402}$ , respectively. (R3) Due to the presence of the pull-up assist loop by turning on  $MP_{402}$ , "1" at Qb will be stable, which coupled to PFOS of  $\overline{BL}$  via  $MN_{402}$  and  $MN_{403}$  to pull down BL.

• Standby : The SRAM enters the standby mode after Predischarge is unasserted.

Similarly, write cycle operations are given in Fig. 5(b), where 3 major steps are W1 (pre-discharge), W2 (WL active), W3 (state assert).

• Write "1": (W1) Pre-discharge is asserted high to ground  $\overline{\text{BL}}$ . (W2) WL=1, WA=1, to turn on MN<sub>403</sub> and MN<sub>402</sub>, respectively. WLO is set high to shut off MP<sub>402</sub>. Qb is then low to turn on MP<sub>401</sub> and cut MN<sub>401</sub>. (W3) Q node will then be high.

• Write "0" : (W1) Pre-discharge is asserted. (W2) WL=0 to shut off  $MN_{403}$ . WA is high to turn on  $MN_{402}$ , and WLO is low to activate the pull-up assist loop such that Qb becomes "1". (W3) Then,  $MN_{401}$  is turned on to ensure Q=0.

## **III. SIMULATION AND MEASUREMENT**

The proposed design is realized by TSMC 45 nm CMOS LOGIC (40G) process. The die photo and the core layout are demonstrated in Fig. 6, where the chip area is  $595 \times 595 \ \mu m^2$ , and the core area is  $200 \times 200 \ \mu m^2$ . Notably, since there is a minimum metal density rule required by the foundry, the top of the chip is covered by metal layers such that the details of individual blocks are not visible in the die photo.

# A. All-PVT-Corner Simulations

To validate the advantages resulted from the features of the proposed SRAM and the cells, all-PVT-corner simulations (5 process corners, 3 VDD voltages, 5 temperatures) are carried out. Fig. 7 shows all the R/W functions, where the



Fig. 6. Die photo of the proposed 1-kb SRAM.



Fig. 7. All-PVT-corner post-layout simulation.



Fig. 8. SNM plot (a) Traditional 6T SRAM cell; (b) Proposed SRAM cell.



Fig. 9. DNM plot.

worst case R/W delay is 1.43 ns. As for the noise margins, Fig. 8 and Fig. 9 show the SNM and DNM (dynamic noise margin) of the proposed 6T SRAM cells, respectively. Notably, Fig. 8(a) is SNM of traditional 6T SRAM cells (225/254 mV), and Fig. 8(b) is that of the proposed cell, where it is enlarged to 706/377 mV, which is the best by far. Another feature is that the cell will resist any noise under 100 ps@0.3 V given VDD=0.9V and 200 MHz system clock.

#### B. On-Silicon Measurement

On-silicon measurement and testing are carried out in Tainan Branch of TSRI (Taiwan Semiconductor Research Institute), where Fig. 10 shows the setup of the measurement site. Figs. 11(a) and (b) show the timing waveforms without and with pull-up assist, respectively. The delay in the case of no pull-up assist is 1.0 ns, while that in the case using pull-up assist is reduced to 0.988 ns. The standby (idle) current is

 TABLE I

 Performance Comparison With Prior CMOS SRAMs

|                       | [7]    | [8]     | [9]     | [10]   | ours    |
|-----------------------|--------|---------|---------|--------|---------|
| Year                  | 2019   | 2019    | 2020    | 2021   | 2021    |
| Pub.                  | TCAS2  | TCAS1   | TCAS2   | TCAS1  |         |
| Cell type             | 6T     | 14T     | 8T      | 6T     | 6T      |
| Cell area $(\mu m^2)$ | 1.03   | 5.70    | 1.56    | 1.83   | 1.02    |
| VDD                   | 0.65   | 1.2     | 0.36    | 1.2    | 0.9     |
| SNM (mV)              | 135    | N/A     | 190     | N/A    | 377     |
| PDP (fJ)              | 131.58 | N/A     | 4454.4  | 233.38 | 47.382  |
| Clock (MHz)           | 20     | 10      | 0.25    | 935    | 200     |
| Capacity (kb)         | 1      | 1       | 32      | 4      | 1       |
| Word Length           | 32     | 128     | 128     | 32     | 32      |
| Energy/access (pJ)    | 0.256  | 24.9    | 0.3     | 1.04   | 0.2313  |
| Energy/bit (pJ)       | 0.008  | 0.19453 | 0.00234 | 0.0325 | 0.00723 |



Fig. 10. Measurement setup.



Fig. 11. Measurement (a) without; and (b) with pull-up assist at 200 MHz.

shown in upper right corner of Fig. 10, which is 3.197 nA so that the standby power = 2.8773 nW. A total of 6 prototypical dies are measured over 30 times to prove that the energy/access and energy/bit are 0.2313 pJ, and 0.00723 pJ, respectively, which are even lower than the corresponding simulation numbers.

Several prior SRAMs designs using 40-nm CMOS process and ours are tabulated in Table I. Although our energy/bit is the second best next to [9], their clock is the lowest among all. Besides this second best in energy/bit, the proposed 6T single-ended SRAM demonstrated the lowest energy/access, second highest clock rate, and largest SNM. This fact justifies that the added pull-up assist loop, VMS, and grounding foot switch indeed resolve the poor SNM and power dissipation problems.

# IV. CONCLUSION

A low energy-consuming SRAM design is demonstrated in this investigation, where several features are added to resolve SNM and power problems in prior SRAMs. The added pullup assist loop and the grounding switch not only enlarge the SNM, but also enhance the noise rejection capability. The proposed PFOS circuitry successfully prevents the poor voltage swing issue when reading "0", which was a potential threat to the correctness of prior single-ended SRAMS. The VMS circuit reduces the idle power of every unselected column to reduce the overall power dissipation.

#### ACKNOWLEDGMENT

The authors would like to express the appreciation to TSRI (Taiwan Semiconductor Research Institute) in NARL (Nation Applied Research Laboratories), Taiwan, for the assistance of EDA tool support, fabrication service, and the measurement setup.

### REFERENCES

- A.-T. Do, Z.-H. Kong, K.-S. Yeo, and J. Y. S. Low, "Design and sensitivity analysis of a new current-mode sense amplifier for low-power SRAM," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 2, pp. 196–204, Feb. 2011.
- [2] K. Agawa, H. Hara, T. Takayanagi, and T. Kuroda, "A bitline leakage compensation scheme for low-voltage SRAMs," *IEEE J. Solid-State Circuits*, vol. 36, no. 5, pp. 726–734, May 2001.
- [3] D. Kim, G. Chen, M. Fojtik, M. Seok, D. Blaauw, and D. Sylvester, "A 1.85fW/bit ultra low leakage 10T SRAM with speed compensation scheme," in *Proc. IEEE Int. Symp. Circuits Syst.*, Rio de Janeiro, Brazil, May 2011, pp. 69–72.
- [4] V. Sharma, S. Cosemans, M. Ashouei, J. Huisken, F. Catthoor, and W. Dehaene, "A 4.4 pJ/access 80 MHz, 128 kbit variability resilient SRAM with multi-sized sense amplifier redundancy," *IEEE J. Solid-State Circuit*, vol. 46, no. 10, pp. 2416–2430, Oct. 2011.
  [5] M.-H. Tu, J.-Y. Lin, M.-C. Tsai, S.-J. Jou, and C.-T. Chuang, "Single-
- [5] M.-H. Tu, J.-Y. Lin, M.-C. Tsai, S.-J. Jou, and C.-T. Chuang, "Singleended subthreshold SRAM with asymmetrical write/read-assist," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 12, pp. 3039–3047, Dec. 2010.
- [6] C.-C. Wang, C.-H. Liao, and S.-Y. Chen, "A single-ended disturb-free 5T loadless SRAM with leakage sensor and read delay compensation using 40 nm CMOS process," in *Proc. IEEE Int. Symp. Circuits Syst.*, Melbourne, VIC, Australia, Jun. 2014, pp. 1126–1129.
- [7] N. Surana and J. Mekie, "Energy efficient single-ended 6-T SRAM for multimedia applications," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 66, no. 6, pp. 1023–1027, Jun. 2019.
- [8] W.-G. Ho, K.-S. Chong, T. T.-H. Kim, and B.-H. Gwee, "A secure data-toggling SRAM for confidential data protection," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 66, no. 11, pp. 4186–4199, Nov. 2019.
- [9] A. T. Do, S. M. A. Zeinolabedin, and T. T.-H. Kim, "Energy-efficient data-aware SRAM design utilizing column-based data encoding," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 67, no. 10, pp. 2154–2158, Oct. 2020.
- [10] J. Chen, W. Zhao, Y. Wang, and Y. Ha, "Analysis and optimization strategies toward reliable and high-speed 6T Compute SRAM," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 68, no. 4, pp. 1520–1531, Apr. 2021.

3166