# GA-Optimized 6.0-Gbps DDR5 SDRAM I/O Buffer Design For 16-nm FinFET CMOS Process

Jhih-Ying Ke\*1, Lean Karlo Santos Tolentino\*<sup>†‡1</sup>, Cheng-Yao Lo\*, Tzung-Je Lee\*, and Chua-Chin Wang\*§

\*Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan

<sup>†</sup>Department of Electronics Engineering, Technological University of the Philippines, Manila, Philippines

<sup>‡</sup>Center for Artificial Intelligence and Nanoelectronics, Integrated Research and Training Center,

Technological University of the Philippines, Manila, Philippines

<sup>§</sup>Department of Electronics & Communications Engineering, Vel Tech University, Chennai, India Email: ccwang@ee.nsysu.edu.tw

Abstract-DDR5 SDRAMs have many requirements of duty cycle, system voltage, slew rate, etc., so that a specialized design approach for FinFET-based I/O buffers is needed. Genetic algorithm (GA) was used to model process, voltage, and temperature (PVT) variations to determine how temperature and voltage affect the I/O buffer's characteristics. Interestingly, the study found that the temperature detector circuit was unnecessary, saving power and space. However, voltage variations significantly affected the slew rate. A new Voltage Detector circuit using ultralow threshold voltage (ULVT) transistors was introduced. The innovative Voltage Level Converter circuit, Pre-Driver, and Digital Logic Control circuits improved the slew rate and throughput while stabilizing Output Buffer Stage. TSMC's 16-nm FinFET CMOS technology implemented the I/O buffer, where its core area was 0.19339×0.056957 mm<sup>2</sup>. The device operated reliably at 6.0 Gbps, with a slew rate of 8.93 V/ns (0.8 V VDDIO) and 14.7 V/ns (1.2 V VDDIO maximum), and a duty cycle of 48.4% (0.8 V VDDIO) to 51.8% (1.2 V VDDIO maximum). By auto-tuning the driving current, high and low voltage modes attained a 19% and 30% increase in SR improvement, respectively.

Index Terms-I/O buffer, DDR5, FinFET, genetic algorithm, slew rate

#### I. INTRODUCTION

High-speed technologies, especially DDR5 SDRAMs, have strong demand on input/output (I/O) driver interface frequency. Output signal quality standards have also been elevated. For example, DDR5 SDRAMs follow JEDEC's DDR5DB01 (DDR5 Data Buffer Specification) [1]. SDRAMs must meet operational requirements such as 1.1 V VDDIO, 0.4-0.9 pF I/O pad load capacitance (CL), and 50±5% duty cycle ratio (DR). Due to PVT variations, integrating legacy ICs with emerging technologies like FinFETs is difficult. Previous efforts have been made to maintain an acceptable range of input/output buffer slew rate (SR) [2]. A proposed output buffer meets SR specifications and operates at 1.6 and 2.5 GHz frequencies [3]-[6]. However, these designs fail to meet DDR5 SR, I/O voltage, and duty cycle ratio requirements. Recently developed 2.6-GHz buffers for DDR5 use clamper circuitry as VDDIO detectors [7]. Additionally, non-overlapping circuits were used

<sup>1</sup>Equally credited authors.

\*TSRI's EDA tool was used by the researchers. This project received funding from the NSTC of Taiwan under grants 112-2221-E-110-063-MY3 and 110-2221-E-110-063-MY2.

to detect and regulate DR and voltage SR according to specified parameters. It also included an input buffer stage with ultra-low threshold voltage (ULVT) transistors to reduce transistor count during standard V<sub>th</sub> (SVT) and low V<sub>th</sub> (LVT) implementation. This technology eliminated the need for a keeper in the input inverter. However, it still fell short for DDR5 specifications.

A 6.0-Gbps digital I/O buffer that meets DDR5 FinFET CMOS buffer specifications is proposed in this research. We used genetic algorithm (GA) to simulate and model the I/O buffer's timing parameters with respect to PVT corners to determine how temperature fluctuations affect it. Interestingly, the investigation revealed the temperature detection circuit was unnecessary, reducing power consumption and occupied space. Further investigation revealed that voltage fluctuations significantly affect slew rate. A new Voltage Detector circuit using ULVT transistors was introduced. The Voltage Level Converter circuit, which uses advanced techniques to improve the Pre-Driver and Digital Logic Control circuits, stabilized the Output Buffer Stage transistors.

## II. GENETIC ALGORITHM (GA) FOR BUFFER DESIGN

To obtain SR values under various PVT conditions, referring to Fig. 1, MP1A, MN1A, MP2, and MN2 are currently activated, while MP1b, MN1b, MP1c, MN1c, MP1d, and MN1d are currently deactivated. The resulting datasets from HSPICE simulations consist of N Process Corner (P<sub>N</sub>) and P Process Corner (P<sub>P</sub>) whose values are assumed as -1 for slow (S), 0 for typical (TT), and 1 for fast (F); T ranges from 0-70°C; the output buffer's supply voltage (V) in 16-nm FinFET process is 0.7-0.9 V; and SR in V/ns. The dataset is split into 80% for training and 20% for testing.

General regression neural network (GRNN) was first implemented, but its SR model's error is about  $\pm 20\%$  [8], [9]. The proposed design approach using GA is shown in Fig. 2. Firstly, an equation that accurately predicts slew rate with given inputs is determined. The genetic programming model undergoes training and validation using a substantial dataset of P<sub>N</sub>, P<sub>P</sub>, V, T, and SR. The terminal set (variables like P<sub>N</sub>, P<sub>P</sub>, V, T) and function set (mathematical operations like +, -,  $\times$ ,  $\div$ ) for the genetic programming algorithm to form candidate equations



Fig. 1. Proposed I/O buffer for DDR5 SDRAMs.

are established simultaneously. Executing the algorithm on the dataset evolves equations across generations, optimizing the fitness function [10]–[12]. Then. a validation dataset is used to validate the most refined equation for accuracy and reliability. The complex relationship between input parameters and SR is shown by this evolved equation. Next, the equation is rigorously tested with new data to prove its predictive power. The algorithm improves continuously with new datasets or terminal and function sets to improve precision and predictive power.



Fig. 2. Proposed design approach using GA.

The SR model generated utilizing GA is displayed in Eqn. (1). The resulting  $R^2$  is 0.98705 and RMSE is 0.21455. Due to the low temperature coefficient (0.00758) and high voltage coefficient (13.9) in Eqn. (1), the SR equation suggests a good voltage sensor/detector for SR adjustment at different PVT corners. A temperature sensor/detector is unnecessary so that the I/O buffer saves power and area.

$$SR = 0.758P_N + 0.758P_P - 0.00758T + 13.9V +4.23 \tanh V + 2.07P_P^2V^2 + 0.944P_P^2 \quad (1) -0.944P_NP_PV - 5.04$$

#### **III. SYSTEM ARCHITECTURE**

Based on the analysis of GA, an I/O buffer compliant with DDR5 standards, featuring a slew rate self-adjustment is shown in Fig. 1. Its following blocks can be realized by existing circuits as stated below [7], [13], [14]:

- Floating N-well circuits are utilized to bypass the channel of leakage current that occurs during mode switching at 1.1 V. To prevent immediate overstress on the Output Buffer Stage, pre-charging control is implemented using SVT transistors, MP3, and MN3.
- A VDDIO detector is employed to supply an appropriate bias voltage, VD, preventing the transistor from experiencing overvoltage (1.1 V) since VDDIO can be chosen as one of the two voltage modes [7].
- Non-Overlap circuits generate two distinct signals without overlapping in time. This prevents transistors MP1a, MP1b, MP1c, MN1a, MN1b, and MN1c from activating simultaneously during transitional periods.
- 4) To mitigate the adverse effects of leakage currents arising from over-voltage hazards in situations where VPAD is equal to VDDIO, an implementation of a Leakage Reduction circuit is utilized.
- To determine the number of compensating transistors needed, the Process Detector analyzes process variations. Considering process corners requires consideration of five distinct scenarios (TT, FF, FS, SF, SS) for P-type or N-type [14].
- 6) The Output Buffer Stage uses ULVT transistors for all transistors. To prevent transistor over-voltage at VDDIO, which is set at 1.1 V, transistors MP1a, MP1b, MP1c, and MP1d are parallelly stacked on MP2. When a corner is detected, the Process and Voltage (PV) Detector in Fig. 1 activates these compensating transistors. By continuously activating these transistors, we reduce output current fluctuations and increase driving current and slew rate. In parallel, MN1a, MN1b, MN1c, and MN1d divert an equal amount of driving current, increasing the falling edge slew rate.

The subsequent subsections will address the remaining novel blocks.

## A. Voltage Detector

As shown in Fig. 3, Voltage Detector has 16 uniformlysized, diode-connected ULVT PMOS devices. It has three section nodes: VDD to V1, V1 to V2, and V2 to GND. All PMOS bulks are coupled to sources to reduce body effect. This arrangement allows V1 and V2 to accurately detect supply voltage (VDD-10%×VDD, VDD, VDD+10%×VDD) despite manufacturing process variations. V1 and V2 are compared to VB1 and VB2, respectively. Any supply voltage's digital codes can be represented by  $V_{code}[2]$  and  $V_{code}[1]$  output signals. The Voltage Detector determines if its respective compensating transistor will be activated due to voltage variation and sends them to Digital Logic Circuit for encoding.



Fig. 3. Voltage Detector circuit.

#### B. Voltage Level Converter

As VDDIO can range from 0.8 V to 1.2 V, a Voltage Level Converter (shown in Fig. 4) was needed to prevent excessive voltage at the output stage [13]. The VD value (0 or 0.4 V) determines if DH and DL output voltages need to be increased. Data signal was coupled to the 4-inverter chain for stable output DL. The inverter chains include ULVTs, MP1a~d, and deep n-well SVTs, MN1a~d.

### C. Digital Logic Control and Pre-Driver circuits

Digital Logic Control circuit in Fig. 1 is shown in Fig. 5. Pcode[2:1] and Ncode[2:1] are encoded from the outputs of PV Detector in Fig. 1. Then, the said circuit's outputs, Vgp[3:1] and Vgn[3:1], are generated and fed through the Pre-Driver circuits in Fig. 5 to control the compensating transistors in the Output Stage. The Output Stage in Fig. 1 is deactivated by OE when Input Stage is activated. The signals VPL and VNL from the Leakage Reduction circuit will turn off transistors when VPAD = VDDIO.

## **IV. POST-LAYOUT SIMULATIONS**

The I/O buffer was designed using TSMC 16-nm technology as shown in Fig. 6. The core and chip areas are  $0.19339 \times 0.056957 \text{ mm}^2$  and  $0.731 \times 0.187957 \text{ mm}^2$ , respectively. The MIM-type C<sub>VD</sub> overlaps with the core due to its higher metal layer levels, reducing chip area.

Our buffer has two I/O voltages, VDDIO = 1.2 V (1.1+10%)and 0.8 V. C<sub>L</sub> is 20 pF for simulations. The all-PVT-corner



Fig. 4. Voltage Level Converter circuit.



Fig. 5. Digital Logic Control and Pre-driver circuits.



Fig. 6. Layout and floorplan of the proposed I/O buffer.

| (                                 |             | L DOOLO LA             |                     |                       |             | - mage a ra          |                        |                         |
|-----------------------------------|-------------|------------------------|---------------------|-----------------------|-------------|----------------------|------------------------|-------------------------|
|                                   | MWSCAS [15] | APCCAS [4]             | ISNE [3]            | ISCAS [16]            | APCCAS [17] | APCCAS [6]           | APCCAS [7]             | This work               |
| Year                              | 2018        | 2018                   | 2019                | 2021                  | 2021        | 2021                 | 2023                   | 2023                    |
| Process<br>(nm)                   | 180         | 40                     | 16                  | 65                    | 22          | 16                   | 16                     | 16                      |
| Verification                      | Simulation  | Simulation             | Simulation          | Simulation            | Simulation  | Simulation           | Simulation             | Simulation              |
| VDD (V)                           | 1.8         | 0.9                    | 0.8                 | 1.8                   | 1.8         | 0.8                  | 0.8                    | 0.8                     |
| VDDIO (V)                         | 1.8         | 1.8/0.9                | 1.6/0.8             | 3.3/1.8/2.5           | 3.3/1.8     | 1.2/0.8              | 1.2/1.1/0.8            | 1.2/1.1/0.8             |
| Max. Data<br>Rate<br>(Gbps)       | 0.50        | 5.0/3.2                | 5.0                 | 0.4                   | 1.0         | 5.0                  | 5.2                    | 6.0                     |
| $\Delta$ SR (V/ns)                | 0.75/1.41   | 6.91/7.85              | 18/19.1             | N/A                   | N/A         | 8.7/6.4              | 5.77/8.89              | 14.7/8.93               |
| Improvement<br>in $\Delta$ SR (%) | 50          | 37                     | 23.5/15.8           | N/A                   | N/A         | 26/21                | 28/10.7                | 19/30                   |
| Duty Cycle (%)                    | N/A         | N/A                    | N/A                 | N/A                   | N/A         | 49.2/48.3            | 48/48.2                | 48.4/51.8               |
| Dynamic<br>Power<br>(mW)          | N/A         | 33.71<br>(@500<br>MHz) | 28<br>(@500<br>MHz) | 29.8<br>(@200<br>MHz) | N/A         | 153<br>(@2.5<br>GHz) | 143.5<br>(@2.6<br>GHz) | 121.16<br>(@3.0<br>GHz) |

 TABLE I

 A Comparison of the Proposed Approach to Previous Methods





Fig. 7. A 0.8-V VDDIO's PAD output waveform (a) without PV compensation (worst SR= 12.8 V/ns; best SR= 8.72 V/ns; worst DR = 47.1%; best DR = 49.1%); (b) with PV compensation (worst SR = 8.93 V/ns; best SR= 7.39 V/ns); worst DR = 51.8%; best DR = 50.4%

post-layout simulation results for VDDIO = 0.8 V with and without PV compensation at 3.0 GHz are shown in Fig. 7 while the eye diagrams in Fig. 8. Fig. 9 and 10 display results for maximum VDDIO = 1.2 V with/without PV compensation and the corresponding eye diagrams, respectively. Eqn. (2) is used to calculate improvement in  $\Delta$ SR.

$$\Delta SR \ (\%) = \frac{\Delta SR_{\text{NoPV}} - \Delta SR_{\text{WithPV}}}{\Delta SR_{\text{NoPV}}} \tag{2}$$

The difference between worst-case slew rates without and with PV compensation is specified as  $\Delta SR_{\text{NoPV}}$  and  $\Delta SR_{\text{WithPV}}$ , respectively. The  $\Delta$ SR improvement for VDDIO at 0.8 V and 1.2 V is 19% and 30%, respectively. A comparison of the proposed I/O buffer with previous works is presented in Table I. Fig. 8 and 10 depict eye diagrams for 0.8 and 1.2 V VDDIO, respectively.

## V. CONCLUSION

The study used GA to optimize SR with respect to PVT variations in FinFET-based I/O buffers for DDR5 SDRAMs. The SR model showed that the temperature detector circuit was unnecessary, saving power and area. However, voltage fluctuations severely affected the SR. The proposed I/O buffer operates reliably at 6.0 Gbps with slew rates and duty cycle ratio compliant with DDR5's demand.

Fig. 8. 0.8-V VDDIO's eye diagram (a) without PV compensation (height = 0.684 V; width = 116 ps); (b) with PV compensation (height = 0.668 V; width = 115 ps).



Fig. 9. A 1.2-V VDDIO's PAD output waveform (a) without PV compensation (worst SR= 17.5 V/ns; best SR= 12 V/ns; worst DR = 48.5%; best DR = 49.7); (b) with PV compensation (worst SR = 14.7 V/ns; best SR= 10 V/ns; worst DR = 48.4%; best DR = 50.0%)



Fig. 10. A 1.2-V VDDIO's eye diagram (a) without PV compensation (height = 1.00 V; width = 117 ps); (b) with PV compensation (height = 1.00 V; width = 117 ps).

#### REFERENCES

- JEDEC, DDR5 Data Buffer Definition (DDR5DB01), December 2021. Accessed on: June 12, 2022. [Online]. Available: https://www.jedec.org/standards-documents/docs/jesd82-521
- [2] C.-C. Wang, L. K. S. Tolentino, S.-W. Lu, O. L. J. A. Jose, R. G. B Sangalang, T.-J. Lee, P.-Y. Lou, W.-C. Chang, "A 2xVDD digital output buffer with gate driving stability and non-overlapping signaling control for slew-rate auto-adjustment using 16-nm FinFET CMOS process," *Integration*, vol. 90, pp. 245-260, May 2023.
- [3] C.-C. Wang and S.-W. Lu, "2.5 GHz data rate 2 × VDD digital output buffer design realized by 16-nm FinFET CMOS," in *Proc. 2019 8th International Symposium on Next Generation Electronics (ISNE)*, pp. 1-3, Oct. 2019.
- [4] C.-C. Wang, Z.-Y. Hou and S.-W. Huang, "40-nm 2×VDD digital output buffer design with DDR4-compliant slew rate," in *Proc. 2018 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)*, pp. 279-282, Oct. 2018.
- [5] C.-C. Wang, L. K. S. Tolentino, T.-J. Lee and W.-J. Su, "Mixed-voltage output buffer," TW Patent I772240B, July 21, 2022.
- [6] T.-J. Lee, W.-J. Su, L. K. S. Tolentino and C.-C. Wang, "A 2.5-GHz 2×VDD 16-nm FinFET digital output buffer with slew rate and duty cycle self-adjustment," in *Proc. 2021 IEEE Asia Pacific Conference on Circuit and Systems (APCCAS)*, pp. 153-156, Nov. 2021.
- [7] J.-Y. Ke, L. K. S. Tolentino, C.-Y. Lo, T.-J. Lee, and C.-C. Wang, "A 2.6-GHz I/O buffer for DDR4 & DDR5 SDRAMs in 16-nm FinFET CMOS process," in *Proc. 2023 IEEE Asia Pacific Conference on Circuit and Systems (APCCAS)*, pp. 1-5, Nov. 2023.
- [8] R. O. Serfa Juan and J. Kim, "Implementation of generalized regression neural network (GRNN) for solar panel power estimation," in *Proc.* 2020 International Conference on Information and Communication Technology Convergence (ICTC), pp. 294-299, Oct. 2020.
- [9] L. K. S. Tolentino, J.-Y. Ke, C.-Y. Lo and C.-C. Wang, "Using machine learning techniques to determine DDR5 SDRAM I/O buffer's slew rate at different PVT variations," in 2023 International Conference on

Integrated Intelligence and Communication Systems (ICIICS), pp. 1-6, Nov. 2023.

- [10] R. S. Concepcion, L. C. Ilagan, and I. C. Valenzuela, "Optimization of nonlinear temperature gradient on eigenfrequency using genetic algorithm for reinforced concrete bridge structural health," in *Proc. World Congress on Engineering and Technology; Innovation and its Sustainability*, pp. 141-151, Aug. 2019.
- [11] R. Concepcion et al., "Towards the integration of computer vision and applied artificial intelligence in postharvest storage systems: Noninvasive harvested crop monitoring," in Proc. 2021 IEEE 13th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM), pp. 1-6, Nov. 2021.
- [12] I. C. Valenzuela, "Application of computational intelligence in plant growth modelling," Doctoral dissertation, 2019.
- [13] C.-C. Wang, "Tutorial: design of high-speed nano-scale CMOS mixedvoltage digital I/O buffer with high reliability to PVTL variations," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 68, no. 2, pp. 562-567, Feb. 2021.
- [14] T.-Y. Tsai, Y.-L. Teng and C.-C. Wang, "A nano-scale 2×VDD I/O buffer with encoded PV compensation technique," in *Proc. 2016 IEEE International Symposium on Circuits and Systems (ISCAS)*, pp. 598-601, May 2016.
- [15] X. Gui, K. Li, X. Wang and L. Geng, "A dual-path open-loop CMOS slew-rate controlled output driver with low PVT variation," in *Proc. 2018 IEEE 61st International Midwest Symposium on Circuits and Systems* (MWSCAS), pp. 274-277, Aug. 2018.
- [16] P. Kalra, D. Rajagopal, S. Seth and A. Sirohi, "Multi voltage high performance bidirectional buffer in a low voltage CMOS process," in *Proc. 2021 IEEE International Symposium on Circuits and Systems* (ISCAS), pp. 1-5, May 2021.
- [17] D. Nedalgi and S. V. Siddamal, "Novel gate tracking and n-well control circuit for 2×VDD tolerant I/O buffer," in *Proc. 2021 IEEE Asia Pacific Conference on Circuit and Systems (APCCAS)*, pp. 93-96, Nov. 2021.