# Engery-Efficient Double-Edge Triggered Flip-Flop Design

Chua-Chin Wang, Senior Member, IEEE, Gang-Neng Sung, Student Member, IEEE, Ming-Kai Chang, and Ying-Yu Shen

> Dept. of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 Email: ccwang@ee.nsysu.edu.tw

Abstract-This paper presents a novel design for doubleedge triggered flip-flops (DETFF). Detailed analysis of the transistors used in the DETFF will be presented to find out those on the critical path. Therefore, the proposed DETFF employs low- $V_{th}$  transistors at critical paths such that the power-delay product as well as the large area consumption caused by the low- $V_{th}$  transistors can be resolved at the same time. The proposed DETFF, thus, fully utilizes the multi-Vth scheme provided by advanced CMOS processes without paying the price of large area, slow clocking frequency, and poor noise immunity. The proposed design is implemented using TSMC 0.18 µm 1P6M CMOS process. Post-layout simulation results reveals that the proposed DETFF saves at least 33% power and 39% power-delay product (i.e., the dissipated energy) at all PVT (process, supply voltage, temperature) corners.

Keywords : double-edge triggered, flip-flop, low power, multiple  $V_{th}$ , clocking

#### I. INTRODUCTION

Double-edge triggered flip-flop (DETFF) can latch digital data signal switches by both rising and falling edges. They are then welcomed to be used in many applications, particularly the register and latch intensive SOC (system-on-chip) applications. Double-edge triggering or latching are also widely used in pipelining designs, e.g., [6], [7], to reduce the sequencing overhead thereof. Many DETFF related researches have been reported, including [1], [2], [3], [4], [5], [8]. The states of the cross-coupled pairs in the two latches are not so easy to be flipped, which in turn slow down the operating frequency. The solution in [1] is mainly focused on the speed rather than the power consideration. [2] successfully integrated a PTL(pass transistor logic)-based XOR gate with a pair of back-to-back inverter pairs such that the number of transistors is reduced. The DETFF in [8] was designed purposely for TSPC (true-single-phase-clocking) logic. Ever since the booming development of CMOS technologies, multiple  $V_{th}$ (threshold voltage) transistors can be fabricated on a single die. Such a drastical evolution has attracted more attention to further improve the performance of DETFF designs by using deep submicron CMOS processes. For instance, [3] proposed to use a lowswing scheme to resolve the power dissipation problem. Low- $V_{th}$ NMOS transistors are driven by the clock and the inverted clock. However, a total of three back-to-back inverter pairs are needed. Sung et al. in [5] announced a full-low- $V_{th}$  DETFF which in fact replaces all of the transistors in the DETFF shown in [2] with low- $V_{th}$  transistors. Without any doubt, the power consumption is reduced. Those methods utilizing multi- $V_{th}$  ignored the fact that the low- $V_{th}$  transistors consume a larger area than the normal transistors. Besides, several side effects will be introduced. First of

all, the subthreshold current is increased, which is not welcomed in a standby mode. Secondly, the noise immunity is worsen due to the reduction of the threshold voltage. Therefore, we need to find out what the critical path in a DETFF is and which transistors are favored to be low- $V_{th}$  MOS.

#### II. LOW-ENERGY DETFF DESIGN

The evolution of CMOS technology makes multiple threshold voltage transistors available nowadays. In this paper, dual threshold voltage transistors provided by TSMC 0.18  $\mu$ m 1P6M CMOS process are used to construct a DETFF.

### A. Current analysis of dual-V<sub>th</sub> transistors

The drain current in the saturation region of a MOSFET transistor is :

$$I_D = \frac{k_p}{2} \frac{W_{eff}}{L_{eff}} (V_{GS} - V_{th})^2 \tag{1}$$

where  $k_p$  is process parameter,  $W_{eff}$  and  $L_{eff}$  are effective width and length of the transistor, respectively. According to Eqn. (1), a lower threshold voltage can produce a larger drain current. Meanwhile, if we take  $\frac{W_{eff}}{L_{eff}}$  as a constant, then Eqn. (1) can be derived as :  $I_D \propto (V_{GS} - V_{th})^2$ .

In this work, TSMC 0.18  $\mu$ m 1P6M CMOS process is adopted to realize dual threshold voltage transistors. The threshold voltages of Normal (high) NMOS/PMOS and Medium (low) NMOS/PMOS are tabulated in Table I. Take the NMOS as an example, where  $V_{GS}$  = VDD = 1.8 V, high threshold voltage  $V_{th-n}$  = 0.4805 V, and low threshold voltage  $V_{th-m}$  = 0.2340 V. Thus, we can compute a ratio of  $\frac{I_{DH}}{I_{DL}}$  as :

$$\frac{I_{DH}}{I_{DL}} = \frac{(V_{GS} - V_{th-n})^2}{(V_{GS} - V_{th-m})^2} = \frac{(1.8 - 0.4805)^2}{(1.8 - 0.2340)^2}$$
(2)

where  $I_{DH}$  and  $I_{DL}$  are the drain currents of the high threshold voltage transistor and low threshold voltage transistor, respectively. Hence, the current increasing rate is about 40 %.

On the other hand, with the decreasing of the transistor operating voltage, the threshold voltage is decreasing as well. The subthreshold current is computed as :

$$I_{DSUB} = \frac{W_{eff}}{W_o} \cdot I_o \cdot 10^{(V_{GS} - V_{th})/S},$$
 (3)

where  $W_o$  and  $I_o$  are the gate width and drain current, respectively. S is the subthreshold swing parameter which can be calculate as :

$$S \approx 2.3 V_T \left[ 1 + \frac{C_d}{C_{ox}} \right]$$
 (4)

# 1-4244-0387-1/06/\$20.00 © 2006 IEEE

1792

where  $V_T$  is thermal voltage,  $C_d$  is the junction capacitance between source and drain. The leakage current can be obtained by replacing  $V_{GS}$  with 0, which is,

$$I_{leak} = \frac{W_{eff}}{W_o} I_o 10^{-V_{th}/S}$$
(5)

If  $W_o$  remains unchanged,  $I_{DSUB}$  as well as  $I_{leak}$  will be increased when  $I_o$  increases. Thus, the subthreshold current becomes a positive factor of driving wires. In short, a transistor with low  $V_{th}$  is more appropriate to drive wires rather than to store data.

#### B. Side effects of low $V_{th}$

Besides the increased leakage current when low- $V_{th}$  transistors are used, other side effects will be introduced at the same time.

**temperature coefficient :** The reduction of the threshold voltage leads to the increase of the temperature coefficient according to the following equation.

$$TCV_{th} = \frac{1}{V_{th}} \cdot \frac{\partial V_{th}}{\partial T},$$
 (6)

where  $TCV_{th}$  is the temperature coefficient, and T is the temperature.

output impedance of drain : The impedance looking into the drain of an MOS is described as follows.

$$r_o^{-1} = \frac{\partial i_D}{\partial v_{DS}} \propto (V_{GS} - V_{th})^2 \cdot \lambda, \tag{7}$$

where  $r_o$  is the output impedance,  $i_D$  and  $v_{DS}$  are the AC portions of the signals of the drain current and the voltage drop across the channel,  $\lambda$  is the channel modulation factor. Thus, the reduction of the threshold voltage causes the increase of the output impedance at the drain. In other words, the state at the drain becomes hard to be flipped.

## C. Circuit of the proposed DETFF

Fig. 1 is the single-latch DETFF in [2], while Fig. 2 is the modified version of the single-latch DETFF proposed in [5]. The only difference is that the later DETFF replaces all of the transistors with low- $V_{th}$  transistors except the two inverters on the right-hand side. Although the simulation results given in [5] showed that there was an improvement in terms of power dissipation, the area penalty is very high. According to the characteristics in Table I, the area overhead will be roughly 55.26%, which is not really acceptable in reality. Meanwhile, the increased output impedance at those output nodes makes the back-to-back inverter pairs difficult to flip states if necessary. Consequently, the operating speed will be reduced.

A simple thought to improve the single-latch DETFF is shown in Fig. 3. M13 to M16 are an XOR gate to detect the edge transitions generated in the series of inverters, I13 to I16. Therefore, it is like an edge detector and generator, which is not critical to the state transitions at D and Q. We don't see any reason to pay the price of large area and high leakage current by using low- $V_{th}$ transistors to construct such an XOR gate.

**Selection of M18** : Notably, the input, D, is present at the source of PMOS M18, while we expect that the state of node B will follow D given the XOR result at the gate drive of M18 is pulled low. Apparently, it is a common-gate amplifier formation of which gain is proportional to its  $gm_{M18}$ , the transconductance of M18.  $gm_{M18} \propto (V_{GS} - V_{th})$ . Therefore, a low- $V_{th}$  transistor is a better selection to obtain the speed and power at the expense of area.

**Selection of M17**: The output, Q, resides at the drain of M17. In order to make I17 easy to flip the state of Q, the load thereof should be small. Therefore, using a low- $V_{th}$  transistor will be a

bad idea based upon the conclusion given by Eqn. (7). We propose to utilize a normal NMOS rather than a low- $V_{th}$  one.

Selection of M19 and M20 : Basically, M19 and M20 constitute an inverter to drive I17. However, the input of this inverter, i.e., node B, is the output of another inverter composed of M17 and M18. Notably, the inverter composed of M17 and M18 is a "floating" inverter where there is no path to either VDD or GND. Thus, node B becomes a "weak" output to drive the inverter of M19 and M20 where M19 is the current source and M20 is the current sink. If M19 is a high- $V_{th}$  transistor, it will have a problem to supply a large current to drive I17 in addition to that the "weak" node B can not easily switch it on. The consequence is that the slew rate (SR =  $\frac{dV}{dt} = \frac{I}{C_L}$ ) at  $\overline{Q}$  deteriorates. Hence, M19 should be a low- $V_{th}$  transistor.

In short, the proposed DETFF only utilizes two low- $V_{th}$  transistors at M18 and M19. The overall area penalty is then reduced to merely 11%.

#### **III. SIMULATION AND IMPLEMENTATION**

TSMC (Taiwan Semiconductor Manufacturing Company) 0.18  $\mu$ m 1P6M CMOS process is adopted to carry out the proposed DETFF design. The layout of the proposed DETFF design is shown in Fig. 4, which shows that the area is  $4.1 \times 18.255 \ \mu\text{m}^2$ . The prototypical chip design shown in Fig. 5 is  $823 \times 888 \ \mu\text{m}^2$  including pads, two 8-bit registers composed of the proposed DETFF, a build-in 800 MHz VCO, a loadable up/down counter, and a MUX. Fig. 6 shows the post-layout simulation results to justify that the DETFF-based registers are operating correctly. A performance comparison of the proposed design with several prior DETFFs is summarized in Table II. Though the proposed design pays the price of increasing 11% of chip area, it does dissipate the least energy as well as the least power. Our design saves at least 33% of power and 39% of energy at the 800 MHz clock rate.

## IV. CONCLUSION

We have proposed an energy-efficient DETFF design to attain low power consumption but still maintain the high speed operation which is capable of meeting the requirement of DDR2 specifications (800 MHz). Detailed circuit analysis resolves the puzzle of which transistors should be low- $V_{th}$  and others should be normal ones. The simulation results justify our analysis.

#### ACKNOWLEDGMENT

This research was partially supported by National Science Council under grant NHRI-EX93-9319EI and NSC 92-2218-E-110-001. The authors would like to thank CIC of National Science Council (NSC), Taiwan, for their thoughtful help in the chip fabrication of the proposed work. The authors also like to thank "Aim for Top University Plan" project of NSYSU and Ministry of Education, Taiwan, for partially supporting the research.

|                  | Normal N          | Medium N           | Normal P        | Medium P          |
|------------------|-------------------|--------------------|-----------------|-------------------|
| $V_{th}$ (V)     | 0.4805            | 0.2340 V           | -0.4897         | -0.2791           |
| W/L (nm/nm)      | 220/180           | 220/300            | 220/180         | 220/250           |
| area $(\mu m^2)$ | $0.92 \times 1.5$ | $0.92 \times 1.78$ | $1.08 \times 2$ | $1.6 \times 2.59$ |
| area penalty     | 0                 | +18.67%            | 0               | +91.85%           |

#### TABLE I

Characteristics of MOS transistors in 0.18  $\mu$ m CMOS process (Note : the W/L demotes the feature size of the MOS transistors.)

|                 | [2]  | [8]  | [5]  | Ours |
|-----------------|------|------|------|------|
| rise delay (ps) | 503  | 381  | 410  | 375  |
| fall delay (ps) | 765  | 430  | 420  | 345  |
| power $(\mu W)$ | 66.3 | 90.9 | 47.8 | 31.7 |
| P×D (fJ)        | 50.5 | 39.0 | 19.6 | 11.9 |

 TABLE II

 Performance comparison with prior works

#### REFERENCES

- M. Afghahi, and J. Yuan, "Double edge-triggered D-flip-flop for high-speed CMOS circuits," *IEEE J. of Solid-State Circuits*, vol. 26, no. 8, pp. 1168-1170, Aug. 1991.
- [2] T. A. Johnson, and I. S. Kourtev, "A single latch, high speed double-edge triggered flip-flop (DETFF)," 2001 IEEE Inter. Conf. on Electronics, Circuits, and Systems, vol. 1, pp. 189-192, 2001.
- [3] C. Kim, and S.-M. Kang, "A low-swing clock double-edge triggered flip-flop," *IEEE J. of Solid-State Circuits*, vol. 37, vol. 5, pp. 648-652, May 2002.
- [4] S. L. Lu, and M. Ercegovac, "A novel CMOS implementation of double-edge triggered flip-flops," *IEEE J. of Solid-State Circuits*, vol. 25, no. 4, pp. 1008-1010, Apr. 1990.
- [5] Y. Y. Sung, and R. C. Chang, "A novel CMOS double-edge triggered flip-flop for low-power applications," 2004 IEEE Inter. Symp. on Circuits & Systems (ISCAS'2004), pp. 665-668, May 2004.
- [6] C.-C. Wang, C.-J. Huang, and K.-C. Tsai, "A 1.0 GHz 0.6-μm 8-bit carry lookahead adder using PLA-styled all-N-transistor logic," *IEEE Trans. of Circuits and Systems, Part II : Analog and Digital Signal Processing*, vol. 47, no. 2, pp. 133-135, Feb. 2000.
  [7] C.-C. Wang, C.-F. Wu, and K.-C. Tsai, "A 1.0 GHz 64-bit high-speed
- [7] C.-C. Wang, C.-F. Wu, and K.-C. Tsai, "A 1.0 GHz 64-bit high-speed comparator using ANT dynamic logic with two-phase clocking," *IEE Proceedings - Computers and Digital Techniques*, vol. 145, no. 6, pp. 433-436, Nov. 1998.
- [8] J.-S. Wang, "A new true-single-phase-clocked double-edge-triggered flip flop for low-power VLSI design," *1997 IEEE Inter. Symp. on Circuits & Systems (ISCAS'97)*, pp. 1986-1989, June 1997.



Fig. 1. Single-latch DETFF in [2]



Fig. 2. Single-latch DETFF in [5]



Fig. 3. Proposed energy-efficient DETFF



Fig. 4. Layout of a proposed DETFF cell



Fig. 6. Post-layout simulation results of the prototypical chip

Fig. 5. Layout of the prototypical chip using the proposed DETFF