# A Low-Energy 8-bit CLA Realized by Single-Phase ANT Logic

Durga Srikanth Kamarajugadda<sup>1</sup>, Oliver Lexter July A. Jose<sup>1</sup>, Lung-Jieh Yang<sup>2</sup>, Balasubramanian Esakki<sup>3</sup>,

Sivaperumal Sampath<sup>4</sup>, and Chua-Chin Wang<sup>1</sup>

<sup>1</sup>Dept. of Electrical Eng., National Sun Yat-Sen University, Kaohsiung, Taiwan 80424

<sup>2</sup>Dept. of Mech. & Electromech. Eng., Tamkang University, Taipei, Taiwan 10650

<sup>3</sup>Dept. of Mechanical Eng., Vel Tech University, Chennai, India 600062

<sup>4</sup>Dept. of Electronics & Comm. Eng., Presidency University, Bengaluru, India 560064

Corresponding Author: ccwang@ee.nsysu.edu.tw

*Abstract*—Low power and high-speed carry look-ahead adder (CLA) is one of the most demanded digital computation units. This paper demonstrates a CLA based on single-phase ANT logic to achieve low power and high speed. It is featured with load capacitance reduction and no internal loop to enhance the speed and reduce switching activity at the same time. The proposed design is proved to work at the clock frequency of 20 GHz with a load of 60 pF implemented using 40 nm CMOS technology by post-layout simulations, where the power dissipation is observed with a normalized 0.071 mW and the normalized PDP is 0.08 pJ.

Index Terms—CLA, effective path, high-speed, low-power, no internal loop.

#### I. INTRODUCTION

Because of the popularity and demand for portable gadgets, designers aspire for smaller silicon areas, faster speeds, longer battery life, and more dependability [1]. Adders are one of the essential building computation blocks of the CPUs in those mentioned applications so that many researchers were focused on this special topic. There are two major logic styles in designing full adders, namely dynamic and static. Static style usually results in high consumption of power, large area, and high complexity in integrated circuits. To precise these parameters, the dynamic style seems to be a better solution. It has a faster switching speed and less transistor count, which leads to higher density compared to conventional static logic style. However, the major concern of dynamic circuits is the excessive power dissipation due to higher switching activity.

The total power dissipation in digital circuits is mainly classified into static and dynamic power consumptions. Static power is dominated by the leakage. Dynamic power is dissipated when the circuit is in an active mode. N-block of prior All-N-Transistors (ANT) contains stacked series of NMOS [2]. Due to the series of NMOS devices, the speed is slowed down [3]. In the ANL circuit, the charge and discharge of voltage are effected by the larger gate capacitance and leading to the glitch problem [4]. This paper demonstrates a single-phase



Fig. 1. Schematic of prior ANT logic [2]

ANT logic design without internal loop, where the power and delay of the circuit are mainly associated to achieve a low power delay product (PDP) solution.

# II. THEORY OF SINGLE-PHASE ANT.

## A. Prior design ANT logic

Referring to Fig. 1, the prior ANT block with an internal feedback loop was reported [2]. The following factors may affect the performance of this logic circuit.

- Besides the N-block, the logic occupies a large area because of 7 transistors are needed in the block.
- The P3-N3 feedback loop may cause additional delay and hysteresis to the logic evaluation.
- Since CLK signal drives 3 transistors for each block, clock loading is high to cause setup time or hold time issues.

In order to improve the performance of the prior ANT design, the following ways are proposed.

• Voltage at node A can be lowered to further decrease the energy consumption.

Prof. Chua-Chin Wang is the corresponding author. He is also with Ins. of Undersea Technology, National Sun Yat-Sen University, Taiwan.

O. L. J. A. Jose is also connected with Dept. of Electronics Engineering, Batangas State University, The National Engineering University, Philippines.



Fig. 2. (a) Schematic of proposed single-phase ANT logic; (b) Output waveforms;

- The internal loop may be removed to reduce the transistor count of auxiliary circuits in the block. It can also remove the possible hysteresis to speed up the circuit evaluation.
- The output stage of the block can be switched off along with N-block to further reduce the energy consumption.

# B. Operations of single-phase ANT logic

Referring to Fig. 2 (a), the proposed single-phase ANT logic removes the internal loop to fasten the charge-discharge operation of the output inverter composed of MP2 and MN3. Besides, MN2 is inserted on the top of the N-block to provide a resistor-like function such that an RC delay is generated to kill the possible glitches at the gate drives of MP2 and MN3. The node voltage at  $V_a$  attains the voltage VDD- $V_{th}$  to maintain the low voltage swing to the input of the inverter to reduce the power dissipation.

Fig. 2 (b) is the working waveforms for the proposed singlephase ANT cell.

1) : When the CLK=0, the cell is in a precharge phase. MP1 is on and MN4 is off. Since the gate drive of MN2 is VDD, this transistor is always on. The node voltage  $V_a$  will be VDD- $V_{th}$ . Thus, MP2 is off and MN3 is on. The output  $V_Y$  follows the previous state.

2) : When the CLK input is high, the cell enters an evaluation phase. The operation consists of 4 cases depending on the previous state of the output  $V_Y$  and the on/off state of the N-block. The different cases are discussed as follows.

**Case 1:** When the N-block is turned on,  $V_a$  is discharged through the N-block voltage to GND. Thus, MP2 will be turned on and MN3 is off. Output Y is then charged to VDD.

**Case 2:** If the N-block is turned off and  $V_a = VDD-V_{th}$ , MP2 is off and MN3 is on to drive  $V_Y$  to 0. By contrast, if the N-block is turned on, the node voltage  $V_a$  is discharged to GND through N-block and MN4. MP2 will then be turned on and MN3 is off to pull high the output  $V_Y$ .



Fig. 3. Block diagram of 8-bit CLA;

**Case 3:** When N-block is turned off, and the node  $V_a$  is VDD- $V_{th}$ , MP2 will be turned off and MN3 is on causing  $V_Y$  to be discharged from VDD to 0.

**Case 4:** When N-block is turned off, the voltage  $V_a$  turns off MP2 and turns on MN3 to pull down  $V_Y$ .

Table I summarizes the transistors state and the outcome of proposed single-phase ANT logic in the above scenarios.

| TABLE I                           |
|-----------------------------------|
| STATE OF TRANSISTORS IN EACH CASE |

|                   | Transist              | or state   | Outcome               |  |
|-------------------|-----------------------|------------|-----------------------|--|
|                   | MP                    | 1 on       |                       |  |
| Precharge, CLK=0  | MN                    | 2 on       | $V_Y$ =previous state |  |
|                   | MN4 off               |            |                       |  |
|                   | case 1                |            |                       |  |
|                   | N-blo                 |            |                       |  |
|                   | MP2                   |            | $V_Y = VDD$           |  |
|                   | MN3 off               |            |                       |  |
|                   | MN4 on                |            |                       |  |
|                   |                       | e 2        |                       |  |
|                   |                       | N-block on |                       |  |
| Evaluation, CLK=1 | MP2 off               | MP2 on     | $V_Y = 0$ to VDD      |  |
|                   | MN3 on                | MN3 off    |                       |  |
|                   | MN4 on MN4 on         |            |                       |  |
|                   | case 3<br>N-block off |            | $V_Y = VDD$ to 0      |  |
|                   | MP2 off               |            |                       |  |
|                   | MN3 on                |            |                       |  |
|                   | MN4 on                |            |                       |  |
|                   | case 4                |            |                       |  |
|                   | N-block off           |            | V O                   |  |
|                   | MP2 off               |            | $V_Y = 0$             |  |
|                   | MN3 on                |            |                       |  |
|                   | MN4                   | 4 on       |                       |  |

## C. 8-bit CLA using single-phase ANT logic

The block diagram of the 8-bit CLA taking advantage the proposed single-phase ANT logic is shown in Fig. 3. The schematic of Generation (G<sub>i</sub>) and propogation (P<sub>i</sub>) blocks are shown in Fig. 4 and Fig. 5, respectively. The equations for G<sub>i</sub> and P<sub>i</sub>,  $i = 0 \sim 7$ , are governed by Eqn (1).

$$P_i = A_i \oplus B_i, \quad G_i = A_i B_i \tag{1}$$

The proposed carry and sum generation circuits using the single-phase ANT logic are shown in Fig. 6 and Fig. 7,



Fig. 4. Schematic of Gi generation circuit.



Fig. 5. Schematic of Pi propagation circuit.

respectively. The equations for carry generation  $(C_i)$  and sum  $(S_i)$  signals are presented in Eqn. (2) and (3), respectively.

$$C_i = G_i + P_i G_{i-1} + \ldots + P_i P_{i-1} \dots P_0 C_{in}$$
(2)

$$S_i = P_i \oplus C_{i-1} \tag{3}$$

The output from sum generation is coupled through 3 stages of tapered buffers to drive the capacitive load of 60 pF.

## **III. IMPLEMENTATION AND SIMULATION**

The 8-bit CLA single-phase ANT is realized using TSMC 40-nm CMOS technology. Fig. 8 shows the layout, where the core area is 154.776  $\mu$ m<sup>2</sup> × 179.165  $\mu$ m<sup>2</sup>, and the chip area is 797.565  $\mu$ m<sup>2</sup> × 804.395  $\mu$ m<sup>2</sup>. The worst delay by all-PVT-corner post-layout simulations is observed in SS corner with VDD = 0.82 V, 0°C at clock frequency of 20 GHz with load 60 pF, where the worst case of power and delay are 69.7 mW, 1.5 ns, respectively.

Certain simulation input patterns and corresponding expected outputs to test the functionality of the 8-bit CLA are shown in Table II. Fig. 9 shows the worst-case postlayout simulation results, where the inputs are given as those in Table II. Notably, output waveform and expected output provide the same values, which proves the functionality of 8-bit CLA using proposed single-phase ANT logic. Table III



Fig. 6. Schematic of 8-bit carry generation circuit.



Fig. 7. Schematic of 1-bit sum generation circuit.

shows the comparison of several prior works with our design. The proposed design attains the best normalized power and PDP with values of 0.071 mW and 0.08 pJ, respectively.

#### **IV. CONCLUSION**

This paper showcased an 8-bit CLA using single-phase ANT logic simulated at a clock frequency of 20 GHz with a load of 60 pF. The proposed circuit incorporates effective capacitance reduction that improves signal transduction and is implemented in 40 nm CMOS technology. Referring to Table III, the proposed adder outperforms the prior works in terms of normalized power and PDP.

TABLE II SIMULATION INPUT PATTERN AND EXPECTED OUTPUT

|                | Inp          | outs         | Outputs            |
|----------------|--------------|--------------|--------------------|
| X <sub>i</sub> | $A7 \sim A0$ | $B7 \sim B0$ | $Cout, S7 \sim S0$ |
| X1             | 00011010     | 10110010     | 0, 11001100        |
| X2             | 11010010     | 00101101     | 0, 11111111        |
| X <sub>3</sub> | 01000011     | 11100001     | 1,00100100         |
| X <sub>4</sub> | 10101000     | 10001010     | 1,00110010         |
| X <sub>5</sub> | 11000011     | 01100100     | 1,00100111         |

| (                            |                                | -                        |             |             |             |             |
|------------------------------|--------------------------------|--------------------------|-------------|-------------|-------------|-------------|
|                              | [5]                            | [6]                      | [7]         | [8]         | [2]         | This work   |
| Year                         | 2013                           | 2018                     | 2019        | 2020        | 2021        | 2022        |
| Publications                 | ISOCC                          | TVLSI                    | SEC         | TN          | APCCAS      |             |
| Technology (nm)              | 16                             | 65                       | 28          | 45          | 16          | 40          |
|                              | CN-MOSFET                      | CMOS                     | CMOS        | CNFET       | FinFET      | CMOS        |
| Verification                 | Post-layout                    | Post-layout              | Post-layout | Post-layout | Post-layout | Post-layout |
|                              | sim.                           | sim.                     | sim.        | sim.        | sim.        | sim.        |
| Supply voltage (VDD)         | 0.7                            | 1.2                      | 0.9         | 1.0         | 0.8         | 0.9         |
| Max. Freq (GHz)              | 1.5                            | 1                        | 0.5         | 0.1         | 20          | 20          |
| Length (bits)                | 32                             | 1                        | 8           | 1           | 8           | 8           |
| Delay (ns)                   | 0.299                          | 0.0518                   | 0.001117    | 0.027       | 0.931       | 1.52        |
| Power consumption (mW)       | 0.01982                        | 0.00444                  | 0.008865    | 0.0024      | 23.28       | 69.7        |
| Load Capacitance (pF)        | 0.00025                        | 0.01                     | 0.01        | 0.001       | 20          | 60          |
| Core area (mm <sup>2</sup> ) | 1.47 X 10 <sup>-6</sup>        | 6.84 X 10 <sup>-</sup> 7 | N/A         | N/A         | 0.0313      | 0.027       |
| <sup>a</sup> Nor. Power (mW) | 107.86                         | 0.308                    | 2.189       | 24          | 0.091       | 0.071       |
| <sup>b</sup> Nor. PDP (pJ)   | 0.17                           | 2.58                     | 3.85        | 0.32        | 0.51        | 0.08        |
| Note: a Nor nouver -         | Note: ${}^{a}Nor$ now $er = P$ |                          |             |             |             |             |

 TABLE III

 COMPARISON OF SEVERAL PREVIOUS WORKS

Note: "Nor. power =  $\frac{1}{Freq \cdot C_{Load} \cdot VDD^2}$ .

<sup>b</sup>Nor. 
$$PDP = \frac{Nor.Power \times Delay}{(Process)^2 \times (VDD)^2}$$
.



Fig. 8. Layout of proposed design.

| A0~A7   | 5 5-11e)<br>5 6<br>7 2                  | 💊 8'h 1A   | 38'h D2    | 8'h 67           | 8'h A8      | &"h C3     |
|---------|-----------------------------------------|------------|------------|------------------|-------------|------------|
| B0~B7 🚟 | 1 (c. 1) (c. 1)<br>6 (c. 1)<br>7 (c. 1) | / 8'h B2   | / 8'h 2D   | BhE1             | 8'h 8A      | B'h 64     |
| S0      |                                         | 5 0        | 1          | No.              | ►; 0        |            |
| S1      | 5.5<br>5.3<br>5.1                       | (o         | 1          | No/              | <b>†</b> 1  |            |
| S2      | 23<br>23<br>21                          | 1 1        | ાં નાં     |                  | ζ o         | 1          |
| S3      |                                         |            | 21         | 101              | <u>ه</u> ک  | >0         |
| S4      | 53<br>53                                | 50         | 21         | C O              | 1           | $\Box \ge$ |
| S5      | .:<br>                                  | 50         | 11         | $\overline{)}$ 1 | > 1         | n          |
| S6      |                                         |            |            |                  | 0           | i≥ o       |
| S7      | Delay=1.                                | 52 ns 1 1  | 71         | 1 01             | 1 2 0       | <u> </u>   |
| Cout    | 22                                      | 20         | 20         |                  | ) 1         |            |
|         |                                         | X1, 8'h CC | X2, 8'h FF | X3.8'h 124       | X4, 8'h 132 | X5. 8'h 1  |

Fig. 9. Post-layout simulations waveforms in SS Corner, VDD = 0.82 V and  $0^\circ$  C.

#### ACKNOWLEDGMENT

The Ministry of Science and Technology (MOST), Taiwan, has provided partial funding for this study under grant numbers MOST 109-2221-E-032-001-MY3 and MOST 110-2224-E110-004. Furthermore, the researchers would like to express their heartfelt gratitude to the Taiwan Semiconductor Research Institute (TSRI) of National Applied Research Laboratories (NARL) for their support with the EDA tool.

#### REFERENCES

- S. Goel, A. Kumar, and M. A. Bayoumi, "Design of robust, energyefficient full adders for deep-submicrometer design using hybrid-CMOS logic style," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 14, no. 12, pp. 1309–1321, Dec. 2006.
- [2] T.-J. Lee, W.-S. Yang, and C.-C. Wang, "A 20 GHz 8-bit all-n-transistor logic CLA using 16-nm FinFET technology," in 2021 IEEE Asia Pacific Conference on Circuit and Systems (APCCAS). IEEE, Feb. 2021, p. 33 36.
- [3] M. Afghahi, "A robust single phase clocking for low power, high-speed VLSI applications," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 2, pp. 247–254, Feb. 1996.
- [4] M. Kargar and M. B. Ghaznavi-Ghoushchi, "A high performance, race eliminated, two phase nonoverlapping clocked all-n-logic for both strong and subthreshold designs," in *The 16th CSI International Symposium on Computer Architecture and Digital Systems (CADS 2012)*. IEEE, May 2012, pp. 87–92.
- [5] Y. Sun and V. Kursun, "A comparison of high-frequency 32-bit dynamic adders with conventional silicon and novel carbon nanotube transistor technologies," in 2013 International SoC Design Conference (ISOCC). IEEE, Jul. 2013, pp. 039–042.
- [6] H. Naseri and S. Timarchi, "Low-power and fast full adder by exploring new XOR and XNOR gates," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, vol. 26, no. 8, pp. 039–042, Jul. 2018.
- [7] W. Al-Akel, K. Abugharbieh, A. Hasan, and H. W. Marar, "A power efficient 500mhz adder," in 2019 SoutheastCon. IEEE, Mar. 2019, pp. 1–6.
- [8] S. Vidhyadharan and S. S. Dan, "An efficient ultra-low-power and superior performance design of ternary half adder using CNFET and gate-overlap TFET devices," *IEEE Transactions on Nanotechnology*, vol. 20, pp. 1–12, Jan. 2021.