# Matrix Phase Shift Based DPWM Technique To Achieve 90% Duty Cycle

Venkata Naveen Kolakaluri, Oliver Lexter July A. Jose, and Chua-Chin Wang Dept. of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 Corresponding Author: ccwang@ee.nsysu.edu.tw

Abstract—For high-resolution, high-accuracy applications, recent DPWM (digital PWM) researches use long DFF arrays, resulting in clock skew that may compromise DPWM performance. This study proposed a DPWM based on a matrix phase shifter and clock gating technique which only selects a specific row of DFFs during operation depending on the required level of  $V_o$ . The proposed design minimizes the clock skew caused by redundant DFF array clock activity. In addition, it uses a single clock and does not need any extra signal for synchronization. It has a dead-time generator that prevents shoot-through. The proposed DPWM has been implemented in UMC 180-nm CMOS technology where an overall chip and core area of 1285 x 1285  $\mu$ m<sup>2</sup> and 725.1 x 282.8  $\mu$ m<sup>2</sup> are used, respectively. The all-PVT corner post-layout simulation confirmed the functionality of the design with a maximum duty cycle of 90.6% at  $C_{load} = 60 \text{ pF}$ and  $f_{clk} = 100$  MHz.

*Index Terms*—DPWM, DFF array, clock gating, matrix phase shifter

#### I. INTRODUCTION

Power Management Integrated Circuit (PMIC) is the fundamental power management basic element for various types of ICs, such as memory and processor [1]. With the introduction of PMIC in low-power applications, the need for digital Pulse Width Modulation (DPWM) controllers has increased. DPWM is preferred to analog PWM (APWM), because it can work at low  $V_{supply}$  with low quiescent current making it more suited for low-power applications [2]. Though APWM produces an accurate signal, it cannot work in low-voltage applications due to analog circuit headroom issues [3] [4]. The maximum duty cycle that the DPWM generator circuit can produce is one factor that determines its efficiency.

A novel high-resolution DPWM circuit based on FPGA was reported, consisting of a D/A converter and RC circuit [5]. The DPWM signal generator produced a better resolution compared to the conventional method. However, this DPWM based on FPGA is costly, consumes a large area, and doesn't have a dead time.

A digital buck converter implemented in a 40-nm CMOS process took advantage of a DPWM controller for low-power and low-voltage wireless sensor network systems [6]. This DPWM controller used three clock inputs to vary the signal and to determine the duty-varying frequency. Using three clocks needs a specialized circuit that is difficult to attain good stability.

A study proposed DPWM with a clock-gating shift register for a low-power PWM buck converter [3]. This DPWM controller of this converter used two clock inputs in generating the PWM signal and a slow bidirectional shift-register and fast shift register to adjust its duty ratio. However, the mentioned architectures have a low maximum output of 50 % duty cycle compared to conventional APWMs, which makes them inefficient. In addition, both designs use a long chain of Dflip flops for a high resolution that may result in severe clock skew.

This study illustrates a single-clock DPWM circuit employing a matrix array consisting of DFF-based chains. The DFF-based chain is split into several rows and columns that can be optimized, resulting in minimal clock skew. The 4bit resolution is used in the proposed DPWM to explain the design's functionality and can be scaled to a higher resolution. In addition, it only generates DPWM signals and does not include the buck topology. The proposed method also used a dual-clock edge clocking technique that reduces glitches during the operation of the shift registers. Furthermore, it has a dead-time circuit to prevent overshoot with a maximum output duty cycle of 90.6%.

# II. DPWM BASED ON MATRIX PHASE SHIFTER AND CLOCK GATING

The block diagram for the proposed DPWM is shown in Fig. 1. Sequence decoder converts clock pulses into binary numbers coupled to the matrix phase shifter. The matrix phase shifter consists of 16 DFF chains which are arranged into a 4 x 4 matrix that reduces the clock skew. Transition controller generates the fundamental DPWM signals,  $V_{tcQ}$  and  $V_{tcQb}$ , fed to the Dead time generator to avoid shoot-through. The synchronizer ensures that the  $T_{PWM}$  of  $V_{oH}$  and  $V_{oL}$  is equal to  $2^N \cdot T_{clk}$ . Fig. 2 shows the timing diagram for the proposed DPWM design.  $V_S$  is a clock cycle pulse that changes the state at the positive clock edge and provides a coarse timing signal for start-of-sequence (SOE) generation. This repeats in every  $T_{PWM}$ .  $V_{D0}$  to  $V_{D(2^N-2)}$  are generated from the  $V_S$  signals and represent all the SOE positions.  $V_D$  signals are updated at the negative clock edge to create a delay of half a clock cycle between  $V_S$  for stability. Based on the "sel" inputs, one

Prof. Chua-Chin Wang is the corresponding author. He is also with Ins. of Undersea Technology, National Sun Yat-Sen University, Taiwan. He is also adjunct professor of Vel Tech. U., India.

O. L. J. A. Jose is also connected with Dept. of Electronics Engineering, Batangas State University, The National Engineering University, Philippines.



Fig. 1. Block diagram of the proposed DPWM



Fig. 2. Proposed DPWM timing diagram

SOE is selected while  $V_{tcQ}$  of Transition controller turns on at the positive level of the selected SOE. To avoid shoot-through during the transition,  $V_{oH}$  is delayed by  $t_{dt}$  with respect to  $V_{tcQ}$ . Because  $V_{tcQ}$  is a positive edge-modulated PWM, the SOE position is varied to modulate the duty cycle while endof-sequence (EOS) is fixed at the positive level of  $V_{sync}$ . The proposed DPWM maximum output voltage can be found as Eqn. (1).

$$V_{out\_max} = \left(\frac{(2^N - 1) \cdot T_{clk} - t_{dt}}{2^N \cdot T_{clk}} \cdot 100\%\right) \cdot V_{in} \quad (1)$$

where N is the transition controller resolution, and  $t_{dt}$  is the dead time.  $V_{sync}$  ensures that  $T_{PWM}$  is equal to  $2^N$  clock periods. For example, given "sel" = 1100,  $V_{tcQ}$  turns on at the positive level of  $V_{D3}$  and turns off with the  $V_{sync}$  pulse.  $V_{oH}$  and  $V_{oL}$  signals are both zero for a period of  $t_{dt}$  for every  $T_{PWM}$ .  $V_{oH}$  and  $V_{tcQ}$  are both turned off at the same negative clock edge.

#### A. Sequence decoder

Fig. 3 shows the positive edge-triggered Sequence decoder, which consists of a sequence generator and decoder that provides the timing control for the DPWM output signals. Sequence generator converts the clock pulses to binary numbers depending on the DPWM resolution and changes the state



Fig. 3. Sequence decoder block diagram

at each positive edge of the clock. The sequence generator's pulse-pipelining architecture converts each binary count into individual pulses that indicate a distinct clk signal position,  $Q_0$  to  $Q_N$ . The number of decoder output states vary depending on the DFF array in the matrix phase shifter.

#### B. Matrix phase shifter

Most DPWM architectures use long DFF arrays for highresolutions or high-accuracy applications resulting in severe clock skew that may affect the functionality. The matrix phase shifter shown in Fig. 4 re-organizes the DFF arrays into a matrix architecture to minimize the clock skew by varying the DFF array's number of rows and columns. The DFF array in the matrix phase shifter can be organized using Eqn. (2).

$$C \cdot R = 2^N \tag{2}$$

where C and R are the numbers of columns and rows, respectively. Though the matrix size can vary, the square matrix topology provides the optimal trade-off between clock skew and the number of Sequence decoder outputs.

Matrix phase shifter used in this research has 16 DFFs arrayed into a 4 x 4 matrix. It consists of 15 SOEs positioned in  $R_0$ ,  $R_1$ ,  $R_2$ , and  $R_3$ , and one EOS. Since the number of columns in the matrix phase shifter may change, the number of Sequence decoder output will also vary. The DFF array and Sequence decoder are updated at the negative and positive edges of the clock, respectively, creating a delay of 1/2 clk period that eliminates false triggering.

A clock gating technique is implemented in Matrix phase shifter to reduce the power consumption caused by redundant DFF array clock activity [7]. The "sel" input will determine which row in the DFF array will be activated by the clock signal depending on the dimension of Matrix phase shifter. For example, if "sel" is 00XX, 01XX, 10XX, and 11XX, rows  $R_3$ ,  $R_2$ ,  $R_1$ , and  $R_0$  will be triggered, respectively.

## C. Transition controller

Transition controller shown in Fig. 5 generates the fundamental DPWM signals  $V_{tcQ}$  and  $V_{tcQb}$  without the implementation of dead time. It employs a  $2^N$  MUX and an SR latch to create a positive edge PWM signal with a modulated rising edge and a fixed falling edge . The SOE and EOS signals set and reset the latch, respectively. The MUX's selection lines are coupled to the "sel" input, which selects the position of SOE relative to EOS and generates the modulated pulse width. The "sel" input is updated after one  $T_{PWM}$ . During the transition



Fig. 4. Matrix phase shifter circuit



Fig. 5. Transition controller circuit

of selection inputs, the MUX is updated at the positive edge of EOS, while the next SOE is generated after the EOS signal ends. Because the MUX output transition is independent of the current input state, it eliminates false triggering. The SOE signals from the phase shifter are connected in inverted sequence, because it is a leading edge modulated PWM.

The effective pulse width or the turn-on time of  $V_{oH}$  without dead time generated by a specific SOE pulse can be calculated using Eqn. (3).

$$Teff = (2^N - 1) - D_N$$
(3)

where  $D_N$  is the position of DFF produces SOE. Since the SR latch provides an invalid output when both inputs are high,  $D_N$  must not be equal to  $2^N$ -1. The I<sub>0</sub> input of the MUX is connected to the ground to avoid the SR latch invalid case, resulting desired output of  $V_{oH}$  = low for "sel" = 0000.

#### D. Synchronizer

For the proposed DPWM to be accurate, the total period  $(T_{PWM})$  of the output signals  $V_{oH}$  and  $V_{oL}$  must be ensured to  $2^N$  clk period. For every complete cycle, the synchronizer circuit shown in Fig. 1 will ensure the period of  $V_{oH}$  and  $V_{oL}$  to  $2^N$  clk period. The primary function of the synchronizer circuit is to reset or end the pulse sequence of Transition controller every  $T_{PWM}$  to ensure its output stability. When the output of sequence generator  $Q_0$  to  $Q_N$  within the sequence decoder is all "1", a logic 1  $V_{sync}$  equal to one clk cycle



Fig. 6. Dead time generator circuit



Fig. 7. Proposed DPWM Layout

is generated. After every  $T_{PWM}$ , a synchronizer pulse is produced to reset  $V_{oH}$  and  $V_{oL}$ .

#### E. Dead time generator

Dead time generator prevents shoot-through by generating a delay ( $t_{dt}$ ) between  $V_{oH}$  and  $V_{oL}$ . Fig. 6 shows Dead time generator circuit using auto dual edge triggering flip flop (Auto - DETFF). Referring to Fig. 2, Dead time generator turns on during the positive edge of  $V_{tcQ}$ .clk signals producing  $t_{dt}$  of 1/2 clk period. After this, Dead time generator will shift to the negative edge triggered by  $V_{tcQ}$ . Overall, Dead time generator produces one clock cycle of dead time from  $2^N$  total number of cycles.

### III. IMPLEMENTATION AND SIMULATION

Fig. 7 shows the proposed DPWM circuit layout implemented in 180-nm CMOS process with an overall chip area of 1285 x 1285  $\mu$ m<sup>2</sup> and a core area of 725.1 x 282.8  $\mu$ m<sup>2</sup>. The layout pin count is based on the package availability from Taiwan Semiconductor Research Institute (TSRI), while the extra pins are used to provide redundancy in power and ground connections. Fig. 8 shows the output waveforms of the DPWM design run in all-PVT corner post-layout simulations. The design is analyzed in five process corners (SS, SF, TT, FS, and FF),  $V_{DD}$  = 1.62, 1.8, and 1.98 V, and 0, 25, and 75°C at "sel" = 0001 to 1111. A  $C_{load}$  = 60 pF (oscilloscope probe) capacitance is used during the simulation at  $f_{clk} = 100$  MHz. The DPWM design has an average power consumption  $P_{ave}$ = 5.05 mW for maximum duty cycle is 90.6 %. Moreover, a linear relationship between duty cycle and time is verified, making the output less affected by clock skew.

Fig. 9 shows the proposed DPWM detailed operation given "sel" = 1100 and 0010 for all-PVT corner post-layout simulation.  $V_S$  changes the state at the positive edge of the clock



Fig. 8. All-PVT corners post-layout simulations of the proposed DPWM



Fig. 9. Proposed DPWM all-PVT post-layout simulations in details

for one clock period and every  $T_{PWM}$ . The upward arrow represents the specific SOE for the DFFs in  $R_0$  ( $V_{D0}$  to  $V_{D3}$ ) of Matrix phase shifter circuit. A delay of 1/2 clk period is generated between  $V_S$  and  $V_{D0}$  to ensure the DPWM stability. Notably, the clock gating technique guaranteed that there is no generated SOE for other rows ( $R_1$  to  $R_3$ ) for "sel" = 1100. During the positive level of  $V_{D3}$ , the transition controller is triggered by the SOE signal and reset by the  $V_{sync}$ . The  $V_{sync}$ pulse is generated every 16 clock pulses and maintains the  $T_{PWM}$  equal to  $1/2^N$  clk period. The  $V_{tcO}$  is generated during the positive level of the  $V_{DN}$ , while the  $V_{oH}$  is shifted by 1/2 clk period to avoid shoot-through and both are turned off at the same negative edge of the 16th clock pulse.  $V_{oL}$  follows the same sequence as  $V_{oH}$  with inverted output voltage. The same process can be observed for "sel" = 0010. R<sub>3</sub> is triggered in this case generating SOEs from  $D_{12}$  to  $D_{14}$ , while none for  $R_0$  to  $R_2$  ( $D_0$  to  $D_{11}$ ).

Table I summarizes the performance comparison with prior DPWM architectures. The proposed DPWM employs only one clock, reducing design complexity and making it scalable to any frequency. The synchronization implemented in the proposed design increases DPWM stability. The matrix phase shift reduces the DFF array length, minimizes the effect of clock skew, and provides the linear relationship between the duty cycle and time. Furthermore, compared to the other PWMs in Table I, our design's maximum duty cycle is not limited to 50%. This can be scaled to higher resolutions, which improves the line and load regulations.

TABLE I DPWM PERFORMANCE COMPARISON

|                              | [6]           | [5]            | [3]       | this<br>work        |
|------------------------------|---------------|----------------|-----------|---------------------|
| Year                         | 2014          | 2018           | 2019      | 2022                |
| Publication                  | JSSC          | ECCE           | ICEIC     | -                   |
| Technology                   | 40-nm         | FPGA           | 65-nm     | 180-nm              |
| Verification                 | Meas.         | Meas.          | Meas.     | Post-layout<br>sim. |
| $V_{DD}$ (V)                 | 0.6 - 1.1     | 1              | 0.6       | 1.8                 |
| $V_{out}$ (V)                | 0.3 - 0.55    | N/A            | 0.1 - 0.5 | 0.05 - 1.6          |
| Switching<br>Frequency (MHz) | 0.1           | 1              | 1         | 6.25                |
| Resolution (bits)            | 6             | 18             | 6         | 4                   |
| Load                         | $220 \ \mu H$ | 100 μF & 19 μH | 4.7 μH    | 60 pF               |
| Max. duty ratio              | 50 %          | N/A            | 50 %      | 90.6 %              |
| Core area (mm <sup>2</sup> ) | 31.43         | N/A            | 31.73     | 0.205               |
| Power diss. (mW)             | N/A           | N/A            | N/A       | 5.05                |

#### IV. CONCLUSION

This study demonstrates matrix phase shifter and clock gating based DPWM implemented in UMC 180-nm CMOS technology. The design reduces clock skew caused by redundant DFF array clock activity. The additional Dead time generator circuit prevents shoot-through during operations. The all-PVT corner post-layout simulation validated the design's functionality with a maximum duty cycle of 90.6 % at  $C_{load}$  of 60 pF and clock frequency of 100 MHz.

#### ACKNOWLEDGMENT

The National Science and Technology Council (NSTC) provided partial funding for this research under grants numbers NSTC 110-2221-E-110-063 -MY2, NSTC 110-2224-E-110-004-, and NSTC 110-2623-E-110-001-.

#### REFERENCES

- Y. Liang, L. Tao, and C. Xing, "A new solution of power management integrated circuit one time programable test," in *Proc. 2021 China Semicond. Technol. Int. Conf. (CSTIC)*, 2021, pp. 1–3.
- [2] P.-H. Chen, C.-S. Wu, and K.-C. Lin, "A 50 nW-to-10 mW output power tri-mode digital buck converter with self-tracking zero current detection for photovoltaic energy harvesting," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 2, pp. 523–532, February 2016.
- [3] T.-H. Kim, D.-J. Kim, H.-S. Shin, S.-H. Lee, J.-W. Suh, and B.-D. Yang, "Low power digital PWM buck converter with a clock-gating shiftregister," in *Proc. 2019 Int. Conf. Electron., Inf., Commun. (ICEIC)*, 2019, pp. 1–3.
- [4] C.-C. Wang, O. L. J. A. Jose, P.-Y. Lou, C.-J. Hsu, L. K. S. Tolentino, and R. G. B. Sangalang, "Single-chip dc-dc buck converter design based on PWM with high-efficiency in light load," *International Journal of Electronics Letters*, vol. 0, no. 0, pp. 1–12, May 2022.
- [5] Y. Furukawa, H. Nakamura, H. Eto, and F. Kurokawa, "A novel high resolution DPWM circuit for high frequency digitally controlled dc-dc converter," in *Proc. 2018 IEEE Energy Conversion Congr. Expo. (ECCE)*, 2018, pp. 1396–1400.
- [6] X. Zhang, P.-H. Chen, Y. Okuma, K. Ishida, Y. Ryu, K. Watanabe, T. Sakurai, and M. Takamiya, "A 0.6 v input ccm/dcm operating digital buck converter in 40 nm cmos," *IEEE Journal of Solid-State Circuits*, vol. 49, no. 11, pp. 2377–2386, November 2014.
- [7] G. S. R. Srivatsava, P. Singh, S. Gaggar, and S. K. Vishvakarma, "Dynamic power reduction through clock gating technique for low power memory applications," in 2015 IEEE Int. Conf. Elect., Comput. Commun. Technol. (ICECCT), 2015, pp. 1–6.