

Received 30 December 2024, accepted 1 February 2025, date of publication 7 February 2025, date of current version 13 February 2025. Digital Object Identifier 10.1109/ACCESS.2025.3539760

## **RESEARCH ARTICLE**

# A Power-Efficient 0.5668 TOPS/W Digital Logic Accelerator Implemented Using 40-nm CMOS Process for Underwater Object Recognition Usage

CHUA-CHIN WANG<sup>®1,2,3</sup>, (Senior Member, IEEE), SHIH-HENG LUO<sup>1</sup>, HSIN-CHE WU<sup>®1</sup>, RALPH GERARD B. SANGALANG<sup>®1,4,5</sup>, (Senior Member, IEEE), CHEWN-PU JOU<sup>6</sup>, HARRY HSIA<sup>6</sup>, AND LAN-CHOU CHO<sup>6</sup>

<sup>1</sup>Department of Electrical Engineering, National Sun Yat-sen University, Kaohsiung 80424, Taiwan

<sup>2</sup>Institute of Undersea Technology, National Sun Yat-sen University, Kaohsiung 80424, Taiwan

<sup>3</sup>Vel Tech, Chennai, Tamil Nadu 600062, India

<sup>4</sup>Department of Electronics Engineering, Batangas State University, The National Engineering University, Batangas 4200, Philippines

<sup>5</sup>Electronic Systems Research Center, Batangas State University, The National Engineering University, Batangas 4200, Philippines <sup>6</sup>Taiwan Semiconductor Manufacturing Company (TSMC), Hsinchu 30078, Taiwan

Corresponding author: Chua-Chin Wang (ccwang@ee.nsysu.edu.tw)

corresponding addition on ad child (valig (corrang coolins) saledater)

This work was supported by Taiwan Semiconductor Manufacturing Corporation (TSMC).

**ABSTRACT** DNN (deep neural network) and CNN (convolution neural network) have been widely used in real-time artificial intelligent (AI) applications, particularly image or video recognitions, because they have been proved physically in many occasions. However, most prior AI hardware works either suffered from high on-silicon area cost or low usage thereof. This investigation presents a power efficient and high performance implementation of a digital logic accelerator (DLA) for the real-time underwater object recognition. The proposed DLA is also featured with 2-dimensional PE (processing element) array to increase the processing throughput by the enhancement of parallelism. The DLA design was realized and fabricated using TSMC 40-nm CMOS process. Not only the post-layout simulation results are shown, the on-silicon measurement outcome as well as the system validation in water are also demonstrated to prove the function correctness and the performance. The area efficiency (GOPS/mm<sup>2</sup>) is 4.562, and the power efficiency (TOPS/W) is 0.5668 on silicon, which both are the best to date.

**INDEX TERMS** Deep neural networks (DNN), digital logic accelerator (DLA), high degree of parallelism, underwater object recognition, low power.

### **I. INTRODUCTION**

Artificial intelligence (AI) has been booming in the past years owing to its capacity to discover unnoticed patterns and links within data sets such that it paves the way for data fusion and decision making, which is considered to be superior in certain occasions to human capabilities especially in the field of medical diagnosis, anomaly detection, financial fraud detection, real-time sensor data analysis, and predictive maintenance. AI based on neural networks utilizing electronics become field applications that greatly benefited the fields of sound processing, video processing, communication systems,

The associate editor coordinating the review of this manuscript and approving it for publication was Mario Donato Marino<sup>(D)</sup>.

pattern analysis, etc. Notably, the core of the neural network applications is either CNN or DNN, which is inspired by extracting features in a small region of a specific specimen to attain patterns in other regions thereof. Most of prior reports were focused on development of better algorithms to achieve detection and prediction with high accuracy and fast convergence. Apparently, the overhead is high complexity to realize these algorithms. Thus, the electronic platforms for the developed complicated algorithms are composed of CPUs, GPUs, and maybe FPGAs, which are all power hungry devices.

As pointed out earlier, one of the biggest challenges in AI-related applications is to equip AI hardware into applications where the power/energy source is limited, such as



FIGURE 1. Underwater vehicles developed at NSYSU [1].

battery-powered autonomous underwater vehicles (AUV), as shown in Fig. 1 [1]. It is impossible to supply enough power directly and constantly to such underwater systems to carry out all the required missions. That is, the overall power in these underwater applications is constrained by the battery size of the vehicle systems. Several works were reported to be focused on the control architectures of an AUV to gain efficient computing capability hence, also saving energy resource [2], [3], [4]. Wang et al. [5] used artificial lateral line systems (ALSS) to detect moving objects while Tian et al. [6] focused on energy efficient underwater acoustic networks. Huang et al. [7] on the other hand proposed a finite-time distributed control that is used to control multiple AUVs. Besides the development of light-weight training algorithms, another approach is to seek more power-efficient hardware solutions to overcome the mentioned power limitation issues for these battery-driven AI systems.

Many research teams already put out tremendous efforts to resolve the power-effective AI hardware demand. For instance, Reck et al. presented one interesting design for chip-integrated optical neural networks [8]. The feature is that they proposed to use optical device, which are considered to attain less power consumption, to overcome the power hungry problems encountered by electronic platforms. However, the overhead caused by the interfacing circuits to carry out optical-electrical data conversion was ignored. Another similar approach was reported, combining the capabilities of electronic and optical neural networks [9], where data transfers between layers and modules can be done at the speed of light using high speed fiber optic cables. Again, the overhead for signal conversions between electrical domain and optical domain in such an approach was not seriously considered and analyzed.

Many researchers still paid their attention to existing CMOS-based electronic solution such that many AI hardware accelerators have been reported in past years. The systolic architecture has reconfigurable designs for different convolution kernels [10], [11], [12], [13], [14], [15], [16], [17], etc. However, a large area overhead becomes the cost to pay

due to the complexity of the control structure is found [10]. The systolic data flow is only allowed to run in the PE rows rather than the entire 2D plane to achieve data reuse of filters along with convolution kernel reuse, which results in throughput degradation [11]. Another hardware design is the spatial architecture reported in [12], which is an improved version of [13]. It mainly took advantage of clustering of sparse CNNs to achieve higher throughput, but it required a larger area. A filter and input reuse in the streaming architecture was reported to be energy efficient, but it has a low hardware utilization [14]. A potential optical solution to realize DNN was reported by Sangalang, et al., [18]. However, this report was simply a proof of concept research and it was not really realized for underwater applications, either. The only DNN/CNN research report dedicated for underwater object recognition was given in [19], where sophisticated algorithms were introduced in detail. However, their proposed methods were never implemented on hardware to prove the feasibility. More importantly, none of the above works were focused on power-limited underwater object recognition applications.

### A. CONTRIBUTIONS OF THIS WORK

Considering the above existing difficulties in the realization of AI applications in underwater edge devices, particularly in the battery-operated edge devices, e.g., AUVs, novel DLA featured with high degree of hardware parallelism is disclosed in this investigation so that the throughput would not be degraded due to the limited power and the simplification of DNN/CNN architectures besides the adoption of a light weight YOLOv3-tiny algorithm.

### **II. LOW-POWER AND HIGH-THROUGHPUT DLA DESIGN**

Fig. 2 shows the required interface for the proposed DLA to co-work with auxiliary circuits. Besides the AXI wrapper, the control and data paths of accelerator, DMA, and Inter-Controller are also defined to drive the DLA. More specifically, the circuitry in the green dashed area is meant to serve as the wrapper I/Os, where many channels are needed to carry out the AXI protocol. The blue dashed area mainly consisting of controller circuitry in charge of the state transition of a FSM (finite state machine) generating all the commands of data flow control, instruction control, and timing control. Last but not least, the yellow area is the core of DLA where all the convolution operations, MAC computation and management, and realization of parallelism are realized.

### A. DLA HARDWARE ARCHITECTURE

Referring to Fig. 2 again, the proposed DLA is mainly composed of a DNN Accelerator, an Inter-controller, and a AXI Wrapper Direct Memory Access (DMA). Most importantly, the DNN Accelerator consists of an  $8 \times 32$  PE array to increase the degree of parallelism, an intra-controller, SRAMs, reshape modules, and line buffers. Apparently, the core of DNN or CNN is the convolution operation, namely MAC operations. To elevate the throughput without serious power overhead, the convolution operation is realized by



FIGURE 2. Interface of the proposed DLA with auxiliary units.



FIGURE 3. Core of convolution calculations.

three parallel computing steps as follows, 1.) Input feature map (IM), 2.) Input Channel Parallel (ICP), and 3.) Output Channels (OCH). The above computing steps are shown in Fig. 3, where H is the frame length, W is the frame width, and K is the size of the Filters.

Notably, the size of IM can be derived based on Fig. 3, which are H+K-1 and W+K-1, respectively. The number of IM parallel input channels is N. The number of output channels is M after the output feature map (OM) is generated by M K×K Filters processing the IM. Apparently, the size of the output feature map is  $M \times H \times W$ .

However, the realization of the above convolution is not straightforward at all owing to the limitations posted by the hardware availability. For instance, it is impossible to have an unlimited size of buffer to store the computation outcome of all the IMs and Kernels in the left hand side of Fig. 3. Thus, we propose to divide the IMs into plural sections, as indicated by those red boxed in Fig. 3, which will then be processed, tiled, and padded later by Reshape module, which will be described later in the text. Tm stands for the memory sizes which are needed in an internal batch computation. Apparently, numerous PEs (processing element) will be needed to carry out the computation of all the tiles in one single cycle, as shown by Eqn. (1). If the number of PEs is limited, on the other hand, it will take a long time, i.e., many cycles, to execute the computation resulting the decrease of throughput. To estimate the number of PEs, we can predict by the following facts.

1). T<sub>m</sub> is the number of OCH in one cycle



FIGURE 4. Illustrative diagram of ICP.



FIGURE 5. Illustrative diagram of OCP.

- 2).  $T_n$  is the number of ICH in one cycle
- 3). T<sub>r</sub>, and T<sub>c</sub> are the length and width of output window, respectively

Thus, the number of PEs to carry out the batch computation in one cycle is summarized as follows.

$$PE_{all} = T_m \times T_n \times T_r \times T_c \times K^2$$
(1)

A total of three possible parallel hardware arrangements to carry out the above convolution computation are highlighted as follows.

• Input Channel Parallel (ICP): As shown in Fig. 4, the superscript of P stands for the number of channel, while the subscript is the coordinate of the selected pixel. For instance, ICP=4 in Fig. 4 indicates that there are 4 channels,  $P_{IM}^0$  to  $P_{IM}^3$ , where each channel is for one pixel. They are delivered to 4-channel kernel to carry out MAC operations in parallel, and the outcome is written into the corresponding one-channel one-pixel output feature map. The required least hardware to carry out this ICP operation needs 4 input/kernel buffers.

• Output Channel Parallel (OCP): Similarly, OCP=4 in Fig. 5 denotes there are 4 channels,  $P_{OM}^0$  to  $P_{OM}^3$ , needed to generate output feature map (OM). Certainly, the required least hardware to carry out the MAC operations is 4 output/kernel buffers.

• Window Parallel (WP): The major difference of WP vs. ICP and OCP is that all the MAC operations are carried out in one channel and multiple pixels, unlike the multiple channel and one pixel in ICP and OCP. WP operations can be divided into two types, namely Output Window Parallel (OWP) and Kernel Window Parallel (KWP), depending on what the data source is. As shown by Fig. 6(a), OWP selects 4 pixels in



FIGURE 6. Illustrative diagram of WP (a) OWP; (b) KWP.

IM,  $P_{OM(1,1)}^0$  to  $P_{OM(1,4)}^3$ , to multiply with  $P_{OM(1,1)}^0$  of the kernel. The product is then written into the single channel 4 pixels in the OM. As for the KWP operation as shown in Fig. 6, 3 pixels in one channel of IM,  $P_{OM(1,1)}^0$  to  $P_{OM(1,3)}^3$  are multiplied with the 3 pixels in the kernel,  $P_{OM(1,1)}^0$  to  $P_{OM(1,3)}^3$ . Then, the MAC result is written in the one channel and one pixel in OM. Since all the WP operations are executed in one channel, the hardware demand is significantly reduced at the expense of a little bit longer delay.

### **B.** CONTROLLER

The DLA controller relies on the Inter-Controller to generate timing signals to drive the DMA and all the other hardware modules. In other words, the controller is the main driver of DLA to generate control signals for the operation of the modules to be activated or not, including the finite-state machine (FSM) state transfer, circuit switching, and power consumption. The Inter-Controller has five states: 1. Idle (S0) state, 2. Load (S1) state, 3. Calculate & Load (S2) state, 4. Calculate (S3) state, and 5. Store (S4) state, as shown in Fig. 7. Each state is assigned to carry out different modes by generating corresponding signals.

### C. PROCESSING ELEMENT (PE)

Most of the prior processing element (PE) was based on a multi-resolution architecture, e.g., [25], which was for different weight resolutions used in MAC operations. Though it attained an advantage of diversity, the trade-off is long delay caused by the cascaded stages. A very fatal issue, on top of the delay, is that the overflow or underflow can not be detected or fixed, if they appear during the MAC operations. Namely, the error will be accumulated and passed downstream if it happens.



FIGURE 7. State transition diagram of the DLA controller.



FIGURE 8. Proposed processing element (PE) with overflow and underflow detector.

To resolve the long delay and inevitable overflow & underflow problems, a new processing element was presented as shown in Fig. 8, where the proposed PE element will only use 16-b resolution integer without loss of robustness for underwater object recognition. It is composed of four 8  $\times$ 8 multipliers, one 16-b logical shifter, two 8-b logical shifters, three 16-b adders, and an underflow/overflow detector. The number of shifting steps are drastically reduced by such an architecture, because the element no longer needs to generate trailing bits for the inputs of the PE. Moreover, less number of stages is needed compared to the prior designs, thus reducing the delay. An underflow and an overflow detector are also introduced to co-work with the intra-controller and send notification signals to the quantization module of the DLA. More specifically, Overflow is detected when the output result is over 16'h7FFF and underflow when it is 16'h8000.

### D. PE ARRAY

Fig. 9 shows the PE array architecture composed of 8 rows of output window parallel (OWP) and 32 columns of output channel parallel (OCP). The outputs are generated and passed to the left and a set of kernel values are shared by the same column. To reduce the area and power consumption of the array, the excitation functions, quantization, and batch



FIGURE 9. PE array architecture.

TABLE 1. Comparison of bit length vs. hardware/timing cost.

| In x Kernel                             | 2x2   | 4x4   | 8x8  | 16x16 |
|-----------------------------------------|-------|-------|------|-------|
| #PE                                     | 64    | 16    | 4    | 1     |
| MUL area (16x16, $\mu$ m <sup>2</sup> ) | 42340 | 10479 | 6268 | 3615  |
| delay (ms)                              | 10    | 7.08  | 7.05 | 5.7   |

normalization of the pixels are performed outside the array prior to the output SRAM.

One important design consideration is how to determine the trade-off between data bit length, accuracy, delay, and possible timing violation. The bottom is when the bit length is large intending to attain high accuracy, the penalty is the increase of sign extended bits and the timing violation caused by the generation of massive clock tree routes. Table 1 summarizes all the multiplier (MUL) designs for various selections of bit lengths of input vectors and kernels. Apparently, the  $4 \times 4$  and 16 PEs are the better option when considering the chip area, complexity, and throughout at the same time.

### E. MEMORY ARRANGEMENT FOR INPUT, KERNEL, AND OUTPUT

The input and kernel SRAMs of the proposed DLA was realized with 8 banks 2-port 8-b 1R1W SRAMs, respectively. Double buffers are employed at the outputs thereof to reduce the access time and increase the slew rate. The double buffer design also attains another advantage, which is that the resource can be allocated to other operations when it is not in use. For instance, if only half of the bank is used, the other half is allocated to load the required data. By contrast, the output SRAM of the DLA uses 4 banks of 128-b dual-port SRAMs (2R or 2W or 1R1W). The SRAMs for Input, Kernel, and Output are shown in Fig. 10, respectively.

### F. RESHAPE AND LINE BUFFER MODULES

Apparently, it is impossible to have unlimited hardware resources to accommodate all the pictures or video frames



FIGURE 10. SRAM structures of (a) Input, (b) Kernel, and (c) Output.



FIGURE 11. Examples of padding given different stride steps and tiles.



FIGURE 12. Operation flow of re-organization of tile and pad.



FIGURE 13. Die photo and layout of the DLA.

at the same time for any processing, particularly the on-chip memory buffers. Thus, the tile-based computation, as shown in Fig. 11, is needed, where the entire picture frame is divided into plural tiles. As the example highlights in Fig. 11, a  $128 \times 128$  frame is divided into 16 16  $\times$  16 tiles after stride







FIGURE 15. Post-layout simulations of the DLA (clock = 150 MHz, worst corner).

2 operations. Notably, a title of 4  $32 \times 32$  tiles will be re-assembled by a Reshape module, which will be addressed later in the text.

A Reshape module is used for the design to support tile-based calculations as explained earlier. The Reshape module re-organizes the tile-based data to support burst transmission such that the transmission of multi-tile data from one module to another is feasible, as shown in Fig. 12. Notably, a padding step is needed that re-organizes the data into 1D or 2D representation prior to the transmission.

Lastly, the line buffer is composed of 16-b registers with 128-b SRAM (1R1W), which is used to push 8 groups of 16-b data during the convolution operation. The line buffers are also used to hold the weights and data values, which are repeatedly used during convolution operations.

# III. IMPLEMENTATION, CHIP MEASUREMENT, AND EXPERIMENT IN WATER

The proposed digital logic accelerator is implemented using TSMC 40-nm CMOS technology (TSMC 45 nm CMOS LOGIC General Purpose Superb (40G) ELK Cu1P10M 0.9/2.5V). Fig. 13 shows the die photo and the layout of the DLA chip, where the on-silicon size is  $3460 \times 3460 \ \mu m^2$  including 320 I/O and power pads. Notably, two scan chains and BIST (using March algorithm) are included to enhance the DLA's reliability and testability, where the test coverage was up to 98.35%.

### A. DESIGN VERIFICATION BY PRE-LAYOUT VS. POST-LAYOUT SIMULATIONS

Functional verification is required before any physical implementation. The signals and functions of the proposed DLA are summarized as follows.

| CPC Ourson-Boarder 00       T=6.67 rs       Filter                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | Baseline ▼= 0     P     O |                | Deserve                                |                      |           | 9     | Baseline 🖛 = 0         |             |                  |                   |               |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|----------------|----------------------------------------|----------------------|-----------|-------|------------------------|-------------|------------------|-------------------|---------------|
| Name         Ourset         Ourset         Plane         Ourset         Plane         Plane <th>Cursor-Baseline - 0 T=</th> <th>6.67 ns</th> <th>TimeA = 0</th> <th></th> <th></th> <th>Curso</th> <th>r-Baseline - 286,824ps</th> <th>Г=6.67 ns</th> <th></th> <th></th> <th></th>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Cursor-Baseline - 0 T=    | 6.67 ns        | TimeA = 0                              |                      |           | Curso | r-Baseline - 286,824ps | Г=6.67 ns   |                  |                   |               |
| ok         imital         load         rt         ionad         load         load <thload< th="">         load         load         l</thload<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Name 🔷 🗸                  | Cursor 🗢       | 0                                      |                      |           | Name  | o.                     | Cursor 🗢    | 1,400,000        | 0ps  1            | 500,000ps     |
| rst         initial         load         rst         0         0 ad         load         load <thload< th="">         load         <thloa< td=""><td>clk</td><td>1</td><td></td><td></td><td></td><td></td><td>clk</td><td>1</td><td>nnnnnnnn</td><td>nnnnnnnn</td><td>הההההההההה</td></thloa<></thload<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | clk                       | 1              |                                        |                      |           |       | clk                    | 1           | nnnnnnnn         | nnnnnnnn          | הההההההההה    |
| func(3:0)         vs         0         1         1         2         2           Intro         istart         istart         istart         intro                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | rst                       | ×              | 1 initial                              |                      | load      |       | rst                    | 0           | load             |                   | load & cal    |
| Aki_s_start         Jstart         Aki_s_start         Mig         Mig <thmig< th="">         Mig         Mig</thmig<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | func[3:0]                 | 'h ×           | l∘ 0                                   | (1                   | 1         |       | func[3:0]              | 'h 1        | 1 1              | )                 | 2 <b>2</b>    |
| Ing         oc         but an         ing         oc         but start           is_but an         is_addr_inst[31:0]         is_accord         is_addr_inst[31:0]         is_accord         is_addr_inst[31:0]         is_accord         is_addr_inst[31:0]         is_accord         is_accord         is_addr_inst[31:0]         is_accord                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Axi_s_start               | ×              | start                                  |                      |           |       | Axi_s_start            | 0           |                  |                   |               |
| In. buf_en         Image: Saddr_Inst31:0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Irq                       | ×              | 1                                      |                      |           |       | Irq                    | 0           |                  |                   | buf start     |
| s. addr. jms(31:0)         h scores         guidescore         g                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | In_buf_en                 | ×              | 1                                      |                      |           |       | In_buf_en              | •           |                  |                   |               |
| Aki_s_enro(13:0)       h scores       (000000       Aki_s_enro(13:0)       h scores       (000000         s_inst_ye_J(3:0)       h scores       (000000       s_inst_ye_J(3:0)       h scores       (000000         s_inst_ye_J(3:0)       h scores       (000000       s_inst_ye_J(3:0)       h scores       (000000         s_inst_ye_J(3:0)       h scores       (000000       s_inst_ye_J(3:0)       h scores       (000000         Aki_s_inst_valid       (000000       (000000       s_inst_ye_J(3:0)       h scores       (000000         Aki_s_inst_valid       (000000       (000000       (000000       (000000       (000000         Aki_m_arcad(13:0)       h scores       (000000       (000000       (000000       (000000         Aki_m_arcad(13:0)       h scores       (000000       (000000       (000000       (000000         Aki_m_arcad(13:0)       h scores       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       (000000       <                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | s_addr_inst[31:0]         | 'h xooooox     | (01000000                              |                      |           | S.    | _addr_inst[31:0]       | 'h 01000000 | 01000000         |                   |               |
| s_inst_ye_(13:0)         h xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Axi_s_error[31:0]         | 'h xoooooox    | 00000000                               |                      |           | A     | xi_s_error[31:0]       | 'h 00000000 | 00000000         |                   |               |
| s. Inst. ye. (131:0)         h second         isoscol         s. inst. ye. (131:0)         h second         second         second         s. inst. ye. (131:0)         h second         second <t< td=""><td>s_inst_lye_H[31:0]</td><td>'h xxxxxxxxxxx</td><td>() 11842008</td><td></td><td></td><td>5_</td><td>inst_lye_H[31:0]</td><td>'h 11842008</td><td>11842008</td><td></td><td></td></t<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | s_inst_lye_H[31:0]        | 'h xxxxxxxxxxx | () 11842008                            |                      |           | 5_    | inst_lye_H[31:0]       | 'h 11842008 | 11842008         |                   |               |
| s_inst.lye_M[31:0]         h soccost         (2025ease         s_inst.lye_M[31:0]         h soccost           Ait_s_inst_valid         h soccost         (2005ease         is_inst.lye_M[31:0]         h soccost           Axi_s_inst_valid         h soccost         (2005ease         (2005ease         (2005ease         (2005ease           Axi_s_inst_valid         h soccost         (2005ease         (2005ease         (2005ease         (2005ease           Axi_m_araddr[31:0]         h soccost         (2005cost         (2150cost)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | s_inst_lye_L[31:0]        | 'h 200000000   | 00000000                               |                      |           | S     | inst_lye_L[31:0]       | 'h 00000000 | 00000000         |                   |               |
| s_inst number[31:0]         h scocord         improved         a.i.m.mardet[31:0]         h scocord         improved         improved <t< td=""><td>s_inst_lye_M[31:0]</td><td>'h sococcos</td><td>2032E4A8</td><td></td><td></td><td>5_</td><td>inst_lye_M[31:0]</td><td>'h 2032E4A8</td><td>2032E4A8</td><td></td><td></td></t<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | s_inst_lye_M[31:0]        | 'h sococcos    | 2032E4A8                               |                      |           | 5_    | inst_lye_M[31:0]       | 'h 2032E4A8 | 2032E4A8         |                   |               |
| Axi, s_inst, valid       0       Axi, s_inst, valid       0         Axi, s_inst, valid       0       Axi, s_inst, valid       0         Axi, s_inst, valid       0       Axi, s_inst, valid       0         Axi, m_araddr[31:0]       1       0       Axi, s_inst, valid       0         Axi, m_araddr[31:0]       1       0       Axi, m_araddr[31:0]       1       0         Axi, m_araddr[31:0]       1       0       Axi, m_araddr[31:0]       1       0       1         Axi, m_araddr[31:0]       1       0       Axi, m_araddr[31:0]       1       0       1       0       1         Axim_awardd(31:0)       1       0       0       0       0       0       0         Axim_awardd       1       0       0       0       0       0       0       0         Axim_awardd       1       0       0       0       0       0       0       0       0       0         Axim_awardd       1       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0       0 <td>s_inst number[31:0]</td> <td>'h xoooooox</td> <td>0000001F</td> <td></td> <td></td> <td>s_i</td> <td>nst number[31:0]</td> <td>'h 0000001F</td> <td>0000001F</td> <td></td> <td></td>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | s_inst number[31:0]       | 'h xoooooox    | 0000001F                               |                      |           | s_i   | nst number[31:0]       | 'h 0000001F | 0000001F         |                   |               |
| Axis_slash(31:0)         h xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | Axi_s_inst_valid          | ×              | 1                                      |                      |           | A     | xi_s_inst_valid        | 0           |                  |                   |               |
| Axi, m_araddr[31:0]         h scores         Constraint         Axi, m_araddr[31:0]         h scores         Constraint         Constraint <thconstraint< th="">         Constraint</thconstraint<>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Axi_s_leaky[31:0]         | 'h xoooooox    | 00000000                               |                      |           | A     | ki_s_leaky[31:0]       | 'h 00000000 | 00000000         |                   |               |
| Axi, m_arten(7:0)       h or       is or <td>Axi_m_araddr[31:0]</td> <td>'h xoooooox</td> <td>01000000</td> <td>01000000</td> <td></td> <td>Ax</td> <td>i_m_araddr[31:0]</td> <td>'h 00000000</td> <td>012ABA00</td> <td>016DF160</td> <td>01000800</td>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Axi_m_araddr[31:0]        | 'h xoooooox    | 01000000                               | 01000000             |           | Ax    | i_m_araddr[31:0]       | 'h 00000000 | 012ABA00         | 016DF160          | 01000800      |
| Axi_m_arready<br>Axi_m_arready<br>Axi_m_arready         1         Axi_m_arready<br>Axi_m_arready         1           Axim_awaddr[31:0]         1         Axim_awaddr[31:0]         1         0           Axim_awaddr[31:0]         1         0         Axim_awaddr[31:0]         1         0           Axim_awaddr[31:0]         1         0         Axim_awaddr[31:0]         1         0         0           Axim_awaddr[31:0]         1         0         Axim_awaddr[31:0]         1         0         0           Axim_awaddr[31:0]         1         0         Axim_awaddr[31:0]         1         0         0           Axim_awald         0         Axim_awaddr[31:0]         1         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0 <td>Axi_m_arlen[7:0]</td> <td>'h xx</td> <td>OF</td> <td>7F</td> <td></td> <td>A</td> <td>xi_m_arlen[7:0]</td> <td>'h 00</td> <td>11</td> <td>01</td> <td>7F</td>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Axi_m_arlen[7:0]          | 'h xx          | OF                                     | 7F                   |           | A     | xi_m_arlen[7:0]        | 'h 00       | 11               | 01                | 7F            |
| Axim_availd                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Axi_m_arready             | ×              |                                        |                      |           |       | Axi_m_arready          | 1           |                  |                   |               |
| Axim_awadd(13:0)         h comcool           Axim_meady         a           Axim_mready                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Axi_m_arvalid             | ×              | h n                                    | 1                    |           |       | Axi_m_arvalid          | 0           |                  |                   | П             |
| Axim_awent/?0]         h or         io         Axim_awent/?0]         h or         Axim_ament/?0]         h or         Axim_awent/?0]         Axim_awent/?0]         h or         Axim_awent/?0]         h or         Axim_awent/?0]         h or         Axim_awent/?0]         h or         Axim_awent/?0]         Axim_awent/?0]         Axim_awent/?0]         Axim_awent/?0]         Axim_awent/?0]         Axim_awent/?0]         Axim_awent/?0]         Axim_awent/?0]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Axim_awaddr[31:0]         | 'h sococcox    | 00000000                               |                      |           | Ax    | im_awaddr[31:0]        | 'h 00000000 | 00000000         |                   |               |
| Axim_averady<br>Axim_averady         1           Axim_averady         1           Axim_averady         1           Axim_prespl1:0)         1x ×           0         loading           Axim_mission         Axim_mission           Axim_mrespl1:0)         1x ×           1         1           Axim_mrespl1:0)         1x ×           Axim_mvalid         0           Axim_mvalid         0 </td <td>Axim_awlen[7:0]</td> <td>'h xx</td> <td>00</td> <td></td> <td></td> <td>A</td> <td>xim_awlen[7:0]</td> <td>'h 00</td> <td>00</td> <td></td> <td></td>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Axim_awlen[7:0]           | 'h xx          | 00                                     |                      |           | A     | xim_awlen[7:0]         | 'h 00       | 00               |                   |               |
| Axim_awvalid         Axim_sevalid         Axim_sevalid<                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | Axim_awready              | ×              | -                                      |                      |           | -     | Axim_awready           | 1           |                  |                   |               |
| Axi_m_bresp[1:0]         h x         0         Axi_m_bresp[1:0]         h x         0         Axi_m_bresp[1:0]         h x         0         Still loading           Axi_m_tratat[127:0]         h x         0         Axi_m_tratat[127:0]         h x         0         Axi_m_tratat[127:0]         h x         0         Axi_m_tratat[127:0]         h x         0         Still loading           Axi_m_ready         0         Axi_m_rready         0         Axi_m_rready         0         Axi_m_rready         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0         0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | Axim_awvalid              | ×              | i .                                    |                      |           |       | Axim_awvalid           | 0           |                  |                   |               |
| Axi_m_tratat         Image: Constraint of the second s | Axi_m_bresp[1:0]          | 'h ×           | (0                                     |                      |           | A     | ki_m_bresp[1:0]        | 'h o        | 0                |                   |               |
| Axi, m_rdati(127:0)       h scores       and m_rdati(127:0)       h scores       and m_rdati(127:0)       h scores         Axi, m_rdati       Axi, m_rdati(127:0)       h scores       and m_rdati(127:0)       h scores       and m_rdati(127:0)       h scores         Axi, m_rdati       0       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores         Axi, m_ready       0       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores         Axi, m_ready       0       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores         Axi, m_ready       1       Scores       Axi, m_rdati(127:0)       h scores       Axi, m_rdati(127:0)       h scores         Axi, m_wdati(127:0)       1       Scores       Scores       Axi, m_rdati(127:0)       h scores         Axi, m_wdati(127:0)       1       Scores       Scores       Scores       Scores       Scores         Axi, m_wdati(127:0)       1       Scores       Scores       Scores       Scores       Scores         Axi, m_wdati(127:0)       1       Scores       Scores       Scores       Scores                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Axi_m_bvalid              | ×              |                                        |                      | loading   |       | Axi_m_bvalid           | 0           |                  |                   | Still loading |
| Azi_m_rast         Azi_m_resp(:a)         In x         Image: Constraint of the constrai                   | Axi_m_rdata[127:0]        | 'h sococoot    | 2008 • 11111111                        |                      |           | Ax    | i_m_rdata[127:0]       | 'h x0000000 |                  | 01760             | FC66+         |
| Axi, m, ready         Axi, m,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | Axi_m_rlast               | ×              |                                        | П                    |           |       | Axi_m_rlast            | 0           |                  |                   |               |
| Axi_m_rresp[1:0]         h x         0           Axi_m_rraid         0         Axi_m_rraid         0           PE_con         Axi_m_rraid         0         PE_col           Axi_m_valat         0         PE_con         0           Axi_m_walat         0         Axi_m_wast         0           Axi_m_walat         0         Axi_m_wast         0           Axi_m_walat         0         Axi_m_wast         0           Axi_m_wast         0         Axi_m_wast         0         0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Axi_m_rready              | ×              |                                        | nmn r                |           |       | Axi_m_rready           | 0           |                  |                   |               |
| Axi_m_valid         PE_con         PE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | Axi_m_rresp[1:0]          | 'h x           | 0                                      | 0101.0               |           | A     | xi_m_rresp[1:0]        | 'h o        | 0                |                   |               |
| PE_con          PE_con            Axi m_valat          Axi m_valat          Axi m_valat            Axi m_valat [27:0]         *          Axi m_valat [27:0]         *         Axi m_valat [27:0]         *         Axi m_valat [27:0]         *         Axi m_valat [27:0]         *          Axi m_valat [27:0]         *          Axi m_valat [27:0]         *          Axi m_valat [27:0]         *           Axi m_valat [27:0]         *           Axi m_valat [27:0]         *           Axi m_valat [27:0]         *           Axi m_valat [27:0]         *              Axi m_valat [27:0]         *                                                  <                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Axi_m_rvalid              | 0              |                                        |                      |           |       | Axi_m_rvalid           | 0           |                  |                   | PE cal        |
| Axi_m_vdsat         ·         Axi_m_vdsat(127:0)         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·         ·                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | PE_cen                    | ×              | 1                                      |                      |           |       | PE_cen                 | 0           |                  |                   |               |
| Axi_m_wdata[127:0]         th x0000004         Concentration         Axi_m_wdata[127:0]         th x0000004         Concentration         Axi_m_wdata[127:0]         th x000004         Concentration         Avii m_wdata[127:0]         th x00004         Concentration         Avii m_wdata[127:0]         th x0004         Concentration         th x0004                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | Axi_m_wlast               | ×              | ĩ                                      |                      |           |       | Axi_m_wlast            | 0           |                  |                   |               |
| Axi_m_wready         Axi_m_wready         1           Axi_m_wrb(15:0)         *n.xxxx         Axi_m_wrb(15:0)         *n.xxxx           Axi_m_wraid         Axi_m_wraid         Axi_m_wrb(15:0)         *n.xxx                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | Axi_m_wdata[127:0]        | 'h xxxxxxxx    | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | x_x000000x_x0000000x | o writing | Axi   | _m_wdata[127:0]        | 'h x0000000 | 20000002_2000002 | xx_xxxxxxxxxx_xxx | ∞no writing   |
| Axi_m_wstrb[15:0]         'h xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Axi_m_wready              | ×              |                                        |                      |           |       | Axi_m_wready           | 1           |                  |                   |               |
| Axi m wvalid x                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | Axi_m_wstrb[15:0]         | 'h socioi      | 0000                                   |                      |           | Ax    | i_m_wstrb[15:0]        | 'h 0000     | 0000             |                   |               |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Axi_m_wvalid              | ×              | 1                                      |                      |           |       | Axi_m_wvalid           | 0           |                  |                   |               |

FIGURE 16. Post-layout simulations of the DLA (Func[0], [1], [2]).

| 🙆 Baseline 🕶 e O 🥊             |             |                                            |                  | 🙉 Baseline 🕶 = O 🗧 🧧           |             |             |                                                                                                                |
|--------------------------------|-------------|--------------------------------------------|------------------|--------------------------------|-------------|-------------|----------------------------------------------------------------------------------------------------------------|
| Cursor-Baseline ▼= 286,824ps T | =6.67 ns    |                                            |                  | Cursor-Baseline *= 286,824ps T | =6.67 ns    |             |                                                                                                                |
| Name 🗢                         | Cursor 🗢 🗸  | 00,000ps                                   | 18,000,000ps     | Name 🗢                         | Cursor 🔷 🗸  |             | 30,000,000ps                                                                                                   |
| clk                            | 1           | and the second second second second second |                  | clk                            | 1           |             |                                                                                                                |
| rst                            | 0           | load & cal                                 | cal              | rst                            | ٥           | cal         | store                                                                                                          |
| func[3:0]                      | 'h 1        | <sup>2</sup> <b>2</b>                      | ) <sup>3</sup> 3 | func[3:0]                      | 'h 1        | 33          | I4 <b>4</b>                                                                                                    |
| Axi_s_start                    | 0           |                                            |                  | Axi_s_start                    | 0           |             |                                                                                                                |
| Irq                            | 0           |                                            |                  | Irq                            | 0           |             |                                                                                                                |
| In_buf_en                      | 0           |                                            |                  | In_buf_en                      | ۰           |             |                                                                                                                |
| s_addr_inst[31:0]              | 'h 01000000 | 01000000                                   |                  | s_addr_inst[31:0]              | 'h 01000000 | 01000000    |                                                                                                                |
| Axi_s_error[31:0]              | 'h 00000000 | 00000000                                   |                  | Axi_s_error[31:0]              | 'h 00000000 | 00000000    |                                                                                                                |
| s_inst_lye_H[31:0]             | 'h 11842008 | 11842008                                   |                  | s_inst_lye_H[31:0]             | 'h 11842008 | 11842008    |                                                                                                                |
| s_inst_lye_L[31:0]             | 'h 00000000 | 00000000                                   |                  | s_inst_lye_L[31:0]             | 'h 00000000 | 00000000    |                                                                                                                |
| s_inst_lye_M[31:0]             | 'h 2032E4A8 | 2032E4A8                                   |                  | s_inst_lye_M[31:0]             | 'h 2032E4A8 | 2032E4A8    |                                                                                                                |
| s_inst number[31:0]            | 'h 0000001F | 0000001F                                   |                  | s_inst number[31:0]            | 'h 0000001F | 0000001F    |                                                                                                                |
| Axi_s_inst_valid               | 0           |                                            |                  | Axi_s_inst_valid               | 0           |             |                                                                                                                |
| Axi_s_leaky[31:0]              | 'h 00000000 | 00000000                                   | Stop loading     | Axi_s_leaky[31:0]              | 'h 00000000 | 00000000    |                                                                                                                |
| Axi_m_araddr[31:0]             | 'h 00000000 | 012ADC40                                   | 00000000         | Axi_m_araddr[31:0]             | 'h 00000000 | 00000000    |                                                                                                                |
| Axi_m_arlen[7:0]               | 'h 00       | 11                                         | 00               | Axi_m_arlen[7:0]               | 'h 00       | 00          |                                                                                                                |
| Axi_m_arready                  | 1           |                                            |                  | Axi_m_arready                  | 1           |             |                                                                                                                |
| Axi_m_arvalid                  | 0           |                                            |                  | Axi_m_arvalid                  | 0           |             |                                                                                                                |
| Axim_awaddr[31:0]              | 'h 00000000 | 0000000                                    |                  | Axim_awaddr[31:0]              | 'h 00000000 | 00000000    | 011858+ 01185C+ 01185E+ 011860+ 01:                                                                            |
| Axim_awlen[7:0]                | 'h 00       | 00                                         |                  | Axim_awlen[7:0]                | 'h 00       | 00          | LIF CONTRACTOR                                                                                                 |
| Axim_awready                   | 1           |                                            |                  | Axim_awready                   | 1           |             | 1                                                                                                              |
| Axim_awvalid                   | 0           |                                            |                  | Axim awvalid                   | 0           |             |                                                                                                                |
| Axi_m_bresp[1:0]               | 'h o        | 0                                          |                  | Axi_m_bresp[1:0]               | 'h o        | 0           |                                                                                                                |
| Axi_m_bvalid                   | •           |                                            |                  | Axi_m_bvalid                   | 0           |             |                                                                                                                |
| Axi_m_rdata[127:0]             | 'h xxxxxxxx | 0004FFEE_02E1F7ED_02A6FB93_0128FFE3        |                  | Axi_m_rdata[127:0]             | 'h xoooooot | 0004FFEE_02 | 21F7ED_02A6FB93_0128FFE3                                                                                       |
| Axi_m_rlast                    | 0           |                                            |                  | Axi_m_rlast                    | 0           |             |                                                                                                                |
| Axi_m_rready                   | 0           |                                            |                  | Axi_m_rready                   | 0           |             |                                                                                                                |
| Axi_m_rresp[1:0]               | 'h 0        | 0                                          |                  | Axi_m_rresp[1:0]               | 'h o        | 0           |                                                                                                                |
| Axi_m_rvalid                   | 0           |                                            |                  | Axi_m_rvalid                   | 0           |             |                                                                                                                |
| PE_cen                         | 0           |                                            | Keep PE cal      | PE_cen                         | 0           |             | Stop PE cal                                                                                                    |
| Axi_m_wlast                    | 0           |                                            |                  | Axi_m_wlast                    | 0           |             | Start writing                                                                                                  |
| Axi_m_wdata[127:0]             | 'h xooooox) | x0000000_0000000_00000000_00000000         | no writing       | Axi_m_wdata[127:0]             | 'h xxxxxxxx | x0000000x_x | 00000000_000+ 000+ 0000-000+ 00000                                                                             |
| Axi_m_wready                   | 1           |                                            |                  | Axi_m_wready                   | 1           |             |                                                                                                                |
| Axi_m_wstrb[15:0]              | 'h 0000     | 0000                                       |                  | Axi_m_wstrb[15:0]              | 'h 0000     | 0000        | 9111400 0 1000 1 40000 1 1000 1 0 00 40 1 0000 0 1000 1 10000 1 1000 1 1000 1 1000 1 1000 1 1000 1 1000 1 1000 |
| Axi m wvalid                   | 0           |                                            |                  | Axi_m_wvalid                   | 0           |             |                                                                                                                |
|                                |             |                                            |                  | <u> </u>                       |             |             |                                                                                                                |

FIGURE 17. Post-layout simulations of the DLA (Func[2], [3], [4]).

- Func[0], [1], [2], [3], [4] : different phases of the controller
- PE\_cen : activation signal of PE array
- irq : termination flag of the current computation

Func[0]: idle stage in which is the initialization of the system Func[1]: load stage in which the data in DRAM are moved to on-chip buffers

Func[2]: load and calculate stage in which certain tiles are delivered to PE array, while other tiles are loaded at the same time

Func[3]: calculate stage in which no data are found in DRAM such that no load is needed and only the calculate is carried out

Func[4]: store stage in which the processed outcome is output

Fig. 14 and 15 are the pre-layout simulation and the post-layout simulation at the worst corner (namely SS), respectively. They have been fully checked to be matched when the same test-bench was used to examine the functional correctness.

To further highlight the simulation details, Fig. 16 and 17 are the magnified post-layout timing waveforms. In the left side of Fig. 16, after Reset is asserted, the DLA system is activated and Axi\_s\_start is triggered to command the DLA to be initialized and stayed in the standby mode. As soon as Func[0] is changed to Func[1], the data in DRAM are copied to the on-chip buffer. PE\_cen and wdata are not asserted now, since there is no calculation nor output.



FIGURE 18. DLA chip measurement setup.



FIGURE 19. Input vectors of measurements.

Referring to the right side of Fig. 16, PE\_cen is pulled high when the state is changed to Func[2] such that the PE array starts the computation. Notably, since the proposed DLA is designed with dual buffers, the next batch of tiles are read when the computation of PE array is proceeding.

The details of Func[2], [3], [4] are given in Fig. 17. The left side of the said figure shows the scenarios when Func[2] is changed to Func[3], where araddr[31:0] becomes 0 and arleng[7:0] (the length for data transfer) is also 0. At this moment, there is no new data to be fetched such that the only operation is the PE array's computation. wdata stays unknown simultaneously.

By contrast, the right side of Fig. 17 shows that transition from Func[3] to Func[4]. Besides stopping PE array operation by pulling down PE\_cen, awvalid (signal to output channels), awlen (length of output data), wvalid (valid flag of data), awaddr (address to write data), and wdata (data to be output) are all activated. As soon all data are written into Output SRAM, irq is asserted to notify the entire process is done.

### **B. CHIP MEASUREMENT**

The performance measurement of the proposed DLA chip was conducted in SOC Lab. of TSRI in Hsinchu, Taiwan, using ADVANTEST 93000, as shown in Fig. 18. Fig. 19 is the screenshot of the input vector waveform. Fig. 20 is the measured waveform to prove the state transition from Func[0] to Func[2], while Fig. 21 is those for transition from Func[2] to Func[4]. These two measured figures match the prediction given by the post-layout simulations, including initialization, load, compute and store operations.

Lastly, Fig. 22 is the end of the entire processing cycle of one picture frame. Notably, the pulse highlight in the right side of the figure indicates the end signal. Fig. 23 is the

| Delevi Kalit Lefe Dee Display                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Tarradi |    | @ Aliss Set | Subst Leit                  | Into Doc Oceptay Forward |       | t the St |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|----|-------------|-----------------------------|--------------------------|-------|----------|
| Coules III<br>Fort: DiscRat                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | - ot    | 0  | 0           | Costes III<br>Pert: Dim/Tet |                          | 0     | 0        |
| 1,53019000000<br>Desc(pr) p1 1 free 113                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | - 01    | 0  | 0           | 1.5009965<br>DevGyr (***    |                          | 0     | 0        |
| Time Delta (and<br>000.27 fame(1)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | < pt    | 0  | 0           | Tone Decision<br>1992       |                          | 0     | 1        |
| Carter C<br>Narles C<br>Carter C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | - ot    | 0  | 1           | C Parker<br>Sect 2          |                          | 1     | 0        |
| Test                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | · 8     |    |             | Test LT.S.                  |                          |       |          |
| (Bat) (Free)<br>(Elise) (Stat) scattarealist                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |         |    |             | (her) (Pri<br>(114) (14     |                          |       | 1        |
| Josef 2 Par 2 Bassian (mill)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | •       |    |             | 2+++(2) 70 +(               |                          |       |          |
| Sornar (V) 3.00 Munitian. Log 30                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |         |    |             | Series III A                |                          |       |          |
| Carser 0<br>Carser 0<br>Nation 0<br>Carsel 10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |         |    |             | Corner<br>C Parker          |                          |       |          |
| 17/141<br>18/141<br>19/141<br>19/141<br>19/141<br>19/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141<br>10/141 | -       |    |             | Print 1                     |                          |       |          |
| Lovel Belts [2] Conles                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |         | 12 |             | Level bells 1               |                          | 15 23 |          |

FIGURE 20. Measured waveforms for transition Func[0] to Func[2].

| felect fein lein ber linging for                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 98.     | ÷ 41 | un Gant | Beleet Edia Lete Day Display For              | west. | • Alton Ste |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|------|---------|-----------------------------------------------|-------|-------------|
| Factor (1)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 0       | 0    |         | Cycles (10)<br>Port: TonYat (mc(3)            | • 0   | 0           |
| 2,5107960000<br>perfor p15 fame(2)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 0       | 0    |         | 7,5117000000<br>Decempon pin (mail2)          | • 0   | 1           |
| Time Ealths (and<br>S0003.35                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 1       | 1    | -       | Time Selfs [48]<br>11mm Selfs [48]<br>1077.44 | • 1   | 0           |
| O Nirbir O Texc01                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 0       | 1    | -       | D farber 0<br>Control (1997)                  | 1     | 0           |
| Office 1 4.177 Breatenant                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |         |      |         | Certain 0.00 (re_tep_out                      |       |             |
| (Bell) (Tree<br>(Bilan) (David A.Ati.arralid                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |         |      | -       | (Ball (Ferr<br>(Bally) (Star) a,ari,arralid   |       |             |
| Transport Professional Anna (1990)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |         |      |         | 2000 2 Pos 2 0, and , and , bes (1)           |       |             |
| Saing [7] (2.00) (a_mt)_sa_1(s)(t)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |         |      |         | Series (V) 2000 - August Level (V)            |       |             |
| O farther D must use leads)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |         |      |         | O farter O eceles(3)                          |       |             |
| (Sing)<br>(Sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(sing)<br>(s |         |      |         | Relation (C.D.)                               |       |             |
| Level Delta [3] Certes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 397. 39 |      |         | Level Delta [1]<br>0.00                       | 161   | 164 6       |

FIGURE 21. Measured waveforms for transition Func[2] to Func[4].



**FIGURE 22.** Measured waveform for transition Func[4] to the end of the current processing cycle.



FIGURE 23. Shmoo plot of the proposed DLA.

Shmoo plot to show that when VDD = 0.9 V, the operation clock = 133.33 MHz.

### C. REAL-TIME SYSTEM FUNCTION VERIFICATION

To verify the DLA functionality in real time, the best way is to make a comparison of two experiments: 1) from CPU-based software simulations (using float32 computation, assumed as the golden model), and 2) the DLA+FPGA hardware testing. An algorithm based on YOLOv3-tiny was implemented using the mentioned two approaches, which



High-performance DLA experiment for real-time recognition

FIGURE 24. Comparison of CPU-based and DLA-based experiments for underwater object recognition.

# Software DLA

FIGURE 25. Recognition outcomes of SW-based FPGA vs. our DLA.

are compared to estimate the absolute error caused by our DLA.

The comparison experiment was shown in Fig. 24, which shows the identical recognition results except the delay and frame rate. Fig. 25 demonstrates an apple-to-apple comparison between the software-based solution and our DLA. The absolute error was found to be less than 1.4% as shown in Fig. 26.

### D. OPEN WATER TEST DEMONSTRATION

To physically test the proposed DLA in real underwater environment, we then conducted open water experiments. An AI box containing the proposed DLA, camera, battery-based power supply, FPGA board, and other auxiliary circuits, was installed on the front of an AUV as shown in Fig. 27. Before Absolute error between software and hardware solutions



**FIGURE 26.** Absolute error of the DLA approach vs. the golden model ( $\leq$  1.4%).

# AI box with DLA inside



FIGURE 27. AUV equipped with an AI box containing the proposed DLA and auxiliary circuits.

the open water experiments, the AI box had been trained offline to recognize more than 20 different underwater objects, including lion fish, clown fish, shark, tire, glass bottle, turtle, diver, etc. The mAP has been trained over 90% before the experiment. All the weights and kernels were written into the FPGA so that the edge inference using the proposed DLA can be carried out independently in water. Fig. 28 shows the AUV was driven 7 meters below the water surface, where it is about 1 mile from the coast line of Little Ryukyu, an island close to Taiwan.

Besides the mentioned object recognition, the proposed DLA installed in the AI box is utilized to carry out the mission of object following in water. In our experiment, the diver is selected to be the object to be followed by AUV, since other objects in the data bank are either uncontrollable, such as shark and fish, or stationary still, such as tires and glass bottles. Fig. 29 and 30 demonstrate the diver following mission screenshots. Particularly, Fig. 30 shows the scenario where not only the AUV follows the diver, it also detects

### TABLE 2. Table of Comparison with previous State-of-the-Art.

|                                                | [13]   | [20]   | [21]   | [22]   | [23]   | [24]   | [25]   | [26]          | Ours   |
|------------------------------------------------|--------|--------|--------|--------|--------|--------|--------|---------------|--------|
| Year                                           | 2017   | 2018   | 2020   | 2020   | 2020   | 2021   | 2022   | 2023          | 2024   |
| Publication                                    | JSSC   | TCAS-I | JETCAS | JSSC   | GCCE   | HCS    | TCAS-I | TCAS-I        |        |
| Process (nm)                                   | 65     | 65     | 40     | 65     | 90     | 180    | 180    | Nexys A7-     | 40     |
|                                                |        |        |        |        |        |        |        | 100T (FPGA)   |        |
| Verification                                   | Meas.  | Simu.  | Simu.  | Meas.  | Simu.  | Simu.  | Meas.  | Meas.         | Meas.  |
| Supply (V)                                     | 1.0    | 1.2    | 0.9    | 0.6    | 1.2    | 1.8    | 1.8    | 3.3           | 0.9    |
| Area (mm <sup>2</sup> )                        | 16     | 10.6   | 200    | 2.56   | 14     | 3      | 53.63  | 50.2k LUT     | 11.97  |
|                                                |        |        |        |        |        |        |        | 240 DSP       |        |
| Max. Freq. (MHz)                               | 200    | 200    | 200    | 0.25   | 100    | 107    | 100    | 100           | 133.33 |
| On-chip buffer (kb)                            | 181.5  | 139.6  | 118    | 16     | 322    | 16     | 150    | 58.1k FFs     | 150    |
| No. of MACs                                    | 168    | 64     | 128    | 512    | 72     | 182    | 256    | 82.53%        | 256    |
|                                                |        |        |        |        |        |        |        | (utilization) |        |
| Activation bit-width                           | 16     | 16     | 16     | 8      | 16     | 32     | 16     | 8             | 16     |
| Kernel bit-width                               | 16     | 16     | 16     | 8      | 16     | 32     | 16     | 8             | 16     |
| Performance (GOPS)                             | 42     | 23.4   | 51.2   | 0.471  | 14.4   | 35.05  | 40.96  | 95.08         | 54.61  |
| Power (mW)                                     | 278    | 93.4   | 153.94 | 0.0106 | 164.7  | 649    | 196.8  | 2203          | 96.35  |
| Area eff. (GOPS/mm <sup>2</sup> )              | 2.625  | 2.21   | 0.26   | 0.184  | 1.029  | 11.684 | 0.7638 | N/A           | 4.562  |
| Power eff. (TOPS/W)                            | 0.1511 | 0.253  | 0.3326 | 44.434 | 0.087  | 0.054  | 0.2081 | 0.04316       | 0.5668 |
| <sup>-1</sup> CO <sub>2</sub> equivalent (kg.) | 1.05   | 0.35   | 0.58   | 0.01   | 0.624  | 2.5    | 0.75   | 8.32          | 0.365  |
| <sup>2</sup> FOM1                              | 30.21  | 60.1   | 59.86  | 6.40   | 10.491 | 10.402 | 37.46  | 47.00         | 68.01  |
| <sup>3</sup> FOM2                              | 1.89   | 5.67   | 0.30   | 2.34   | 0.75   | 3.47   | 0.70   | N/A           | 5.68   |
| 1                                              |        |        |        |        |        |        |        |               |        |

<sup>1</sup>Based on U.S. EPA greenhouse gas equivalency [27].

Computed based on continuous operation for 1 year.

 ${}^{2}FOM1 = rac{Frequency(MHz) \times GOPS}{Normalized Power(mW)}$  ${}^{3}FOM2 = rac{FOM1}{V}$ 

 ${}^{3}FOM2 = \frac{FOM1}{Area(mm^{2})}$ 



FIGURE 28. AUV with the AI box was tested in the open sea of Little Ryukyu, an island close to Taiwan.



FIGURE 29. AUX with the AI box to execute diver following mission.

the unknown big rock in the water and cruise away to avoid possible collisions. These experiments surely prove that the proposed design not only measurable on silicon, but also usable in field applications.

### E. PERFORMANCE COMPARISON AND ANALYSIS

Table 2 shows the comparison with many recent CNN/DNN hardware accelerator works reported in top journals recently. Notably, the supply voltage of our DLA is 0.9 V operating at



FIGURE 30. AUX with the AI box to execute diver following and object avoidance.

133.33 MHz frequency. The on-silicon measurement results of our DLA show a performance 54.61 GOPS at a power consumption of 96.35 mW. Meanwhile, TOPS/W = 0.5668, and  $GOPS/mm^2 = 4.562$ , both are the best by far if normalized with CMOS technology nodes and the operating clock frequency. It also shows that [22] has the highest TOPS/W and this is because its designs uses only 8-bit kernels thus significantly decreases the overall area and hence power of the chip.

Besides the comparison of recent DLAs on single dies, we also made a comparison with a recent work implementing YOLOv3-tiny on Nexys A7-100T FPGA [26]. Not surprisingly, though the FPGA-based solution gives a better performance in terms of GOPS, it pays very high price in

terms of power consumption such that the overall power efficiency is really poor. Two (2) FOMs (Figure-of-Merit) is used to compare the different designs, the first one uses the frequency, performance, and power while the second one added the effect of the chip area. In short, the proposed design shows an FOM1 value of 68.01 and FOM2 value of 5.68 which is the best among all DLA works in Table 2. It also shows the lowest carbon dioxide (CO<sub>2</sub>) equivalent energy emission when used continuously for an entire year [27]. Lastly, the proposed DLA is the only one specially developed for underwater AI applications.

### **IV. CONCLUSIONS AND FUTURE WORKS**

A low-power and high-performance DLA using 40-nm CMOS process is presented in this investigation. A new parallel architecture based on processing element with underflow and overflow detection is proposed to increase processing speed and reduce computational error. Not only the normalized area and power efficiencies of our design are better than prior DLAs, the FOM also shows that our design is the best so far even if the clock frequency is taken into account.

Future works for this investigation is to improve the processing elements to perform faster computations. An improved version of the machine learning is also under development to further reduce the computational requirements of the DLA in order to further improve the power efficiency of the design.

### ACKNOWLEDGMENT

This investigation was supported by Taiwan Semiconductor Manufacturing Corporation Limited, Taiwan under TSMC Contract no. 202201-100011.

### REFERENCES

- NSYSU Underwater Mechatronics Lab. Tab: Research Instruments. Accessed: Sep. 19, 2022. [Online]. Available: https://uml.iut.nsysu.edu. tw/research\_Instruments.html
- [2] B. Chen, J. Hu, Y. Zhao, and B. K. Ghosh, "Finite-time observer based tracking control of uncertain heterogeneous underwater vehicles using adaptive sliding mode approach," *Neurocomputing*, vol. 481, pp. 322–332, Apr. 2022.
- [3] R. G. B. Sangalang, D. J. S. Masangcay, C. M. R. Torino, and D. J. C. Gutierrez, "Design of a control architecture for an underwater remotely operated vehicle (ROV) used for search and rescue operations," *Kybernetika*, vol. 58, no. 2, pp. 237–253, Jul. 2022.
- [4] B. Huang, H. Peng, C. Zhang, and C. K. Ahn, "Distributed optimal coordinated control for unmanned surface vehicles with interleaved periodic event-based mechanism," *IEEE Trans. Veh. Technol.*, vol. 73, no. 12, pp. 18073–18086, Dec. 2024.
- [5] Z. Wang, S. Wang, X. Wang, and X. Luo, "Underwater moving object detection using superficial electromagnetic flow velometer arraybased artificial lateral line system," *IEEE Sensors J.*, vol. 24, no. 8, pp. 12104–12121, Apr. 2024.
- [6] W. Tian, Y. Zhao, R. Hou, M. Dong, K. Ota, D. Zeng, and J. Zhang, "A centralized control-based clustering scheme for energy efficiency in underwater acoustic sensor networks," *IEEE Trans. Green Commun. Netw.*, vol. 7, no. 2, pp. 668–679, Jun. 2023.
- [7] B. Huang, S. Song, C. Zhu, J. Li, and B. Zhou, "Finite-time distributed formation control for multiple unmanned surface vehicles with input saturation," *Ocean Eng.*, vol. 233, Aug. 2021, Art. no. 109158.
- [8] M. Reck, A. Zeilinger, H. J. Bernstein, and P. Bertani, "Experimental realization of any discrete unitary operator," *Phys. Rev. Lett.*, vol. 73, no. 1, pp. 58–61, Jul. 1994.

- [9] J. Chang, V. Sitzmann, X. Dun, W. Heidrich, and G. Wetzstein, "Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification," *Sci. Rep.*, vol. 8, no. 1, pp. 1–10, Aug. 2018.
- [10] F. Tu, S. Yin, P. Ouyang, S. Tang, L. Liu, and S. Wei, "Deep convolutional neural network architecture with reconfigurable computation patterns," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 25, no. 8, pp. 2220–2233, Aug. 2017.
- [11] Y. Huan, J. Xu, L. Zheng, H. Tenhunen, and Z. Zou, "A 3D tiled low power accelerator for convolutional neural network," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2018, pp. 1–5.
- [12] Y.-H. Chen, T.-J. Yang, J. Emer, and V. Sze, "Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices," *IEEE J. Emerg. Sel. Topics Circuits Syst.*, vol. 9, no. 2, pp. 292–308, Jun. 2019.
- [13] Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, "Eyeriss: An energyefficient reconfigurable accelerator for deep convolutional neural networks," *IEEE J. Solid-State Circuits*, vol. 52, no. 1, pp. 127–138, Jan. 2017.
- [14] L. Du, Y. Du, Y. Li, J. Su, Y.-C. Kuan, C.-C. Liu, and M. F. Chang, "A reconfigurable streaming deep convolutional neural network accelerator for Internet of Things," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 1, pp. 198–208, Jan. 2018.
- [15] J. Lee, C. Kim, S. Kang, D. Shin, S. Kim, and H.-J. Yoo, "UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision," in *IEEE Int. Solid-State Circuits Conf.* (*ISSCC*) Dig. Tech. Papers, Feb. 2018, pp. 218–220.
- [16] J. Lee, C. Kim, S. Kang, D. Shin, S. Kim, and H.-J. Yoo, "UNPU: An energy-efficient deep neural network accelerator with fully variable weight bit precision," *IEEE J. Solid-State Circuits*, vol. 54, no. 1, pp. 173–185, Jan. 2019.
- [17] H. Sharma, J. Park, N. Suda, L. Lai, B. Chau, V. Chandra, H. Esmaeilzadeh, and J. K. Kim, "Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural network," in *Proc. ACM/IEEE 45th Annu. Int. Symp. Comput. Archit. (ISCA)*, Jun. 2018, pp. 764–775.
- [18] R. G. B. Sangalang, S.-H. Luo, H.-C. Wu, B.-Q. He, S.-F. Hsiao, C.-C. Wang, C. Jou, H. Hsia, and D. C.-H. Yu, "A power effective DLA for PBs in opto-electrical neural network architecture," in *Proc. IEEE Asia– Pacific Conf. Circuits Syst. (APCCAS)*, Nov. 2022, pp. 46–49.
- [19] C.-H. Yeh, C.-H. Lin, L.-W. Kang, C.-H. Huang, M.-H. Lin, C.-Y. Chang, and C.-C. Wang, "Lightweight deep neural network for joint learning of underwater object detection and color conversion," *IEEE Trans. Neural Netw. Learn. Syst.*, vol. 33, no. 11, pp. 6129–6143, Nov. 2022.
- [20] J. Jo, S. Kim, and I.-C. Park, "Energy-efficient convolution architecture based on rescheduled dataflow," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 12, pp. 4196–4207, Dec. 2018.
- [21] S.-F. Hsiao, K.-C. Chen, C.-C. Lin, H.-J. Chang, and B.-C. Tsai, "Design of a sparsity-aware reconfigurable deep learning accelerator supporting various types of operations," *IEEE J. Emerg. Sel. Topics Circuits Syst.*, vol. 10, no. 3, pp. 376–387, Sep. 2020.
- [22] J. S. P. Giraldo, S. Lauwereins, K. Badami, and M. Verhelst, "Vocell: A 65-nm speech-triggered wake-up SoC for 10- μ W keyword spotting and speaker verification," *IEEE J. Solid-State Circuits*, vol. 55, no. 4, pp. 868–878, Apr. 2020.
- [23] C.-B. Wu, C.-S. Wang, and Y.-K. Hsiao, "Reconfigurable hardware architecture design and implementation for AI deep learning accelerator," in *Proc. IEEE 9th Global Conf. Consum. Electron. (GCCE)*, Oct. 2020, pp. 154–155.
- [24] K.-D. Nguyen, D. T. Kiet, T.-T. Hoang, N. Q. N. Quynh, and C.-K. Pham, "A CORDIC-based trigonometric hardware accelerator with custom instruction in 32-bit RISC-V system-on-chip," in *Proc. IEEE Hot Chips* 33 Symp. (HCS), Aug. 2021, pp. 1–13.
- [25] C.-C. Wang, R. G. B. Sangalang, C.-P. Kuo, H.-C. Wu, Y. Hsu, S.-F. Hsiao, and C.-H. Yeh, "A 40.96-GOPS 196.8-mW digital logic accelerator used in DNN for underwater object recognition," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 69, no. 12, pp. 4860–4871, Dec. 2022.
- [26] M. Kim, K. Oh, Y. Cho, H. Seo, X. T. Nguyen, and H.-J. Lee, "A low-latency FPGA accelerator for YOLOv3-tiny with flexible layerwise mapping and dataflow," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 71, no. 3, pp. 1158–1171, Mar. 2024.
- [27] United States Environ. Protection Agency. (Mar. 2022). Green-housegas Equivalenciescalculator. [Online]. Available: https://www.epa.gov/ energy/



**CHUA-CHIN WANG** (Senior Member, IEEE) received the Ph.D. degree in electrical engineering from Stony Brook University, The State University of New York (SUNY), Stony Brook, NY, USA, in 1992.

Then, he joined the Department of Electrical Engineering, National Sun Yat-sen University (NSYSU), Taiwan. He was elevated to be Distinguished Professor at NSYSU, in 2010. In 2018, he was assigned as the Director General of the

Underwater Vehicle Research and Development Center. He is currently the Vice President of the Office of Research and Development, NSYSU. His research interests include AI-related memory and logic circuit design, communication circuit design, and interfacing I/O circuits.

Dr. Wang became an IET Fellow, in 2012. He was nominated as the ASE Chair Professor, in 2013, and elected to be the Dean of the Engineering College, in 2014. He was named as a Distinguished Lecturer of the IEEE Circuits and System Society (CASS), from 2019 to 2021. He chaired the IEEE CASS Nanoelectronics and Giga-scale Systems (NG) Technical Committee, from 2008 to 2009. He was an Associate Editor of IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS and IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: REFURAR PAPERS and IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, from 2010 to 2011. He was the General Chair of the 2015 Symposium on Engineering Medicine and Biology Application (2015 SEMBA), the 2012 IEEE Asia-Pacific Conference on Circuits and Systems (2012 APCCAS), the 2011 IEEE International Conference on IC Design and Technology (2011 ICICDT), and the 2007 VLSJ/CAD Symposium; and the General Co-Chair of the 2010 IEEE International Symposium on Next-generation Electronics (2010 ISNE).



**SHIH-HENG LUO** received the B.S. and M.S. degrees in electrical engineering from National Sun Yat-sen University (NSYSU), Kaohsiung, Taiwan, in 2020 and 2024, respectively. He is currently with MediaTek as a Design Engineer. His research interests include temperature detector circuit design, mix-signal IC design, and high-performance AI accelerator design.



**HSIN-CHE WU** received the B.S. and M.S. degrees in electrical engineering from National Sun Yat-sen University (NSYSU), Kaohsiung, Taiwan, in 2019 and 2022, respectively. He is currently pursuing the Ph.D. degree with Iowa State University, Ames, IA, USA. His recent research interests include negative charge pump circuit design, mix-signal IC design, and high-performance AI accelerator design.



**RALPH GERARD B. SANGALANG** (Senior Member, IEEE) received the B.S. degree in electronics and communications engineering and the M.S. degree in electronics engineering from Batangas State University, Philippines, the first Ph.D. degree in electrical engineering from National Sun Yat-sen University (NSYSU), Taiwan, in 2023, and the second Ph.D. degree in electronics engineering from Batangas State University, The National Engineering University,

Philippines, in 2024. He is currently an Assistant Professor and the Head of the Electronic Systems Research Center and the Program Chair of the Electronics Engineering Graduate programs, where he is in charge of the M.Sc., the M.Eng., and the Ph.D. studies in the field of electronics engineering at Batangas State University, The National Engineering University. His research interests include memory design, AI circuits, digital systems, control systems, computational modeling, fractional circuits, research security, fractional systems, and engineering education. He was awarded the Yeh Kung-Chie Memorial Scholarship Award from NSYSU, in 2023. He was the Program Chair of BS Electronics Engineering, from 2017 to 2021, and the Interim Program Chair of BS Biomedical Engineering. He was also formerly the Student Outcome Committee Chair of the College of Engineering, Architecture and Fine Arts, from 2014 to 2021. He is the Governor of the Institute of Electronics Engineers of the Philippines–Batangas Chapter, in 2025. He has served in different positions in the organization, since 2014. He is also the Vice-President of Technical of the Mechatronics and Robotics Society of the Philippines-Batangas Chapter. He has served as a Reviewer for ISCAS, AICAS, ISCAIE, ISBI, CSSP, IJE, Kybernetika, and IJCDS.



**CHEWN-PU JOU** received the B.S. and M.S. degrees in electrical engineering from National Taiwan University, in 1982 and 1984, respectively, and the Ph.D. degree from The State University of New York, in 1991. From 1991 to 1998, he was the RF Program Manager of ITRI Designing LTCC Components and Modeling RF CMOS. From 1998 to 2001, he was the Manager of UMC Developing RFCMOS Technology. From 2001 to 2006, he was the V.P. of Uwave

Technology developing RFCMOS soc products. Since 2006, he has been with Taiwan Semiconductor Manufacturing Company (TSMC) building up RF foundry necessities, including reference design flow and RF process development, monitoring, and debug structures. After 2016, he joined the Pioneer Team of TSMC Silicon-Photonics Technology Platform Development. He was a TPC member of ISSCC. He was a recipient of the 1998 Taiwan MOEA Best Project Award.

HARRY HSIA was born in Taipei, Taiwan. He received the B.S. degree in electrical engineering from National Taiwan University, Taipei, in 1991, the M.S. degree in electrical engineering from the University of Southern California, Los Angeles, CA, USA, in 1995, and the Ph.D. degree in electrical engineering from the University of Illinois Urbana-Champaign, in 1999. He was with Philips-Lumileds Lighting Company, USA, and Infinera Corporation, Sunnyvale, CA, USA, working on high-brightness light-emitting diodes, and InP photonic integrated circuits, respectively. He joined Taiwan Semiconductor Manufacturing Company, Hsinchu, as the Technical Manager, in May 2005, where he worked on 45-nm-technology front-end process integration with emphasis on silicon germanium and then on 22-nm-technology front-end device development with emphasis on high-k gate dielectrics. He is also leading the silicon photonics device team for pathfinding in system integration with an emphasis on Compact Universal Photonics Engines (COUPE) integration. His research interests include device, process development, and integration in compound semiconductor optoelectronics.



**LAN-CHOU CHO** was born in Taipei, Taiwan, in 1978. He received the B.S., M.S., and Ph.D. degrees in electrical engineering from National Taiwan University, in 2001, 2003, and 2008, respectively. From 2009 to 2014, he was with Mediatek Inc., Hsinchu, Taiwan. Currently, he is employed at Taiwan Semiconductor Manufacturing Company (TSMC), Hsinchu. His research interests include phase-locked loops, high-speed CMOS data-communication circuits, and silicon

photonic circuits and systems.