# A 2.5-GHz 2×VDD 16-nm FinFET Digital Output Buffer with Slew Rate and Duty Cycle Self-Adjustment

Tzung-Je Lee Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 tjlee@ee.nsysu.edu.tw Wen-Jian Su Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 leo6934070@vlsi.ee.nsysu.edu.tw

Chua-Chin Wang<sup>2</sup> Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 ccwang@ee.nsysu.edu.tw Lean Karlo S. Tolentino<sup>1</sup> Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 leankarlo.tolentino@g-mail.nsysu.edu.tw

Abstract—This paper presents a 2×VDD, PVT-insensitive (process, voltage, and temperature) output buffer that has a slew rate and duty cycle self-adjustment. It complies with the slew rate, system voltage, and duty cycle requirements for DDR4 SDRAMS. Low Vth transistors which are always turned on are selected as drivers in the output stage to prevent output current fluctuations and increase the driving current. These transistors' gates are stabilized by both driving currents and a capacitor rejecting any interference by the noise coupled from GND. The output buffer is realized using TSMC 16-nm FinFET CMOS process. The core area is 0.1412×0.0794 mm<sup>2</sup>. At 2.5 GHz, it has maintained a slew rate of 6.4 and 8.7 V/ns and a duty cycle of 48.3 to 49.2% at a maximum load capacitance of 30 pF. Whether at normal voltage mode (VDD) or high voltage mode (VDDIO), the improvement in slew rate increase ( $\Delta$ SR improvement) is approximately at least 20% after driving current auto-tuning.

Index Terms—DDR4, duty cycle, FinFET, output buffer, slew rate self-adjustment.

#### I. INTRODUCTION

The speed of the transmission interface circuit is gradually increasing. Moreover, the requirements for the quality of the output signal are relatively higher. For example, the I/O interfaces of SDRAMs follows DDR4 (Double Data Rate 4) Data Buffer Definition (DDR4DB02) issued by JEDEC Solid State Technology Association [1]. The Double Data Rate 4 Synchronous Dynamic Random-Access Memory (DDR4 SDRAM) requires slew rate (SR) specifications within the range of 4 to 9 V/ns, an external voltage (VDDIO) of 1.2 V, a silicon pad I/O capacitive load of 0.7 to 1.1 pF, and a duty cycle of  $50\pm 2\%$  [2].

Slew rate is an important factor affecting the quality of signal output in any interface. Excessive slew rate in largesized output buffers causes serious simultaneous switching noise (SSN), or L×di/dt noise [3], [4]. As shown in Fig. 1, when the current (i) in a power line flows through, the power line will generate voltage drop of L×di/dt or a voltage drop of  $i \times R$  because of the power line's parasitic inductance (L) and parasitic resistance (R). Meanwhile, a very low slew rate will result in insufficient timing margin and possible wrong timing. In addition, the slew rate is very sensitive to process, voltage, and temperature variations [5]. Therefore, to maintain the slew rate of the input/output buffer of the transmission interface within a certain range under different environments and loads, PVT drift compensation must be considered [6]. Besides PVT drift, leakage current should not be ignored especially in 16-nm and more advanced processes because it may cause additional power consumption and reduce slew rate [7]. Moreover, the size of the circuit load will also affect the slew rate. If the variation of process, voltage, temperature, leakage current and load can be accurately detected, and corresponding compensation for various circuits is implemented, the slew rate will be greatly improved.

Prior works were developed for the slew rate of I/O buffers to be maintained at a suitable range [4], [5], [9]–[11]. A previous output buffer was designed [5] that has a higher frequency of 2.5 GHz. However, like the rest of the prior works, it cannot meet the DDR4 standards' requirements of slew rate, system voltage, and duty cycle. Therefore, in this paper, a design for a digital output buffer is proposed which complies with the DDR4 requirements for FinFET CMOS buffers. Additional blocks, namely Feedback Detector and Duty Cycle Correctors, which detect and correct the voltage

<sup>&</sup>lt;sup>1</sup>L.K.S. Tolentino is also with Department of Electronics Engineering, Technological University of the Philippines, Manila 1000, Philippines.

<sup>&</sup>lt;sup>2</sup>Prof. C.-C. Wang is the contact author. He is also with Institute of Undersea Technology, National Sun Yat-Sen University, Kaohsiung 80424, Taiwan.

<sup>&</sup>lt;sup>\*</sup>The EDA tool used in this study was assisted by Taiwan Semiconductor Research Institute (TSRI). This project was supported by Ministry of Science and Technology (MOST), Taiwan under grant MOST 110-2218-E-110-008-.

slew rate and duty cycle, respectively, based on the required specifications were added in the proposed design.



Fig. 1. Synchronous switching noise.

## **II. SYSTEM ARCHITECTURE**

Fig. 2 shows the system block diagram of an output buffer with self-adjustment of slew rate and duty cycle that meets DDR4 specifications. It consists of Output stage, Floating Nwell, Pre-driver, VDDIO Detector, Voltage Level Converter (VLC), Non-overlap circuit, Duty Cycle Corrector (DCC), PVT Detector, Digital Logic Circuit, and Feedback Detector. Floating N-well circuit avoids the leakage current path caused by mode switching at  $1.5 \times VDD$  [6]. The Feedback Detector circuit detects the voltage slew rate of the current output signal. If it is too high or too low, the corresponding compensation transistor can be turned off or on to maintain the DDR4 standard correspondingly. The other blocks will be discussed in the following subsections.



Fig. 2. Proposed multi-voltage output buffer with automatic adjustment of slew rate and duty cycle.

#### A. Output Stage

In the Output stage as shown in Fig. 2, MP1a, MP1b, MP1c, MP1d, and MP1e are parallelly stacked on top of MP2 to avoid over-voltage of the transistors (at  $1.5 \times VDD$ ). When a corner is detected by the PVT Detector, the corresponding compensation transistor is turned on. MP1b, MP1c, MP1d, and MP1e are low-Vth compensation transistors which are needed to be always switched on. When these are always switched on, the worse output current fluctuations are prevented so that driving current and rising edge's slew rate increase. Meanwhile, MN1b, MN1c, MN1d, and MN1e are

also compensation transistors but needs to sink driving current increasing the edge's slew rate therewith.

## B. VDDIO detector

The VDDIO detector in Fig. 2 is shown in Fig. 3 [7]. It is used to provide an appropriate bias voltage VD to prevent the transistor from overvoltage (at  $1.5 \times VDD$ ) since VDDIO can be selected as one of the two voltage modes. In addition to the output stage, VD also needs to be provided for Voltage Level Converter, Duty Cycle Corrector, Digital Logic Circuit and other circuits that need to raise the voltage level. With this, an additional capacitor C<sub>VD</sub>, as shown in Fig. 2, is added which stabilizes the driving voltage at the gate of MP2 in Fig. 2 and rejects coupled noise from VDDIO Detector.



Fig. 3. VDDIO detector circuit diagram [7].

#### C. Voltage Level Converter

The Voltage Level Converter was needed to prevent overvoltage in Output stage since VDDIO may be  $1 \times VDD$  or  $1.5 \times VDD$  [6]. It is used to determine whether the voltage level needs to be elevated, depending on the VD value.

#### D. Non-overlap circuit

In order to prevent the occurrence of over-voltage and short-circuit current during the switching transition of power transistors, it is necessary to generate two non-overlapping signals. Non-overlap circuit prevents Output stage transistors, MP1a, MP1b, MP1c, MP1d, MP1e and MN1a, MN1b, MN1c, MN1d, MN1e to be turned on at the same time during transitions. The VDDIO Non-overlap circuit is composed of transistors instead of traditional logic gates, which can increase the data rate [5]. Since VDDIO equals VDD or  $1.5 \times$ VDD, two sets of non-overlap circuits of the same size are required. The highest and lowest potentials of DataP2 and DataN2 are the same as the voltage levels of DataP1 and DataN1 in Fig. 2, respectively.

## E. PVT detector

PVT Detector detects the corner scenario, sends the data to the Digital Logic Circuit for encoding, and selects the number of compensation transistors that are turned on due to different manufacturing processes, voltages, and temperatures. Since the process drift has 5 corners: TT, FF, SS, FS, and SF, PVT detectors are divided into two types: P-type PVT detector and N-type PVT detector (Fig. 4) [1]. Both are composed of a set of low-skew inverters with a MOS capacitor ( $C_P$ ), and two sets of voltage comparators, inverters and flip-flops.  $C_P$  is charged by a low-skew inverter in a certain rate which corresponds to a particular corner.



Fig. 4. N-type PVT detector.

# F. Duty Cycle Corrector

Fig. 5 shows the Duty Cycle Corrector [8]. It detects the duty cycle of the input signal and adjusts the duty cycle error caused by the Non-overlap circuit by providing the corresponding control voltage V12, and increasing voltage V11 through the inverting transistors, MP35 and MN35.



Fig. 5. Duty cycle corrector connected to VDDIO [8]

#### G. Digital Logic Control

Fig. 6 shows the Digital Logic Control circuit. As shown in Fig. 2, it encodes the PVT Detector outputs: Pcode[2:1] and Ncode[2:1] and Feedback Detector outputs: Ctr1 and Ctr2. Then, it outputs control signals Vgp[5:1] and Vgn[5:1]. It selects the compensation transistors (MP1b, MNPc, MP1d, MNPe, MN1b, MN1c, MN1d, MN1e) to be turned on, and decides whether the circuit will be re-compensated by the Reset signal.

## H. Pre-driver

Since MP2, MP1a, MP1b, MP1c, MP1d, MP1e, MN2, MN1a, MN1b, MN1c, MN1d, MN1e are very large in size, a Pre-driver, which is mainly composed of a series of tappered inverters, is added to provide their driving current. The aspect ratio of each stage of inverters is about 3 times of that of the previous stage. There are six inverters in total, so that it can provide enough current to Output stage.



Fig. 6. Digital Logic Control circuit diagram.

#### **III. ALL-PVT-CORNER SIMULATION**

The digital output buffer was implemented using TSMC 16nm FinFET CMOS process. Fig. 7 shows the layout view of the whole chip. The core area is  $0.1412 \times 0.0794 \text{ mm}^2$ . The chip area including the pads is  $0.60736 \times 0.60576 \text{ mm}^2$ .



Fig. 7. Layout of the proposed output buffer.

This design has two transmission voltages, namely VDDIO = 1.2 V and 0.8 V. The output stage load ( $C_L$ ) is a silicon pad equivalent capacitance load of 1 pF and a probe equivalent capacitance load of 30 pF. Using HSPICE simulation tool at a data rate of 2.5 GHz, Fig. 8(a) and Fig. 8(b) show the all-PVT-corner simulation output before and after corner compensation when VDDIO is 0.8 V while Fig. 9(a) and Fig. 9(b) show the all-PVT-corner simulation output before and after corner compensation when VDDIO is 1.2 V. The  $\Delta$ SR Improvement can be calculated using Eqn. (1). The improvement in slew rate increase ( $\Delta$ SR Improvement) for VDDIO of 0.8 V and 1.2 V

|                                    | This work                | ISNE [5]                 | TCAS-II [9]              | ICICDT [10]              | ESSCIRC [11]             | TCAS-I [4]               |
|------------------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|
| Year                               | 2021                     | 2019                     | 2017                     | 2016                     | 2013                     | 2013                     |
| Process<br>(nm)                    | 16                       | 16                       | 40                       | 28                       | 28                       | 90                       |
| Result                             | Simulation               | Simulation               | Measurement              | Simulation               | Measurement              | Simulation               |
| VDD (V)                            | 0.8                      | 0.8                      | 0.9                      | 1.05                     | 1.8                      | 1.2                      |
| VDDIO (V)                          | 1.2/0.8                  | 1.6/0.8                  | 1.8/0.9                  | 1.8/1.05                 | 3.3/1.8                  | 2.5                      |
| Max. Data Rate (GHz)               | 2.5                      | 2.5                      | 0.5                      | 0.8                      | 0.2                      | 0.125                    |
| $\Delta$ SR                        | 8.7/6.4                  | 18/19.1                  | N.A./1.54                | 3.9/4.9                  | N.A.                     | 2.2/3.4                  |
| (V/ns)                             | $(@C_L = 30 \text{ pF})$ | $(@C_L = 20 \text{ pF})$ | $(@C_L = 20 \text{ pF})$ | $(@C_L = 20 \text{ pF})$ | $(@C_L = 10 \text{ pF})$ | $(@C_L = 15 \text{ pF})$ |
| $\Delta$ SR Improvement (%)        | 29/26                    | 23.3/15.8                | 11/67                    | N.A.                     | N.A.                     | 40.5                     |
| Duty Cycle (%)                     | 49.2/48.3                | N.A.                     | N.A.                     | N.A.                     | N.A.                     | N.A.                     |
| Dynamic Power (mW)                 | 153<br>(@2500 MHz)       | 28<br>(@500 MHz)         | 27<br>(@500 MHz)         | N.A.                     | 0.09<br>(@200 MHz)       | N.A.                     |
| Figure of Merit (FOM) <sup>1</sup> | 1200                     | 800                      | 400                      | 448                      | 56                       | 168.75                   |

 TABLE I

 PERFORMANCE COMPARISON OF THE PROPOSED WORK WITH THE PREVIOUS WORKS

<sup>1</sup>FOM = Process (nm) × Max. Data Rate (GHz) ×  $C_L$  (pF)



Fig. 8. A 0.8-V VDDIO's output waveform at all corners (a) before (The worst SR(rise) = 5.26 V/ns. The worst SR(fall) = 4.3 V/ns); (b) after corner compensation (The worst SR(rise) = 6.4 V/ns. The worst SR(fall) = 5.8 V/ns.)



Fig. 9. A 1.2-V VDDIO's output waveform at all corners (a) before (The worst SR(rise) = 7.6 V/ns. The worst SR(fall) = 5.18 V/ns); (b) after corner compensation (The worst SR(rise) = 8.7 V/ns. The worst SR(fall) = 7.35 V/ns).

is 17% (rising)/26% (falling) and 16% (rising)/29% (falling), respectively. Table I summarizes the comparison between the proposed output buffer with several previous works. It can be seen in Table I that this work has the largest FOM among all output buffers with multiple voltage modes.

$$\Delta SR \ Improvement(\%) = \frac{\Delta SR_{Before} - \Delta SR_{After}}{\Delta SR_{Before}}$$
(1)

where  $\Delta SR_{\text{Before}}$  and  $\Delta SR_{\text{After}}$  is the difference between the slew rates at worst-case condition before and after compensation, respectively.

## IV. CONCLUSION

In this paper, a design for a digital output buffer is proposed which complies with the DDR4 requirements. Its minimum and maximum worst-case slew rates are 6.4 and 8.7 V/ns at a data rate of 2.5 GHz which is within the 4 to 9 V/ns required slew rated based on the DDR4 standard. Lastly, a duty cycle corrector is integrated in the output buffer which can keep the duty cycle at 48.3% to 49.2%.

#### REFERENCES

- T.-Y. Tsai, Y.-L. Teng and C.-C. Wang, "A nano-scale 2×VDD I/O buffer with encoded PV compensation technique," in *Proc. 2016 IEEE International Symposium on Circuits and Systems (ISCAS)*, pp. 598-601, May 2016.
- [2] JEDEC, DDR4 Data Buffer Definition (DDR4DB02), JEDEC Solid State Technology Association, Arlington, VA, USA, August 2019. Accessed on: May 15, 2021. [Online]. Available: https://www.jedec.org/standardsdocuments/docs/jesd82-32
- [3] J.-G. Yook, V. Chandramouli, L.P.B. Katehi, K.A. Sakallah, T.R. Arabi, and T.A. Schreyer, "Computation of switching noise in printed circuit boards," *IEEE Transactions on Components, Packaging, and Manufacturing Technology: Part A*, vol. 20, no. 1, pp. 64-75, Mar. 1997.
- [4] M.-D. Ker and P.-Y. Chiu, "Design of 2×VDD-tolerant I/O buffer with PVT compensation realized by only 1×VDD thin-oxide devices," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 10, pp. 2549-2560, Oct. 2013.
- [5] C.-C. Wang and S.-W. Lu, "2.5 GHz data rate 2 × VDD digital output buffer design realized by 16-nm FinFET CMOS," in *Proc. 2019 8th International Symposium on Next Generation Electronics (ISNE)*, pp. 1-3, Oct. 2019.
- [6] T.-J. Lee, S.-W. Huang, and C.-C. Wang, "A slew rate enhanced 2 x VDD I/O buffer with precharge timing technique," *IEEE Transactions* on Circuits and Systems II: Express Briefs, vol. 67, no. 11, pp. 2707-2711, Nov. 2020.
- [7] T.-J. Lee, K.-W. Ruan, and C.-C. Wang, "32% slew rate and 27% data rate improved 2×VDD output buffer using PVTL compensation," in *Proc. 2014 IEEE International Conference on IC Design & Technology*, pp. 1-4, May 2014.
- [8] A. De Marcellis, M. Faccio, and E. Palange, "A 0.35μm CMOS 200kHz–2GHz fully-analogue closed-loop circuit for continuous-time clock duty-cycle correction in integrated digital systems," in *Proc. 2018 IEEE International Symposium on Circuits and Systems (ISCAS)*, pp. 1-5, May 2018.
- [9] C.-C. Wang, Z.-Y. Hou and K.-W. Ruan, "2×VDD 40-nm CMOS output buffer with slew rate self-adjustment using leakage compensation," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 64, no. 7, pp. 812-816, Jul. 2017.
- [10] T.-Y. Tsai, Y.-Y. Chou and C.-C. Wang, "A method of leakage reduction and slew-rate adjustment in 2×VDD output buffer for 28 nm CMOS technology and above," in *Proc. 2016 International Conference on IC Design and Technology (ICICDT)*, pp. 1-4, Jun. 2016.
- [11] V. Kumar and M. Rizvi, "Power sequence free 400Mbps 90µW 6000µm<sup>2</sup> 1.8V-3.3V stress tolerant I/O buffer in 28nm CMOS," in *Proc. 2013 Proceedings of the ESSCIRC (ESSCIRC)*, pp. 37-40, Jul. 2013.