

# A Lock Detector Loop for Low-power PLL-Based Clock and Data Recovery Circuits

Chua-Chin Wang<sup>1</sup>  $\odot$  · Zong-You Hou<sup>1</sup> · Chih-Lin Chen<sup>1</sup> · Doron Shmilovitz<sup>2</sup>

Received: 7 April 2016 / Revised: 26 July 2017 / Accepted: 27 July 2017 / Published online: 5 August 2017 © Springer Science+Business Media, LLC 2017

**Abstract** This work presents a phase-locked loop (PLL)-based clock and data recovery (CDR) circuit with a lock detector loop to reduce the voltage ripple of voltage-controlled oscillator (VCO). A tunable charge pump is used in this work to adjust the charge current depending on the state of lock detector loop, which is determined by seven clocks with equal phase difference. An experimental prototype is implemented using a typical 0.18- $\mu$ m CMOS process to justify the performance. The measurement results reveal that lock detector loop could reduce the voltage amplitude of Vctrl, which is the control of VCO. Notably, the voltage amplitude of Vctrl is reduced 75% from 1 V to 250 mV.

Keywords CDR · Lock detector loop · PLL · Ripple reduction · Low power

### **1** Introduction

For communication systems without a master clock source, e.g., FlexRay, controller area network (CAN), and local interconnect network (LIN), devices in these systems

Chua-Chin Wang ccwang@ee.nsysu.edu.tw

> Zong-You Hou blazesky@vlsi.ee.nsysu.edu.tw

Chih-Lin Chen clchen@vlsi.ee.nsysu.edu.tw

Doron Shmilovitz shmilo@eng.tau.ac.il

<sup>1</sup> Department of Electrical Engineering, National Sun Yat-Sen University, Kaohsiung, Taiwan

<sup>2</sup> Department of Electrical Engineering, Tel-Aviv University, 69978 Tel-Aviv, Israel

usually have no synchronous and standard clock to follow and transmit/receive data [9]. For example, according to FlexRay specification [7], FlexRay systems need a highfrequency (8 times of the data rate) PLL circuit to carry out over-sampling function and data recovery. In the next-generation FlexRay specification [13], the data rate might be drastically raised for more audio/video equipments in a vehicle, e.g., mobile TV receiver, GPS, video player, video game, and so on. For instance, if FlexRay systems operate at a high data rate (e.g., 100 Mbps), a 800-Mbps PLL is required for oversampling. A reliable 800-Mbps PLL is, however, not easy to be designed and integrated in system-on-chip (SOC) such that it is a problem in in-vehicle networks. Besides, transceivers are the interfaces between any two devices over the in-car network, while CDR is the critical circuit to synchronize and sample the digital signals transmitted between these devices.

CDR circuit is mainly used to generate a clock, synchronize received data, and reduce jitter. In a receiver design, a reference-less CDR circuit needs a PLL to synchronize the local clock with the received data to ensure that the operating frequency of the CDR is appropriate. If the incoming data are coupled with noise, the receiver with CDR should be able to reject the noise to reduce jitter. In prior reports, CDR designs usually had a trade-off between settling time and clock jitter for different applications. To shorten settling time and reduce jitter, a lock detector was reported to adjust loop bandwidth in a receiver system [5,8,11,12,14] and [15]. Referring to [12], PLL designs utilized a digital frequency difference detector (DFDD) [8] to adjust resistors of LPF such as to adjust system bandwidth and then shorten settling time. In prior CDR designs [14,15], a lock detector was proposed to adjust system bandwidth for fast locking. Another kind of CDR design shown in [5,11] took advantage of a lock detector to detect the transition with respect to a reference clock. If the clock transition occurs before the reference clock, a counter is counted up. Otherwise, the counter is counted down. The counter selects the CDR circuit to activate in a frequency detecting loop or a phase detecting loop.

A 100-Mbps CDR circuit with a lock detector loop monitoring the lock status of a PLL adaptively is proposed in this study. Figure 1 shows the proposed CDR circuit, including a PLL-based CDR circuit and lock detector loop. The proposed CDR circuit is able to recover data bits given a 100 Mbps data rate. Lock detector loop is in charge of detecting whether CDR circuit is locked or not to reduce the clock jitter. In Sect. 2, a lock detector loop is disclosed. The functionality of lock detector loop and its operation states are illustrated. In Sect. 3, the architecture of PLL-based CDR with lock detector loop will be introduced. The simulation results of CDR circuit and lock detector loop by MATLAB as well as measurement results on silicon are given in Sect. 4. A brief conclusion and future work discussion are given in Sect. 5. Notably, this paper is expanded by the conference paper [4]. The conference paper only provides simulation results used to prove the function of the proposed CDR. However, in this article, the measurement results have been added which further proves the functional correctness and performance of the proposed CDR. Besides, the descriptions of lock detector and lock detector loop have been rewritten more clearly.

### 2 Lock Detector Loop Theory

When a CDR circuit is locked, the phase difference between data and main clock should be same. Assume that there are two more reference clocks, clock\_lead and clock\_lag, which leads and lags the main clock in phase, respectively. Therefore, the positive and negative edges of data should be located between clock\_lead and clock\_lag to distinguish whether CDR circuit is locked or not. However, the above condition happens when CDR circuit is not locked, namely wrong detection. To avoid such a wrong detection, the edge of data must be ensured to be detected between clock\_lead and clock\_lag for a pre-defined time. Therefore, an 8-bit counter is used to count such a period or duration, which indicates how long CDR circuit is locked. Notably, the mentioned required clock\_lead and clock\_lag signals must be generated by a reliable clock source, e.g., VCO. The details of this VCO will be revealed in Sect. 3.

#### 2.1 Lock Detector Loop

In the proposed design, Fout[3] is set to be the central (main) clock of VCO to be synchronized with the data. Fout[0], Fout[1], and Fout[2] are considered as clock\_lead, while Fout[4], Fout[5], and Fout[6] are deemed as clock\_lag, as shown in Fig. 2.

These signals will be generated by the VCO in the PLL. Lock detector loop in Fig. 1 is composed of a lock detector, a MUX, and an 8-bit counter, as shown in Fig. 3.

Lock detector compares the phase differences between data and Fout[3] to deliver a counter\_up signal into the counter, since Fout[3] is the central frequency of the



Fig. 1 The proposed CDR circuit with a lock detector loop

```
Fig. 2 The clocks with equal phase shift
```





Fig. 3 Block diagram of lock detector loop

| Table 1   The functions of MUX | Lock[0] | Lock[1] | 01      | 02      |
|--------------------------------|---------|---------|---------|---------|
|                                | 0       | 0       | Fout[0] | Fout[6] |
|                                | 1       | 0       | Fout[1] | Fout[5] |
|                                | 1       | 1       | Fout[2] | Fout[4] |

seven clock signals, Fout[0]–Fout[6]. The 8-bit counter is composed of a few D-flipflops (DFFs) and logic circuits, which is used to count the required duration for the CDR circuit to lock. If counter\_up is pulled low, the 8-bit counter is reset indicating that CDR circuit is not locked. By contrast, when the 8-bit counter counts up to 256, counter\_up is pulled high to notify that CDR circuit is locked. Referring to Fig. 3, Lock[0] and Lock[1] are selection signals of the MUX to select different clocks fed into lock detector, which are listed in Table 1.

Notably, Fout[0], Fout[1], and Fout[2] have leading phases with respect to the central Fout[3], while Fout[4], Fout[5], and Fout[6] have the lagging phases. Because the jitter might cause a wrong detection, this work employs a greedy search method by gradually reducing the phase difference range to detect if the CDR circuit is locked or not. The details of this search method are as follows.

- In the beginning, Lock[0:1] is equal to '00' such that the MUX selects Fout[0] and Fout[6] into lock detector, where Fout[0] and Fout[6] provide the maximum phase difference with respect to Fout[3]. If lock detector is in lock, the 8-bit counter keeps counting up. As soon as the counter hits 256, Lock[0] is pulled high to indicate that the CDR circuit is locked.
- Secondly, Lock[0:1] is set to '10.' The MUX selects Fout[1] and Fout[5] to be delivered into lock detector, which gives the second widest range of phase difference. If counter counts up to 256, Lock[1] will pull high to show that the CDR circuit is locked.
- Finally, Lock[0:1] is equal to '11.' The MUX selects Fout[2] and Fout[4] to be sent into lock detector, which gives the narrowest range of phase difference. When counter counts up to 256, Lock[2] is pulled high to ensure that CDR circuit is locked.



Fig. 4 Lock detector schematic





Lock[0:2] will be used to adjust the charge pump in the CDR circuit. Therefore, the current in the charge pump will be reduced step by step such that the clock jitter behaves the same.

#### 2.2 Lock Detector

Figure 4 shows the lock detector, where 3 DFFs and an XNOR are used. Lock detector detects if the positive or negative edges of incoming data are synchronized with the main clock.

Referring to Fig. 4, DFF1 is used to detect whether data transition leads clock\_lag. DFF2 is in charge of detecting whether data transition lags clock\_lead. DFF3 depends on the results of DFF1 and DFF2 to detect whether data transition occurs between clock\_lag and clock\_lead. There are three different scenarios to check whether CDR circuit is locked or not, as shown in Fig. 5.

- Late state: When data transition leads clock\_lag and clock\_lead, sample\_A and sample\_B are pulled high. Therefore, sample\_C and counter\_up are pulled low.
- Early state: When data transition lags clock\_lag and clock\_lead, sample\_A and sample\_B are pulled low. Therefore, sample\_C and counter\_up are pulled low.



Fig. 6 PLL-based CDR circuit





- Lock state: When data transition leads clock\_lag and lags clock\_lead, sample\_A is pulled high and sample\_B is pulled low. Therefore, sample\_C and counter\_up are pulled high to indicate that it is the "lock" status.

### 3 PLL-Based Clock and Data Recovery (CDR) Circuit with Lock Detector Loop

The proposed PLL-based CDR circuit includes a phase detector (PD), a frequency detector (FD), two charge pumps for PD and FD (which are CP\_PD and CP\_FD, respectively), a second-order LPF composed of R1, C1, and C2, a VCO, as shown in Fig. 6. The function description of each block is given in the following text.

#### 3.1 Phase Detector (PD)

The Alexander binary phase detector in [2] is used as the PD, as shown in Fig. 7. If the data transition occurs after the Fout[3]'s falling edge, the PD\_up generates logic '1' to increase clock frequency of VCO, where Fout[3] is the main (central) clock. Otherwise, if the data transition occurs before the Fout[0]'s falling edge, the PD\_up generates logic '0' to decrease clock frequency of VCO. Because the PD does not have a linear frequency response, it causes a constant jitter in the CDR system.



Fig. 9 Charge pump for PD (CP\_PD)

### 3.2 Frequency Detector (FD)

Figure 8 shows the frequency detector in [6], which is used to detect whether the data stream has two rising edges during each half period of Fout[3]\_quarter, where the period of Fout[3]\_quarter is 4 times of the output clock frequency of VCO generated by the Divider. If the FD detects two rising edges, FD\_up generates logic '1.' In other words, the clock frequency of VCO is less than 2 times of the data rate such that the clock frequency of VCO must be increased.

### 3.3 Charge Pump for PD and FD (CP\_PD and CP\_FD)

Figure 9 shows the schematic of the CP\_PD.  $R_{PD}$  is used to generate a bias current,  $I_{RPD}$ . M1~M9 are used to be current mirrors to generate 1~8 times of  $I_{RPD}$ , where the W/L ratios of M1~M9 are shown in Fig. 9. Notably,  $I_P$  is determined by  $I_{RPD}$ , Lock[0], Lock[1], and Lock[2]. For example, when Lock[0:2] is equal to '010,'  $I_P$  is 3 times of  $I_{RPD}$ . Notably, the maximum charge current,  $I_P$ , is 8 times of  $I_{RPD}$ . If PD\_up is logic '1,'  $I_P$  flows into Vctrl from VDD. If PD\_down is logic '1,'  $I_P$  flows into GND from Vctrl. The CP\_FD is identical to that for PD. When CDR circuit is locked, Lock[0] Lock[2] are pulled high to tune the current of CP\_PD for reducing



Fig. 10 The architecture of VCO and the schematic of VCO cells

clock jitter. In short, depending on Lock[0:2], the current of CP\_PD is reduced step by step such that the clock jitter behaves the same.

#### 3.4 Voltage-Controlled Oscillator (VCO)

Figure 10 shows the VCO block diagram, where the schematic of the VCO\_cell is included. A differential 8-stage voltage-controlled ring oscillator is used to generate clocks with equal phase shift. When RESET is activated, the VCO starts oscillating. Vctrl is used to adjust the clock frequency of VCO. VCO\_cells accumulate different phase delays, respectively. In other words, each VCO\_cell generate 22.5° phase delay, i.e., the phase difference between Fout[0] and Fout[1] is 22.5°, which is 625 ps for a 100-MHz clock. Therefore, VCO generates a bank of clocks with seven different phases, i.e., Fout[0] to Fout[6].

#### 4 Measurement and Comparison

For the sake of stability, CDR system is required to avoid oscillating such that the phase margin of the CDR system must be larger than  $0^{\circ}$ . Equation (1) is the transfer function of our CDR shown in Fig. 6, which is simulated using MATLAB to derive all of the system parameters given in Table 2. According to MATLAB simulation results, the phase margin is 71° and the bandwidth is 13.7 MHz.

$$H(s) = \frac{I_{\text{RPD}} \times \text{VCO gain} \times \frac{1}{s} \times \left(R1 + \frac{1}{s \times C1}\right)}{s \times C2 \times \left(R1 + \frac{1}{s \times C1} + \frac{1}{s \times C2}\right)}$$
(1)

Birkhäuser

| Table 2 There parameters in   Eq. (1) (1) | Parameter        | Value    |  |  |
|-------------------------------------------|------------------|----------|--|--|
|                                           | I <sub>RPD</sub> | 36 µA    |  |  |
|                                           | VCO gain         | 88 MHz/V |  |  |
|                                           | <i>R</i> 1       | 4.5 ΚΩ   |  |  |
|                                           | <i>C</i> 1       | 100 pF   |  |  |
|                                           | C2               | 3 pF     |  |  |

| Fig. 11 | Die photo of the |
|---------|------------------|
| propose | d design         |



Fig. 12 Measurement results of lock detector loop

The proposed design is then implemented using a typical 0.18-µm CMOS process to justify the performance. Figure 11 shows the die photo of the proposed design. The chip area is  $1.027 \times 1.027 \text{ mm}^2$ , while the core area is  $0.4 \times 0.4 \text{ mm}^2$ . Figure 12 shows the measurement results of lock detector loop. After activating Lock[0:2], the amplitude of Vctrl is decreased from 1 V to 250 mV. Therefore, the jitter can also be reduced due to the decreasing amplitude of Vctrl. Figure 13 shows the jitter performance of the proposed CDR circuit. The rms jitter value is 47.75 ps.

Table 3 shows the comparison between the proposed design and several prior CDR designs for Mbps data rate. Notably, the proposed design shows the best FOM, where





| Table 3  | Comparison | between t | he proi | osed design   | and pri | or works |
|----------|------------|-----------|---------|---------------|---------|----------|
| I ubic c | Comparison | between t | ne prop | Jobea aebigii | and pri | or works |

|                                                               | Ours          | [6]        | [16]        | [10]       |
|---------------------------------------------------------------|---------------|------------|-------------|------------|
| Year                                                          | 2015          | 2012       | 2013        | 2014       |
| Process (µm)                                                  | 0.18          | 0.18       | 0.18        | 0.18       |
| Operating speed (Mbps)                                        | 100           | 5000       | 155.52-3125 | 3200       |
| Clock jitter (rms)                                            | 47.75 ps      | 30.4 ps    | 80.4 ps     | 95.6 ps    |
|                                                               | @100 Mbps     | @5000 Mbps | @155 Mbps   | @3200 Mbps |
| Supply voltage (V)                                            | 1.8           | 1.8        | 1.8         | 1.8        |
| Core area (mm <sup>2</sup> )                                  | 0.16          | 4.37       | 0.88        | 0.26       |
| FOM $\left(\frac{\mu m^2}{ps \times mm^2 \times Mbps}\right)$ | 42.41         | 4.88       | 2.94        | 40.73      |
| Process                                                       | $s^2 (u m^2)$ | 6          |             |            |

 $FOM = \frac{Process (\mu m^2)}{\text{Jitter (ps)} \times \text{Area (mm^2)} \times \text{Data rate (Mbps)}} \times 10^6$ 

the FOM is calculated based on the equation in the bottom of Table 3, where the process is used to normalize the area. Besides, because the higher the data rate, the lower the jitter [3, 16], the data rate is used to normalize the jitter in the FOM.

# **5** Conclusion

A novel lock detector loop is proposed to reduce the ripple amplitude of Vctrl, which is coupled with the gate drive of VCO. Depending on the phase difference among clocks and data, the lock detector loop can generate appropriate current control signals, Lock[0:2], to adjust the charge pump currents. The jitter performance of the CDR circuit proportional to the ripple amplitude of Vctrl is reduced as well. The measurement results of lock detector loop in Fig. 12 show that the ripple amplitude of Vctrl is effectively reduced 75% from 1V to 250 mV.

# 6 Future Work

If 100-Mbps CDR was meant to used in FlexRay Tx/Rx for in-car networking [7], it might be implemented using high-voltage (HV) CMOS process due to HV require-

ment. An HV buffer is needed to be added at the input of the CDR to prevent MOSs from over-voltage problems. On the other hand, PLL can be replaced by finite impulse response (FIR) PLL in [1] and [17] to enhance robustness. Referring to [17], the horizon size is taken as 4 to optimally deal with flicker noise and phase noise. It is expected that the jitter of the proposed CDR will be significantly reduced and achieve impressive reliability.

Acknowledgements This investigation was partially supported by National Science Council and Metal Industries Research Development Center (MIRDC) under Grant NSC102-2221-E-110-081-MY3, NSC102-2221-E-110-083-MY3, MOST104-2622-E-006-040-CC2, MOST104-ET-E-110-002-ET, MOST105-2221-E-110-058-, and MOST105-2218-E-110-006-. The authors would like to express their deepest gratefulness to Chip Implementation Center of National Applied Research Laboratories, Taiwan, for their thoughtful chip fabrication service and EDA tool support.

### References

- C.-K. Ahn, P. Shi, M.V. Basin, Deadbeat dissipative FIR filtering. IEEE Trans. Circuits Syst. I Regul. Pap. 63(8), 1210–1221 (2016)
- 2. J.D.H. Alexander, Clock recovery from random binary signals. Electron. Lett. 11(22), 541-542 (1975)
- D. Belot, L. Dugoujon, S. Dedieu, A 3.3 V power adaptive 1244/622/155 Mbit/s transceiver for ATM, SONET/SDH. IEEE J. Solid State Circuits 33(6), 1047–1058 (1998)
- C.-L. Chen, C.-C. Wang, C.-Y. Juan, A fast-locking clock and data recovery circuit with a lock detector loop, in *Proceedings of IEEE International Symposium on Integrated Circuits*, 2011, pp. 332–335
- F.-T. Chen, J.-M. Wu, An extended phase detector 2.56/3.2Gb/s clock and data recovery design with digitally assisted lock detector, in *Proceedings of IEEE International Symposium on Circuits and Systems*, 2009, pp. 1831–1834
- D. Dalton, K. Chai, E. Evans, M. Ferriss, D. Hitchcox, P. Murray, S. Selvanayagam, P. Shepherd, L. DeVito, A 12.5-Mb/s to 2.7-G/s continuous-rate CDR with automatic frequency acquisition and data-rate readback. IEEE J. Solid State Circuits 40(12), 2713–2725 (2005)
- FlexRay Communications System—Protocol Specification V3.0.1, http://www.flexray.com, (2010), pp. 78–79
- I. Hwang, S. Lee, S. Kim, A digitally controlled phase loop with fast locking scheme for clock synthesis application, in *Proceedings of IEEE International Solid-State Circuits Conference Digest Technical Papers*, 2000, pp. 168–169
- I. Jung, D. Shin, T. Kim, C. Kim, A 140 Mb/s to 1.96 Gb/s referenceless transceiver with 7.2 μs frequency acquisition time. IEEE Trans. Very Large Scale Integr. Syst. 19(7), 1310–1315 (2011)
- Y.-L. Lee, S.-J. Chang, R.-S. Chu, Y.-C. Chen, An area- and power-efficient half-rate clock and data recovery circuit, in *IEEE International Symposium on Circuits and Systems*, 2014, pp. 2129–2132
- T.-S. Tan, K.-S. Yeo, C.-C. Boon, M.-A. Do, A dual-loop clock and data recovery circuit with compact quarter-rate CMOS linear phase detector. IEEE Trans. Circuits Syst. I Regul. Pap. 59(6), 1156–1167 (2012)
- Y. Tang, M. Ismail, S. Bibyk, A new fast-settling gearshift adaptive PLL to extend loop bandwidth enhancement in frequency synthesizers, in *Proceedings of IEEE International Symposium on Circuits* and Systems, vol. 4 2002, pp. 787–790
- C.-C. Wang, C.-L. Chen, T.-H. Yeh, Y. Hu, G.-N. Sung, A high speed transceiver front-end design with fault detection for FlexRay-based automotive communication systems, in *Proceedings of IEEE International Symposium on Circuits and Systems*, 2011, pp. 434–437
- J.-K. Woo, D.-K. Jeong, S. Kim, Fast-locking CDR circuit with autonomously reconfigurable mechanism. Electron. Lett. 43(11), 624–626 (2007)
- J.-K Woo, H. Lee, W.-Y. Shin, H. Song, D.-K. Jeong, S. Kim, A fast-locking CDR circuit with an autonomously reconfigurable charge pump and loop filter, in *Proceedings of IEEE Asian Solid-State Circuits Conference*, 2006, pp. 411–414

- R.-J. Yang, K.-H. Chao, S.-C. Hwu, C.-K. Liang, S.-I. Liu, A 155.52 Mbps-3.125 Gbps continuous-rate clock and data recovery circuit. IEEE J. Solid State Circuits 41(6), 1380–1390 (2006)
- S.-H. You, J.-M. Pak, C.-K. Ahn, P. Shi, M.-T. Lim, Unbiased finite-memory digital phase-locked loop. IEEE Trans. Circuits Syst. II 63(8), 798–802 (2016)