# Wide Lock-in Range CDR with Modified DQFD and Coarse-fine Tuning Technique

Tzung-Je Lee<sup>†</sup> Department of Electrical Engineering National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 Email: tjlee@ee.nsysu.edu.tw Bo-Hao Liao Department of Electrical Engineering National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 Email: m60815mo@gmail.com Chua-Chin Wang Department of Electrical Engineering National Sun Yat-Sen University, Kaohsiung, Taiwan 80424 Email: ccwang@ee.nsysu.edu.tw

*Abstract*—This paper presents a CDR circuit with wide lock-in range and low jitter. By using the Frequency Increase/Decrease Control circuit and the Modified DQFD (Digital Quadricorrelator Frequency Detector), the lock-in range is enhanced. Besides, the problem of the state loss at wide frequency range detection is avoided. The Coarse-fine Tuning VCO provides two control wires such that separate loop filters could be involved in the dual loops. Thus, the noise and jitter could be further miniaturized. The proposed design is implemented with a typical 40 nm CMOS process. The simulated lock-in range is 1-6.5 GHz and the simulated RMS jitter is 5.79 ps.

Index Terms—CDR, DQFD, lock-in range, jitter, Coarse-fine Tuning

# I. INTRODUCTION

CDR (Clock and Data Recovery) circuit plays an important role to acquire the clock from the data stream in communication system [1]- [3]. In order to meet the requirement for different data rate, CDR circuits must possess wide lockin range. Besides, low jitter is also needed to attain the high quality communication. Dual-loop control is widely used in CDR circuit because of the simplicity to implement [1]-[2]. However, the two control signals from the dual-loop are merged into unique control signal for VCO by sharing the same loop filter. The interference and the unwanted glitch at the control signal is easily induced in the locked state, increasing the jitter and the lock-in time [1]- [2]. DOFD (Digital quadricorrelator frequency detector) is one of the solutions for the dual-loop controlled CDR circuit because no control pulse is generated in the locked state [3], [4], [5], [6]. However, the operation state would lose when the frequency difference is large, such that the lock-in range is limited [7].

In order to solve the mentioned problems, this paper proposes the Frequency Increase/Decrease Control circuit and the Modified Digital Quadricorrelator Frequency Detector to achieve the wide lock-in range. Besides, the Coarse-Fine Tuning VCO is utilized to avoid the inference from dual loops and reduce the jitter. The simulated lock-in range is 1-6.5 Gb/s with the P2P jitter and RMS jitter of 17.1 ps and 5.79 ps, respectively.



Fig. 1. Block diagram of the proposed CDR circuit.

# II. WIDE RANGE AND LOW JITTER CDR

Fig. 1 shows the block diagram of the proposed wide lock-in range and low jitter CDR circuit, which includes a Phase Detector (PD) [8], a Modified DQFD (Modified Digital Quadricorrelator Frequency Detector), a Frequency Increase/Decrease Control (Freq. Inc/Dec Control) circuit, a Hysteresis Lock Detector [9], two Charge Pump circuits (CP1 and CP2), two off-chip Loop Filters (LF1 and LF2), a Coarse-Fine Tuning VCO, and a Buffer.

The proposed referenceless CDR circuit consists of the frequency acquisition loop and the phase lock loop. The VCO generates the digital quadrature signals, CLKI and CLKQ, for the dual-loop detection. When CLKI and the input DATA have large frequency difference initially, the output of the Hysteresis Lock Detector, LOCK, is logic 0 such that the frequency acquisition loop is activated and the phase lock loop is disable. Then, the Frequency Increase/Decrease Control circuit detects the frequency difference between CLKI and DATA and generates the output signal, UP\_DN, to tell the VCO to change the frequency. In this state, the control signals, FD\_UP and FD\_DN, are directly from the comparison result of CLKI and DATA.

When the frequency of VCO is close to the frequency of DATA, LOCK signal becomes logic 1 such that the CDR operates with both of the Modified DQFD and the PD. In this state, the Modified DQFD and the CP2 generate the coarse control signal, Vcoarse, for the VCO. The PD and the CP1 generate the fine control signal, Vfine. The VCO receives the separately coarse and fine tuning control signals to obtain the precise frequency and phase for the in-phase output, CLKI,

<sup>&</sup>lt;sup>†</sup> Prof. T.-J. Lee is the contact author. (e-mail: tjlee@ee.nsysu.edu.tw)



Fig. 2. Schematic of the Frequency Increase/Decrease control circuit.



Fig. 3. Illustrated waveforms of the Frequency Increase/Decrease Control circuit.

and the quadrature output, CLKQ. The CLKI is finally coupled to the output Buffer to increase the driving ability for the large loads of 20 pF.

## A. Frequency Increase/Decrease Control Circuit

Fig. 2 reveals the schematic of the Frequency Increase/Decrease Control circuit. Notably, the frequency of the signal, CLKI/2, is divided-by-2 from the signal, CLKI. CLKI and DATA are used as the clock source of the two 5-bit counters, respectively. The frequency difference of the signal, CLKI and DATA, results in different output of the counters. Thus, the output of Counter1, Q3, and the output of Counter2, A3, could be used to detect whose frequency is faster. When DATA is faster than CLKI, A3 would lead Q3 such that UP DN equal to logic 1, which is used to increase the output frequency of VCO, as shown in Fig. 3 (a). When CLKI is faster than DATA, Q3 would lead A3 such that UP\_DN becomes logic 0 and the output frequency of VCO is reduced, as shown in Fig. 3 (b). In order to avoid the problem of the initial large frequency difference of CLKI and DATA, Q4 is included for comparison. DFF2 is used to generate the reset signal, UPDN\_resetb, for the two Counters.

#### B. Modified Digital Quadricorrelator Frequency Detector

Fig. 4 shows the schematic of the Modified DQFD. In the bottom of Fig. 4 is a traditional DQFD [6]. The edge of DATA is detected firstly with the XOR gate and the delay cell. Then, the falling edge of the DATA could be compared to the in-phase clock signal, CLKI, and the quadrature clock signal, CLKQ. The DQFD possesses 4 operating states, the state I, II, III, and IV, which stands for the values of CLKI and CLKQ equaling to 00, 01, 11, and 10, respectively. When



Fig. 4. Schematic of the Modified DQFD.



Fig. 5. Illustrated waveforms of the Modified DQFD.

the frequency of DATA is faster than the frequency of VCO, the state would changes from  $I \rightarrow II \rightarrow III \rightarrow IV$ , as shown in the left waveforms of Fig. 5. When the frequency of DATA is slower than that of VCO, the state changes in the order of  $IV \rightarrow III \rightarrow II \rightarrow I$ , as shown in the right diagram of Fig. 5. Thus, the output values of the shift register, A, B, C, D, and their complementary could be used to generate the control signals, UP1 and DN1. However, the operating state might be lost when the frequency difference is large. In order to solve this problem, the Assistant Circuit is used. When LOCK = 0, it indicates that the large frequency difference between DATA and CLKI (CLKQ), the control signal is directly chosen from DIFF, which stands for the frequency difference between CLKI and DATA. If LOCK becomes logic 1, the frequency difference is less and the state loss problem would not happen. Thus, UP1 and DN1 could be selected as the output control signals, FD\_UP and FD\_DN, respectively.



Fig. 6. Schematic of the current-mode master-slave DFF.

Because the operating frequency is higher than several GHz,



Fig. 7. Schematic of the Coarse-Fine Tuning VCO.



Fig. 8. Schematic of the Phase Bias circuit for the proposed VCO.

the current-mode positive-edge triggered master-slave DFF is used, as shown in Fig. 6. It includes the current-mode master and slave latches and the bias circuit. The current-mode master latch includes the differential pair, MN1-MN2, and the crosscoupled pair, MN3-MN4. When CLK is logic 0, the bias current, ISS, is mirrored to drive the cross-coupled pairs, MN3-MN4, and MN7-MN8. At the same time, the differential pairs, MN1-MN2 and MN5-MN6, are disabled. Thus, the internal data, Y and Yb, is latched by the crossed-coupled pair, MN3-MN4. Besides, the output data, Q and Qb, is kept at the previous values by the cross-coupled pair, MN7-MN8. When CLK becomes logic 1, the bias current, I<sub>SS</sub>, is mirrored to the tail current of the two differential pairs, MN1-MN2 and MN5-MN6, and the cross-coupled pairs are disable. Thus, the input data, D and Db, is sampled and then the values of Y, Yb, Q, and Qb are updated. The MOS transistors, Mbias1 and Mbias2, are used to improve the matching of the current mirror by driving the drain-source voltages of MN9-12, MN10b, and MN12b at the same voltage level. Besides, the bias voltage, VB, is given by the internal bias circuit.

## C. Coarse-Fine Tuning VCO

The Coarse-Fine Tuning VCO is composed of 4 stages of DIDO (Differential Input Differential Output) delay cells, as shown in Fig. 7. Two DtoS Converters (Differential to Single-ended) are used to generate the quadrature CLK signals, CLKI and CLKQ, respectively. The DIDO delay cell is driven by MN5 to provide the basic tail current such that the free running frequency could be ensured. The coarse tuning of the oscillation is achieved by adjusting the gate voltage of MN7 and the fine tuning is controlled by MN6. In order to match the current through MP1-MP4, the variation of  $I_{D,MN6}$  is complementary of the variation of  $I_{D,MN7}$ . Thus, the Phase Bias circuit is utilized to generate the fine tuning voltage, VFN, with the inverse amplitude according to Vfine, as shown in Fig. 8.



Fig. 9. Layout of the proposed design.



Fig. 10. Simulated waveforms at various PVT corners.

## **III. IMPLEMENTATION AND SIMULATION RESULTS**

The proposed design is implemented using a typical 40 nm CMOS process. Fig. 9 shows the layout of the proposed CDR circuit, where the area is 537.585  $\times$  537.42  $\mu$ m<sup>2</sup>. The post-layout simulation is executed with HSPICE at all PVT corners. The pseudo random binary sequence, PRBS7, is used as the input simulation patterns. Based on the all corner simulation results, the frequency could be locked at 2 GHz in 180 ns, as shown in Fig. 10. The control signals of VCO, Vcoarse and Vfine, are convergent to 0.27-0.55 V and 0.4-0.52 V, respectively, at all corners. Fig. 11 shows the simulated waveforms locked at 6.5 GHz. When LOCK changes to logic 1, the frequency is locked by the CDR circuit and Vcoarse becomes stable. The phase lock loop starts to work and the control signals, PD\_UP, PD\_DN and Vfine, are used to adjust the phase. Fig. 12 shows the simulated eye diagram of the retimed data at the frequency of 6.5 GHz. The simulated P2P jitter and RMS jitter are 17.1 ps and 5.79 ps, respectively. Table I summarizes the performance comparison with several prior works. The proposed design possesses the wide lock-in range and low jitter.

|                         | This work | [2]        | [10]      | [11]      | [12]      |
|-------------------------|-----------|------------|-----------|-----------|-----------|
| Year                    | 2022      | 2020       | 2016      | 2017      | 2016      |
| Publication             | N/A       | TCASII     | NEWCAS    | TCASII    | ISCAS     |
| Result                  | sim.      | meas.      | meas.     | sim.      | meas.     |
| Process (nm)            | 40        | 180        | 40        | 180       | 180       |
| Supply voltage (V)      | 0.9       | 1.8        | 0.9       | 1.8       | 1.6-2     |
| Data rate (Gb/s)        | 1-6.5     | 0.42-3.45  | 2.6-6.4   | 0.3-4     | 0.4-2.4   |
| Lock-in range (Gb/s)    | 5.5       | 3.03       | 3.8       | 3.7       | 2         |
| Architecture            | Full-rate | Full-rate  | Half-rate | Half-rate | Full-rate |
| CDR type                | Dual-loop | Dual-loop  | MDLL      | PD, DQFD  | PLL-based |
| P2P jitter (ps)         | 17.1      | 29.8       | 13.3      | 83.6      | 92        |
|                         | @6.5 Gb/s | @3.45 Gb/s | @6 Gb/s   | @4 Gb/s   | @2 Gb/s   |
| RMS jitter (ps)         | 5.79      | 4.33       | 1.3       | 12.4      | 5.4       |
|                         | @6.5 Gb/s | @3.45 Gb/s | @6 Gb/s   | @4 Gb/s   | @2 Gb/s   |
| Area (mm <sup>2</sup> ) | 0.289     | 0.44       | N/A       | 0.16      | N/A       |
| Power (mW)              | 54.51     | 20.3       | 1.8       | 71        | 18.4      |
|                         | @6.5Gb/s  | @3.45 Gb/s | @6.4 Gb/s | @4 Gb/s   | @2 Gb/s   |
| $FOM_{p2p}^{\dagger}$   | 49.48     | 29.47      | 47.62     | 11.06     | 10.87     |

TABLE I COMPARISON WITH SEVERAL PRIOR WORKS

Note: <sup> $\dagger$ </sup> FOM = Lock-in range(Gb/s)/P2P jitter(UI), where jitter(UI) = jitter (s)/1 UI.



Fig. 11. Simulated waveforms locked at the frequency of 6.5 GHz.



Fig. 12. Simulated eye diagram of the retimed data at 6.5 GHz.

## IV. CONCLUSION

This paper proposes the CDR circuit with the lock-in range of 1-6.5 GHz and the RMS jitter of 5.79 ps. The dual-loop control includes the Phase Detector, Frequency Increase/Decrease Control circuit, the Modified DQFD, and the Coarse-Fine Tuning VCO, to achieve the wide lock-in range and low jitter.

#### ACKNOWLEDGMENT

This research was partially supported by the Ministry of Science and Technology under grant no. MOST 110-2224-E-110-004-, MOST 109-2218-E-110-007-, MOST 109-2221-E-110-079- and MOST 110-2218-E-110-008-. Moreover, the authors would like to express their deepest appreciation to TSRI (Taiwan Semiconductor Research Institute) of NARL (National Applied Research Laboratories), Taiwan, for the EDA tool assistance.

#### REFERENCES

- C. Gimeno, D. Flandre and D. Bol, "Low-power half-rate dual-loop clock-recovery system in 28-nm FDSOI," 2018 IEEE 9th Latin American Symposium on Circuits & Systems (LASCAS), pp. 1-4, 2018.
- [2] K. -S. Son, T. -J. An, Y. -H. Moon and J. -K. Kang, "A 0.42-3.45 Gb/s referenceless clock and data recovery circuit with counter-based unrestricted frequency acquisition," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 67, no. 6, pp. 974-978, June 2020.
- [3] P. M. Ha, N. Huu Tho, N. Thanh and Q. Nguyen-The, "An improved wideband referenceless cdr with up pulse selector for frequency acquisition," 2020 International Conference on Advanced Technologies for Communications (ATC), pp. 56-60, 2020.
- [4] K.-J. Hsiao, M.-H. Lee and T.-C. Lee, "A clock and data recovery circuit with wide linear range frequency detector," 2008 IEEE International Symposium on VLSI Design, Automation and Test (VLSI-DAT), pp. 121-124, 2008.
- [5] P. M. Ha, N. H. Tho, H. H. Hanh and Q. Nguyen-The, "A wide-band reference-less bidirectional continuous-rate frequency detector," 2019 3rd International Conference on Recent Advances in Signal Processing, Telecommunications & Computing (SigTelCom), pp. 25-29, 2019.
- [6] B. Stilling, "Bit rate and protocol independent clock and data recovery," *Electronics Letters*, vol. 36. no. 9, pp. 824-825, Apr. 2000.
- [7] K. Lee and J. Sim, "A 0.8-to-6.5 Gb/s continuous-rate reference-less digital cdr with half-rate common-mode clock-embedded signaling," in *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 4, pp. 482-493, April 2016.
- [8] B. Razavi, "Challenges in the design high-speed clock and data recovery circuits," in *IEEE Communications Magazine*, vol. 40, no. 8, pp. 94-101, Aug. 2002.
- [9] Y. S. Tan, K. S. Yeo, C. C. Boon and M. A. Do, "Design of a hysteresis lock detector for dual-loops clock and data recovery circuit," 2011 IEEE International Conference of Electron Devices and Solid-State Circuits, pp. 1-2, 2011.
- [10] K. Gharibdoust, A. Tajalli, Y. Leblebici, "A wideband MDLL with jitter reduction scheme for forwarded clock serial links in 40 nm CMOS," 2016 14th IEEE International New Circuits and Systems Conference (NEWCAS), pp. 1-4, 2016.
- [11] Y. -L. Lee, S. -J. Chang, Y. -C. Chen and Y. -P. Cheng, "An unbounded frequency detection mechanism for continuous-rate cdr circuits," in *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 64, no. 5, pp. 500-504, May 2017.
- [12] H. Liu, C. Su, C. Cheng and W. Liu, "Design and modeling of PLLbased clock and data recovery circuits with periodically embedded clock encoding for intra-panel interfaces," 2016 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2234-2237, 2016.