# LOW-COST NTSC DIGITAL VIDEO DECODER USING 4θ-BASED DDFS<sup>§</sup>

Chua-Chin Wang<sup>†</sup>, Yih-Long Tseng, Chun-Chih Chen, and Chiuan-Shian Chen

Department of Electrical Engineering National Sun Yat-Sen University Kaohsiung, Taiwan 80424 email: ccwang@ee.nsysu.edu.tw

#### ABSTRACT

A low-cost digital video decoder for NTSC signals is present in this paper. The new fully digital design employs a DDFS (digital direct frequency synthesizer) basing upon trigonometric quadruple angle formula, and an adaptive digital PLL to track and lock the demodulation clocks. The complexity of the digital video decoder, hence, is drastically reduced. The overall cost of the proposed design is 6.0 mm² (39 K gates). The maximum power dissipation is 86 mW at the highest clock rate which is 21.48 MHz.

Indexing terms: video decoder, NTSC, color burst, DDFS, line delay, comb filter

## 1. INTRODUCTION

Video decoder (VD) plays a very important role in the design of the most popular consumer electronics, TVs. Particularly, NTSC-based TVs [2], [3], [5], [1]. The tasks of the VD include the Y/C separation, sync separation, and color demodulation. Many prior works have been reported to pursue the digital version of NTSC TV, e.g., [4], [7], [8], which possesses the edge of compatibility and cost reduction over traditional analog NTSC solutions, e.g., [6]. However, all of these digital solutions required sophisticated adaptive comb filters to cope with the poor signal quality such that the Y/C separation is feasible. The price to pay if these methods are used is the area as well as the power dissipation, since more gate count is needed to carry out the 2-line comb filter, or even 3-line comb filter. The difficulties to recover the color information are resulted from the poor received signal quality which in turn introduces serious jitters existing in the clocks of color burst as well as the H-sync and V-sync. We, thus, propose a novel method to recover the clock in a DPLL (digital PLL) as well as a DDFS clock generator based upon a trigonometric quadruple angle  $(4\theta)$ formula. The recovery of color burst clock and sync signals is drastically simplified such that the hardware cost is reduced.

#### 2. DIGITAL VIDEO DECODER DESIGN

A color bar sample of a NTSC line is shown in Fig. 1. An ADC is used to convert the NTSC analog signal into 8-bit sampled data which are the input to the proposed video decoder. An overview of the NTSC video decoder system is shown in Fig. 2. An NTSC signal is given in Fig. 3.



Figure 1: A color bar NTSC signal

<sup>§</sup> This research was partially supported by National Science Council under grant NSC 91-2218-E-110-001 and 91-2622-E-110-004.

<sup>†</sup>the contact author



Figure 2: Overview of a digital NTSC video decoder



Figure 3: NTSC signal in time domain and frequency domain

## 2.1. Y/C separation

The most important task of the video decoder is to extract the Y (luminance) and C (chrominance) signals to restore images. A widely used solution is to employ so-called comb filters to attack this problem. Prior works proposed many complicated comb filter designs to pursue the quality of the decoded images. What worse is that a high-resolution ADC might be required to resolve the problem. The bottom line of this problem resides in the line length variation of the received NTSC signal which leads to the synchronization difficulties of H-sync edges. We adopt two methods to relax this problem:

Weighting Window: The system clock is aimed at over 20 MHz which samples 6 points at the edges of the signal. The correct edge is detected by the convolution of the window shown in Fig. 4 with the NTSC signal. If the convolution result is close to 0, then the edge is detected as shown in Fig. 5. If the edges are correctly detected, the timing signals, including H-sync, V-sync, color burst, and odd/even field, will be extracted by using counters.

Digital PLL: The jitter of the burst clock causes



Figure 4: Convolution of an NTSC signal with a weighting window



Figure 5: Zero-crossing edge detection

the locking problem of the burst clock. The received NTSC signal has neither constant swing nor amplitude. A sophisticated DDFS is used to replaced the common VCO in Fig. 6. Notably, the large and slow cos and sin ROMs can be removed by using a modified  $4\theta$ -based DDFS which will be described later in the following text.

The PFD (phase-frequency detector) in Fig 7 is responsible for detection of the phase difference between the generated clocks of DDFS,  $\cos(wt + \theta_o)$  and  $\sin(wt + \theta_o)$  and the input burst reference clock,  $\cos(wt + \theta_i)$ .  $\theta_e$  is the output phase difference.

$$[\sin(wt + \theta_o) \cdot \cos(wt + \theta_i)] \cdot [\operatorname{sgn}(\cos(wt + \theta_o)) \cdot \cos(wt + \theta_i)]$$

$$= \pm \sin \theta_e \cdot \operatorname{sgn}(\pm \cos \theta_e) \approx \theta_e \quad \text{if } \theta_e \text{ is small} \qquad (1)$$

The sampling frequency is selected to be an integer times of the sub-carrier's frequency to minimize the phase error. If it is not the case, a significant variation of phase error will be produced. For instance, since there are 18 samples in a cycle, the integrator in Fig. 7 will accumulate the non-fixed periods' error to generate the large fluctuations in Fig. 8. With the the same phase errors, 7% and 10%, the PFD can not lock the clock given a 20 MHz sampling frequency. By contrast, if the sampling frequency is identical to the system clock which is 21.48 MHz, these phase errors are clearly detected.



Figure 6: DDFS-based VCO block diagram



Figure 7: PFD block diagram

**Loop Filter:** A digital version of the loop filter (LF), which is a 1st-order IIR, is shown in Fig. 9. It is noted that  $C_1$  and  $C_2$  are constants to be determined by the following equations.

$$w_{n} = \frac{1}{T} \sqrt{\frac{4C_{2}K_{o}K_{d}}{4 - (2C_{1} + C_{2})K_{o}K_{d}}}$$

$$\zeta = \frac{C_{1}}{2 \cdot C_{2}} \sqrt{\frac{4C_{2}K_{o}K_{d}}{4 - (2C_{1} + C_{2})K_{o}K_{d}}}$$

$$T_{n} = \frac{2\pi}{w_{n}}, \qquad (2)$$

where T is the sampling period,  $K_o$  and  $K_d$  denote the gain of PFD and VCO,  $w_n$  is the natural frequency,  $T_n$  is the lock time.



Figure 9: Digital loop filter

**ROM-less DDFS**: A modified  $4\theta$ -based DDFS [9] is employed to carry out the function of the required VCO such that the slow and large ROMs in Fig. 6 can

be removed. The central frequency is set to be 3.58 MHz. Referring to Fig. 6, the p fed into the phase accumulator determines the central frequency,

$$\frac{F_{SC}}{F_S} = \frac{p}{2^k - 1},\tag{3}$$

where k is the wordlength of the phase accumulator,  $F_{SC}$  is the sub-carrier frequency (3.58 MHz), and  $F_S$  is the sampling frequency (21.48 MHz). p is derived to be 44739242. The sin and cos ROMs are pre-computed by [9] and hard-wired with combinational logic to get rid of real embedded and slow ROMs.

## 2.2. Chrominance Demodulator

The chrominance demodulator receives the digital PLL sub-carrier output as well as the C generated by the Y/C Separation block as shown in Fig. 10. The C is derived to be [3]:

$$C = (C_b - 128) \cdot 0.504 \cdot \sin wt + (C_r - 128) \cdot 0.711 \cdot \cos wt$$
 (4)

The  $C_b$  and  $C_r$  are produced after the multiplication products of C and the outputs of the digital PLL are low-passedly filtered.

$$2 \cdot 1.406 \cdot \cos wt \cdot C$$

$$= (C_r - 128) + \underbrace{(C_r - 128) \cdot \cos 2wt}_{\text{filtered}} + \underbrace{(C_b - 128) \cdot 0.504 \cdot 1.406(\sin 2wt)}_{\text{filtered}}$$
(5)

$$2 \cdot 1.984 \cdot \sin wt \cdot C$$

$$= (C_b - 128) - \underbrace{(C_b - 128) \cdot \cos 2wt}_{\text{filtered}} + \underbrace{(C_r - 128) \cdot 0.711 \cdot 1.984(\sin 2wt)}_{\text{filtered}}$$
(6)

The LPF to execute the filer operation is also shown in Fig. 10 which is a 20-tap transposed FIR. Notably, the 128 in Eqn.(5) and (6) is an DC offset of the 8-bit ADC. Hence, the correct results of  $C_b$  and  $C_r$  are composed of the outputs of Eqn.(5) and (6) and the 128 DC offset.

## 3. SIMULATION AND IMPLEMENTATION

Artisan 0.35  $\mu m$  1P4M CMOS technology cell library is adopted to implement the proposed design. Fig. 11 shows the die photo and the layout of the design. Pre-layout RTL and post-layout simulation results of proposed video decoder are respectively revealed in Fig. 12 and Fig. 13. The overall characteristics of the proposed video decoder as well as the comparison with prior works are summarized in Table 1.



Figure 8: Phase error examples



Figure 10: Chrominance demodulator

### 4. CONCLUSION

We have proposed a novel NTSC video decoder design in this paper. The traditional VCO is replaced with a ROM-less  $4\theta$ -based DDFS to reduce the phase tracking problem such that the complicated comb filter design is relaxed. Not only is the chip area saved, the power dissipation is also reduced.

# 5. REFERENCES

- [1] EIA Standard, "Line 21 data services," EIA-608-A, Reading: published by Electronics Industries Alliance, 1999.
- [2] B. Grob, and C. E. Herndon, "Basic television and video systems," Reading: published by Mcgraw-Hill, 1999.
- [3] K. Jack, "Video demystified," Reading: published by LLH Technology Publishing, 2001.

- [4] C.-C. Kuo, and Y.-T. Chen, "New method for the implementation of an NTSC digital video decoder," *IEEE Trans. on Consumer Electronics*, vol. 48, no. 2, pp. 265-274, May 2002.
- [5] Faroudja, and Y. Charles, "NTSC and Beyond," IEEE Trans. on Consumer Electronics, vol. 34, no. 1, Feb. 1988.
- [6] M. Ohta, K. Kohiyama, N. Tahara, K. Sugihara, F. Asami, O. Kobayashi, Y. Hino, and T. Akiba, "A single-chip CMOS analog/digital mixed NTSC decoder," *IEEE J. of Solid-State Circuits*, vol. 25, no. 6, pp. 1464-1469, Dec. 1990.
- [7] Philips, "PAL/NTSC/SECAM Video Decoder with Adaptive PAL/NTSC Comb fileter, VBIdata Slicer and High Performance Scaler," Data Sheet: SAA7114H, Mar. 15, 2000.
- [8] Techwell, "Enhanced NTSC/PAL/SECAM Video Decoder," Data Sheet: TW99, Mar. 30, 2000.
- [9] C.-C. Wang, H.-C. She, and R. Hu, "A ROMless direct digital frequency synthesizer by using trigonometric quadruple angle formula," 9th IEEE International Conference on Electronics, Circuits and Systems (ICECS 2002), pp. 65-68, Sep. 2002.



Figure 11: Die photo and layout of the proposed video decoder

|                              | [4]† | [6]                               | ours                               |
|------------------------------|------|-----------------------------------|------------------------------------|
| Area                         | N/A  | $67.76 \text{ mm}^2$ ‡            | $6.02~\mathrm{mm^2}$               |
| Power                        | N/A  | 980 mW@15 MHz                     | $86~\mathrm{mW@21.48~MHz}$         |
| ${ m Gate\#} \ { m Process}$ | N/A  | N/A                               | 39 K                               |
| Process                      | N/A  | $1.2~\mu\mathrm{m}~2\mathrm{P2M}$ | $0.35~\mu\mathrm{m}~1\mathrm{P4M}$ |

†: functional simulation by SystemView S/W only ‡: estimated from the die photo

Table 1: Performance comparison



Figure 12: Pre-layout simulation result



Figure 13: Post-layout simulation result