# This document is downloaded from DR-NTU (https://dr.ntu.edu.sg) Nanyang Technological University, Singapore.

# A micropower low-distortion digital class-D amplifier based on an algorithmic pulsewidth modulator

Gwee, Bah Hwee; Chang, Joseph Sylvester; Victor, Adrian

2005

Gwee, B. H., Chang, J. S., & Victor, A. (2005). A micropower low-distortion digital class-D amplifier based on an algorithmic pulsewidth modulator. IEEE Transactions on Circuits and Systems-I: Regular Papers, 52(10).

https://hdl.handle.net/10356/96769

https://doi.org/10.1109/TCSI.2005.852920

© 2005 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Downloaded on 26 Aug 2022 02:40:17 SGT

# A Micropower Low-Distortion Digital Class-D Amplifier Based on an Algorithmic Pulsewidth Modulator

Bah-Hwee Gwee, Senior Member, IEEE, Joseph S. Chang, and Victor Adrian

Abstract—A digital Class-D amplifier comprises a pulsewidth modulator (PWM) and an output stage. In this paper, we simplify the time-domain expression for the algorithmic PWM linear interpolation (LI) sampling process and analytically derive its double Fourier series expression. By means of our derivation, we show that the nonlinearities of the LI process are very low, especially given its modest computation complexity and low sampling frequency. In particular, the total-harmonic distortion (THD)  $\approx$  0.02% and foldback distortion is -98.4 dB (averaged from modulation indexes M=0.1 to 0.9) for the 4-kHz voiceband bandwidth @1-kHz input, 48-kHz sampling. We also describe a simple hardware for realizing the LI process. We propose a frequency doubler (with small overheads) for the pulse generator for the PWM, thereby reducing the counter clock rate by 2, leading to a substantial  $\sim 47\%$  power dissipation reduction for the Class-D amplifier. By means of computer simulations and on the basis of experimental measurements, we verify our double Fourier series derivation and show the attractive attributes of a Class-D amplifier embodying our simplified LI sampling expression and reduced clock rate pulse generator. We show that our Class-D amplifier design is micropower ( $\sim$ 60  $\mu$ W @1.1 V and 48-kHz sampling rate, and THD  $\approx$ 0.03%) and is suitable for practical power-critical portable audio devices, including digital hearing aids.

Index Terms—Class-D amplifier, digital hearing aids, linear interpolation, pulse generator, pulsewidth modulation (PWM), sampling process.

# NOMENCLATURE

| B                 | Pulse duration in the absence of modulation.          |
|-------------------|-------------------------------------------------------|
| $f_i$             | Modulating signal frequency.                          |
| $f_s$             | Sampling (carrier) frequency.                         |
| $F(\theta, \phi)$ | PWM pulse amplitude function.                         |
| H                 | Amplitude of the PWM pulse.                           |
| $J_n(x)$          | Bessel function of the first kind.                    |
| k                 | Ratio of zero-input duty cycle of the input signal to |
|                   | the ideal 50% zero-input duty cycle.                  |
| M                 | Modulation index $(0 \le M \le 1)$ .                  |
| m                 | Carrier harmonic index.                               |
| n                 | Modulating signal harmonic index.                     |
| p                 | Ratio of the carrier frequency to the input modula-   |
|                   | ting signal frequency $(f_s/f_i)$ .                   |
|                   |                                                       |

Manuscript received January 2, 2004; revised September 1, 2004. This paper was presented in part at the IEEE International Symposium on Circuits and Systems, Bangkok, Thailand, May 25–28, 2003. This paper was recommended by Associate Editor I. M. Filanovsky.

The authors are with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798 (e-mail: ebhgwee@ntu.edu.sg).

Digital Object Identifier 10.1109/TCSI.2005.852920

- Q Amplitude of the sinusoidal input modulating signal.
- $S_1$  Past Uniform PWM sampled point.
- $S_2$  Present Uniform PWM sampled point.
- $S_{\rm NS}$  Current Natural PWM sampled point.
- $t_p$  PWM pulsewidth.
- t Time.
- T Sampling period.
- $V_s$  Amplitude of the carrier signal.
- $\omega_i$  Modulating signal angular frequency.
- $\omega_s$  Carrier angular frequency.
- $\varepsilon$  Variable sampling factor for the Linear Interpolation
  - sampling process algorithm.
- $\theta$  Phase shift of the carrier signal.
- $\phi$  Phase shift of the input modulating signal.
- $\Omega(\theta, \phi)$  Pulsewidth function.
- $\Omega_T(\theta, \Phi)$  Transformed pulsewidth function.

# I. INTRODUCTION

IGITAL Class-D amplifiers (amps), also known as digital pulse modulation amps, are increasingly prevalent as power amps in audio applications, in particular portable audio devices whose critical parameters include low-voltage (1.1 V-1.4 V) and low-power ( $\sim$ 1 mW) operation, and small integrated circuit (IC) area. An example of such an application includes the digital hearing instrument (hearing aid) whose total quiescent current budget is approximately  $1000 \,\mu\text{A}$  @ 1.1 V. As the small battery cell used has an energy capacity of the order of 100 mAh @1.1-1.4 V, the lifespan of the cell is  $\sim$ 100 h. It is of interest to appreciate that because of this low current budget, most of the power should be allocated to complex signal processing (such as noise reduction [1]) as opposed to signal conditioning. The Class-D amp is particularly advantageous in this application because when properly designed [2], its output stage features high power efficiency (of the order of 90%) over a large modulation index range (signal swing) or where the crest factor is high (e.g., 15 dB). The digital Class-D amp is also advantageous when interfaced to a digital processor (for example, in a digital hearing instrument) because the need for a digital-to-analog converter is eliminated, hence, the immediate power savings and reduced hardware.

To appreciate the significance of power dissipation as one of the primary figure-of-merits for the Class-D amp, note that the typical audio power output of a hearing instrument for the prevalent mild-to-moderate hearing impaired is of the order of



Fig. 1. Digital Class-D amp with the PWM realized by a sampling process and a pulse generator.

100  $\mu$ W @1.1–1.4 V; also see Section IV. In view of this, the total quiescent power dissipation of the Class-D amp should be of the order of 50–60  $\mu$ W.

In general, the Class-D amp, as depicted in Fig. 1, comprises a pulsewidth modulator (PWM) and an output stage. The output stage drives a load usually consisting of a low-pass filter and an output transducer. We have earlier described a methodology [2] to optimize the design of the output stage of the Class-D amp for maximum power efficiency (and that results in a small IC area). It is well established [3] that an ideal PWM of the Class-D amp outputs a PWM output signal with zero harmonic distortion, that is the total harmonic distortion (THD) = 0.

The methods for generating the PWM signal for a Class-D amp may be classified into three general methods: 1) algorithmic PWM; 2) oversampled Delta-Sigma ( $\Delta\Sigma$ ) pulse-code modulation (PCM); and 3) click modulation. We will now briefly review these in turn.

The algorithmic PWMs essentially involve a signal sampling process to digitally emulate the natural sampling (NS) process, followed by a pulse generator. This signal sampling process is sometimes termed the crosspoint deriver and as its name implies, the process simply involves estimating the crosspoint or intersection of the modulating signal and the carrier signal—the NS process, see Section II later.

One of impetuses for the wealth of reported algorithmic PWM sampling processes is the desire for a low-distortion PWM output (<0.5%) with a low sampling rate (for example, 48 kHz) and with modest computation complexity (for example, 2 additions/subtractions and 1 division operation per sample). These attributes are highly desirable in view of micropower operation for power critical portable applications such as hearing instruments. This is because a high sampling frequency, increased computation rate arising from more samples per unit time and the corresponding higher clock rate in the pulse generator, all translate to undesirable higher power dissipation. Further, a low computation complexity translates to simpler hardware, hence, lower cost and usually higher reliability.

The reported sampling processes for the algorithmic PWM methods include the linear interpolation (LI) [4], [5], pseudonatural PWM [5], static-filter PWM [8], weighted PWM and its variants [9], derivative PWM [10], parabolic correction PWM [23], prediction correction PWM [28], [29] and more recently, our earlier proposed Delta compensation ( $\delta$ C) PWM [11] sampling processes. At the outset, we remark that the LI process offers one of the lowest nonlinearities with very modest computation complexity and with low sampling rate. The other processes may offer lower nonlinearities but at the high cost of substantially more complex computation and in some cases, requiring a

higher sampling rate. We will qualify and quantify these parameters and briefly describe these different algorithmic sampling processes in our review in Section II later.

For completeness, we remark that the mechanisms of the nonlinearities of low-voltage analog Class-D amps, based on NS but with a quasi-linear carrier, for power-critical analog hearing instruments are now well understood from our recent work [13].

The pulse generators for the algorithmic PWM include the clock-counter [14], tapped-delay-line [15], a hybrid combination of clock-counter cum tapped-delay-line [16], and the clock-counter cum noise-shaper approach [17], [18] abbreviated as the CNS pulse generator in this paper. Of these designs, the CNS pulse generator is the preferred design because of its robustness (in the sense that its parameters are virtually independent of fabrication process variations) in design and all its building blocks are compatible with standard CMOS fabrication processes. As in the case of the sampling processes, it is highly desirable that the sampling rate of the counter embodied in CNS pulse generator is low for low power dissipation. This is because, as we will later show, the pulse generator dominates the power dissipation in the Class-D amp, and its clock counter is the functional block that dissipates the largest power.

The oversampled  $\Delta\Sigma$  PCM-PWM method [19] is essentially a PCM-to-PWM converter where the PCM signal is the original sampled signal, the uniform sampled data. This conversion process is usually complex, including the following processes arranged chronologically: Oversampling by interpolation,  $\Delta\Sigma$ modulation and a pulse-density-modulation (PDM)-to-PWM converter [19]. The oversampling effectively reduces the wordlength of the input samples in the interpolation process but at the cost of a higher clock frequency (typically  $2^7-2^8$ times, thereby reducing the wordlength of the input samples by 3-4 LSBs) and increased computation, including the need for digital filtering. The subsequent  $\Delta\Sigma$  modulation is also usually relatively complex and involves a delta sigma modulator (typically 4th order or higher), and a PDM output is obtained. To reduce the high frequency of the PDM output, bit-flipping techniques [20], [21] are sometimes used to reduce the frequency. However, these techniques result in some errors and as a result, may compromise the low linearities attribute of the oversampled  $\Delta\Sigma$  PCM-PWM method and may possibly lead to instability. Finally, a PWM output is obtained via a PDM-to-PWM converter and the PWM output is usually low resolution ( $\sim 5$  bits) but timed to a medium speed clock ( $\sim 10$ s MHz). The analog output can be obtained by low pass filtering the high frequency PDM signal or the lower frequency PWM signal. In this final conversion, a table look-up may be used instead of direct computation to reduce the intensity of the computation. In short, it is instructive to appreciate that the computation of the oversampled  $\Delta\Sigma$  PCM-PWM method is substantially more intensive than the algorithmic PWM method. However, the primary advantage of the oversampled  $\Delta\Sigma$  PCM-PWM method hitherto is its low THD, typically 0.08% (compared to  $\sim\!0.2\%$  in algorithmic PWMs), negligible intermodulation distortion is over the entire audio bandwidth and high signal-to-noise ratio (SNR). The reduced THD, however, is obtained at considerable hardware (including a larger IC area) and power dissipation costs, and these costs may be prohibitive for power critical applications.

Click modulation PWM [6], [7] involves the application of Hilbert transform to convert the audio signal into a complex signal. It further involves an analytic exponential modulation to generate a binary signal having a separated baseband and finally the PWM signal is generated. In view of the computation complexity of click modulation PWM, it does not appear, in practice, to offer any advantage in terms of nonlinearities. This is because both the algorithmic PWM and  $\Delta\Sigma$  PCM-PWM methods can offer comparable low nonlinearities but with lesser computation demands.

In this paper, we briefly review reported algorithmic sampling processes, with emphasis on the classical NS and US sampling processes and the practical algorithmic sampling processes, namely the  $\delta C$  and LI. We qualify practical as that requiring modest computation and low sampling rates, and with low nonlinearities. In view of these parameters for power critical audio voiceband applications, we provide a simplified expression for the LI process, thereby simplifying (compared to that reported in literature) the computation complexity of the LI process. It is interesting to note that reported work on the LI process [4], [5], [22], the mechanisms for the nonlinearities of the LI process, in particular the harmonic distortion and foldback distortion, are not well understood; the reported double Fourier series expression [4] for the LI process is imprecise. In view of this, we analytically derive the double Fourier expression for the LI process [12] and by means of our derivation, we show that the nonlinearities of the LI process are the lowest among reported practical algorithmic sampling processes. In particular, for 1-kHz input and at sampling frequency  $f_s=48$  kHz, the average THD  $\approx$ 0.02% and its average foldback distortion is -98.4 dB for the 4 kHz bandwidth. The average THD and average foldback distortion are the arithmetic averages for the range of modulation indexes from M = 0.1 to 0.9 and we will henceforth use this definition for "average;" we do not include the condition for M = 1.0 to avoid the possibility of pulses dropout. This analysis is useful not only because it provides an analytical means to accurately predict the harmonic and foldback distortion nonlinearities, it further provides insight into the mechanisms of the nonlinearities and how the different parameters of the LI sampling process may be traded or compromised for a given design. We also provide a hardware design [12] to realize the LI process based on our simplified LI expression, and can show that this hardware is substantially simpler than the oversampled  $\Delta\Sigma$  PCM-PWM method or the click modulation method with comparable nonlinearities.

In the same spirit of micropower operation, we reduce the demands of the clock rate of the counter in the pulse generator.

We do this by proposing a frequency doubler with low hardware overheads [12]. This clock frequency reduction is significant because the counter dominates the power dissipation in the pulse generator that in turn dominates the power in the Class-D amp. This frequency doubler reduces the clock rate by a factor of 2 and this translates to a significant  $\sim 50\%$  and  $\sim 47\%$  power dissipation reduction of the CNS pulse generator and of the Class-D amp, respectively.

We verify our double Fourier series derivation and show the attractiveness of a digital Class-D amp embodying the LI sampling process and reduced clock rate CNS pulse generator by computer simulations and on the basis of experimental measurements using a Complex Programmable Logic Device (CPLD) and a prototype IC embodying a Class-D output stage [12]. We show that our Class-D amp design is micropower ( $\sim$ 60  $\mu$ W @1.1 V and 48 kHz sampling rate) and features low distortion (average THD  $\approx$ 0.02% and average foldback distortion = -98.4 dB @1 kHz input for the 4-kHz voiceband bandwidth), rendering it appropriate for practical portable power-critical audio devices, including digital hearing instruments. From a layout point of view, the required IC area for the digital PWM is also small, requiring only 18 342 transistors.

# II. ALGORITHMIC PWM SAMPLING PROCESS

# A. Digital PWM Sampling Processes

In this section, we will briefly review reported algorithmic PWM processes in the perspective of their applicability for power-critical applications.

We will first briefly review the different reported sampling processes and thereafter, compare the NS, US,  $\delta$ C, and LI sampling processes. As depicted in Fig. 2, NS requires an analog sampling process for precise sampling and is hence impractical to realize. This is because the amplitude of the input modulating signal at the intersection with the carrier waveform must be known. Put differently, for a true digital NS emulation, the complete contour of the input modulating signal needs to be known, that is the input modulating signal must be sampled at an infinite rate. The US process, on the other hand, is a highly simplistic algorithmic PWM and is depicted in Fig. 2 where  $S_1$  and  $S_2$  are the resultant US sampled points. As the US pulsewidth arising from the sampled points differs considerably from the ideal NS pulsewidth at low sampling rates, intolerably high THD (typically 2% @48 kHz sampling rate) results. At higher sampling rates, the THD of the US process improves and the same is observed for other algorithmic sampling processes.

The LI sampling is relatively simple and as depicted in Fig. 2, simply involves the computation of the intersection point of the carrier signal and a piece-wise linear approximation of the input modulating signal (formed by connecting two uniform sampled points). LI is arguably first reported in [4] but inadvertently for  $\varepsilon=0.65$  and  $\varepsilon=0.35$  in [22] instead of  $\varepsilon=0$ ; we can show that  $\varepsilon=0.65$  and  $\varepsilon=0.35$  are poorly optimized, resulting in high nonlinearities. LI was later reformulated in [5] and reclassified as a first-order pseudonatural PWM (PNPWM). The time-domain pulsewidth expressions reported in [4] [see (2a)] and in [5] [see (2b)] are somewhat different due to the different normalizations (equivalent to the bounded range) assumed for



Fig. 2. Natural, uniform,  $\delta$  C, and LI sampling processes.

the carrier waveform. We will later provide a simplified expression for the LI sampling process, hence simplified hardware and reduced power dissipation.

As described earlier, the PWM algorithmic sampling process may be generally interpreted as an algorithm to determine the intersection point between carrier signal and the input modulating signal, that is to estimate the naturally sampled point. Higher order PNPWMs such as third- and fifth-order PNPWMs [5] attempt to improve this over the LI process by increasing the number of iterations in its computation involving more than two uniform sampled points for each sample. These iterations include established methods such as the Newton–Raphson method or equivalent methods. Consequently, the computation of the higher order PNPWMs is substantially more complex than the LI process and require multiple arithmetic operations for each output sample including several division operations and more than one iteration.

The static filter PWM (SFPWM) [8] is based on the concept that static filters can be employed to reduce the nonlinearities by canceling higher order terms of the signal components during the modulation process. In practice, the degree of distortion reduced is constrained by the complexity of the required hardware. It was shown in [9] that for a distortion performance approximately equal to the LI sampling process, a

third-order SFPWM is required, comprising 6 additions/sub-tractions, 5 multiplications, and 5 delays per sample. This is not only substantially more complex than the LI process, there is a further need to determine the coefficients of the filter.

The weighted-PWM (WPWM) and the WPWM with error estimation reported in [9] are relatively complex algorithms. The WPWM involves a parameter that is required to be empirically determined and that, in part, depends on the absolute sampling period. Further, for this algorithm to obtain the same THD performance as the LI process, ≥2 iterations are required to determine each intersection point and each iteration requires five additions/subtractions and four multiplications per sample. In the case of the WPWM with error estimation, the computation may be even more formidable as it further requires a constant obtained by trial and error.

The derivative PWM (DPWM) [10] requires an empirically determined constant to determine the number of differentiation processes. The arithmetic computation of DPWM is highly complex as it requires a substantial amount of derivative terms during the differentiation processes. For example, Algorithm I DPWM requires 16 multiplications and 19 additions for each output sample. Similarly, Algorithm II DPWM requires a complex 13 multiplications and 19 additions for each output sample.

The parabolic correction PWM (PaCPWM) [23], [27] involves a linear interpolation of consecutive samples to determine an initial estimation of the natural sampled point, and subsequently adds a parabolic correction factor weighted by the sum of differences between the initial estimated sample and a projection of consecutive samples. The PaCPWM, like the WPWM, requires a constant that needs to be selected empirically and this constant depends on the absolute sampling period. In other words, the computation requirement for PaCPWM is relatively complex. For the same THD, the simplest PaCPWM, the Algorithm A [23], is substantially more complex than the LI process.

The prediction correction PWM [28], [29] is essentially an algorithm that first estimates an initial time point, calculates the signal value at this point using an interpolation formula, and finally adds a correction step to obtain a corrected output that is closer to the ideal naturally sampled point. To improve the accuracy, this algorithm may use the corrected output as a new time point and the entire process is reiterated. As this algorithm requires three mathematically complex steps, the computation and memory requirements are substantial.

We had previously proposed the  $\delta C$  processes [11] and this process is by far the simplest process, requiring three additions/subtractions, one two's complement addition, and one multiplication. However, as we will show later, the LI process offers lower nonlinearities but with a small power dissipation penalty.

In summary, from this review, the practical algorithmic sampling processes are the LI and  $\delta C$  processes, especially in the context of power critical applications with low nonlinearities and a low sampling rate. We will now review these processes in greater detail.

Fig. 2 depicts how the pulsewidths of the NS, US, LI, and  $\delta$ C sampling processes (all being trailing-edge modulations) are obtained. In this diagram, point A along the abscissa time axis is referenced as time t=0, and the magnitude of the carrier signal along the ordinate is normalized to 1 unit with point A referenced as the initial zero point. The pulsewidths of the NS [3], US [3],  $\delta$ C [11], and LI processes are, respectively

$$t_n(NS) = T \cdot S_{NS} \tag{1a}$$

$$t_n(\mathrm{US}) = T \cdot S_1 \tag{1b}$$

$$t_p(\delta C) = T \cdot \left( S_1 + (S_2 - S_1) \frac{(S_1 + S_2)}{2} \right)$$
 (1c)

$$t_p(\mathrm{LI}) = T \cdot \left(\frac{S_1}{1 + S_1 - S_2}\right). \tag{1d}$$

Note that our LI expression in (1d) is somewhat different from that derived by Mellor *et al.* [4], [22] and by Goldberg *et al.* [5]. The expression by Mellor *et al.* [4] is reproduced here

$$t'_p(\text{LI}) = T \cdot \left(\frac{1 + S_1}{2 + (1 - \varepsilon)(S_1 - S_2)}\right).$$
 (2a)

The difference here is due to the normalization of the carrier signal used is from -1 to 1, and it can be appreciated that this unnecessarily complicates the expression. Further, as previously discussed,  $\varepsilon$  should be 0. It is in fact straightforward by

trigonometry to infer that the pulsewidth of the LI process is closer to the pulsewidth of the NS process if  $\varepsilon=0$ , and this results in lower nonlinearities. Goldberg et~al. [5] subsequently reexpressed the LI pulsewidth with the normalization of the carrier signal set from -0.5 to +0.5 and assumed that  $\varepsilon=0$  for the first-order PNPWM. The expression for the LI process by Goldberg et~al. is

$$t_p''(\text{LI}) = T \cdot \left(\frac{0.5(S_1 + S_2)}{1 - S_2 + S_1}\right).$$
 (2b)

As in the case of Mellor et al.'s expression, Goldberg et al.'s expression is also unnecessarily complex. When compared to our simplified equation in (1d), (2b) requires an additional addition operation and an additional shift right operation. Specifically, in our expression in (1d), the computation for the LI process simply involves 1 subtraction (with a virtual addition, see later) and one division. By means of a restoring division process in the divider based on the parallel divider architecture, we can show that this divider is approximately equivalent to 1.5 times the hardware complexity of an array multiplier and depending on the input vectors, 2–3 times the power dissipation. Put simply, the equivalent computation complexity for the LI is 1 subtraction and  $\sim$ 2 multiplications per sample. We will describe our simple hardware design to realize the simplified LI process expression later. For completeness, it is perhaps worthwhile to note that the division that we employ is exact in the sense that no arithmetic approximation is made. In some reported sampling processes [27] described in the review earlier, an arithmetic approximation to the division is used. Although this approximation may reduce the overall complexity in some instances, the inaccuracies may however compromise the advantages gained.

We make two comments on (1a)–(1d). First, it is apparent that the PWM output of the LI and  $\delta C$  processes is obtained after one clock delay while there is no delay in the NS and US processes. This one clock delay for the LI and  $\delta C$  processes is of no consequence in audio applications. Second, it is well established and apparent from Fig. 2 and from (1a) to (1d) that there is some time variation between the pulsewidths of the NS and the other processes, namely US, LI, and  $\delta C$ , and hence, some harmonic nonlinearities will be present in these other processes.

An alternative viewpoint to compare these processes is to view them spectrally in the frequency domain. We do this by reviewing their double Fourier series expressions for NS [3], US [3], and  $\delta$ C [11] processes and they are, respectively, given in (3a)–(3c)

$$F_{NS}(t) = K + \frac{M}{2}\cos(\omega_{i}t) + \sum_{m=1}^{\infty} \left[ \frac{\sin(m\omega_{s}t)}{m\pi} - \frac{J_{0}(m\pi M)}{m\pi} \sin(m\omega_{s}t - 2m\pi k) \right] - \sum_{m=1}^{\infty} \sum_{n=\pm 1}^{\pm \infty} \frac{J_{n}(m\pi M)}{m\pi} \times \sin\left(m\omega_{s}t + n\omega_{i}t - 2m\pi k - \frac{n\pi}{2}\right)$$
(3a)

$$F_{\text{US}}(t)$$

$$= K - \sum_{n=1}^{\infty} \frac{J_n \left(\frac{n\pi M\omega_i}{\omega_s}\right)}{\frac{n\pi\omega_i}{\omega_s}} \sin\left(n\omega_i t - \frac{2n\pi k\omega_i}{\omega_s} - \frac{n\pi}{2}\right)$$

$$+ \sum_{m=1}^{\infty} \frac{\sin(m\omega_s t) - J_0(m\pi M) \sin(m\omega_s t - 2m\pi k)}{m\pi}$$

$$- \sum_{m=1}^{\infty} \sum_{n=\pm 1}^{\pm \infty} \frac{J_n \left[\left(m\omega_s + n\omega_i\right)\frac{\pi M}{\omega_s}\right]}{\left(m\omega_s + n\omega_i\right)\frac{\pi}{\omega_s}}$$

$$\cdot \sin\left[\left(m\omega_s + n\omega_i\right)\left(t - \frac{2\pi k}{\omega_s}\right) - \frac{n\pi}{2}\right] \qquad (3b)$$

$$F_{\delta C}(t)$$

$$= K - \sum_{n=1}^{\infty} \frac{\omega_s I_{0n}}{2\omega_i n\pi^2} \sin\left(n\omega_i t - \frac{2n\pi k\omega_i}{\omega_s}\right)$$

$$+ \sum_{m=1}^{\infty} \left[\frac{1}{m\pi} \sin m\omega_s t - \frac{I_{m0}}{2m\pi^2} \sin(m\omega_s t - 2m\pi k)\right]$$

$$- \sum_{m=1}^{\infty} \sum_{n=\pm 1}^{\pm \infty} \frac{I_{mn}}{2\left(m + \frac{n\omega_i}{\omega_s}\right)\pi^2}$$

$$\cdot \sin\left[m\omega_s t + n\omega_i t - 2m\pi k - \frac{2n\pi k\omega_i}{\omega_s}\right] \qquad (3c)$$

where

$$I_{0n} = \int_{0}^{2\pi} e^{-jn\Phi} e^{-j\frac{nQ}{p}\cos\Phi} e^{j\frac{nBQ}{2p\pi}} \sin\frac{2\pi}{p}\sin\Phi$$

$$\times e^{j\frac{nBQ}{p\pi}} \sin^{2}\frac{\pi}{p}\cos\Phi e^{j\frac{nQ^{2}}{8p\pi}} \sin\frac{4\pi}{p}\sin2\Phi$$

$$\cdot e^{j\frac{nQ^{2}}{4p\pi}} \sin^{2}\frac{2\pi}{p}\cos2\Phi d\Phi$$

$$I_{m0} = \int_{0}^{2\pi} e^{-jmQ\cos\Phi} e^{j\frac{mBQ}{2\pi}} \sin\frac{2\pi}{p}\sin\Phi$$

$$\times e^{j\frac{mBQ}{4\pi}} \sin^{2}\frac{\pi}{p}\cos\Phi e^{j\frac{mQ^{2}}{8\pi}} \sin\frac{4\pi}{p}\sin2\Phi$$

$$\cdot e^{j\frac{mQ^{2}}{4\pi}} \sin^{2}\frac{2\pi}{p}\cos2\Phi d\Phi$$

$$I_{mn} = \int_{0}^{2\pi} e^{-jn\Phi} e^{-j(m+\frac{n}{p})Q\cos\Phi}$$

$$\times e^{j(m+\frac{n}{p})\frac{BQ}{2\pi}} \sin\frac{2\pi}{p}\sin\Phi} e^{j(m+\frac{n}{p})\frac{BQ}{4\pi}} \sin^{2}\frac{\pi}{p}\cos\Phi}$$

$$\cdot e^{j(m+\frac{n}{p})\frac{Q^{2}}{8\pi}} \sin\frac{4\pi}{p}\sin2\Phi} e^{j(m+\frac{n}{p})\frac{BQ}{4\pi}} \sin^{2}\frac{\pi}{p}\cos2\Phi} d\Phi.$$

Note that we have not included the double Fourier series expression for the LI given in reference [4] due to its imprecision. In view of this, we will derive the double Fourier series expression for the LI process in the next section. At this juncture, it is worthwhile to note that these double Fourier series expressions are useful as they may be used to analytically determine the nonlinearities and provide insight to the mechanisms of the nonlinearities and how the different parameters may be traded/compromised for a given design.

These double Fourier series expressions may be interpreted as follows. For the NS process in (3a), the first term K is the dc component of the resultant PWM output and is of no consequence as it is easily accommodated. The second term represents the input modulating signal. It is well established

[3] that the second term depicts that theoretically, there is no signal harmonics within the audio band for the NS process, that is, the THD=0; the carrier harmonics are beyond the audio band (see third term below). Despite this advantage and as described earlier, the major shortcoming of this process is that in practice, the input modulating signal needs to be sampled at an inordinately high rate, ideally at an infinite rate. The third term corresponds to the carrier and its associated harmonics, and is effectively attenuated by the low-pass filter in Fig. 1. Note that as the carrier frequency is typically above the audio band (e.g., 48 kHz), the components of the third term are hence beyond the audio band.

The fourth term represents the modulating signal and its harmonics intermodulated with the carrier and its harmonics, and is usually negligible. In this term, for n=-1 to  $-\infty$ , the foldback distortion is obtained. By careful examination of this fourth term, it can be appreciated that the degree of foldback distortion depends on ratio of the sampling frequency  $f_s$  to input modulating frequency  $f_i$ , that is ratio  $p=f_s/f_i$ , on the modulation index M, and on the bandwidth of interest. In the perspective of a hearing instrument and many other voiceband applications where bandwidth of interest is usually  $\sim 4$  kHz, for  $f_i=1$  kHz and  $f_s=48$  kHz, that is ratio p=48/1, the average foldback distortion is negligible at -979 dB. For  $f_i\approx 4$  kHz (simulated frequency =3.8 kHz to be precise) and  $f_s=48$  kHz,  $p\approx 48/4$ , the average foldback distortion remains negligible at -159 dB.

It is instructive to note that foldback distortion can be significant if the conditions are inappropriate—specifically the foldback distortion is a function of M, p and bandwidth; in the next section, we describe how the foldback distortion can be reduced by increasing p, and its implications on power dissipation. As a case in point, for 8-kHz bandwidth,  $p\approx48/8$  and M=0.9, the foldback distortion  $\approx -38$  dB (1.3%). Although the magnitude of this distortion is generally unacceptable, this specific condition is unlikely in real-life voiceband applications because of the low speech energy beyond 4 kHz, typically -15 dB or lower below the first formant of speech and because the signals are often bandlimited. For the same conditions but at M=0.18(-15 dB), the foldback distortion is negligible at -109 dB. In prevalent voice-band applications where the bandwidth is 4 kHz, the foldback distortion is negligible at -142 dB @M = $0.9, f_i = 3.8$  kHz, and  $p \approx 48/4$ , and -295 dB @M = $0.18, f_i = 3.8$  kHz, and  $p \approx 48/4$ . In summary, the foldback distortion is not generally a problem if the conditions therein are appropriate and this comment generally applies to other sampling processes as well; refer to the next section for the LI process.

For the US and  $\delta C$  sampling processes in (3b) and (3c), respectively, the first, third, and fourth terms are similar to those in (3a) and as described earlier, these terms are usually of little consequence in practice. The second term corresponds to the input modulating signal and its harmonics, and the latter is the source of the harmonic distortion. For completeness, we can show that for the  $\delta C$  process, for p=48/1 within the 4-kHz bandwidth, the average foldback distortion is negligible at -98 dB. For  $p\approx48/4$ , the average foldback distortion remains negligible at -92 dB.



Fig. 3. Spectral analysis of the LI sampling process by double Fourier series.

# B. Spectral Analysis for LI Sampling Process

We will now derive the double Fourier series expression for the LI process and thereafter, we will interpret our derived expression. We will subsequently compare the harmonic distortions of the NS, US,  $\delta$ C, and the LI processes, and comment on the foldback distortion of the LI process.

We apply the double Fourier series analysis [3], [24] on the LI PWM output signal for a single-sided trailing-edge PWM of a cosine input modulating waveform. The three-dimensional (3-D) geometrical representation [24] of the PWM signal can be simplified to a 2-D representation [3] depicted in Fig. 3. In Fig. 3, the ordinate represents the phase shift of the input

modulating signal,  $\phi = \omega_i t$ , in the period of  $2\pi$ . The abscissa, on the other hand, represents the phase shift of the carrier frequency  $\theta = \omega_s t$  and line AA' corresponds to the path of the PWM sampling contour passing through the origin with gradient  $1/p = \omega_i/\omega_s$ . The pulse amplitude function  $F(\theta,\phi)$  is defined by

$$F(\theta, \phi) = \begin{cases} H, & 0 \le (\theta - |\theta|_{2\pi}) \le \Omega(\theta, \phi) \\ 0, & \text{otherwise} \end{cases}$$
 (4)

where  $|\theta|_{2\pi}$  denotes the nearest multiple of  $2\pi \leq \theta, \Omega(\theta, \phi)$  is determined by the input signal and the type of PWM sampling, and H is the amplitude of the PWM pulse and is normalized to unity.

 $F(\theta,\phi)$ takes the form of an infinite series of parallel walls placed at periodic intervals of  $2\pi$  along the  $\theta$ -axis. In Fig. 3 for the LI sampling, the points  $\theta_1$  and  $\theta_2$ , respectively, correspond to the sampled points  $S_1$  and  $S_2$  in Fig. 2, and can be represented as

$$\theta_1 = B + Q\cos\phi_1 \text{ and } \theta_2 = B + Q\cos\phi_2$$
 (5)

where  $B=2\pi k$ , and  $Q=M\pi$ . Following the double Fourier series analysis given in the Appendix, we derive the resultant double Fourier series expression for the LI single-sided trailing-edge PWM sampling process (see Fig. 4)

$$F_{LI}(t) = K - \sum_{n=1}^{\infty} \frac{\omega_s I_{0n}}{2n\omega_i \pi^2} \sin\left(n\omega_i t - 2n\pi k \frac{\omega_i}{\omega_s}\right)$$

$$+ \sum_{m=1}^{\infty} \left(\frac{1}{m\pi} \sin m\omega_s t - \frac{I_{m0}}{2m\pi^2} \sin(m\omega_s t - 2m\pi k)\right)$$

$$- \sum_{m=1}^{\infty} \sum_{n=\pm 1}^{\pm \infty} \frac{I_{mn}}{2(m + \frac{n\omega_i}{\omega_s})\pi^2}$$

$$\cdot \sin\left(m\omega_s t + n\omega_i t - 2m\pi k - 2n\pi k \frac{\omega_i}{\omega_s}\right)$$
(6)

where the equation shown at the bottom of the page is true.

$$K = \frac{k}{2\pi} \int_{0}^{2\pi} \left( \frac{Q \cos \Phi - \frac{BQ}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)}{1 + \frac{Q}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)} \right) d\Phi$$

$$I_{0n} = \int_{0}^{2\pi} \exp \left( -j \left( n\Phi + \frac{\frac{n}{p} \left(Q \cos \Phi - \frac{BQ}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)\right)}{1 + \frac{Q}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)} \right) \right) d\Phi$$

$$I_{m0} = \int_{0}^{2\pi} \exp \left( -jm \frac{\left(Q \cos \Phi - \frac{BQ}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)\right)}{1 + \frac{Q}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)} \right) d\Phi$$

$$I_{mn} = \int_{0}^{2\pi} \exp \left( -j \left( n\Phi + \frac{\left(m + \frac{n}{p}\right) \left(Q \cos \Phi - \frac{BQ}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)\right)}{1 + \frac{Q}{\pi} \sin \frac{\pi}{p} \sin \left(\Phi + \frac{\pi}{p}\right)} \right) \right) d\Phi$$



Fig. 4. Detailed illustration of the LI single-sided trailing-edge PWM sampling.

As in the previous double Fourier series expressions, we interpret (6) as follows. The first, third, and fourth terms are similar to that given in (3a)–(3c) and as explained earlier, these terms are of little consequence in practice. For an input modulating frequency of 1 kHz and  $f_s=48$ , p=48/1, the average foldback distortion for 4- and 8-kHz bandwidth is negligible at -98.4 and -97.7 dB, respectively. For  $p\approx48/4$ , the foldback distortion at M=0.9 is also negligible at -101.5 dB, respectively. As in the other processes, note that the foldback distortion is a function of M,p and the bandwidth.

The second term in (6) is of particular interest as it comprises the input modulating signal and its harmonics, and the latter, the source of the harmonic distortion. This term clearly depicts the different parameters that contribute to the mechanisms of harmonic distortion. It is apparent here that the magnitude of the signal harmonics in the second term in (6) and of the foldback distortion in the fourth term in (6) can be reduced by increasing the sampling frequency, equivalent to increasing ratio, p. Note that increasing p should be used judiciously as a higher ratio leads to a higher clock rate and consequently higher power dissipation; see Sections III and IV. It can also be appreciated, on careful examination, that the magnitude of the signal harmonics is greatly reduced as n increases.

To evaluate the extent of this harmonic distortion, we note that the integral term in the second term of (6) is a complex term comprising real and imaginary components. As this term appears to be mathematically intractable, we employ the well-known numerical integration method to determine the magnitude of the individual signal harmonics, and eventually determine the THD. To view the harmonic distortion of the LI process in perspective, we will now compare the THD of the different sampling processes, namely NS, US,  $\delta$ C, and LI. We can do this by two methods. In the first method, we determine the THD of the NS, US,  $\delta$ C, and LI sampling processes using the second term of their respective double Fourier series expressions given in (3a)–(3c) and (6).

In Fig. 5, we present the THD of the different sampling processes by using this first method. Note that we have used a logarithmic scale (as opposed to a linear scale) for the % THD ordinate to better depict the salient features. The THD of the NS process is zero and is, hence, not plotted in Fig. 5. From Fig. 5, we remark that of the practical algorithmic sampling processes (and US process), the LI process features the lowest THD. As a case in point, for p=48/1 and at near maximum signal swing at modulation index M=0.9, the THD of the LI process is, respectively,  $\sim\!67$  times (36.5 dB) and  $\sim\!3$  times (9.4 dB) better that the US and  $\delta$ C processes; we will later show in Section IV that we obtain this advantage with very little penalty. These results are somewhat intuitive if we compare the pulsewidth expressions or the double Fourier series expressions aforementioned.

In the second method, we can determine the THD of the different processes in the time domain, that is based on their pulsewidths given in (1a)–(1d) and by using the fast Fourier transform (FFT) in MATLAB. We can show that the THDs obtained from both methods agree well except for the NS where the noise floor (due to finite 16-bit data wordlength precision) masks the THD results (instead of obtaining an ideal THD = 0); we will elaborate this in Section IV. We remark that as the THD from both methods agree well, our derived double Fourier series expression for the LI process is, hence, verified.

### C. Hardware Design for the LI Sampling Process

In view of the low THD advantage of the LI process for power critical applications, we will now describe our simple hardware to realize for the LI process. We depict in Fig. 6 our block circuit diagram design to realize the LI process expressed in (1d). We will now describe its operation. The 16-bit Register R2 stores the current input sample  $S_2$  and while the 16-bit Register R1 stores the previous input sample  $S_1$  from R2. The Subtractor subtracts  $S_2$  from  $S_1$ , that is  $S_1 - S_2$ , a 17-bit output. We use the borrow bit from the Subtractor and invert this bit. Collectively, this inverted borrow bit (Bit 16) and the other 16 bits (Bits 0–15) form the Subtractor output  $1 + (S_1 - S_2)$ , the denominator of (1d). Note that the addition process at the output of the Subtractor is virtual, in the sense that we eliminate the need for a physical adder, hence, some hardware saving.

To complete the operation for (1d), the output of the Subtractor,  $1 + (S_1 - S_2)$ , is input to the Parallel Divider as the divisor. The 16-bit output  $(S_1)$  from Register R1 is padded with 16 least significant bit zeros in the converter and this forms the 32-bit dividend of the Divider. The 32-bit Parallel Divider employs full-subtractors to subtract the 17-bit divisor  $(1+(S_1-S_2))$  from the 32-bit dividend  $(S_1)$  in 16 stages of subtraction. This Parallel Divider obtains the 16-bit quotient output in one clock period, and the complete mathematical operation in (1d) is obtained. We remark that as the 16 least significant bits of the 32-bit dividend are zeros (a known value), the hardware of this Divider can be designed to be  $\sim 23\%$  lower than the standard 32-bit divider [25]. This is because in most stages of the 16 subtraction stages (of a parallel structure divider), there is no need for subtraction (in the lower significant bits of the stages). To be specific, the Divider here performs



Fig. 5. Comparison between the THDs of US,  $\delta$  C and LI sampling processes at different modulation indexes  $M \cdot +$  Average THD is the arithmetic average of the THDs for M=0.1 to 0.9.



Fig. 6. Hardware implementation for the LI sampling process based on simplified expression [see (1d)].

a 17-bit subtraction in the first stage, an 18-bit subtraction in the second stage and eventually a 32-bit subtraction in the 16th stage instead of a consistent 32-bit subtraction in all 16 stages; note that depending on the output of each subtraction, each subtraction step may require a multiplexer to restore the dividend. In specific hardware terms, this divider is 1.5 times the complexity of a  $16 \times 16$ -bit multiplier and dissipates 2–3 times more power. In short, the hardware of the LI process is simple, hence its low power dissipation attribute (see later). In Section IV, we will report the simulations and practical measurements of the LI sampling process based on this circuit design.

# III. CLOCK-COUNTER CUM NOISE-SHAPER PWM PULSE GENERATOR

As aforementioned, the preferred design for the pulse generator for the algorithmic PWMs is the CNS pulse generator and this is an established design methodology [17]. The other reported clock-counter method [14], tapped-delay-line method [15], a combined clock-counter and tapped-delay-line pulse generator methods are unattractive for the following reasons. The clock-counter method requires its clock frequency to be  $2^N x$  sampling frequency  $(f_s)$  where N is the number of bits. For example, if N=16 bits and  $f_s=48$  kHz are specified,



Fig. 7. Block diagram of the 10-bit clock-counter and the 1-bit frequency doubter.

the resultant inordinately high clock frequency of  $\sim 3.15~\mathrm{GHz}$  would be required. The tapped-delay-line technique and the combined clock-counter and tapped-delay-line technique, on the other hand, depend, to some degree, on the fabrication parameters that may have unacceptable tolerances and this may lead to some nonlinearity.

We depict in Fig. 7 part of our design of the CNS pulse generator embodying the 16-bit to 11-bit noise shaper, 10-bit clock counter and a proposed frequency doubler. We employ the noise shaper [18] to downconvert the 16-bit wordlength input to an 11-bit wordlength output to reduce the required clock rate of the clock counter. The noise shaper forces the noise arising from the quantization from the downconversion to fall outside the band of interest. We do not consider the Integral Noise Shaper [26] because of the considerably more complex arithmetic computations operating at high oversampling rate and because the nonlinearities of the noise shaper are already negligible, see Section IV.

Despite the 5-bit wordlength reduction of the wordlength from the noise shaper, a relatively high clock frequency of  $\sim\!100$  MHz ( $2^{11}\times48$  kHz) is still required for the 11-bit fast counter. To reduce this clock frequency, we employ our proposed frequency doubler depicted in Fig. 7 to reduce the number of bits by a further 1 bit (i.e., the counter is now 10 bits)—hence, an  $\sim\!50$  MHz ( $2^{10}\times48$  kHz) clock rate, a frequency that is easily accommodated at 1.1 V operation for a typical 0.35- $\mu$ m CMOS process. In Fig. 7, the most significant 10 bits of the 11-bit data obtained from the noise shaper is compared against the 10-bit output generated from the up-counter to provide the timing for the initial pulsewidth. Depending on the least significant bit of the 11-bit data, we may extend the width of the PWM pulse using the frequency doubler by one

half clock period of the  $\sim \!\! 50$  MHz clock. In this fashion, we effectively reduce the clock frequency of the clockcounter to half its initial frequency and with little hardware overhead. This clock rate reduction translates to  $\sim \!\! 50\%$  of power dissipation reduction of the pulse generator and is equivalent to a significant  $\sim \!\! 47\%$  reduction of the overall Class-D amp (ambient power dissipation).

It is worthwhile to note that the clockcounter in our CNS pulse generator design also serves as a frequency divider. This novelty of sharing effectively reduces the power dissipation of the entire CNS pulse generator by  ${\sim}45\%$  (compared to a design where a separate frequency divider is required). In the next section, we will show that our CNS pulse generator dissipates very low power ( ${\sim}53~\mu\mathrm{W}$ ) for a pulse generator with a 16-bit input signal. We will also report on the THDs of pulse generators realized by our abovementioned design and that realized directly by a 16-bit clock-counter (without a noise shaper).

### IV. SIMULATION AND EXPERIMENTAL RESULTS

In Section II, we have shown by simulations that of the practical sampling processes, the LI sampling process exhibits the lowest THD. In this section, we will present simulations and practical measurements for different Class-D amps, each embodying a different sampling process but the same CNS pulse generator and the same load (Fig. 1). To obtain the THDs from simulations, we employ (1a)–(1d) and simulate the pulse generator by using  $48 \times 2^{11}$ -point FFTs in MATLAB. Note that as described in Section III, the CNS pulse generator comprises a 16-bit to 11-bit noise-shaper, a 10-bit clock-counter, and a 1-bit frequency doubler. We will use a 1 kHz 16-bit digital audio data as the input modulating signal sampled at 48 kHz and a



Fig. 8. Microphotograph of the Class-D output stage.

clock frequency of  $2^{10}\times48~\mathrm{kHz}\approx\!\!50~\mathrm{MHz}$  for the CNS pulse generator.

To make a fair comparison for the practical measurements, we realize the different 16-bit Class-D amps using the same CPLD, and with the same output stage realized in a prototype IC depicted in Fig. 8. Note that as the NS process is difficult to realize practically, we omit the NS process for the practical measurement comparison. In all practical measurements, we use the Brüel and Kjaer PULSE 3560C Multi-Analyzer System with LabShop version 7.

We summarize in Fig. 9, the THD of the different Class-D amp realizations. As before, note that we have used a logarithmic scale for the ordinate % THD. From Fig. 9, we remark that for all Class-D amp realizations, the THD obtained by simulations agree well with that obtained experimentally. This observation again serves to verify our derived double Fourier series expression for the LI process; we earlier remarked the same in Section II-B. We note that of the practical Class-D amp realizations, the embodiment with the LI process features the lowest THD. In this realization, the average THD is  $\sim 0.03\%$ , and this THD is slightly better than the reported substantially more complex oversampled  $\Delta\Sigma$  PCM-PWM realizations as discussed in Section I. For completeness, we remark that our simulations of a Class-D amp embodying the LI sampling process and a 16-bit clock-counter pulse generator (without a noise shaper) yield approximately the same THD and foldback distortion as a Class-D amp embodying the LI sampling process and the CNS pulse generator described herein. Put simply, despite the substantially reduced clock rate of counter by means of the noise shaper, the 16-to-11-bit noise shaper contributes negligible THD nonlinearities.

For completeness, we have included the THD of the Class-D embodying the NS and CNS pulse generator in Fig. 9. In this case, note that this plot is actually the noise floor as the noise floor masks the THD and the THD is below the noise floor; THD is ideally zero. The noise floor is largely due to the finite precision of the 16-bit wordlength data and in a practical situation, the noise floor is further due to digital dithering; refer to Fig. 10.

We observe that the shape of the THDs of the Class-D amps embodying the  $\delta C$  and LI sampling processes against modulation indexes show a somewhat U shape. We attribute the higher THD at low modulation index to the reduced effective signal

representation at low signal levels. The higher THD observed at higher modulation indexes, on the other hand, is due to larger distance between samples in the sampling processes, and this translates to a higher interpolation error or nonlinearity. For other types of practical amps, a higher THD is also usually observed at larger signal swings although the mechanisms are different.

For completeness, we plot in Fig. 10, the simulated frequency spectrum of the PWM output of a Class-D amp embodying the LI and CNS pulse generator for M=0.9. Note that to emulate a realistic design, we assume that the input signal is (digitally) dithered by two least-significant-bits. In plot (a), the frequency spectrum at the output of the output stage shows an increasing noise floor with frequency. This can be attributed to the spectrum shaping arising from the noise shaper. In plot (b), as a result of the 4-kHz low-pass filter, the frequency spectrum is now flattened as expected. In both plots, we remark that the harmonic distortion is very low and the SNR is relatively high ( $\sim 90~\mathrm{dB}$ ) at the low-pass filter output.

In view of the low linearities exhibited by the Class-D amp embodying the LI process and CNS pulse generator, it is now instructive to compare its power dissipation against the different Class-D amp realizations. We summarize in Table I, a comparison of these realizations, each embodying a different sampling process but the same CNS pulse generator. We will omit the Class-D amp embodying the NS process as this process is merely of academic interest, and its power dissipation is largely dependent on the clock frequency.

From Table I, it is apparent that the power dissipation of the PWMs is dominated by the power dissipation of the pulse generator. In the case of the US process, as there is no need to compute the sampling process, there is no power dissipation for the sampling process. We remark that of the practical algorithmic processes, our earlier proposed  $\delta C$  process dissipates the least power. This is, as explained earlier, because its computation does not require a division (but requires a simpler multiplication instead) [11]. However, its shortcoming is its higher distortion, an average THD of 0.05%. We remark that the magnitude of this THD is, however, acceptable for many applications. With a slight 8% increase in overall power (5  $\mu$ W), the Class-D amp embodying the LI process can be realized. We reiterate that the primary advantage for using the LI process is the reduced THD, an average improvement of  $\sim 6$  dB over the  $\delta C$  process. We are of the opinion this THD/power tradeoff is worthwhile. It is probably of interest to note that if the wordlength is shortened, say 12-bits instead of 16-bits, the improvement of the distortion of the LI over the  $\delta C$  process becomes even more apparent. This would be useful for very stringent power-critical applications.

To view the power dissipation of the pulse generator in perspective, the distribution of the 53  $\mu$ W power dissipation is as follows: the  $\sim \!\! 50$  MHz frequency divider/clock-counter dissipates 44  $\mu$ W and the remaining circuits dissipate 9  $\mu$ W. If our proposed frequency doubler is not applied, the frequency divider/clock-counter would instead dissipate  $\sim \!\! 106~\mu$ W, increasing the overall power dissipation of the pulse generator and the Class-D amp by  $\sim \!\! 50\%$  and  $\sim \!\! 47\%$ , respectively.



Fig. 9. Comparison of the THDs of different Class-D amp realizations (embodying different sampling processes and the same pulse generator) at various modulation indexes based on computer simulations and on experimental measurements. \* Note that the THD for the NS Class-D amp is theoretically zero, and the harmonic components are masked by the noise floor level at the harmonic frequencies. In other words, this plot is the noise floor.



Fig. 10. Frequency spectrum of the PWM output of the Class-D amp embodying the LI and CNS pulse generator at modulation index, M=0.9 and with 2 LSBs dithering for the input. (a) Output of the output stage in Fig. 1. (b) Output of the low-pass filter in Fig. 1.

We summarize in Table II, the number of transistors required for the realization of the different Class-D amps. The number of transistors is an indication of the IC area required and for easy comparison, we normalize the IC areas with respect to the  $\delta C$ 

TABLE I Comparison of Power Dissipation for Different 16-BIT Sampling Processes But With Same Hybrid Pulse Generator  $@V_{\rm DD}=1.1~{
m V}, 0.35$ - $\mu{
m m}$  CMOS Process,  $f_S=48~{
m kHz}$  and Modulation Index, M=0.5

| 16-bit Digital PWM    | Power Dissipation (μW) |                 |       |  |
|-----------------------|------------------------|-----------------|-------|--|
| 10-bit Digital F Will | Sampling Process       | Pulse Generator | Total |  |
| US Sampling           | 0                      | 53              | 53    |  |
| δC Sampling           | 2                      | 53              | 55    |  |
| LI Sampling           | 7                      | 53              | 60    |  |

TABLE II

COMPARISON OF NUMBER OF TRANSISTORS REQUIRED TO REALIZE CLASS D

AMPS Embodying Different Sampling Process and Same 16-Bit Pulse

GENERATOR BASED ON 0.35-\(\mu\)m CMOS Process

| 16-bit Digital PWM | Number of               | Normalized Area |
|--------------------|-------------------------|-----------------|
|                    | <b>CMOS Transistors</b> |                 |
| US Sampling        | 2212                    | 0.14            |
| δC Sampling        | 15494                   | 1               |
| LI Sampling        | 18342                   | 1.18            |

process. From Table II and as expected, the US process requires the smallest area because there is no need to compute the sampling process. Of the two algorithmic processes, our previously proposed  $\delta C$  process requires the smallest area for reasons already discussed-to realize the LI process, the IC area is  $\sim\!18\%$  larger. This increase is somewhat smaller than expected because we have simplified the design of the divider as described in Section II-C.

An important practical consideration in a digital Class-D amp realization is the noise floor that in turn determines the signal-tonoise ratio (SNR). We summarize in Table III, a comparison of the different Class-D amp realizations based on experiment measurements. It should be appreciated that the noise floor (at zero-input) is largely dependent on the noise of the circuit (in this case, the CPLD) used and is virtually independent of the algorithm because the noise floor is measured at zero-input. We remark that the same CPLD used for all the sampling processes is somewhat noisy at zero-input and we expect a lower noise floor for an actual IC realization; an SNR of > 90 dB is not uncommon in custom designs (e.g., [19]) and we had earlier shown that the noise floor is of this order based on simulations depicted in Fig. 10. As we have used the same CPLD and tested the different Class-D amp realizations under identical conditions, we remark from Table III that the noise floor (and SNRs) of all realizations is approximately equal and that the LI process has little effect on the noise floor.

In summary, the Class-D amp embodying the LI sampling process and CNS pulse generator features low nonlinearities, simple hardware (and small IC area) and low power dissipation, rendering it suitable for power critical portable applications including hearing instruments.

# V. CONCLUSION

We have presented a Class-D amp design that embodied our simplified LI sampling process expression and the CNS pulse generator that included our proposed frequency doubler design. We have analytically derived the double Fourier expression for the LI process to analytically determine the nonlinearities and that modeled the mechanisms of the nonlinearities. We have shown that the LI sampling process featured very low nonlinearities despite the modest computation required. The Class-D amp featured low voltage micropower operation, very low nonlinearities, and a small IC area, hence, appropriate for power-critical micropower portable applications. Our design and analytical derivations have been verified by computer simulations and on the basis of experimental measurements.

## APPENDIX

# DERIVATION OF THE FOURIER SERIES COEFFICIENTS FOR THE LI SAMPLING PROCESS

With reference to (4) and (5), we apply the double Fourier series analysis to derive the Fourier series coefficients for the LI sampling process. First, we substitute  $\theta$  as a function of  $\phi$  so that the  $\phi$  points correspond to the  $\theta_1$  and  $\theta_2$  points, i.e.,

$$\phi_1 = \left\lfloor \frac{\theta}{2\pi} \right\rfloor \frac{2\pi}{p}; \quad \phi_2 = \left( \left\lfloor \frac{\theta}{2\pi} \right\rfloor + 1 \right) \frac{2\pi}{p}$$
 (A1)

where  $\lfloor (\theta/2\pi) \rfloor$  denotes the nearest integer less than or equal to  $(\theta)/(2\pi)$ .

We depict in Fig. 4 the detailed illustration of the LI single-sided trailing-edge PWM sampling corresponding to Fig. 3. Analogous to our simplified LI pulsewidth (1d) in Section II.A, we note that as VW is  $\theta_1$  and YZ is  $2\pi - \theta_2$ , we use a pair of similar triangles VWX and XYZ to form the proportionality

$$\frac{\frac{\theta_{\text{LI}}}{p}}{\frac{2\pi}{p} - \frac{\theta_{\text{LI}}}{p}} = \frac{\text{VW}}{\text{YZ}} \Rightarrow \frac{\frac{\theta_{\text{LI}}}{p}}{\frac{2\pi}{p} - \frac{\theta_{\text{LI}}}{p}} = \frac{\theta_1}{2\pi - \theta_2}.$$
 (A2)

Using (A2), we can easily derive an expression for the sampling point

$$\theta_{\rm LI} = \frac{2\pi\theta_1}{2\pi - \theta_2 + \theta_1}.\tag{A3}$$

Note that (A3) is derived based on the example in Fig. 4. This example pertains to a part of the input modulating signal that has positive gradient, i.e.,  $\theta_2 > \theta_1$ . We remark that the same equation [i.e., (A3)] can be derived when the input modulating signal has a negative gradient, i.e.,  $\theta_1 > \theta_2$ .

### TABLE III

COMPARISON OF MEASURED NOISE FLOOR AND SNR (4 kHz BANDWIDTH) FOR DIGITAL CLASS D AMP REALIZATIONS EMBODYING DIFFERENT SAMPLING PROCESSES AND SAME PULSE GENERATOR. MEASUREMENTS ARE OBTAINED FROM THE ALTERA EXCALIBUR NIOS DEVELOPMENT BOARD VERSION 2.0 WITH AN APEX 20K200EFC484 DEVICE

| 16-bit Class D Amp | Integrated Noise Floor | SNR (dB) |
|--------------------|------------------------|----------|
|                    | (μV <sub>rms</sub> )   |          |
| US Sampling        | 86.9                   | 80.4     |
| δC Sampling        | 88.1                   | 80.3     |
| LI Sampling        | 86.9                   | 80.4     |

By substituting (5) and (A1) into (A3), we can now show that the pulsewidth function of the LI sampling contour  $\Omega(\theta, \phi)$  is

$$\Omega(\theta, \phi) = \frac{2\pi (B + Q\cos\phi_1)}{2\pi - (B + Q\cos\phi_2) + (B + Q\cos\phi_1)} 
= \frac{2B\pi + 2Q\pi\cos\left(\left\lfloor\frac{\theta}{2\pi}\right\rfloor\frac{2\pi}{p}\right)}{2\pi + 2Q\sin\frac{\pi}{p}\sin\left(\left(\left\lfloor\frac{\theta}{2\pi}\right\rfloor\frac{2\pi}{p}\right) + \frac{\pi}{p}\right)}.$$
(A4)

The PWM spectrum cannot be obtained directly by evaluating the double Fourier series along the sampling contour because the LI sampling contour is discontinuous in the  $(\theta, \phi)$  coordinate term  $\lfloor (\theta)/(2\pi) \rfloor$  space due to the discrete term. We overcome this difficulty by transforming (A4) into a continuous function with a new term

$$u = \left(\phi - \left| \frac{\theta}{2\pi} \right| \frac{2\pi}{p} + \frac{\theta}{p} \right). \tag{A5}$$

By straightforward manipulation of (A5), we obtain

$$\phi = u - \frac{\theta}{p}$$
, for  $0 \le \theta < 2\pi$ . (A6)

We now rewrite the transformed pulsewidth function  $\Omega_T(\theta,u)$  as

$$\Omega_T(\theta, u) = \frac{B + Q\cos\left(u - \frac{\theta}{p}\right)}{1 + \frac{Q}{\pi}\sin\frac{\pi}{p}\sin\left(u - \frac{\theta}{p} + \frac{\pi}{p}\right)}.$$
 (A7)

In order to compute the Fourier coefficient  $K_{mn}$ , we apply a linear shearing transform to the parallelogram bounded by  $\theta = 0, \theta = 2 \pi, u = (\theta)/(p)$  and  $u = 2\pi + (\theta)/(p)$  in Fig. 3 into a square by introducing a new term  $\Phi$ 

$$\Phi = u - \frac{\theta}{p}, \qquad 0 \le \theta, u \le 2\pi. \tag{A8}$$

We note that  $d\Phi=dU$  as  $\theta$  and p are constants for a given line. In other words, the LI sampling contour is now placed along the  $\theta$  axis and the parallelogram is transformed into a square bounded by  $\theta=0, \theta=2\pi, \Phi=0$ , and  $\Phi=2\pi$ . This contour is

$$\Omega_T(\theta, \Phi) = \frac{B + Q\cos\Phi}{1 + \frac{Q}{\pi}\sin\frac{\pi}{p}\sin\left(\Phi + \frac{\pi}{p}\right)}.$$
 (A9)

By substituting (4) and (A9) into the Fourier coefficient [3], we can express the Fourier coefficient  $K_{mn}$  in the  $(\theta, \Phi)$  domain as shown in (A10)

$$K_{mn} = \frac{1}{4\pi^2} \int_{\Phi} \int_{\theta} F(\theta, \Phi) e^{-j(m\theta + n(\Phi + \frac{\theta}{p}))} d\theta d\Phi$$

$$= \frac{1}{4\pi^2} \int_{0}^{2\pi} e^{-jn\Phi} \int_{0}^{\Omega_T(\theta, \Phi)} H e^{-j(m + \frac{n}{p})\theta} d\theta d\Phi$$

$$= \frac{1}{4\pi^2} \int_{0}^{2\pi} e^{-jn\Phi} \int_{0}^{\left(\frac{B+Q\cos\Phi}{1+\frac{Q}{\pi}\sin\frac{\pi}{p}\sin(\Phi + \frac{\pi}{p})}\right)} H$$

$$\times e^{-j(m + \frac{n}{p})\theta} d\theta d\Phi$$

$$= \frac{jH}{4\left(m + \frac{n}{p}\right)\pi^2} \int_{0}^{2\pi} e^{-jn\Phi}$$

$$\times \left[ e^{-j(m + \frac{n}{p})(\frac{B+Q\cos\Phi}{1+\frac{Q}{\pi}\sin\frac{\pi}{p}\sin(\Phi + \frac{\pi}{p})})} - 1 \right] d\Phi$$

$$= \frac{jH}{4\left(m + \frac{n}{p}\right)\pi^2}$$

$$\cdot \left[ \int_{0}^{2\pi} e^{-jn\Phi} e^{-j(m + \frac{n}{p})(\frac{B+Q\cos\Phi}{1+\frac{Q}{\pi}\sin\frac{\pi}{p}\sin(\Phi + \frac{\pi}{p})})} d\Phi$$

$$- \int_{0}^{2\pi} e^{-jn\Phi} d\Phi \right] \tag{A10}$$

where

$$\int_{0}^{2\pi} e^{-jn\Phi} d\Phi = \begin{cases} 0 & n \neq 0 \\ 2\pi & n = 0, \end{cases}$$
 and 
$$K_{mn} = \frac{jH}{4\left(m + \frac{n}{p}\right)\pi^{2}} \left[ \int_{0}^{2\pi} e^{-jn\Phi} \left( \frac{B + Q\cos\Phi}{1 + \frac{Q}{\pi}\sin\frac{\pi}{p}\sin(\Phi + \frac{\pi}{p})} \right) d\Phi \right]$$

$$\times e^{-j(m + \frac{n}{p})\left(\frac{B + Q\cos\Phi}{1 + \frac{Q}{\pi}\sin\frac{\pi}{p}\sin(\Phi + \frac{\pi}{p})} \right)} d\Phi \right]$$
(A11)

for  $n \neq 0$ 

$$K_{m0} = \frac{jH}{4\left(m + \frac{n}{p}\right)\pi^{2}}$$

$$\left[\int_{0}^{2\pi} e^{-jn\Phi} e^{-jm\left(\frac{B+Q\cos\Phi}{1+\frac{Q}{\pi}\sin\frac{\pi}{P}\sin(\Phi+\frac{\pi}{p})}\right)} d\Phi - 2\pi\right]$$
 (A12)
for  $n = 0$ .

In summary, we have derived the double Fourier series coefficients in (A11) and (A12) for the LI single-sided trailing-edge PWM sampling process.

# ACKNOWLEDGMENT

The authors would like to thank Dr. M.-T. Tan for designing the Class-D output stage IC.

# REFERENCES

- B. L. Sim, Y. C. Tong, J. S. Chang, and C. T. Tan, "A parametric formulation of the generalized spectral subtraction method," *IEEE Trans. Speech Audio Process.*, vol. 6, no. 4, pp. 328–337, Jul. 1998.
- [2] J. S. Chang, M. T. Tan, Z. H. Cheng, and Y. C. Tong, "Analysis and design of power efficient Class-D amplifier output stages," *IEEE Trans. Circuits Syst. I, Fundam. Theory Appl.*, vol. 47, no. 6, pp. 897–902, Jun. 2000
- [3] H. S. Black, Modulation Theory. New York: Van Nostrand, 1953, pp. 263–281.
- [4] P. H. Mellor, S. P. Leigh, and B. M. G. Cheetham, "Reduction of spectral distortion in class D amplifiers by an enhanced pulsewidth modulation sampling process," *Proc. Inst. Elect. Eng.-G*, vol. 138, no. 4, pp. 441–448, Aug. 1991.
- [5] J. M. Goldberg and M. B. Sandler, "New high accuracy pulsewidth modulation based digital-to-analogue convertor/power amplifier," *Proc.Inst. Elect. Eng. -Circuits, Devices Syst.*, vol. 141, no. 4, pp. 315–324, Aug. 1994.
- [6] M. Streitenberger, F. Felgenhauer, and H. Bresch, "Zero position coding (ZePoC)—A generalized concept of pulse-length modulated signals and its application to class-D audio power amplifiers," in *Proc. 110th AES Convention*, May 2001. preprint 5365.
- [7] B. F. Logan Jr., "Click modulation," AT&T Bell Lab Tech. J., vol. 63, no. 3, pp. 401–423, Apr. 1984.
- [8] L. Risbo and T. Mørch, "Performance of an all-digital power amplification system," in *Proc. 104th AES Convention*, May 1998. preprint no. 4695.
- [9] M. Johansen and K. Nielsen, "A review and comparison of digital PWM methods for digital pulse modulation amplifier (PMA) systems," in *Proc.* 107th AES Convention, Sep. 1999. preprint no. 5039.
- [10] Z. Song, "Digital pulse width modulation: analysis, algorithms, and applications," Ph.D. dissertation, Univ. Illinois at Urbana-Champaign, Urbana, 2001.
- [11] B. H. Gwee, J. S. Chang, and H. Li, "A micropower low-distortion digital pulsewidth modulator for a digital Class-D amplifier," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 49, no. 1, pp. 1–13, Jan. 2002.
- [12] B. H. Gwee, J. S. Chang, V. Adrian, and H. Amir, "A novel sampling process and pulse generator for a low distortion digital pulse-width modulator for digital Class-D amplifiers," *Proc. IEEE Int. Symp. Circuits* Syst., vol. IV, pp. 504–507, May 2003.
- [13] M. T. Tan, J. S. Chang, H. C. Chua, and B. H. Gwee, "An investigation into the parameters affecting THD in low-voltage low-power Class-D amplifiers," *IEEE Trans. Circuits Syst. I, Fundam. Theory Appl.*, vol. 50, no. 10, pp. 1304–1315, Oct. 2003.
- [14] G. Y. Wei and M. Horowitz, "A low power switching power supply for self-clocked systems," *Int. Symp. Low Power Electron. Design*, pp. 313–318, Aug. 1996.
- [15] A. Dancy and A. P. Chandrakasan, "Ultra low power control circuits for PWM converters," in *IEEE Power Electron. Specialist Conf.*, 1997, pp. 21–27.
- [16] —, "A reconfigurable dual output low power digital PWM power converter," in *Proc. Int. Symp. Low Power Electronics and Design*, 1998, pp. 191–196.
- [17] R. E. Hiorns, J. M. Goldberg, and M. B. Sandler, "Realizing an all digital power amplifier," in *Proc. AES 89th Convention*, Sep. 1990. preprint no. 2960.
- [18] S. K. Tewksbury and R. W. Hallock, "Oversampled, linear predictive and noise-shaping coders of order N > 1," *IEEE Trans. Circuits Syst.*, vol. CAS-25, no. 7, pp. 436–447, Jul. 1978.

- [19] J. L. Melanson, "Delta Sigma PWM DAC to reduce switching," U.S. Patent 5 815 102, Sep. 29, 1998. (incorporated in Cirrus Logic CS44210).
- [20] A. J. Magrath and M. B. Sandler, "Digital power amplification using sigma-delta modulation and bit flipping," J. AES, vol. 45, no. 6, Jun. 1997
- [21] R. Esslinger, G. Gruhler, and R. W. Stewart, "Sigma-delta modulation in digital class-D power amplifiers: Methods for reducing the effective pulse transition rate," in *Proc. AES 112th Convention*, May 2002. preprint no. 5634.
- [22] P. H. Mellor, S. P. Leigh, and B. M. G. Cheetham, "Improved sampling process for a digital, pulse-width modulated Class-D power amplifier," *Proc. Inst. Elect. Eng. Colloq. Digit. Audio Signal Process.*, pp. 3/1–3/5, May 1991.
- [23] C. Pascual and B. Roeckner, "Computationally efficient conversion from pulse-code modulation to naturally sampled pulse-width modulation," in *Proc. 109th AES Convention*, Sep. 2000. preprint 5198.
- [24] W. R. Bennett, "New results in the calculation of modulation products," Bell Syst. Tech. J., no. 12, pp. 228–243, 1933.
- [25] R. F. Tinder, Engineering Digital Design, 2nd ed. San Diego, CA: Academic, 2000, pp. 353–357.
- [26] P. Midya, M. Miller, and M. Sandler, "Integral noise shaping for quantization of pulse width modulation," in *Proc. 109th AES Convention*, Sep. 2000. preprint no. 5193.
- [27] C. Pascual, "All-digital audio amplifier," Ph.D. dissertation, Univ. Illinois at Urbana-Champaign, Urbana, 2001.
- [28] W. J. Roeckner, P. Midya, P. A. Wagh, and W. J. Rinderknecht, "Method and apparatus for generating a pulse width modulated signal," U.S. Patent 6 606 044, Aug. 2003.
- [29] P. Midya, B. Roeckner, P. Rakers, and P. Wagh, "Prediction correction algorithm for natural pulsewidth modulation," in *Proc. 109th AES Con*vention, Sep. 2000. preprint 5194.



**Bah-Hwee Gwee** (S'93–M'97–SM'03) received the B.Eng. degree in electrical and electronic engineering from the University of Aberdeen, U.K., in 1990, and the M.Eng. and Ph.D. degrees from Nanyang Technological University (NTU), Singapore, in 1992 and 1998, respectively.

He worked on a National Science and Technology Board-funded project with NTU in collaboration with SEIKO Instruments R&D Lab.—Human Interface Engineering, Singapore, from 1990 to 1993. From 1995 to 1998, he was a Lecturer with

the School of Electronic Engineering, Temasek Polytechnic, Singapore. He has been an Assistant Professor with the School of Electrical and Electronic Engineering, NTU, since 1999. He is the principal investigator of several research grants including the ASEAN-EU University Network Programme (AUNP) project. His total research grant amounts to US\$ 450 000. He has published more than 30 research papers and filed several patents in circuit design. His research interests include low-power asynchronous microprocessor and digital signal processor design, Class-D amplifiers, and soft-computing.

Dr. Gwee is currently the chairperson of the IEEE Singapore-Circuits and Systems Chapter and the publication chair of the IEEE APCCAS-2006.



**Joseph S. Chang** received the B.Eng. degree in electrical and computer systems engineering from Monash University, Melbourne, Australia, in 1983 and the Ph.D. degree in otolaryngology from the University of Melbourne, Melbourne, in 1990.

He worked for CSIRO, Melbourne, Australia, and Texas Instruments, Singapore, from 1983 to 1985. From 1989 to 1991, he was a Senior Research Scientist/Engineer at the Human Communication Research Centre, University of Melbourne. He is presently an Associate Professor with Nanyang

Technological University, Singapore. His research interests include analog and digital signal processing, very large-scale integration design, speech perception, biomedical engineering, and hearing instrument (hearing aid) research. He holds several patents and has several pending patents in circuit design.

Dr. Chang received the Commendation for the Best Presentation of a Paper Award in 1989 for a paper presented at the Microelectronics Conference, Australia. He served as the Chairperson of the International Symposium on Integrated Circuits, Devices and Systems (ISIC-2004).



Victor Adrian received the B.Eng. degree in electrical and electronic engineering, and the M.Phil. degree from Nanyang Technological University (NTU), Singapore, in 2003 and 2004, respectively.

He is currently working on his research projects with NTU. His research interests are digital Class-D amplifiers, low-voltage low-power IC design, and real-time implementation of acoustic noise reduction algorithms.