# Multicarrier demodulator for digital satellite communication systems

Prof. E. Del Re R. Fantacci, PhD

Indexing terms: Demodulation, Digital communication systems, Satellite links and space communication

Abstract: A multicarrier demodulator (MCD) suitable for advanced digital satellite communications is presented. This system permits the direct interfacing of FDMA and TDM communication links by using digital signal processing techniques. Two main functions are implemented by an MCD: demultiplexing and demodulation. We focus here only on a digital implementation of an MCD with a view to achieving flexibility, better performance and suitability for VLSI.

The demultiplexer is implemented according to a per-channel structure based on an analytic signal method that allows a highly modular and flexible implementation to be achieved. This approach permits a certain degree of integration of the demultiplexer and demodulator functions. In the proposed MCD scheme the receiver pulseshaping filter can be integrated in the demultiplexer structure, thus lowering the overall implementation complexity. Coherent demodulation is used to reduce the signal-to-noise ratio required to achieve a specified bit error rate and is carried out using the maximum likelihood (ML) estimation method. A maximum a posteriori probability (MAP) method is used to jointly estimate the carrier phase and bit timing of the received signal. The digital architecture of the proposed MCD can be adapted to different digital modulation techniques. However, we focus here on the application for QPSK signals, since this modulation scheme is of interest in digital satellite communications. A theoretical analysis and computer simulations are performed in order to evaluate the performance degradation of the proposed MCD, including the finite arithmetic implementation.

## Introduction

Digital communication systems will play a key role in the development and establishment of the new and value-added services of future advanced communication networks. Digital transmission will be employed in areas that traditionally have been, and still are, the domain of analogue transmission, such as radio-relay links and satellite communication systems. Efficient and cost-

Paper 66781 (E8, E9), first received 26th May 1987 and in revised form 16th January 1989

The authors are with the Dipartimento di Ingegneria Elettronica, Università di Firenze, Via S. Marta 3, 50139 Firenze, Italy

effective solutions require new approaches and implementations for both the transmission part (e.g. new coding/modulation schemes) and the communication system architecture. For example, in the past, satellites have operated using analogue modulation of the carrier, and access to the satellite was achieved with frequency-division multiple access (FDMA). The satellite simply translated the carrier frequency and retransmitted the signal in a wide beam covering a large geographical area. Today, new systems employ time-division multiple access (TDMA), new efficient modulation techniques, multiplebeam antennas, and onboard processing for higher system efficiency.

Onboard signal processing offers advantages for satellite communication systems. A typical and interesting feature is the separation of uplinks and downlinks, thus allowing their separate and independent optimisation. Regenerative satellites allow different modulation and multiple-access schemes to be employed in the uplinks and the downlinks; for example, uplink random access and downlink TDMA techniques can be envisaged [1]. Alternatively, in many applications, such as mobile or fixed communications services, the use of uplink frequency division multiple access (FDMA) techniques (with the inherently low-cost earth stations) and downlink TDMA techniques (that can fully exploit the satellite transponder output power without intermodulation distortion) is an attractive solution. The feasibility of this approach, however, depends on efficient means for translating between the two multiple-access formats onboard the satellite. The onboard system implementation complexity (including the VLSI design) and power consumption are, of course, of primary concern. The onboard processing system receives an input FDMA signal and supplies an output to interface the TDMA or TDM links; therefore it must separate each individual radio channel, demodulate it and switch it correctly to the appropriate downlink channel. An appropriate name for an onboard processing system performing the first two operations is 'multicarrier demodulator' (MCD). Two main functions are implemented by a MCD: demultiplexing and demodulation.

The focus here is only on a digital implementation of an MCD because it offers several advantages, such as flexibility, VLSI integrability and better efficiency. The operation of the demultiplexer is to separate the individual input FDMA channels and to supply each of them to a demodulator input for the appropriate down-conversion to baseband. Therefore, in principle its operation corresponds to a bank of bandpass filters followed by a down-converter. The down-conversion can be implemented digitally by a frequency-sampling reduction

(i.e. decimation operation). However, direct implementation of a bank of digital filters is not the most convenient solution. This paper describes an efficient approach to the digital implementation of a demultiplexer, based on the analytic signal method [2]. This method fully exploits the properties of the analytic signal and employs the tools offered by digital signal processing techniques to implement a demultiplexer that is modular, efficient, flexible, of relatively low-complexity and suitable for VLSI integration.

Coherent demodulation is usually employed in satellite communication to achieve the required bit error rate (i.e.  $10^{-6}$  to  $10^{-9}$ ) with an acceptable signal-to-noise ratio. The performance of a coherent demodulator depends rather critically on the design of the synchronisation circuit employed to estimate the received-carrier phase and bit synchronisation reference from the received signal. Carrier recovery can be achieved in different ways. e.g. using the Mth-power method or with the Costas loop and decision-directed feedback circuit. With M = 2, the Mth-power method is known as a squaring loop [3]. Clock recovery is usually achieved by performing a nonlinear operation on the received signal. This is because the signal does not contain discrete spectral lines at the clock frequency [3]. Clock recovery can occur subsequent to or coincident with carrier recovery. In the former case, the recovery circuits operate on the demodulated (not necessarily detected) baseband waveform, whereas, in the latter situation, circuits operate directly on the modulated carrier signal. In this paper a maximum a posteriori probability (MAP) method is used to jointly estimate the parameters that require synchronisation. In particular, it will be shown that, by a suitable choice of the architecture of the digital coherent receiver, the ML demodulator can be easily integrated with the joint carrier and clock recovery circuit. The digital architecture of the receiver can be adapted to different digital modulation techniques. However, we focus here only on the application for QPSK signals, since this modulation scheme is of interest in satellite digital communications.

The MCD system described in this paper represents a complete solution for a processing system interfacing FDMA and TDM links. In particular, its design has been carried out with a view to its possible implementation by means of custom-VLSI digital circuits.

## 2 Demultiplexer\*

Demultiplexing of an FDM signal can be performed following two basic approaches: block methods and nonblock methods. We focus here only on nonblock methods and, in particular, the analytic signal approach [2] is considered. This approach is a per-channel method that avoids using a digital product modulator or a block processor. It has the specific feature of relaxing the filter specifications, thus achieving a lower implementation complexity with respect to other per-channel approaches. Further, the analytic signal approach leads directly to a per-channel and highly modular structure; this structure is directly matched to the per-channel implementation of the demodulators. Another advantage of the analytic signal approach is its high flexibility: in contrast to other methods, where some specific application would benefit from unequal channel bandwidths, the analytic signal

structure could vary on demand the bandwidth assigned to each channel, simply by switching to a suitable new set of demultiplexer parameters. The principle of operation of the analytic signal method is illustrated in Reference 2 and will be briefly recalled in the following. The structure of a demultiplexer according to the analytic signal method is shown in Fig. 1 [2, 4], where only the ith



channel is considered. It must be noted that the implementation is the same for all channels. The FDMA input signal, after appropriate analogue down-conversion to a low-frequency range, is sampled according to the sampling theorem [5] at the high-rate frequency  $f_u = 1/T_u$  and processed to obtain  $N_c$  TDM digital signals, each sampled at the low-rate frequency  $f_d = 1/T_u$ ,  $N_c$  being the number of multiplexed channels. In Fig. 1,  $H_s(fT_u)$  and  $H_s(fT_u)$  represent the conjugate symmetric and antisymmetric parts, respectively, of the high-rate complex bandpass filter  $H_s(fiT_u)$ . This complex filter can be regarded as a frequency translated version of a lowpass prototype  $H(fT_u)$  such that [2]

$$H_{i}(fT_{u}) = H_{i}(fT_{u}) + jH_{i}(fT_{u})$$
  
=  $H[2\pi(f - iW - W/2)T_{u}]$  (1)

where W is the channel spacing. In the same figure,  $G_i(fT_d)$  and  $G_i'(fT_d)$  represent the conjugate symmetric and antisymmetric parts, respectively, of the complex low-rate filter  $G_i(fT_d)$  which can be defined as [2]

$$G_i(fT_d) = G_i(fT_d) + jG_i'(fT_d)$$

$$= G\{\lceil f - (-1)^i W/2 \rceil T_d\}$$
(2)

Thus, each filter  $G_i(fT_d)$  is related, according to eqn. 2, to a lowpass prototype. It can be noted from eqn. 2 that the number of different filters  $G_i(fT_d)$  is actually two: one for the odd channels and the other for the even channels.

The principle of operation of the analytic signal method is shown in Fig. 2. The input FDM signal (Fig. 2a) is filtered by  $H_0(fT_u)$  (where  $i = 0, 1, ..., N_c - 1$  is the channel index) which is ideally defined as

$$H_{i}(fT_{u}) = \begin{cases} 1, & \text{if }_{d}/2 \leq f \leq (i+1)f_{d}/2\\ \text{undefined}, & (i-1)f_{d}/2 < f < if_{d}/2\\ \text{undefined}, & (i+1)f_{d}/2 < f < (i+2)f_{d}/2\\ 0, & \text{elsewhere} \end{cases}$$
(3)

in the frequency band  $[0 \text{ to } f_u/2]$  and is periodic with a frequency period  $f_u$ . The frequency response is sketched in Figs. 2b and 2c for the odd and even channels, respectively. The filter output is a sampled analytic signal  $s_i(nT_u)$  at the sampling rate  $f_u$  which can be expressed in the frequency domain as

$$S_i(fT_u) = S(fT_u)H_i(fT_u) \tag{4}$$

The spectrum  $S_i(fT_u)$  is shown in Figs. 2d and 2e. The sampling frequency of the signal  $s_i(nT_u)$  is reduced by the factor  $N_c$  to produce the complex lowpass signal  $u_i(nT_d)$ 

<sup>\*</sup> In accordance with usual signal theory, in the following the arguments of frequency domain quantities are considered as the exponents of complex exponentials, e.g.  $S(fT_u)$  means  $S(\exp [j2\pi fT_u])$ .

(for n an integer and sampled at the frequency  $f_d = f_{\rm w}/N_{\rm c}$ ) given by

$$u_{i}(nT_{d}) = s_{i}(nN_{c}T_{d})$$

$$U_{i}(fT_{d}) = (1/N_{c}) \sum_{0}^{N_{c}-1} S_{i}[(fT_{d} - k)/N_{c}]$$

$$= (1/N_{c})\{S_{i}[(fT_{d} - k_{1})/N_{c}]$$

$$+ S_{i}[(fT_{d} - k_{2})/N_{c}]\}$$
(5)



Fig. 2 Frequency demultiplexing by the analytic signal method a FDM input signal b and c Frequency response of the high-rate channel filter d and e Spectra of the filtered FDM signal f and g Spectra of the complex signal obtained by decimation over  $N_c$ ; h and i Frequency response of the lowpass prototype j Frequency response of related lowpass prototype filter k and l Spectra of the complex demultiplexed signal m and n Recovered baseband spectra

with

$$k_1 = i/2 + [1 - (-1)^i]/4, \quad k_2 = k_1 + (-1)^i$$

The frequency spectrum of this complex lowpass signal is sketched in Figs. 2f and 2g for the odd and even channels, respectively. Its baseband (i.e. the range for which the frequency magnitude is not greater than half the sampling frequency) extends to  $f_d/2$ .

The complex signal  $y_i(nT_d)$  whose real part is related to the desired demultiplexed digital signal is given, in the frequency domain, by

$$Y_i(fT_d) = U_i(fT_d)G_i(fT_d)$$

$$= (1/N_c)S_i[(fT_d - k_1)/N_c]G_i(fT_d)$$
(6)

The ideal frequency response for the filters  $G_i(fT_d)$  is sketched in Figs. 2h and 2i for i odd and even, respectively. It is now clear that they are related, according to eqn. 2, to a lowpass prototype filter  $G(fT_d)$  with a frequency response shown in Fig. 2j. Finally, the real digital

signal translated to baseband can be expressed as:

$$y_i(nT_d) = \text{Re} [y_i(nT_d)]$$
  
 $Y_i(fT_d) = \frac{1}{2}[Y_i(fT_d) + Y_i^*(-fT_d)]$  (7)

where \* denotes the complex conjugation operator.

The output from the demultiplexer, taking in account the spectral inversion for the odd channels, is

$$X_{i}(fT_{d}) = Y_{i}(fT_{d} + i/2)$$

$$= (1/N_{c})S(fT_{d} + i/2)$$

$$\times [H_{i}(fT_{d} + i/2)G_{i}(fT_{d} + i/2)$$

$$- H_{i}(fT_{d} + i/2)G_{i}(fT_{d} + i/2)]$$
(8)

Eqn. 8 represents the signal of the *i*th channel correctly translated to the baseband and sampled at the low sampling frequency, as shown in Figs. 2m and 2n.

In order to justify the choice of the analytic signal approach, it can be pointed out that this method permits a certain degree of integration of the demultiplexer and demodulator functions. In particular, the pulse-shaping filter, which is generally used to reduce the effects of noise at the receiver and to avoid intersymbol interference (ISI) at the detection instant, can be implemented by the cascade of the two digital filters  $H_i(fT_u)$  and  $G_i(fT_d)$ . The high-rate filter  $H_i(fT_u)$  is essentially a bandpass filter, and thus the desired pulse-shaping function can be implemented by the low-rate filter  $G_i(fT_d)$ . For example, it will be shown later that a 40% cosine rolloff-factor pulseshaping filter [3], equally shared between the transmitter and receiver, can be easily integrated in the demultiplexer, thus lowering the overall implementation complexity. However, in the following we mainly consider the case in which the low-rate filter is a lowpass filter without pulseshaping. An interesting feature of the implementation structure shown in Fig. 1 is that only processing of real quantities is required. Moreover, the illustration of the frequency dechannelisation performed by the analytic signal method (Fig. 2) has been outlined on the basis of ideal filtering masks. Indeed, in real applications there are nonzero transition bands for the filters  $G_i(fT_d)$  and transition bands wider than W for the filters  $H_l(fT_u)$  [2, 4]. This opportunity gives rise to more relaxed filter specifications and thus reduces the overall system complexity. The overall number of multiplications required per input channel and per second can be estimated as a function of the channel spacing W, the number of channels  $N_c$  and the filtering bandwidth B as [4]

$$M = KW^{2}[W(N_{c} + 4) - 2B(N_{c} + 2)]/$$

$$[(W - B)(W - 2B)] (9)$$

where K is given by

$$K = -2 \log_{10}[5\delta_1 \delta_2]/3 \tag{10}$$

The terms  $\delta_1$  and  $\delta_2$  denote the overall acceptable inband and out-of-band ripples, respectively, derived according to given system specifications. A detailed description of the filter design procedure is reported in Reference 4. It follows from eqn. 9 that for specified values of B and  $N_c$  an optimum value for the channel spacing  $W_0$  can be found in order to achieve the lowest M. However, taking into account that, for the subsequent demodulation operation an integral number of samples per symbol is convenient, the suboptimum value of W closest to  $W_0$  is generally used.

#### 3 Effects of finite arithmetic implementation

The implementation of a digital signal processing system necessarily requires a finite arithmetic. Although it is possible to conceive and actually to implement floating-point arithmetic for digital signal processing systems, it is considered that fixed-point arithmetic implementation will still represent the most convenient solution in the foreseeable future. Thus, we consider here only the effects of a fixed-point finite arithmetic implementation. The error sources due to the finite length of the digital registers are: (a) quantisation of the input signal; (b) quantisation of the filter coefficients; and (c) rounding of the multiplication operations. For the first source of error, the sampled input signal is quantised in amplitude in order to be represented by a set of numbers in binary form. We suppose that the input signal will be modelled as a random Gaussian signal. This assumption arises from the consideration that the input FDMA signal is the sum of several independent signals. With this hypothesis the signal-to-quantisation noise ratio SNR<sub>a</sub> can be expressed in decibels as [5]

$$(SNR_q)_{dB} = 6.02b_q - 7.27 \text{ dB} \tag{11}$$

where  $b_a$  is the number of bits employed for the quantisation of the input signal. We suppose that automatic gain control (AGC) is used to constrain the input signal of the analogue-to-digital converter (A/D) within the range  $\pm 1$ . Further, we shall assume that the output signal from any filter is in the range  $\pm 1$ . This can be guaranteed by a suitable scaling of the digital filter coefficients (included in the filter design and implementation). For the second source of error, the minimum word length of the filter coefficients is determined by computer rounding to guarantee that they still satisfy the required filtering specifications. For the third source of error it is assumed that a FIR implementation is the most suitable one for the digital filters of Fig. 1. A FIR filter implemented by P multiplications, each rounded to  $b_m$  bits, produces an output noise error with a mean power equal to that introduced by an output quantisation to  $b_a$  bits, accord-

$$P2^{-2b_m/3} = 2^{-2b_a/3} (12)$$

Let us suppose that we have determined (through analytic or simulation tools) the number of bits  $b_a$  required for the output signal quantisation to achieve some specified performance, then the number of bits  $b_m$  for the multiplication roundings inside the filter is determined [2, 4, 5] as

$$b_m = b_a + \langle (\log_2 P)/2 \rangle \tag{13}$$

where  $\langle x \rangle$  denotes the minimum integer greater than or equal to x. The block diagram of the demultiplexer according to the analytic signal approach and including the multiplication rounding model previously described is shown in Fig. 3. In this Figure,  $S_i$  denotes the power of



Fig. 3 Finite precision implementation of demultiplexed

the input FDM signal assumed uniformly distributed among  $N_c$  channels,  $N_i$  is the mean noise power introduced in the uplink and  $N_q$  is the quantisation noise

power due to the input A/D conversion, both supposed white, Gaussian and uniformly distributed among the N. channels. The term  $S_t/2$  represents the power of the signals at the output of the filters  $H_i(fT_u)$ ,  $H_i'(fT_u)$  and also at the output of the filters  $G_i(fT_d)$ ,  $G_i(fT_d)$  under the assumption that they are of the all-pass type. In the same Figure, S, is the power of the signal at the ith output of the demultiplexer,  $N_r$  is the overall noise power for each demultiplexer output, which will be defined in the following, and  $N_{a1}$  denotes the noise power due to the finite arithmetic implementation of the filters  $H_i(fT_u)$ ,  $H_i'(fT_u)$ and to the quantisation of their outputs at  $b_{a1}$  bits. In the same way,  $N_{a2}$  represents the power of the noise introduced by the finite arithmetic implementation of the filters  $G_i(fT_d)$ ,  $G_i'(fT_d)$  and quantisation of their outputs at  $b_{a2}$  bits. To evaluate the signal-to-noise ratio  $S_a/N_a$ , at each demultiplexer output, in addition to the contributions previously considered, the effects of the decimation process must also be included. The decimation process gives rise to a noise contribution at each demultiplexer output independent of the other disturbances with mean power given by [4]

$$N_d = S_i \delta_2^2 \tag{14}$$

where  $\delta_2$  is the maximum acceptable out-of-band ripple. It can be noted that eqn. 14 is derived according to a worst-case analysis and assuming  $N_c \gg 1$ , (out-of-band ripple constant in the filtering bandwidth and equal to its maximum value  $\delta_2$ ) [4].

Now, under the hypothesis that the filters  $G_i(fT_d)$ ,  $G_i'(fT_d)$  are of the all-pass type, by setting  $N_a = N_{a1} + N_{a2}$  and assuming  $N_{a1} = N_{a2}$  (equal quantisation bits at the output of the high-rate and low-rate digital filters), the demultiplexer output is found to be given by

$$N_{t} = N_{i}/N_{c} + N_{a}/N_{c} + S_{i}\delta_{2}^{2} + 2N_{a}$$
 (15)

Thus, the signal-to-noise ratio at each demultiplexer output is [4]

$$(S_i/N_i) = [(S_i/N_i)^{-1} + (S_i/N_q)^{-1} + 2(S_i/N_q)^{-1} + N_c \delta_2^2]^{-1}$$
(16)

where we have assumed  $S_t = S_i/N_c$ . Thus, the finite arithmetic wordlengths at each point of the demultiplexer structure can be determined in order to introduce an overall degradation with respect to the input signal-tonoise ratio smaller than a specified value.

## 4 MAP synchronisation and ML demodulation

The use of digital signal processing for the implementation of a QPSK coherent demodulator is now considered. The proposed digital receiver integrates the operations of the carrier and clock recovery with the coherent demodulation. A maximum a posteriori probability (MAP) criterion [6, 7] is used to simultaneously estimate the parameters necessary for synchronisation. A suitable approach to a digital implementation of a joint carrier and clock recovery circuit is described in Reference 8. In particular, it is shown that, by a suitable choice of the architecture of the digital receiver, the coherent demodulator can be easily integrated in the joint carrier and clock recovery circuit. Here we specifically base the MCD demodulator on the method and results described in Reference 8. The overall number of multiplications and

additions required are

$$M_D = 2(L+1)(M+4)$$

multiplications/symbol

$$S_D = (L+1)(2M+1) + 3L + 1$$

where L + 1 is the number of symbols used to perform synchronisation and M is the number of samples per symbol. An important result is that the performance of the joint carrier and clock recovery circuit with integrated coherent demodulation depends basically only on the carrier phase error. Indeed, by considering a rectangular pulse shape for the QPSK signal, it is evident that the coherent demodulation of the received symbol depends only on the correct selection of groups of samples that belong to the same symbol and does not depend on the position of these samples relative to the symbol interval. In other words, the clock timing recovery operation consists in this case in the correct partitioning of the received samples into sets of M samples, each set belonging to a single symbol.

Thus the overall degradation of the bit error rate can be derived through the following equation [4, 9]:

loss (dB) = 
$$\frac{4.34}{\alpha} (1 + 2\Gamma) \left( 1 + \frac{1 + 2\Gamma}{2\alpha} \right)$$
 (18)

where  $\Gamma$  is equal to the energy-per-bit to one-side-noisepower-density ratio  $(E/N_0)$  and  $\alpha$  for high values of the signal-to-noise ratio is defined as

$$\alpha = 1/\sigma_{\theta}^2 \tag{19}$$

with  $\sigma_{\theta}^2$  is the variance of the phase error. The phase error consists of two contributions that can be considered independent of each other, i.e. the error due to the algorithm (at infinite precision) and the additional error introduced by the finite precision implementation. The first contribution can be derived from the results reported in Reference 8. The errors introduced by the finite arithmetic implementation can be considered as independent, identically distributed (IID) random variables, with zero mean and variance  $2^{-2b\epsilon}/3$ , where  $b_{\epsilon}$  is the number of bits (including sign) used for the finite arithmetic implementation of the digital receiver.

As shown in the Appendix the error  $e_{\theta}$  introduced into the carrier phase estimate can be considered to be a random variable with zero mean and a variance which is overbounded by

$$\sigma_{e_{\theta}}^{2} = K_{\theta}^{2} 8(L+1)M2^{-2b_{\theta}}/3 \tag{20}$$

From eqns. 18 and 20 and from Fig. 4 of Reference 8, the finite arithmetic word length  $b_e$  can be determined in order to achieve an overall degradation less than or equal to a specified value [9, 10].

## System design and performance

As a particular application, the case of a 10-channels FDMA/SCPC system is considered. A QPSK modulation scheme with a data-rate of R = 2048 kbit/s is used for data transmission. Independent carrier phase and symbol timing have been assumed for each channel. The channel spacing W has been selected in order to achieve the lowest implementation complexity of the MCD system and to guarantee an integral number of samples per symbol. Starting from the previous considerations, W has been selected equal to 3R/4, (in particular  $f_d$  corresponds to 3 samples/symbol). Therefore,  $f_u$  and  $f_d$  are equal to 15R and 1.5R, respectively.

The demultiplexer design is first presented. The highrate and low-rate lowpass prototypes have been designed as FIR linear-phase filters by the equiripple method [11]. These filters have been designed to have a stopband attenuation of at least 45 dB and an inband ripple not greater than 0.02 dB. These requirements result in the number of filter coefficients being equal to 47 and 23 for the high-rate lowpass prototype and low-rate lowpass prototype, respectively. FIR linear-phase filters have been chosen to avoid phase-distortion and because they are suitable for the implementation of the samplingfrequency reduction process [2]. If a pulse-shaping filter is required it can be integrated, as outlined in Section 2, in the low-rate filters  $G_i(fT_d)$ , thereby decreasing the overall implementation complexity of the MCD. As an example we shall consider the case of a low-rate lowpass prototype designed to integrate a pulse-shaping filter with a 40% rolloff factor [3] shared equally between the transmitted and the receiver. This filter has been designed as a FIR linear-phase filter by using a modified version of the Parks-McClellan program [11]. The required number of filter coefficients is equal to 31 (instead of 23).

The overall number of multiplications required per channel and per second can be derived using eqn. 9; it amounts to 167.42 Mmultiplications/s/channel with the pulse-shaping filter included in the demultiplexer lowrate filters  $G_i(fT_d)$ . The overall number of multiplications required per channel and per second without integrating the pulse-shaping filter in the low-rate stage of the demultiplexer amounts to 142.85 Mmultiplications/s/channel. Nevertheless, if the same pulse-shaping filter is used, it must be implemented in the demodulator and an additional 95.23 Mmultiplications per second per channel are required.

The finite precision design of the demultiplexer for the application considered can be carried out following the procedure outlined in Section 3. With  $b_q = 8$  bits at the A/D converter, we obtain a degradation of 0.042 dB at the input signal-to-noise ratio  $S_i/N_i$  (=15.5 dB) that guarantees a bit-error-rate of  $10^{-9}$ . This value has been derived from eqn. 16 by considering only the terms  $S_i/N_i$ and  $S_i/N_a$ , i.e. the signal-to-quantisation noise ratio given by eqn. 11. The number of bits needed for the finite arithmetic implementation of the filter coefficients can be derived by computer rounding to meet the filtering specifications

The other finite arithmetic wordlengths, derived according to eqn. 16, are given in Table 1; they actually

Table 1: Finite arithmetic MCD design (analytic signal method)

|              | Dem   | ultipl | ex fil         | ter p | aram  | eters | Demodulator |
|--------------|-------|--------|----------------|-------|-------|-------|-------------|
| quantisation | h     | I(fT   | ,)             | G     | (fT   | ,)    |             |
| $b_q$        | $b_c$ | $b_m$  | b <sub>a</sub> | $b_c$ | $b_m$ | ь,    | b,          |
| 8            | 11    | 11     | 8              | 11    | 11    | 8     | 8           |

 $b_a = input signal wordlength$ 

bc = filter coefficient wordlength

 $b_m$  = filter arithmetic wordlength

= filter output wordlength

b = joint carrier and clock recovery wordlength

introduce a degradation of 0.033 dB, with respect to the signal-to-noise ratio at the A/D converter output.

In Fig. 4 the degradation in decibels in the output signal-to-noise ratio introduced by a infinite arithmetic implementation with respect to the finite precision design is shown as a function of the parameter  $E/N_0$ . It can be seen that there is a good agreement between the results



 $\begin{tabular}{ll} Fig.~4 & Performance~degradation~due~to~a~finite~precision~implementation~of~the~demultiplexer \end{tabular}$ 

 $N_c = 10, R = 2048 \text{ Kbit/s}$ 

derived by theoretical analysis (analytical results) and those obtained by computer simulation (simulation results). It should be borne in mind that the analytical results have been based on a worst-case analysis. A detailed description of the simulation algorithm employed is given in Reference 4. The combined carrier and clock recovery implementation structure is given in Reference 8. The overall numbers of multiplications and additions per symbol depend on the number of symbols used in the estimator and the number of M=3 samples per symbol. We have chosen L=2 as a good tradeoff between implementation complexity and estimation accuracy. The resulting implementation complexity is given in Table 2.

Table 2: Overall MCD implementation complexity

|               | Mmultiplications/s/channel | Madditions/s/channel |
|---------------|----------------------------|----------------------|
| Demultiplexer | 142.85                     | 420.86               |
| Demodulator   | 43.01                      | 28.67                |
| Overall       | 185.86                     | 449.53               |

The finite precision design of a MAP carrier and recovery circuit with integrated ML demodulation can be carried out following the procedure reported in Section 4. From eqn. 20 a finite arithmetic wordlength  $b_e$  equal to 8 bits can be used to implement the combined carrier and clock recovery circuit and the integrated ML demodulator so as to introduce an overall degradation at the specified bit-error-rate  $(10^{-9})$  of 0.03 dB.

The overall degradations due to the MCD implementation are given in Table 3, which also shows the resulting degradations at different bit-error-rates. From Tables 2 and 3 it can be seen that the proposed MCD system achieves a good performance together with an acceptable implementation complexity and is well suited to digital implementation onboard a satellite.

#### 6 Conclusions

In this paper a digital MCD system suitable for advanced satellite communications has been presented. The proposed MCD is formed from two parts: the demultiplexer and the coherent demodulator.

Table 3: Overall degradation of the bit-error-rate due to MCD implementation

| Bit-error-rate         | 10-4  | 10-6  | 10 <sup>-9</sup> |
|------------------------|-------|-------|------------------|
| Demultiplexer loss, dB | 0.016 | 0.026 | 0.042            |
| Demodulator loss, dB   | 0.011 | 0.018 | 0.029            |
| Overall loss, dB       | 0.027 | 0.044 | 0.071            |

The demultiplexer has been implemented according to the analytic signal approach, which leads to a perchannel structure that avoids the use of block processor. It has the specific advantage of lowering the necessary filter specifications, thus allowing a lower implementation complexity to be achieved with respect to other perchannel approaches. Further, the analytic signal approach is directly matched to the per-channel implementation of the demodulators. Therefore a certain degree of integration of the demultiplexer and demodulator functions can be obtained. Indeed, it has been shown that the required pulse-shaping filter can be integrated in the demultiplexer filters, thus lowering the overall system complexity.

Another advantage of the analytic signal approach is its high flexibility: in contrast to other methods, where some specific application would benefit from unequal channel bandwidths, the analytic signal structure can vary on demand the bandwidth assigned to each channel simply by switching to a suitable new set of demultiplexer parameters.

Coherent demodulation is used to reduce the  $E/N_0$  value that guarantees the specified bit-error-rate. A MAP method is employed to estimate jointly the carrier phase and bit timing. An interesting feature is the integration of ML demodulation in a combined carrier and clock recovery circuit. The MCD design, including the finite precision implementation, has been carried out by considering the specific application of a QPSK modulation scheme with a data-rate equal to 2048 kbit/s. Therefore, the proposed MCD system can be adapted to different modulation schemes, for example MSK modulation.

In conclusion, the digital MCD system described in this paper represents an appropriate solution for interfacing FDM and TDM links in advanced digital communication systems and, in particular, it has been carried out with a view to possible implementation by custom or semicustom VLSI digital circuits.

#### 7 Acknowledgments

The work reported in this paper was developed under European Space Agency Research Contract ESTEC 6096/84/NL/GM(SC).

The authors wish to acknowledge important discussions with Dr. G. Pennoni and Dr. W. Greiner of the European Space Agency throughout the contract. Special thanks are due also to Dr. P.L. Emiliani of IROE-CNR, Florence, for his valuable co-operation. Finally, ITAL-SPAZIO, Rome, is gratefully acknowledged for its support in the form of a fellowship given to one of the authors.

#### References

BENELLI, G., DEL RE, E., FANTACCI, R., and MANDELLI, F.: Performance of uplink random-access and downlink TDMA techniques for packet satellite networks', Proc. IEEE, 1984, 72, (1), pp.

2 DEL RE, E., and EMILIANI, P.L.: 'An analytic signal approach to transmultiplexers: theory and design', IEEE Trans., 1982, COM-30, (7), pp. 1623-1628

BHARGAVA, V.K., HACCAUN, D., MATJAS, R., and NUSPL, P.P.: 'Digital communication by satellite' (Wiley, New York, 1984)

4 DEL RE, E., EMILIANI, P.L., FANTACCI, R., and PILONI, V.:

'Multicarrier demodulator design'. Estec Contract 6096/84/NL/GM(SC) Final Report, December 1986

5 BELLANGER, M.: 'Digital processing of signals: theory and practice' (Wiley, London, 1984)

6 VAN TREES, H.L.: 'Detection, estimation and modulation theory' (Wiley, New York, 1968)

7 BOOTH, R.W.: 'An illustration of the MAP estimation method for

7 BOOTH, R.W.: 'An illustration of the MAP estimation method for deriving closed-loop phase tracking topologies: the MSK signal structure', IEEE Trans., 1980, COM-28, pp. 1137-1142
8 DEL RE, E., and FANTACCI, R.: 'Joint carrier and clock recovery for QPSK and MSK digital communications', IEE Proc. I. Commun., Sound & Vision, 1989, 136, pp. 208-212
9 MATYAS, R.: 'Effect of noisy phase references on coherent detection of FFSK signals', IEEE Trans., 1978, COM-26, pp. 807-815
10 CARDIER, E.M.: 'Corrigor and sledy emphysiciation for TDMA

10 GARDNER, F.M.: Carrier and clock synchronisation for TDMA digital communications. European Space Agency Report ESA TM-169 (ESTEC), December 1976

11 ASSP Digital Signal Processing Committee: 'Programs for digital signal processing' (IEEE Press, New York, 1979)

#### 9 Appendix

In this Appendix we derive eqn. 20 of Section 4. We denote by b, the finite precision wordlength (including the sign) used to implement the combined carrier and clock recovery circuit. The carrier phase-error signal is derived in Reference 8. Taking into account the rounding of multiplications to  $b_e$  bits it can be rewritten as

$$\frac{\partial \ln f(\mathbf{r} | \theta, \varepsilon)}{\partial \theta} = -\sum_{i=0}^{L} \tanh \{x_{i,1} + e_{i,1}\} \{x_{i,2} + e_{i,2}\} + \sum_{i=0}^{L} \tanh \{x_{i,3} + e_{i,3}\} \{x_{i,4} + e_{i,4}\}$$
 (21)

where the terms  $x_{i,j}$ , j = 1, 2, 3, 4, denote the correct values of the respective quantities (see Reference 8). The terms  $e_{i,j}$ , j = 1, 2, 3, 4, represent the error introduced by rounding the multiplications to  $b_e$  bits. These terms can be considered as independent, identically distributed (IID) random variables with zero mean and variance  $\sigma^2$ 

$$\sigma^2 = 8M2^{-2b_e/3} \tag{22}$$

Then, through simple mathematical considerations, we can write

$$\tanh \{x_{i,1} + e_{i,1}\} \approx \tanh (x_{i,1}) + e_{i,1}/\cosh (x_{i,1})$$

for all i (23)

$$\tanh \{x_{i,3} + e_{i,3}\} \approx \tanh \{x_{i,3}\} + e_{i,3}/\cosh (x_{i,3})$$

for all i (24)

where we have assumed  $e_{i,j} \le 1$  for all i and j. Substituting eqns. 23 and 24 into eqn. 21 using the same analytical calculation we obtain the error in the estimated carrier value due to a finite precision implementation as

$$e_{\theta} = K_{\theta} \sum_{i=0}^{L} \{-\tanh(x_{i,1})e_{i,2} \\ - [x_{i,2}e_{i,1}/\cosh^{2}(x_{i,1})] + \tanh(x_{i,3})e_{i,4} \\ + [x_{i,4}e_{i,3}/\cosh^{2}(x_{i,3})] \}$$
 (25)

Thus the random variable  $e_{\theta}$  has a zero mean and a variance  $\sigma_{e_0}^2$  given by

$$\sigma_{e_{\theta}}^{2} = K_{\theta}^{2} \sigma^{2} \sum_{i=0}^{L} \{ \tanh^{2}(x_{i, 1}) + [x_{i, 2}^{2}/\cosh^{2}(x_{i, 1})] + \tanh^{2}(x_{i, 3}) + [x_{i, 4}^{2}/\cosh^{2}(x_{i, 3})] \}$$
(26)

Thus it can be noted that eqn. 26 can be upperbounded by eqn. 20 according to a worst case hypothesis.