Lossy Joint Source-Channel Coding in the Finite Blocklength Regime
Summary (5 min read)
I. INTRODUCTION
- In the limit of infinite blocklengths, the optimal achievable coding rates in channel coding and lossy data compression are characterized by the channel capacity C and the source ratedistortion function R(d), respectively [3] .
- While computable formulas for the This work was supported in part by the National Science Foundation (NSF) under Grant CCF-1016625 and by the Center for Science of Information (CSoI), an NSF Science and Technology Center, under Grant CCF-0939370.
- Such bounds were shown in [8] for the channel coding problem and in [9] for the source coding problem.
- The error exponent approximation and the Gaussian approximation to the non-asymptotic fundamental limit are tight in different operational regimes.
- Section II summarizes basic definitions and notation.
II. DEFINITIONS
- (i.e. a code with M codewords and average error probability ǫ and cost α).
- The dispersion, which serves to quantify the penalty on the rate of the best JSCC code induced by the finite blocklength, is defined as follows.
Definition 4. Fix α and d ≥ d min . The rate-dispersion function of joint source-channel coding (source samples squared per channel use) is defined as
- EQUATION ) where C(α) and R(d) are the channel capacity-cost and source rate-distortion functions, respectively.
- The following properties of d−tilted information, proven in [19] , are used in the sequel.
- The authors use the same notation ı S;Z for that more general function.
- All results in those sections generalize to the case of a maximal cost constraint by considering X whose distribution is supported on the subset of allowable channel inputs: EQUATION rather than the entire channel input alphabet X .
Theorem 1 (Converse
- The authors write summations over alphabets for simplicity.
- Unless stated otherwise, all their results hold for abstract probability spaces.
- To obtain a code-independent converse, the authors simply choose P X|S that gives the weakest bound, and (23) follows.
B. Converses via hypothesis testing and list decoding
- While traditionally list decoding has only been considered in the context of finite alphabet sources, the authors generalize the setting to sources with abstract alphabets.
- Even though the authors keep the standard "list" terminology, the decoder output need not be a finite or countably infinite set.
Definition 7 (List code
- Any converse for list decoding implies a converse for conventional decoding.
- The hypothesis testing converse for channel coding [8, Theorem 27 ] can be generalized to joint source-channel coding with list decoding as follows.
- Note that this is a hypothetical test, which has access to both the source outcome and the decoder output.
- The Neyman-Pearson lemma (e.g. [20] ) implies that the outcome of the optimum binary hypothesis test between P and Q only depends on the observation through dP dQ, also known as Proof.
- In the case of finite channel input and output alphabets, the channel symmetry assumption of Theorem 5 holds, in particular, if the rows of the channel transition probability matrix are permutations of each other, and P Ȳ n is the equiprobable distribution on the (n-dimensional) channel output alphabet, which, coincidentally, is also the capacityachieving output distribution.
IV. ACHIEVABILITY
- If both the source and the channel code are chosen separationoptimally for their given sizes, the separation principle guarantees that under certain quite general conditions (which encompass the memoryless setting, see [21] ) the asymptotic fundamental limit of joint source-channel coding is achievable.
- In the finite blocklength regime, however, such SSCC construction is, in general, only suboptimal.
- The dispersion achieved by the conventional SSCC approach is in fact suboptimal.
- At finite n, the output of the optimum source encoder need not be nearly equiprobable, so there is no reason to expect that a separated scheme employing a maximum-likelihood channel decoder, which does not exploit unequal message probabilities, would achieve near-optimal non-asymptotic performance.
- The following achievability result, obtained using independent random source codes and random channel codes within the paradigm of Definition 8, capitalizes on this intuition.
Theorem 7 (Achievability). There exists a (d, ǫ) sourcechannel code with
- The authors will construct a code with separate encoders for source and channel and separate decoders for source and channel as in Definition 8.
- The authors now proceed to analyze the performance of the code described above.
- The authors now average (90) over the source and channel codebooks.
- The code size M that leads to tight achievability bounds following from Theorem 7 is in general much larger than the size that achieves the minimum in (81).
Theorem 8 (Achievability)
- Theorem 9 (Achievability, almost-lossless JSCC [17] ).
- The technical condition (iv) ensures applicability of the Gaussian approximation in the following result.
VI. LOSSY TRANSMISSION OF A BMS OVER A BSC
- The rate-distortion function of the source and the channel capacity are given by, respectively, EQUATION.
- For convenience, the authors define the discrete random variable U α,β by EQUATION Furthermore, the binomial sum is denoted by EQUATION A straightforward particularization of the d-tilted information converse in Theorem 2 leads to the following result.
- The hypothesis-testing converse in Theorem 4 particularizes to the following result: Theorem 12 (Converse, BMS-BSC).
- If the source is equiprobable, the bound in Theorem 12 becomes particularly simple, as the following result details.
Theorem 14 (Achievability, BMS-BSC).
- A source of fair coin flips has zero dispersion, and as anticipated in Remark 8, JSSC does not afford much gain in the finite blocklength regime (Fig. 5 ).
- Moreover, in that case the JSCC achievability bound in Theorem 8 is worse than the SSCC achievability bound.
- The situation is different if the source is biased, with JSCC showing significant gain over SSCC .
VIII. TO CODE OR NOT TO CODE
- The authors goal in this section is to compare the excess distortion performance of the optimal code of rate 1 at channel blocklength n with that of the optimal symbol-by-symbol code, evaluated after n channel uses, leveraging the bounds in Sections III and IV and the approximation in Section V.
- The authors show certain examples in which symbol-by-symbol coding is, in fact, either optimal or very close to being optimal.
- A general conclusion drawn from this section is that even when no coding is asymptotically suboptimal it can be a very attractive choice for short blocklengths [2] .
Definition 10. The distortion-dispersion function of symbolby-symbol joint source-channel coding is defined as
- EQUATION ) where D is the distortion-rate function of the source.
- Condition (v) ensures that symbol-by-symbol transmission attains the minimum average (over source realizations) distortion achievable among all codes of any blocklength.
- The following results pertain to the full distribution of the distortion incurred at the receiver output and not just its mean.
Theorem 20 (Achievability, symbol-by-symbol code). Under restrictions
- If (v) holds, then there exist a symbol-by-symbol encoder and decoder such that the conditional distribution of the output of the decoder given the source outcome coincides with distribution P Z ⋆ |S , so the excess-distortion probability of this symbol-by-symbol code is given by the left side of (189), also known as Proof.
- EQUATION where EQUATION Moreover, if there is no power constraint, EQUATION EQUATION where θ(n) is that in Theorem 10.
- In other words, not only do such symbol-by-symbol codes attain the minimum average distortion but also the variance of distortions at the decoder's output is the minimum achievable among all codes operating at that average distortion.
B. Uncoded transmission of a BMS over a BSC
- If the encoder and the decoder are both identity mappings (uncoded transmission), the resulting joint distribution satisfies condition (v).
- As is well known, regardless of the blocklength, the uncoded symbol-by-symbol scheme achieves the minimum bit error rate (averaged over source and channel).
- Here, the authors are interested instead in examining the excess distortion probability criterion.
- Consider an application where, if the fraction of erroneously received bits exceeds a certain threshold, then the entire output packet is useless.
- Moreover, the uncoded transmission attains the minimum bit error rate threshold D(n, n, ǫ) achievable among all codes operating at blocklength n, regardless of the allowed ǫ, as the following result demonstrates.
It achieves, at blocklength n and excess distortion probability
- For the transmission of the fair binary source over a BSC, Fig. 8 shows the distortion achieved by the uncoded scheme, the separated scheme and the JSCC scheme of Theorem 14 versus n for a fixed excess-distortion probability ǫ = 0.01.
- The no coding / converse curve in Fig. 8 depicts one of those singular cases where the non-asymptotic fundamental limit can be computed precisely.
- As the blocklength increases, the performance of the separated scheme approaches that of the no-coding scheme, but according to Theorem 23 it can never outperform it.
- Had the authors allowed the excess distortion probability to vanish sufficiently slowly, the JSCC curve would have approached the Shannon limit as n → ∞.
- Nevertheless, uncoded transmission performs remarkably well in the displayed range of blocklengths, achieving the converse almost exactly at blocklengths less than 100, and outperforming the JSCC achievability result in Theorem 14 at blocklengths as long as 700.
C. Symbol-by-symbol coding for lossy transmission of a GMS over an AWGN channel
- The next result characterizes the distribution of the distortion incurred by the symbol-by-symbol scheme that attains the minimum average distortion.
- On the other hand, using (130), the authors compute EQUATION.
- Indeed, in the range of blocklenghts displayed in Figure 11 , the symbol-by-symbol code even outperforms the converse for codes operating under a maximal power constraint.
E. Symbol-by-symbol transmission of a DMS over a DEC under logarithmic loss
- Curiously, for any 0 ≤ d ≤ H(S), the rate-distortion function and the d-tilted information are given respectively by ( 213) and (214), even if the source is not equiprobable.
- In fact, the rate-distortion function is achieved by, EQUATION and the channel that is matched to the equiprobable source under logarithmic loss is exactly the DEC in (215).
- Finally, it is easy to verify that the distortion-dispersion function of symbol-by-symbol coding under logarithmic loss is the same as that under erasure distortion and is given by (216).
IX. CONCLUSION
- Which hold in wide generality and are tight enough to determine the dispersion of joint source-channel coding for the transmission of an abstract memoryless source over either a DMC or a Gaussian channel, under an arbitrary fidelity measure.the authors.
- The major results and conclusions are the following.
- 6) For the transmission of a stationary memoryless source over a stationary memoryless channel, the Gaussian approximation in Theorem 10 (neglecting the remainder θ(n)) provides a simple estimate of the maximal nonasymptotically achievable joint source-channel coding rate.
- 8) Even in the absence of a probabilistic match between the source and the channel, symbol-by-symbol transmission, though asymptotically suboptimal, might outperform separate source-channel coding and joint source-channel random coding in the finite blocklength regime.
- The authors are grateful to Dr. Oliver Kosut for offering numerous comments, and, in particular, suggesting the simplification of the achievability bound in [1] with the tighter version in Theorem 8.
Did you find this useful? Give us your feedback
Citations
106 citations
101 citations
98 citations
Cites result from "Lossy Joint Source-Channel Coding i..."
...Orig inally studied by Strassen [1], there has been a recent surge of works on this topic following the results of Polyans kiy et al [2] (see for instance [3]-[6])....
[...]
96 citations
83 citations
References
65,425 citations
3,242 citations
588 citations
Related Papers (5)
Frequently Asked Questions (13)
Q2. What is the reason for the lower achievable dispersion in this case?
The reason for possibly lower achievable dispersion in this case is that the authors have the freedom to map the unlikely source realizations leading to high probability of failure to those codewords resulting in the maximum variance so as to increase the probability that the channel output escapes the decoding failure region.
Q3. What is the rate-dispersion function of a joint source-channel coding?
The rate-dispersion function of joint source-channel coding (source samples squared per channel use) is defined asV(d, α) = lim ǫ→0 lim sup n→∞n ( C(α) R(d) −R(n, d, ǫ, α) )22 loge 1 ǫ(11)where C(α) and R(d) are the channel capacity-cost and source rate-distortion functions, respectively.
Q4. What is the distortion-dispersion function of a joint source-channel coding?
1The distortion-dispersion function of joint source-channel coding is defined asW(R,α) = lim ǫ→0 lim sup n→∞n ( D ( C(α) R ) −D(nR, n, ǫ, α) )22 loge 1 ǫ(12) where D(·) is the distortion-rate function of the source.
Q5. What is the possible bound for a given encoder?
Optimizing over γ, T and the distributions of the auxiliary random variables Ȳ and W , the authors obtain the best possible bound for a given encoder PX|S .
Q6. What is the maximum error probability of a source-channel code?
The output of the optimum source encoder is, for large k, approximately equiprobable over a set of5As the maximal (over source outputs) error probability cannot be lower than the worst-case error probability, the maximal error probability achievability bounds of [8] apply to bound ǫ⋆(M).
Q7. what is the erasure distortion measure for a discrete source?
For a discrete source, the single-letter erasure distortion measure is defined as the following mapping d : S×{S, e} 7→ [0,∞]:8d(s, z) = 0 z = s H(S) z = e∞ otherwise (211)For any 0 ≤ d ≤ H(S), the rate-distortion function of the equiprobable source is achieved byPZ⋆|S=s(z) = { 1− dH(S) z = s dH(S) z = e (212)The rate-distortion function and the d-tilted information for the equiprobable source with the erasure distortion measure are given by, respectively,R(d) = H(S)− d (213) S(s, d) = ıS(s)− d (214)Note that, trivially, S(S, d) = R(d) = log |S| − d a.s.
Q8. What is the achievability bound for a single code?
Since the distortion cdf of any single code does not majorize the cdfs of all possible codes, the converse bound on the average distortion obtained through this approach, although asymptotically tight, may be loose at short blocklengths.
Q9. What is the effect of the converse result in Theorem 5?
3) As evidenced by their numerical results, the converse result in Theorem 5, which applies to those channels satisfying a certain symmetry condition and which is a consequence of the hypothesis testing converse in Theorem 4, can outperform the d-tilted information converse in Theorem 3.
Q10. What is the probability of a channel code being distinguished?
From the channel coding theorem the authors know that there exists a channel code that is capable of distinguishing, with high probability, M = exp (kR(d)) < exp (nC) messages when equipped with the maximum likelihood decoder.
Q11. What is the definition of a lossy source-channel code?
In the absence of an input cost constraint the authors simplify the terminology and refer to the code as (d, ǫ) lossy source-channel code.
Q12. What is the error probability with a list decoder?
The error probability with this type of list decoding is the probability that the source outcome S does not belong to the decoder output list for Y : 1− ∑x∈X∑y∈Y∑s̃∈M(L)∑ s∈s̃ PS̃|Y (s̃|y)PY |X(y|x)PX|S(x|s)PS(s)(54)where M(L) is the set of all QS-measurable subsets of M with QS-measure not exceeding L.Definition 7 (List code).
Q13. what is the optimum binary hypothesis test?
In particular, the optimum binary hypothesis test W⋆ for deciding between PSPX|SPY |X and QSPX|SPȲ satisfiesW ⋆ − (S, ıX;Ȳ (X ;Y ))−