# Showing papers in "IEEE Transactions on Information Theory in 1974"

•

6,667 citations

••

TL;DR: The general problem of estimating the a posteriori probabilities of the states and transitions of a Markov source observed through a discrete memoryless channel is considered and an optimal decoding algorithm is derived.

Abstract: The general problem of estimating the a posteriori probabilities of the states and transitions of a Markov source observed through a discrete memoryless channel is considered. The decoding of linear block and convolutional codes to minimize symbol error probability is shown to be a special case of this problem. An optimal decoding algorithm is derived.

4,830 citations

••

TL;DR: Estimation of the parameters of a single-frequency complex tone from a finite number of noisy discrete-time observations is discussed and appropriate Cramer-Rao bounds and maximum-likelihood estimation algorithms are derived.

Abstract: Estimation of the parameters of a single-frequency complex tone from a finite number of noisy discrete-time observations is discussed. The appropriate Cramer-Rao bounds and maximum-likelihood (MI.) estimation algorithms are derived. Some properties of the ML estimators are proved. The relationship of ML estimation to the discrete Fourier transform is exploited to obtain practical algorithms. The threshold effect of one algorithm is analyzed and compared to simulation results. Other simulation results verify other aspects of the analysis.

1,878 citations

••

TL;DR: This correspondence establishes lower bounds on how small the cross correlation and autocorrelation can simultaneously be.

Abstract: Some communication systems require sets of signals with impulse-like autocorrelation functions and small cross correlation. There is considerable literature on signals with impulse-like autocorrelation functions hut little on sets of signals with small cross correlation. A possible reason is that designers put too severe a restriction on cross correlation magnitudes. This correspondence establishes lower bounds on how small the cross correlation and autocorrelation can simultaneously be.

1,451 citations

••

TL;DR: This investigation has considered a class of distortion measures for which it is possible to simulate the optimum (in a rate-distortion sense) encoding and found one distortion measure was fairly consistently rated as yielding the most satisfactory appearing encoded images.

Abstract: Shannon's rate-distortion function provides a potentially useful lower bound against which to compare the rate-versus-distortion performance of practical encoding-transmission systems. However, this bound is not applicable unless one can arrive at a numerically-valued measure of distortion which is in reasonable correspondence with the subjective evaluation of the observer or interpreter. We have attempted to investigate this choice of distortion measure for monochrome still images. This investigation has considered a class of distortion measures for which it is possible to simulate the optimum (in a rate-distortion sense) encoding. Such simulation was performed at a fixed rate for various measures in the class and the results compared subjectively by observers. For several choices of transmission rate and original images, one distortion measure was fairly consistently rated as yielding the most satisfactory appearing encoded images.

1,188 citations

••

TL;DR: Developments in the theory of linear least-squares estimation in the last thirty years or so are outlined and particular attention is paid to early mathematica[ work in the field and to more modern developments showing some of the many connections between least-Squares filtering and other fields.

Abstract: Developments in the theory of linear least-squares estimation in the last thirty years or so are outlined. Particular attention is paid to early mathematica[ work in the field and to more modern developments showing some of the many connections between least-squares filtering and other fields.

696 citations

••

TL;DR: The first part of this paper consists of short summaries of recent work in five rather traditional areas of the Shannon theory, namely: source and channel coding theorems for new situations, and calculation of source rate and channel capacity.

Abstract: The first part of this paper consists of short summaries of recent work in five rather traditional areas of the Shannon theory, namely: 1) source and channel coding theorems for new situations; 2) calculation of source rate and channel capacity; 3) channel coding with feedback; 4) source coding; 5) universal coding. The second part of thc paper consists of a relatively detailed discussion of some aspects of the area that the author considers to be the most dynamic and exciting in the Shannon theory: multiple-user communication. The discussion here includes "multiple-access channels," "broadcast channels," and various source coding problems with multiple-user constraints.

662 citations

••

TL;DR: A simple converse is established showing the optimality of sets of achievable rates for the additive white Gaussian noise broadcast channel, found by Cover and generalized by Bergmans.

Abstract: Sets of achievable rates for the additive white Gaussian noise broadcast channel have been found by Cover [1] for channels with two outputs and generalized by Bergmans [2] to channels with any number of outputs. In this correspondence, we establish a simple converse showing the optimality of these sets of achievable rates. The proof is made simple by use of special properties of the Gaussian channel.

505 citations

••

TL;DR: The testing of binary hypotheses is developed from an information-theoretic point of view, and the asymptotic performance of optimum hypothesis testers is developed in exact analogy to the ascyptoticperformance of optimum channel codes.

Abstract: The testing of binary hypotheses is developed from an information-theoretic point of view, and the asymptotic performance of optimum hypothesis testers is developed in exact analogy to the asymptotic performance of optimum channel codes. The discrimination, introduced by Kullback, is developed in a role analogous to that of mutual information in channel coding theory. Based on the discrimination, an error-exponent function e(r) is defined. This function is found to describe the behavior of optimum hypothesis testers asymptotically with block length. Next, mutual information is introduced as a minimum of a set of discriminations. This approach has later coding significance. The channel reliability-rate function E(R) is defined in terms of discrimination, and a number of its mathematical properties developed. Sphere-packing-like bounds are developed in a relatively straightforward and intuitive manner by relating e(r) and E (R) . This ties together the aforementioned developments and gives a lower bound in terms of a hypothesis testing model. The result is valid for discrete or continuous probability distributions. The discrimination function is also used to define a source code reliability-rate function. This function allows a simpler proof of the source coding theorem and also bounds the code performance as a function of block length, thereby providing the source coding analog of E (R) .

358 citations

••

TL;DR: Articles, books, and technical reports on the theoretical and experimental estimation of probability of misclassification are listed for the case of correctly labeled or preclassified training data.

Abstract: Articles, books, and technical reports on the theoretical and experimental estimation of probability of misclassification are listed for the case of correctly labeled or preclassified training data. By way of introduction, the problem of estimating the probability of misclassification is discussed in order to characterize the contributions of the literature.

325 citations

••

TL;DR: A feedback decision scheme is proposed in which a punctured codeword is initially transmitted and if an uncorrectable error is detected, the receiver signals the transmitter to send another increment of redundancy.

Abstract: A feedback decision scheme is proposed in which a punctured codeword is initially transmitted. If an uncorrectable error is detected, the receiver signals the transmitter to send another increment of redundancy. This procedure is continued if the aggregated word is still uncorrectable.

••

TL;DR: This paper selectively surveys contributions to major topics in pattern recognition since 1968, including contributions to error estimation and the experimental design of pattern classifiers.

Abstract: This paper selectively surveys contributions to major topics in pattern recognition since 1968. Representative books and surveys pattern recognition published during this period are listed. Theoretical models for automatic pattern recognition are contrasted with practical,, design methodology. Research contributions to statistical and structural pattern recognition are selectively discussed, including contributions to error estimation and the experimental design of pattern classifiers. The survey concludes with a representative set of applications of pattern recognition technology.

••

TL;DR: Lower bounds on the out-of-phase autocorrelation and on the cross correlation of sequences of given length and alphabet size are derived and a method of constructing families of sequences that uniformly realize these hounds is presented.

Abstract: The unnormalized Hamming correlation between two sequences of equal length is the number of positions in which these sequences have identical symbols. In this paper, lower bounds on the out-of-phase autocorrelation and on the cross correlation of sequences of given length and alphabet size are derived. A method of constructing families of sequences that uniformly realize these hounds is presented.

••

TL;DR: A new tracking filter is developed that incorporates, in an a posteriori statistical fashion, all data available from sensor reports located in the vicinity of the track, and that provides both optimal performance and reliable estimates of this performance when operating in dense environments.

Abstract: When tracking targets in dense environments, sensor reports originating from sources other than the target being tracked (i.e., from clutter, thermal false alarms, other targets) are occasionally incorrectly used in track updating. As a result tracking performance degrades, and the error covariance matrix calculated on-line by the usual types of tracking filters becomes extremely unreliable for estimating actual accuracies. This paper makes three contributions in this area. First, a new tracking filter is developed that incorporates, in an a posteriori statistical fashion, all data available from sensor reports located in the vicinity of the track, and that provides both optimal performance and reliable estimates of this performance when operating in dense environments. The optimality of and the performance equations for this filter are verified by analytical and simulation results. Second, several computationally efficient classes of suboptimal tracking filters based on the optimal filter developed in this paper and on an optimal filter of another class that appeared previously in the literature are developed. Third, using an extensive Monte Carlo simulation, the various optimal and suboptimal filters as well as the Kalman filter are compared, with regard to the differences between the on-line calculated and experimental covariances of each filter, and with regard to relative accuracies, computational requirements, and numbers of divergences or lost tracks each produces.

••

TL;DR: This paper shows that several transmitters operating in an additive white Gaussian noise environment can send at rates strictly dominating time- multiplex and frequency-multiplex rates by use of a superposition scheme that pools the time, bandwidth, and power allocations of the transmitters.

Abstract: This paper shows that several transmitters operating in an additive white Gaussian noise environment can send at rates strictly dominating time-multiplex and frequency-multiplex rates by use of a superposition scheme that pools the time, bandwidth, and power allocations of the transmitters. This pooling can be achieved without cooperative action, except for agreement on the actual rate of transmission each transmitter will allow itself. The superposition scheme involves subtraction from the received signal of the estimated signals sent by the other transmitters, followed by decoding of the intended signal. This scheme has been shown to be optimal. We conclude that present methods of allocating different frequency bands to different transmitters are necessarily suboptimal.

••

TL;DR: For discrete memoryless sources with a single-letter fidelity criterion, the probability of the event that the distortion exceeds a level d is studied, if for large block length the best code of given rate R > R(d) is used.

Abstract: For discrete memoryless sources with a single-letter fidelity criterion, we study the probability of the event that the distortion exceeds a level d , if for large block length the best code of given rate R > R(d) is used. Lower and upper exponential bounds are obtained, giving the asymptotically exact exponent, except possibly for a countable set of R values.

••

Bell Labs

^{1}TL;DR: These results demonstrate that the decision-feedback equalizer has a lower error probability than the linear zero-forcing equalizer when there is both a high S/N ratio and a fast roll-off of the feedback tap gains.

Abstract: An upper bound on the error probability of a decision-feedback equalizer which takes into account the effect of error propagation is derived. The bound, which assumes independent data symbols and noise samples, is readily evaluated numerically for arbitrary tap gains and is valid for multilevel and nonequally likely data. One specific result for equally likely binary symbols is that if the worst case intersymbol interference when the first J feedback taps are Set to zero is less than the original signal voltage, then the error probability is multiplied by at most a factor of 2^J relative to the error probability in the absence of decision errors at high S/N ratios. Numerical results are given for the special case of exponentially decreasing tap gains. These results demonstrate that the decision-feedback equalizer has a lower error probability than the linear zero-forcing equalizer when there is both a high S/N ratio and a fast roll-off of the feedback tap gains.

••

Osaka University

^{1}TL;DR: It is shown that there exist arbitrarily long quasi-cyclic (2k,k) binary codes that meet a bound slightly weaker than the Gilbert-Varshamov bound.

Abstract: It is shown that there exist arbitrarily long quasi-cyclic (2k,k) binary codes that meet a bound slightly weaker than the Gilbert-Varshamov bound This is a refinement of the result of Chen, Peterson, and Weldon [1]

••

TL;DR: A definition of the degree of randomness of individual binary strings is led to and an information-theoretic version of Godel's theorem on the limitations of the axiomatic method is examined.

Abstract: This paper attempts to describe, in nontechnical language, some of the concepts and methods of one school of thought regarding computational complexity. It applies the viewpoint of information theory to computers. This will first lead us to a definition of the degree of randomness of individual binary strings, and then to an information-theoretic version of Godel's theorem on the limitations of the axiomatic method. Finally, we will examine in the light of these ideas the scientific method and yon Neumann's views on the basic conceptual problems of biology.

••

TL;DR: A bound on the average per-letter distortion achievable by a trellis source code of fixed constraint length is derived for any fixed code rate greater than R(D) , and this bound decreases toward D^{\ast} exponentially with constraint length.

Abstract: For memoryless discrete-time sources and bounded single-letter distortion measures, we derive a bound on the average per-letter distortion achievable by a trellis source code of fixed constraint length. For any fixed code rate greater than R(D^{\ast}) , the rate-distortion function at D^{\ast} , this bound decreases toward D^{\ast} exponentially with constraint length.

••

TL;DR: An ancillary result, used in proving the lower bound on free distance for time-varying nonsystematic codes, furnishes a generalization of two earlier bounds on the definite decoding minimum distance of convolutional codes.

Abstract: The best asymptotic bounds presently known on free distance for convolutional codes are presented from a unified point of view. Upper and lower bounds for both time-varying and fixed codes are obtained. A comparison is made between bounds for nonsystematic and systematic codes which shows that more free distance is available with nonsystematic codes. This result is important when selecting codes for use with sequential or maximum-likelihood (Viterbi) decoding since the probability of decoding error is closely related to the free distance of the code. An ancillary result, used in proving the lower bound on free distance for time-varying nonsystematic codes, furnishes a generalization of two earlier bounds on the definite decoding minimum distance of convolutional codes.

••

TL;DR: A search algorithm is described to decode long binary block codes of any rate for the memoryless binary input J -ary output channel and can be used directly to perform maximum-likelihood decoding or in a constrained version that gives considerably fewer searches at a small sacrifice in performance.

Abstract: A search algorithm is described to decode long binary block codes of any rate for the memoryless binary input J -ary output channel. It can be used directly to perform maximum-likelihood decoding or in a constrained version that gives considerably fewer searches at a small sacrifice in performance. Simulation results are given for a rate-l/2 code of length 128 and minimum Hamming distance 22 on the quantized Gaussian channel.

••

TL;DR: In this paper, a search procedure was developed to find good short binary (N,N - 1) convolutional codes using simple rules to discard from the complete ensemble of codes a large fraction whose free distance d{free} either cannot achieve the maximum value or is equal to d_{free} of some code in the remaining set.

Abstract: A search procedure is developed to find good short binary (N,N - 1) convolutional codes. It uses simple rules to discard from the complete ensemble of codes a large fraction whose free distance d_{free} either cannot achieve the maximum value or is equal to d_{free} of some code in the remaining set. Farther, the search among the remaining codes is started in a subset in which we expect the possibility of finding codes with large values of d_{free} to be good. A number of short, optimum (in the sense of maximizing d_{free} ), rate-2/3 and 3/4 codes found by the search procedure are listed.

••

TL;DR: It is shown that the sequence of distributions used in that algorithm has a limit yielding a point on the R(d) curve if the reproducing alphabet is finite, and a similar but weaker result for countable reproducing alphabets is obtained.

Abstract: In a recent paper [1], Blahut suggested an efficient algorithm for computing rate-distortion functions. In this correspondence we show that the sequence of distributions used in that algorithm has a limit yielding a point on the R(d) curve if the reproducing alphabet is finite, and we obtain a similar but weaker result for countable reproducing alphabets.

••

TL;DR: The discrete Fourier transform is applied as a coarse estimator of the frequency of a sine wave in Gaussian noise to estimate signal energy-to-noise density ratio E/N_0.

Abstract: The discrete Fourier transform (DFT) is applied as a coarse estimator of the frequency of a sine wave in Gaussian noise. Probability of anomaly and the variance of the estimation error are determined by computer simulation for several DFT block sizes as a function of signal energy-to-noise density ratio \mathcal{E}/N_0 . Several data windows are considered, but uniform weighting gives the best performance.

••

TL;DR: It is shown that the quantum-mechanical Cramer-Rao inequalities derived from right logarithmic derivatives and symmetrized logarathmic derivatives of the density operator give superior lower bounds on the error variances of individual unbiased estimates of arrival time and carrier frequency of a coherent signal.

Abstract: Basing decisions and estimates on simultaneous approximate measurements of noncommuting observables in a quantum receiver is shown to be equivalent to measuring commuting projection operators on a larger Hilbert space than that of the receiver itself. The quantum-mechanical Cramer-Rao inequalities derived from right logarithmic derivatives and symmetrized logarithmic derivatives of the density operator are compared, and it is shown that the latter give superior lower bounds on the error variances of individual unbiased estimates of arrival time and carrier frequency of a coherent signal. For a suitably weighted sum of the error variances of simultaneous estimates of these, the former yield the superior lower bound under some conditions.

••

TL;DR: Source coding theorems are proved for discrete-time stationary processes subject to a fidelity criterion and potential applications to universal source coding with a fidelity criteria are discussed.

Abstract: Source coding theorems are proved for discrete-time stationary processes subject to a fidelity criterion. The alphabet of the process is assumed to be a separable metric space, but the process is not assumed to be ergodic. When the process is not ergodic, the minimum average distortion for a fixed-rate code is not given by the distortion-rate function of the source as usually defined. It is given instead by a weighted average of the distortion-rate functions of ergodic subsources comprising the ergodic decomposition of the source. Potential applications to universal source coding with a fidelity criterion are discussed.

••

TL;DR: Three classes of 4-D codes are presented, and an algorithm is given which yields good 4- D codes of any length which exceed that of amplitude- and-phase modulation in two independent two-dimensional channels.

Abstract: This paper examines codes for four-dimensional (4-D) modulation and their performance for digital transmission. The signals are defined by M points inside a sphere in four-dimensional Euclidian space. Three classes of 4-D codes are presented, and an algorithm is given which yields good 4-D codes of any length. Bounds on symbol error probability are plotted versus symbol-energy-to-noise-density ratio. The performance is shown to exceed that of amplitude- and-phase modulation in two independent two-dimensional channels.

••

TL;DR: The sharp lower bound f(x) on the per-symbol output entropy for a given per-Symbol input entropy x is determined for stationary discrete memoryless channels; it is the lower convex envelope of the bound g( x) for a single channel use.

Abstract: The sharp lower bound f(x) on the per-symbol output entropy for a given per-symbol input entropy x is determined for stationary discrete memoryless channels; it is the lower convex envelope of the bound g(x) for a single channel use. The bounds agree for all noiseless channels and all binary channels. However, for nonbinary channels, g is not generally convex so that the bounds differ. Such is the case for the Hamming channels that generalize the binary symmetric channels. The bounds are of interest in connection with multiple-user communication, as exemplified by Wyner's applications of "Mrs. Gerber's lemma" (the bound for binary symmetric channels first obtained by Wyner and Ziv). These applications extend from the binary symmetric case to the. Hamming case. Doubly stochastic channels are characterized by the property of never decreasing entropy.

••

TL;DR: Estimation- theoretic and information-theoretic interpretations are developed and applied to prove existence theorems for universal source codes, both noiseless and with a fidelity criterion.

Abstract: The ergodic decomposition is discussed, and a version focusing on the structure of individual sample functions of stationary processes is proved for the special case of discrete-time random processes with discrete alphabets. The result is stronger in this case than the usual theorem, and the proof is both intuitive and simple. Estimation-theoretic and information-theoretic interpretations are developed and applied to prove existence theorems for universal source codes, both noiseless and with a fidelity criterion.