scispace - formally typeset
Search or ask a question

Showing papers by "Aggelos K. Katsaggelos published in 2001"


Journal ArticleDOI
TL;DR: An iterative algorithm for enhancing the resolution of monochrome and color image sequences and several different experiments using the same motion estimator but three different data fusion approaches to merge the individual motion fields were performed.
Abstract: We propose an iterative algorithm for enhancing the resolution of monochrome and color image sequences. Various approaches toward motion estimation are investigated and compared. Improving the spatial resolution of an image sequence critically depends upon the accuracy of the motion estimator. The problem is complicated by the fact that the motion field is prone to significant errors since the original high-resolution images are not available. Improved motion estimates may be obtained by using a more robust and accurate motion estimator, such as a pel-recursive scheme instead of block matching, in processing color image sequences, there is the added advantage of having more flexibility in how the final motion estimates are obtained, and further improvement in the accuracy of the motion field is therefore possible. This is because there are three different intensity fields (channels) conveying the same motion information. In this paper, the choice of which motion estimator to use versus how the final estimates are obtained is weighed to see which issue is more critical in improving the estimated high-resolution sequences. Toward this end, an iterative algorithm is proposed, and two sets of experiments are presented. First, several different experiments using the same motion estimator but three different data fusion approaches to merge the individual motion fields were performed. Second, estimated high-resolution images using the block matching estimator were compared to those obtained by employing a pel-recursive scheme. Experiments were performed on a real color image sequence, and performance was measured by the peak signal to noise ratio (PSNR).

120 citations


Journal ArticleDOI
TL;DR: A time domain algorithm based on the coding of line segments which are used to approximate the signal is presented that will guarantee the smallest possible distortion among all methods applying linear interpolation given an upper bound on the available number of bits.
Abstract: Signal compression is an important problem encountered in many applications. Various techniques have been proposed over the years for addressing the problem. Here, the authors present a time domain algorithm based on the coding of line segments which are used to approximate the signal. These segments are fit in a way that is optimal in the rate distortion sense. Although the approach is applicable to any type of signal, the authors focus, in this paper, on the compression of electrocardiogram (ECG) signals. ECG signal compression has traditionally been tackled by heuristic approaches. However, it has been demonstrated (R. Nygaard and D. Haugland, Proc. Eur. Signal Processing Conf. (EUSIPCO), Island of Rhodes, Greece, p. 2473-6, 1998) that exact optimization algorithms outperform these heuristic approaches by a wide margin with respect to reconstruction error. By formulating the compression problem as a graph theory problem, known optimization theory can be applied in order to yield optimal compression. In this paper, the authors present an algorithm that will guarantee the smallest possible distortion among all methods applying linear interpolation given an upper bound on the available number of bits. Using a varied signal test set, extensive coding experiments are presented. The authors compare the results from their coding method to traditional time domain ECG compression methods, as well as, to more recently developed frequency domain methods. Evaluation is based both on percentage root-mean-square difference (PRD) performance measure and visual inspection of the reconstructed signals. The results demonstrate that the exact optimization methods have superior performance compared to both traditional ECG compression methods and the frequency domain methods.

66 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: This work presents a deterministic dynamic programming approach to find the optimal tradeoff for both the constant bit rate (CBR) and renegotiated constant bit rates (RCBR) service classes.
Abstract: We study the design tradeoffs involved in video streaming in networks with QoS guarantees. We approach this problem by using a utility function to quantify the benefit a user derives from the received video sequence. This benefit is expressed as a function of the total distortion. In addition, we also consider the cost, in network resources, of a video streaming system. The goal of the network user is then to obtain the most benefit for the smallest cost. We formulate this utility maximization problem as a joint constrained optimization problem. The difference between the utility and the network cost is maximized subject to the constraint that the decoder buffer does not underflow. We present a deterministic dynamic programming approach to find the optimal tradeoff for both the constant bit rate (CBR) and renegotiated constant bit rate (RCBR) service classes. Experimental results demonstrate the benefits and the performance of the proposed approach.

42 citations


Proceedings ArticleDOI
07 Oct 2001
TL;DR: A method for simultaneously estimating the high-resolution frames and the corresponding motion field from a compressed low-resolution video sequence is presented and illustrates an improvement in the peak signal-to-noise ratio when compared with traditional interpolation techniques.
Abstract: A method for simultaneously estimating the high-resolution frames and the corresponding motion field from a compressed low-resolution video sequence is presented. The algorithm incorporates knowledge of the spatio-temporal correlation between low and high-resolution images to estimate the original high-resolution sequence from the degraded low-resolution observation. Information from the encoder is also exploited, including the transmitted motion vectors, quantization tables, coding modes and quantizer scale factors. Simulations illustrate an improvement in the peak signal-to-noise ratio when compared with traditional interpolation techniques and are corroborated with visual results.

32 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: A joint source-channel coding scheme for scalable video is developed and the resulting optimization algorithm utilizes universal rate-distortion characteristic plots, showing the contribution of each layer to the total distortion as a function of the source rate of the layer and the residual bit error rate.
Abstract: A joint source-channel coding scheme for scalable video is developed in this paper. An SNR scalable video coder is used and unequal error protection (UEP) is allowed for each scalable layer. Our problem is to allocate the available bit rate across scalable layers and, within each layer, between source and channel coding, while minimizing the end-to-end distortion of the received video sequence. The resulting optimization algorithm we propose utilizes universal rate-distortion characteristic plots. These plots show the contribution of each layer to the total distortion as a function of the source rate of the layer and the residual bit error rate (the error rate that remains after the use of channel coding). Models for these plots are proposed in order to reduce the computational complexity of the solution. Experimental results demonstrate the effectiveness of the proposed approach.

27 citations


Proceedings ArticleDOI
01 Jan 2001
TL;DR: The optimization algorithm that is proposed utilizes universal rate-distortion characteristic curves that show the contribution of each layer to the total distortion as a function of the source rate of the layer and the residual bit error rate (the error rate after channel coding).
Abstract: We extend our previous work on joint source-channel coding to scalable video transmission over wireless direct-sequence code-division-multiple-access (DS-CDMA) multipath fading channels. A SNR scalable video coder is used and unequal error protection (UEP) is allowed for each scalable layer. At the receiver-end an adaptive antenna array auxiliary-vector (AV) filter is utilized that provides space-time RAKE-type processing and multiple-access interference suppression. The choice of the AV receiver is dictated by realistic channel fading rates that limit the data record available for receiver adaptation and redesign. Our problem is to allocate the available bit rate of the user of interest between source and channel coding and across scalable layers, while minimizing the end-to-end distortion of the received video sequence. The optimization algorithm that we propose utilizes universal rate-distortion characteristic curves that show the contribution of each layer to the total distortion as a function of the source rate of the layer and the residual bit error rate (the error rate after channel coding). These plots can be approximated using appropriate functions to reduce the computational complexity of the solution.

25 citations


Proceedings ArticleDOI
01 Jan 2001
TL;DR: This work casts the pre-processing problem in the operational rate-distortion framework, and proposes a method that couples the choice of the quantization scale to the response of the prefilter to address coding errors.
Abstract: Pre-processing algorithms improve the quality of a compression system by removing unimportant data before encoding. This enhances both the visual quality and coding efficiency of the system. We cast the pre-processing problem in the operational rate-distortion framework. Filtering the displaced frame difference is the focus, and the proposed method couples the choice of the quantization scale to the response of the prefilter. Coding errors are then addressed by penalizing significant differences between coded blocks. Finally, experimental results illustrate the efficacy of the method within the context of an MPEG-2 coding scenario.

19 citations


Patent
21 Dec 2001
TL;DR: A scalability type selection method and structure for hybrid SNR-temporal scalability that employs a decision mechanism capable of selecting between SNR and temporal scalability based upon desired criteria is disclosed as discussed by the authors.
Abstract: A scalability type selection method and structure for hybrid SNR-temporal scalability that employs a decision mechanism capable of selecting between SNR and temporal scalability based upon desired criteria is disclosed. This method and structure utilizes models of the desired criteria such as the extent of motion between two encoded frames, the temporal distance between two encoded frames, the gain in visual quality achieved by using SNR scalability over temporal scalability, and the bandwidth available to the scaleable layer in deciding which form of scalability to use. Furthermore, the method and structure allows for control over the extent to which a certain type of scalability is used. By the selection of parameters used in the models, a certain type of scalability can be emphasized while another type of scalability can be given less preference. This invention not only allows the type of scalability to be selected but also to the degree to which the scalability will be used.

17 citations


Proceedings ArticleDOI
01 Jan 2001
TL;DR: This work presents a vertex-based shape coding method which is optimal in the operational rate-distortion sense and takes into account the texture information of the video frames and utilizing a variable-width tolerance band which is proportional to the degree of trust in the accuracy of the shape information at that location.
Abstract: A major problem in object oriented video coding and MPEG-4 is the encoding of object boundaries. Traditionally, and within MPEG-4, the encoding of shape and texture information are separate steps (the extraction of shape is not considered by the standards). We present a vertex-based shape coding method which is optimal in the operational rate-distortion sense and takes into account the texture information of the video frames. This is accomplished by utilizing a variable-width tolerance band which is proportional to the degree of trust in the accuracy of the shape information at that location. Thus, in areas where the confidence in the estimation of the boundary is not high and/or coding errors in the boundary will not affect the application (object oriented coding, MPEG-4, etc.) significantly, a larger boundary approximation error is allowed. We present experimental results which demonstrate the effectiveness of the proposed algorithm.

16 citations


Proceedings ArticleDOI
01 Jan 2001
TL;DR: This work forms an optimization problem that corresponds to minimising the transmission energy required to achieve an acceptable level of distortion subject to a delay constraint, and presents results illustrating the advantages of jointly considering these two variables.
Abstract: Transmitter energy is a valuable resource in wireless networks. Transmitter power management can have an impact on battery life for mobile users, link level QoS and network capacity. We consider efficient use of transmitter energy in a streaming application.. We formulate an optimization problem that corresponds to minimising the transmission energy required to achieve an acceptable level of distortion subject to a delay constraint. By considering jointly the selection of coding parameters and transmitter power, we can formulate an optimal policy. We present results illustrating the advantages of jointly considering these two variables.

16 citations


Journal ArticleDOI
TL;DR: A new methodology for signal-to-noise ratio (SNR) video scalability based on the partitioning of the DCT coefficients is introduced using Lagrangian relaxation and dynamic programming.
Abstract: We introduce a new methodology for signal-to-noise ratio (SNR) video scalability based on the partitioning of the DCT coefficients. The DCT coefficients of the displaced frame difference (DFD) for inter-blocks or the intensity for intra-blocks are partitioned into a base layer and one or more enhancement layers, thus, producing an embedded bitstream. Subsets of this bitstream can be transmitted with increasing video quality as measured by the SNR. Given a bit budget for the base and enhancement layers the partitioning of the DCT coefficients is done in a way that is optimal in the operational rate-distortion sense. The optimization is performed using Lagrangian relaxation and dynamic programming (DP). Experimental results are presented and conclusions are drawn.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: This work considers a situation where a video sequence is to be compressed and transmitted over a wireless channel, and formulates an optimization problem that corresponds to minimizing the energy required to transmit a video frame with an acceptable level of distortion.
Abstract: A key constraint in mobile communications is the reliance on a battery with a limited energy supply. Efficiently utilizing the available energy is therefore an important design consideration. We consider a situation where a video sequence is to be compressed and transmitted over a wireless channel. The goal is to limit the amount of distortion in the received video sequence while using the minimum required transmission energy. To accomplish this goal, we consider error resilience and concealment techniques, at the source coding level, as well as the dynamic allocation of physical layer communication resources. We consider these approaches jointly in a novel framework. We formulate an optimization problem that corresponds to minimizing the energy required to transmit a video frame with an acceptable level of distortion. We present methods for solving this problem and other extensions.

Proceedings ArticleDOI
07 May 2001
TL;DR: This work proposes a new iterative method, which is stochastic for the line process and deterministic for the reconstruction of SPECT images, and uses a compound Gauss Markov random field as prior model to reconstruct such images.
Abstract: SPECT (single photon emission computed tomography) is used in nuclear medicine to determine the distribution of a radioactive isotope within a patient from tomographic views or projection data. These images are severely degraded due to the presence of noise and several physical factors like attenuation and scattering. We use, within the Bayesian framework, a compound Gauss Markov random field (CGMRF) as prior model to reconstruct such images. In order to find the maximum a posteriori (MAP) estimate we propose a new iterative method, which is stochastic for the line process and deterministic for the reconstruction. The proposed method is tested and compared with other reconstruction methods on both synthetic and real SPECT images.

Proceedings ArticleDOI
08 Jun 2001
TL;DR: Results form objective and subjective analysis show that an HMM correlating model can significantly decrease audio-visual synchronization errors and increase speech understanding.
Abstract: Emerging broadband communication systems promise a future of multimedia telephony. The addition of visual information, for example, during telephone conversions would be most beneficial to people with impaired hearing useful for speech reading, based on existing narrowband communications system used for speech signal. A Hidden Markov Model (HMM)-based visual speech synthesizer is designed to improve speech understanding. The key elements in the application of HMMs to this problem are: a) the decomposition of the overall modeling task into key stages; and, b) the judicious determination of the components of the observation vector for each stage. The main contribution of this paper is the development of a novel correlation HMM model that is able to integrate independently trained acoustic and visual HMMs for speech-to-visual synthesis. This model allows increased flexibility in choosing model topologies for the acoustic and visual HMMs. It also reduces the amount of required training data compared to early integration modeling techniques. Results form objective and subjective analysis show that an HMM correlating model can significantly decrease audio-visual synchronization errors and increase speech understanding.

Proceedings ArticleDOI
07 Oct 2001
TL;DR: A new shape-coding approach is presented, which decouples the shape information into two independent data sets, the skeleton and the distance of the boundary from the skeleton, which provides the possibility of better performance in the operational rate-distortion (ORD) optimal sense than other reported techniques.
Abstract: We present a new shape-coding approach, which decouples the shape information into two independent data sets, the skeleton and the distance of the boundary from the skeleton. The major benefit of this approach is that it allows a more flexible trade-off between accuracy of the approximation and bit-allocation cost, and thus, provides the possibility of better performance in the operational rate-distortion (ORD) optimal sense than other reported techniques. The characteristics of these data sets are studied and various approximation approaches are applied on each of them to reach an ORD optimal result. We apply, for example, polygonal approximation on both the skeleton and distance data. We demonstrate that the resulting approach outperforms existing ORD optimal approaches.

Proceedings ArticleDOI
07 May 2001
TL;DR: A novel fidelity constraint to the image enhancement problem is presented, which exploits the motion vectors of a compressed video bit-stream to guarantee that processing the decoded sequence does not violate this correspondence.
Abstract: A novel fidelity constraint to the image enhancement problem is presented With this constraint, we exploit the motion vectors of a compressed video bit-stream These vectors establish a correspondence between image pixels across a series of frames, and we guarantee that processing the decoded sequence does not violate this correspondence We develop the constraint within the context of MPEG-2 and incorporate the constraint into a regularized enhancement algorithm Simulations are then performed Quantitative and qualitative results illustrate an improvement in visual quality


Proceedings ArticleDOI
26 Sep 2001
TL;DR: A time domain signal compression algorithm based on the coding of line segments which are used to approximate the signal is presented, generalized by using second order polynomial interpolation for the reconstruction of the signal from the extracted signal samples.
Abstract: We present a time domain signal compression algorithm based on the coding of line segments which are used to approximate the signal. These segments are fitted in a way that is optimal in the rate distortion sense. The approach is applicable to many types of signals, but in this paper we focus on the compression of electrocardiogram (ECG) signals. As opposed to traditional time-domain algorithms, where heuristics are used to extract representative signal samples from the original signal, an optimization algorithm is formulated for sample selection using graph theory, with linear interpolation applied to the reconstruction of the signal. In this paper the algorithm is generalized by using second order polynomial interpolation for the reconstruction of the signal from the extracted signal samples. The polynomials are fitted in a way that guarantees minimum reconstruction error given an upper bound on the number of bits. The method achieves good performance compared both to the case where linear interpolation is used in reconstruction of the signal and to other state-of-the-art ECG coders.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: A novel fidelity constraint is presented by exploiting the motion vectors of a compressed video bit-stream to establish a correspondence between image pixels across a series of frames, and this constraint is posed within the context of a sum-of-squared errors criterion for matching.
Abstract: We present a novel fidelity constraint for the image enhancement problem by exploiting the motion vectors of a compressed video bit-stream. These vectors establish a correspondence between image pixels across a series of frames, and our goal is to maintain this relationship during processing. In our past work, we considered algorithms that relied on the sum-of-absolute differences as the match criteria. As we show in this paper, this metric is problematic for the enhancement problem. We then pose the constraint within the context of a sum-of-squared errors criterion for matching. This allows for a more rigorous treatment of the fidelity constraint. Finally, experimental results illustrate the performance of the new constraint.