scispace - formally typeset
Search or ask a question

Showing papers in "Signal Processing-image Communication in 1995"


Journal ArticleDOI
TL;DR: A new, adaptive algorithm for change detection is derived where the decision thresholds vary depending on context, thus improving detection performance substantially.
Abstract: In many conventional methods for change detection, the detections are carried out by comparing a test statistic, which is computed locally for each location on the image grid, with a global threshold. These ‘nonadaptive’ methods for change detection suffer from the dilemma of either causing many false alarms or missing considerable parts of non-stationary areas. This contribution presents a way out of this dilemma by viewing change detection as an inverse, ill-posed problem. As such, the problem can be solved using prior knowledge about typical properties of change masks. This reasoning leads to a Bayesian formulation of change detection, where the prior knowledge is brought to bear by appropriately specified a priori probabilities. Based on this approach, a new, adaptive algorithm for change detection is derived where the decision thresholds vary depending on context, thus improving detection performance substantially. The algorithm requires only a single raster scan per picture and increases the computional load only slightly in comparison to non-adaptive techniques.

192 citations


Journal ArticleDOI
TL;DR: A totally automatic, low-complexity algorithm is proposed, which robustly performs face detection and tracking and is applicable to any video coding scheme that allows for fine-grain quantizer selection, and can maintain full decoder compatibility.
Abstract: We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces of persons in typical head-and-shoulders video sequences, and to exploit the face location information in a ‘classical’ video coding/decoding system. The motivation is to enable the system to selectively encode various image areas and to produce psychologically pleasing coded images where faces are sharper. We refer to this approach as model-assisted coding. We propose a totally automatic, low-complexity algorithm, which robustly performs face detection and tracking. A priori assumptions regarding sequence content are minimal and the algorithm operates accurately even in cases of partial occlusion by moving objects. Face location information is exploited by a low bit-rate 3D subband-based video coder which uses both a novel model-assisted pixel-based motion compensation scheme, as well as model-assisted dynamic bit allocation with object-selective quantization. By transferring a small fraction of the total available bit-rate from the non-facial to the facial area, the coder produces images with better-rendered facial features. The improvement was found to be perceptually significant on video sequences coded at 96 kbps for an input luminance signal in CIF format. The technique is applicable to any video coding scheme that allows for fine-grain quantizer selection (e.g. MPEG, H.261), and can maintain full decoder compatibility.

149 citations


Journal ArticleDOI
TL;DR: A technique for video compression based on a mosaic image representation obtained by aligning all frames of a video sequence, giving a panoramic view of the scene.
Abstract: We describe a technique for video compression based on a mosaic image representation obtained by aligning all frames of a video sequence, giving a panoramic view of the scene We describe two types of mosaics, static and dynamic, which are suited for storage and transmission applications, respectively In each case, the mosaic construction process aligns the images using a global parametric motion transformation, usually canceling the effect of camera motion on the dominant portion of the scene The residual motions that are not compensated by the parametric motion are then analyzed for their significance and coded The mosaic representation exploits large scale spatial and temporal correlations in image sequences In many applications where there is significant camera motion (eg, remote surveillance), it performs substantially better than traditional interframe compression methods and offers the potential for very low bit-rate transmission In storage applications, such as digital libraries and video editing environments, it has the additional benefit of enabling direct access and retrieval of single frames at a time

117 citations


Journal ArticleDOI
TL;DR: The ITU near term standard for very low bitrate video coding, H.263 (ITU-T SG 15/1 Rapporteurs Group for Very Low Bitrate Visual Telephony, 1995), is described and a long term activity is planned by ITU for the development of a new video coding algorithm with a considerable better picture quality than H. 263.
Abstract: The ITU near term standard for very low bitrate video coding, H.263 (ITU-T SG 15/1 Rapporteurs Group for Very Low Bitrate Visual Telephony, 1995), is described. Both QCIF and a sub-QCIF format (128 × 96) are mandatory picture formats for the decoder; the CIF picture format is optional. The H.263 algorithm consists of a mandatory core algorithm and four negotiable options. With H.263 a significantly better picture quality than with H.261 can be achieved, depending on the content of the video scene and the coding parameters. Also, the cost of the H.263 video codec can be kept low if only the minimum required is implemented. The negotiable options of H.263 increase the complexity of the video codec, but also significantly improve the picture quality. H.263 is part of a set of recommendations for a very low bitrate audio visual terminal that was frozen in January 1995 and is based on existing technology. A long term activity is planned by ITU for the development of a new video coding algorithm (H.263/L) with a considerable better picture quality than H.263. This standard will be developed in joint co-operation with MPEG4.

107 citations


Journal ArticleDOI
TL;DR: Results for INTRA coding of images indicate that the proposed shape-adaptive DCT transform algorithm allows efficient coding over a wide range of coding parameters — thus providing means for generic coding of segmented video between very high and very low bit rates.
Abstract: A low complexity shape-adaptive DCT transform algorithm for coding pels in arbitrarily shaped image segments is presented. The proposed algorithm is compared to the well established generalized shape-adaptive transform method introduced by Gilge et al. in terms of transform efficiency and computational complexity. Results obtained under both theoretical and experimental conditions show that the new algorithm achieves a transform efficiency close to that of the Gilge method with considerably reduced computational complexity. The proposed shape-adaptive DCT algorithm was implemented into a standard MPEG-1 coder to provide object or segment based coding of images and video with additional content-based functionality. The extended MPEG-1 object based coding scheme can handle generic input sequences and can readily provide MPEG-1 backward compatibility if no contour data is transmitted for a given video sequence. Results for INTRA coding of images indicate that the algorithm allows efficient coding over a wide range of coding parameters — thus providing means for generic coding of segmented video between very high and very low bit rates. It is further shown that some of the content-based based functionalities currently discussed in MPEG-4 can be provided efficiently using the proposed object based coding scheme.

107 citations


Journal ArticleDOI
TL;DR: A hierarchical scheme is developed to extract the semantic features in the head-and-shoulders scene, such as silhouette, face, eyes and mouth, using a knowledge-based selection mechanism.
Abstract: A method for the adaptation of a generic 3-D face model to an actual face in a head-and-shoulders scene is discussed, with application to video-telephony. The adaptation is carried out both on a global scale to reposition and resize the wire-frame, as well as on a local scale to mimic individual physiognomy. To this effect a hierarchical scheme is developed to extract the semantic features in the head-and-shoulders scene, such as silhouette, face, eyes and mouth, using a knowledge-based selection mechanism. These algorithms, which are to be an integral part of a general model-based image coder, are tested on typical videophone sequences.

104 citations


Journal ArticleDOI
TL;DR: A new approach to bit-rate control for inter-frame encoders such as MPEGEncoder using concepts from control theory is described, which is a surprisingly simple but effective model for the encoder, which consists of a gain element, a delay element and additive noise.
Abstract: Bit-rate control is a central problem in designing image sequence compression systems. In this paper we describe a new approach to bit-rate control for inter-frame encoders such as MPEG encoders. This approach uses concepts from control theory. Its central feature is a surprisingly simple but effective model for the encoder, which consists of a gain element, a delay element and additive noise. In our system we control the bit-rate with a PI-controller which is set to achieve two objectives: (1) we want the picture quality to be as uniform as possible, and (2) we want to use as closely as possible the available amount of bits. It is demonstrated in the paper that these two objectives, when considered separately, lead to contradictory settings of the controller. This dilemma can be solved by using Bit Usage Profiles that indicate how the bits have to be spread over the pictures. The effectiveness of the approach is demonstrated by designing a bit-rate control for an MPEG encoder that has a nearly constant bit-rate per group of pictures (GOP). Such a bit-rate control is of high value for applications like magnetic recording, where a constant bit-rate per GOP is required in order to realize playback trick modes, e.g. the fast forward mode.

69 citations


Journal ArticleDOI
TL;DR: A method of object-selective quantizer control in a standard coding system based on motion-compensated discrete cosine transform based onCCITT's recommendation H.261 is described, based on two novel algorithms, namely buffer rate modulation and buffer size modulation.
Abstract: We present a novel and practical way to integrate techniques from computer vision to low bit-rate coding systems for video teleconferencing applications. Our focus is to locate and track the faces and selected facial features of persons in typical head-and-shoulders video sequences, and to exploit the location information in a ‘classical’ video coding/decoding system. The motivation is to enable the system to encode selectively various image areas and to produce perceptually pleasing coded images where faces are sharper. We refer to this approach—a mix of classical waveform coding and model—based coding-as model-assisted coding. We propose two totally automatic algorithms which, respectively, perform the detection of a head outline, and identify an ‘eyes-nose-mouth’ region, both from downsampled binary thresholded edge images. The algorithms operate accurately and robustly, even in cases of significant head rotation or partial occlusion by moving objects. We show how the information about face and facial feature location can be advantageously exploited by low bit-rate waveform-based video coders. In particular, we describe a method of object-selective quantizer control in a standard coding system based on motion-compensated discrete cosine transform—CCITT's recommendation H.261. The approach is based on two novel algorithms, namely buffer rate modulation and buffer size modulation. By forcing the rate control algorithm to transfer a fraction of the total available bit-rate from the coding of the non-facial to that of the facial area, the coder produces images with better-rendered facial features, i.e. coding artefacts in the facial area are less pronounced and eye contact is preserved. The improvement was found to be perceptually significant on video sequences coded at the ISDN rate of 64 kbps, with 48 kbps for the input (color) video signal in QCIF format.

66 citations


Journal ArticleDOI
TL;DR: A new approach to adaptive Huffman coding of 2-D DCT coefficients for image sequence compression based on the popular motion-compensated interframe coding, which employs self-switching multiple Huffman codebooks for entropy coding of quantized transform coefficients.
Abstract: This paper presents a new approach to adaptive Huffman coding of 2-D DCT coefficients for image sequence compression. Based on the popular motion-compensated interframe coding, the proposed method employs self-switching multiple Huffman codebooks for entropy coding of quantized transform coefficients. Unlike the existing multiple codebook approaches where the type of block (intra/inter or luminance/chrominance) selects a codebook, the proposed method jointly utilizes the type of block, the quantizer step size, and the zigzag scan position for the purpose of codebook selection. In addition, as another utilization of the quantizer step size and the scan position, the proposed method uses a variable-length “Escape” sequence for encoding rare symbols. Experimental results show that the proposed method with two codebooks provides 0.1–0.4 dB improvement over the single-codebook scheme and this margin turns out to be substantially larger than that the MPEG-2, two-codebook approach has over the single-codebook approach.

55 citations


Journal ArticleDOI
TL;DR: A detailed analysis of the differential approach for motion estimation in video image sequences and some key elements for an effective implementation of a complete motion estimation scheme are drawn.
Abstract: This paper presents a detailed analysis of the differential approach for motion estimation in video image sequences. The models considered are defined either by a parametric approach, or by a physical approach (in terms of parameters of the pick-up equipment, movement and object structures). The relationships between the 2D and the 3D approaches are examined. The ambiguities inherent in a physical interpretation of a set of descriptors identified from image sequences are underlined. The critical points when using differential estimators are discussed, in particular, we study several classical image processing tools which improve the convergence rate of these estimators (hierarchical analysis, multiresolution, spatial interpolation of the luminance…). Definition and tuning of gains, initialization stage, cross-dependence between image segmentation and identification of the parameters associated to each region (as well as the duality between top-down and bottom-up approaches), which partly condition the behavior of the algorithms, are studied too. Results in terms of motion and segmentation maps, images predicted from one or several previous images by motion compensation, convergence curves of some of the proposed iterative algorithms illustrate this paper. Finally, we draw from these theoretical developments and the associated simulations some key elements for an effective implementation of a complete motion estimation scheme. We conclude by some perspectives for future work.

44 citations


Journal ArticleDOI
TL;DR: A segmentation algorithm based on split and merge, which combines merge, elimination of small regions and control of the number of regions for deep segmentation in very low bit-rate video coding.
Abstract: Very low bit-rate video coding has recently become one of the most important areas of image communication and a large variety of applications have already been identified. Since conventional approaches are reaching a saturation point, in terms of coding efficiency, a new generation of video coding techniques, aiming at a deeper “understanding” of the image, is being studied. In this context, image analysis, particularly the identification of objects or regions in images (segmentation), is a very important step. This paper describes a segmentation algorithm based on split and merge. Images are first simplified using mathematical morphology operators, which eliminate perceptually less relevant details. The simplified image is then split according to a quad tree structure and the resulting regions are finally merged in three steps: merge, elimination of small regions and control of the number of regions.

Journal ArticleDOI
TL;DR: It is shown that with motion-compensated hybrid coding, object-based analysis-synthesis coding, knowledge-based coding and semantic coding, there is a consistent development of source models so that these coding techniques can be combined in a layered coding system.
Abstract: Known coding techniques for transmitting moving images at very low bit rates are explained by the source models on which these coding techniques are based. It is shown that with motion-compensated hybrid coding, object-based analysis-synthesis coding, knowledge-based coding and semantic coding, there is a consistent development of source models. In consequence these coding techniques can be combined in a layered coding system. From experimental results obtained for object-based analysis-synthesis, coding estimates for the coding efficiency of such a layered coding system are derived using head and shoulder video telephone test sequences. It is shown that an additional compression factor of about 3 can be expected with such a complex layered coding system, when compared to block-based hybrid coding.

Journal ArticleDOI
TL;DR: An algorithm for accurate motion estimation that takes into account both themotion of the camera and the motion of the imaged objects (local motion) through the use of a multistage structure is described.
Abstract: In this paper we describe an algorithm for accurate motion estimation that takes into account both the motion of the camera (global motion) and the motion of the imaged objects (local motion) through the use of a multistage structure. First, global motion parameters are accurately estimated and the images are compensated for the camera motion. The local motion field is then estimated on the basis of these ‘compensated’ images. To improve the performance of this estimation, both temporal and spatial congruence constraints have been introduced. To test the performance of the proposed algorithm a motion compensated image interpolation that uses the estimated motion field has been carried out. The experimental results show the effectiveness of the proposed multistage motion estimation procedure.

Journal ArticleDOI
TL;DR: A hybrid coder using block-based coding as a fall-back mode in cases where 3-D motion estimation fails, is proposed and experimental results demonstrate the performance of the proposed coding scheme.
Abstract: A method that performs 3-D motion estimation for stereoscopic image sequences is presented. The 2-D motion of each object observed in one of the two views is modelled using a 3-D motion model involving a translation and a rotation. The estimation of model parameters is performed in two steps: a linear step involving 2-D vectors that are initially estimated using block matching techniques followed by a non-linear step involving displaced frame difference minimization. The regions where the 3-D model is applied are identified using a motion-based split and merge technique. Furthermore, an extension of the 3-D motion estimation method that uses a single 3-D motion model to describe the apparent 2-D motion in both channels is examined. These 3-D motion estimation methods are then integrated in a stereoscopic interframe coding scheme. A hybrid coder using block-based coding as a fall-back mode in cases where 3-D motion estimation fails, is proposed. Experimental results demonstrate the performance of the proposed coding scheme.

Journal ArticleDOI
TL;DR: The LTS1 Reference LTS-ARTICLE-1995-005 describes the design and construction of the Large Hadron Collider and some of the experimental procedures that went into its construction.
Abstract: Keywords: LTS1 Reference LTS-ARTICLE-1995-005View record in Web of Science Record created on 2006-06-14, modified on 2016-08-08

Journal ArticleDOI
TL;DR: Using fast transformation and interpolation algorithms, it is shown that while the compression efficiency of the presented method is far superior to that of the conventional full search block-matching motion estimation, its computational complexity is still affordable.
Abstract: A combination of spatial transformation and image segmentation is used to compensate for non-uniform intensity changes in moving scenes. The method efficiently tracks movements such that the motion vectors alone can be employed to represent a moving object with complex motion. Using fast transformation and interpolation algorithms, it is shown that while the compression efficiency of the presented method is far superior to that of the conventional full search block-matching motion estimation, its computational complexity is still affordable.

Journal ArticleDOI
TL;DR: The use of the geodesic skeleton as a morphological tool for contour coding of segmented image sequences and a new technique is presented for the entropy coding of the coordinates of the skeleton points exploiting their special spatial distribution.
Abstract: Region-based coding schemes are among the most promising compression techniques for very low bit-rate applications. They consist of image segmentation, contour and texture coding. This paper deals with the use of the geodesic skeleton as a morphological tool for contour coding of segmented image sequences. In the geodesic case, already coded and known regions are taken into account for the coding of contours of unknown regions. A new technique is presented for the entropy coding of the coordinates of the skeleton points exploiting their special spatial distribution. Furthermore, a fast algorithm for the reconstruction of the skeleton points is given based on hierarchical queues. In the case of numerous isolated contour arcs (for example error coding in a motion prediction loop), the geodesic skeleton proofs higher efficiency than traditional methods. Results at very low bit-rates are presented and compared to standard methods confirming the validity of the chosen approach.

Journal ArticleDOI
TL;DR: The source model of moving rigid 3D objects of an object-based analysis-synthesis coder (OBASC) is extended from diffuse to non-diffuse illumination introducing the explicit illumination model of a distant point light source and ambient diffuse light.
Abstract: In this paper, the source model of moving rigid 3D objects of an object-based analysis-synthesis coder (OBASC) is extended from diffuse to non-diffuse illumination introducing the explicit illumination model of a distant point light source and ambient diffuse light. For each image of a real image sequence containing moving objects, first, shape and 3D motion parameters describing the objects are estimated assuming an ellipsoid-like smooth shape. Then, the illumination parameters are estimated by a fast iterative maximum-likelihood Gauβ-Newton estimation method. Typically, the illumination parameters converge after very few images close to the true ones. The accurateness depends on the amount of object rotation and the correctness of the shape assumptions. For a real image sequence showing a textured ball covering 20% of image area, rotating about 10 ° per frame, and illuminated by spot and ambient light, the extension of the source model reduces the model failures from 9.9% of the image area to 6.7%. In the area of model failures, the image synthesized from the source model parameters differ significantly from the real image. In this early experiment, source model parameters are coded losslessly. Since model failures are expensive by means of bit-rate, a significant reduction of bit-rate can be expected.

Journal ArticleDOI
TL;DR: It is shown that an amplitude scalable compression scheme for 10-bit video can be developed using the MPEG-2 syntax and tools and that it is possible to quantitatively analyze the multi-generation characteristics of the non-scalable approach using the theory of generalized projections.
Abstract: We address the problem of compressing 10-bits per pixel video using the tools of the emerging MPEG-2 standard, which is primarily targeted to 8-bits per pixel video. We show that an amplitude scalable compression scheme for 10-bit video can be developed using the MPEG-2 syntax and tools. We experimentally evaluate the performance of the scalable approach and compare it with the straightforward non-scalable approach where the 10-bit input is rounded to 8 bits and usual 8-bit MPEG-2 compression is applied. In addition to general performance evaluation of scalable and non-scalable approaches, we also evaluate their multi-generation characteristics where the input video undergoes successive compression-decompression cycles. We show that it is possible to quantitatively analyze the multi-generation characteristics of the non-scalable approach using the theory of generalized projections.

Journal ArticleDOI
TL;DR: G Gibbs-Markov models are proposed and linked together by the maximum a posteriori probability (MAP) criterion, and the resulting objective function is minimized using multiresolution deterministic relaxation.
Abstract: This paper makes two contributions to the area of motion-compensated processing of image sequences. First contribution is the development of a framework for the modeling and estimation of dense 2-D motion trajectories with acceleration. Therefore, Gibbs-Markov models are proposed and linked together by the maximum a posteriori probability (MAP) criterion, and the resulting objective function is minimized using multiresolution deterministic relaxation. Accuracy of the method is demonstrated by measuring the mean-squared error of estimated motion parameters for images with synthetic motion. Second contribution is the demonstration of a significant gain resulting from the use of trajectories with acceleration in motion-compensated temporal interpolation of videoconferencing/videophone images. An even higher gain is demonstrated when the accelerated motion trajectory model is augmented with occlusion and motion discontinuity models. The very good performance of the method suggests a potential application of the proposed framework in the next generation of video coding algorithms.

Journal ArticleDOI
TL;DR: It is shown that for simple video telephony scenes a reduction of more than 30% in the energy of the prediction error can be achieved with an unchanged number of transmitted motion vectors and with only a modest increase in computational complexity.
Abstract: A new method for motion-compensated temporal prediction of image sequences is proposed. Motion vector fields in natural scenes should possess two basic properties. First, the field should be smoothly varying within moving objects to compensate for nonrigid or rotational motion, and scaling of objects. Second, the field should be discontinuous along the boundaries of the objects. In the proposed method the motion vector field is modelled using finite element methods and interpolated using adaptive interpolators to satisfy the above-stated requirements. This is particularly important when only very sparse estimates of motion vector fields are available in the decoder due to bit-rate constraints limiting the amount of overhead information that can be transmitted. The proposed prediction method can be applied for low-bit-rate video coding in conventional codecs based on motion-compensated prediction and transform coding, as well as in model-based codecs. The performance of the proposed method is compared with standard motion-compensated prediction based on block matching. It is shown that for simple video telephony scenes a reduction of more than 30% in the energy of the prediction error can be achieved with an unchanged number of transmitted motion vectors and with only a modest increase in computational complexity. When implemented in an H.261 codec the new prediction method can improve the peak SNR 1–2 dB producing a significant visual improvement.

Journal ArticleDOI
TL;DR: It is shown that the analysis of moving image sequences for 3D modelling can be performed in a relatively straightforward manner if the scene is captured in stereo, and the known problems of ambiguity suffered by monocular-source model-based coders are alleviated.
Abstract: It is shown that the analysis of moving image sequences for 3D modelling can be performed in a relatively straightforward manner if the scene is captured in stereo. Output from a stereo disparity estimation process using calibrated cameras gives absolute 3D surface coordinates from a single stereo pair. When combined with monocular motion cues, the true 3D motion parameters of moving objects can be accurately calculated. Further analysis enables segmentation of body elements according to motion while the 3D surface feature structure, although available from the start, can be integrated and checked for anomalies over the sequence. These results are expected to alleviate the known problems of ambiguity suffered by monocular-source model-based coders.

Journal ArticleDOI
TL;DR: The incorporation of the proposed global-motion estimation technique in an image-sequence coder was found to bring about a substantial reduction in bit-rate without degrading the perceived quality or the PSNR.
Abstract: A technique for global -motion estimation and compensation in image sequences of 3-D scenes is described in this paper. Each frame is segmented into regions whose motion can be described by a single set of parameters and a set of motion parameters is estimated for each segment. This is done using an iterative block-based image segmentation combined with the estimation of the parameters describing the global motion of each segment. The segmentation is done using a Gibbs-Markov model-based iterative technique for finding a local optimum solution to a maximum a posteriori probability (MAP) segmentation problem. The initial condition for this process is obtained by applying a Hough transform to the motion vectors of each block in the frame obtained by block matching. In each iteration, given a segmentation, the motion parameters are estimated using the least-squares (LS) technique. To obtain the final segmentation and the more appropriate higher-order motion model for each segment, a final stage of splitting/merging of segments is needed. This step is performed on the basis of maximum-likelihood decisions combined with the determination of the higher-order model parameters by LS. The incorporation of the proposed global-motion estimation technique in an image-sequence coder was found to bring about a substantial reduction in bit-rate without degrading the perceived quality or the PSNR.

Journal ArticleDOI
TL;DR: The framework which guided the development of the generic video coding standard MPEG-2 Video in MPEG is addressed, resulting in the establishment of ‘Profiles and Levels’ which define subsets of the whole standard to facilitate interworking among different applications as well as assisting practical implementation of the standard.
Abstract: This paper addresses the framework which guided the development of the generic video coding standard MPEG-2 Video (ISO/IEC 13818-2) in MPEG (ISO/IECJTC1/SC29/WG11). The authors were deeply involved in the standard development process and for that purpose we first collected various requirements from a number of applications, typically distribution, storage and retrieval and communication services, and extracted essential items. This requirements information led the MPEG members to collaboratively develop necessary algorithmic elements through the use of an evolving ‘Test Model’. The second stage of the work was to investigate how the generic standard should be structured, resulting in the establishment of ‘Profiles and Levels’ which define subsets of the whole standard to facilitate interworking among different applications as well as assisting practical implementation of the standard. The final stage was to make a plan for verifying the draft standard through testing before its approval.

Journal ArticleDOI
TL;DR: The performance trends shown by the analysis indicate that the complexity of model-based coding algorithms, when combined with their reliance on coding picture differences and content-dependent algorithm execution times, interact to make it very difficult to achieve significant speed-up of sequential algorithms.
Abstract: Model-based and object-oriented coding algorithms are generally more computationally complex than current block-based image coding standards such as H.261, due primarily to the complexity of the image analysis they require. In this paper, simulations of H.261 and two model-based coding algorithms are analysed in terms of their computational complexity, and mapped onto a generalised image coder parallel-pipeline model. Example implementations of the H.261 coder and an object-oriented coder using general purpose parallel processor systems are then presented to confirm the validity of the performance trend analyses; these achieve maximum speedups of about 11 and 1.7, respectively, using up to 16 processors. The performance trends shown by the analysis indicate that the complexity of model-based coding algorithms, when combined with their reliance on coding picture differences and content-dependent algorithm execution times, interact to make it very difficult to achieve significant speed-up of sequential algorithms. Furthermore, the algorithm complexity and abstract data structures will make direct hardware implementations increasingly difficult. Overcoming these problems to achieve real-time model-based coders may require significant algorithmic compromises to be made.

Journal ArticleDOI
TL;DR: A new approach for very low bit-rate sequence coding (below 64 kbit/s) in which spatial information and temporal changes are encoded using similar tools is proposed, maintaining a good subjective quality.
Abstract: We propose a new approach for very low bit-rate sequence coding (below 64 kbit/s) in which spatial information and temporal changes are encoded using similar tools. In this spatio-temporal integrated approach adaptive multigrids are used for both predicted images (the inter-images) and for the intra-images. The inter-images are mainly encoded by a reliable motion field analyzed and transcribed on an adaptive sampling grid (with variable sampling mesh). The intra-images are encoded using tree segmentation leading to a similar transcription. Universal entropy coding and image reconstruction based on overlapping waveforms are used for both intra-images and inter-images. This coding system is well suited for the cases where regular intra-image refreshes are demanded. Present results suggest that with the proposed method, the range of very low bit-rate can be achieved, maintaining a good subjective quality.

Journal ArticleDOI
TL;DR: A new adaptive method based on a two-dimensional finite-state vector quantization (2-D FSVQ) and a frame adaptive technique using codebook and address replenishments that can achieve nearly constant quality over the entire image sequence.
Abstract: This paper presents a new adaptive method for the compression of image sequences. The method is based on a two-dimensional finite-state vector quantization (2-D FSVQ), and a frame adaptive technique using codebook and address replenishments. Each frame is described by an FSVQ super codebook and an address map. The super codebook is the collection of state codebooks and each state codebook consists of K region codebooks. The super codebook and the address map are efficiently updated on a frame basis so as to more closely match the local frame statistics, thus yielding approximately fixed quality over the entire image sequence. This method is very computationally efficient and can achieve nearly constant quality over the entire image sequence.

Journal ArticleDOI
TL;DR: The proposed coder sequentially employs selective motion estimation on the wavelet transform domain, motion-compensated prediction of wavelet coefficients, and selective entropy-constrained vector quantization of the resultant MCP errors to considerably reduce the computational burden.
Abstract: This paper proposes a new motion-compensated wavelet transform video coder for very low bit-rate visual telephony The proposed coder sequentially employs: (1) selective motion estimation on the wavelet transform domain, (2) motion-compensated prediction (MCP) of wavelet coefficients, and (3) selective entropy-constrained vector quantization (ECVQ) of the resultant MCP errors The selective schemes in motion estimation and in quantization, which efficiently exploit the characteristic of image sequences in a visual telephony, considerably reduce the computational burden The coder also employs a tree structure encoding to represent efficiently which blocks were encoded In addition, in order to reduce the number of ECVQ codebooks and the image dependency of their performance, we introduce a preprocessing of signals which normalizes input vectors of ECVQ Simulation results show that our video coder provides good PSNR (peak-to-peak signal-to-noise ratio) performance and efficient rate control

Journal ArticleDOI
TL;DR: The experimental results obtained with image vector quantization show that the proposed method learns more rapidly and yields better quality of the coded images than conventional competitive learning method with a scalar learning rate.
Abstract: In this paper, we present a new competitive learning algorithm with classified learning rates, and apply it to vector quantization of images. The basic idea is to assign a distinct learning rate to each reference vector. Each reference vector is updated independently of all the other reference vectors using its own learning rate. Each learning rate is changed only when its corresponding reference vector wins the competition, and the learning rates of the losing reference vectors are not changed. The experimental results obtained with image vector quantization show that the proposed method learns more rapidly and yields better quality of the coded images than conventional competitive learning method with a scalar learning rate.

Journal ArticleDOI
TL;DR: The purpose of the method proposed here is to achieve exact decomposition and reconstruction using ideal band-pass filters implemented in the DFT domain without any excess data.
Abstract: This work extends previously reported work on image multi-subband decomposition/reconstruction using DFT. The purpose of the method proposed here is to achieve exact decomposition and reconstruction using ideal band-pass filters implemented in the DFT domain without any excess data.