scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Adaptive Split-and-Merge for Image Analysis and Coding

01 May 1986-Vol. 0594, pp 2-9
TL;DR: In this paper, an approximation algorithm for two-dimensional (2-D) signals, e.g. images, is presented by partitioning the original signal into adjacent regions with each region being approximated in the least square sense by a 2-D analytical function.
Abstract: An approximation algorithm for two-dimensional (2-D) signals, e.g. images, is presented. This approximation is obtained by partitioning the original signal into adjacent regions with each region being approximated in the least square sense by a 2-D analytical function. The segmentation procedure is controlled iteratively to insure at each step the best possible quality between the original image and the segmented one. The segmentation is based on two successive steps: splitting the original picture into adjacent squares of different size, then merging them in an optimal way into the final region configuration. Some results are presented when the approximation is performed by polynomial functions.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This work presents a new lattice-based perfect reconstruction and critically sampled anisotropic M-DIR WT, which provides an efficient tool for nonlinear approximation of images, achieving the approximation power O(N/sup -1.55/), which, while slower than the optimal rate O-2/, is much better than O-1/ achieved with wavelets, but at similar complexity.
Abstract: In spite of the success of the standard wavelet transform (WT) in image processing in recent years, the efficiency of its representation is limited by the spatial isotropy of its basis functions built in the horizontal and vertical directions. One-dimensional (1-D) discontinuities in images (edges and contours) that are very important elements in visual perception, intersect too many wavelet basis functions and lead to a nonsparse representation. To efficiently capture these anisotropic geometrical structures characterized by many more than the horizontal and vertical directions, a more complex multidirectional (M-DIR) and anisotropic transform is required. We present a new lattice-based perfect reconstruction and critically sampled anisotropic M-DIR WT. The transform retains the separable filtering and subsampling and the simplicity of computations and filter design from the standard two-dimensional WT, unlike in the case of some other directional transform constructions (e.g., curvelets, contourlets, or edgelets). The corresponding anisotropic basis functions (directionlets) have directional vanishing moments along any two directions with rational slopes. Furthermore, we show that this novel transform provides an efficient tool for nonlinear approximation of images, achieving the approximation power O(N/sup -1.55/), which, while slower than the optimal rate O(N/sup -2/), is much better than O(N/sup -1/) achieved with wavelets, but at similar complexity.

320 citations

Journal ArticleDOI
TL;DR: High-quality variable-rate image compression is achieved by segmenting an image into regions of different sizes, classifying each region into one of several perceptually distinct categories, and using a distinct coding procedure for each category.
Abstract: High-quality variable-rate image compression is achieved by segmenting an image into regions of different sizes, classifying each region into one of several perceptually distinct categories, and using a distinct coding procedure for each category Segmentation is performed with a quadtree data structure by isolating the perceptually more important areas of the image into small regions and separately identifying larger random texture blocks Since the important regions have been isolated, the remaining parts of the image can be coded at a lower rate than would be otherwise possible High-quality coding results are achieved at rates between 035 and 07 b/p depending on the nature of the original image, and satisfactory results have been obtained at 025 b/p >

253 citations

Patent
04 Oct 1988
TL;DR: In this article, a full motion color digital video signal is compressed, formatted for transmission, recorded on compact disc media and decoded at conventional video frame rates, where regions of a frame are individually analyzed to select optimum fill coding methods specific to each region.
Abstract: A full motion color digital video signal is compressed, formatted for transmission, recorded on compact disc media and decoded at conventional video frame rates. During compression, regions of a frame are individually analyzed to select optimum fill coding methods specific to each region. Region decoding time estimates are made to optimize compression thresholds. Region descriptive codes conveying the size and locations of the regions are grouped together in a first segment of a data stream. Region fill codes conveying pixel amplitude indications for the regions are grouped together according to fill code type and placed in other segments of the data stream. The data stream segments are individually variable length coded according to their respective statistical distributions and formatted to form data frames. The number of bytes per frame is dithered by the addition of auxiliary data determined by a reverse frame sequence analysis to provide an average number selected to minimize pauses of the compact disc during playback thereby avoiding unpredictable seek mode latency periods characteristic of compact discs. A decoder includes a variable length decoder responsive to statistical information in the code stream for separately variable length decoding individual segments of the data stream. Region location data is derived from region descriptive data and applied with region fill codes to a plurality of region specific decoders selected by detection of the fill code type (e.g., relative, absolute, dyad and DPCM) and decoded region pixels are stored in a bit map for subsequent display.

184 citations

Patent
05 Oct 1987
TL;DR: In this article, a full motion color digital video signal is compressed, formatted for transmission, recorded on compact disc media and decoded at conventional video frame rates, where regions of a frame are individually analyzed to select optimum fill coding methods specific to each region.
Abstract: A full motion color digital video signal is compressed, formatted for transmission, recorded on compact disc media and decoded at conventional video frame rates. During compression, regions of a frame are individually analyzed to select optimum fill coding methods specific to each region. Region decoding time estimates are made to optimize compression thresholds. Region descriptive codes conveying the size and locations of the regions are grouped together in a first segment of a data stream. Region fill codes conveying pixel amplitude indications for the regions are grouped together according to fill code type and placed in other segments of the data stream. The data stream segments are individually variable length coded according to their respective statistical distributions and formatted to form data frames. The number of bytes per frame is dithered by the addition of auxiliary data determined by a reverse frame sequence analysis to provide an average number selected to minimize pauses of the compact disc during playback thereby avoiding unpredictable seek mode latency periods characteristic of compact discs. A decoder includes a variable length decoder responsive to statistical information in the code stream for separately variable length decoding individual segments of the data stream. Region location data is derived from region descriptive data and applied with region fill codes to a plurality of region specific decoders selected by detection of the fill code type (e.g., relative, absolute, dyad and DPCM) and decoded region pixels are stored in a bit map for subsequent display.

165 citations

Journal ArticleDOI
TL;DR: Novel coding algorithms based on tree-structured segmentation achieve the correct asymptotic rate-distortion (R-D) behavior for a simple class of signals, known as piecewise polynomials, by using an R-D based prune and join scheme.
Abstract: This paper presents novel coding algorithms based on tree-structured segmentation, which achieve the correct asymptotic rate-distortion (R-D) behavior for a simple class of signals, known as piecewise polynomials, by using an R-D based prune and join scheme. For the one-dimensional case, our scheme is based on binary-tree segmentation of the signal. This scheme approximates the signal segments using polynomial models and utilizes an R-D optimal bit allocation strategy among the different signal segments. The scheme further encodes similar neighbors jointly to achieve the correct exponentially decaying R-D behavior (D(R)/spl sim/c/sub 0/2/sup -c1R/), thus improving over classic wavelet schemes. We also prove that the computational complexity of the scheme is of O(NlogN). We then show the extension of this scheme to the two-dimensional case using a quadtree. This quadtree-coding scheme also achieves an exponentially decaying R-D behavior, for the polygonal image model composed of a white polygon-shaped object against a uniform black background, with low computational cost of O(NlogN). Again, the key is an R-D optimized prune and join strategy. Finally, we conclude with numerical results, which show that the proposed quadtree-coding scheme outperforms JPEG2000 by about 1 dB for real images, like cameraman, at low rates of around 0.15 bpp.

163 citations

References
More filters
Journal ArticleDOI
TL;DR: The theory of edge detection explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround ∇2G filters acting on the image forms the basis for a physiological model of simple cells.
Abstract: A theory of edge detection is presented. The analysis proceeds in two parts. (1) Intensity changes, which occur in a natural image over a wide range of scales, are detected separately at different scales. An appropriate filter for this purpose at a given scale is found to be the second derivative of a Gaussian, and it is shown that, provided some simple conditions are satisfied, these primary filters need not be orientation-dependent. Thus, intensity changes at a given scale are best detected by finding the zero values of delta 2G(x,y)*I(x,y) for image I, where G(x,y) is a two-dimensional Gaussian distribution and delta 2 is the Laplacian. The intensity changes thus discovered in each of the channels are then represented by oriented primitives called zero-crossing segments, and evidence is given that this representation is complete. (2) Intensity changes in images arise from surface discontinuities or from reflectance or illumination boundaries, and these all have the property that they are spatially. Because of this, the zero-crossing segments from the different channels are not independent, and rules are deduced for combining them into a description of the image. This description is called the raw primal sketch. The theory explains several basic psychophysical findings, and the operation of forming oriented zero-crossing segments from the output of centre-surround delta 2G filters acting on the image forms the basis for a physiological model of simple cells (see Marr & Ullman 1979).

6,893 citations

Book
13 Dec 1977

1,119 citations

Journal ArticleDOI
01 Apr 1985
TL;DR: A new class of coding methods capable of achieving compression ratios as high as 70:1 is called second generation, which can be formed in this class: methods using local operators and combining their output in a suitable way and methods using contour-texture descriptions.
Abstract: The digital representation of an image requires a very large number of bits. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio, starting at 1 with the first digital picture in the early 1960s, reached a saturation level around 10:1 a couple of years ago. This certainly does not mean that the upper bound given by the entropy of the source has also been reached. First, this entropy is not known and depends heavily on the model used for the source, i.e., the digital image. Second, the information theory does not take into account what the human eye sees and how it sees. Recent progress in the study of the brain mechanism of vision has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 70:1. Image quality, of course, remains as an important problem to be investigated. This class of methods, that we call second generation, is the subject of this paper. Two groups can be formed in this class: methods using local operators and combining their output in a suitable way and methods using contour-texture descriptions. Four methods, two in each class, are described in detail. They are applied to the same set of original pictures to allow a fair comparison of the quality in the decoded pictures. If more effort is devoted to this subject, a compression ratio of 100:1 is within reach.

753 citations

Journal ArticleDOI
TL;DR: This paper explores a number of hierarchical image representations as applied to binary images, of which quadtrees are a single exemplar, and discusses quadtree, binary trees, and an adaptive hierarchical method.
Abstract: Quadtrees are a compact hierarchical method of representation of images. In this paper, we explore a number of hierarchical image representations as applied to binary images, of which quadtrees are a single exemplar. We discuss quadtrees, binary trees, and an adaptive hierarchical method. Extending these methods into the third dimension of time results in several other methods. All of these methods are discussed in terms of time complexity, worst case and average compression of random images, and compression results on binary images derived from natural scenes. The results indicate that quadtrees are the most effective for two-dimensional images, but the adaptive algorithms are more effective for dynamic image sequences.

67 citations

DOI
01 Jan 1983
TL;DR: These Ecole polytechnique federale de Lausanne EPFL, n° 476 (1983)Institut de genie electrique et electroniqueLaboratoire de traitement des signaux 1 Reference as mentioned in this paper
Abstract: These Ecole polytechnique federale de Lausanne EPFL, n° 476 (1983)Institut de genie electrique et electroniqueLaboratoire de traitement des signaux 1 Reference doi:10.5075/epfl-thesis-476Print copy in library catalog Record created on 2005-03-16, modified on 2016-08-08

8 citations