Showing papers in &quot;Signal Processing-image Communication in 2000&quot;

Description of shape information for 2-D and 3-D objects

TL;DR: The experimental results conducted on a database of about 6,000 images in terms of exact matching under various transformations and the similarity-based retrieval show that the proposed shape descriptor is very effective in representing shapes.

...read moreread less

Abstract: In order to retrieve an image from a large image database, the descriptor should be invariant to scale and rotation. It must also have enough discriminating power and immunity to noise for retrieval from a large image database. The Zernike moment descriptor has many desirable properties such as rotation invariance, robustness to noise, expression efficiency, fast computation and multi-level representation for describing the shapes of patterns. In this paper, we show that the Zernike moment can be used as an effective descriptor of global shape of an image in a large image database. The experimental results conducted on a database of about 6,000 images in terms of exact matching under various transformations and the similarity-based retrieval show that the proposed shape descriptor is very effective in representing shapes.

...read moreread less

390 citations

Journal Article•DOI•

[...]

Eric Paquet¹, Marc Rioux¹, Anil M. Murching², Thumpudi Naveen², Ali Tabatabai² - Show less +1 more•Institutions (2)

National Research Council¹, Tektronix²

Face and 2-D mesh animation in MPEG-4

TL;DR: An efficient way to represent the coarse shape, scale and composition properties of an object is described, which is invariant to resolution, translation and rotation, and may be used for both two- and three-dimensional objects.

...read moreread less

Abstract: The description of the spatial characteristics of two- and three-dimensional objects, in the framework of MPEG-7, is considered. The shape of an object is one of its fundamental properties, and this paper describes an efficient way to represent the coarse shape, scale and composition properties of an object. This representation is invariant to resolution, translation and rotation, and may be used for both two-dimensional (2-D) and three-dimensional (3-D) objects. This coarse shape descriptor will be included in the eXperimentation Model (XM) of MPEG-7. Applications of such a description to search object databases, in particular the CAESAR anthropometric database are discussed.

...read moreread less

338 citations

Journal Article•DOI•

[...]

A. Murat Tekalp¹, A. Murat Tekalp², Jörn Ostermann³•Institutions (3)

University of Rochester¹, Sabancı University², AT&T³

A texture descriptor for browsing and similarity retrieval

TL;DR: An overview of some of the synthetic visual objects supported by MPEG-4 version-1, namely animated faces and animated arbitrary 2D uniform and Delaunay meshes and integration of the face animation tool with the text-to-speech interface (TTSI), so that face animation can be driven by text input.

...read moreread less

Abstract: This paper presents an overview of some of the synthetic visual objects supported by MPEG-4 version-1, namely animated faces and animated arbitrary 2D uniform and Delaunay meshes. We discuss both specification and compression of face animation and 2D-mesh animation in MPEG-4. Face animation allows to animate a proprietary face model or a face model downloaded to the decoder. We also address integration of the face animation tool with the text-to-speech interface (TTSI), so that face animation can be driven by text input.

...read moreread less

224 citations

Journal Article•DOI•

[...]

Peng Wu¹, B.S. Manjunath¹, Shawn Newsam¹, Hyun Doo Shin¹•Institutions (1)

University of California, Santa Barbara¹

Region-based shape descriptor invariant to rotation, scale and translation

TL;DR: A texture descriptor based on a multiresolution decomposition using Gabor wavelets is proposed that is quite robust to illumination variations and compares favorably with other texture descriptors for similarity retrieval.

...read moreread less

Abstract: Image texture is useful in image browsing, search and retrieval. A texture descriptor based on a multiresolution decomposition using Gabor wavelets is proposed. The descriptor consists of two parts: a perceptual browsing component (PBC) and a similarity retrieval component (SRC). The extraction methods of both PBC and SRC are based on a multiresolution decomposition using Gabor wavelets. PBC provides a quantitative characterization of the texture’s structuredness and directionality for browsing application, and the SRC characterizes the distribution of texture energy in di!erent subbands, and supports similarity retrieval. This representation is quite robust to illumination variations and compares favorably with other texture descriptors for similarity retrieval. Experimental results are provided. ( 2000 Elsevier Science B.V. All rights reserved.

...read moreread less

222 citations

Journal Article•DOI•

[...]

Haekwang Kim¹, Jong-Deuk Kim•Institutions (1)

Sejong University¹

MPEG-4 natural video coding – An overview

TL;DR: A region-based shape descriptor invariant to rotation, scale and translation is presented and experimental results conforming to the MPEG-7 shape descriptor core experiments are presented.

...read moreread less

Abstract: A region-based shape descriptor invariant to rotation, scale and translation is presented in this paper. For a given binary shape, positions of pixels belonging to the shape are regarded as observed vectors of a 2-D random vector and two eigenvectors are obtained from the covariance matrix of the vector population. The shape is divided into four sub-regions by two principal axes corresponding to the two eigenvectors at the center of mass of the shape. Each sub-region is subdivided into four sub-regions in the same way. The sub-division process is repeated for a predetermined number of times. A quadtree representation with its nodes corresponding to regions of the shape is derived from the above process. Four parameters invariant to translation, rotation and scale are calculated for the corresponding region of each node while two parameters are extracted for the root node. The shape descriptor is represented as a vector of all the parameters and the similarity distance between two shapes is calculated by summing up the absolute differences of each element of descriptor vectors. Experimental results conforming to the MPEG-7 shape descriptor core experiments are presented.

...read moreread less

162 citations

Journal Article•DOI•

[...]

Touradj Ebrahimi¹, Caspar Horne•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

MPEG-4's binary format for scene description

TL;DR: The MPEG-4 visual standard is developed to provide users a new level of interaction with visual contents, which provides technologies to view, access and manipulate objects rather than pixels, with great error robustness at a large range of bit-rates.

...read moreread less

Abstract: This paper describes the MPEG-4 standard, as defined in ISO/IEC 14496-2. The MPEG-4 visual standard is developed to provide users a new level of interaction with visual contents. It provides technologies to view, access and manipulate objects rather than pixels, with great error robustness at a large range of bit-rates. Application areas range from digital television, streaming video, to mobile multimedia and games. The MPEG-4 natural video standard consists of a collection of tools that support these application areas. The standard provides tools for shape coding, motion estimation and compensation, texture coding, error resilience, sprite coding and scalability. Conformance points in the form of object types, profiles and levels, provide the basis for interoperability. Shape coding can be performed in binary mode, where the shape of each object is described by a binary mask, or in gray scale mode, where the shape is described in a form similar to an alpha channel, allowing transparency, and reducing aliasing. Motion compensation is block based, with appropriate modifications for object boundaries. The block size can be 16×16, or 8×8, with half pixel resolution. MPEG-4 also provides a mode for overlapped motion compensation. Texture coding is based in 8×8 DCT, with appropriate modifications for object boundary blocks. Coefficient prediction is possible to improve coding efficiency. Static textures can be encoded using a wavelet transform. Error resilience is provided by resynchronization markers, data partitioning, header extension codes, and reversible variable length codes. Scalability is provided for both spatial and temporal resolution enhancement. MPEG-4 provides scalability on an object basis, with the restriction that the object shape has to be rectangular. MPEG-4 conformance points are defined at the Simple Profile, the Core Profile, and the Main Profile. Simple Profile and Core Profiles address typical scene sizes of QCIF and CIF size, with bit-rates of 64, 128, 384 and 2 Mbit/s. Main Profile addresses a typical scene sizes of CIF, ITU-R 601 and HD, with bit-rates at 2, 15 and 38.4 Mbit/s.

...read moreread less

141 citations

Journal Article•DOI•

[...]

Julien Signes¹, Yuval Fisher², Alexandros Eleftheriadis³•Institutions (3)

Orange S.A.¹, University of California, San Diego², Columbia University³

MPEG-4 natural audio coding

TL;DR: This paper provides an introduction to the use and internal mechanisms of these functions of the new MPEG-4 standard: streaming multimedia content, good compression, and user interactivity.

...read moreread less

Abstract: The new MPEG-4 standard provides a suite of functionalities under one standard: streaming multimedia content, good compression, and user interactivity. This paper provides an introduction to the use and internal mechanisms of these functions.

...read moreread less

77 citations

Journal Article•DOI•

[...]

Karlheinz Brandenburg, Oliver Kunz, Akihiko Sugiyama

Hybrid image coding based on partial fractal mapping

TL;DR: The natural coding part within MPEG-4 audio describes traditional type speech and high-quality audio coding algorithms and their combination to enable new functionalities like scalability across the boundaries of coding algorithms.

...read moreread less

Abstract: MPEG-4 audio represents a new kind of audio coding standard. Unlike its predecessors, MPEG-1 and MPEG-2 high-quality audio coding, and unlike the speech coding standards which have been completed by the ITU-T, it describes not a single or small set of highly efficient compression schemes but a complete toolbox to do everything from low bit-rate speech coding to high-quality audio coding or music synthesis. The natural coding part within MPEG-4 audio describes traditional type speech and high-quality audio coding algorithms and their combination to enable new functionalities like scalability (hierarchical coding) across the boundaries of coding algorithms. This paper gives an overview of the basic algorithms and how they can be combined.

...read moreread less

62 citations

Journal Article•DOI•

[...]

Zhou Wang¹, David Zhang², Yinglin Yu³•Institutions (3)

University of Texas at Austin¹, Hong Kong Polytechnic University², South China University of Technology³

01 Jul 2000-Signal Processing-image Communication

TL;DR: This paper demonstrates that the fractal image coding algorithm is compatible with other image coding methods and proposes a new mapping in the image space called partial fractal mapping, which provides much flexibility for real implementations.

...read moreread less

Abstract: The fractal image compression technique models a natural image using a contractive mapping called fractal mapping in the image space. In this paper, we demonstrate that the fractal image coding algorithm is compatible with other image coding methods. In other words, we can encode only part of the image using fractal technique and model the remaining part using other algorithms. According to such an idea, a new mapping in the image space called partial fractal mapping is proposed. Furthermore, a general framework of fractal-based hybrid image coding encoding/decoding systems is presented. The framework provides us with much flexibility for real implementations. Many different hybrid image coding schemes can be derived from it. Finally, a new hybrid image coding scheme is proposed where non-fractal coded regions are used to help the encoding of fractal coded regions. Experiments show that the proposed system performs better than the quadtree-based fractal image coding algorithm and the JPEG image compression standard at high compression ratios larger than 30.

...read moreread less

54 citations

Journal Article•DOI•

On image compression by competitive neural networks and optimal linear predictors

[...]

Robert Cierniak, Leszek Rutkowski

01 Mar 2000-Signal Processing-image Communication

TL;DR: A new algorithm for image compression, named predictive vector quantization (PVQ), is developed based on competitive neural networks and optimal linear predictors, and the performance of the algorithm is discussed.

...read moreread less

Abstract: In this paper a new algorithm for image compression, named predictive vector quantization (PVQ), is developed based on competitive neural networks and optimal linear predictors The semi-closed-loop PVQ methodology is studied The experimental results are presented and the performance of the algorithm is discussed

...read moreread less

49 citations

Journal Article•DOI•

Image-based rendering and 3D modeling: A complete framework

[...]

Ebroul Izquierdo¹, Jens-Rainer Ohm¹•Institutions (1)

Heinrich Hertz Institute¹

01 Aug 2000-Signal Processing-image Communication

TL;DR: The recently introduced incomplete 3D technique, which in a first step extracts the texture of the visible surface of a video object acquired with multiple cameras, and then performs disparity-compensated projection from the surface onto a view plane.

...read moreread less

Abstract: Multi-viewpoint synthesis of video data is a key technology for the integration of video and 3D graphics, as necessary for telepresence and augmented-reality applications. This paper describes a number of important techniques which can be employed to accomplish that goal. The techniques presented are based on the analysis of 2D images acquired by two or more cameras. To determine depth information of single objects present in the scene, it is necessary to perform segmentation and disparity estimation. It is shown, how these analysis tools can benefit from each other. For viewpoint synthesis, techniques with different levels of tradeoff between complexity and degrees of freedom are presented. The first approach is disparity-controlled view interpolation, which is capable of generating intermediate views along the interocular axis between two adjacent cameras. The second is the recently introduced incomplete 3D technique, which in a first step extracts the texture of the visible surface of a video object acquired with multiple cameras, and then performs disparity-compensated projection from the surface onto a view plane. In the third and most complex approach, a 3D model of the object is generated, which can be represented by a 3D wire grid. For synthesis, this model can be rotated to arbitrary orientations, and original texture is mapped onto the surface to obtain an arbitrary view of the processed object. The result of this rendering procedure is a virtual image with very natural appearance.

...read moreread less

Journal Article•DOI•

Rate control for MPEG video coding

[...]

Limin Wang¹•Institutions (1)

General Instrument¹

01 Mar 2000-Signal Processing-image Communication

TL;DR: A new rate control approach which addresses the problems associated with degradation in picture quality at scene cuts and nonuniform picture quality due to buffer-dependent variations of the quantization parameter is presented.

...read moreread less

Abstract: ISO/IEC MPEG-2 Test Model 5 (TM5) describes a rate control method which consists of three steps: bit allocation, rate control and modulation (ISO/MPEG II, Test Model 5, April 1993). There are, however, two problems associated with TM5 rate control: degradation in picture quality at scene cuts and nonuniform picture quality due to buffer-dependent variations of the quantization parameter. This paper presents a new rate control approach which addresses these issues. To eliminate the impact of scene cuts on the picture quality, the first scheduled P picture in a new scene is coded as an I picture, and the extra I picture is further balanced by coding the next scheduled I picture as a P picture. To achieve a relatively uniform picture quality, the same global quantization parameter is applied to all the macroblocks in a picture. The global quantization parameter is determined by using either an iterative or a binary search algorithm. The simulation results demonstrate that a significant improvement in performance is obtained using the proposed rate control approach.

...read moreread less

Journal Article•DOI•

MPEG-4 Systems: Overview

[...]

Olivier Avaro¹, Alexandros Eleftheriadis², Carsten Herpel, Ganesh Rajan, Liam Ward - Show less +1 more•Institutions (2)

Deutsche Telekom¹, Columbia University²

Object-based multimedia content description schemes and applications for MPEG-7

TL;DR: An overview of Part 1 of ISO/IEC 14496 (MPEG-4 systems) is given, starting from the general architecture up to the description of the individual MPEG-4 Systems tools.

...read moreread less

Abstract: This paper gives an overview of Part 1 of ISO/IEC 14496 (MPEG-4 Systems). It first presents the objectives of the MPEG-4 activity. In the MPEG-1 and MPEG-2 standards, “Systems” referred only to overall architecture, multiplexing, and synchronization. In MPEG-4, in addition to these issues, the Systems part encompasses scene description, interactivity, content description, and programmability. The description of the MPEG-4 specification follows, starting from the general architecture up to the description of the individual MPEG-4 Systems tools. Finally, a conclusion describes the future extensions of the specification, as well as a comparison between the solutions provided by MPEG-4 Systems and some alternative technologies.

...read moreread less

Journal Article•DOI•

[...]

Ana B. Benitez¹, Seungyup Paek¹, Shih-Fu Chang¹, Atul Puri², Qian Huang², John R. Smith³, Chung-Sheng Li³, Lawrence D. Bergman³, Charles N Judice⁴ - Show less +5 more•Institutions (4)

Columbia University¹, AT&T Labs², IBM³, Eastman Kodak Company⁴

A tunable algorithm to update a reference image

TL;DR: This paper describes description schemes (DSs) for image, video, multimedia, home media, and archive content proposed to the MPEG-7 standard and demonstrates the feasibility and the efficiency of the description schemes by presenting applications that already use the proposed structures or will greatly benefit from their use.

...read moreread less

Abstract: In this paper, we describe description schemes (DSs) for image, video, multimedia, home media, and archive content proposed to the MPEG-7 standard. MPEG-7 aims to create a multimedia content description standard in order to facilitate various multimedia searching and filtering applications. During the design process, special care was taken to provide simple but powerful structures that represent generic multimedia data. We use the extensible markup language (XML) to illustrate and exemplify the proposed DSs because of its interoperability and flexibility advantages. The main components of the image, video, and multimedia description schemes are object, feature classification, object hierarchy, entity-relation graph, code downloading, multi-abstraction levels, and modality transcoding. The home media description instantiates the former DSs proposing the 6-W semantic features for objects, and 1-P physical and 6-W semantic object hierarchies. The archive description scheme aims to describe collections of multimedia documents, whereas the former DSs only aim at individual multimedia documents. In the archive description scheme, the content of an archive is represented using multiple hierarchies of clusters, which may be related by entity-relation graphs. The hierarchy is a specific case of entity-relation graph using a containment relation. We explicitly include the hierarchy structure in our DSs because it is a natural way of defining composite objects, a more efficient structure for retrieval, and the representation structure used in MPEG-4. We demonstrate the feasibility and the efficiency of our description schemes by presenting applications that already use the proposed structures or will greatly benefit from their use. These applications are the visual apprentice, the AMOS-search system, a multimedia broadcast news browser, a storytelling system, and an image meta-search engine, MetaSEEk.

...read moreread less

Journal Article•DOI•

[...]

Massimo Boninsegna, A. Bozzoli

MPEG-4 Systems: Elementary stream management

TL;DR: An algorithm based on Kalman filtering is suggested here to dynamically estimate the background reference image and faces the severe problems of parameter tuning and modeling approximations.

...read moreread less

Abstract: A change detection scheme used to detect objects in a complex real-life scene must be able to deal with illumination changes, shadows and structural variations of the environment. Several approaches are based on subtracting a reference image, representing the background, from the current input image. The most used methods estimate the background image by applying some low-pass filter on the input image sequence. Many of them require an accurate calibration phase and rely on a careful selection of critical parameters. An algorithm based on Kalman filtering is suggested here to dynamically estimate the background reference image. The approach extends former works and faces the severe problems of parameter tuning and modeling approximations. An experimental analysis on the behavior of the proposed algorithm in presence of different illumination changes is performed using noisy synthetic data. The results are used to address the choice of values for the filter parameters. The effectiveness and robustness of the algorithm are evaluated on several tests that were carried out on real-life sequences.

...read moreread less

Journal Article•DOI•

[...]

Carsten Herpel, Alexandros Eleftheriadis¹•Institutions (1)

Columbia University¹

MPEG-4: Why, what, how and when?

TL;DR: The paper describes the elementary stream management (ESM) facilities provided by MPEG-4 Systems and describes the synchronization functionality as well as the system decoder model that defines the timing behavior and buffer resource management of MPEG- 4 receivers.

...read moreread less

Abstract: We describe the elementary stream management (ESM) facilities provided by MPEG-4 Systems. Within the extensive set of tools defined by MPEG-4, the ESM tools play a critical role in joining several building blocks together. ESM provides a dual to the scene description language (BIFS) in that it links the streaming resources of a presentation to the scene. We also describe the synchronization functionality as well as the system decoder model that defines the timing behavior and buffer resource management of MPEG-4 receivers. The paper concludes with considerations on data packaging in underlying delivery layer protocols and a description of the MPEG-4 content access procedure.

...read moreread less

Journal Article•DOI•

[...]

Fernando Pereira¹•Institutions (1)

Instituto Superior Técnico¹

A set of visual feature descriptors and their combination in a low-level description scheme

TL;DR: This paper intends to give an overview on the MPEG-4 motivations, objectives, achievements, process and workplan, providing a stimulating starting point for more detailed reading.

...read moreread less

Abstract: The MPEG-4 Version 1 standard has been recently finalized Since MPEG-4 adopted an object-based audiovisual representation model with hyperlinking and interaction capabilities and supports both natural and synthetic content, it is expected that this standard will become the information coding playground for future multimedia applications This paper intends to give an overview on the MPEG-4 motivations, objectives, achievements, process and workplan, providing a stimulating starting point for more detailed reading

...read moreread less

Journal Article•DOI•

[...]

Jens-Rainer Ohm¹, F. Bunjamin¹, Wolfram Liebsch¹, Bela Makai¹, Karsten Muller¹, Aljoscha Smolic¹, Detlef Zier¹ - Show less +3 more•Institutions (1)

Heinrich Hertz Institute¹

A new lossless compression scheme based on Huffman coding scheme for image compression

TL;DR: Empirical descriptors for basic visual information features are developed so that invariance against common transformations of visual material is achieved, and that they are fitted to human perception properties.

...read moreread less

Abstract: This paper reports about descriptors for basic visual information features, which have been developed in the context of the forthcoming MPEG-7 standard. The four basic features supported are color, texture, shape and motion. A search engine system has been developed which supports combinations of basic feature descriptors in a low-level description scheme for similarity-based retrieval of visual (image and video) data. All basic descriptors have been developed so that invariance against common transformations of visual material, e.g. filtering, contrast/color manipulation, resizing, etc., is achieved, and that they are fitted to human perception properties. Furthermore, descriptors have been designed to allow fast, coarse-to-fine search procedures. Elements described in this contribution have been proposed for the MPEG-7 standard. They are currently either included in the MPEG-7 Experimentation Model (XM) or investigated within core experiments, which are performed during the standard's development.

...read moreread less

Journal Article•DOI•

[...]

Yu-Chen Hu¹, Chin-Chen Chang¹•Institutions (1)

National Chung Cheng University¹

Description schemes for video programs, users and devices

TL;DR: With this algorithm, a compression ratio higher than that of the Lossless JPEG method for 512×512 images can be obtained and the newly proposed algorithm provides a good means for lossless image compression.

...read moreread less

Abstract: A novel lossless image-compression scheme is proposed in this paper. A two-stage structure is embedded in this scheme. A linear predictor is used to decorrelate the raw image data in the first stage. Then in the second stage, an effective scheme based on the Huffman coding method is developed to encode the residual image. This newly proposed scheme could reduce the cost for the Huffman coding table while achieving high compression ratio. With this algorithm, a compression ratio higher than that of the Lossless JPEG method for 512×512 images can be obtained. In other words, the newly proposed algorithm provides a good means for lossless image compression.

...read moreread less

Journal Article•DOI•

[...]

Philippe Salembier¹, Richard J. Qian, Noel E. O'Connor², Paulo Lobato Correia³, Ibrahim Sezan, P. van Beek - Show less +2 more•Institutions (3)

Polytechnic University of Catalonia¹, Dublin City University², Instituto Superior Técnico³

MPEG-7: A standardised description of audiovisual content

TL;DR: This paper presents a set of description schemes (DS) dealing with video programs, users and devices designed to support personalization, efficient management of AV information and the expected variability in the capabilities ofAV information access devices.

...read moreread less

Abstract: This paper presents a set of description schemes (DS) dealing with video programs, users and devices. Following MPEG-7 terminology, a description of an AV document includes descriptors (termed Ds), which specify the syntax and semantics of a representation entity for a feature of the AV data, and description schemes (termed DSs) which specify the structure and semantics of a set of Ds and DSs. The Program DS is used to describe the physical structure as well as the semantic content of a video program. It focuses on the visual information only. The physical structure is described by the temporal organization of the sequence (segments), the spatial organization of images (regions) as well as the spatio-temporal structure of the video (regions with motion). The semantic description is built around objects and events. Finally, the physical and semantic descriptions are related by a set of links defining where or when instances of specific semantic notions can be found. The User DS is used to describe the personal preferences and usage patterns of a user. It facilitates a smart personalizable device that records and presents to the user audio and video information based upon the user's preferences, prior viewing and listening habits, as well as personal characteristics. Finally, the Device DS keeps a record of the users of the device, available programs, and a description of device capabilities. It allows a device to prepare itself based on the existing users, profiles and available programs. These three types of DSs and the common set of descriptors that they share are designed to support personalization, efficient management of AV information and the expected variability in the capabilities of AV information access devices.

...read moreread less

Journal Article•DOI•

[...]

Rob Koenen, Fernando Pereira¹•Institutions (1)

Instituto Superior Técnico¹

Motion estimation in the presence of illumination variations

TL;DR: The paper concentrates on the motivations and objectives behind MPEG-7, giving some applications, outlining the process and work plan, and explains the relation with the other MPEG standards, notably MPEG-4.

...read moreread less

Abstract: The value of information often depends on how easily it can be found, retrieved, accessed, filtered and managed. An incommensurable amount of audiovisual information is becoming available in digital form, in digital archives, on the World Wide Web, in broadcast datastreams and in personal and professional databases, and this amount is only growing. In spite of the fact that users have increasing access to these resources, identifying and managing them efficiently is becoming more difficult, because of the growing volume. The question of identifying content is not just restricted to database retrieval applications such as digital libraries, but extends to areas like broadcast channel selection, multimedia editing, and multimedia directory services. In 1996 MPEG has recognised the need to identify multimedia content, and started a work item formally called `Multimedia Content Description Interface', better known as MPEG-7. The new MPEG-7 standard will provide a rich set of standardised tools to describe multimedia content. The people active in defining MPEG-7 represent broadcasters, equipment and chip manufacturers, digital content creators and managers, telecommunication service providers, publishers and intellectual property rights managers, as well as university researchers. Both human users and automatic systems that process audiovisual information are within the scope of MPEG-7. This paper will present an overview of the MPEG-7 standardisation project. The paper concentrates on the motivations and objectives behind MPEG-7, giving some applications, outlining the process and work plan. It will also explain the relation with the other MPEG standards, notably MPEG-4.

...read moreread less

Journal Article•DOI•

[...]

Frances Jane Hampson, Jean-Christophe Pesquet

A contour-based approach to binary shape coding using a multiple grid chain code

TL;DR: This work considers motion estimation between images of a video sequence in the presence of illumination variations in the scene and proposes a pel-recursive motion estimator adapted to this new motion model in order to estimate both motion and illumination variation fields.

...read moreread less

Abstract: We consider motion estimation between images of a video sequence in the presence of illumination variations in the scene. The standard model of motion between consecutive images is extended to a prediction model with multiplicative prediction coefficient. This additional coefficient is interpreted as an illumination variation parameter. A pel-recursive motion estimator is adapted to this new motion model in order to estimate both motion and illumination variation fields. We present experiments on real images containing localised illumination variations and show that the proposed approach allows the prediction error to be largely reduced in comparison with the standard pel-recursive motion estimation algorithm.

...read moreread less

Journal Article•DOI•

[...]

Paulo Nunes¹, Ferran Marques², Fernando Pereira¹, Antoni Gasull²•Institutions (2)

Instituto Superior Técnico¹, Polytechnic University of Catalonia²

01 May 2000-Signal Processing-image Communication

TL;DR: This paper presents a contour-based approach to efficiently code binary shape information in the context of object-based video coding which meets some of the most important requirements identified for the MPEG-4 standard, notably efficient coding and low delay.

...read moreread less

Abstract: This paper presents a contour-based approach to efficiently code binary shape information in the context of object-based video coding. This approach meets some of the most important requirements identified for the MPEG-4 standard, notably efficient coding and low delay. The proposed methods support both object-based lossless and quasi-lossless coding modes. For the cases where low delay is a primary requirement, a macroblock-based coding mode is proposed which can take advantage of inter-frame coding to improve the coding efficiency. The approach presented here relies on a grid different from that used for the pixels to represent the shape – the hexagonal grid – which simplifies the task of contour coding. Using this grid, an appraoch based on a differential chain code (DCC) is proposed for the lossless mode while, for the quasi-lossless case, an approach based on the multiple grid chain code (MGCC) principle is proposed. The MGCC combines both contour simplification and contour prediction to reduce the number of bits needed to code the shapes. Results for alpha plane coding of MPEG-4 video test sequences are presented in order to illustrate the performance of the several modes of operation, and a comparison is made with the shape-coding tool chosen by MPEG-4.

...read moreread less

Journal Article•DOI•

Profiles and levels in MPEG-4: Approach and overview

[...]

Rob Koenen

Lossless acceleration of fractal image encoding via the fast Fourier transform

TL;DR: The MPEG-4 profiles and levels as discussed by the authors serve two main purposes: (1) ensuring interoperability between MPEG4 implementations, and (2) allowing conformance to the standard to be tested.

...read moreread less

Abstract: Profiles and levels in MPEG-4 are standardised in order to give users a number of well-defined and well-chosen conformance points. They serve two main purposes: (1) ensuring interoperability between MPEG-4 implementations, and (2) allowing conformance to the standard to be tested. Profiles exist not only for the Audio and Visual parts of the standard (audio profiles and visual profiles), but also for the Systems part of the standard, in the form of graphics profiles, scene graph profiles, and an object descriptor profile. Different profiles are created for different application environments. The policy for defining profiles is that they should enable as many applications as possible while keeping the number of different profiles low. MPEG has defined a first set of profiles for MPEG-4, but more are expected. MPEG will be restrictive in defining any new profiles, listening carefully to what its users have to say.

...read moreread less

Journal Article•DOI•

[...]

Hannes Hartenstein¹, Dietmar Saupe²•Institutions (2)

University of Freiburg¹, Leipzig University²

Motion descriptors for content-based video representation

TL;DR: This work presents a new technique for reducing the encoding complexity of fractal image compression that is lossless, i.e., it does not sacrifice any image reconstruction quality for the sake of speedup, and outperforms other currently known lossless acceleration methods.

...read moreread less

Abstract: In fractal image compression the encoding step is computationally expensive. We present a new technique for reducing the encoding complexity. It is lossless, i.e., it does not sacrifice any image reconstruction quality for the sake of speedup. It is based on a codebook coherence characteristic of fractal image compression and leads to a novel application of the fast Fourier transform-based cross correlation. The proposed method is particularly well suited for use with highly irregular image partitions for which most traditional (lossy) acceleration schemes lose a large part of their efficiency. For large ranges our approach outperforms other currently known lossless acceleration methods.

...read moreread less

Journal Article•DOI•

[...]

Sylvie Jeannin¹, Radu S. Jasinschi¹, Alfred She, Thumpudi Naveen², Benoit Mory¹, Ali Tabatabai³ - Show less +2 more•Institutions (3)

Philips¹, Microsoft², Sony Broadcast & Professional Research Laboratories³

MPEG-7 Videotext description scheme for superimposed text in images and video

TL;DR: This paper presents two motion descriptors which were recommended by MPEG to become part of the first visual reference model (XM 1.0) of the evolving MPEG-7 standard in development and are important elements in capturing the dynamic content of video sequences in a compact form.

...read moreread less

Abstract: This paper presents two motion descriptors which were recommended by MPEG to become part of the first visual reference model (XM 1.0) of the evolving MPEG-7 standard in development. These motion descriptors are: (i) the camera motion descriptor which describes the global motion of the camera or of the observer in a natural 3-D scene, and (ii) the object motion trajectory descriptor which describes how an object moves in 3-D space or in the 2-D image plane. These two descriptors are important elements in capturing the dynamic content of video sequences in a compact form. They are used to index video sequences according to their dynamic content. Applications that use these descriptors include TV program classification, video editing for broadcast TV and movies, broadcast sports, and video surveillance.

...read moreread less

Journal Article•DOI•

[...]

Nevenka Dimitrova¹, Lalitha Agnihotri¹, Chitra Dorai², Ruud M. Bolle²•Institutions (2)

Philips¹, IBM²