scispace - formally typeset
Search or ask a question
Author

Guido M. Schuster

Bio: Guido M. Schuster is an academic researcher from University of Applied Sciences of Eastern Switzerland. The author has contributed to research in topics: Lossy compression & Shape coding. The author has an hindex of 16, co-authored 48 publications receiving 748 citations. Previous affiliations of Guido M. Schuster include Northwestern University & USRobotics.

Papers
More filters
Journal ArticleDOI
TL;DR: A prototype compressive video camera is presented that encodes scene movement using a translated binary photomask in the optical path, and the use of a printed binary mask allows reconstruction at higher spatial resolutions than has been previously demonstrated.
Abstract: We present a prototype compressive video camera that encodes scene movement using a translated binary photomask in the optical path. The encoded recording can then be used to reconstruct multiple output frames from each captured image, effectively synthesizing high speed video. The use of a printed binary mask allows reconstruction at higher spatial resolutions than has been previously demonstrated. In addition, we improve upon previous work by investigating tradeoffs in mask design and reconstruction algorithm selection. We identify a mask design that consistently provides the best performance across multiple reconstruction strategies in simulation, and verify it with our prototype hardware. Finally, we compare reconstruction algorithms and identify the best choice in terms of balancing reconstruction quality and speed.

74 citations

Journal ArticleDOI
TL;DR: The fundamental problem of optimally splitting a video sequence into two sources of information, the displaced frame difference (DFD) and the displacement vector field (DVF) is addressed, and a general dynamic programming (DP) formulation which results in an optimal tradeoff between the DVF and the DFD is derived.
Abstract: We address the fundamental problem of optimally splitting a video sequence into two sources of information, the displaced frame difference (DFD) and the displacement vector field (DVF). We first consider the case of a lossless motion-compensated video coder (MCVC), and derive a general dynamic programming (DP) formulation which results in an optimal tradeoff between the DVF and the DFD. We then consider the more important case of a lossy MCVC, and present an algorithm which solves the tradeoff between the rate and the distortion. This algorithm is based on the Lagrange multiplier method and the DP approach introduced for the lossless MCVC. We then present an H.263-based MCVC which uses the proposed optimal bit allocation, and compare its results to H.263. As expected, the proposed coder is superior in the rate-distortion sense. In addition to this, it offers many advantages for a rate control scheme. The presented theory can be applied to build new optimal coders, and to analyze the heuristics employed in existing coders. In fact, whenever one changes an existing coder, the proposed theory can be used to evaluate how the change affects its performance.

72 citations

Proceedings ArticleDOI
27 Feb 1996
TL;DR: A fast and efficient method for selecting the encoding modes and the quantizers for the ITU H.263 standard based on Lagrangian relaxation and dynamic programming (DP), which employs a fast evaluation of the operational rate distortion curve in the DCT domain and a fast iterative search which is based on a Bezier function.
Abstract: In this paper, a fast and efficient method for selecting the encoding modes and the quantizers for the ITU H.263 standard is presented. H.263 is a very low bit rate video coder which produces satisfactory results at bit rates around 24 kbits/second for low motion quarter common intermediate format (QCIF) color sequences such as 'mother and daughter.' Two major target applications for H.263 are video telephony using public switched telephone network lines and video telephony over wireless channels. In both cases, the channel bandwidth is very small, hence the efficiency of the video coder needs to be as high as possible. The presented algorithm addresses this problem by finding the smallest frame distortion for a given frame bit budget. The presented scheme is based on Lagrangian relaxation and dynamic programming (DP). It employs a fast evaluation of the operational rate distortion curve in the DCT domain and a fast iterative search which is based on a Bezier function.© (1996) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

67 citations

Patent
26 Oct 1995
TL;DR: In this paper, the authors present a method (200, 400) and device (500) for, within a variable or fixed block size video compression scheme, providing optimal bit allocation among at least three critical types of data: segmentation, motion vectors and prediction error, or DFD.
Abstract: The present invention provides a method (200, 400) and device (500) for, within a variable or fixed block size video compression scheme, providing optimal bit allocation among at least three critical types of data: segmentation, motion vectors and prediction error, or DFD. Since the amount of information represented by one bit for a particular type of data is not equivalent to the information represented by one bit for some other data type, this consideration is taken into account to efficiently encode the video sequence. Thus, a computationally efficient method is provided for optimally encoding a given frame of a video sequence wherein, for a given bit budget the proposed encoding scheme leads to the smallest possible distortion and vice versa, for a given distortion, the proposed encoding scheme leads to the smallest possible rate.

64 citations

Journal ArticleDOI
TL;DR: This article addresses the issue of operationally optimal shape encoding, which is a step in the direction of globally optimal resource allocation in object-oriented video and introduces the directed acyclic graph (DAG) formulation of the problem, which results in a fast solution approach.
Abstract: In this article, we address the issue of operationally optimal shape encoding, which is a step in the direction of globally optimal resource allocation in object-oriented video. After an overview of shape-based coding and algorithms, we define the problem mathematically, introduce the necessary notation, and then present the basic idea behind the proposed algorithms. We then discuss the constraints imposed on the code used to encode the approximation. We then introduce a definition of distortion that fits into the proposed framework and introduce the directed acyclic graph (DAG) formulation of the problem, which results in a fast solution approach. We also show how the DAG algorithm can be used to find the approximation with the minimum-maximum segment distortion for a given rate as well as to find the approximation with the smallest total distortion for a given rate. We then present experimental results and point out directions for future research.

52 citations


Cited by
More filters
Journal ArticleDOI

2,415 citations

Proceedings Article
01 Jan 1994
TL;DR: The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images.
Abstract: MUCKE aims to mine a large volume of images, to structure them conceptually and to use this conceptual structuring in order to improve large-scale image retrieval. The last decade witnessed important progress concerning low-level image representations. However, there are a number problems which need to be solved in order to unleash the full potential of image mining in applications. The central problem with low-level representations is the mismatch between them and the human interpretation of image content. This problem can be instantiated, for instance, by the incapability of existing descriptors to capture spatial relationships between the concepts represented or by their incapability to convey an explanation of why two images are similar in a content-based image retrieval framework. We start by assessing existing local descriptors for image classification and by proposing to use co-occurrence matrices to better capture spatial relationships in images. The main focus in MUCKE is on cleaning large scale Web image corpora and on proposing image representations which are closer to the human interpretation of images. Consequently, we introduce methods which tackle these two problems and compare results to state of the art methods. Note: some aspects of this deliverable are withheld at this time as they are pending review. Please contact the authors for a preview.

2,134 citations

Journal ArticleDOI
TL;DR: Based on the well-known hybrid video coding structure, Lagrangian optimization techniques are presented that try to answer the question: what part of the video signal should be coded using what method and parameter settings?
Abstract: The rate-distortion efficiency of video compression schemes is based on a sophisticated interaction between various motion representation possibilities, waveform coding of differences, and waveform coding of various refreshed regions. Hence, a key problem in high-compression video coding is the operational control of the encoder. This problem is compounded by the widely varying content and motion found in typical video sequences, necessitating the selection between different representation possibilities with varying rate-distortion efficiency. This article addresses the problem of video encoder optimization and discusses its consequences on the compression architecture of the overall coding system. Based on the well-known hybrid video coding structure, Lagrangian optimization techniques are presented that try to answer the question: what part of the video signal should be coded using what method and parameter settings?.

1,954 citations

Journal ArticleDOI
TL;DR: The article provides arguments in favor of an alternative approach that uses splines, which is equally justifiable on a theoretical basis, and which offers many practical advantages, and brings out the connection with the multiresolution theory of the wavelet transform.
Abstract: The article provides arguments in favor of an alternative approach that uses splines, which is equally justifiable on a theoretical basis, and which offers many practical advantages. To reassure the reader who may be afraid to enter new territory, it is emphasized that one is not losing anything because the traditional theory is retained as a particular case (i.e., a spline of infinite degree). The basic computational tools are also familiar to a signal processing audience (filters and recursive algorithms), even though their use in the present context is less conventional. The article also brings out the connection with the multiresolution theory of the wavelet transform. This article attempts to fulfil three goals. The first is to provide a tutorial on splines that is geared to a signal processing audience. The second is to gather all their important properties and provide an overview of the mathematical and computational tools available; i.e., a road map for the practitioner with references to the appropriate literature. The third goal is to give a review of the primary applications of splines in signal and image processing.

1,732 citations