scispace - formally typeset
Search or ask a question

Showing papers in "electronic imaging in 2003"


Proceedings ArticleDOI
TL;DR: A new dataset, UCID (pronounced "use it") - an Uncompressed Colour Image Dataset which tries to bridge the gap between standardised image databases and objective evaluation of image retrieval algorithms that operate in the compressed domain.
Abstract: Standardised image databases or rather the lack of them are one of the main weaknesses in the field of content based image retrieval (CBIR). Authors often use their own images or do not specify the source of their datasets. Naturally this makes comparison of results somewhat difficult. While a first approach towards a common colour image set has been taken by the MPEG 7 committee 1 their database does not cater for all strands of research in the CBIR community. In particular as the MPEG-7 images only exist in compressed form it does not allow for an objective evaluation of image retrieval algorithms that operate in the compressed domain or to judge the influence image compression has on the performance of CBIR algorithms. In this paper we introduce a new dataset, UCID (pronounced ”use it”) - an Uncompressed Colour Image Dataset which tries to bridge this gap. The UCID dataset currently consists of 1338 uncompressed images together with a ground truth of a series of query images with corresponding models that an ideal CBIR algorithm would retrieve. While its initial intention was to provide a dataset for the evaluation of compressed domain algorithms, the UCID database also represents a good benchmark set for the evaluation of any kind of CBIR method as well as an image set that can be used to evaluate image compression and colour quantisation algorithms.

1,117 citations


Proceedings ArticleDOI
TL;DR: In this article, it is shown that these embedding methods are equivalent to a lowpass filtering of histograms that is quantified by a decrease in the HCF center of mass (COM), which is exploited in known scheme detection to classify unaltered and spread spectrum images using a bivariate classifier.
Abstract: The process of information hiding is modeled in the context of additive noise. Under an independence assumption, the histogram of the stegomessage is a convolution of the noise probability mass function (PMF) and the original histogram. In the frequency domain this convolution is viewed as a multiplication of the histogram characteristic function (HCF) and the noise characteristic function. Least significant bit, spread spectrum, and DCT hiding methods for images are analyzed in this framework. It is shown that these embedding methods are equivalent to a lowpass filtering of histograms that is quantified by a decrease in the HCF center of mass (COM). These decreases are exploited in a known scheme detection to classify unaltered and spread spectrum images using a bivariate classifier. Finally, a blind detection scheme is built that uses only statistics from unaltered images. By calculating the Mahalanobis distance from a test COM to the training distribution, a threshold is used to identify steganographic images. At an embedding rate of 1 b.p.p. greater than 95% of the stegoimages are detected with false alarm rate of 5%.

444 citations


Proceedings ArticleDOI
TL;DR: An innovative image annotation tool for classifying image regions in one of seven classes - sky, skin, vegetation, snow, water, ground, and buildings - or as unknown is described.
Abstract: The paper describes an innovative image annotation tool for classifying image regions in one of seven classes - sky, skin, vegetation, snow, water, ground, and buildings - or as unknown. This tool could be productively applied in the management of large image and video databases where a considerable volume of images/frames there must be automatically indexed. The annotation is performed by a classification system based on a multi-class Support Vector Machine. Experimental results on a test set of 200 images are reported and discussed.

296 citations


Proceedings ArticleDOI
TL;DR: A novel objective segmentation evaluation method based on information theory that uses entropy as the basis for measuring the uniformity of pixel characteristics (luminance is used in this paper) within a segmentation region.
Abstract: Accurate image segmentation is important for many image, video and computer vision applications. Over the last few decades, many image segmentation methods have been proposed. However, the results of these segmentation methods are usually evaluated only visually, qualitatively, or indirectly by the effectiveness of the segmentation on the subsequent processing steps. Such methods are either subjective or tied to particular applications. They do not judge the performance of a segmentation method objectively, and cannot be used as a means to compare the performance of different segmentation techniques. A few quantitative evaluation methods have been proposed, but these early methods have been based entirely on empirical analysis and have no theoretical grounding. In this paper, we propose a novel objective segmentation evaluation method based on information theory. The new method uses entropy as the basis for measuring the uniformity of pixel characteristics (luminance is used in this paper) within a segmentation region. The evaluation method provides a relative quality score that can be used to compare different segmentations of the same image. This method can be used to compare both various parameterizations of one particular segmentation method as well as fundamentally different segmentation techniques. The results from this preliminary study indicate that the proposed evaluation method is superior to the prior quantitative segmentation evaluation techniques, and identify areas for future research in objective segmentation evaluation.

194 citations


Proceedings ArticleDOI
TL;DR: The orientation angle will be incorporated in the formula and a correction factor will be given for the dependence on surround luminance and the effects of orientation angle and surround Luminance will be treated.
Abstract: For design criteria of displayed images and for the judgment of image quality, it is very important to dispose of a trustful formula for the contrast sensitivity of the human eye. The contrast sensitivity function or CSF depends on a number of conditions. Most important are the luminance and the viewing angle of the object, but surround illumination can also play a role. In the paper a practical formula is given for a standard observer. This formula is derived from a more general physical formula for the contrast sensitivity. In this paper also the effects of orientation angle and surround luminance will be treated. The orientation angle will be incorporated in the formula and a correction factor will be given for the dependence on surround luminance.

187 citations


Proceedings ArticleDOI
TL;DR: An automatic camera calibration algorithm for court sports that locates the court in the image without any user assistance or a-priori knowledge about the most probable position and optimizes the parameters using a gradient-descent algorithm.
Abstract: We propose an automatic camera calibration algorithm for court sports The obtained camera calibration parameters are required for applications that need to convert positions in the video frame to real-world coordinates or vice versa Our algorithm uses a model of the arrangement of court lines for calibration Since the court model can be specified by the user, the algorithm can be applied to a variety of different sports The algorithm starts with a model initialization step which locates the court in the image without any user assistance or a-priori knowledge about the most probable position Image pixels are classified as court line pixels if they pass several tests including color and local texture constraints A Hough transform is applied to extract line elements, forming a set of court line candidates The subsequent combinatorial search establishes correspondences between lines in the input image and lines from the court model For the succeeding input frames, an abbreviated calibration algorithm is used, which predicts the camera parameters for the new image and optimizes the parameters using a gradient-descent algorithm We have conducted experiments on a variety of sport videos (tennis, volleyball, and goal area sequences of soccer games) Video scenes with considerable difficulties were selected to test the robustness of the algorithm Results show that the algorithm is very robust to occlusions, partial court views, bad lighting conditions, or shadows

161 citations


Proceedings ArticleDOI
TL;DR: Some first results on the capacity of reversible watermarking and data-hiding schemes have been derived and these are indicative of more practical schemes that will be presented in subsequent papers.
Abstract: An undesirable side effect of many watermarking and data-hiding schemes is that the host signal into which auxiliary data is embedded is distorted. Finding an optimal balance between the amount of information embedded and the induced distortion is therefore an active field of research. In recent years, with the rediscovery of Costa's seminal paper Writing on Dirty Paper , there has been considerable progress in understanding the fundamental limits of the capacity versus distortion of watermarking and data-hiding schemes. For some applications, however, no distortion resulting from auxiliary data, however small, is allowed. In these cases the use of reversible data-hiding methods provide a way out. A reversible data-hiding scheme is defined as a scheme that allows complete and blind restoration (i.e. without additional signaling) of the original host data. Practical reversible data-hiding schemes have been proposed by Fridrich et al., but little attention has been paid to the theoretical limits. Some first results on the capacity of reversible watermarking schemes have been derived. The reversible schemes considered in most previous papers have a highly fragile nature: in those schemes, changing a single bit in the watermarked data would prohibit recovery of both the original host signal as well as the embedded auxiliary data. It is the purpose of this paper to repair this situation and to provide some first results on the limits of robust reversible data-hiding. Admittedly, the examples provided in this paper are toy examples, but they are indicative of more practical schemes that will be presented in subsequent papers.© (2003) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

141 citations


Proceedings ArticleDOI
TL;DR: The Image Systems Evaluation Toolkit (ISET) as mentioned in this paper is an integrated suite of software routines that simulate the capture and processing of visual scenes and includes a graphical user interface (GUI) for users to control the physical characteristics of the scene and many parameters of the optics, sensor electronics and image processing pipeline.
Abstract: The Image Systems Evaluation Toolkit (ISET) is an integrated suite of software routines that simulate the capture and processing of visual scenes. ISET includes a graphical user interface (GUI) for users to control the physical characteristics of the scene and many parameters of the optics, sensor electronics and image processing-pipeline. ISET also includes color tools and metrics based on international standards (chromaticity coordinates, CIELAB and others) that assist the engineer in evaluating the color accuracy and quality of the rendered image.

135 citations


Proceedings ArticleDOI
TL;DR: A new higher-order steganalytic method called Pairs Analysis for detection of secret messages embedded in digital images, which enables more reliable and accurate message detection than previous methods.
Abstract: In this paper, we describe a new higher-order steganalytic method called Pairs Analysis for detection of secret messages embedded in digital images. Although the approach is in principle applicable to many different steganographic methods as well as image formats, it is ideally suited to 8-bit images, such as GIF images, where message bits are embedded in LSBs of indices to an ordered palette. The EzStego algorithm with random message spread and optimized palette order is used as an embedding archetype on which we demonstrate Pairs Analysis and compare its performance with the chi-square attacks and our previously proposed RS steganalysis. Pairs Analysis enables more reliable and accurate message detection than previous methods. The method was tested on databases of GIF images of natural scenes, cartoons, and computer-generated images. The experiments indicate that the relative steganographic capacity of the EzStego algorithm with random message spread is less than 10% of the total image capacity (0.1 bits per pixel).© (2003) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

129 citations


Proceedings ArticleDOI
TL;DR: In this article, the authors proposed a new steganographic paradigm for digital images in raster formats, where message bits are embedded in the cover image by adding a weak noise signal with a specified but arbitrary probabilistic distribution.
Abstract: In this paper, we present a new steganographic paradigm for digital images in raster formats. Message bits are embedded in the cover image by adding a weak noise signal with a specified but arbitrary probabilistic distribution. This embedding mechanism provides the user with the flexibility to mask the embedding distortion as noise generated by a particular image acquisition device. This type of embedding will lead to more secure schemes because now the attacker must distinguish statistical anomalies that might be created by the embedding process from those introduced during the image acquisition itself. Unlike previously proposed schemes, this new approach, that we call stochastic modulation, achieves oblivious data transfer without using noise extraction algorithms or error correction. This leads to higher capacity (up to 0.8 bits per pixel) and a convenient and simple implementation with low embedding and extraction complexity. But most importantly, because the embedding noise can have arbitrary properties that approximate a given device noise, the new method offers better security than existing methods. At the end of this paper, we extend stochastic modulation to a content-dependent device noise and we also discuss possible attacks on this scheme based on the most recent advances in steganalysis.

124 citations


Proceedings ArticleDOI
TL;DR: This paper discusses color image quality metrics and presents no-reference artifact metrics for blockiness, blurriness, and colorfulness, showing that these metrics are highly correlated with experimental data collected through subjective experiments.
Abstract: Color image quality depends on many factors, such as the initial capture system and its color image processing, compression, transmission, the output device, media and associated viewing conditions. In this paper, we are primarily concerned with color image quality in relation to compression and transmission. We review the typical visual artifacts that occur due to high compression ratios and/or transmission errors. We discuss color image quality metrics and present no-reference artifact metrics for blockiness, blurriness, and colorfulness. We show that these metrics are highly correlated with experimental data collected through subjective experiments. We use them for no-reference video quality assessment in different compression and transmission scenarios and again obtain very good results. We conclude by discussing the important effects viewing conditions can have on image quality.

Proceedings ArticleDOI
TL;DR: Experimental results show that the proposed technique can be used to detect the presence of hidden messages in digital audio data.
Abstract: Classification of audio documents as bearing hidden information or not is a security issue addressed in the context of steganalysis. A cover audio object can be converted into a stego-audio object via steganographic methods. In this study we present a statistical method to detect the presence of hidden messages in audio signals. The basic idea is that, the distribution of various statistical distance measures, calculated on cover audio signals and on stego-audio signals vis-a-vis their denoised versions, are statistically different. The design of audio steganalyzer relies on the choice of these audio quality measures and the construction of a two-class classifier. Experimental results show that the proposed technique can be used to detect the presence of hidden messages in digital audio data.

Proceedings ArticleDOI
TL;DR: In this paper, a method for synthesizing enhanced depth of field digital still camera using multiple differently focused images is presented, which exploits only spatial image gradients in the initial decision process.
Abstract: A method for synthesizing enhanced depth of field digital still camera pictures using multiple differently focused images is presented This technique exploits only spatial image gradients in the initial decision process The image gradient as a focus measure has been shown to be experimentally valid and theoretically sound under weak assumptions with respect to unimodality and monotonicity Subsequent majority filtering corroborates decisions with those of neighboring pixels, while the use of soft decisions enables smooth transitions across region boundaries Furthermore, these last two steps add algorithmic robustness for coping with both sensor noise and optics-related effects, such as misregistration or optical flow, and minor intensity fluctuations The dependence of these optical effects on several optical parameters is analyzed and potential remedies that can allay their impact with regard to the technique's limitations are discussed Several examples of image synthesis using the algorithm are presented Finally, leveraging the increasing functionality and emerging processing capabilities of digital still cameras, the method is shown to entail modest hardware requirements and is implementable using a parallel or general purpose processor

Proceedings ArticleDOI
TL;DR: An overview of 3D digitizing techniques is presented with an emphasis on some of the numerous commercial techniques and systems currently available, with a focus on commercial systems that are good representation of the key technologies that survived the test of the years.
Abstract: We review 20 years of development in the field of 3-D laser imaging. An overview of 3-D digitizing techniques is presented with an emphasis on commercial techniques and systems currently available. It covers some of the most important methods that have been developed, both at the National Research Council of Canada (NRC) and elsewhere, with a focus on commercial systems that are considered good representations of the key technologies that have survived the test of years. © 2004 SPIE and IS&T.

Proceedings ArticleDOI
TL;DR: This paper identifies two limitations of the proposed approach and shows how they can be overcome to obtain accurate detection in every case and outlines a condition that must be satisfied by all secure high-capacity steganographic algorithms for JPEGs.
Abstract: In this paper, we present general methodology for developing attacks on steganographic systems for the JPEG image format. The detection first starts by decompressing the JPEG stego image, geometrically distorting it (e.g., by cropping), and recompressing. Because the geometrical distortion breaks the quantized structure of DCT coefficients during recompression, the distorted/recompressed image will have many macroscopic statistics approximately equal to those of the cover image. We choose such macroscopic statistic S that also predictably changes with the embedded message length. By doing so, we estimate the unknown message length by comparing the values of S for the stego image and the cropped/recompressed stego image. The details of this detection methodology are explained on the F5 algorithm and OutGuess. The accuracy of the message length estimate is demonstrated on test images for both algorithms. Finally, we identify two limitations of the proposed approach and show how they can be overcome to obtain accurate detection in every case. The paper is closed with outlining a condition that must be satisfied by all secure high-capacity steganographic algorithms for JPEGs.

Proceedings ArticleDOI
TL;DR: In this paper, a method for generalizing the observation model to incorporate spatially varying point spread functions and general motion fields is presented, which utilizes results from image resampling theory which is shown to have equivalences with the multi-frame image observation model used in super-resolution restoration.
Abstract: Multi-frame super-resolution restoration algorithms commonly utilize a linear observation model relating the recorded images to the unknown restored image estimates. Working within this framework, we demonstrate a method for generalizing the observation model to incorporate spatially varying point spread functions and general motion fields. The method utilizes results from image resampling theory which is shown to have equivalences with the multi-frame image observation model used in super-resolution restoration. An algorithm for computing the coefficients of the spatially varying observation filter is developed. Examples of the application of the proposed method are presented.

Proceedings ArticleDOI
TL;DR: A framework to handle semantic scene classification, where a natural scene may contain multiple objects such that the scene can be described by multiple class labels, is presented and appears to generalize to other classification problems of the same nature.
Abstract: In classic pattern recognition problems, classes are mutually exclusive by definition. Classification errors occur when the classes overlap in the feature space. We examine a different situation, occurring when the classes are, by definition, not mutually exclusive. Such problems arise in scene and document classification and in medical diagnosis. We present a framework to handle such problems and apply it to the problem of semantic scene classification, where a natural scene may contain multiple objects such that the scene can be described by multiple class labels (e.g., a field scene with a mountain in the background). Such a problem poses challenges to the classic pattern recognition paradigm and demands a different treatment. We discuss approaches for training and testing in this scenario and introduce new metrics for evaluating individual examples, class recall and precision, and overall accuracy. Experiments show that our methods are suitable for scene classification; furthermore, our work appears to generalize to other classification problems of the same nature.

Proceedings ArticleDOI
TL;DR: This paper provides an overview of Project RESCUE, which aims to enhance the mitigation capabilities of first responders in the event of a crisis by dramatically transforming their ability to collect, store, analyze, interpret, share and disseminate data.
Abstract: This paper provides an overview of Project RESCUE, which aims to enhance the mitigation capabilities of first responders in the event of a crisis by dramatically transforming their ability to collect, store, analyze, interpret, share and disseminate data. The multidisciplinary research agenda incorporates a variety of information technologies: networks; distributed systems; databases; image and video processing; and machine learning, together with subjective information obtained through social science. While the IT challenges focus on systems and algorithms to get the right information to the right person at the right time, social science provides the right context. Besides providing an overview of the nature of RESCUE research activities the paper highlights challenges of particular interest to the internet imaging community.

Proceedings ArticleDOI
TL;DR: A subjective evaluation test of visual comfort and sense of presence using 48 different stereoscopic HDTV pictures, and compared the results with the parallax distributions in these pictures measured by the proposed method showed that the range ofParallax distribution and the average parallAX distribution significantly affect visual comfort when viewing stereoscopicHDTV images.
Abstract: The relationship between visual comfort and parallax distribution for stereoscopic HDTV has been studied. In this study, we first examined a method for measuring this parallax distribution. As it is important to understand the characteristics of the distribution in a frame or temporal changes of the characteristics, rather than having detailed information on the parallax at every point, we propose a method to measure the parallax based on the phase correlation. It includes a way of reducing the measurement error depending on the phase correlation method. The method was used to measure stereoscopic HDTV images with good results. Secondly, we conducted a subjective evaluation test of visual comfort and sense of presence using 48 different stereoscopic HDTV pictures, and compared the results with the parallax distributions in these pictures measured by the proposed method. The comparison showed that the range of parallax distribution and the average parallax distribution significantly affect visual comfort when viewing stereoscopic HDTV images. It is also suggested that the range of parallax distribution in many of the images that were judged comfortable to view is located within approximate 0.3Diopter.

Proceedings ArticleDOI
TL;DR: This standard describes two novel perceptual methods, the triplet comparison technique and the quality ruler, that yield results calibrated in just noticeable differences (JNDs).
Abstract: ISO 20462, a three-part standard entitled “Psychophysical experimental methods to estimate image quality,” is being developed by WG18 (Electronic Still Picture Imaging) of TC42 (Photography). As of late 2003, all three parts were in the Draft International Standard (DIS) ballot stage, with publication likely during 2004. This standard describes two novel perceptual methods, the triplet comparison technique and the quality ruler, that yield results calibrated in just noticeable differences (JNDs). Part 1, “Overview of psychophysical elements,” discusses specifications regarding observers, test stimuli, instructions, viewing conditions, data analysis, and reporting of results. Part 2, “Triplet comparison method,” describes a technique involving simultaneous five-point scaling of sets of three stimuli at a time, arranged so that all possible pairs of stimuli are compared exactly once. Part 3, “Quality ruler method,” describes a real-time technique optimized for obtaining assessments over a wider range of image quality. A single ruler is a series of ordered reference stimuli depicting a common scene but differing in a single perceptual attribute. Methods for generating quality ruler stimuli of known JND separation through modulation transfer function (MTF) variation are provided. Part 3 also defines a unique absolute Standard Quality Scale (SQS) of quality with one unit equal to one JND. Standard Reference Stimuli (SRS) prints calibrated against this new scale will be made available through the International Imaging Industry Association.

Proceedings ArticleDOI
Nathan Moroney1
TL;DR: This web-based study uses a distributed design to collect a small number of names from a large number of observers to identify focal colors, and results in CIELAB hues and lightnesses for the basic colors that agree with previous investigations as well as those investigations agree with each other.
Abstract: This paper describes an ongoing web-based approach to collecting color names or color categories. Previous studies have tended to require a large number of observations from a small number of observers. These studies have also tended to limit responses to one-word or monolexical replies. Many studies have also focused on response time or levels of intra-observer agreement in order to identify focal colors. This web-based study uses a distributed design to collect a small number of names from a large number of observers. The responses are neither limited to nor restricted from being monolexical. The focal color analysis is then based on statistical analysis of monolexcially named colors. This paper presents the methodology and infrastructure, as well as considerations for data analysis. Finally, preliminary results of the experiment are results are considered. The data from over 700 participants results in CIELAB hues and lightnesses for the basic colors that agree with previous investigations as well as those investigations agree with each other.

Proceedings ArticleDOI
TL;DR: Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper that can easily retrieve a broad coverage of images with a high recall rate and a relatively low precision.
Abstract: Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper. We first use a text-based image meta-search engine to retrieve images from the Web based on the text information on the image host pages to provide an initial image set. Because of the high-speed and low cost nature of the text-based approach, we can easily retrieve a broad coverage of images with a high recall rate and a relatively low precision. An image content based ordering is then performed on the initial image set. All the images are clustered into different folders based on the image content features. In addition, the images can be re-ranked by the content features according to the user feedback. Such a design makes it truly practical to use both text and image content for image retrieval over the Internet. Experimental results confirm the efficiency of the system.

Proceedings ArticleDOI
TL;DR: This work describes a forensic watermarking approach that is based on the inherent robustness and imperceptibility of very low spatiotemporal frequency watermark carriers, and on a watermark placement technique that renders jamming attacks too costly in picture quality, even if the attacker has complete knowledge of the embedding algorithm.
Abstract: Forensic digital watermarking is a promising tool in the fight against piracy of copyrighted motion imagery content, but to be effective it must be (1) imperceptibly embedded in high-definition motion picture source, (2) reliably retrieved, even from degraded copies as might result from camcorder capture and subsequent very-low-bitrate compression and distribution on the Internet, and (3) secure against unauthorized removal. No existing watermarking technology has yet to meet these three simultaneous requirements of fidelity, robustness, and security. We describe here a forensic watermarking approach that meets all three requirements. It is based on the inherent robustness and imperceptibility of very low spatiotemporal frequency watermark carriers, and on a watermark placement technique that renders jamming attacks too costly in picture quality, even if the attacker has complete knowledge of the embedding algorithm. The algorithm has been tested on HD Cinemascope source material exhibited in a digital cinema viewing room. The watermark is imperceptible, yet recoverable after exhibition capture with camcorders, and after the introduction of other distortions such as low-pass filtering, noise addition, geometric shifts, and the manipulation of brightness and contrast.© (2003) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
TL;DR: A new image quality measure that can be used as a multidimensional or a scalar measure to predict the distortion introduced by a wide range of noise sources based on the Singular Value Decomposition.
Abstract: The important criteria used in subjective evaluation of distorted images include the amount of distortion, the type of distortion, and the distribution of error. An ideal image quality measure should therefore be able to mimic the human observer. We present a new image quality measure that can be used as a multidimensional or a scalar measure to predict the distortion introduced by a wide range of noise sources. Based on the Singular Value Decomposition, it reliably measures the distortion not only within a distortion type at different distortion levels but also across different distortion types. The measure was applied to Lena using six types of distortion (JPEG, JPEG 2000, Gaussian blur, Gaussian noise, sharpening and DC-shifting), each with five distortion levels.

Proceedings ArticleDOI
TL;DR: This paper proposes new definitions for security and capacity in the presence of a steganalyst in steganography, and the intuition and mathematical notions supporting these definitions are described.
Abstract: Two fundamental questions in steganography are addressed in this paper, namely, (a) definition of steganography security and (b) definition of steganographic capacity. Since the main goal of steganography is covert communications, we argue that these definitions must be dependent on the type of steganalysis detector employed to break the embedding algorithm. We propose new definitions for security and capacity in the presence of a steganalyst. The intuition and mathematical notions supporting these definitions are described. Some numerical examples are also presented to illustrate the need for this investigation.© (2003) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
TL;DR: A holographic 3D display system using a 4 channel Active Tiling modulator with a new replay optics system has demonstrated directly viewable 3-D images and animations from 100 Mega-pixel CGH data.
Abstract: Giga-pixel scale displays or spatial light modulators are required in order to form directly viewable 3-D images of 0.5m in size using the principles of computer generated holography (CGH). This has been a key bottleneck preventing commercial development of electro-holography. Active Tiling is a modular spatial light modulator system developed by the authors to provide a route to replay images from giga-pixel scale CGHs. This paper will present the latest development of a multi-channel Active Tiling unit and results from this system for the first time. A holographic 3D display system using a 4 channel Active Tiling modulator with a new replay optics system has demonstrated directly viewable 3-D images and animations from 100 Mega-pixel CGH data. This provided viewing of both horizontal parallax only (HPO) and full parallax 3-D images up to 140mm in size. 25 Mega-pixels of CGH data is written by each channel onto a liquid crystal optically addressed spatial light modulator at high resolution. The modular design of Active Tiling permits CGH data to be written seamlessly across multiple channels which can be updated at rates up to 30 Hz. A Fourier Transform optical replay system was developed and integrated with the 4-channel Active Tiling system to form the CGH images.

Proceedings ArticleDOI
TL;DR: In this paper, it was shown that the maximal number of detections that can be performed in a geometrical search is bounded by the maximum false positive detection probability required by the watermark application.
Abstract: One way of recovering watermarks in geometrically distorted images is by performing a geometrical search. In addition to the computational cost required for this method, this paper considers the more important problem of false positives. The maximal number of detections that can be performed in a geometrical search is bounded by the maximum false positive detection probability required by the watermark application. We show that image and key dependency in the watermark detector leads to different false positive detection probabilities for geometrical searches for different images and keys. Furthermore, the image and key dependency of the tested watermark detector increases the random-image-random-key false positive detection probability, compared to the Bernoulli experiment that was used as a model.

Proceedings ArticleDOI
TL;DR: This study extensively tested this new algorithm with a variety of settings using audio items with different characteristics and showed that for 16bit PCM audio, capacities close to 1-bit per sample can be achieved, while perceptual degradation of the watermarked signal remained acceptable.
Abstract: A digital watermark can be seen as an information channel, which is hidden in a cover signal. It is usually designed to be imperceptible to human observers. Although imperceptibility is often achieved, the inherent modification of the cover signal may be viewed as a potential disadvantage. In this paper, we present a reversible watermarking technique for digital audio signals. In our context reversibility refers to the ability to restore the original input signal in the watermark detector. In summary, the approach works as follows. In the encoder, the dynamic range of the input signal is limited (i.e. it is compressed), and part of the unused bits is deployed for encoding the watermark bits. Another part of these bits is used to convey information for the bit-exact reconstruction of the cover signal. It is the purpose of the watermark detector to extract the watermark and reconstruct the input signal by restoring the original dynamic range. In this study we extensively tested this new algorithm with a variety of settings using audio items with different characteristics. These experiments showed that for 16bit PCM audio, capacities close to 1-bit per sample can be achieved, while perceptual degradation of the watermarked signal remained acceptable.

Proceedings ArticleDOI
TL;DR: The Maximum Entropy statistical model was applied and extended to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text, which included various features such as motion, face, music/speech types, prosody, and high-level text segmentation information.
Abstract: In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/speech types, prosody, and high-level text segmentation information. The statistical fusion model is used to automatically discover relevant features contributing to the detection of story boundaries. One novel aspect of our method is the use of a feature wrapper to address different types of features -- asynchronous, discrete, continuous and delta ones. We also developed several novel features related to prosody. Using the large news video set from the TRECVID 2003 benchmark, we demonstrate satisfactory performance (F1 measures up to 0.76 in ABC news and 0.73 in CNN news), present how these multi-level multi-modal features construct the probabilistic framework, and more importantly observe an interesting opportunity for further improvement.© (2003) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
TL;DR: Investigation on static volume displays with yttrium-lithium-fluoride (YLiF4) crystals, which are still very small but offer bright voxels with less laser-power than necessary in CaF2 crystals, potential applications are for example in medical imaging, entertainment and computer aided design.
Abstract: The two basic classes of volumetric displays are swept volume techniques and static volume techniques. During several years of investigations on swept volume displays within the FELIX 3D Project we learned about some significant disadvantages of rotating screens, one of them being the presence of hidden zones, and therefore started investigations on static volume displays two years ago with a new group of high school students. Systems which are able to create a space-filling imagery without any moving parts are classified as static volume displays. A static setup e.g. a transparent crystal describes the complete volume of the display and is doped with optically active ions of rare earths. These ions are excited in two steps by two intersecting IR-laser beams with different wavelengths (two-frequency, two-step upconversion) and afterwards emit visible photons. Suitable host materials are crystals, various special glasses and in future even polymers. The advantage of this approach is that there are only very little hidden zones which leads to a larger field of view and a larger viewing zone, the main disadvantage is the small size of the currently used fluoride crystals. Recently we started working with yttrium-lithium-fluoride (YLiF 4 ) crystals, which are still very small but offer bright voxels with less laser-power than necessary in CaF 2 crystals. Potential applications are for example in medical imaging, entertainment and computer aided design.