scispace - formally typeset
Search or ask a question

Showing papers on "Discrete cosine transform published in 2007"


Journal ArticleDOI
TL;DR: A novel approach to image filtering based on the shape-adaptive discrete cosine transform is presented, in particular, image denoising and image deblocking and deringing from block-DCT compression and a special structural constraint in luminance-chrominance space is proposed to enable an accurate filtering of color images.
Abstract: The shape-adaptive discrete cosine transform (SA-DCT) transform can be computed on a support of arbitrary shape, but retains a computational complexity comparable to that of the usual separable block-DCT (B-DCT). Despite the near-optimal decorrelation and energy compaction properties, application of the SA-DCT has been rather limited, targeted nearly exclusively to video compression. In this paper, we present a novel approach to image filtering based on the SA-DCT. We use the SA-DCT in conjunction with the Anisotropic Local Polynomial Approximation-Intersection of Confidence Intervals technique, which defines the shape of the transform's support in a pointwise adaptive manner. The thresholded or attenuated SA-DCT coefficients are used to reconstruct a local estimate of the signal within the adaptive-shape support. Since supports corresponding to different points are in general overlapping, the local estimates are averaged together using adaptive weights that depend on the region's statistics. This approach can be used for various image-processing tasks. In this paper, we consider, in particular, image denoising and image deblocking and deringing from block-DCT compression. A special structural constraint in luminance-chrominance space is also proposed to enable an accurate filtering of color images. Simulation experiments show a state-of-the-art quality of the final estimate, both in terms of objective criteria and visual appearance. Thanks to the adaptive support, reconstructed edges are clean, and no unpleasant ringing artifacts are introduced by the fitted transform

721 citations


Journal ArticleDOI
TL;DR: A new worst-case metric is proposed for predicting practical system performance in the absence of matching failures, and the worst case theoretical equal error rate (EER) is predicted to be as low as 2.59 times 10-1 available data sets.
Abstract: This paper presents a novel iris coding method based on differences of discrete cosine transform (DCT) coefficients of overlapped angular patches from normalized iris images. The feature extraction capabilities of the DCT are optimized on the two largest publicly available iris image data sets, 2,156 images of 308 eyes from the CASIA database and 2,955 images of 150 eyes from the Bath database. On this data, we achieve 100 percent correct recognition rate (CRR) and perfect receiver-operating characteristic (ROC) curves with no registered false accepts or rejects. Individual feature bit and patch position parameters are optimized for matching through a product-of-sum approach to Hamming distance calculation. For verification, a variable threshold is applied to the distance metric and the false acceptance rate (FAR) and false rejection rate (FRR) are recorded. A new worst-case metric is proposed for predicting practical system performance in the absence of matching failures, and the worst case theoretical equal error rate (EER) is predicted to be as low as 2.59 times 10-1 available data sets

503 citations


Proceedings ArticleDOI
15 Feb 2007
TL;DR: In this article, a support vector machine (SVM) was used to construct a new multi-class JPEG steganalyzer with markedly improved performance by extending the 23 DCT feature set and applying calibration to the Markov features.
Abstract: Blind steganalysis based on classifying feature vectors derived from images is becoming increasingly more powerful. For steganalysis of JPEG images, features derived directly in the embedding domain from DCT coefficients appear to achieve the best performance (e.g., the DCT features10 and Markov features21). The goal of this paper is to construct a new multi-class JPEG steganalyzer with markedly improved performance. We do so first by extending the 23 DCT feature set,10 then applying calibration to the Markov features described in21 and reducing their dimension. The resulting feature sets are merged, producing a 274-dimensional feature vector. The new feature set is then used to construct a Support Vector Machine multi-classifier capable of assigning stego images to six popular steganographic algorithms-F5,22 OutGuess,18 Model Based Steganography without ,19 and with20 deblocking, JP Hide&Seek,1 and Steghide.14 Comparing to our previous work on multi-classification,11, 12 the new feature set provides significantly more reliable results.

451 citations


Journal ArticleDOI
TL;DR: A simple recursive modification of the split-radix algorithm is presented that computes the DFT with asymptotically about 6% fewer operations than Yavne, matching the count achieved by Van Buskirk's program-generation framework.
Abstract: Recent results by Van Buskirk have broken the record set by Yavne in 1968 for the lowest exact count of real additions and multiplications to compute a power-of-two discrete Fourier transform (DFT). Here, we present a simple recursive modification of the split-radix algorithm that computes the DFT with asymptotically about 6% fewer operations than Yavne, matching the count achieved by Van Buskirk's program-generation framework. We also discuss the application of our algorithm to real-data and real-symmetric (discrete cosine) transforms, where we are again able to achieve lower arithmetic counts than previously published algorithms

381 citations


Journal ArticleDOI
TL;DR: An imperceptible and a robust combined DWT-DCT digital image watermarking algorithm that watermarks a given digital image using a combination of the Discrete Wavelet Transform (DWT) and thediscrete Cosine transform (DCT).
Abstract: The proliferation of digitized media due to the rapid growth of networked multimedia systems, has created an urgent need for copyright enforcement technologies that can protect copyright ownership of multimedia objects. Digital image watermarking is one such technology that has been developed to protect digital images from illegal manipulations. In particular, digital image watermarking algorithms which are based on the discrete wavelet transform have been widely recognized to be more prevalent than others. This is due to the wavelets' excellent spatial localization, frequency spread, and multi-resolution characteristics, which are similar to the theoretical models of the human visual system. In this paper, we describe an imperceptible and a robust combined DWT-DCT digital image watermarking algorithm. The algorithm watermarks a given digital image using a combination of the Discrete Wavelet Transform (DWT) and the Discrete Cosine Transform (DCT). Performance evaluation results show that combining the two transforms improved the performance of the watermarking algorithms that are based solely on the DWT transform.

319 citations


Journal ArticleDOI
TL;DR: A lossless and reversible steganography scheme for hiding secret data in each block of quantized discrete cosine transformation (DCT) coefficients in JPEG images that can provide expected acceptable image quality of stego-images and successfully achieve reversibility.

314 citations


Book
13 Apr 2007
Abstract: Mathematics of Multidimensional Fourier Transform AlgorithmsThe DFTThe Discrete Fourier TransformComputational Frameworks for the Fast Fourier TransformFourier Analysis on Finite Groups and ApplicationsSimpson's Discrete Fourier TransformThe DFTFoundations of Signal ProcessingThe Discrete Fourier TransformFast Fourier TransformsAn Introduction to Laplace Transforms and Fourier SeriesAlgorithms for Discrete Fourier Transform and ConvolutionFourier Series, Fourier Transform and Their Applications to Mathematical PhysicsFourier TransformsMusic Through Fourier SpaceFourier TransformsDiscrete and Continuous Fourier TransformsAnalysis of Boolean FunctionsDiscrete Fourier Analysis and WaveletsPrinciples of Fourier AnalysisDiscrete Fourier AnalysisDiscrete Harmonic AnalysisThe Scientist and Engineer's Guide to Digital Signal ProcessingDiscrete Fourier Analysis and WaveletsMastering the Discrete Fourier Transform in One, Two or Several DimensionsThe XFT Quadrature in Discrete Fourier AnalysisMathematics of the Discrete Fourier Transform (DFT)Signal ProcessingDiscrete Fourier Transforms and their Applications,Harmonic AnalysisLectures on the Fourier Transform and Its ApplicationsAn Introduction to Fourier AnalysisDiscrete Fourier AnalysisTheory of Discrete and Continuous Fourier AnalysisThe Fast Fourier Transform and Its ApplicationsDiscrete Fourier and Wavelet TransformsIntroduction to Digital FiltersFourier Analysis—A Signal Processing ApproachMathematics of the Discrete Fourier Transform (DFT) with Music and Audio ApplicationsMathematics of the Discrete Fourier Transform (DFT)

308 citations


Journal ArticleDOI
TL;DR: It is shown that the scheme based on the proposed low-complexity KLT significantly outperforms previous schemes as to rate-distortion performance, and an evaluation framework based on both reconstruction fidelity and impact on image exploitation is introduced.
Abstract: Transform-based lossy compression has a huge potential for hyperspectral data reduction. Hyperspectral data are 3-D, and the nature of their correlation is different in each dimension. This calls for a careful design of the 3-D transform to be used for compression. In this paper, we investigate the transform design and rate allocation stage for lossy compression of hyperspectral data. First, we select a set of 3-D transforms, obtained by combining in various ways wavelets, wavelet packets, the discrete cosine transform, and the Karhunen-Loegraveve transform (KLT), and evaluate the coding efficiency of these combinations. Second, we propose a low-complexity version of the KLT, in which complexity and performance can be balanced in a scalable way, allowing one to design the transform that better matches a specific application. Third, we integrate this, as well as other existing transforms, in the framework of Part 2 of the Joint Photographic Experts Group (JPEG) 2000 standard, taking advantage of the high coding efficiency of JPEG 2000, and exploiting the interoperability of an international standard. We introduce an evaluation framework based on both reconstruction fidelity and impact on image exploitation, and evaluate the proposed algorithm by applying this framework to AVIRIS scenes. It is shown that the scheme based on the proposed low-complexity KLT significantly outperforms previous schemes as to rate-distortion performance. As for impact on exploitation, we consider multiclass hard classification, spectral unmixing, binary classification, and anomaly detection as benchmark applications

292 citations


Proceedings ArticleDOI
27 Feb 2007
TL;DR: A novel statistical model based on Benford's law for the probability distributions of the first digits of the block-DCT and quantized JPEG coefficients is presented and a parametric logarithmic law, i.e., the generalized Benford't law, is formulated.
Abstract: In this paper, a novel statistical model based on Benford's law for the probability distributions of the first digits of the block-DCT and quantized JPEG coefficients is presented. A parametric logarithmic law, i.e., the generalized Benford's law, is formulated. Furthermore, some potential applications of this model in image forensics are discussed in this paper, which include the detection of JPEG compression for images in bitmap format, the estimation of JPEG compression Qfactor for JPEG compressed bitmap image, and the detection of double compressed JPEG image. The results of our extensive experiments demonstrate the effectiveness of the proposed statistical model.

287 citations


Journal ArticleDOI
TL;DR: Investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets.
Abstract: A thorough investigation of the application of support vector regression (SVR) to the superresolution problem is conducted through various frameworks. Prior to the study, the SVR problem is enhanced by finding the optimal kernel. This is done by formulating the kernel learning problem in SVR form as a convex optimization problem, specifically a semi-definite programming (SDP) problem. An additional constraint is added to reduce the SDP to a quadratically constrained quadratic programming (QCQP) problem. After this optimization, investigation of the relevancy of SVR to superresolution proceeds with the possibility of using a single and general support vector regression for all image content, and the results are impressive for small training sets. This idea is improved upon by observing structural properties in the discrete cosine transform (DCT) domain to aid in learning the regression. Further improvement involves a combination of classification and SVR-based techniques, extending works in resolution synthesis. This method, termed kernel resolution synthesis, uses specific regressors for isolated image content to describe the domain through a partitioned look of the vector space, thereby yielding good results

275 citations


Journal ArticleDOI
TL;DR: This scheme embeds the watermark without exposing video content's confidentiality, and provides a solution for signal processing in encrypted domain, and increases the operation efficiency, since the encrypted video can be watermarked without decryption.
Abstract: A scheme is proposed to implement commutative video encryption and watermarking during advanced video coding process. In H.264/AVC compression, the intra-prediction mode, motion vector difference and discrete cosine transform (DCT) coefficients' signs are encrypted, while DCT coefficients' amplitudes are watermarked adaptively. To avoid that the watermarking operation affects the decryption operation, a traditional watermarking algorithm is modified. The encryption and watermarking operations are commutative. Thus, the watermark can be extracted from the encrypted videos, and the encrypted videos can be re-watermarked. This scheme embeds the watermark without exposing video content's confidentiality, and provides a solution for signal processing in encrypted domain. Additionally, it increases the operation efficiency, since the encrypted video can be watermarked without decryption. These properties make the scheme a good choice for secure media transmission or distribution

Journal ArticleDOI
TL;DR: A novel steganographic method, based on JPEG and Particle Swarm Optimization algorithm (PSO), that has larger message capacity and better image quality than Chang et al.'s and has a high security level is proposed.

Journal ArticleDOI
TL;DR: Viewing postprocessing as an inverse problem, this work proposes to solve it by the maximum a posteriori criterion and shows that the proposed method achieves higher PSNR gain than other methods and the processed images possess good visual quality.
Abstract: Transform coding using the discrete cosine transform (DCT) has been widely used in image and video coding standards, but at low bit rates, the coded images suffer from severe visual distortions which prevent further bit reduction. Postprocessing can reduce these distortions and alleviate the conflict between bit rate reduction and quality preservation. Viewing postprocessing as an inverse problem, we propose to solve it by the maximum a posteriori criterion. The distortion caused by coding is modeled as additive, spatially correlated Gaussian noise, while the original image is modeled as a high order Markov random field based on the fields of experts framework. Experimental results show that the proposed method, in most cases, achieves higher PSNR gain than other methods and the processed images possess good visual quality. In addition, we examine the noise model used and its parameter setting. The noise model assumes that the DCT coefficients and their quantization errors are independent. This assumption is no longer valid when the coefficients are truncated. We explain how this problem can be rectified using the current parameter setting.

Journal ArticleDOI
TL;DR: A robust watermarking algorithm for H.264 is proposed that detects the watermark from the decoded video sequence in order to make the algorithm robust to intraprediction mode changes and builds a theoretical framework for watermark detection based on a likelihood ratio test.
Abstract: As H.264 digital video becomes more prevalent, the need for copyright protection and authentication methods that are appropriate for this standard will emerge. This paper proposes a robust watermarking algorithm for H.264. We employ a human visual model adapted for a 4 times 4 discrete cosine transform block to increase the payload and robustness while limiting visual distortion. A key-dependent algorithm is used to select a subset of the coefficients that have visual watermarking capacity. Furthermore, the watermark is spread over frequencies and within blocks to avoid error pooling. This increases the payload and robustness without noticeably changing the perceptual quality. We embed the watermark in the coded residuals to avoid decompressing the video; however, we detect the watermark from the decoded video sequence in order to make the algorithm robust to intraprediction mode changes. We build a theoretical framework for watermark detection based on a likelihood ratio test. This framework is used to obtain optimal video watermark detection with controllable detection performance. Our simulation results show that we achieve the desired detection performance in Monte Carlo trials. We demonstrate the robustness of our proposed algorithm to several different attacks

Journal ArticleDOI
Sangkeun Lee1
TL;DR: The main advantage of the proposed algorithm enhances the details in the dark and the bright areas with low computations without boosting noise information and affecting the compressibility of the original image since it performs on the images in the compressed domain.
Abstract: The object of this paper is to present a simple and efficient algorithm for dynamic range compression and contrast enhancement of digital images under the noisy environment in the compressed domain. First, an image is separated into illumination and reflectance components. Next, the illumination component is manipulated adaptively for image dynamics by using a new content measure. Then, the reflectance component based on the measure of the spectral contents of the image is manipulated for image contrast. The spectral content measure is computed from the energy distribution across different spectral bands in a discrete cosine transform (DCT) block. The proposed approach also introduces a simple scheme for estimating and reducing noise information directly in the DCT domain. The main advantage of the proposed algorithm enhances the details in the dark and the bright areas with low computations without boosting noise information and affecting the compressibility of the original image since it performs on the images in the compressed domain. In order to evaluate the proposed scheme, several base-line approaches are described and compared using enhancement quality measures

Book ChapterDOI
22 Aug 2007
TL;DR: This paper proposes a lossless data hiding technique for JPEG images based on histogram pairs that embeds data into the JPEG quantized 8x8 block DCT coefficients and can obtain higher payload than the prior arts.
Abstract: This paper proposes a lossless data hiding technique for JPEG images based on histogram pairs It embeds data into the JPEG quantized 8x8 block DCT coefficients and can achieve good performance in terms of PSNR versus payload through manipulating histogram pairs with optimum threshold and optimum region of the JPEG DCT coefficients It can obtain higher payload than the prior arts In addition, the increase of JPEG file size after data embedding remains unnoticeable These have been verified by our extensive experiments

Journal ArticleDOI
TL;DR: This work derives the relationship between the discrete-frequency linear autocorrelation of a spectrum and the temporal envelope of a signal and model the temporal envelopes of the residual of regular AR modeling to efficiently capture signal structure in the most appropriate domain.
Abstract: Autoregressive (AR) models are commonly obtained from the linear autocorrelation of a discrete-time signal to obtain an all-pole estimate of the signal's power spectrum. We are concerned with the dual, frequency-domain problem. We derive the relationship between the discrete-frequency linear autocorrelation of a spectrum and the temporal envelope of a signal. In particular, we focus on the real spectrum obtained by a type-I odd-length discrete cosine transform (DCT-Io) which leads to the all-pole envelope of the corresponding symmetric squared Hilbert temporal envelope. A compact linear algebra notation for the familiar concepts of AR modeling clearly reveals the dual symmetries between modeling in time and frequency domains. By using AR models in both domains in cascade, we can jointly estimate the temporal and spectral envelopes of a signal. We model the temporal envelope of the residual of regular AR modeling to efficiently capture signal structure in the most appropriate domain.

Journal ArticleDOI
TL;DR: A visual measure is proposed for the purpose of video compressions that combines the motion attention model, unconstrained eye-movement incorporated spatiovelocity visual sensitivity model, and visual masking model and exhibits the effectiveness in improving coding performance without picture quality degradation.
Abstract: Human visual sensitivity varies with not only spatial frequencies, but moving velocities of image patterns. Moreover, the loss of visual sensitivity due to object motions might be compensated by eye movement. Removing the psychovisual redundancies in both the spatial and temporal frequency domains facilitates an efficient coder without perceptual degradation. Motivated by this, a visual measure is proposed for the purpose of video compressions. The novelty of this analysis relies on combining three visual factors altogether: the motion attention model, unconstrained eye-movement incorporated spatiovelocity visual sensitivity model, and visual masking model. For each motion-unattended macroblock, the retinal velocity is evaluated so that discrete cosine transform coefficients to which the human visual system has low sensitivity are picked up with the aid of eye movement incorporated spatiovelocity visual model. Based on masking thresholds of those low-sensitivity coefficients, a spatiotemporal distortion masking measure is determined. Accordingly, quantization parameters at macroblock level for video coding are adjusted on the basis of this measure. Experiments conducted by H.264 exhibit the effectiveness of the proposed scheme in improving coding performance without picture quality degradation

01 Jan 2007
TL;DR: Compression schemes oriented to the task of remote transmission are becoming increasingly of interest in hyperspectral applications, because in many applications involving the communication of images, progressive transmission is desired in that successive reconstructions of the image are possible.
Abstract: Since hyperspectral imagery is generated by collecting hundreds of contiguous bands, uncompressed hyperspectral imagery can be very large, with a single image potentially occupying hundreds of megabytes. For instance, the Airborne Visible InfraRed Imaging Spectrometer (AVIRIS) sensor is capable of collecting several gigabytes of data per day. Compression is thus necessary to facilitate both the storage and the transmission of hyperspectral images. Since hyperspectral imagery is typically collected on remote acquisition platforms, such as satellites, the transmission of such data to central, often terrestrial, reception sites can be a critical issue. Thus, compression schemes oriented to the task of remote transmission are becoming increasingly of interest in hyperspectral applications. Although there have been a number of approaches to the compression of hyperspectral imagery proposed in recent years—prominent techniques would include vector quantization (VQ) (e.g., [1, 2]) or principal component analysis (PCA) (e.g., [3, 4]) applied to spectral pixel vectors, as well as 3D extensions of common image-compression methods such as the discrete cosine transform (DCT) (e.g., [5])—most of the approaches as proposed are not particularly well-suited to the image-transmission task. That is, in many applications involving the communication of images, progressive transmission is desired in that successive reconstructions of the image are possible. In such a scenario, the receiver can produce a low-quality representation of the image after having received only a small portion of the transmitted bitstream, and this “preview” representation can be successively refined in


Journal ArticleDOI
TL;DR: Three highly efficient thermal simulation algorithms for calculating the on-chip temperature distribution in a multilayered substrate structure based on the concept of the Green function and utilize the technique of discrete cosine transform are presented.
Abstract: Due to technology scaling trends, the accurate and efficient calculations of the temperature distribution corresponding to a specific circuit layout and power density distribution will become indispensable in the design of high-performance very large scale integrated circuits. In this paper, we present three highly efficient thermal simulation algorithms for calculating the on-chip temperature distribution in a multilayered substrate structure. All three algorithms are based on the concept of the Green function and utilize the technique of discrete cosine transform. However, the application areas of the algorithms are different. The first algorithm is suitable for localized analysis in thermal problems, whereas the second algorithm targets full-chip temperature profiling. The third algorithm, which combines the advantages of the first two algorithms, can be used to perform thermal simulations where the accuracy requirement differs from place to place over the same chip. Experimental results show that all three algorithms can achieve relative errors of around 1% compared with that of a commercial computational fluid dynamic software package for thermal analysis, whereas their efficiencies are orders of magnitude higher than that of the direct application of the Green function method.

Patent
07 Aug 2007
TL;DR: In this article, the authors proposed a scheme for communicating channel quality measures that on the one hand allows for an accurate reconstruction of the channel quality at the receiver and on the other hand requires an acceptable transmission overhead.
Abstract: The invention relates to a method for transmitting and a method for reconstructing channel quality information in a communication system. Further, the invention also provides a transmitter and receiver performing these methods, respectively. The invention suggests a scheme for communicating channel quality measures that on the one hand allows for an accurate reconstruction of the channel quality measures at the receiver and on the other hand requires an acceptable transmission overhead. This is achieved by partitioning channel quality measures into at least two partitions and to compress the values partition-wise, for example by means of a discrete cosine transform and the transmission of only a subset of the resulting coefficients.

Proceedings ArticleDOI
28 Jan 2007
TL;DR: It is shown by experimental results that the super-macroblock coding scheme can achieve a higher coding gain and an adaptive scheme is proposed for the selection the best coding mode and transform size.
Abstract: A high definition video coding technique using super-macroblocks is investigated in this work. Our research is motivated by the observation that the macroblock-based partition in H.264/AVC may not be efficient for high definition video since the maximum macroblock size of 16 x 16 is relatively small against the whole image size. In the proposed super-macboblock based video coding scheme, the original block size MxN in H.264 is scaled to 2Mx2N. Along with the super-macroblock prediction framework, a low-complexity 16 x 16 discrete cosine transform (DCT) is proposed. As compared with the 1D 8 x 8 DCT, only 16 additions are added for a 1D 16 points 16 x 16 DCT. Furthermore, an adaptive scheme is proposed for the selection the best coding mode and best transform size. It is shown by experimental results that the super-macroblock coding scheme can achieve a higher coding gain.

Journal ArticleDOI
TL;DR: The compression algorithm is based on a hybrid technique implementing a four-dimensional transform combining the discrete wavelet transform and the discrete cosine transform that outperforms the baseline JPEG compression scheme applied to II and a previous compression method developed for II based on MPEG II.
Abstract: Integral imaging (II) is a promising three-dimensional (3-D) imaging technique that uses an array of diffractive or refractive optical elements to record the 3-D information on a conventional digital sensor. With II, the object information is recorded in the form of an array of subimages, each representing a slightly different perspective of the object In order to obtain high-quality 3-D images, digital sensors with a large number of pixels are required. Consequently, high-quality II involves recording and processing large amounts of data. In this paper, we present a compression method developed for the particular characteristics of the digitally recorded integral image. The compression algorithm is based on a hybrid technique implementing a four-dimensional transform combining the discrete wavelet transform and the discrete cosine transform. The proposed algorithm outperforms the baseline JPEG compression scheme applied to II and a previous compression method developed for II based on MPEG II.

Proceedings ArticleDOI
16 Apr 2007
TL;DR: A novel DCT architecture that allows aggressive voltage scaling by exploiting the fact that not all intermediate computations are equally important in a DCT system to obtain "good" image quality with peak signal to noise ratio (PSNR) > 30 dB is presented.
Abstract: 2D discrete cosine transform (DCT) is widely used as the core of digital image and video compression. In this paper, the authors present a novel DCT architecture that allows aggressive voltage scaling by exploiting the fact that not all intermediate computations are equally important in a DCT system to obtain "good" image quality with peak signal to noise ratio (PSNR) > 30 dB. This observation has led us to propose a DCT architecture where the signal paths that are less contributive to PSNR improvement are designed to be longer than the paths that are more contributive to PSNR improvement It should also be noted that robustness with respect to parameter variations and low power operation typically impose contradictory requirements in terms of architecture design. However, the proposed architecture lends itself to aggressive voltage scaling for low-power dissipation even under process parameter variations. Under a scaled supply voltage and/or variations in process parameters, any possible delay errors would only appear from the long paths that are less contributive towards PSNR improvement, providing large improvement in power dissipation with small PSNR degradation. Results show that even under large process variation and supply voltage scaling (0.8V), there is a gradual degradation of image quality with considerable power savings (62.8%) for the proposed architecture when compared to existing implementations in 70 nm process technology

Journal ArticleDOI
TL;DR: The experimental results show that the performance of the proposed directional DCT-like transform can dramatically outperform the conventional DCT up to 2 dB even without modifying entropy coding.
Abstract: Traditional 2-D discrete cosine transform (DCT) implemented by separable 1-D transform in horizontal and vertical directions does not take image orientation features in a local window into account. To improve it, we propose to introduce directional primary operations to the lifting-based DCT and thereby derive a new directional DCT-like transform, whose transform matrix is dependent on directional angle and interpolation used there. Furthermore, the proposed transform is compared with the straightforward one of first rotated and then transformed. A JPEG-wise image coding scheme is also proposed to evaluate the performance of the proposed directional DCT-like transform. The first 1-D transform is performed according to image orientation features, and the second 1-D transform still in the horizontal or vertical direction. At the same time, an approach is proposed to optimally select transform direction of each block because selected directions of neighboring blocks will influence each other. The experimental results show that the performance of the proposed directional DCT-like transform can dramatically outperform the conventional DCT up to 2 dB even without modifying entropy coding.

Journal ArticleDOI
TL;DR: A unified framework for text detection, localization, and tracking in compressed videos using the discrete cosines transform (DCT) coefficients is proposed and the final experimental results show the effectiveness of the proposed methods.
Abstract: Video text information plays an important role in semantic-based video analysis, indexing and retrieval. Video texts are closely related to the content of a video. Usually, the fundamental steps of text-based video analysis, browsing and retrieval consist of video text detection, localization, tracking, segmentation and recognition. Video sequences are commonly stored in compressed formats where MPEG coding techniques are often adopted. In this paper, a unified framework for text detection, localization, and tracking in compressed videos using the discrete cosines transform (DCT) coefficients is proposed. A coarse to fine text detection method is used to find text blocks in terms of the block DCT texture intensity information. The DCT texture intensity of an 8x8 block of an intra-frame is approximately represented by seven AC coefficients. The candidate text block regions are further verified and refined. The text block region localization and tracking are carried out by virtue of the horizontal and vertical block texture intensity projection profiles. The appearing and disappearing frames of each text line are determined by the text tracking. The final experimental results show the effectiveness of the proposed methods.

Journal ArticleDOI
TL;DR: In this paper, two novel methods have been employed to analyze facial features in coarse and fine scales successively: a mixture of factor analyzers to learn Gabor filter outputs on a coarse scale and a template matching of block-based Discrete Cosine Transform (DCT) features.
Abstract: Finding landmark positions on facial images is an important step in face registration and normalization, for both 2D and 3D face recognition. In this paper, we inspect shortcomings of existing approaches in the literature and compare several methods for performing automatic landmarking on near-frontal faces in different scales. Two novel methods have been employed to analyze facial features in coarse and fine scales successively. The first method uses a mixture of factor analyzers to learn Gabor filter outputs on a coarse scale. The second method is a template matching of block-based Discrete Cosine Transform (DCT) features. In addition, a structural analysis subsystem is proposed that can determine false matches, and correct their positions.

Journal ArticleDOI
TL;DR: The proposed Cordic-based Loeffler DCT is very suitable for low-power and high-quality encoder/decoders (codecs) used in battery-based systems and retains the good transformation quality of the original Loeffer DCT.
Abstract: A computationally efficient and high-quality preserving discrete cosine transform (DCT) architecture is presented. It is obtained by optimising the Loeffler DCT based on the coordinate rotation digital computer (Cordic) algorithm. The computational complexity is reduced significantly from 11 multiply and 29 add operations (Loeffler DCT) to 38 add and 16 shift operations (i.e. similar to the complexity of the binDCT) without losing quality. After synthesising with TSMC 0.13-mum technology library, Synopsys PrimePower was used to estimate the power consumption at gate-level. The experimental results show that the proposed 8-point one-dimensional DCT architecture only consumes 19% of the area and about 16% of the power of the original Loeffler DCT. Moreover, it also retains the good transformation quality of the original Loeffler DCT. In this regard, the proposed Cordic-based Loeffler DCT is very suitable for low-power and high-quality encoder/decoders (codecs) used in battery-based systems.

Proceedings ArticleDOI
TL;DR: Variants of both the digital curvelet transform, and the digital wave atom transform, which handle the image boundaries by mirror extension are presented, which extend the range of applicability of curvelets and wave atoms to situations where periodization at the boundaries is uncalled for.
Abstract: We present variants of both the digital curvelet transform, and the digital wave atom transform, which handle the image boundaries by mirror extension. Previous versions of these transforms treated image boundaries by periodization. The main ideas of the modifications are 1) to tile the discrete cosine domain instead of the discrete Fourier domain, and 2) to adequately reorganize the in-tile data. In their shift-invariant versions, the new constructions come with no penalty on the redundancy or computational complexity. For shift-variant wave atoms, the penalty is a factor 2 instead of the naive factor 4. These various modifications have been included in the CurveLab and WaveAtom toolboxes, and extend the range of applicability of curvelets (good for edges and bandlimited wavefronts) and wave atoms (good for oscillatory patterns and textures) to situations where periodization at the boundaries is uncalled for. The new variants are dubbed ME-curvelets and ME-wave atoms, where ME stands for mirror-extended.