scispace - formally typeset
Search or ask a question

Showing papers by "Renato J. Cintra published in 2012"


Journal ArticleDOI
TL;DR: In this article, a low-complexity 8-point orthogonal approximate discrete cosine transform (DCT) is introduced. But the proposed transform requires no multiplications or bit-shift operations.
Abstract: A low-complexity 8-point orthogonal approximate discrete cosine transform (DCT) is introduced. The proposed transform requires no multiplications or bit-shift operations. The derived fast algorithm requires only 14 additions, less than any existing DCT approximation. Moreover, in several image compression scenarios, the proposed transform could outperform the well-known signed DCT, as well as state-of the-art algorithms.

91 citations


Journal ArticleDOI
TL;DR: A novel DCT approximation having zero multiplicative complexity is shown to be better for multi-beamforming AAs when compared to BAS-2008 and CB-2011, implying the fastest DCT approximations using reconfigured logic devices in the literature.
Abstract: Multi-beamforming is an important requirement for broadband space imaging applications based on dense aperture arrays (AAs). Usually, the discrete Fourier transform is the transform of choice for AA electromagnetic imaging. Here, the discrete cosine transform (DCT) is proposed as an alternative, enabling the use of emerging fast algorithms that offer greatly reduced complexity in digital arithmetic circuits. We propose two novel high-speed digital architectures for recently proposed fast algorithms (Bouguezel, Ahmad and Swamy 2008 Electron. Lett. 44 1249?50) (BAS-2008) and (Cintra and Bayer 2011 IEEE Signal Process. Lett. 18 579?82) (CB-2011) that provide good approximations to the DCT at zero multiplicative complexity. Further, we propose a novel DCT approximation having zero multiplicative complexity that is shown to be better for multi-beamforming AAs when compared to BAS-2008 and CB-2011. The far-field array pattern of ideal DCT, BAS-2008, CB-2011 and proposed approximation are investigated with error analysis. Extensive hardware realizations, implementation details and performance metrics are provided for synchronous field programmable gate array (FPGA) technology from Xilinx. The resource consumption and speed metrics of BAS-2008, CB-2011 and the proposed approximation are investigated as functions of system word size. The 8-bit versions are mapped to emerging asynchronous FPGAs leading to significantly increased real-time throughput with clock rates at up to 925.6?MHz implying the fastest DCT approximations using reconfigurable logic devices in the literature.

41 citations


Journal ArticleDOI
TL;DR: The proposed transform outperforms the well-known Walsh?Hadamard transform and the current state-of-the-art 16-point approximation and is experimentally validated using hardware implementations that are physically realized and verified on a maximum clock rate of 342?MHz.
Abstract: The discrete cosine transform (DCT) is the key step in many image and video coding standards. The eight-point DCT is an important special case, possessing several low-complexity approximations widely investigated. However, the 16-point DCT transform has energy compaction advantages. In this sense, this paper presents a new 16-point DCT approximation with null multiplicative complexity. The proposed transform matrix is orthogonal and contains only zeros and ones. The proposed transform outperforms the well-known Walsh?Hadamard transform and the current state-of-the-art 16-point approximation. A fast algorithm for the proposed transform is also introduced. This fast algorithm is experimentally validated using hardware implementations that are physically realized and verified on a 40?nm CMOS Xilinx Virtex-6 XC6VLX240T FPGA chip for a maximum clock rate of 342?MHz. Rapid prototypes on FPGA for a 8-bit input word size show significant improvement in compressed image quality by up to 1?2?dB at the cost of only eight adders compared to the state-of-art 16-point DCT approximation algorithm in the literature (Bouguezel et al 2010 Proc. 53rd IEEE Int. Midwest Symp. on Circuits and Systems).

33 citations


Journal ArticleDOI
TL;DR: In this paper, a five-parameter distribution called the McDonald normal distribution is defined and studied, which contains several important distributions discussed in the literature, such as the normal, skew-normal, exponentiated normal, beta normal and Kumaraswamy normal distributions, among others.
Abstract: A five-parameter distribution called the McDonald normal distribution is defined and studied. The new distribution contains, as special cases, several important distributions discussed in the literature, such as the normal, skew-normal, exponentiated normal, beta normal and Kumaraswamy normal distributions, among others. We obtain its ordinary moments, moment generating function and mean deviations. We also derive the ordinary moments of the order statistics. We use the method of maximum likelihood to fit the new distribution and illustrate its potentiality with three applications to real data.

32 citations


Journal ArticleDOI
TL;DR: An algebraic integer (AI)-based time-multiplexed row-parallel architecture and two final reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI encoded 2-D discrete cosine transform (DCT) to enable low-noise high-dynamic range applications in digital video processing.
Abstract: An algebraic integer (AI)-based time-multiplexed row-parallel architecture and two final reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI encoded 2-D discrete cosine transform (DCT). The architecture directly realizes an error-free 2-D DCT without using FRSs between row-column transforms, leading to an 8 × 8 2-D DCT that is entirely free of quantization errors in AI basis. As a result, the user-selectable accuracy for each of the coefficients in the FRS facilitates each of the 64 coefficients to have its precision set independently of others, avoiding the leakage of quantization noise between channels as is the case for published DCT designs. The proposed FRS uses two approaches based on: 1) optimized Dempster-Macleod multipliers, and 2) expansion factor scaling. This architecture enables low-noise high-dynamic range applications in digital video processing that requires full control of the finite-precision computation of the 2-D DCT. The proposed architectures and FRS techniques are experimentally verified and validated using hardware implementations that are physically realized and verified on field-programmable gate array (FPGA) chip. Six designs, for 4-bit and 8-bit input word sizes, using the two proposed FRS schemes, have been designed, simulated, physically implemented, and measured. The maximum clock rate and block rate achieved among 8-bit input designs are 307.787 MHz and 38.47 MHz, respectively, implying a pixel rate of 8 × 307.787≈2.462 GHz if eventually embedded in a real- time video-processing system. The equivalent frame rate is about 1187.35Hz for the image size of 1920 × 1080. All implementations are functional on a Xilinx Virtex-6 XC6VLX240T FPGA device.

29 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide a better foundation for some properties and an analytical study of its bimodality, and derive explicit expressions for moments, generating function, mean deviations using a power series expansion for the quantile function, and Shannon entropy.
Abstract: The beta normal distribution is a generalization of both the normal distribution and the normal order statistics. Some of its mathematical properties and a few applications have been studied in the literature. We provide a better foundation for some properties and an analytical study of its bimodality. The hazard rate function and the limiting behavior are examined. We derive explicit expressions for moments, generating function, mean deviations using a power series expansion for the quantile function, and Shannon entropy.

25 citations


Proceedings ArticleDOI
11 Dec 2012
TL;DR: Close-form relationship between the 16×16 transform and arbitrary smaller sized transform is presented, enabling the usability of this architecture to compute transforms of size 4 · 2P × 4· 2q where 0 ≤ p, q ≤ 2.
Abstract: The discrete cosine transform (DCT) is widely employed in image and video coding applications due to its high energy compaction. In addition to 4×4 and 8×8 transforms utilized in earlier video coding standards, the proposed HEVC standard suggests the use of larger transform sizes including 16 × 16 and 32×32 transforms in order to obtain higher coding gains. Further, it also proposes the use of non-square transform sizes as well as the use of the discrete sine transform (DST) in certain intra-prediction modes. The decision on the type of transform used in a given prediction scenario is dynamically made, to obtain required compression rates. This motivated the proposed digital VLSI architecture for a multitransform engine capable of computing 16×16 approximate 2-D DCT/DST transform, with null multiplicative complexity. The relationship between DCT-II and DST-II is employed to compute both transforms using the same digital core, leading to reductions in both area and power. Closed-form relationship between the 16×16 transform and arbitrary smaller sized transform is presented, enabling the usability of this architecture to compute transforms of size 4 · 2P × 4 · 2q where 0 ≤ p, q ≤ 2.

24 citations


Journal ArticleDOI
TL;DR: In this paper, the authors derived and compared eight stochastic distances and assesses the performance of hypothesis tests that employ them and maximum likelihood estimation, concluding that tests based on the triangular distance have the closest empirical size to the theoretical one.
Abstract: Images obtained with coherent illumination, as is the case of sonar, ultrasound-B, laser and Synthetic Aperture Radar -- SAR, are affected by speckle noise which reduces the ability to extract information from the data. Specialized techniques are required to deal with such imagery, which has been modeled by the G0 distribution and under which regions with different degrees of roughness and mean brightness can be characterized by two parameters; a third parameter, the number of looks, is related to the overall signal-to-noise ratio. Assessing distances between samples is an important step in image analysis; they provide grounds of the separability and, therefore, of the performance of classification procedures. This work derives and compares eight stochastic distances and assesses the performance of hypothesis tests that employ them and maximum likelihood estimation. We conclude that tests based on the triangular distance have the closest empirical size to the theoretical one, while those based on the arithmetic-geometric distances have the best power. Since the power of tests based on the triangular distance is close to optimum, we conclude that the safest choice is using this distance for hypothesis testing, even when compared with classical distances as Kullback-Leibler and Bhattacharyya.

17 citations


Journal ArticleDOI
TL;DR: A low complexity digital VLSI architecture for the computation of an algebraic integer (AI) based 8-point Arai DCT algorithm based on a new final reconstruction step (FRS) having lower complexity and higher accuracy compared to the state-of-the-art.
Abstract: A low complexity digital VLSI architecture for the computation of an algebraic integer (AI) based 8-point Arai DCT algorithm is proposed. AI encoding schemes for exact representation of the Arai DCT transform based on a particularly sparse 2-D AI representation is reviewed, leading to the proposed novel architecture based on a new final reconstruction step (FRS) having lower complexity and higher accuracy compared to the state-of-the-art. This FRS is based on an optimization derived from expansion factors that leads to small integer constant-coefficient multiplications, which are realized with common sub-expression elimination (CSE) and Booth encoding. The reference circuit [1] as well as the proposed architectures for two expansion factors α† = 4.5958 and α′ = 167.2309 are implemented. The proposed circuits show 150% and 300% improvements in the number of DCT coefficients having error ≤ 0:1% compared to [1]. The three designs were realized using both 40 nm CMOS Xilinx Virtex-6 FPGAs and synthesized using 65 nm CMOS general purpose standard cells from TSMC. Post synthesis timing analysis of 65 nm CMOS realizations at 900 mV for all three designs of the 8-point DCT core for 8-bit inputs show potential real-time operation at 2.083 GHz clock frequency leading to a combined throughput of 2.083 billion 8-point Arai DCTs per second. The expansion-factor designs show a 43% reduction in area (A) and 29% reduction in dynamic power (PD) for FPGA realizations. An 11% reduction in area is observed for the ASIC design for α† = 4.5958 for an 8% reduction in total power (PT ). Our second ASIC design having α′ = 167.2309 shows marginal improvements in area and power compared to our reference design but at significantly better accuracy.

14 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed several divergence measures specifically tailored for G0 distributed data, and devised and assessed tests based on such measures, and their performances were quantified according to their test sizes and powers.
Abstract: Synthetic aperture radar (SAR) has a pivotal role as a remote imaging method. Obtained by means of coherent illumination, SAR images are contaminated with speckle noise. The statistical modeling of such contamination is well described according with the multiplicative model and its implied G0 distribution. The understanding of SAR imagery and scene element identification is an important objective in the field. In particular, reliable image contrast tools are sought. Aiming the proposition of new tools for evaluating SAR image contrast, we investigated new methods based on stochastic divergence. We propose several divergence measures specifically tailored for G0 distributed data. We also introduce a nonparametric approach based on the Kolmogorov-Smirnov distance for G0 data. We devised and assessed tests based on such measures, and their performances were quantified according to their test sizes and powers. Using Monte Carlo simulation, we present a robustness analysis of test statistics and of maximum likelihood estimators for several degrees of innovative contamination. It was identified that the proposed tests based on triangular and arithmetic-geometric measures outperformed the Kolmogorov-Smirnov methodology.

5 citations


Proceedings ArticleDOI
20 May 2012
TL;DR: A multi-encoding approach using wavelet based subband coding is proposed to accomplish error free calculations from exact representation of Daubechies 4-tap wavelet filter coefficients using the algebraic integer (AI) representation.
Abstract: In this paper, a multi-encoding approach using wavelet based subband coding is proposed to accomplish error free calculations from exact representation of Daubechies 4-tap wavelet filter coefficients using the algebraic integer (AI) representation. By mapping the irrational coefficients to a convenient AI basis, the proposed architecture is designed employing a parallel channel model having two data paths carrying integer sequences. The computations done in the AI architecture are exactly accurate and are done entirely in a multiplier-free circuit. The design is implemented on a Xilinx Virtex-6 device at 172 MHz and hardware co-simulated with an ML605 board at 100 MHz. AI mapping facilitates simplicity and error-free calculations. The proposed architecture has a single final reconstruction step (FRS). Booth encoding has been adopted at the FRS to minimize the error which can be incurred at this point. This paper provides a hardware and power analysis for different bit lengths. An example output image sequence for the mandrill image is also provided.

Journal ArticleDOI
TL;DR: In this article, the authors derived analytic expressions for the Shannon, R\'enyi, and restricted Tsallis entropies under the scaled complex Wishart distribution for polarimetric synthetic aperture radar (PolSAR) images.
Abstract: Images obtained from coherent illumination processes are contaminated with speckle noise, with polarimetric synthetic aperture radar (PolSAR) imagery as a prominent example. With an adequacy widely attested in the literature, the scaled complex Wishart distribution is an acceptable model for PolSAR data. In this perspective, we derive analytic expressions for the Shannon, R\'enyi, and restricted Tsallis entropies under this model. Relationships between the derived measures and the parameters of the scaled Wishart law (i.e., the equivalent number of looks and the covariance matrix) are discussed. In addition, we obtain the asymptotic variances of the Shannon and R\'enyi entropies when replacing distribution parameters by maximum likelihood estimators. As a consequence, confidence intervals based on these two entropies are also derived and proposed as new ways of capturing contrast. New hypothesis tests are additionally proposed using these results, and their performance is assessed using simulated and real data. In general terms, the test based on the Shannon entropy outperforms those based on R\'enyi's.

Journal ArticleDOI
TL;DR: A block-parallel systolic-array architecture is proposed for watermarking based on the 2-D special Hartley NTT (HNTT) hardware cores, each using digital arithmetic over GF(3), and processes 4 × 4 blocks of pixels in parallel every clock cycle.
Abstract: Number-theoretic transforms (NTTs) have been applied in the fragile watermarking of digital images. A block-parallel systolic-array architecture is proposed for watermarking based on the 2-D special Hartley NTT (HNTT). The proposed core employs two 2-D special HNTT hardware cores, each using digital arithmetic over GF(3), and processes 4 × 4 blocks of pixels in parallel every clock cycle. Prototypes are operational on a Xilinx Sx35-10ff668 FPGA device. The maximum estimated throughput of the FPGA circuit is 100 million 4 × 4 HNTT fragile watermarked blocks per second, when clocked at 100 MHz. Potential applications exist in high-traffic back-end servers dealing with large amounts of protected digital images requiring authentication, in remote-sensing for high-security surveillance applications, in real-time video processing of information of a sensitive nature or matters of national security, in video/photographic content management of corporate clients, in authenticating multimedia for the entertainment industry, in the authentication of electronic evidence material, and in real-time news streaming.