scispace - formally typeset
Search or ask a question

Showing papers by "Renato J. Cintra published in 2015"


Journal ArticleDOI
TL;DR: In this article, the authors proposed an 8-point DCT approximation that requires only 14 addition operations and no multiplications, compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio.
Abstract: Video processing systems such as HEVC requiring low energy consumption needed for the multimedia market has lead to extensive development in fast algorithms for the efficient approximation of 2-D DCT transforms. The DCT is employed in a multitude of compression standards due to its remarkable energy compaction properties. Multiplier-free approximate DCT transforms have been proposed that offer superior compression performance at very low circuit complexity. Such approximations can be realized in digital VLSI hardware using additions and subtractions only, leading to significant reductions in chip area and power consumption compared to conventional DCTs and integer transforms. In this paper, we introduce a novel 8-point DCT approximation that requires only 14 addition operations and no multiplications. The proposed transform possesses low computational complexity and is compared to state-of-the-art DCT approximations in terms of both algorithm complexity and peak signal-to-noise ratio. The proposed DCT approximation is a candidate for reconfigurable video standards such as HEVC. The proposed transform and several other DCT approximations are mapped to systolic-array digital architectures and physically realized as digital prototype circuits using FPGA technology and mapped to 45 nm CMOS technology.

107 citations


Journal ArticleDOI
TL;DR: The methods discussed in the paper can be used in the design of emerging low-power digital systems having lowest complexity at the cost of a loss in accuracy?the optimal trade-off of computational accuracy for lowest possible complexity and power.
Abstract: The DCT and the DWT are used in a number of emerging DSP applications, such as, HD video compression, biomedical imaging, and smart antenna beamformers for wireless communications and radar. Of late, there has been much interest on fast algorithms for the computation of the above transforms using multiplier-free approximations because they result in low power and low complexity systems. Approximate methods rely on the trade-off of accuracy for lower power and/or circuit complexity/chip-area. This paper provides a detailed review of VLSI architectures and CAS implementations for both DCT/DWTs, which can be designed either for higher-accuracy or for low-power consumption. This article covers both recent theoretical advancements on discrete transforms in addition to an overview of existing VLSI architectures. The paper also discusses error free VLSI architectures that provides high accuracy systems and approximate architectures that offer high computational gain making them highly attractive for real-world applications that are subject to constraints in both chip-area as well as power. The methods discussed in the paper can be used in the design of emerging low-power digital systems having lowest complexity at the cost of a loss in accuracy?the optimal trade-off of computational accuracy for lowest possible complexity and power. A complete synopsis of available techniques, algorithms and FPGA/VLSI realizations are discussed in the paper.

54 citations


Journal ArticleDOI
TL;DR: By solving a comprehensive multicriteria optimization problem, several new DCT approximations are identified that could outperform various existing methods archived in literature and could be suitable for image compression.

39 citations


Journal ArticleDOI
TL;DR: The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations, and numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding.
Abstract: In this letter, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.

26 citations


Journal ArticleDOI
TL;DR: In this paper, a low-complexity approximation for the discrete Tchebichef transform (DTT) is proposed, which is multiplication-free and requires a reduced number of additions and bit-shifting operations.
Abstract: In this paper, we introduce a low-complexity approximation for the discrete Tchebichef transform (DTT). The proposed forward and inverse transforms are multiplication-free and require a reduced number of additions and bit-shifting operations. Numerical compression simulations demonstrate the efficiency of the proposed transform for image and video coding. Furthermore, Xilinx Virtex-6 FPGA based hardware realization shows 44.9% reduction in dynamic power consumption and 64.7% lower area when compared to the literature.

23 citations


Proceedings ArticleDOI
17 May 2015
TL;DR: A current-mode CMOS design is proposed for realizing receive mode multi-beams in the analog domain using a novel DFT approximation to efficiently achieve spatial discrete Fourier transform operation across a ULA to obtain multiple simultaneous RF beams.
Abstract: A current-mode CMOS design is proposed for realizing receive mode multi-beams in the analog domain using a novel DFT approximation. High-bandwidth CMOS RF transistors are employed in low-voltage current mirrors to achieve bandwidths exceeding 4 GHz with good beam fidelity. Current mirrors realize the coefficients of the considered DFT approximation, which take simple values in {0,±1,±2} only. This allows high bandwidths realizations using simple circuitry without needing phase-shifters or delays. The proposed design is used as a method to efficiently achieve spatial discrete Fourier transform operation across a ULA to obtain multiple simultaneous RF beams. An example using 1.2 V current-mode approximate DFT on 65 nm CMOS, with BSIM4 models from the RF kit, show potential operation up to 4 GHz with eight independent aperture beams.

17 citations


Journal ArticleDOI
TL;DR: Novel ultralow-complexity differential-form depth-selective 4-D IIR filter algorithms and their corresponding architectures are proposed for processing 4-D light fields and practical results are presented for real-world video sequences.
Abstract: We propose the application of light field cameras and depth-selective 4-D IIR filtering to enable video surveillance, leveraging the post-capture depth-selective filtering enabled by computational photography. Novel ultralow-complexity differential-form depth-selective 4-D IIR filter algorithms and their corresponding architectures are proposed for processing 4-D light fields. Practical results are presented for real-world video sequences, and a CMOS VLSI implementation of the arithmetic processing elements is synthesized. The architecture shows $$86.66$$ , $$78.94\,\%$$ reduction in multipliers and adders compared to direct-form structure and delivers 26 frames/s for light fields of size $$16\times 16\times 128\times 128$$ .

16 citations


Proceedings ArticleDOI
16 Apr 2015
TL;DR: Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented, both transforms present lower arithmetic complexity than state-of-the-art methods.
Abstract: Two multiplierless pruned 8-point discrete cosine transform (DCT) approximation are presented. Both transforms present lower arithmetic complexity than state-of-the-art methods. The performance of such new methods was assessed in the image compression context. A JPEG-like simulation was performed, demonstrating the adequateness and competitiveness of the introduced methods. Digital VLSI implementation in CMOS technology was also considered. Both presented methods were realized in Berkeley Emulation Engine (BEE3).

13 citations


Journal ArticleDOI
TL;DR: A new family of wavelets and a multiresolution analysis that exploits the relationship between analyzing filters and Floquet's solution of Mathieu differential equations has potential application in the fields of optics and electromagnetism.
Abstract: This note introduces a new family of wavelets and a multiresolution analysis, which exploits the relationship between analysing filters and Floquet's solution of Mathieu differential equations. The transfer function of both the detail and the smoothing filter is related to the solution of a Mathieu equation of odd characteristic exponent. The number of notches of these filters can be easily designed. Wavelets derived by this method have potential application in the fields of Optics and Electromagnetism.

12 citations


Journal ArticleDOI
TL;DR: Two approaches are applied to reduce the computation time of the residual complexity similarity metric employed in image registration applications aimed at hardware-based implementations with low-complexity transforms, finding that block-wise processing alone reduces computational cost.
Abstract: The authors apply two approaches to reduce the computation time of the residual complexity similarity metric employed in image registration applications aimed at hardware-based implementations with low-complexity transforms. First, the similarity metric is computed in image sub-blocks, which are subsequently combined into a global metric value. Second, the discrete cosine transform (DCT) needed in the computation of the similarity measure is replaced with multiplier-free low-complexity approximate transforms. The authors propose a new low-complexity transform requiring only 18 additions in an 8 × 8 block and compare it to: the round DCT, the signed DCT, the Hadamard transform and the Walsh-Hadamard transform. Detailed computational complexity analysis reveals that block-wise processing alone reduces computational cost by a factor of 8-9 for original DCT composed of multiplications and additions, and up to ≃4.90 when the proposed DCT is utilised; being the computation performed with additions only. Results obtained from computer simulated and realistic X-ray images demonstrate block-wise processing and approximate transforms result in successful image registration, making residual complexity similarity measure available to hardware-accelerated fast image registration applications.

10 citations


Proceedings ArticleDOI
10 May 2015
TL;DR: A low-complexity multiplierless approximation for the 8-point DFT is presented for RF multi- beamforming using only 26 hardware adders and provides eight simultaneous aperture beams that closely resemble the antenna array patterns of an FFT-based beamformer.
Abstract: A low-complexity multiplierless approximation for the 8-point DFT is presented for RF multi- beamforming using only 26 hardware adders. The algorithm provides eight simultaneous aperture beams that closely resemble the antenna array patterns of an FFT-based beamformer. The multiplicative complexity is used as a benchmark for comparing the performance-complexity-power trade-offs between the traditional FFT and the proposed approximate DFT algorithms. Metrics based on maximum throughput, chip area, and power consumption are used for the comparison. The paper discusses the theory behind the proposed new algorithm, and the proposed 8-point DFT will be presented in the form of an 8 × 8 matrix. Furthermore simulation examples are provided for both 1-D and 2-D antenna array patters along with synthesized results for 45 nm CMOS technology at 1.1 V supply voltage. Cadence designs show a reduction of 30.6% , 33.2%, 29.0% , 26.1% and 52.0% in chip area, dynamic power consumption, critical path delay, gate-count and area-time and an increase in 45.5% in maximum clock frequency (throughput) for the proposed 8-point DFT approximation in comparison with a traditional radix-2 FFT algorithm, where both algorithms assumed 16-bit input signals.

Posted Content
TL;DR: In this article, a wavelet multiresolution analysis based on the Fourier and Hartley transform kernels is proposed, which is called Fourier-like andHartley-like wavelet analysis.
Abstract: In continuous-time wavelet analysis, most wavelet present some kind of symmetry. Based on the Fourier and Hartley transform kernels, a new wavelet multiresolution analysis is proposed. This approach is based on a pair of orthogonal wavelet functions and is named as the Fourier-Like and Hartley-Like wavelet analysis. A Hilbert transform analysis on the wavelet theory is also included.

Posted Content
TL;DR: The theoretical lower bound on the multiplicative complexity for the DFT/DHT are achieved and some fast algorithms are derived based on the factorization of DHT matrices.
Abstract: Discrete transforms such as the discrete Fourier transform (DFT) and the discrete Hartley transform (DHT) are important tools in numerical analysis. The successful application of transform techniques relies on the existence of efficient fast transforms. In this paper some fast algorithms are derived. The theoretical lower bound on the multiplicative complexity for the DFT/DHT are achieved. The approach is based on the factorization of DHT matrices. Algorithms for short blocklengths such as $N \in \{3, 5, 6, 12, 24 \}$ are presented.

Posted Content
TL;DR: Some fast algorithms are derived which meet the lower bound on the multiplicative complexity of a DFT/DHT based on a decomposition of the DHT into layers of Hadamard-Walsh transforms.
Abstract: Discrete transforms such as the Discrete Fourier Transform (DFT) or the Discrete Hartley Transform (DHT) furnish an indispensable tool in Signal Processing. The successful application of transform techniques relies on the existence of the so-called fast transforms. In this paper some fast algorithms are derived which meet the lower bound on the multiplicative complexity of a DFT/DHT. The approach is based on a decomposition of the DHT into layers of Hadamard-Walsh transforms. In particular, schemes named Turbo Fourier Transforms for short block lengths such as N=4, 8, 12 and 24 are presented.

Posted Content
TL;DR: In this paper, a relationship between the Riemann zeta function and a density on integer sets is explored, and several properties of the examined density are derived, such as the properties of a given density on an integer set on a set.
Abstract: A relationship between the Riemann zeta function and a density on integer sets is explored. Several properties of the examined density are derived.

Proceedings ArticleDOI
07 Apr 2015
TL;DR: This paper introduces a new class of multiplierless hardware algorithm consisting only of arithmetic adder circuits that closely approximates the 2-D version of the 8-point DFT.
Abstract: The two-dimensional (2-D) discrete Fourier transform (DFT) is widely used in digital signal processing (DSP) and computing applications. Fast Fourier transforms (FFTs) are widely used as low-complexity algorithms for the computation of the DFT as it reduces the required computation operations from O(N2) to O(N log 2 N). The multiplicative complexity is used as a benchmark in comparing different algorithms as it affects the circuit complexity, chip area and power. This paper introduces a new class of multiplierless hardware algorithm consisting only of arithmetic adder circuits that closely approximates the 2-D version of the 8-point DFT. The paper discusses the theory behind the proposed new algorithm, with the DFT presented in the form of an 8 × 8 matrix. Furthermore it provide a multi-beam RF aperture application example where the 2-D DFT approximation has been used to closely obtain the antenna array patterns.

Journal ArticleDOI
TL;DR: A low-complexity multiplierless approximation for the 8-point FFT is presented for RF beamforming, using only 26 additions, and provides eight beams that closely resemble the antenna array patterns of the traditional FFT-based beamformer albeit without using multipliers.
Abstract: Multiple independent radio frequency (RF) beams find applications in communications, radio astronomy, radar, and microwave imaging. An $N$-point FFT applied spatially across an array of receiver antennas provides $N$-independent RF beams at $\frac{N}{2}\log_2N$ multiplier complexity. Here, a low-complexity multiplierless approximation for the 8-point FFT is presented for RF beamforming, using only 26 additions. The algorithm provides eight beams that closely resemble the antenna array patterns of the traditional FFT-based beamformer albeit without using multipliers. The proposed FFT-like algorithm is useful for low-power RF multi-beam receivers; being synthesized in 45 nm CMOS technology at 1.1 V supply, and verified on-chip using a Xilinx Virtex-6 Lx240T FPGA device. The CMOS simulation and FPGA implementation indicate bandwidths of 588 MHz and 369 MHz, respectively, for each of the independent receive-mode RF beams.

Journal ArticleDOI
TL;DR: A hardware architecture for the computation of the null mean ACT is proposed, followed by a novel architectures that extend the ACT for non-null mean signals, utilizing the novel architecture described.
Abstract: The discrete cosine transform (DCT) is a widely-used and important signal processing tool employed in a plethora of applications. Typical fast algorithms for nearly-exact computation of DCT require floating point arithmetic, are multiplier intensive, and accumulate round-off errors. Recently proposed fast algorithm arithmetic cosine transform (ACT) calculates the DCT exactly using only additions and integer constant multiplications, with very low area complexity, for null mean input sequences. The ACT can also be computed non-exactly for any input sequence, with low area complexity and low power consumption, utilizing the novel architecture described. However, as a trade-off, the ACT algorithm requires 10 non-uniformly sampled data points to calculate the eight-point DCT. This requirement can easily be satisfied for applications dealing with spatial signals such as image sensors and biomedical sensor arrays, by placing sensor elements in a non-uniform grid. In this work, a hardware architecture for the computation of the null mean ACT is proposed, followed by a novel architectures that extend the ACT for non-null mean signals. All circuits are physically implemented and tested using the Xilinx XC6VLX240T FPGA device and synthesized for 45 nm TSMC standard-cell library for performance assessment.

Posted Content
TL;DR: The proposed fast algorithms are based on successive decompositions of the finite field Hartley transform by means of Hadamard-Walsh transforms (HWT), which meet the lower bound on the multiplicative complexity for all the cases investigated.
Abstract: A new transform over finite fields, the finite field Hartley transform (FFHT), was recently introduced and a number of promising applications on the design of efficient multiple access systems and multilevel spread spectrum sequences were proposed. The FFHT exhibits interesting symmetries, which are exploited to derive tailored fast transform algorithms. The proposed fast algorithms are based on successive decompositions of the FFHT by means of Hadamard-Walsh transforms (HWT). The introduced decompositions meet the lower bound on the multiplicative complexity for all the cases investigated. The complexity of the new algorithms is compared with that of traditional algorithms.

Journal ArticleDOI
TL;DR: The proposed algorithms for the arithmetic Fourier transform are surveyed and a new arithmetic transform for computing the discrete Hartley transform is introduced: the ArithmeticHartley transform.
Abstract: Arithmetic complexity has a main role in the performance of algorithms for spectrum evaluation. Arithmetic transform theory offers a method for computing trigonometrical transforms with minimal number of multiplications. In this paper, the proposed algorithms for the arithmetic Fourier transform are surveyed. A new arithmetic transform for computing the discrete Hartley transform is introduced: the Arithmetic Hartley transform. The interpolation process is shown to be the key element of the arithmetic transform theory.

Posted Content
TL;DR: New elliptic cylindrical wavelets are introduced, which exploit the relationship between analysing filters and Floquet's solution of Mathieu differential equations.
Abstract: New elliptic cylindrical wavelets are introduced, which exploit the relationship between analysing filters and Floquet's solution of Mathieu differential equations. It is shown that the transfer function of both multiresolution filters is related to the solution of a Mathieu equation of odd characteristic exponent. The number of notches of these analysing filters can be easily designed. Wavelets derived by this method have potential application in the fields of optics, microwaves and electromagnetism.

Journal ArticleDOI
TL;DR: In this paper, an extension of the Dirichlet density for sets of Gaussian integers is proposed and some properties of its properties are investigated, such as asymptotic density, Schnirelmann density, and Dirichlett density.
Abstract: Several measures for the density of sets of integers have been proposed, such as the asymptotic density, the Schnirelmann density, and the Dirichlet density. There has been some work in the literature on extending some of these concepts of density to higher dimensional sets of integers. In this work, we propose an extension of the Dirichlet density for sets of Gaussian integers and investigate some of its properties.