scispace - formally typeset
Search or ask a question

Showing papers on "Residue number system published in 2003"


Journal ArticleDOI
TL;DR: The new Chinese remainder theorem introduced recently has been employed to exploit the special properties of the proposed moduli set where modulo corrections are done without resorting to the costly and time consuming modulo operations.
Abstract: The inherent properties of carry-free operations, parallelism and fault-tolerance have made the residue number system a promising candidate for high-speed arithmetic and specialized high-precision digital signal-processing applications. However, the reverse conversion from the residues to the weighted binary number has long been the performance bottleneck, particularly when the number of moduli set increases beyond 3. In this paper, we present an elegant residue-to-binary conversion algorithm for a new 4-moduli set {2/sup n/ $1, 2/sup n/, 2/sup n/ + 1, 2/sup 2n/ + 1}. The new Chinese remainder theorem introduced recently has been employed to exploit the special properties of the proposed moduli set where modulo corrections are done without resorting to the costly and time consuming modulo operations. The resulting architecture is notably simple and can be realized in hardware with only bit reorientation and one multioperand modular adder. The new reverse converter has superior area-time complexity in comparison with the reverse converters for several other 4-moduli sets.

110 citations


Journal ArticleDOI
TL;DR: In order to represent 8-, 16-, 32-, and 64-bit binary numbers, the moduli set {2/sup n/,2/Sup n/+1,2/ Sup n/-1} provides the fastest R/B converter and requires the smallest area.
Abstract: In this paper, a detailed study on the four three-moduli sets reported in the literature is carried out from the point of view of the hardware complexity and speed of their residue-to-binary (R/B) converters. First, a new formulation of the Chinese remainder theorem is proposed that reduces the size of the modulo operation. Then, the proposed formulation is applied to derive, in a simple and unified manner, R/B conversion algorithms for three of the sets. Further, using this formulation, a new algorithm along with two corresponding R/B converters for the fourth set is proposed; one of these converters is area efficient while the other is speed efficient. Next, the best R/B converter(s) for each of the sets is chosen based on the hardware complexity and/or speed. These converters are implemented for 8, 16, 32, and 64-bit dynamic ranges, using CMOS VLSI technology. Based on a post-layout performance evaluation for the VLSI implementations of the chosen converters, it is shown that in order to represent 8-, 16-, 32-, and 64-bit binary numbers, the moduli set {2/sup n/,2/sup n/+1,2/sup n/-1} provides the fastest R/B converter and requires the smallest area.

97 citations


Proceedings ArticleDOI
Ricardo Chaves1, Leonel Sousa1
01 Sep 2003
TL;DR: Experimental results show that not only a significant reduction in circuit area and power consumption but also a speedup may be achieved with RNS when compared with a binary DSP.
Abstract: This paper is focused on low power programmable fast digital signal processors (DSP) design based on a configurable 5-stage RISC core architecture and on residue number systems (RNS). Several innovative aspects are introduced at the control and datapath architecture levels, which support both the binary system and the RNS. A new moduli set {2/sup n/-1, 2/sup 2n/, 2/sup n/+1} is also proposed for balancing the processing time in the different RNS channels. Experimental results, obtained trough RDSP implementation on FPGA and ASIC, show that not only a significant reduction in circuit area and power consumption but also a speedup may be achieved with RNS when compared with a binary DSP.

69 citations


Patent
18 Sep 2003
TL;DR: In this paper, a method and a circuit for masking digital data handled by an algorithm and factorized by a residue number system based on a finite base of numbers or polynomials prime to one another are presented.
Abstract: A method and a circuit for masking digital data handled by an algorithm and factorized by a residue number system based on a finite base of numbers or polynomials prime to one another, comprising making the factorization base variable.

44 citations


Journal ArticleDOI
01 Jul 2003
TL;DR: The innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks is reported on, demonstrating to be well suited for field programmable logic (FPL) assimilation as well as for CBIC (cell-based integrated circuit) technologies.
Abstract: The design of high performance, high precision, real-time digital signal processing (DSP) systems, such as those associated with wavelet signal processing, is a challenging problem. This paper reports on the innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks. The disclosed system uses an enhanced index-transformation defined over Galois fields to efficiently support different wavelet filter instantiations without adding any extra cost or additional look-up tables (LUT). A selection of a small wordwidth modulus set are the keys for attaining low-complexity and high-throughput. An exhaustive comparison against existing two's complement (2C) designs for different custom IC technologies was carried out. Results reveal a performance improvement of up to 100% for high-precision RNS-based systems. These structures demonstrated to be well suited for field programmable logic (FPL) assimilation as well as for CBIC (cell-based integrated circuit) technologies.

30 citations


Journal ArticleDOI
TL;DR: A residue to binary converter design is proposed which requires less time and hardware than all published converters for the other two moduli sets.

28 citations


Journal ArticleDOI
TL;DR: A new implementation of an 8×8 two-dimensional discrete cosine transform (2D-DCT) processor based on the residue number system (RNS) is presented, which leads to a 129% throughput improvement over the equivalent binary system.
Abstract: A new implementation of an 8×8 two-dimensional discrete cosine transform (2D-DCT) processor based on the residue number system (RNS) is presented This architecture makes use of a fast cosine transform algorithm It is shown that the RNS implementation of the 2D-DCT over field-programmable logic devices leads to a 129% throughput improvement over the equivalent binary system

28 citations


Journal ArticleDOI
TL;DR: This paper presents a new scaling approach, which allows faster and more efficient schemes, because the scaling uses only RNS operations within the small word length channels.
Abstract: Previous scaling schemes are based on the conversion of the unpositional residue number system (RNS) digits into a positional number system via Chinese remainder theorem (CRT) or mixed-radix-conversion (MRC) and the back conversion into RNS with an associated size and speed penalty in cell-based integrated circuit (CBIC) designs. This paper presents a new scaling approach, which allows faster and more efficient schemes, because the scaling uses only RNS operations within the small word length channels.

28 citations


Journal ArticleDOI
28 Feb 2003
TL;DR: In this paper, a five-moduli set, in which each modulus has a specific form, is presented, and the multiplicative inverses of the modulus is expressed in a closed-form expression, which facilitates the conversion process.
Abstract: Residue number system to binary number system conversion is a very basic operation in any interface between the two systems. A five-moduli set, in which each modulus has a specific form is presented. The moduli set is defined as (2 k-2 , 2 k-1 , 2 k +1, 2 k -2 (k+1)/2 +1, 2 k +2 (k+1)/2 +1), where k is an odd positive integer. The multiplicative inverses of each modulus is expressed in a closed-form expression, which facilitates the conversion process. The realisation of the introduced converter is very attractive for many residue-based applications. The converter requires only a multi-operand adder. The delay and area of the new converter are considerably less than those previously reported in the literature.

24 citations


Journal ArticleDOI
TL;DR: A fault-tolerant technique based on the modulus replication residue number system (MRRNS) which allows for modular arithmetic computations over identical channels and takes advantage of some elementary polynomial properties to obtain the same level of fault tolerance.
Abstract: This paper presents a fault-tolerant technique based on the modulus replication residue number system (MRRNS) which allows for modular arithmetic computations over identical channels. In this system, fault tolerance is provided by adding extra computational channels that can be used to redundantly compute the mapped output. An algebraic technique is used to determine the error position in the mapped outputs and provide corrections. We also show that by taking advantage of some elementary polynomial properties we obtain the same level of fault tolerance with about a 30% decrease in the number of channels. This new system is referred to as the symmetric MRRNS (SMRRNS).

23 citations


Journal ArticleDOI
TL;DR: This paper introduces the generalized CRT (GCRT) with parallel algorithms used for the conversion, and proposes algorithms that calculate concurrently for some non-related program fragments of GCRT computation to compute more efficiently.

Journal ArticleDOI
TL;DR: The quotient function (QF) of the residue number system (RNS) is presented, which provides the integer value of the argument scaled by a modulus of the RNS, and an application of the QF to residue-to-binary conversion is proposed.
Abstract: This paper presents the quotient function (QF) of the residue number system (RNS), which provides the integer value of the argument scaled by a modulus of the RNS. An application of the QF to residue-to-binary conversion is proposed, which outperforms the traditional approaches based on the Chinese remainder theorem and the mixed radix conversion, if a modulus of the kind 2/sup k/, k integer, is included in the set of RNS moduli.

01 Jan 2003
TL;DR: The new Chinese remainder theorem introduced recently has been employed to exploit the special properties of the proposed moduli set where modulo corrections are done without resorting to the costly and time consuming modulo operations.
Abstract: The inherent properties of carry-free operations, parallelism and fault-tolerance have made the residue number system a promising candidate for high-speed arithmetic and specialized high-precision digital signal-processing applications. However, the reverse conversion from the residues to the weighted binary number has long been the performance bottleneck, par- ticularly when the number of moduli set increases beyond 3. In this paper, we present an elegant residue-to-binary conversion algorithm for a new 4-moduli set . The new Chinese remainder theorem introduced recently has been employed to exploit the special properties of the proposed moduli set where modulo corrections are done without resorting to the costly and time consuming modulo operations. The resulting architecture is notably simple and can be realized in hardware with only bit reorientation and one multioperand modular adder. The new reverse converter has superior area-time complexity in comparison with the reverse converters for several other 4-moduli sets. Index Terms—New Chinese remainder theorem (CRT), residue arithmetic, residue number system (RNS), residue-to-binary con- verter.

Book ChapterDOI
01 Sep 2003
TL;DR: RNS-FPL merged filters demonstrated its superiority when compared to 2C (two’s complement) filters, being about 65% faster and requiring fewer logic elements for most study cases.
Abstract: This paper presents the residue number system (RNS) implementation of reduced complexity and high performance adaptive FIR filters on Altera APEX20K field-programmable logic (FPL) devices. Index arithmetic over Galois fields along with a selection of a small wordwidth modulus set are keys for attaining low-complexity and high-throughput. The replacement of a classical modulo adder tree by a binary adder with extended precision followed by a single modulo reduction stage improved area requirements by 10% for a 32-tap FIR filter. A block LMS (BLMS) implementation was preferred for the update of the adaptive FIR filter coefficients. RNS-FPL merged filters demonstrated its superiority when compared to 2C (two’s complement) filters, being about 65% faster and requiring fewer logic elements for most study cases.

Journal Article
TL;DR: In this paper, the residue number system (RNS) implementation of reduced complexity and high performance adaptive FIR filters on Altera APEX20K field-programmable logic (FPL) devices is presented.
Abstract: This paper presents the residue number system (RNS) implementation of reduced complexity and high performance adaptive FIR filters on Altera APEX20K field-programmable logic (FPL) devices. Index arithmetic over Galois fields along with a selection of a small wordwidth modulus set are keys for attaining low-complexity and high-throughput. The replacement of a classical modulo adder tree by a binary adder with extended precision followed by a single modulo reduction stage improved area requirements by 10% for a 32-tap FIR filter. A block LMS (BLMS) implementation was preferred for the update of the adaptive FIR filter coefficients. RNS-FPL merged filters demonstrated its superiority when compared to 2C (two's complement) filters, being about 65% faster and requiring fewer logic elements for most study cases.

Proceedings ArticleDOI
18 Mar 2003
TL;DR: This technique is used to encode two digital image signals or one text with one image signal, which are quantized with 8 bits using residue number system (RNS) technique and multiplexed together and de-multiplexing technique to separate the original image signals is performed successfully.
Abstract: Data hiding technique is a very attractive field The field of data hiding in imagery is relatively very young and is growing at an exponential rate This field is highly multidisciplinary field that combines image and signal processing with cryptography, communication and visual perception theory In this paper a new data hiding technique based on residue number system is introduced in digital imagery system This technique is used to encode two digital image signals or one text with one image signal, which are quantized with 8 bits using residue number system (RNS) technique and multiplexed together Also the de-multiplexing technique to separate the original image signals is simply performed successfully Simulation results are given

Proceedings ArticleDOI
25 May 2003
TL;DR: Results on architectures such as FIR filters, show that the techniques used to reduce the switching capacitance not only lead to more power efficient circuits, but also to a better performance.
Abstract: In this paper we present some tradeoffs between delay and power consumption in the design of digital processors based on the Residue Number System (RNS). We focus on reducing the switching capacitance, and therefore the power, in modular adders and isomorph multipliers. Results on architectures such as FIR filters, show that the techniques used to reduce the switching capacitance not only lead to more power efficient circuits, but also to a better performance.

Proceedings ArticleDOI
20 Jun 2003
TL;DR: The area and time complexity of the proposed high-radix multipliers is compared to published results and reveals that the proposed architecture can be several times more efficient in the area times time product complexity sense.
Abstract: In this paper a novel class of Residue Number System (RNS) arithmetic circuits is introduced. The novel circuits feature the radix-r representation, r = 2i - 1, (i > 1) of the involved residues, and perform operations modulo rn - 1, r n, or rn + 1. Redundancy is used in the representation of the radix-r digits to reduce the complexity of the digit adder cells. The area and time complexity of the proposed high-radix multipliers is compared to published results; the comparisons reveal that the proposed architecture can be several times more efficient in the area times time product complexity sense

Journal ArticleDOI
TL;DR: A new 5-modulus set is presented, that expands the dynamic range in comparison with the popular 3-moduli sets, and expresses the multiplicative inverse of each modulus in a compact form that eases the conversion.

Journal ArticleDOI
TL;DR: A fast residue checker for the error detection of arithmetic circuits based on radix-two signed-digit (SD) number arithmetic that can be performed in real-time for a large product-sum circuit.
Abstract: This paper presents a fast residue checker for the error detection of arithmetic circuits. The residue checker consists of a number of residue arithmetic circuits such as adders, multipliers and binary-to-residue converters based on radix-two signed-digit (SD) number arithmetic. The proposed modulo m (m = 2p ± 1) adder is designed with a p-digit SD adder, so that the modulo m addition time is independent of the word length of operands. The modulo m multiplier and binary-to-residue number converter are constructed with a binary tree structure of the modulo m SD adders. Thus, the modulo m multiplication is performed in a time proportional to log2 p and an n-bit binary number is converted into a p-digit SD residue number, n ≫ p, in a time proportional to log2(n/p). By using the presented residue arithmetic circuits, the error detection can be performed in real-time for a large product-sum circuit.

Patent
25 Sep 2003
TL;DR: In this paper, the problem of providing an arithmetic unit and an encryption/decoding arithmetic unit for making common a part of a plurality of arithmetic processing including matrix operations, and for performing the partial matrix operations in parallel to realize a high speed operation is addressed.
Abstract: PROBLEM TO BE SOLVED: To provide an arithmetic unit and an encryption/decoding arithmetic unit for making common a part of a plurality of arithmetic processing including matrix operations, and for performing the partial matrix operations in parallel to realize a high speed operation. SOLUTION: This arithmetic unit for performing the arithmetic processing of both first arithmetic processing including a first matrix operation and second arithmetic processing including a second matrix operation is provided with a first arithmetic part 41 for performing the second matrix operation, at least one or more other arithmetic parts 42 for performing matrix operations in parallel with the first arithmetic part for performing the first matrix operation and a logic circuit 46 for performing the logical operation of each of arithmetic results between the first arithmetic part and the other arithmetic parts. Then, when the arithmetic result of the first matrix operation is requested, the arithmetic unit obtains it from the logic circuit 46. COPYRIGHT: (C)2005,JPO&NCIPI

Proceedings ArticleDOI
09 Nov 2003
TL;DR: Algorithms for exact scaling and Montgomery reduction in the RNS with pairs of conjugate moduli are presented, suggesting this RNS is suitable for the implementation of public key cryptosystems.
Abstract: The residue number system (RNS) with pairs of conjugate moduli uses a modulus set containing pairs of moduli of the form {2/sup k/-1, 2/sup k/+1}. This RNS provides a good trade-off between large dynamic range and channel width. It also supports efficient channel-width multiplication and addition as well as efficient conversions between weighted and RNS representations. This paper presents algorithms for exact scaling and Montgomery reduction in the RNS with pairs of conjugate moduli. The ability to perform these operations on very large integers suggest this RNS is suitable for the implementation of public key cryptosystems.

Book ChapterDOI
07 Sep 2003
TL;DR: A novel method for the parallelization of the modular multiplication algorithm in the Residue Number System (RNS) is presented, which only requires L moduli which is half the number needed in the previous algorithm.
Abstract: This paper presents a novel method for the parallelization of the modular multiplication algorithm in the Residue Number System (RNS). The proposed algorithm executes modular reductions using a new lookup table along with the Mixed Radix number System (MRS) and RNS. MRS is used because algebraic comparison is difficult in RNS, which has a non-weighted number representation. Compared with the previous algorithm, the proposed algorithm only requires L moduli which is half the number needed in the previous algorithm. Furthermore, the proposed algorithm reduces the number of MUL operations by 25 %.

Proceedings ArticleDOI
25 May 2003
TL;DR: An elegant residue-to-binary algorithm for a new 4-moduli set and the performance of the new reverse converter surpasses previously reported reverse converters for several celebrated four- moduli sets.
Abstract: This paper presents an elegant residue-to-binary algorithm for a new 4-moduli set (2/sup n/ - 1, 2/sup n/, 2/sup n/ + 1, 2/sup 2n/ + 1) Residue Number System. Our reverse conversion algorithm takes advantage of the special number properties of the proposed moduli set. The recently introduced New CRT theorem has been exploited to simplify the costly and time consuming modular corrections. The resulting architecture is notably simple and can be realized in hardware with only bit reorientation operation and a Multi-Operand Modular Adder. In terms of area-time complexity, the performance of the new reverse converter surpasses previously reported reverse converters for several celebrated four-moduli sets.

Proceedings ArticleDOI
25 May 2003
TL;DR: This paper introduces a new number representation, the Quantized Polynomial Representation (QPR), which is used for building FIR filters where some quantization errors can be tolerated and considerable savings are possible if polynomial quantization can be tolerate.
Abstract: This paper introduces a new number representation, the Quantized Polynomial Representation (QPR), which is used for building FIR filters where some quantization errors can be tolerated. The technique is based on the previously published Modulus Replication Residue Number System (MRRNS) but considerable savings are possible if polynomial quantization can be tolerated. The QPR can be used as a vehicle for Quadratic Residue Number System (QRNS) mapping of complex data, and the main computational architecture can be built with independent finite ring computational channels. We demonstrate this new technique on an asymmetrical Gigabit wireless LAN where the adaptive filter computes with complex arithmetic. We demonstrate area savings of up to 28% and power savings of up to 50%.

Patent
19 Sep 2003
TL;DR: In this paper, the digital data is factorized by a residue number system based on finite base of numbers or polynomials prime to one another, and an independent claim is also included for circuit of algorithm processing of data factorised by residue number systems.
Abstract: The digital data is factorized by a residue number system based on finite base of numbers or polynomials prime to one another. An Independent claim is also included for circuit of algorithm processing of data factorized by residue number system.

Journal ArticleDOI
TL;DR: This paper assesses the arithmetic benefits provided by the Residue Number System (RNS) for building Digital Signal Processing (DSP) systems with Field-Programmable Logic (FPL) technology and studies the quantifiable benefits of this approach in the context of a new Fast Cosine Transform (FCT) architecture enhanced by using the QRNS.
Abstract: This paper assesses the arithmetic benefits provided by the Residue Number System (RNS) for building Digital Signal Processing (DSP) systems with Field-Programmable Logic (FPL) technology. The quantifiable benefits of this approach are studied in the context of a new Fast Cosine Transform (FCT) architecture enhanced by using the Quadratic Residue Number System (QRNS). The system reduces the number of adders and multipliers required for the N-point Discrete Cosine Transform (DCT) and provides high throughput. For an FPL-based implementation, the proposed design gets significant improvements over an equivalent 2C structure. By using up to 6-bit moduli, an overall increase in the system performance of about 140% is achieved. If this speed increase is considered along with the penalty in device resources, the presented QRNS-based FCT system provides an improvement in the area-delay figure factor of about 20%. Finally, the conversion overhead was carefully studied and it was found that the quantifiable benefits of the proposed design are not affected when converters are included.