scispace - formally typeset
Search or ask a question

Showing papers on "Residue number system published in 2001"


Book ChapterDOI
14 May 2001
TL;DR: An implementation of RSA cryptosystem using the RNS Montgomery multiplication is described, and an implementation method using the Chinese Remainder Theorem (CRT) is presented.
Abstract: We proposed a fast parallel algorithm of Montgomery multiplication based on Residue Number Systems (RNS). An implementation of RSA cryptosystem using the RNS Montgomery multiplication is described in this paper. We discuss how to choose the base size of RNS and the number of parallel processing units. An implementation method using the Chinese Remainder Theorem (CRT) is also presented. An LSI prototype adopting the proposed Cox-Rower Architecture achieves 1024- bit RSA transactions in 4.2 msec without CRT and 2.4 msec with CRT, when the operating frequency is 80 MHz and the total number of logic gates is 333 KG for 11 parallel processing units.

128 citations


Proceedings ArticleDOI
11 Jun 2001
TL;DR: A new RNS modular multiplication for very large operands is presented, based on Montgomery's (1985) method adapted to residue arithmetic, which achieves an effect corresponding to a redundant high-radix implementation by choosing the moduli of the RNS system reasonably large.
Abstract: We present a new RNS modular multiplication for very large operands. The algorithm is based on Montgomery's (1985) method adapted to residue arithmetic. By choosing the moduli of the RNS system reasonably large, an effect corresponding to a redundant high-radix implementation is achieved, due to the carry-free nature of residue arithmetic. The actual computation in the multiplication takes place in constant time, where the unit of time is a few simple residue operations. However, it is necessary twice to convert values from one residue system into another, operations which take O(n) time on O(n) processors, where n is the number of moduli in the RNS systems. Thus these conversions are the bottlenecks of the method, and any future improvements in RNS base conversions, or the use of particular residue systems, can immediately be applied.

116 citations


Journal ArticleDOI
TL;DR: In this paper, the authors discuss employing alternative number systems to reduce power dissipation in portable devices and high-performance systems, focusing on two alternative number representations that are quite different from the conventional linear number representations, namely the logarithmic number system (LNS) and the residue number system.
Abstract: The authors discuss employing alternative number systems to reduce power dissipation in portable devices and high-performance systems. They focus on two alternative number systems that are quite different from the conventional linear number representations, namely the logarithmic number system (LNS) and the residue number system (RNS). Both have recently attracted the interest of researchers for their low-power properties. The authors address aspects of the conventional arithmetic representations, the impact of logarithmic arithmetic on power dissipation, and discuss the low-power aspects of residue arithmetic.

88 citations


Patent
19 Sep 2001
TL;DR: In this article, a precomputation and dual-pass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is described. Butts et al. used two passes of Montgomery's method are used for a modular operation associated with the encryption protocol along with pre-computations of a constant based on a modulus.
Abstract: A pre-computation and dual-pass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is disclosed. An encrypted electronic message is received and another electronic message generated based on the encryption protocol. Two passes of Montgomery's method are used for a modular operation that is associated with the encryption protocol along with pre-computation of a constant based on a modulus. The modular operation may be a modular multiplication or a modular exponentiation. Modular arithmetic may be performed using the residue number system (RNS) and two RNS bases with conversions between the two RNS bases. A minimal number of register files are used for the computations along with an array of multiplier circuits and an array of modular reduction circuits. The approach described allows for high throughput for large encryption keys with a relatively small number of logical gates.

50 citations


Proceedings ArticleDOI
06 May 2001
TL;DR: The resulting implementations show that the RNS filters are smaller and consume less power than the corresponding ones in TCS, when the number of taps is larger than sixteen.
Abstract: In this work, a study on the implementation of FIR filters in the Residue Number System (RNS) is carried out. For different configurations, RNS filters are compared with filters realized in the traditional two's complement system (TCS) in terms of delay, area and power dissipation. The resulting implementations show that the RNS filters are smaller and consume less power than the corresponding ones in TCS, when the number of taps is larger than sixteen.

42 citations


Proceedings ArticleDOI
01 Jan 2001
TL;DR: A theory of RNS representations with redundant residues is developed and it is shown how representation parameters affect the speed and complexity of various arithmetic operations.
Abstract: Residue number system (RNS) representations that contain redundant moduli have been extensively studied with regard to their error checking properties. A second form of redundancy in RNS, that of redundant residues, has been applied in certain application contexts to gain speed and cost benefits. Such applications are developed in an ad hoc manner and there does not exist a theoretical framework for the latter variety of redundant RNS. We develop a theory of RNS representations with redundant residues and show how representation parameters affect the speed and complexity of various arithmetic operations. The theory parallels that of redundant signed-digit representations in that at issue are choice of 'residue sets' (akin to digit sets), encoding of residue sets, conversions between residue sets with different degrees of redundancy, including none, and algorithms for arithmetic on redundant-residue operands.

27 citations


Journal ArticleDOI
TL;DR: A novel very large scale integration architecture and the corresponding design methodology for a combinatorial adder-based residue number system (RNS) multiplier are presented, finding that the introduced architecture is more efficient in the area/spl times/time product sense.
Abstract: A novel very large scale integration architecture and the corresponding design methodology for a combinatorial adder-based residue number system (RNS) multiplier are presented in this paper. The proposed approach to residue multiplier design, exploits the nonoccurring combinations of input bits to reduce the number of 1-bit full adders (FAs) required to compose an RNS multiplier. In particular, input bits which cannot be simultaneously asserted for any input residue value are organized into couples or triplets, which can be processed by OR gates instead of 1-bit adders, therefore reducing the RNS multiplier complexity. By comparing the performance and hardware complexity of the proposed residue multiplier to previously reported designs, it is found that the introduced architecture is more efficient in the area/spl times/time product sense. In fact, it is shown that a performance improvement in excess of 80% can be achieved in certain cases.

26 citations


Proceedings ArticleDOI
27 Mar 2001
TL;DR: A secure image coding scheme using the residue number system (RNS) is presented and tested and can be used to enhance the signal to noise ratio for received images corrupted with AWGN.
Abstract: A secure image coding scheme using the residue number system (RNS) is presented and tested. The proposed scheme can be also used as the base for a full security multiple access image communication system. Using RNS with multiple look-up tables for different modules increases the security level of the system. This conversion technique can be used to enhance the signal to noise ratio for received images corrupted with AWGN.

19 citations


Proceedings ArticleDOI
06 May 2001
TL;DR: This paper presents a general conversion procedure based on a N moduli set which can process both unsigned and signed numbers and an architecture which efficiently implements the output conversion is illustrated.
Abstract: The use of the Residue Number System (RNS) in modern telecommunication and multimedia applications is becoming more and more important because it allows interesting advantages in terms of precision, power consumption and speed. Generally, the output conversion from residue to binary is the crucial point in effective realizations of application specific architectures based on residual arithmetic. This paper presents a general conversion procedure based on a N moduli set. The algorithm can process both unsigned and signed numbers. Based on this algorithm an architecture which efficiently implements the output conversion is illustrated. The architecture has been mapped on a FPGA.

15 citations


Proceedings ArticleDOI
01 Jan 2001
TL;DR: Hardware implementation of FIR digital filters can be achieved in the Xilinx Virtex FPGA by using residue number system (RNS) arithmetic techniques and the result is a highly efficient hardware realization of the desired FIR filter.
Abstract: Efficient implementation of FIR digital filters can be achieved in the Xilinx Virtex FPGA by using residue number system (RNS) arithmetic techniques. The hardware implementation of the RNS filter can be done using lookup tables (LUT) in either the block or distributed RAM in FPGA of the Xilinx Virtex FPGA. The RAM generated by the core generator are used for this purpose. The result is a highly efficient hardware realization of the desired FIR filter.

13 citations


Patent
18 Sep 2001
TL;DR: In this article, a precomputation and dual-pass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is described, which allows for high throughput for large encryption keys with a relatively small number of logical gates.
Abstract: A pre-computation and dual-pass modular operation approach to implement encryption protocols efficiently in electronic integrated circuits is disclosed An encrypted electronic message is received and another electronic message generated based on the encryption protocol Two passes of Montgomery's method are used for a modular operation that is associated with the encryption protocol along with pre-computation of a constant based on a modulus The modular operation may be a modular multiplication or a modular exponentiation Modular arithmetic may be performed using the residue number system (RNS) and two RNS bases with conversions between the two RNS bases A minimal number of register files are used for the computations along with an array of multiplier circuits and an array of modular reduction circuits The approach described allows for high throughput for large encryption keys with a relatively small number of logical gates

Proceedings ArticleDOI
26 Sep 2001
TL;DR: The innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks is reported on, which uses an enhanced index-transformation defined over Galois fields to efficiently support different wavelet filters instantiations without adding any extra cost or additional lookup tables (LUT).
Abstract: The design of high-performance, high-precision, real-time digital signal processing (DSP) systems, such as those associated with wavelet signal processing, is a challenging problem. This paper reports on the innovative use of the residue number system (RNS) for implementing high-end wavelet filter banks. The disclosed system uses an enhanced index-transformation defined over Galois fields to efficiently support different wavelet filter instantiations without adding any extra cost or additional lookup tables (LUT). An exhaustive comparison against existing two's complement (2C) designs for different custom IC technologies was carried out. These structures have been demonstrated to be well suited for field programmable logic (FPL) assimilation as well as for CBIC (cell-based integrated circuit) technologies.

Proceedings ArticleDOI
14 Aug 2001
TL;DR: Two algorithms are proposed to reduce the size of the residue-to-binary converter, which is the crucial part of the system, and a lookup table (LUT) partition technique is presented such that the most frequently accessed locations are stored in a smaller memory.
Abstract: In this paper, several low power techniques are proposed for the FPGA implementation of a distributed arithmetic and residue number system-based FIR filter. Two algorithms are proposed to reduce the size of the residue-to-binary converter, which is the crucial part of the system. The area, speed and power consumption of the filter is improved accordingly. Furthermore, a lookup table (LUT) partition technique is presented such that the most frequently accessed locations are stored in a smaller memory. The power consumption of the LUTs is reduced because accesses to smaller LUTs dissipate less power. The implementation results show a 20% power reduction by using the proposed methods.

Book
30 Jun 2001
TL;DR: This paper presents a framework for Algorithmic and Architectural Transformations for Multiplication-Free Linear Transforms and some examples of how this framework has been applied to DSP implementation.
Abstract: List of Figures. List of Tables. Foreword. Acknowledgments. Preface. 1. Introduction. 2. Programmable DSP Based Implementation. 3. Implementation Using Hardware Multiplier(s) and Adder(s). 4. Distributed Arithmetic Based Implementation. 5. Multiplier-Less Implementation. 6. Implementation of Multiplication-Free Linear Transforms. 7. Residue Number System Based Implementation. 8. A Framework for Algorithmic and Architectural Transformations. 9. Summary. References. Topic Index. About the Authors. Index.

Proceedings ArticleDOI
02 Sep 2001
TL;DR: Fast residue arithmetic multipliers based on a radix-2 signed-digit (SD) number and 16-digit residue arithmetic multiplier circuits have been designed using VHDL and the results show that high speed residue arithmetic circuits can be implemented.
Abstract: Fast residue arithmetic multipliers based on a radix-2 signed-digit (SD) number are presented. For a given modulus m, 2/sup p/-1/spl les/m/spl les/ 2/sup p/+2/sup p-1/-1, in a residue number system (RNS), the modulo m addition is performed by using one or two p-digit SD adders, and the modulo m addition time is independent of the word length of operands. We propose two kinds of modulo m multipliers which are constructed using a modulo m SD adder and a binary tree of the adders and the modulo m multiplication are performed in a time proportional to p and log/sub 2/p, respectively. 16-digit residue arithmetic multiplier circuits have been designed using VHDL, and the results show that high speed residue arithmetic circuits can be implemented.

Proceedings ArticleDOI
07 May 2001
TL;DR: An exhaustive comparison of the advantages of RNS-DA over the traditional two's complement design, 2C-DA, is carried out for field-programmable logic (FPL), and cell-based ASIC technologies and shows that the reported R NS-DA methodology enjoys a significant performance advantage that increases with precision.
Abstract: The need for both speed and increased precision in modern digital signal processing (DSP) applications represents a serious implementation obstacle. The paper explores the arithmetic benefits provided by the residue number system (RNS) for the design of such systems. Specifically, the fusion of the RNS with the popular distributed arithmetic (DA) is considered for the implementation of a discrete wavelet transform (DWT) filter bank. An exhaustive comparison of the advantages of RNS-DA over the traditional two's complement design, 2C-DA, is carried out for field-programmable logic (FPL), and cell-based ASIC technologies. The results show that the reported RNS-DA methodology, compared to a traditional 2C-DA design, enjoys a significant performance advantage that increases with precision.

Journal ArticleDOI
N. Burgess1
01 Jan 2001
TL;DR: A new Chinese remainder theorem (CRT)-based technique for the conversion of numbers in residue number system (RNS) format to binary representation is proposed that employs a high-radix SRT division-like architecture.
Abstract: A new Chinese remainder theorem (CRT)-based technique for the conversion of numbers in residue number system (RNS) format to binary representation is proposed that employs a high-radix SRT division-like architecture. The major benefit of the new technique is that it permits the efficient conversion of residue numbers with many moduli. A k-modulus RNS converter returning a w-bit result employs a([log/sub 2/ k]+1)-bit carry-propagate adder, a ROM with [log, k]+3 address bits, a (w+[log, k])-bit borrow-save subtracter, and a M -bit carry-propagate adder. This comprises less hardware than any other reported general modulus CRT-based converter.

Proceedings ArticleDOI
30 Sep 2001
TL;DR: An advanced architecture for a residue number system (RNS) based CDMA system for high-rate data transmission by combining RNS representation, PSK/QAM modulation and orthogonal modulation is presented.
Abstract: This paper presents an advanced architecture for a residue number system (RNS) based CDMA system for high-rate data transmission by combining RNS representation, PSK/QAM modulation and orthogonal modulation. The proposed system uses a lesser spreading factor than RNS based CDMA and the performance is comparable. The modified system is simulated extensively for different channel conditions and found that it can be used for faster data transmission without affecting the system performance.

Proceedings ArticleDOI
14 Aug 2001
TL;DR: Highly efficient implementations of FIR digital filters in Xilinx Virtex FPGA's are possible by using scaling, order augmentation and optimized CSD techniques for fixed coefficient multipliers.
Abstract: Highly efficient implementations of FIR digital filters in Xilinx Virtex FPGA's are possible by using scaling, order augmentation and optimized CSD techniques for fixed coefficient multipliers. Addition of Residue Number System (RNS) arithmetic techniques to this approach results in further reduction in FPGA resources particularly when large input and output word lengths are required. RNS is particularly attractive when key operations can be carried out with Look-Up-Table (LUT) techniques using either the block or distributed RAM's in FPGA's or the small LUT's available in each CLB of the Xilinx Virtex FPGA's.

Proceedings ArticleDOI
02 Sep 2001
TL;DR: In this article, a high level description of a RNS division algorithm is proposed and a general hardware architecture of the algorithm for division by a constant as well as its application to fractal image coding are also presented.
Abstract: Division, sign detection and number comparison are the more difficult operations in residue number systems (RNS). These shortcomings limited most RNS implementations to additions, subtractions and multiplications. In this paper, a high level description of a RNS division algorithm is proposed. A general hardware architecture of the algorithm for division by a constant as well as its application to fractal image coding are also presented.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: An algorithm is described in which long wordlength integers are represented by short-wordlength Montgomery (1985) residues and Architectures are proposed that take advantage of the parallelism afforded by this scheme.
Abstract: This paper considers the evaluation of long wordlength modular products. An algorithm is described in which long wordlength (e.g. 1024-bit) integers are represented by short-wordlength (e.g. 32-bit) Montgomery (1985) residues. Long integer modular multiplication is performed using only short-wordlength Montgomery operations. Architectures are proposed that take advantage of the parallelism afforded by this scheme.

Proceedings ArticleDOI
08 Nov 2001
TL;DR: A new concept of using residue arithmetic in optical flip-flop or memory element in optical parallel processing for its inherent parallelism is proposed.
Abstract: Residual arithmetic is a very recognized mathematical approach accepted in optical parallel processing for its inherent parallelism. Here in this paper we propose a new concept of using residue arithmetic in optical flip-flop or memory element.

Patent
29 Jan 2001
TL;DR: In this paper, the authors proposed a modulation scheme using the residue number system to modulate binary information for the spread signal using a modulator of the transmitting section of a communication system.
Abstract: A communication system (30) having a transmitting section (31) and a receiving section (32) to transmit and to receive, respectively, a spread signal. The communication system (30) applies a modulation scheme using the residue number system to modulate binary information for the spread signal using a modulator (38) of the transmitting section (31). Residue sets are derived from a plurality of bits provided by the binary information. Each residue set corresponds to a residue symbol and has at least one non-redundant residue. A fixed number of least significant binary bits of each residue set are then modulated to form a modulated symbol. An orthogonal code sequence, associated with remaining bits of the binary bits of each residue set, is then selected to spread the modulated symbol on a residue channel for transmission. At the receiving section (32), a demodulator (39) demodulates the spread signal to obtain the binary information.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: Using the MRRNS mapping technique, the technique is able to compute over identical channels using modular arithmetic and an extension to the complex MRR NS which allows the detection and correction of faults in the mapped channels is discussed.
Abstract: This paper discusses a fault-tolerant procedure for computing with complex signals based on the modulus replication residue number system (MRRNS). Using the MRRNS mapping technique, we are able to compute over identical channels using modular arithmetic. We discuss an extension to the complex MRRNS which allows the detection and correction of faults in the mapped channels. The technique provides redundancy by adding channels to the existing parallel structure. The paper discusses the overall fault tolerant technique and provides an application example of a complex adaptive filter for a wireless LAN system.

Proceedings ArticleDOI
01 Mar 2001
TL;DR: The architecture and VLSI implementation of a new architecture for a multiply-accumulate unit based on Residue Number System (RNS) is discussed, and the analysis indicates that the design is generally quite competitive.
Abstract: This paper discusses the VLSI implementation of a new architecture for a multiply-accumulate unit based on Residue Number System (RNS). The architecture and VLSI implementation of an arbitrary-moduli RNS MAC are given. The cost and performance are analyzed with respect to other designs, and the analysis indicates that the design is generally quite competitive.

01 Jan 2001
TL;DR: A possible RNS realization of the filter following the analog to RNS conversion stage is studied, designed to have high-throughput and low power dissipation and its timing parameters should set the design constraints in terms of architecture, technology and speed for the converter.
Abstract: The analog to digital front-end of high speed acquisition systems requires fast A/D conversion and fast filtering. These performances are difficult to obtain if the acquisition calls for large wordlength. Great design and implementation efforts are needed in order to match the required throughput. The Residue Number System (RNS) appears to be very attractive, but it requires an expensive conversion from binary to RNS representation. A possible solution is the direct conversion from analog to RNS. In this paper a Residue Number System approach to the realization of a high speed front-end for large wordlength is proposed. We study a possible RNS realization of the filter following the analog to RNS conversion stage. The filter is designed to have high-throughput and low power dissipation, and its timing parameters should set the design constraints in terms of architecture, technology and speed for the converter.

Proceedings ArticleDOI
02 Sep 2001
TL;DR: Montgomery's algorithm has been broken into two concurrent no-interleaved multiplication operations and the architectures derived from this algorithm are systolic and need near communication links only, very well suited for VLSI implementation.
Abstract: Algorithms and architectures for performing modular multiplication operations are important in cryptography and Residue Number System. In this paper Montgomery's algorithm has been broken into two concurrent no-interleaved multiplication operations. The architectures derived from this algorithm are systolic and need near communication links only. Thus, very well suited for VLSI implementation. The presented architectures offer a great flexibility of finding the best trade-off between hardware cost and throughput rate by changing the digit size.

Journal ArticleDOI
TL;DR: The Montgomery residue number system for long word-length arithmetic is introduced, which represents a long integer as a set of smaller Montgomery residues and can then be performed using hardware-efficient Montgomery operations applied independently to each of the residues.
Abstract: The Montgomery residue number system (MRNS) for long word-length arithmetic is introduced. MRNS, a modification of the residue number system (RNS), represents a long integer as a set of smaller Montgomery residues. Long integer addition, subtraction and multiplication can then be performed using hardware-efficient Montgomery operations applied independently to each of the residues. An MRNS hardware architecture suitable for incorporation on a microprocessor data path is also proposed.

Proceedings ArticleDOI
02 Sep 2001
TL;DR: It is shown that the RNS-enabled ID-DCT provides a throughput improvement over the equivalent binary system of up to 62% when 8-bit moduli are used.
Abstract: This paper shows the implementation of the one dimensional discrete cosine transform (1D-DCT) based on the residue number system (RNS). The 1D-DCT has been derived by,the application of a previously developed scaled fast cosine transform (FCT) algorithm that requires a reduced number of multiplications. The processor has been modeled at structural level using VHDL and implemented in Altera FLEX10K devices. This paper shows that the RNS-enabled ID-DCT provides a throughput improvement over the equivalent binary system of up to 62% when 8-bit moduli are used. This is achieved due to the synergy between RNS and modern field programmable logic (FPL) device families.