scispace - formally typeset
Search or ask a question

Showing papers on "Residue number system published in 2012"


Journal ArticleDOI
TL;DR: An experimental analysis of the scalability, based on OpenCL descriptions of the proposed algorithms, suggest that further advantage can be obtained from the proposed RNS approach for GPUs and EC curves supported by underlying finite fields of smaller size, regarding implementations on general purpose multi-cores.
Abstract: Acceleration of cryptographic applications on massive parallel computing platforms, such as Graphic Processing Units (GPUs), becomes a real challenge concerning practical implementations. In this paper, we propose a parallel algorithm for Elliptic Curve (EC) point multiplication in order to compute EC cryptography on these platforms. The proposed approach relies on the usage of the Residue Number System (RNS) to extract parallelism on high-precision integer arithmetic. Results suggest a maximum throughput of 9827 EC multiplications per second and minimum latency of 29.2Â ms for a 224-bit underlying field, in a commercial Nvidia 285 GTX GPU. Performances up to an order of magnitude better in latency and 122% in throughput are achieved regarding other approaches reported in the related art. An experimental analysis of the scalability, based on OpenCL descriptions of the proposed algorithms, suggest that further advantage can be obtained from the proposed RNS approach for GPUs and EC curves supported by underlying finite fields of smaller size, regarding implementations on general purpose multi-cores.

59 citations


Journal Article
TL;DR: A high-speed pairing coprocessor using Residue Number System (RNS) which is intrinsically suitable for parallel computation and which outperforms all reported hardware and software designs.
Abstract: In this paper, we present a high-speed pairing coprocessor using Residue Number System (RNS) which is intrinsically suitable for parallel computation. This work improves the design of Cheung et al. [11] using a carefully selected RNS base and an optimized pipeline design of the modular multiplier. As a result, the cycle count for a modular reduction has been halved. When combining with the lazy reduction, Karatsuba-like formulas and optimal pipeline scheduling, a 128-bit optimal ate pairing computation can be completed in less than 100,000 cycles. We prototype the design on a Xilinx Virtex-6 FPGA using 5237 slices and 64 DSPs; a 128-bit pairing is computed in 0.358 ms running at 230MHz. To the best of our knowledge, this implementation outperforms all reported hardware and software designs.

28 citations


Book ChapterDOI
16 May 2012
TL;DR: Cheung et al. as discussed by the authors presented a high-speed pairing coprocessor using Residue Number System (RNS) which is intrinsically suitable for parallel computation, using a carefully selected RNS base and an optimized pipeline design of the modular multiplier.
Abstract: In this paper, we present a high-speed pairing coprocessor using Residue Number System (RNS) which is intrinsically suitable for parallel computation. This work improves the design of Cheung et al. [11] using a carefully selected RNS base and an optimized pipeline design of the modular multiplier. As a result, the cycle count for a modular reduction has been halved. When combining with the lazy reduction, Karatsuba-like formulas and optimal pipeline scheduling, a 128-bit optimal ate pairing computation can be completed in less than 100,000 cycles. We prototype the design on a Xilinx Virtex-6 FPGA using 5237 slices and 64 DSPs; a 128-bit pairing is computed in 0.358 ms running at 230MHz. To the best of our knowledge, this implementation outperforms all reported hardware and software designs.

28 citations


Proceedings ArticleDOI
03 Jul 2012
TL;DR: A technique, based on the residue number system (RNS) with diminished-1 encoded channel, has being used for implementing a finite impulse response (FIR) digital filter.
Abstract: A technique, based on the residue number system (RNS) with diminished-1 encoded channel, has being used for implementing a finite impulse response (FIR) digital filter. The proposed RNS architecture of the filter consists of three main blocks: forward and reverse converter and arithmetic processor for each channel. Architecture for residue to binary (reverse) convertor with diminished-1 encoded channel has been proposed. Besides, for all RNS channels, the systolic design is used for the efficient realization of FIR filter. A numerical example illustrates the principles of diminished-1 residue arithmetic, signal processing, and decoding for FIR filters.

26 citations


Journal ArticleDOI
TL;DR: It is shown that the D-RNS-NPM guarantees the reconstruction of any number from a subset of at least n-z fragments, which appears to be relevant in wireless sensor networks, and an application in this area is envisioned.
Abstract: We consider a problem where a physical quantity is repeatedly measured by replicated devices, yielding a stream of numerical data. Data are stored within the measuring devices and sporadically retrieved by a user. To avoid data losses due to large data streams with insufficient memory, the data are split into fragments, each of which is a compressed encoding of a number in the stream, and different fragments are stored in different, replicated devices. The devices are not allowed to communicate with each other, and they produce the local streams of fragments from independent measurements. Given the independence of measurements, the fragments are corrupted by independent errors, which are likely to be small integers, although errors of unbounded magnitude may also occur due to failures or to interferences. As devices may fail, or communication may be unreliable, the user may be unable to download fragments from some of the replicated devices, leading to fragment erasures. Our approach to the problem is to encode the data in a Residue Number System with Nonpairwise-Prime Moduli, named D-RNS-NPM. With n moduli and n residue digits, every replicated device is tied to a different modulus, with which it produces and stores a residue digit (i.e., a fragment) from the local measurement. Assuming an upper bound z, with z <; n, to the number of erasures, we show that the D-RNS-NPM guarantees the reconstruction of any number from a subset of at least n-z fragments. If fragments bear errors, whose magnitude is unrestricted for at most one error and upper bounded by a small δ for the others, reconstruction is within an approximation of ±δ, and this property is retained when errors cannot be detected due to the unbounded error multiplicity. The time complexity of the decoding algorithm is polynomial. This problem appears to be relevant in wireless sensor networks, and an application in this area is envisioned.

21 citations


Proceedings ArticleDOI
01 Dec 2012
TL;DR: A detailed study on time and hardware requirements of residue-to-binary converters (RC) and residue arithmetic units (RAU) based on each of the studied sets, categorized by the dynamic range (DR) they provide.
Abstract: This paper presents a comparative study on different moduli sets that are used in the Residue Number System (RNS) Choosing a proper moduli set is one of the most important issues in RNS It greatly affects the performance of the whole system Many moduli sets have been introduced recently, each of them has its own advantages and disadvantages Therefore, the necessity to a survey and a comparison between these sets is obvious This paper demonstrates a detailed study on time and hardware requirements of residue-to-binary converters (RC) and residue arithmetic units (RAU), based on each of the studied sets, categorized by the dynamic range (DR) they provide Then, the most efficient moduli set for each DR is suggested The effect of moduli number on the system's performance and complexity is also discussed Our research is aimed for designs whose main goal and strategy is balanced

20 citations


Journal ArticleDOI
01 Jun 2012
TL;DR: Experiments based on the implementation of Finite Impulse Response filters shows that the use of RNS together with suitable moduli sets optimally fits the 6-input LUTs in the last generation FPGAs architectures.
Abstract: In this paper optimized Residue Number System (RNS) arithmetic blocks to better exploit some of the architectural characteristics of the last generation FPGAs are presented. The implementation of modulo m adders, modulo m constant and general multipliers, input and output converters are presented. These architectures are based on moduli sets chosen in order to optimally use the 6-input Look-Up Tables (LUTs) available in the Complex Logic Blocks (CLBs) of the new generation FPGAs. Experiments based on the implementation of Finite Impulse Response (FIR) filters characterized by different number of taps and wordlengths shows that the use of RNS together with suitable moduli sets optimally fits the 6-input LUTs in the last generation FPGAs architectures.

19 citations


Journal ArticleDOI
TL;DR: The proposed scaling algorithm breaks the inter-modulus dependency and produces a parallel architecture incurring no more than two logarithmic shifters, one-stage of carry-save adder and a modulo adder in any modulus channel.
Abstract: Variable scaling by power-of-two factor is the backbone operation of floating point arithmetic and is also commonly used in fixed-point digital signal processing (DSP) system for overflow prevention. While this operation can be readily performed in binary number system, it is extremely difficult to implement in residue number system (RNS). In the absence of an efficient solution to scale an integer directly in residue domain by a programmable power-of-two factor, improvised architecture by cascading fixed RNS scaling-by-two blocks has been previously presented. However, its area complexity and time complexity are worse than a hybrid solution leveraging on binary shifting through efficient residue-to-binary and binary-to-residue conversions. This paper presents a new algorithm for scaling in {2n - 1,2n,2n + 1} RNS by a programmable power-of-two factor. The proposed scaling algorithm breaks the inter-modulus dependency and produces a parallel architecture incurring no more than two logarithmic shifters, one-stage of carry-save adder and a modulo adder in any modulus channel. Comparing with the only available and most efficient hybrid programmable power-of-two scaler for the same moduli set, our proposed design has not only significantly reduced the critical path delay by 52.2%, 52.8%, 53.1%, and 53.2% for n = 5 , 6, 7, and 8, respectively, but also cut down the area by 14.1% on average based on CMOS 0.18 μm standard cell based implementation. In addition, our proposed design has effectively reduced the total power consumption by 43.8% and the leakage power by 20.6% on average.

17 citations


Proceedings ArticleDOI
05 Sep 2012
TL;DR: Improved units for addition, subtraction, and multiplication in RNS for modulo {2n±k} are proposed, encouraging the development of moduli sets with channels other than the traditional {2–2⩽n–±1} modulo.
Abstract: Recently new Residue Number Systems (RNS) moduli sets have been proposed in order to increase the dynamic range and reduce the width of channels, therefore, reducing the processing time and further exploiting the carry-free characteristic of the modular arithmetic. In this paper we propose improved units for addition, subtraction, and multiplication in RNS for modulo {2^n+-k}. With this work, the somewhat disregarded field of RNS unit design is covered, encouraging the development of moduli sets with channels other than the traditional {2^n}, {2^n+-1} modulo. In order to evaluate the performance of the proposed structures, they are compared with the well known units for modulo {2^n}, {2^n+-1}, and {2^n+-3}. These structures allow to implement generic units for modulo {2^n+-k}, in the case of modular multiplication it is achieved the same critical path delay and merely 4% of increase on area resources, when compared with the dedicated structure presented in the state-of-art for modulo {2^n+-3}.

14 citations


Posted Content
TL;DR: In this article, a new moduli set selection technique is proposed to improve bit efficiency which can be used to construct a residue number system for digital signal processing environment, and the novelty of the architecture is shown by comparison the different schemes reported in the literature.
Abstract: Residue Number System (RNS), which originates from the Chinese Remainder Theorem, offers a promising future in VLSI because of its carry-free operations in addition, subtraction and multiplication. This property of RNS is very helpful to reduce the complexity of calculation in many applications. A residue number system represents a large integer using a set of smaller integers, called residues. But the area overhead, cost and speed not only depend on this word length, but also the selection of moduli, which is a very crucial step for residue system. This parameter determines bit efficiency, area, frequency etc. In this paper a new moduli set selection technique is proposed to improve bit efficiency which can be used to construct a residue system for digital signal processing environment. Subsequently, it is theoretically proved and illustrated using examples, that the proposed solution gives better results than the schemes reported in the literature. The novelty of the architecture is shown by comparison the different schemes reported in the literature. Using the novel moduli set, a guideline for a Reconfigurable Processor is presented here that can process some predefined functions. As RNS minimizes the carry propagation, the scheme can be implemented in Real Time Signal Processing & other fields where high speed computations are required.

12 citations


Posted Content
TL;DR: In this article, the use of Residue Number System (RNS) arithmetic to accelerate modular operations was explored for solving the discrete logarithm problem on groups of size 100 to 1000 bits.
Abstract: In the context of cryptanalysis, computing discrete logarithms in large cyclic groups using index-calculus-based methods, such as the number field sieve or the function field sieve, requires solving large sparse systems of linear equations modulo the group order. Most of the fast algorithms used to solve such systems --- e.g., the conjugate gradient or the Lanczos and Wiedemann algorithms --- iterate a product of the corresponding sparse matrix with a vector (SpMV). This central operation can be accelerated on GPUs using specific computing models and addressing patterns, which increase the arithmetic intensity while reducing irregular memory accesses. In this work, we investigate the implementation of SpMV kernels on NVIDIA GPUs, for several representations of the sparse matrix in memory. We explore the use of Residue Number System (RNS) arithmetic to accelerate modular operations. We target linear systems arising when attacking the discrete logarithm problem on groups of size 100 to 1000 bits, which includes the relevant range for current cryptanalytic computations. The proposed SpMV implementation contributed to solving the discrete logarithm problem in GF($2^{619}$) and GF($2^{809}$) using the FFS algorithm.

Journal ArticleDOI
TL;DR: Results indicate that the proposed general SUT-RNS multiplier for the moduli set {2n−1, 2 n, 2n+1} is a fast fault-tolerant multiplier which outperforms area, power and energy/operation of existing RRNS multiplier.
Abstract: Residue number system (RNS) which utilises redundant encoding for the residues is called redundant residue number system (RRNS). It can accelerate multiplication which is a high-latency operation. Using stored-unibit-transfer (SUT) redundant encoding in RRNS called SUT-RNS has been shown as an efficient number system for arithmetic operation. Radix-2h SUT-RNS multiplication has been proposed in previous studies for modulo 2n−1, but it has not been generalised for each moduli lengths (n) and radix (r=2h). Also, SUT-RNS multiplication for modulo 2n+1 has not been discussed. In this study the authors remove these weaknesses by proposing general radix-2h SUT-RNS multiplication for the moduli set {2n−1, 2n, 2n+1}. Moreover, the authors demonstrate that our approach enables a unified design for the moduli set multipliers, which results in designing fault-tolerant SUT-RNS multipliers with low hardware redundancy. Results indicate that the proposed general SUT-RNS multiplier for the moduli set {2n−1, 2n, 2n+1} is a fast fault-tolerant multiplier which outperforms area, power and energy/operation of existing RRNS multiplier.

Proceedings ArticleDOI
02 May 2012
TL;DR: The preliminary results show the capability of the proposed system with high computing speed based on OHR to speed incensement of operation, decreasing of consumption power, facilitating designed hardware and finally decreasing chip production for image processing.
Abstract: In this paper, the use of the one hot residue (OHR) number system for digital image processing and its application for designing fast, high-speed and low area image processors are studied Since digital image filtering in space domain requires many algebra computations, we're going to propose a system with high computing speed based on OHR Using proposed system can significantly enhance the speed of the computation operations and hence the system In the proposed image coding scheme the delay of implementation is equal to delay of a transistor which is a good improvement in compare with the conventional methods such as direct method Other advantages of using one hot residue (OHR) number system are its simplicity of implementation and minimum power dissipation Design of adder and multipliers commonly used for filtering with selected moduli set {2n−1+1, 2n−1, 2n| for one hot coding are presented here MATLAB was used for simulation studies while VLSI tools have been employed for design analysis The preliminary results show the capability of the proposed method to speed incensement of operation, decreasing of consumption power, facilitating designed hardware and finally decreasing chip production for image processing

Journal ArticleDOI
31 Oct 2012
TL;DR: A new moduli set selection technique is proposed to improve bit efficiency which can be used to construct a residue system for digital signal processing environment and it is theoretically proved and illustrated using examples, that the proposed solution gives better results than the schemes reported in the literature.
Abstract: Residue Number System (RNS), which originates from the Chinese Remainder Theorem, offers a promising future in VLSI because of its carry-free operations in addition, subtraction and multiplication. This property of RNS is very helpful to reduce the complexity of calculation in many applications. A residue number system represents a large integer using a set of smaller integers, called residues. But the area overhead, cost and speed not only depend on this word length, but also the selection of moduli, which is a very crucial step for residue system. This parameter determines bit efficiency, area, frequency etc. In this paper a new moduli set selection technique is proposed to improve bit efficiency which can be used to construct a residue system for digital signal processing environment. Subsequently, it is theoretically proved and illustrated using examples, that the proposed solution gives better results than the schemes reported in the literature. The novelty of the architecture is shown by comparison the different schemes reported in the literature. Using the novel moduli set, a guideline for a Reconfigurable Processor is presented here that can process some predefined functions. As RNS minimizes the carry propagation, the scheme can be implemented in Real Time Signal Processing & other fields where high speed computations are required.

Proceedings ArticleDOI
12 Mar 2012
TL;DR: The knowledge of apriori probabilities of residue generation is utilized to implement a probability based Distance-Aware Direct Mapping scheme for M-ary modulation which further improves the error performance of the RRNS-STBC coding scheme.
Abstract: In this paper, we propose a novel application of Redundant Residue Number System (RRNS) codes to Space-Time Block Codes (STBCs) design. Based on the so-called “Direct-Mapping” scheme, the link between residues and complex signal constellations is optimized. We derive upper bounds on the codeword error probability of RRNS-STBC and characterize its achievable diversity gain assuming maximum likelihood decoding (MLD). The knowledge of apriori probabilities of residue generation is utilized to implement a probability based Distance-Aware Direct Mapping scheme for M-ary modulation which further improves the error performance of the RRNS-STBC coding scheme.


Journal ArticleDOI
TL;DR: Two novel architectures are proposed for multi-modulus adders that support the most common moduli cases in RNS channels, that is, modulo 2^n-1,2^n and 2^ n+1.

Proceedings ArticleDOI
20 May 2012
TL;DR: Residue Number System advantages are quantitatively illustrated by considering a timing model with two non-gaussian distributions and it is shown that for bases where all moduli channels are candidates to contain the critical path of the RNS circuit, the delay variation is significantly reduced.
Abstract: In this paper the utilization of Residue Number System (RNS) is investigated as a tool for variation-tolerant design. In particular circuits using various RNS bases are compared to the equivalent binary structures in terms of their sensitivity to the variation of process parameters. Furthermore, RNS advantages are quantitatively illustrated by considering a timing model with two non-gaussian distributions. It is shown that for bases where all moduli channels are candidates to contain the critical path of the RNS circuit, the delay variation is significantly reduced when compared to the equivalent binary structures.

Journal ArticleDOI
01 Jun 2012
TL;DR: Two methodologies for designing single constant multipliers in Residue Number System for moduli of the 2n−1, 2n and 2n + 1 forms are proposed and both result in circuits that are shown to be efficient in terms of area and delay.
Abstract: Architectures for designing single constant multipliers in Residue Number System (RNS) for moduli of the 2 n ?1, 2 n and 2 n ?+?1 forms are introduced with the constant operand being recoded in Signed-Digit representation. Two methodologies are proposed. In the first one a straightforward implementation of the shift-and-add algorithm is adopted, while in the second one a graph-based approach is used. Both methodologies result in circuits that are shown to be efficient in terms of area and delay.

Proceedings ArticleDOI
05 Sep 2012
TL;DR: New efficient modulo 2n+1 residue generators are proposed which find applicability as forward converts from the binary to the residue number system, and in the design of self-checking digital systems.
Abstract: In this work new efficient modulo 2^n+1 residue generators are proposed. The input operands are divided into n-bit vectors which are added by an inverted end around carry save adder tree and a final stage diminished-1 modulo 2^n+1 adder. The conversion of the proposed residue generators to configurable modulo 2^n±1 ones is also discussed. Modulo 2^n±1 residue generators find applicability as forward converts from the binary to the residue number system, and in the design of self-checking digital systems.

Proceedings ArticleDOI
15 Mar 2012
TL;DR: The novel 3-2, 4-2 and 5-2 compressors are illustrated for efficient design, which are used as the basic building blocks for the proposed binary to residue converter designs.
Abstract: In this paper, a binary to residue number system architecture based on the 2k−1 modulo set. For the integer modulo operation (X mod m), (p, 2) compressors are used, where m is restricted to the values 2k−1, for any value of k > 1 and X is a 16 bit number. The novel 3-2, 4-2 and 5-2 compressors are illustrated for efficient design, which are used as the basic building blocks for the proposed binary to residue converter designs. The 3-2, 4-2 and 5-2 compressors are used in place of half adder and full adder to reduce the delay, power consumption as well as the area of the circuit. The 4-2 and 5-2 compressors cell can operate reliably in any tree structured parallel multiplier at very low supply voltages. The proposed converter can be implemented by fast and simple architecture and also required less hardware.

Journal Article
TL;DR: Simulation results suggest that the performance of the proposed system is superior to many existing systems, and the significance of cross-correlation factor in alleviating multi-access interference is also discussed.
Abstract: The successful use of CDMA technology is based on the construction of large families of encoding sequences with good correlation properties. This paper discusses PN sequence generation based on Residue Arithmetic with an effort to improve the performance of existing interference-limited CDMA technology for mobile cellular systems. All spreading codes with residual number system proposed earlier did not consider external interferences, multipath propagation, Doppler effect etc. In literature the use of residual arithmetic in DS-CDMA was restricted to encoding of already spread sequence; where spreading of sequence is done by some existing techniques. The novelty of this paper is the use of residual number system in generation of the PN sequences which is used to spread the message signal. The significance of cross-correlation factor in alleviating multi-access interference is also discussed. The RNS based PN sequence has superior performance than most of the existing codes that are widely used in DS-CDMA applications. Simulation results suggest that the performance of the proposed system is superior to many existing systems. Keywords—Direct-Sequence Code Division Multiple Access (DSCDMA), Multiple-Access Interference (MAI), PN Sequence, Residue Number System (RNS).


Proceedings Article
07 May 2012
TL;DR: A dedicated processor for image coding and transfer in wireless networks was developed on the basis of the residue number system and selected co-prime modules provide conversion of the 24-bit images.
Abstract: A dedicated processor for image coding and transfer in wireless networks was developed on the basis of the residue number system. Received residues are transferred by parallel channels using multipath routing. Selected co-prime modules provide conversion of the 24-bit images. The one pixel of image transformation is carried out once per a cycle using parallel-serial multi-bit adders and incomplete encoders.

Journal ArticleDOI
TL;DR: This new moduli set is completely free from modulo-(2k+1)-type which results in high-speed modulo arithmetic channels for RNS and provides fast arithmetic operation with higher speed of the reverse converter comparing to other five moduliSet which is found in literature.
Abstract: In this paper, we propose efficient designs of residue number system (RNS) to binary converter for the balanced moduli set {2n, 2n+1-1, 2n-1, 2n-1-1} where n has even values. This new moduli set is completely free from modulo-(2k+1)-type which results in high-speed modulo arithmetic channels for RNS. Also, mixed-radix conversion (MRC) algorithm is used to achieve both an arithmetic-based and reduced-complexity two-level RNS to binary converter architectures. The proposed moduli set provides fast arithmetic operation with higher speed of the reverse converter comparing to other five moduli set which is found in literature.

Proceedings ArticleDOI
28 Jun 2012
TL;DR: A new 4-moduli Residue Number System (RNS) and its respective reverse converter and the converter capitalizes on the mixed-radix conversion technique augmented to obtain a compact architecture and to reduce critical path delay by careful decomposition of the computations into cricital and non-critical chunks.
Abstract: In this paper we propose a new 4-moduli Residue Number System (RNS) {2k, 2n-1, 2n-1 - 1, 2n+1 - 1} and its respective reverse converter. The new moduli set is characterized by a very efficient way of utilizing the underlying representation's capacity coming from its two aspects. First, by not including conjugate moduli our set covers the dynamic range of the representation much better than the systems that do include conjugate moduli. Second, independent controlling parameters of the even and odd moduli allow for fine-grained adjustment of the dynamic range to the needs of specific application with a single bit resolution. Our converter capitalizes on the mixed-radix conversion technique augmented to obtain a compact architecture and to reduce critical path delay by careful decomposition of the computations into cricital and non-critical chunks. The synthesis experiment conducted on a 16-tap programmable FIR filter example using STMicro 65nm low-power library shows up to 11.5% improvement in power dissipation against competitive flexible moduli set. (6 pages)

Proceedings ArticleDOI
03 May 2012
TL;DR: The synthesis of the resulting design over the ST Microelectronics 65nm LP library demonstrates that the delay, area, and power characteristics improve the performance and power consumption of the existing complementary 5-moduli set.
Abstract: In this paper, we present a new residue number system (RNS) {2n-1, 2n, 2n+1, 2n+1+1, 2n-1+1} of five well-balanced moduli that are co-prime for odd n. This new RNS complements the 5-moduli RNS system proposed before for even n {2n-1, 2n, 2n+1, 2n+1-1, 2n-1-1}. With the new set, we also present a novel approach to designing multi-moduli reverse converters that focuses strongly on critical path analysis and aims at strongly on moving a significant amount of computations off the critical path. The synthesis of the resulting design over the ST Microelectronics 65nm LP library demonstrates that the delay, area, and power characteristics improve the performance and power consumption of the existing complementary 5-moduli set.

Journal ArticleDOI
Wei Guo1, Yaling Liu1, Songhui Bai1, Jizeng Wei1, Da-Zhi Sun1 
TL;DR: A parallel architecture for efficient hardware implementation of Rivest Shamir Adleman (RSA) cryptography is proposed and a simple and fast base transformation is used to achieve RNS Montgomery modular multiplication algorithm, which facilitates hardware implementation.
Abstract: A parallel architecture for efficient hardware implementation of Rivest Shamir Adleman (RSA) cryptography is proposed. Residue number system (RNS) is introduced to realize high parallelism, thus all the elements under the same base are independent of each other and can be computed in parallel. Moreover, a simple and fast base transformation is used to achieve RNS Montgomery modular multiplication algorithm, which facilitates hardware implementation. Based on transport triggered architecture (TTA), the proposed architecture is designed to evaluate the performance and feasibility of the algorithm. With these optimizations, a decryption rate of 106 kbps can be achieved for 1 024-b RSA at the frequency of 100 MHz.

Journal ArticleDOI
TL;DR: The results show that the proposed architecture can implement FIR filter with high speed and is compared with conventional RNS-DA FIR filter.
Abstract: In this paper high speed Residue Number System (RNS) based FIR filter using Distributed Arithmetic (DA) is proposed. The proposed architecture uses the module set having the value of numbers as small as possible. In case of using Distributed Arithmetic in FIR filter; the size of LUTs gets increased exponentially with the increase of tap of the filter. Here care has been taken so that sizes of LUTs do not get increased. The propooed architecture is designed using verilog HDL; a popular hardware description language [9]. The design is synthesized with ISE 10.1 and implemented on Xilinx's Virtex-4. The propooed architecture is also compared with conventional RNS-DA FIR filter. The results show that the proposed architecture can implement FIR filter with high speed.

Journal ArticleDOI
30 Jan 2012
TL;DR: In this paper, a general scalable implementation of the Strassen algorithm in the DNA computing paradigm is presented and can be generalized to the application of all fast matrix multiplication algorithms on a DNA computer.
Abstract: On distributed memory electronic computers, the implementation and association of fast parallel matrix multiplication algorithms has yielded astounding results and insights. In this discourse, we use the tools of molecular biology to demonstrate the theoretical encoding of Strassen’s fast matrix multiplication algorithm with DNA based on an n-moduli set in the residue number system, thereby demonstrating the viability of computational mathematics with DNA. As a result, a general scalable implementation of this model in the DNA computing paradigm is presented and can be generalized to the application of all fast matrix multiplication algorithms on a DNA computer. We also discuss the practical capabilities and issues of this scalable implementation. Fast methods of matrix computations with DNA are important because they also allow for the efficient implementation of other algorithms (that is inversion, computing determinants, and graph theory) with DNA. Key words: DNA computing, residue number system, logic and arithmetic operations, Strassen algorithm.