scispace - formally typeset
Search or ask a question
Book ChapterDOI

A Survey of Hardware Implementations of RSA

20 Aug 1989-pp 368-370
TL;DR: The purpose of this paper is to briefly describe some of the different compu- tational algorithms that have been used in the chip designs and to provide a list of all of the currently available chips.
Abstract: Today, a dozen years after the discovery of the RSA encryption algorithm [12], there are many chips available for performing RSA encryption [1] [3] [4] [5] [8] [9] [13] [15]. The purpose of this paper is to briefly describe some of the different compu- tational algorithms that have been used in the chip designs and to provide a list of all of the currently available chips. In this abstract, we will simply mention some of these computational algorithms and give references. The full paper will contain more details of these algorithms and will appear in a book on survey articles in Cryptology which is being edited by Gus Simmons and will be published by IEEE in 1990.

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI
12 Aug 1999
TL;DR: Three new types of power analysis attacks against smartcard implementations of modular exponentiation algorithms are described, each of which requires an adversary to exponentiate many random messages with a known and a secret exponent.
Abstract: Three new types of power analysis attacks against smartcard implementations of modular exponentiation algorithms are described. The first attack requires an adversary to exponentiate many random messages with a known and a secret exponent. The second attack assumes that the adversary can make the smartcard exponentiate using exponents of his own choosing. The last attack assumes the adversary knows the modulus and the exponentiation algorithm being used in the hardware. Experiments show that these attacks are successful. Potential countermeasures are suggested.

412 citations

Journal ArticleDOI
TL;DR: This work exhibits a dozen applications where PAM technology proves superior, both in performance and cost, to every other existing technology, including supercomputers, massively parallel machines, and conventional custom hardware.
Abstract: Programmable active memories (PAM) are a novel form of universal reconfigurable hardware coprocessor. Based on field-programmable gate array (FPGA) technology, a PAM is a virtual machine, controlled by a standard microprocessor, which can be dynamically and indefinitely reconfigured into a large number of application-specific circuits. PAM's offer a new mixture of hardware performance and software versatility. We review the important architectural features of PAM's, through the example of DECPeRLe-1, an experimental device built in 1992. PAM programming is presented, in contrast to classical gate-array and full custom circuit design. Our emphasis is on large, code-generated synchronous systems descriptions; no compromise is made with regard to the performance of the target circuits. We exhibit a dozen applications where PAM technology proves superior, both in performance and cost, to every other existing technology, including supercomputers, massively parallel machines, and conventional custom hardware. The fields covered include computer arithmetic, cryptography, error correction, image analysis, stereo vision, video compression, sound synthesis, neural networks, high-energy physics, thermodynamics, biology and astronomy. At comparable cost, the computing power virtually available in a PAM exceeds that of conventional processors by a factor 10 to 1000, depending on the specific application, in 1992. A technology shrink increases the performance gap between conventional processors and PAM's. By Noyce's law, we predict by how much the performance gap will widen with time.

359 citations

Journal ArticleDOI
TL;DR: This work presents the first implementation of RSA in the residue number system (RNS) which does not require any conversion, either from radix to RNS beforehand or RNS to radix afterward, based on an optimized RNS version of Montgomery multiplication.
Abstract: We present the first implementation of RSA in the residue number system (RNS) which does not require any conversion, either from radix to RNS beforehand or RNS to radix afterward. Our solution is based on an optimized RNS version of Montgomery multiplication. Thanks to the RNS, the proposed algorithms are highly parallelizable and seem then well suited to hardware implementations. We give the computational procedure both parties must follow in order to recover the correct result at the end of the transaction (encryption or signature).

258 citations


Cites background from "A Survey of Hardware Implementation..."

  • ...DURING the last decade, fast hardware implementations of publickey cryptosystems have been widely studied [3], [5], [15] while confidentiality and security requirements were becoming more and more important....

    [...]

Proceedings ArticleDOI
29 Jun 1993
TL;DR: The authors detail and analyze the critical techniques that may be combined in the design of fast hardware for RSA cryptography: chinese remainders, star chains, Hensel's odd division, carry-save representation, quotient pipelining, and asynchronous carry completion adders.
Abstract: The authors detail and analyze the critical techniques that may be combined in the design of fast hardware for RSA cryptography: chinese remainders, star chains, Hensel's odd division (also known as Montgomery modular reduction), carry-save representation, quotient pipelining, and asynchronous carry completion adders. A fully operational PAM (programmable active memory) implementation of RSA that combines all of the techniques presented here delivers an RSA secret decryption rate over 600-kb/s for 512-b keys, and 165-kb/s for 1-kb keys. This is an order of magnitude faster than any previously reported running implementation. While the implementation makes full use of the PAM's reconfigurability, it is possible to derive from the (multiple PAM designs) implementation a (single) gate-array specification with estimated size under 100 K gates and speed over 1 Mb/s for RSA 512-b keys. Matching gains in software performance which are also analyzed. >

245 citations

Journal ArticleDOI
TL;DR: Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985), showing that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures.
Abstract: Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985). Comparison with previous techniques shows that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures. The gain in speed arises from the faster clock that results from simpler combinational logic. >

238 citations

References
More filters
Book
01 Jan 1968
TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.
Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

17,939 citations

Journal ArticleDOI
TL;DR: An encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key.
Abstract: An encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key. This has two important consequences: (1) Couriers or other secure means are not needed to transmit keys, since a message can be enciphered using an encryption key publicly revealed by the intented recipient. Only he can decipher the message, since only he knows the corresponding decryption key. (2) A message can be “signed” using a privately held decryption key. Anyone can verify this signature using the corresponding publicly revealed encryption key. Signatures cannot be forged, and a signer cannot later deny the validity of his signature. This has obvious applications in “electronic mail” and “electronic funds transfer” systems. A message is encrypted by representing it as a number M, raising M to a publicly specified power e, and then taking the remainder when the result is divided by the publicly specified product, n, of two large secret primer numbers p and q. Decryption is similar; only a different, secret, power d is used, where e * d ≡ 1(mod (p - 1) * (q - 1)). The security of the system rests in part on the difficulty of factoring the published divisor, n.

14,659 citations


"A Survey of Hardware Implementation..." refers methods in this paper

  • ...Today, a dozen years after the discovery of the RSA encryption algorithm [ 12 ], there are many chips available for performing RSA encryption [l] [3] [4] [S] [S] [9] [13] [15]....

    [...]

Journal ArticleDOI
TL;DR: A method for multiplying two integers modulo N while avoiding division by N, a representation of residue classes so as to speed modular multiplication without affecting the modular addition and subtraction algorithms.
Abstract: Let N > 1. We present a method for multiplying two integers (called N-residues) modulo N while avoiding division by N. N-residues are represented in a nonstandard way, so this method is useful only if several computations are done modulo one N. The addition and subtraction algorithms are unchanged. 1. Description. Some algorithms (1), (2), (4), (5) require extensive modular arith- metic. We propose a representation of residue classes so as to speed modular multiplication without affecting the modular addition and subtraction algorithms. Other recent algorithms for modular arithmetic appear in (3), (6). Fix N > 1. Define an A'-residue to be a residue class modulo N. Select a radix R coprime to N (possibly the machine word size or a power thereof) such that R > N and such that computations modulo R are inexpensive to process. Let R~l and N' be integers satisfying 0 N then return t - N else return t ■ To validate REDC, observe mN = TN'N = -Tmod R, so t is an integer. Also, tR = Tmod N so t = TR'X mod N. Thirdly, 0 < T + mN < RN + RN, so 0 < t < 2N. If R and N are large, then T + mN may exceed the largest double-precision value. One can circumvent this by adjusting m so -R < m < 0. Given two numbers x and y between 0 and N - 1 inclusive, let z = REDC(xy). Then z = (xy)R~x mod N, so (xR-l)(yR~x) = zRx mod N. Also, 0 < z < N, so z is the product of x and y in this representation. Other algorithms for operating on N-residues in this representation can be derived from the algorithms normally used. The addition algorithm is unchanged, since xR~x + yR~x = zR~x mod N if and only if x + y = z mod N. Also unchanged are

2,647 citations

Book
01 Jan 1980

533 citations

Book
01 Oct 1990
TL;DR: The concept of PAM, Programmable Active Memory is introduced and results obtained with the Perle-0 prototype board are presented, featuring a software silicon foundry for a 50K gate array, with a 50 milliseconds turn-around time.
Abstract: We introduce the concept of PAM, Programmable Active Memory and present results obtained with our Perle-0 prototype board, featuring: A software silicon foundry for a 50K gate array, with a 50 milliseconds turn-around time. A 3000 one bit processors universal machine with an arbitrary interconnect structure specified by 400K bits of nano-code. A programmable hardware co-processor with an initial library including: a long multiplier, an image convolver, a data compressor, etc. Each of these hardware designs speeds up the corresponding software application by at least an order of magnitude.

184 citations