A Survey of Hardware Implementations of RSA

doi:10.1007/0-387-34805-0_34

Home
/
Papers
/
A Survey of Hardware Implementations of RSA

Book Chapter•DOI•

A Survey of Hardware Implementations of RSA

Ernest F. Brickell¹•Institutions (1)

Sandia National Laboratories¹

20 Aug 1989-pp 368-370

TL;DR: The purpose of this paper is to briefly describe some of the different compu- tational algorithms that have been used in the chip designs and to provide a list of all of the currently available chips.

read less

Abstract: Today, a dozen years after the discovery of the RSA encryption algorithm [12], there are many chips available for performing RSA encryption [1] [3] [4] [5] [8] [9] [13] [15]. The purpose of this paper is to briefly describe some of the different compu- tational algorithms that have been used in the chip designs and to provide a list of all of the currently available chips. In this abstract, we will simply mention some of these computational algorithms and give references. The full paper will contain more details of these algorithms and will appear in a book on survey articles in Cryptology which is being edited by Gus Simmons and will be published by IEEE in 1990.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Power Analysis Attacks of Modular Exponentiation in Smartcards

[...]

Thomas S. Messerges¹, Ezzy A. Dabbish¹, Robert H. Sloan²•Institutions (2)

Motorola¹, University of Illinois at Chicago²

12 Aug 1999

TL;DR: Three new types of power analysis attacks against smartcard implementations of modular exponentiation algorithms are described, each of which requires an adversary to exponentiate many random messages with a known and a secret exponent.

...read moreread less

Abstract: Three new types of power analysis attacks against smartcard implementations of modular exponentiation algorithms are described. The first attack requires an adversary to exponentiate many random messages with a known and a secret exponent. The second attack assumes that the adversary can make the smartcard exponentiate using exponents of his own choosing. The last attack assumes the adversary knows the modulus and the exponentiation algorithm being used in the hardware. Experiments show that these attacks are successful. Potential countermeasures are suggested.

...read moreread less

412 citations

Journal Article•DOI•

Programmable active memories: reconfigurable systems come of age

[...]

Jean Vuillemin, P. Bertin¹, Didier Roncin, M. Shand, H.H. Touati, Philippe Boucard - Show less +2 more•Institutions (1)

French Institute for Research in Computer Science and Automation¹

01 Mar 1996-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: This work exhibits a dozen applications where PAM technology proves superior, both in performance and cost, to every other existing technology, including supercomputers, massively parallel machines, and conventional custom hardware.

...read moreread less

Abstract: Programmable active memories (PAM) are a novel form of universal reconfigurable hardware coprocessor. Based on field-programmable gate array (FPGA) technology, a PAM is a virtual machine, controlled by a standard microprocessor, which can be dynamically and indefinitely reconfigured into a large number of application-specific circuits. PAM's offer a new mixture of hardware performance and software versatility. We review the important architectural features of PAM's, through the example of DECPeRLe-1, an experimental device built in 1992. PAM programming is presented, in contrast to classical gate-array and full custom circuit design. Our emphasis is on large, code-generated synchronous systems descriptions; no compromise is made with regard to the performance of the target circuits. We exhibit a dozen applications where PAM technology proves superior, both in performance and cost, to every other existing technology, including supercomputers, massively parallel machines, and conventional custom hardware. The fields covered include computer arithmetic, cryptography, error correction, image analysis, stereo vision, video compression, sound synthesis, neural networks, high-energy physics, thermodynamics, biology and astronomy. At comparable cost, the computing power virtually available in a PAM exceeds that of conventional processors by a factor 10 to 1000, depending on the specific application, in 1992. A technology shrink increases the performance gap between conventional processors and PAM's. By Noyce's law, we predict by how much the performance gap will widen with time.

...read moreread less

359 citations

Journal Article•DOI•

a full RNS implementation of RSA

[...]

Jean-Claude Bajard, Laurent Imbert

01 Jun 2004-IEEE Transactions on Computers

TL;DR: This work presents the first implementation of RSA in the residue number system (RNS) which does not require any conversion, either from radix to RNS beforehand or RNS to radix afterward, based on an optimized RNS version of Montgomery multiplication.

...read moreread less

Abstract: We present the first implementation of RSA in the residue number system (RNS) which does not require any conversion, either from radix to RNS beforehand or RNS to radix afterward. Our solution is based on an optimized RNS version of Montgomery multiplication. Thanks to the RNS, the proposed algorithms are highly parallelizable and seem then well suited to hardware implementations. We give the computational procedure both parties must follow in order to recover the correct result at the end of the transaction (encryption or signature).

...read moreread less

258 citations

Cites background from "A Survey of Hardware Implementation..."

...DURING the last decade, fast hardware implementations of publickey cryptosystems have been widely studied [3], [5], [15] while confidentiality and security requirements were becoming more and more important....
[...]

Proceedings Article•DOI•

Fast implementations of RSA cryptography

[...]

M. Shand, Jean Vuillemin

29 Jun 1993

TL;DR: The authors detail and analyze the critical techniques that may be combined in the design of fast hardware for RSA cryptography: chinese remainders, star chains, Hensel's odd division, carry-save representation, quotient pipelining, and asynchronous carry completion adders.

...read moreread less

Abstract: The authors detail and analyze the critical techniques that may be combined in the design of fast hardware for RSA cryptography: chinese remainders, star chains, Hensel's odd division (also known as Montgomery modular reduction), carry-save representation, quotient pipelining, and asynchronous carry completion adders. A fully operational PAM (programmable active memory) implementation of RSA that combines all of the techniques presented here delivers an RSA secret decryption rate over 600-kb/s for 512-b keys, and 165-kb/s for 1-kb keys. This is an order of magnitude faster than any previously reported running implementation. While the implementation makes full use of the PAM's reconfigurability, it is possible to derive from the (multiple PAM designs) implementation a (single) gate-array specification with estimated size under 100 K gates and speed over 1 Mb/s for RSA 512-b keys. Matching gains in software performance which are also analyzed. >

...read moreread less

245 citations

Journal Article•DOI•

Hardware implementation of Montgomery's modular multiplication algorithm

[...]

Stephen Eldridge¹, Colin D. Walter¹•Institutions (1)

University of Manchester¹

01 Jun 1993-IEEE Transactions on Computers

TL;DR: Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985), showing that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures.

...read moreread less

Abstract: Hardware is described for implementing the fast modular multiplication algorithm developed by P.L. Montgomery (1985). Comparison with previous techniques shows that this algorithm is up to twice as fast as the best currently available and is more suitable for alternative architectures. The gain in speed arises from the faster clock that results from simpler combinational logic. >

...read moreread less

238 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Collapse

References

PDF

Open Access

More filters

Book•

The Art of Computer Programming

[...]

Donald Ervin Knuth

01 Jan 1968

TL;DR: The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid.

...read moreread less

Abstract: A fuel pin hold-down and spacing apparatus for use in nuclear reactors is disclosed. Fuel pins forming a hexagonal array are spaced apart from each other and held-down at their lower end, securely attached at two places along their length to one of a plurality of vertically disposed parallel plates arranged in horizontally spaced rows. These plates are in turn spaced apart from each other and held together by a combination of spacing and fastening means. The arrangement of this invention provides a strong vibration free hold-down mechanism while avoiding a large pressure drop to the flow of coolant fluid. This apparatus is particularly useful in connection with liquid cooled reactors such as liquid metal cooled fast breeder reactors.

...read moreread less

17,939 citations

Journal Article•DOI•

A method for obtaining digital signatures and public-key cryptosystems

[...]

Ronald L. Rivest¹, Adi Shamir¹, Leonard M. Adleman¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Feb 1978-Communications of The ACM

TL;DR: An encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key.

...read moreread less

Abstract: An encryption method is presented with the novel property that publicly revealing an encryption key does not thereby reveal the corresponding decryption key. This has two important consequences: (1) Couriers or other secure means are not needed to transmit keys, since a message can be enciphered using an encryption key publicly revealed by the intented recipient. Only he can decipher the message, since only he knows the corresponding decryption key. (2) A message can be “signed” using a privately held decryption key. Anyone can verify this signature using the corresponding publicly revealed encryption key. Signatures cannot be forged, and a signer cannot later deny the validity of his signature. This has obvious applications in “electronic mail” and “electronic funds transfer” systems. A message is encrypted by representing it as a number M, raising M to a publicly specified power e, and then taking the remainder when the result is divided by the publicly specified product, n, of two large secret primer numbers p and q. Decryption is similar; only a different, secret, power d is used, where e * d ≡ 1(mod (p - 1) * (q - 1)). The security of the system rests in part on the difficulty of factoring the published divisor, n.

...read moreread less

14,659 citations

"A Survey of Hardware Implementation..." refers methods in this paper

...Today, a dozen years after the discovery of the RSA encryption algorithm [ 12 ], there are many chips available for performing RSA encryption [l] [3] [4] [S] [S] [9] [13] [15]....
[...]

Journal Article•DOI•

Modular multiplication without trial division

[...]

Peter L. Montgomery

01 Apr 1985-Mathematics of Computation

TL;DR: A method for multiplying two integers modulo N while avoiding division by N, a representation of residue classes so as to speed modular multiplication without affecting the modular addition and subtraction algorithms.

...read moreread less

Abstract: Let N > 1. We present a method for multiplying two integers (called N-residues) modulo N while avoiding division by N. N-residues are represented in a nonstandard way, so this method is useful only if several computations are done modulo one N. The addition and subtraction algorithms are unchanged. 1. Description. Some algorithms (1), (2), (4), (5) require extensive modular arith- metic. We propose a representation of residue classes so as to speed modular multiplication without affecting the modular addition and subtraction algorithms. Other recent algorithms for modular arithmetic appear in (3), (6). Fix N > 1. Define an A'-residue to be a residue class modulo N. Select a radix R coprime to N (possibly the machine word size or a power thereof) such that R > N and such that computations modulo R are inexpensive to process. Let R~l and N' be integers satisfying 0 N then return t - N else return t ■ To validate REDC, observe mN = TN'N = -Tmod R, so t is an integer. Also, tR = Tmod N so t = TR'X mod N. Thirdly, 0 < T + mN < RN + RN, so 0 < t < 2N. If R and N are large, then T + mN may exceed the largest double-precision value. One can circumvent this by adjusting m so -R < m < 0. Given two numbers x and y between 0 and N - 1 inclusive, let z = REDC(xy). Then z = (xy)R~x mod N, so (xR-l)(yR~x) = zRx mod N. Also, 0 < z < N, so z is the product of x and y in this representation. Other algorithms for operating on N-residues in this representation can be derived from the algorithms normally used. The addition algorithm is unchanged, since xR~x + yR~x = zR~x mod N if and only if x + y = z mod N. Also unchanged are

...read moreread less

2,647 citations

Book•

Computer Arithmetic

[...]

Earl E. Swartzlander

01 Jan 1980

533 citations

Book•

Introduction to programmable active memories

[...]

P. Bertin, Didier Roncin, Jean Vuillemin

01 Oct 1990

TL;DR: The concept of PAM, Programmable Active Memory is introduced and results obtained with the Perle-0 prototype board are presented, featuring a software silicon foundry for a 50K gate array, with a 50 milliseconds turn-around time.

...read moreread less

Abstract: We introduce the concept of PAM, Programmable Active Memory and present results obtained with our Perle-0 prototype board, featuring: A software silicon foundry for a 50K gate array, with a 50 milliseconds turn-around time. A 3000 one bit processors universal machine with an arbitrary interconnect structure specified by 400K bits of nano-code. A programmable hardware co-processor with an initial library including: a long multiplier, an image convolver, a data compressor, etc. Each of these hardware designs speeds up the corresponding software application by at least an order of magnitude.

...read moreread less

184 citations