scispace - formally typeset
Search or ask a question
Author

Georgi Todorov

Bio: Georgi Todorov is an academic researcher from Oregon State University. The author has contributed to research in topics: Operand & Manufacturing engineering. The author has an hindex of 2, co-authored 3 publications receiving 126 citations.

Papers
More filters
Book ChapterDOI
14 May 2001
TL;DR: This paper describes an algorithm and architecture based on an extension of a scalable radix-2 architecture proposed in a previous work that is proven to be correct and the hardware design is discussed in detail.
Abstract: This paper describes an algorithm and architecture based on an extension of a scalable radix-2 architecture proposed in a previous work. The algorithm is proven to be correct and the hardware design is discussed in detail. Experimental results are shown to compare a radix-8 implementation witha radix-2 design. The scalable Montgomery multiplier is adjustable to constrained areas yet being able to work on any given precision of the operands. Similar to some systolic implementations, this design avoid the high load on signals that broadcast to several components, making the delay independent of operand's precision.

121 citations

Patent
25 Apr 2002
TL;DR: In this paper, a multiplicand operand and a modulus are processed word by word, and then additional bits of the multiplier operand are selected for processing for processing.
Abstract: Methods and apparatus for Montgomery multiplication process a multiplier operand in k-bit radix-digits, wherein k corresponds to a radix r=2k. A multiplicand operand and a modulus are processed word by word, and then additional bits of the multiplier operand are selected for processing. In a radix r=8 example, the multiplier operand is processed in 3 bit radix-8 digits. A processing kernel is configured to preprocess the modulus and/or the multiplier operand so that at least some values can be obtained from lookup tables.

9 citations

Journal ArticleDOI
TL;DR: In this paper , the influence of the printing temperature on linear wear, wear intensity, wear resistance, roughness and microhardness in condition of reverse sliding friction of tribosystems with two different types of counterbodies was investigated.
Abstract: Purpose The purpose of this study is to investigate the influence of the printing temperature on several tribological parameters. Design/methodology/approach Polylactic acid (PLA) samples are produced at different printing temperatures. Results for the influence of the printing temperature on linear wear, wear intensity, wear resistance, roughness and microhardness in condition of reverse sliding friction of tribosystems with two different types of counterbodies were obtained. Findings In view of the experiments performed and a thorough analysis of the data obtained, it can be concluded that the printing temperature of PLA parts is directly related to their wear resistance – the higher the printing temperature, the greater the wear resistance, i.e. when making PLA machinery elements (which are working under conditions of friction and wear, e.g. gears, plain bearings and so on), it is appropriate to print them at a higher temperature. Originality/value To the best of the authors’ knowledge, the topic of this study is original and essential for future developments.

2 citations

01 Jan 2001
TL;DR: This paper describes an algorithm and architecture based on an extension of a scalable radix-2 architecture proposed in a previous work that is proven to be correct and the hardware design is discussed in detail.
Abstract: This paper describes an algorithm and architecture based on an extension of a scalable radix-2 architecture proposed in a previous work. The algorithm is proven to be correct and the hardware design is discussed in detail. Experimental results are shown to compare a radix-8 implementation witha radix-2 design. Th e scalable Montgomery multi- plier is adjustable to constrained areas yet being able to work on any given precision of the operands. Similar to some systolic implementa- tions, this design avoid the high load on signals that broadcast to several components, making the delay independent of operand's precision.

1 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A word-based version of MM is presented and used to explain the main concepts in the hardware design and gives enough freedom to select the word size and the degree of parallelism to be used, according to the available area and/or desired performance.
Abstract: This paper presents a scalable architecture for the computation of modular multiplication, based on the Montgomery multiplication (MM) algorithm. A word-based version of MM is presented and used to explain the main concepts in the hardware design. The proposed multiplier is able to work with any precision of the input operands, limited only by memory or control constraints. Its architecture gives enough freedom to select the word size and the degree of parallelism to be used, according to the available area and/or desired performance. Design trade offs are analyzed in order to identify adequate hardware configurations for a given area or bandwidth requirement.

242 citations

Journal ArticleDOI
TL;DR: This paper presents an overview of hardware implementations for the two commonly used types of public key cryptography, i.e. RSA and elliptic curve cryptography, both based on modular arithmetic.

120 citations

Proceedings ArticleDOI
24 Jun 2003
TL;DR: A hardware implementation of an arithmetic processor which is efficient for bit-lengths suitable for both commonly used types of public key cryptography (PKC) and RSA cryptosystems is described.
Abstract: We describe a hardware implementation of an arithmetic processor which is efficient for bit-lengths suitable for both commonly used types of public key cryptography (PKC), i.e., elliptic curve (EC) and RSA cryptosystems. Montgomery modular multiplication in a systolic array architecture is used for modular multiplication. The processor consists of special operational blocks for Montgomery modular multiplication, modular addition/subtraction, EC point doubling/addition, modular multiplicative inversion, EC point multiplier, projective to affine coordinates conversion and Montgomery to normal representation conversion.

120 citations

Journal ArticleDOI
TL;DR: Two new hardware architectures that are able to perform the same operation in approximately n clock cycles with almost the same clock period are proposed, based on precomputing partial results using two possible assumptions regarding the most significant bit of the previous word.
Abstract: Montgomery modular multiplication is one of the fundamental operations used in cryptographic algorithms, such as RSA and Elliptic Curve Cryptosystems. At CHES 1999, Tenca and Koc proposed the Multiple-Word Radix-2 Montgomery Multiplication (MWR2MM) algorithm and introduced a now-classic architecture for implementing Montgomery multiplication in hardware. With parameters optimized for minimum latency, this architecture performs a single Montgomery multiplication in approximately 2n clock cycles, where n is the size of operands in bits. In this paper, we propose two new hardware architectures that are able to perform the same operation in approximately n clock cycles with almost the same clock period. These two architectures are based on precomputing partial results using two possible assumptions regarding the most significant bit of the previous word. These two architectures outperform the original architecture of Tenca and Koc in terms of the product latency times area by 23 and 50 percent, respectively, for several most common operand sizes used in cryptography. The architecture in radix-2 can be extended to the case of radix-4, while preserving a factor of two speedup over the corresponding radix-4 design by Tenca, Todorov, and Koc from CHES 2001. Our optimization has been verified by modeling it using Verilog-HDL, implementing it on Xilinx Virtex-II 6000 FPGA, and experimentally testing it using SRC-6 reconfigurable computer.

100 citations

Book ChapterDOI
10 Sep 2007
TL;DR: A circuit architecture that can handle multiple data lengths using the same circuits and improve the Montgomery multiplication algorithm in order to maximize the performance of the multiplication unit in FPGA.
Abstract: This paper describes a modular exponentiation processing method and circuit architecture that can exhibit the maximum performance of FPGA resources. The modular exponentiation architecture proposed by us comprises three main techniques. The first technique is to improve the Montgomery multiplication algorithm in order to maximize the performance of the multiplication unit in FPGA. The second technique is to improve and balance the circuit delay. The third technique is to ensure and make fast the scalability of the effective FPGA resource. We propose a circuit architecture that can handle multiple data lengths using the same circuits. In addition, our architecture can perform fast operations using small-scale resources; in particular, it can complete 512-bit modular exponentiation in 0.26 ms by means of XC4VF12-10SF363, which is the minimum logic resources in the Virtex-4 Series FPGAs. Also, the number of SLICEs used is approx. 4000 to make a very compact design. Moreover, 1024-, 1536- and 2048-bit modular exponentiations can be processed in the same circuit with the scalability.

78 citations