scispace - formally typeset
Search or ask a question
Author

Christoph Puttmann

Bio: Christoph Puttmann is an academic researcher from University of Paderborn. The author has contributed to research in topics: Network on a chip & Elliptic curve cryptography. The author has an hindex of 7, co-authored 18 publications receiving 156 citations.

Papers
More filters
Proceedings ArticleDOI
29 Aug 2007
TL;DR: The GigaNoC is presented, a hierarchical Network-on-Chip that is especially suitable for scalable Chip-Multiprocessor architectures and features a packet-switched wormhole routing on-chip network that provides the backbone of the authors' multiprocesser architecture.
Abstract: Due to the technological progress in the semiconductor industry, more and more components can be integrated on a single die forming a complex System-on-Chip. For enabling an efficient interaction between the various building blocks of today's SoCs, efficient communication structures become more and more essential. In this paper, we present the GigaNoC, a hierarchical Network-on-Chip that is especially suitable for scalable Chip-Multiprocessor architectures. The GigaNoC approach features a packet-switched wormhole routing on-chip network that provides the backbone of our multiprocessor architecture. In order to meet bandwidth requirements of different application domains, our Network-on-Chip is easily scalable and parameterizable in various aspects. This work highlights the communication protocol and shows a performance evaluation for different congestion scenarios. Furthermore, we present an FPGA-based prototypical realization and introduce a debugging and verification environment. Finally, implementation results for a standard cell technology are discussed.

33 citations

Proceedings Article
16 Aug 2007
TL;DR: In this article, a fast Fourier transform based method was proposed to reduce the number of F3m-multiplications for multiplication in F36m from 18 in recent implementations to 15.
Abstract: Efficient computation of the Tate pairing is an important part of pairing-based cryptography. Recently with the introduction of the Duursma-Lee method special attention has been given to the fields of characteristic 3. Especially multiplication in F36m, where m is prime, is an important operation in the above method. In this paper we propose a new method to reduce the number of F3m-multiplications for multiplication in F36m from 18 in recent implementations to 15. The method is based on the fast Fourier transform and its explicit formulas are given. The execution times of our software implementations for F36m show the efficiency of our results.

31 citations

Book ChapterDOI
16 Aug 2007
TL;DR: A new method is proposed to reduce the number of multiplication operations in \(\mathbb{F}_{3^{6m}}\)-multiplications for multiplication in \(\ mathbb {F}_3^{ 6m}}\) from 18 in recent implementations to 15, based on the fast Fourier transform.
Abstract: Efficient computation of the Tate pairing is an important part of pairing-based cryptography. Recently with the introduction of the Duursma-Lee method special attention has been given to the fields of characteristic 3. Especially multiplication in \(\mathbb{F}_{3^{6m}}\), where m is prime, is an important operation in the above method. In this paper we propose a new method to reduce the number of \(\mathbb{F}_{3^{m}}\)-multiplications for multiplication in \(\mathbb{F}_{3^{6m}}\) from 18 in recent implementations to 15. The method is based on the fast Fourier transform and its explicit formulas are given. The execution times of our software implementations for \(\mathbb{F}_{3^{6m}}\) show the efficiency of our results.

17 citations

Journal ArticleDOI
TL;DR: This article uses an FPGA-based rapid prototyping system to verify the functionality of the scalable GigaNetIC chip multiprocessor architecture before fabricating the ASIC in a modern CMOS standard cell technology.

16 citations

Proceedings ArticleDOI
10 Mar 2008
TL;DR: A low overhead reconfigurable multiprocessor, which provides both parallelism and flexibility and via reconfiguration to suit the application, power savings were noted in UMC's 90 nm standard cell technology.
Abstract: Reconfigurable architectures are being increasingly used for their flexibility and extensive parallelism to achieve accelerations for computationally intensive applications. Although these architectures provide easy adaptability, it is so with an overhead in terms of area, power and timing, as compared to non-reconfigurable ASICs. Here, we propose a low overhead reconfigurable multiprocessor, which provides both parallelism and flexibility. The architecture has been evaluated for its energy efficiency for a computational intensive algorithm used in elliptic curve cryptography (ECC). Typically, algorithms in ECC exhibit task-level parallelism and demand large amount of computational resources for custom implementations to achieve a significant speedup. A finite field multiplication in GF(2233) was chosen as a sample application to evaluate the performance on the QuadroCore reconfigurable multiprocessor architecture. A three-fold performance improvement as compared to a single processor implementation was observed. Further, via reconfiguration to suit the application, power savings of about 24% were noted in UMC's 90 nm standard cell technology.

13 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This highly successful textbook, widely regarded as the “bible of computer algebra”, gives a thorough introduction to the algorithmic basis of the mathematical engine in computer algebra systems.
Abstract: Computer algebra systems are now ubiquitous in all areas of science and engineering. This highly successful textbook, widely regarded as the “bible of computer algebra”, gives a thorough introduction to the algorithmic basis of the mathematical engine in computer algebra systems. Designed to accompany oneor two-semester courses for advanced undergraduate or graduate students in computer science or mathematics, its comprehensiveness and reliability has also made it an essential reference for professionals in the area. Special features include: detailed study of algorithms including time analysis; implementation reports on several topics; complete proofs of the mathematical underpinnings; and a wide variety of applications (among others, in chemistry, coding theory, cryptography, computational logic, and the design of calendars and musical scales). A great deal of historical information and illustration enlivens the text. In this third edition, errors have been corrected and much of the Fast Euclidean Algorithm chapter has been renovated.

937 citations

Journal ArticleDOI
TL;DR: This new textbook by R. E. Blahut contains perhaps the most comprehensive coverage of fast algorithms todate, with an emphasis on implementing the two canonical signal processing operations of convolution and discrete Fourier transformation.
Abstract: This new textbook by R. E. Blahut, which deals with the theory and design of efficient algorithms for digital signal processing, contains perhaps the most comprehensive coverage of fast algorithms todate.Alargecollectionofalgorithmsistreated,withanemphasis on implementing the two canonical signal processing operations of convolution and discrete Fourier transformation. In recent years, there has been much work done on fast algorithms,andBlahutdoesafinejobofblendingmaterialfromdiverse sources to form a coherent and self-contained approach to his subject.The mathematical level of this book is high, reflecting the rather abstract nature of the theoretical underpinnings of fast computational techniques. Although electrical engineers are forthe most part mathematically sophisticated, they tend to lack training in abstract algebra and number theory, both of which are essential to any thorough discussion of fast algorithms. Thus this audience should find the tutorial chapters which the text provides on these topics to be quite helpful. An additional feature of the text, which the nonspecialist should find useful, is that each new algorithm is described through three different formats: a simple example, a flowchart, and a set of matrix equations. This use of repetition assists the reader in grasping subject matter which for the most part is nonintuitive. Operation counts (as measured by the number of multiplications and the number of additions) for each algorithm are tabulated for avarietyof blocklengths (i.e., lengths of data segments), making performance comparisons easy. As the author points out, run-time comparisons may be quite different. Each chapter concludes with a set of problems of varying difficulty. These problems are well integrated with the text and serve to supplement the many examples worked out in the text. The book is devoted to how one rapidly computes various mathematical operators such as transform and convolutions. For a deeper understanding of the meaning of theseoperators,one must consult other sources in which their use is discussed. The text emphasizes algorithms which employ a reduced, or minimum, numberof muItiplications,althoughadditioncountsarealsotaken into consideration. However, an algorithm which is the “fastest” as measured in arithmetic operation counts may not be the fastest in execution time, particularly if dedicated hardware is employed. Indeed in practiceother considerationsfrequentlypropel oneaway from the computationally ”optimal” algorithm. Much work has been done on the theory and application of signal processing algorithms which are “efficient” in terms other than rnultiply/add counts, such as roundoff noise, limit cycles, coefficient quantization, memory access, hardware costs, etc. It is clearly necessary to limit the scope of any treatise, and the exclusion of differing performance measures is certainly appropriate. A description of the contents of the book will now be given, followed by some concluding remarks of a more general nature. Chapter 2 i s a tutorial on abstract algebra. It i s quite readable and is liberally laced with examples. In addition to the standard modern algebra fare (groups, rings, fields, vector spaces, matrices), the ubiquitous Chinese remainder theorem is discussed in detail. Chapters 3 and 4, and their extensions in Chapters 7and 8, form the core of the text. The third chapter addresses fast algorithms for short convolutions. The Cook-Toom convolution algorithm is discussed, followed by the Winograd convolution algorithm. A proof of the optimality of the Winograd algorithm, with respect to multiplications, for performing cyclic convolutions, is presented at the close of the chapter. The fourth chapter addresses fast algorithms for computing the discrete Fourier transform. The CooleyTukey algorithm is considered first. The approach taken is to view this algorithm as a means of mapping a onedimensional Fourier transform into a multidimensional transform. Variations of the algorithm are discussed, including the Rader-Brenner transform. Next, the Good-Thomas algorithm is discussed. This algorithm is again presented as a means of mapping a onedimensional transform into a higher dimensional transform, this time based on the Chinese remainder theorem. Rader’s algorithm for computing primelength Fouriertransforms by useofconvolution ispresented next. Extensions of the algorithm to blocklengths which are the power of an odd prime are considered. The chapter closes with the Winograd-Fourier transform which builds upon the Rader prime algorithm. Certain short blocklengthsareconsidered in detail,and the corresponding algorithms are compiled into an Appendix. Chapter 5 i s a mathematical interlude, tutorially covering items from number theory and algebraic field theory which are needed in later chapters. Topics include the totient function, Euler’s theorem, Fermat’s theorem, minimal polynomials, and cyclotomic polynomials. Chapter 6 is devoted to number theoretic transforms. These transforms proceed by representing the data values themselves in the field of integers modulo a prime. Convolution in integer fields is also covered. Chapters 7and 8 extend the convolution and transform methods of Chapters 3 and 4 to higher dimensions. Multidimensional transforms (convolutions) are used both to efficiently compute onedimensional transforms (convolutions) and to process data which are inherently higher dimensional. Both applications are treated in these chapters. Topics include the Agarwal-Cooley convolution algorithm, polynomial transforms, the family of Johnson-Burrus transforms and the Nussbaumer-Quandalle FFT. Chapter 9 discusses architectures for transforms and digital filters and includes treatmentsof FFT butterfly networks and overlapadd convolution. The remaining three chapters are mostly independent from the rest of the book. Chapter 10 covers fast algorithms based on doubling strategies. Computational tasks for which such fast algorithms are derived include sorting, matrix transposition, matrix multiplication, polynomial division, computation of trigonometric functions, and coordinate rotation. Many of theseoperations arise as steps in the solution of oneor more signal processing problems. Fast algorithms for solving Toeplitz systems is the theme of Chapter 11. There is a variety of fast algorithms discussed, the proper choice of which depends on the specific structure of the Toeplitz system at hand (such as whether or not the system is symmetric and whether or not the right-hand vector is arbitrary). The final chapter addresses fast algorithms for Trellis and tree search and includes the Viterbi, Stack, and Fano algorithms. These

175 citations

Journal Article
TL;DR: In this paper, the authors introduce a set of five custom instructions to accelerate arithmetic operations in finite fields GF(p) and GF(2 m ), which can be easily integrated into a standard RISC architecture like MIPS32 and require only little extra hardware.
Abstract: Instruction set extensions are a small number of custom instructions specifically designed to accelerate the processing of a given kind of workload such as multimedia or cryptography. Enhancing a general-purpose RISC processor with a few application-specific instructions to facilitate the inner loop operations of public-key cryptosystems can result in a significant performance gain. In this paper we introduce a set of five custom instructions to accelerate arithmetic operations in finite fields GF(p) and GF(2 m ). The custom instructions can be easily integrated into a standard RISC architecture like MIPS32 and require only little extra hardware. Our experimental results show that an extended MIPS32 core is able to perform an elliptic curve scalar multiplication over a 192-bit prime field in 36 msec, assuming a clock speed of 33 MHz. An elliptic, curve scalar multiplication over the binary field GF(2 191 ) takes only 21 msec, which is approximately six times faster than a software implementation on a standard MIPS32 processor.

70 citations

Journal ArticleDOI
TL;DR: In this article, a hardware accelerator based on a unified arithmetic operator able to perform the operations required by a given algorithm is proposed for the field F397 given by F3[x]/(x97+x12+2).
Abstract: Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. With software implementations being rather slow, the study of hardware architectures became an active research area. In this paper, we discuss several algorithms to compute the etaT pairing in characteristic three and suggest further improvements. These algorithms involve addition, multiplication, cubing, inversion, and sometimes cube root extraction over F3m. We propose a hardware accelerator based on a unified arithmetic operator able to perform the operations required by a given algorithm. We describe the implementation of a compact coprocessor for the field F397 given by F3[x]/(x97+x12+2), which compares favorably with other solutions described in the open literature.

52 citations