scispace - formally typeset
Search or ask a question

Showing papers on "Binary number published in 2004"


Journal ArticleDOI
TL;DR: A new method is introduced, the binary approach, which computes the elementary modes as binary patterns of participating reactions from which the respective stoichiometric coefficients can be computed in a post-processing step giving the most efficient method available for computing elementary modes to date.
Abstract: Metabolic pathway analysis has been recognized as a central approach to the structural analysis of metabolic networks. The concept of elementary (flux) modes provides a rigorous formalism to describe and assess pathways and has proven to be valuable for many applications. However, computing elementary modes is a hard computational task. In recent years we assisted in a multiplication of algorithms dedicated to it. We require a summarizing point of view and a continued improvement of the current methods. We show that computing the set of elementary modes is equivalent to computing the set of extreme rays of a convex cone. This standard mathematical representation provides a unified framework that encompasses the most prominent algorithmic methods that compute elementary modes and allows a clear comparison between them. Taking lessons from this benchmark, we here introduce a new method, the binary approach, which computes the elementary modes as binary patterns of participating reactions from which the respective stoichiometric coefficients can be computed in a post-processing step. We implemented the binary approach in FluxAnalyzer 5.1, a software that is free for academics. The binary approach decreases the memory demand up to 96% without loss of speed giving the most efficient method available for computing elementary modes to date. The equivalence between elementary modes and extreme ray computations offers opportunities for employing tools from polyhedral computation for metabolic pathway analysis. The new binary approach introduced herein was derived from this general theoretical framework and facilitates the computation of elementary modes in considerably larger networks.

287 citations


Proceedings ArticleDOI
18 Oct 2004
TL;DR: This paper presents a novel abstraction, called accrual failure detectors, that emphasizes flexibility and expressiveness and can serve as a basic building block to implementing failure detectors in distributed systems.
Abstract: The detection of failures is a fundamental issue for fault-tolerance in distributed systems. Recently, many people have come to realize that failure detection ought to be provided as some form of generic service, similar to IP address lookup or time synchronization. However, this has not been successful so far; one of the reasons being the fact that classical failure detectors were not designed to satisfy several application requirements simultaneously. We present a novel abstraction, called accrual failure detectors, that emphasizes flexibility and expressiveness and can serve as a basic building block to implementing failure detectors in distributed systems. Instead of providing information of a binary nature (trust vs. suspect), accrual failure detectors output a suspicion level on a continuous scale. The principal merit of this approach is that it favors a nearly complete decoupling between application requirements and the monitoring of the environment. In this paper, we describe an implementation of such an accrual failure detector, that we call the /spl phi/ failure detector. The particularity of the /spl phi/ failure detector is that it dynamically adjusts to current network conditions the scale on which the suspicion level is expressed. We analyzed the behavior of our /spl phi/ failure detector over an intercontinental communication link over a week. Our experimental results show that if performs equally well as other known adaptive failure detection mechanisms, with an improved flexibility.

125 citations


Book ChapterDOI
15 Aug 2004
TL;DR: The most common method for computing exponentiation of random elements in Abelian groups are sliding window schemes, which enhance the efficiency of the binary method at the expense of some precomputation.
Abstract: The most common method for computing exponentiation of random elements in Abelian groups are sliding window schemes, which enhance the efficiency of the binary method at the expense of some precomputation. In groups where inversion is easy (e.g. elliptic curves), signed representations of the exponent are meaningful because they decrease the amount of required precomputation. The asymptotic best signed method is wNAF, because it minimizes the precomputation effort whilst the non-zero density is nearly optimal. Unfortunately, wNAF can be computed only from the least significant bit, i.e. right-to-left. However, in connection with memory constraint devices left-to-right recoding schemes are by far more valuable.

95 citations


Journal ArticleDOI
TL;DR: In this article, the authors carried out a deep, 3.6 cm radio continuum survey of young outflow sources using the Very Large Array in its A configuration providing subarcsecond resolution.
Abstract: We have carried out a deep, 3.6 cm radio continuum survey of young outflow sources using the Very Large Array in its A configuration providing subarcsecond resolution. The eight regions observed are Haro 6-10 and L1527 IRS in Taurus, Haro 5a/6a in OMC 2/3, NGC 2023 MMS, NGC 2264 IRS1, HH 108 IRAS/MMS in Serpens, L1228, and L1251A. In combination with our similar and previously published maps of eight other star-forming regions, we find only one region with a single source, while the other 15 regions have on average 3.9 nearby sources. This supports the view that isolated star formation is rare. We have selected 21 objects, which are all young mostly Class I sources, and find a binary frequency of 33% in the separation range from 05 to 12''. This is within the uncertainties comparable to the observed binary frequency among T Tauri stars in a similar separation range. Seven of the 21 sources drive giant Herbig-Haro flows. Four of these seven are known to have companions (three are triple systems), corresponding to 57%. We discuss these results in relation to the hypothesis that giant Herbig-Haro flows are driven by disintegrating multiple systems.

86 citations


Book
01 Jan 2004
TL;DR: This chapter discusses the construction of a modular number system, and some of the techniques used to achieve this goal, as well as some of those used in the design of the modern number system.
Abstract: Preface. List of Figures. List of Tables. About the Author. 1. Computer Number Systems. 1.1 Conventional Radix Number System. 1.2 Conversion of Radix Numbers. 1.3 Representation of Signed Numbers. 1.3.1 Sign-Magnitude. 1.3.2 Diminished Radix Complement. 1.3.3 Radix Complement. 1.4. Signed-Digit Number System. 1.5 Floating-Point Number Representation. 1.5.1 Normalization. 1.5.2 Bias. 1.6 Residue Number System. 1.7 Logarithmic Number System. References. Problems. 2. Addition and Subtraction. 2.1 Single-Bit Adders. 2.1.1 Logical Devices. 2.1.2 Single-Bit Half-Adder and Full-Adders. 2.2 Negation. 2.2.1 Negation in One's Complement System. 2.2.2 Negation in Two's Complement System. 2.3 Subtraction through Addition. 2.4 Overflow. 2.5 Ripple Carry Adders. 2.5.1 Two's Complement Addition. 2.5.2 One's Complement Addition. 2.5.3 Sign-Magnitude Addition. References. Problems. 3. High-Speed Adder. 3.1 Conditional-Sum Addition. 3.2 Carry-Completion Sensing Addition. 3.3 Carry-Lookahead Addition (CLA). 3.3.1 Carry-Lookahead Adder. 3.3.2 Block Carry Lookahead Adder. 3.4 Carry-Save Adders (CSA). 3.5 Bit-Partitioned Multiple Addition. References. Problems. 4. Sequential Multiplication. 4.1 Add-and-Shift Approach. 4.2 Indirect Multiplication Schemes. 4.2.1 Unsigned Number Multiplication. 4.2.2 Sign-Magnitude Number Multiplication. 4.2.3 One's Complement Number Multiplication. 4.2.4 Two's Complement Number Multiplication. 4.3 Robertson's Signed Number Multiplication. 4.4 Recoding Technique. 4.4.1 Non-overlapped Multiple Bit Scanning. 4.4.2 Overlapped Multiple Bit Scanning. 4.4.3 Booth's Algorithm. 4.4.4 Canonical Multiplier Recoding. References. Problems. 5. Parallel Multiplication. 5.1 Wallace Trees. 5.2 Unsigned Array Multiplier. 5.3 Two's Complement Array Multiplier. 5.3.1 Baugh-Wooley Two's Complement Multiplier. 5.3.2 Pezaris Two's Complement Multipliers. 5.4 Modular Structure of Large Multiplier. 5.4.1 Modular Structure. 5.4.2 Additive Multiply Modules. 5.4.3 Programmable Multiply Modules. References. Problems. 6. Sequential Division. 6.1 Subtract-and-Shift Approach. 6.2 Binary Restoring Division. 6.3 Binary Non-Restoring Division. 6.4 High-Radix Division. 6.4.1 High-Radix Non-Restoring Division. 6.4.2 SRT Division. 6.4.3 Modified SRT Division. 6.4.4 Robertson's High-Radix Division. 6.5 Convergence Division. 6.5.1 Convergence Division Methodologies. 6.5.2 Divider Implementing Convergence Division Algorithm. 6.6 Division by Divisor Reciprocation. References. Problems. 7. Fast Array Dividers. 7.1 Restoring Cellular Array Divider. 7.2 Non-Restoring Cellular Array Divider. 7.3 Carry-Lookahead Cellular Array Divider. References. Problems. 8. Floating Point Operations. 8.1 Floating Point Addition/Subtraction. 8.2 Floating Point Multiplication. 8.3 Floating Point Division. 8.4 Rounding. 8.5 Extra Bits. References. Problems. 9. Residue Number Operations. 9.1 RNS Addition, Subtraction and Multiplication. 9.2 Number Comparison and Overflow Detection. 9.2.1 Unsigned Number Comparison. 9.2.2 Overflow Detection. 9.2.3 Signed Numbers and Their Properties. 9.2.4 Multiplicative Inverse and the Parity Table. 9.3 Division Algorithm. 9.3.1 Unsigned Number Division. 9.3.2 Signed Number Division. 9.3.3 Multiplicative Division Algorithm. References. Problems. 10. Operations through Logarithms. 10.1 Multiplication and Addition in Logarithmic Systems. 10.2 Addition and Subtraction in Logarithmic Systems. 10.3 Realizing the Approximation. References. Problems. 11. Signed-Digit Number Operations. 11.1 Characteristics of SD Numbers. 11.2 Totally Parallel Addition/Subtraction. 11.3 Required and Allowed Values. 11.4 Multiplication and Division. References. Problems. Index.

76 citations


Journal ArticleDOI
TL;DR: This review describes impulse response techniques with a curve-fitting method to measure thermodynamic properties, such as binary diffusion coefficient, retention factor, and partial molar volume, under supercritical conditions.

70 citations


Journal ArticleDOI
TL;DR: An all-optical system for the addition of binary numbers is proposed in which input binary digits are encoded by appropriate cells in two different planes and outputbinary digits are expressed as the presence or the absence of a light signal.
Abstract: An all-optical system for the addition of binary numbers is proposed in which input binary digits are encoded by appropriate cells in two different planes and output binary digits are expressed as the presence (=1) or the absence (=0) of a light signal. The intensity-based optical XOR and AND logic operations are used here as basic building blocks. Nonlinear materials, appropriate cells (pixels), and other conventional optics are utilized in this system.

65 citations


Journal ArticleDOI
TL;DR: This letter modify the algorithm of constructing signatures removing three of these exceptions and derive tighter bounds of the Karystinos-Pados bounds.
Abstract: Tightness of the Karystinos-Pados bounds was originally proved with four exceptions. In this letter, we modify the algorithm of constructing signatures removing three of these exceptions. For the fourth one the tighter bounds are derived.

61 citations


Journal ArticleDOI
TL;DR: An all-optical module that generates simultaneously four Boolean operations at 10 Gb/s is reported, which employs two cascaded ultrafast nonlinear interferometers and requires only two signals as inputs.
Abstract: In this letter, we report an all-optical module that generates simultaneously four Boolean operations at 10 Gb/s. The circuit employs two cascaded ultrafast nonlinear interferometers and requires only two signals as inputs. The first gate is configured as a 2 /spl times/ 2 exchange-bypass switch and provides OR and AND logical operations. The second gate generates XOR (SUM bit) and AND (CARRY bit) Boolean operations and constitutes a binary half-adder. Successful operation of the system is demonstrated with 10-Gb/s return-to-zero pseudorandom data patterns.

61 citations


Journal ArticleDOI
TL;DR: A simple one-dimensional system whose observations are sent to a state estimator over a noisy binary communication link and a simple but efficient estimator for the binary symmetric channel (BSC) is constructed.
Abstract: We consider a simple one-dimensional system whose observations are sent to a state estimator over a noisy binary communication link. The interesting thing about the system is that it is unstable. The problem is to design an encoding scheme and a decoder such that the estimation error is stable. We explicitly construct a simple but efficient estimator for the binary symmetric channel (BSC). We are not aware of any such previous "codes" for the BSC. We compare our results to the nonconstructive bounds of Sahai.

59 citations


Journal ArticleDOI
TL;DR: In this article, a new description of the binary fluid problem via the lattice Boltzmann method is presented which highlights the use of the moments in constructing two equilibrium distribution functions.
Abstract: A new description of the binary fluid problem via the lattice Boltzmann method is presented which highlights the use of the moments in constructing two equilibrium distribution functions. This offers a number of benefits, including better isotropy, and a more natural route to the inclusion of multiple relaxation times for the binary fluid problem. In addition, the implementation of solid colloidal particles suspended in the binary mixture is addressed, which extends the solid-fluid boundary conditions for mass and momentum to include a single conserved compositional order parameter. A number of simple benchmark problems involving a single particle at or near a fluid-fluid interface are undertaken and show good agreement with available theoretical or numerical results.

Journal ArticleDOI
TL;DR: In this article, the density of 1's in the binary expansion of real algebraic numbers has been established, and bounds on the number of 1-bits in the expansion have been established.
Abstract: Employing concepts from additive number theory, to- gether with results on binary evaluations and partial series, we establish bounds on the density of 1's in the binary expansions of real algebraic numbers. A central result is that if a real y has algebraic degree D > 1, then the number #(|y|,N) of 1-bits in the expansion of |y| through bit position N satisfies

Journal ArticleDOI
TL;DR: A general sliding window scheme is described that extends minimal binary sliding window conversion to arbitrary radix and to encompass signed digit sets and expresses a number of known recoding techniques as special cases.
Abstract: We consider the problem of recoding a number to minimize the number of nonzero digits in its representation, that is, to minimize the weight of the representation. A general sliding window scheme is described that extends minimal binary sliding window conversion to arbitrary radix and to encompass signed digit sets. This new conversion expresses a number of known recoding techniques as special cases. Proof that this scheme achieves minimal weight for a given digit set is provided and results concerning the theoretical average and worst-case weight are derived.

Journal ArticleDOI
TL;DR: In this article, the authors examine the application of spectroastrometry to binary point sources which are spatially unresolved due to the observational point spread function convolution, and present the relation to the ratio of the fluxes of the two components of the binary.
Abstract: Spectroastrometry is a technique which has the potential to resolve flux distributions on scales of milliarcseconds. In this study, we examine the application of spectroastrometry to binary point sources which are spatially unresolved due to the observational point spread function convolution. The technique uses measurements with sub-pixel accuracy of the position centroid of high signal-to-noise long-slit spectrum observations. With the objects in the binary contributing fractionally more or less at different wavelengths (particularly across spectral lines), the variation of the position centroid with wavelength provides some information on the spatial distribution of the flux. We examine the width of the flux distribution in the spatial direction, and present its relation to the ratio of the fluxes of the two components of the binary. Measurement of three observables (total flux, position centroid and flux distribution width) at each wavelength allows a unique separation of the total flux into its component parts even though the angular separation of the binary is smaller than the observations' point-spread function. This is because we have three relevant observables for three unknowns (the two fluxes, and the angular separation of the binary), which therefore generates a closed problem. This is a wholly different technique than conventional deconvolution methods, which produce information on angular sizes of the sampling scale. Spectroastrometry can produce information on smaller scales than conventional deconvolution, and is successful in separating fluxes in a binary object with a separation of less than one pixel. We present an analysis of the errors involved in making binary object spectroastrometric measurements and the separation method, and highlight necessary observing methodology.

Journal ArticleDOI
TL;DR: Under the proposed definition, multi-state systems are divided into two groups without reference to component relevancy conditions: dominant systems, and nondominant systems.
Abstract: In this paper, we propose a definition of the dominant multi-state system. Under the proposed definition, multi-state systems are divided into two groups without reference to component relevancy conditions: dominant systems, and nondominant systems. Dominant systems can be further divided into two groups: with binary image, and without binary image. A multi-state system with binary image implies that its structure function can be expressed in terms of binary structure functions such that it can be treated as a binary system structure, and existing algorithms for reliability evaluation of binary systems can be applied for system performance evaluation. A technique is provided for establishing the bounds of performance distribution of dominant systems without binary image. The properties of dominant systems are studied. Examples are given to illustrate applications of the proposed definitions and methods.

Journal ArticleDOI
TL;DR: An optimal method for the computation of linear combinations of elements of Abelian groups, which uses signed digit expansions, which has applications in elliptic curve cryptography is discussed and a central limit theorem is proved.

Journal ArticleDOI
TL;DR: A binary addition and subtraction scheme with proper accommodation of optical nonlinear materials is proposed and it is shown that this circuit can perform these operations with the inherent parallelism of optics.
Abstract: Arithmetic operation is an essential task in any type of computing system. In high-speed all-optical computations, arithmetic operations are used to exploit the highest capability of optical performance. We propose a binary addition and subtraction scheme with proper accommodation of optical nonlinear materials. It is also shown that this circuit can perform these operations with the inherent parallelism of optics.

Journal ArticleDOI
TL;DR: This paper shows a DNA representation of n binary numbers of m bits, and proposes a procedure to assign the bits of that representation to DNA molecules.
Abstract: In this paper, we consider procedures for logic and arithmetic operations with DNA molecules. We first show a DNA representation of n binary numbers of m bits, and propose a procedure to assign the...

Proceedings Article
01 Jan 2004
TL;DR: This article shows an approach based on utilizing formal concept analysis to compute nonhierarchical BFA, which helps to speed up the BFA computation.
Abstract: Binary factor analysis (BFA, also known as Boolean Factor Analysis) is a nonhierarchical analysis of binary data, based on reduction of binary space dimension. It allows us to find hidden relationships in binary data, which can be used for data compression, data mining, or intelligent data comparison for information retrieval. Unfortunately, we can't eectively use classical (i.e. non-binary) factor analysis methods for binary data. In this article we show an approach based on utilizing formal concept analysis to compute nonhierarchical BFA. Computation of a concept lattice is a computationally expensive task too, still it helps us to speed up the BFA computation.

Journal ArticleDOI
TL;DR: It is shown that there exist binary circular 5/2+ power free words of every length.
Abstract: We show that there exist binary circular 5/2+ power free words of every length.

Proceedings ArticleDOI
23 Mar 2004
TL;DR: An efficient run-length encoding of binary sources with unknown statistics with respect to the source entropy using adaptive Golomb-Rice coders is described.
Abstract: This paper describes an efficient run-length encoding of binary sources with unknown statistics. Binary entropy coders are used in many multimedia codec standards, which uses adaptive Golomb-Rice coders. Using a maximum-likelihood approach, an excess rate for the Golomb-like coder when compared to an adaptive Rice coder is up to 4.2% for binary sources with unknown statistics with respect to the source entropy.

Patent
22 Jan 2004
TL;DR: In this article, the analysis of binaries, components, configurations, and their footprints for component design and optimization has been presented, and complete and meaningful binary, component, configuration, and footprint information allows formal methods for component analysis and configuration optimization.
Abstract: The present invention facilitates the analysis of binaries, components, configurations, and their footprints for component design and optimization. Complete and meaningful binary, component, configuration, and footprint information allows formal methods for component analysis and configuration optimization. A binary dependency database persists and stores binary dependency information. The binary dependency database provides detailed dependency information among binaries.

Journal ArticleDOI
TL;DR: The binary vector-output correlation-immune functions are studied in this paper and the nonlinearity of the newly constructed vector- output correlation- immune functions is studied.

Journal ArticleDOI
TL;DR: In this paper, the authors examine the application of spectroastrometry to binary point sources which are spatially unresolved due to the observational point spread function convolution, and present the relation to the ratio of the fluxes of the two components of the binary.
Abstract: Spectroastrometry is a technique which has the potential to resolve flux distributions on scales of milliarcseconds. In this study, we examine the application of spectroastrometry to binary point sources which are spatially unresolved due to the observational point spread function convolution. The technique uses measurements with sub-pixel accuracy of the position centroid of high signal-to-noise long-slit spectrum observations. With the objects in the binary contributing fractionally more or less at different wavelengths (particularly across spectral lines), the variation of the position centroid with wavelength provides some information on the spatial distribution of the flux. We examine the width of the flux distribution in the spatial direction, and present its relation to the ratio of the fluxes of the two components of the binary. Measurement of three observables (total flux, position centroid and flux distribution width) at each wavelength allows a unique separation of the total flux into its component parts even though the angular separation of the binary is smaller than the observations' point-spread function. This is because we have three relevant observables for three unknowns (the two fluxes, and the angular separation of the binary), which therefore generates a closed problem. This is a wholly different technique than conventional deconvolution methods, which produce information on angular sizes of the sampling scale. Spectroastrometry can produce information on smaller scales than conventional deconvolution, and is successful in separating fluxes in a binary object with a separation of less than one pixel. We present an analysis of the errors involved in making binary object spectroastrometric measurements and the separation method, and highlight necessary observing methodology.

Journal ArticleDOI
TL;DR: An analytical method for approximate performance evaluation of binary linear block codes using an additive white Gaussian noise channel model with binary phase-shift keying modulation with Gram-Charlier series expansion is presented.
Abstract: An analytical method for approximate performance evaluation of binary linear block codes using an additive white Gaussian noise channel model with binary phase-shift keying modulation is presented. We focus on the probability density function of the bit log-likelihood ratio (LLR), which is expressed in terms of the Gram-Charlier series expansion. This expansion requires knowledge of the statistical moments of the bit LLR. We introduce an analytical method for calculating these moments. This is based on some recursive calculations involving certain weight enumerating functions of the code. It is proved that the approximation can be as accurate as desired, if we use enough terms in the Gram-Charlier series expansion. Numerical results are provided for some examples, which demonstrate close agreement with simulation results.

Book ChapterDOI
TL;DR: This work presents a simple architectural enhancement to a general-purpose processor core which facilitates arithmetic operations in binary finite fields GF(2 m) which was integrated into a SPARC V8 core and served to compare the merits of the enhancement for two different ECC implementations.
Abstract: Mobile and wireless devices like cell phones and network-enhanced PDAs have become increasingly popular in recent years. The security of data transmitted via these devices is a topic of growing importance and methods of public-key cryptography are able to satisfy this need. Elliptic curve cryptography (ECC) is especially attractive for devices which have restrictions in terms of computing power and energy supply. The efficiency of ECC implementations is highly dependent on the performance of arithmetic operations in the underlying finite field. This work presents a simple architectural enhancement to a general-purpose processor core which facilitates arithmetic operations in binary finite fields GF(2 m ). A custom instruction for a multiply step for binary polynomials has been integrated into a SPARC V8 core, which subsequently served to compare the merits of the enhancement for two different ECC implementations. One was tailored to the use of GF(2191) with a fixed reduction polynomial. The tailored implementation was sped up by 90% and its code size was reduced. The second implementation worked for arbitrary binary fields with a range of reduction polynomials. The flexible implementation was accelerated by a factor of nearly 10.

Patent
04 May 2004
TL;DR: In this paper, a binary adder coupled to a binary counter output and to a selected binary offset value is used for gray-code counting in an integrated circuit such as a programmable logic device, where a binary sum is converted to a gray code value by a binary-to-gray converter.
Abstract: A system for gray-code counting in an integrated circuit such as a programmable logic device uses a binary adder coupled to a binary counter output and to a selected binary offset value. The binary adder provides a binary sum that is converted to a gray code value by a binary-to-gray converter. The gray code value represents the binary sum output.

Patent
09 Jun 2004
TL;DR: In this article, a method for generating a typical image including at least three values per pixel, based on a set of binary images all showing the same element, was proposed, including summing up the binary values of the pixels of same coordinates of the binary images.
Abstract: A method for generating a typical image including at least three values per pixel, based on a set of binary images all showing a same element, including: summing up the binary values of the pixels of same coordinates in the set of binary images; generating a first state in the typical image if the sum of the pixels of same coordinates of the binary images provides a value smaller than a first threshold; and generating a second state in the typical image if the sum of the pixels of same coordinates of the binary images provides a value greater than a second threshold.

Patent
17 Aug 2004
TL;DR: In this article, a method and apparatus for generating true random numbers by way of a quantum optics process using a light source to produce a beam which illuminates a detector array is described.
Abstract: A method and apparatus for generating true random numbers by way of a quantum optics process uses a light source to produce a beam which illuminates a detector array. The detectors of the array are associated with random numbers values. Detection of a photon by one of the detectors yields a number whose value is equal to that associated with the detector. This procedure is repeated to produce sequences of true random numbers. The randomness of the numbers stems from the transverse spatial distribution of the detection probability of the photons in the beam. If the array is made up of two detectors, the true random numbers produced are binary numbers. The process can be sped up using an array having pairs of two detectors. Using an array having more than two detectors also allows generating true random numbers of dimension higher than two. The primary object of the invention is to allow generating true random numbers by way of a quantum optics process.

Proceedings ArticleDOI
27 Sep 2004
TL;DR: This paper validates new arithmetic instructions for binary field arithmetic for processors with wider wordsizes and multiple-issue (e.g. superscalar) execution, and suggests a low-cost method, which is called multi-word result execution, to realize some of the benefits of wordsize scaling in existing processors with fixed wordsizes.
Abstract: Binary finite fields GF(2/sup n/) are very commonly used in cryptography, particularly in public-key algorithms such as elliptic curve cryptography (ECC). On word-oriented programmable processors, field elements are generally represented as polynomials with coefficients from [0, 1]. Key arithmetic operations on these polynomials, such as squaring and multiplication, are not supported by integer-oriented processor architectures. Instead, these are implemented in software, causing a very large fraction of the cryptography execution time to be dominated by a few elementary operations. For example, more than 90% of the execution time of 163-bit ECC may be consumed by two simple field operations: squaring and multiplication. A few processor architectures have been proposed recently that include instructions for binary field arithmetic. However, these have only considered processors with small wordsizes and in-order, single-issue execution. The first contribution of this paper is to validate these new arithmetic instructions for processors with wider wordsizes and multiple-issue (e.g. superscalar) execution. We also consider the effects of varying the number of functional units and load/store pipes. We demonstrate that the combination of microarchitecture and new instructions provides speedups up to 22.4x for ECC point multiplication. Second, we show that if a bit-level reverse instruction is included in the instruction set, the size of the multiplier can be reduced by half without significant performance degradation. Third, we compare the benefits of superscalar execution with wordsize scaling. The latter has been used in recent processor architectures such as PLX and PAX as a new way to extract parallelism. We show that 2x wordsize scaling provides 70% better performance than 2-way superscalar execution. Finally, we suggest a low-cost method, which we call multi-word result execution, to realize some of the benefits of wordsize scaling in existing processors with fixed wordsizes.