scispace - formally typeset
Search or ask a question

Showing papers on "Multiplier (economics) published in 1988"


Book ChapterDOI
TL;DR: A method for designing a parallel multiplier for GF(2m) that is both speed and area efficient and well suited for VLSI is described.
Abstract: The finite fields GF(2m) play a central role in the implementation of BCH/Reed-Solomon coders and decoders. Also, these fields are attractive in some data encryption systems. In this paper we describe a method for designing a parallel multiplier for GF(2m) that is both speed and area efficient. The multiplier proposed is based on the conventional (or polynomial) base representation. From our multiplier we can derive the one introduced by Bartee and Schneider [9]. Their multiplier has been considered unsuitable for VLSI because of lack of modularity. Our approach shows that this multiplier is indeed modular and can also exhibit a high degree of regularity. It is thus well suited for VLSI. Compared to the best parallel design available today, our design requires, roughly, only half the number of gates and still achieves a high operational speed. The speed, size and regularity of our design depends on the irreducible polynomial used to generate the field. In the paper we derive two simple selection criteria for choosing the irreducible polynomial in order to obtain a good design. Also, we present a list of best polynomials for m≤16.

162 citations


Journal ArticleDOI
TL;DR: Three different finite-field multipliers are presented: a dual-basis multiplier due to E.R. Berlekamp (1982), the Massey-Omura normal basis multiplier, and the Scott-Tavares-Peppard standard basis multiplier.
Abstract: Three different finite-field multipliers are presented: (1) a dual-basis multiplier due to E.R. Berlekamp (1982); the Massey-Omura normal basis multiplier; and (3) the Scott-Tavares-Peppard standard basis multiplier. These algorithms are chosen because each has its own distinct features that apply most suitably in particular areas. They are implemented on silicon chips with NMOS technology so that the multiplier most desirable for VLSI implementation can readily be ascertained. >

139 citations


Journal ArticleDOI
TL;DR: In this article, a 32*32-bit multiplier using multiple-valued current-mode circuits has been fabricated in 2- mu m CMOS technology, which is half that of the corresponding binary CMOS multiplier.
Abstract: A 32*32-bit multiplier using multiple-valued current-mode circuits has been fabricated in 2- mu m CMOS technology. For the multiplier based on the radix-4 signed-digit number system, 32*32-bit two's complement multiplication can be performed with only three-stage signed-digit full adders using a binary-tree addition scheme. The chip contains about 23600 transistors and the effective multiplier size is about 3.2*5.2 mm/sup 2/, which is half that of the corresponding binary CMOS multiplier. The multiply time is less than 59 ns. The performance is considered comparable to that of the fastest binary multiplier reported. >

110 citations


Journal ArticleDOI
TL;DR: The effectiveness of the 32*32-bit signed digit multiplier implemented with multiple-valued, bidirectional, current-mode circuits and based on two-microcomputer complementary metal-oxide-semiconductor technology is established.
Abstract: A description is given of a 32*32-bit signed digit multiplier implemented with multiple-valued, bidirectional, current-mode circuits and based on two-microcomputer complementary metal-oxide-semiconductor technology. The multiplier can perform 32-bit two's-complement multiplication with three-stage SD full adders using a binary-tree addition scheme The effective multiplier size in the chip and the power dissipation are almost half that of the corresponding binary CMOS multiplier. The multiply time is comparable to that of the fastest binary multiplier. These results establish the effectiveness of the technology for future very large scale integration. >

85 citations


Patent
23 Aug 1988
TL;DR: A pipelined error correction circuit iteratively determines syndromes, error locator and evaluator equations, and error locations and associated error values for received Reed-Solomon code words.
Abstract: A pipelined error correction circuit iteratively determines syndromes, error locator and evaluator equations, and error locations and associated error values for received Reed-Solomon code words. The circuit includes a plurality of Galois Field multiplying circuits which use a minimum number of circiut elements to generate first and second products. Each Galois Field multiplying circuit includes a first GF multiplier which multiplies one of two input signals in each of two time intervals by a first value to produce a first product. The circuit includes a second GF multiplier which further multiplies one of the first products by a second value to generate a second product. The first and second products are then applied to the first GF multiplier as next input signals. The multiplying circuit minimizes the elements required to generate two products by using a first, relatively complex multiplier for both the first and second products and then a second relatively simple multiplier to generate the second product. This simplifies the multiplying circuit which would otherwise require two relatively complex multipliers. The error correction circuit determines, for each received code word, an error locator equation by iteratively updating a preliminary error locator equation. The circuit determines for a given iteration whether or not to update the preliminary error locator equation by comparing a particular variable with zero.

70 citations


Patent
16 Sep 1988
TL;DR: A pipeline-type serial multiplier with a cellular structure was proposed in this article, where each cell comprising an adder which operates on 3 one-bit data x, y, c and which determines the result v modulo 2 and the carry co of the addition of x,y, and c. The output rate is F/n, where F is the clock frequency.
Abstract: A pipeline-type serial multiplier having a cellular structure, each cell comprising an adder which operates on 3 one-bit data x, y, c and which determines the result v modulo 2 and the carry co of the addition of x, y, and c. Each adder simultaneously determines a data c1 which is the modulo 2 result of the addition of x, y, co. This enables the exact final result of a multiplication of a data A of n bits by a data B of p bits to be obtained in two successive segments: a segment L which is formed by the p bits of lowest digital weight and a segment H which is formed by the n bits of the highest weight. The output rate is F/n, where F is the clock frequency. The multiplier circuits can be cascaded under the control of an external signal. They can also be connected in parallel in order to add the results of two multiplications.

51 citations


Patent
09 Dec 1988
TL;DR: In this article, a synapse cell for use in providing a weighted connection strength is disclosed; the cell employs a four-quadrant multiplier and a pair of floating gate devices, and various charge levels are programmed onto the floating gate device, establishing weight and reference levels.
Abstract: A synapse cell for use in providing a weighted connection strength is disclosed. The cell employs a four-quadrant multiplier and a pair of floating gate devices. Various charge levels are programmed onto the floating gate devices, establishing weight and reference levels. These levels affect the current flowing through the multiplier. The output of the cell thus becomes a multiple of the input and the programmed charge difference.

34 citations


Patent
14 Mar 1988
TL;DR: In this paper, an intermediate latch with its own clock is provided at the output of the multiplier half-array in the intermediate stage to feed back data for a second pass for double-precision numbers.
Abstract: The present invention optimizes the number and ratio of cycles required among the divide/square root unit, multiplier unit and ALU. An intermediate latch with its own clock is provided at the output of the multiplier half-array in the intermediate stage to feed back data for a second pass for double-precision numbers. The multiplier can then be adjusted for either two-cycle latency mode (for optimizing double-precision multiplies) or three-cycle latency mode (for optimizing single-precision multiplies). A separate divide clock is used for the divide/square root unit, and is synchronized with the multiplier cycle clock on input and output. This allows the divide time to be optimized so that it requires fewer clock cycles when a longer multiplier clock cycle time is used.

31 citations


Proceedings ArticleDOI
Santoro1, Horowitz
01 Jan 1988
TL;DR: An iterating array multiplier, SPIM (Stanford Pipelined Iterative Multiplier), which can provide the performance of a full array at a fraction of the silicon area and is pipelined and clocked at a high frequency.
Abstract: THE DEMAND FOR HIGH-PERFORMANCE floating-point coprocessors has created a need for high-speed, small-area multipliers. Array multipliers achieve the highest performance but have a large silicon cost, while shift and add multipliers require very little hardware but have lower performance. This paper will present an iterating array multiplier, SPIM (Stanford Pipelined Iterative Multiplier), which can provide the performance of a full array at a fraction of the silicon area. A prototype 64 by 6 4 multiplier has been built in a 1.6p.m CMOS technology. It produces a carry-save output in 82ns, and in pipeline mode can start a new multiply every 4711s. In some sense, building a complete multiplication array is only useful if yon are going to pipeline it; otherwise, most of the hardware is idle a majority of the time. The data acts like a wave that flows through the array; the logic is doing useful work only when the crest of the wave is coincident with that row of the array. To provide a better utilization of the hardware, this multiplier contains only a fraction of a full array and then iteratively uses this small array. To make the iteration overhead small, the partial array is pipelined and clocked at a high frequency.

29 citations


Journal ArticleDOI
TL;DR: The PSI as discussed by the authors is a programmable digital signal processor that includes the full complex computation mode, including a complex parallel multiplier that is derived from a real multiplier by adding pipeline stages.
Abstract: The PSI, a programmable digital signal processor that includes the full complex computation mode, is presented A complex parallel multiplier is derived from a real multiplier by adding pipeline stages The pipeline introduced in the arithmetic unit fits the parallel nature of complex multiplications that require four real multiplications simultaneously The internal architecture is optimized for a set of kernel algorithms currently used in signal-processing applications Several algorithms are presented that exemplify the high performance of the PSI for complex signal processing It is shown that the external architecture of the PSI allows one to realize efficient multiprocessor applications >

25 citations


Patent
28 Jul 1988
TL;DR: In this paper, a circuit arrangement for finding a sum of electrical power outputs for use in a multi-phase electricity meter is disclosed. The circuit arrangement comprises a plurality of multiplier circuits arranged in sequence, each of which has two poles which form the output poles of the circuit arrangement.
Abstract: A circuit arrangement for finding a sum of electrical power outputs for use in a multi-phase electricity meter is disclosed. The circuit arrangement comprises a plurality of multiplier circuits arranged in sequence. The sequence of multiplier circuits has two poles which form the output poles of the circuit arrangement. Each multiplier circuit comprises a Hall element, an amplifier, and a polarity reversing switch.

Proceedings ArticleDOI
25 May 1988
TL;DR: In this paper, a broadband frequency doubler that uses distributed amplifier techniques has been designed to operate over several octaves of bandwidth, and the circuit design suppresses the fundamental frequency energy present at the output port while maximizing the second harmonic signal.
Abstract: A broadband frequency doubler that uses distributed amplifier techniques has been designed to operate over several octaves of bandwidth. The circuit design suppresses the fundamental frequency energy present at the output port while maximizing the second harmonic signal. The design can be realized using monolithic or conventional microwave circuit techniques for use in local oscillator chains. To demonstrate the multiplier concept, a two-cell monolithic circuit was designed. The doubler is composed of four FETs with gate widths of 176 mu m. The completed frequency doubler chip was used in the design of a 5-9-GHz (10-8-GHz output frequency) multiplier chain. The chain used several stages of post- and preamplification to set input drive levels and to provide increased local oscillator power. A 9-dB variation in input power produces only 3 dB of output power variation. With the chain fully driven, the total power output variation is less than +or-1 dB. The multipliers exhibit excellent power output characteristics, with fundamental frequency suppression, and require no tuning. >

Journal ArticleDOI
TL;DR: The comparison of the five-counter cell design and the full adder cell design reveals that the proposed design is most useful with pass gate logic and results in high-speed multiplication with a moderate increase in hardware complexity.
Abstract: A parallel multiplier design based on the five-counter cell is discussed. A design optimization for the performance in speed is proposed at the logic design level which is developed into an MOS circuit design. The comparison of the five-counter cell design and the full adder cell design reveals that the proposed design is most useful with pass gate logic and results in high-speed multiplication (approximately twice as fast as that of the full adder design) with a moderate increase in hardware complexity. With the five-counter design, an improvement in the hardware complexity of a squarer can be expected. >

Journal ArticleDOI
TL;DR: The design of a 4*4-bit multiplier using the modified Booth's algorithm in 2- mu m NMOS technology is discussed and a novel adder-cum-subtractor circuit was designed to realize the arithmetic processing part.
Abstract: The design of a 4*4-bit multiplier using the modified Booth's algorithm in 2- mu m NMOS technology is discussed. The main features of this chip are its 62.5-MHz operating frequency and 31.5-mW power dissipation. The chip occupies an area of 1.37 mm/sup 2/. A novel adder-cum-subtractor circuit was designed to realize the arithmetic processing part. >

Journal ArticleDOI
TL;DR: In this article, the optimal multiplier for multi-item inventory systems with one restriction is investigated and a recursive process which rapidly converges to the optimal multiplicative value is discussed. And a comparative analysis of the new bounds in relation to existing bounds is presented.

Proceedings ArticleDOI
11 Apr 1988
TL;DR: A modified linear interpolation that could increase the precision by about four bits without using a multiplier is proposed and it is shown that an improvement can be made in Taylor's technique by choosing a different constant.
Abstract: Logarithmic number systems are an attractive method of implementing high-speed digital signal-processing systems with a word size of about 14 bits. Larger word sizes pose problems because of the address limitations of the ROMs needed for logarithmic addition. M.G. Arnold (1982) gave several interpolation techniques that can double the precision, but these techniques require the use of a fixed-point multiplier. F.J. Taylor (1983) proposed a modified linear interpolation that could increase the precision by about four bits without using a multiplier. It is shown that an improvement can be made in Taylor's technique by choosing a different constant. Also, by combining one of Arnold's techniques with a variant of Taylor's modification, precision similar to that of Taylor's design can be obtained using a simpler circuit. >

Patent
30 Mar 1988
TL;DR: In this paper, a multi-parameter control circuit supplies a number of control signals to a multiple-input system, and these regulate process parameters, and a cost function, which depends on all of the parameters, is measured and a time derivative signal is produced which represents the time differential of the cost function.
Abstract: A multi-parameter control circuit supplies a number of control signals to a multiple-input system, and these regulate process parameters. A cost function, which depends on all of the parameters, is measured, and a time derivative signal is produced which represents the time differential of the cost function. There are a number of control signal circuits responsive to the cost function derivative which each include a multiplier that receives the time derivative signal, and a second input. An integrator integrates the output of the multiplier and provides a time integral to one input of an adder, a second input of which is provided with a rapidly varying incoherent signal, which can be white noise. The adder produces a parameter control output signal which is used to control the process parameter. A differentiator has an input connected to the adder output and an output that provides a time derivative of the parameter control signal to the second input of the multiplier. Alternatively, a single digital circuit may be used in a multiplexing mode to represent in succession all control signal circuits.

Patent
21 Apr 1988
TL;DR: In this paper, a truncated product partial canonical signed digit (PCSD) multiplier is disclosed for use in a finite impulse response (FIR) digital filter, where each multiplier quantity is coded as two non-zero signed digits in an 8-bit word.
Abstract: A truncated product partial canonical signed digit (PCSD) multiplier is disclosed for use in a finite impulse response (FIR) digital filter. Each multiplier quantity is coded as two non-zero signed digits in an 8-bit word. Each non-zero signed digit is recoded into a four bit nibble for application to the multiplier. Each partial product output of the multiplier is truncated from 16 to 11 bits. The multiplier operations are performed in the sequence shift right, truncate, one's complement, add partial products and, according to the output of a logic control circuit, add one into an appropriate order.

Journal ArticleDOI
TL;DR: Error detection can be accomplished by applying arithmetic codes to the multiplier hardware in different ways, and low-cost residue codes are applied to three different error detection architectures for both serial-parallel and fully bit-serial processing elements.
Abstract: Special-purpose architectures have been proposed to provide high processing rates for signal processing applications. These architectures use highly concurrent structures on VLSI circuits to achieve billions of multiply/add operations per second. Both serial-parallel and fully bit-serial multiplier elements have been proposed for highly concurrent signal processing arrays. Error detection can be accomplished by applying arithmetic codes to the multiplier hardware in different ways. Here, low-cost residue codes are applied to three different error detection architectures for both serial-parallel and fully bit-serial processing elements. The error performance of these different implementations is studied through computer simulation. The cost of using these codes in terms of silicon area and circuit complexity is also investigated. >

Journal ArticleDOI
TL;DR: In this paper, a method is described to realize 2D all-pass digital filters using a minimum number of multipliers, where the multiplier values are real and are obtained as simple functions of the coefficients of the transfer function.
Abstract: A method is described to realize 2-D all-pass digital filters using a minimum number of multipliers. The multiplier values are real and are obtained as simple functions of the coefficients of the transfer function. It is shown that using this method one can realize a first-order 2-D all-pass filter with only three real-gain multipliers and three delays. >

Journal ArticleDOI
01 Dec 1988
TL;DR: An 8 × 8 bit time-optimal multiplier using the Dadda scheme implemented as a 7-stage linear pipeline and a new pipelined carry look-ahead adder is used for the final summation.
Abstract: Parallel multiplication schemes for VLSI have traditionally been chosen for their regular layout. Unfortunately, this has meant using algorithms which are not time-optimal. In the paper, we present an 8 × 8 bit time-optimal multiplier using the Dadda scheme implemented as a 7-stage linear pipeline. The design uses automated layout techniques to avoid the problems associated with the irregularity of the scheme, and a 3 μm n-well CMOS process with two layers of metal. The use of multiple levels of metal reduces the delay associated with the interconnection between cells and also permits the over-routing of active circuitry. A new pipelined carry look-ahead adder is used for the final summation, and this provides a significant contribution to the performance of the multiplier. A set of cells was designed for the multiplier and some aspects of their design are discussed. In particular, a previously unreported Vdd overshoot problem in an existing exclusive-OR gate circuit is described and explained. The multiplier is expected to operate at a maximum clock frequency of at least 50 MHz.

Patent
01 Nov 1988
TL;DR: A cell module which is particularly employable in bit-serial silicon compilation methods permits the fabrication and layout of bit serial multipliers having variable word sizes as discussed by the authors, which is capable of a number of different functions including the production of high-order and low-order output product bit streams which may be selected from so as to provide output results in a variety of different formats associated with binary fractional multiplication.
Abstract: A cell module which is particularly employable in bit-serial silicon compilation methods permits the fabrication and layout of bit-serial multipliers having variable word sizes. In particular, the cell module permits the fabrication of a bit-serial multiplier which is capable of a number of different functions including the production of high-order (major) and low-order (minor) output product bit streams which may be selected from so as to provide output results in a variety of different formats associated with binary fractional multiplication.

Proceedings ArticleDOI
Bailey1, Snyder1
01 Jun 1988

Patent
29 Jan 1988
TL;DR: In this paper, a floating point processor (10) is provided having a multiplier (48) and an ALU (54) for performing arithmetic calculations simultaneously, and the output of the multiplier and ALU are stored in a product register (64) and a sum register (66), respectively.
Abstract: A floating point processor (10) is provided having a multiplier (48) and an ALU (54) for performing arithmetic calculations simultaneously. The output of the multiplier (48) and ALU (54) are stored in a product register (64) and a sum register (66), respectively. Multiplexers (40,42,44,46) are provided at the inputs to the multiplier (48) and the ALU (54). The multiplexers choose between data in input registers (32,34), product and sum registers (64,66), and an output register (76). Since the multiplier (48) and ALU (54) operate simultaneously, and since the outputs of the multiplier (48) and ALU (54) are available to the multiplexers (40-46), product of sums calculations and sum of products calculations may be performed rapidly. An input stage (12) uses a temporary register (18) to store data from a data bus on the first clock edge, and configuration logic (28) for directing data from the data bus and the temporary register (18) to the input registers (32,34) on a second clock edge.

Proceedings ArticleDOI
S.J. Hong1
27 Jun 1988
TL;DR: To provide 100% controllability of summands, the summand-generator is modified using one extra input, and a parallel multiplier can be designed testable with 3n+60 vectors using only one extra pin.
Abstract: The n-by-n parallel multiplier can be usually broken into two blocks, the summand-generator and the summand-counter. The summand-generator generates n/sup 2/ summands and the summand-counter adds them up to produce the final 2n-bit product. The summand-generator is easy to test because all inputs are directly controllable and all faults propagate through summand-counter to primary outputs. However, the summand-counter is difficult to test due to poor controllability. To provide 100% controllability of summands, the summand-generator is modified using one extra input. This new summand-generator can be tested with 19 vectors. With this summand-generator, the summand-counter can be constructed testable using the minimum number of adder cells but no extra device or pin. This summand-counter can be tested with 3n+41 vectors. Thus, a parallel multiplier can be designed testable with 3n+60 vectors using only one extra pin. >

Patent
Barry L. Jason1
12 Sep 1988
TL;DR: In this paper, a variable gain transconductance amplifier is used in active filters to provide adjustment of the bandwidth without affecting the center frequency, where complimentary input followers are used in conjunction with the multiplier to provide a high input impedance.
Abstract: A variable gain transconductance amplifier wherein complimentary input followers are used in conjunction with the multiplier to provide a high input impedance. A pair of multiplier input devices are embedded in the input followers to reduce the supply voltage used by the multiplier and to improve signal handling capability while reducing the supply voltage required. Further, these variable gain transconductance amplifiers are used in active filters to provide adjustment of the bandwidth without affecting the center frequency.

Journal ArticleDOI
TL;DR: The performance and cost of a modulo-3 residue code checker that has been attached to a pipelined serial multiplier to provide a concurrent self-test capability are considered and analytical results are derived for error detection coverage and minimum error latency.
Abstract: The performance and cost of a modulo-3 residue code checker that has been attached to a pipelined serial multiplier to provide a concurrent self-test capability are considered. Analytical results are derived for error detection coverage and minimum error latency; these quantities are observed to be in agreement with simulation results obtained by using ISPS, a register-transfer language. The residue checker error detection coverage and minimum error latency are observed to be dependent on the statistical properties of the multiplier input operands. The checker and serial multiplier were implemented in 4- mu m NMOS, using a standard cell design. The residue code checker required approximately half of the total silicon area. >

Patent
16 May 1988
TL;DR: In this article, a shifting operation is performed in a multiplier unit by entering the number to be shifted in the multiplicand register of the multiplier unit while entering appropriate control signals in the multiplier register.
Abstract: In floating point operations, it is necessary to align the fractions of the floating point operands before addition or subtraction operations can be executed. This fraction alignment is performed by a shifting operation, typically using dedicated apparatus such as a barrel shifter. While the dedicated apparatus provides high performance in the execution of the shifting operation, this performance is accomplished by reserving a portion of the substrate area for apparatus implementation. To avoid the use of dedicated apparatus, the shifting operation is performed in a multiplier unit, according to the present invention, by entering the number to be shifted in the multiplicand register of the multiplier unit while entering appropriate control signals in the multiplier register. In this manner, a shifting operation can be performed without dedicated apparatus and with minor impact on performance.

Journal ArticleDOI
TL;DR: A family of bipolar macrocell array LSIs has been developed which has a basic delay of 43 ps/CML and a toggle frequency of 5.2 GHz/flip-flop and a highly advanced super self-aligned process technology which uses a selectively ion-implanted collector technology based on SST-1A.
Abstract: A family of bipolar macrocell array LSIs has been developed which has a basic delay of 43 ps/CML and a toggle frequency of 5.2 GHz/flip-flop. This family uses a cascaded-differential and single-ended CML circuit and a highly advanced super self-aligned process technology (SST-1B) which uses a selectively ion-implanted collector technology based on SST-1A. Using the macrocell array LSIs, performances of 1.2 ns/1.2 W for a 6-bit multiplier, and 4.3 ns/3 W for a 16-bit multiplier have been achieved. >

Patent
23 Mar 1988
TL;DR: In this paper, a multistage digital filter for producing an output data sequence in response to elements of an input data sequence includes a multiplier for multiplying elements of the input data sequences by selected coefficients to produce product terms.
Abstract: A multistage digital filter for producing an output data sequence in response to elements of an input data sequence includes a multiplier for multiplying elements of the input data sequence by selected coefficients to produce product terms. An accumulator sums product terms to produce elements of a first filter stage output sequence and elements of the first filter stage output sequence are sequentially fed back to the multiplier. The multiplier and accumulator produce accumulated sums in response to the first stage output sequence to provide a second filter stage output sequence. In a similar fashion, each filter stage output sequence is fed back to the multiplier as the input sequence to the next filter stage until the output sequence of the last filter stage is produced.