Showing papers on "Adder published in 1997"

PDF

Open Access

Journal Article•DOI•

Low-power logic styles: CMOS versus pass-transistor logic

[...]

R. Zimmermann¹, Wolfgang Fichtner¹•Institutions (1)

01 Jul 1997-IEEE Journal of Solid-state Circuits

TL;DR: This paper shows that complementary CMOS is the logic style of choice for the implementation of arbitrary combinational circuits if low voltage, low power, and small power-delay products are of concern.

...read moreread less

Abstract: Recently reported logic style comparisons based on full-adder circuits claimed complementary pass-transistor logic (CPL) to be much more power-efficient than complementary CMOS. However, new comparisons performed on more efficient CMOS circuit realizations and a wider range of different logic cells, as well as the use of realistic circuit arrangements demonstrate CMOS to be superior to CPL in most cases with respect to speed, area, power dissipation, and power-delay products. An implemented 32-b adder using complementary CMOS has a power-delay product of less than half that of the CPL version. Robustness with respect to voltage scaling and transistor sizing, as well as generality and ease-of-use, are additional advantages of CMOS logic gates, especially when cell-based design and logic synthesis are targeted. This paper shows that complementary CMOS is the logic style of choice for the implementation of arbitrary combinational circuits if low voltage, low power, and small power-delay products are of concern.

...read moreread less

911 citations

Dissertation•DOI•

Binary adder architectures for cell-based VLSI and their synthesis

[...]

R. Zimmermann

01 Jan 1997

TL;DR: It is found that the ripple-carry, the carry-lookahead, and the proposed carry-increment adders show the best overall performance characteristics for cell-based design.

...read moreread less

Abstract: The addition of two binary numbers is the fundamental and most often used arithmetic operation on microprocessors, digital signal processors (DSP), and data-processing application-specific integrated circuits (ASIC). Therefore, bi¬ nary adders are crucial building blocks in very large-scale integrated (VLSI) circuits. Their efficient implementation is not trivial because a costly carrypropagation operation involving all operand bits has to be performed. Many different circuit architectures for binary addition have been proposed over the last decades, covering a wide range of performance characteristics. Also, their realization at the transistor level for full-custom circuit implemen¬ tations has been addressed intensively. However, the suitability of adder archi¬ tectures for cell-based design and hardware synthesis both prerequisites for the ever increasing productivity in ASIC design — was hardly investigated. Based on the various speed-up schemes for binary addition, a compre¬ hensive overview and a qualitative evaluation of the different existing adder architectures are given in this thesis. In addition, a new multilevel carryincrement adder architecture is proposed. It is found that the ripple-carry, the carry-lookahead, and the proposed carry-increment adders show the best overall performance characteristics for cell-based design. These three adder architectures, which together cover the entire range of possible area vs. delay trade-offs, are comprised in the more general prefix adder architecture reported in the literature. It is shown that this universal and flexible prefix adder structure also allows the realization of various customized adders and of adders fulfilling arbitrary timing and area constraints. A non-heuristic algorithm for the synthesis and optimization of prefix adders is proposed. It allows the runtime-efficient generation of area-optimal adders for given timing constraints.

...read moreread less

268 citations

Journal Article•DOI•

The impact of intra-die device parameter variations on path delays and on the design for yield of low voltage digital circuits

[...]

M. Eisele, J. Berthold¹, D. Schmitt-Landsiedel², R. Mahnkopf¹•Institutions (2)

Siemens¹, Technische Universität München²

01 Dec 1997-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: It is found that circuits with a large number of critical paths and with a low logic depth are most sensitive to uncorrelated gate delay variations, and scenarios for future technologies show the increased impact of uncor related delay variations on digital design.

...read moreread less

Abstract: The yield of low voltage digital circuits is found to he sensitive to local gate delay variations due to uncorrelated intra-die parameter deviations. Caused by statistical deviations of the doping concentration they lead to more pronounced delay variations for minimum transistor sizes. Their influence on path delays in digital circuits is verified using a carry select adder test circuit fabricated in 0.5 and 0.35 /spl mu/m complementary metal-oxide-semiconductor (CMOS) technologies with two different threshold voltages. The increase of the path delay variations for smaller device dimensions and reduced supply voltages as well as the dependence on the path length is shown. It is found that circuits with a large number of critical paths and with a low logic depth are most sensitive to uncorrelated gate delay variations. Scenarios for future technologies show the increased impact of uncorrelated delay variations on digital design. A reduction of the maximal clock frequency of 10% is found for, for example, highly pipelined systems realized in a 0.18-/spl mu/m CMOS technology.

...read moreread less

177 citations

Patent•

Data processor and data processing system

[...]

Fumio Arakawa¹, Norio Nakagawa¹, Tetsuya Yamada¹, Yonetaro Totsuka¹•Institutions (1)

Hitachi¹

15 Oct 1997

TL;DR: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied multipliers as discussed by the authors.

...read moreread less

Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

...read moreread less

172 citations

Patent•

Determining an extremum value and its index in an array using a dual-accumulation processor

[...]

Mazhar M. Alidina¹, Sivanand Simanapalli¹•Institutions (1)

Alcatel-Lucent¹

13 Nov 1997

TL;DR: In this paper, compare-select features are implemented in an adder and an arithmetic logic unit (ALU) to reduce the computational complexity in determining extrema in a data processor.

...read moreread less

Abstract: A data processor determines an overall extremum value of an input set of array data, with the input set of array data partitionable into a first set of array data and a second set of array data. The data processor includes a pair of compare-select circuits implemented in an adder as well as in an arithmetic-logic unit (ALU), respectively, which operate in parallel for respectively processing the first set and the second set, and for respectively determining first and second extremum values of the first set and the second set, respectively. A first compare-select circuit of the pair of compare-select circuits determines the overall extremum value of the input set of array data from the first and second extremum values. The first compare-select circuit also determines the location of the overall extremum value in the input set of array data. The computational complexity in determining extrema is reduced by implementing compare-select features in an adder in addition to an ALU to operate in parallel to reduce the number of processing cycles.

...read moreread less

161 citations

Book•

Arithmetic Built-In Self-Test for Embedded Systems

[...]

Janusz Rajski¹, Jerzy Tyszer²•Institutions (2)

Mentor Graphics¹, Poznań University of Technology²

01 Oct 1997

TL;DR: This chapter discusses Built-In Self-Test, High-Level Synthesis, and Implementation-Dependent Fault Grading, which aims to improve the quality of Diagnostic Resolution in Scan-Based Designs.

...read moreread less

Abstract: 1. Built-In Self-Test. Introduction. Design for Testability. Generation of Test Vectors. Compaction of Test Responses. BIST Schemes for Random Logic. BIST for Memory Arrays. 2. Generation of Test Vectors. Additive Generators of Exhaustive Patterns. Other Generation Schemes. Two-Dimensional Generators. 3. Test-Response Compaction. Binary Adders. 1's Complement Adders. Rotate-Carry Adders. Cascaded Compaction Scheme. 4. Fault Diagnosis. Analytical Model. Experimental Validation. The Quality of Diagnostic Resolution. Fault Diagnosis in Scan-Based Designs. 5. BIST of Data-Path Kernel. Testing of ALU. Testing of the MAC Unit. Testing of the Microcontroller. 6. Fault Grading. Fault Simulation Framework. Functional Fault Simulation. Experimental Results. 7. High-Level Synthesis. Implementation-Dependent Fault Grading. Synthesis Steps. Simulation Results. 8. ABIST at Work. Testing of Random Logic. Memory Testing. Digital Integrators. Leaking Integrators. 9. Epilog. Bibliography. A. Tables of Generators. B. Assembly Language. Index.

...read moreread less

129 citations

Proceedings Article•DOI•

Implementation of single precision floating point square root on FPGAs

[...]

Yamin Li¹, Wanming Chu¹•Institutions (1)

University of Aizu¹

16 Apr 1997

TL;DR: A non-restoring square root algorithm and two very simple single precision floating point square root implementations based on the algorithm on FPGAs that uses a traditional adder/subtracter and a high-throughput pipelined implementation.

...read moreread less

Abstract: The square root operation is hard to implement on FPGAs because of the complexity of the algorithms. In this paper, we present a non-restoring square root algorithm and two very simple single precision floating point square root implementations based on the algorithm on FPGAs. One is low-cost iterative implementation that uses a traditional adder/subtracter. The operation latency is 25 clock cycles and the issue rate is 24 clock cycles. The other is high-throughput pipelined implementation that uses multiple adder/subtracters. The operation latency is 15 clock cycles and the issue rate is one clock cycle. It means that the pipelined implementation is capable of accepting a square root instruction on every clock cycle.

...read moreread less

109 citations

Proceedings Article•DOI•

Compilation tools for run-time reconfigurable designs

[...]

Wayne Luk¹, Nabeel Shirazi¹, Peter Y. K. Cheung¹•Institutions (1)

Imperial College London¹

16 Apr 1997

TL;DR: A framework and tools for automating the production of designs which can be partially reconfigured at run time, and a tool which further optimises designs for FPGAs supporting simultaneous configuration of multiple cells.

...read moreread less

Abstract: This paper describes a framework and tools for automating the production of designs which can be partially reconfigured at run time. The tools include: a partial evaluator, which produces configuration files for a given design, where the number of configurations can be minimised by a process, known as compile-time sequencing; an incremental configuration calculator, which takes the output of the partial evaluator and generates an initial configuration file and incremental configuration files that partially update preceding configurations; and a tool which further optimises designs for FPGAs supporting simultaneous configuration of multiple cells. While many of our techniques are independent of the design language and device used, our tools currently target Xilinx 6200 devices. Simultaneous configuration, for example, can be used to reduce the time for reconfiguring an adder to a subtractor from time linear with respect to its size to constant time at best and logarithmic time at worst.

...read moreread less

103 citations

Proceedings Article•DOI•

Speculative completion for the design of high-performance asynchronous dynamic adders

[...]

Steven M. Nowick¹, K.Y. Yun², Peter A. Beerel³, A.E. Dooply²•Institutions (3)

Columbia University¹, University of California, San Diego², University of Southern California³

07 Apr 1997

TL;DR: This paper presents an in-depth case study in high-performance asynchronous adder design that uses single-rail bundled datapaths but also allows early completion, and introduces five new dynamic designs for Brent-Kung and Carry-Bypass adders.

...read moreread less

Abstract: This paper presents an in-depth case study in high-performance asynchronous adder design. A recent method, called "speculative completion", is used. This method uses single-rail bundled datapaths but also allows early completion. Five new dynamic designs are presented for Brent-Kung and Carry-Bypass adders. Furthermore, two new architectures are introduced, which target (i) small number addition, and (ii) hybrid operation. Initial SPICE simulation and statistical analysis show performance improvements up to 19% on random inputs and 14% on actual programs for 32-bit adders, and up to 29% on random inputs for 64-bit adders, over comparable synchronous designs.

...read moreread less

101 citations

Patent•

Arithmetic cell for field programmable devices

[...]

Frederick A. Perner¹•Institutions (1)

Hewlett-Packard¹

12 Sep 1997

TL;DR: An arithmetic cell to be used in field programmable devices is defined in this article, which allows efficient implementations of multipliers, multipliers/accumulators and adders (addition, compare, and subtraction) in one compact cell that is a collection of circuits common to FPGA devices.

...read moreread less

Abstract: An arithmetic cell to be used in field programmable devices is defined in this invention. This cell will allow efficient implementations of multipliers, multipliers/accumulators and adders (addition, compare, and subtraction) in one compact cell that is a collection of circuits common to field programmable devices. This cell may be used in a flexible manner that allows full multipliers of any dimension (n*m products), adders of any length (n+m sums, compare, differences), accumulators, and registers (to hold complete results or partial products). Key elements in this invention are an application controlled multiplexer, signal routing to provide a shift function for multiplication, and a minimum collection of configuration bits and circuit elements to perform the basic arithmetic functions.

...read moreread less

97 citations

Patent•

Signal transmitting method, transmitter, receiver, and spread-spectrum code synchronizing method for mobile communication system

[...]

Kenichi Higuchi, Mamoru Sawahashi, Fumiyuki Adachi, Koji Ohno, Akihiro Higashi - Show less +1 more

04 Mar 1997

TL;DR: In this article, the spread-spectrum code synchronizing speed of a down control channel is improved by using different first spread-Spectrum codes having the repetitive periods of information symbol periods from each first short code (short code) generating section.

...read moreread less

Abstract: The spread-spectrum code synchronizing speed of a down control channel is improved. The spectra of a control channel information signal and each communication channel information signal are spread by using different first spread-spectrum codes having the repetitive periods of information symbol periods from each first spread-spectrum code (short code) generating section (11). Then only the spectrum of the control channel information signal is spread by using third spread-spectrum codes which are complex conjugation of a common long code (second spread-spectrum code) to be spread from a third spread-spectrum code (complex conjugate code of a long code mask section) generating section (12). Thereafter, the signals of all channels are added at an adequate timing by an adder (13), the spectrum of the output of the adder (B) is spread by using second spread-spectrum codes from a second spread-spectrum code generating section (14), and the spread-spectrum signal is outputted as a spread modulated signal.

...read moreread less

Proceedings Article•DOI•

Symmetric bipartite tables for accurate function approximation

[...]

Michael J. Schulte¹, James E. Stine¹•Institutions (1)

Lehigh University¹

06 Mar 1997

TL;DR: The method for designing bipartite tables, called the Symmetric Bipartite Table Method, utilizes symmetry in the table entries to reduce the overall memory requirements and requires smaller table lookups to achieve a given accuracy.

...read moreread less

Abstract: The paper presents a methodology for designing bipartite tables for accurate function approximation. Bipartite tables use two parallel table lookups to obtain a carry-save (borrow-save) function approximation. A carry propagate adder can then convert this approximation to a two's complement number or the approximation can be directly Booth encoded. Our method for designing bipartite tables, called the Symmetric Bipartite Table Method, utilizes symmetry in the table entries to reduce the overall memory requirements. It has several advantages over previous bipartite table methods in that it: (1) provides a closed form solution for the table entries; (2) has right bounds on the maximum absolute error; (3) requires smaller table lookups to achieve a given accuracy; and (4) can be applied to a wide range of functions. Compared to conventional table lookups, the symmetric bipartite tables presented are 15.0 to 41.7 times smaller when the operand size is 16 bits and 99.1 to 273.9 times smaller when the operand size is 24 bits.

...read moreread less

Patent•

Digital signal processor architecture optimized for performing fast fourier transforms

[...]

Mohit K. Prasad¹, Hosahalli R. Srinivas¹•Institutions (1)

Alcatel-Lucent¹

30 Jun 1997

TL;DR: In this article, a digital signal processor architecture for fast Fourier transform (FFT) algorithms is presented. But the architecture is not suitable for high-dimensional (HDF) data.

...read moreread less

Abstract: A digital signal processor architecture particularly adapted for performing fast Fourier Transform algorithms efficiently. The architecture comprises dual, parallel multiply and accumulate units in which the output of the multiplier circuit portion of each MAC is cross-coupled to an input of the adder unit of the other MAC as well as to an input of the adder unit of the same MAC to which the multiplier belongs.

...read moreread less

Journal Article•DOI•

Fast adders using enhanced multiple-output domino logic

[...]

Zhongde Wang¹, Graham A. Jullien¹, W.C. Miller¹, Jinghong Wang², S.S. Bizzan³ - Show less +1 more•Institutions (3)

University of Windsor¹, Nortel², ATI Technologies³

01 Feb 1997-IEEE Journal of Solid-state Circuits

TL;DR: Using an enhanced multiple output domino logic (EMODL) implementation of a carry lookahead adder (CLA), sums of several consecutive bits can be built in one nFET tree with a single carry-in.

...read moreread less

Abstract: Using an enhanced multiple output domino logic (EMODL) implementation of a carry lookahead adder (CLA), sums of several consecutive bits can be built in one nFET tree with a single carry-in. Based on this result, a new sparse carry chain architecture is proposed for the CLA adder. We demonstrate the design approach using a 32-b adder, and show that only four carries are sufficient for generating all sums, with a consequent reduction in the number of stage delays. Using a 1.2-/spl mu/m CMOS technology, we verify our simulation procedures by fabrication and measurement of a 2.7 ns critical path.

...read moreread less

Patent•

Modulo address generating circuit and method with reduced area and delay using low speed adders

[...]

Min-joong Rim¹•Institutions (1)

Samsung¹

05 Aug 1997

TL;DR: In this article, a modulo address generator is presented, which includes a first adder for adding a current address to an address increment to generate an incremented address, an inverter for producing a complement of a maximum address, a second adder to generate a circular correction value by adding the complement of the maximum address to a minimum address, and an adder/subtractor for generating a corrected next address by adding or subtracting the circular corrected value to or from the increment according to a sign value of the address increment.

...read moreread less

Abstract: A modulo address generating apparatus and method are disclosed which obtain high speed performance with reduced integrated circuit area. A modulo address generator according to the present invention includes a first adder for adding a current address to an address increment to generate an incremented address, an inverter for producing a complement of a maximum address, a second adder for generating a circular correction value by adding the complement of the maximum address to a minimum address, an adder/subtracter for generating a corrected next address by adding or subtracting the circular correction value to or from the incremented address according to a sign value of the address increment, a comparator for checking whether the incremented address is within an address range defined by the maximum and minimum addresses, and a multiplexor controlled by the comparator which selects the incremented address for output as a next address when the incremented address is within the address range and selects the corrected address for output as the next address when the incremented address is outside the address range.

...read moreread less

Journal Article•

Single-Electron Majority Logic Circuits

[...]

Hiroki Iwamura, Masamichi Akazawa, Yoshihito Amemiya

26 Sep 1997-IEICE technical report. Electron devices

Proceedings Article•DOI•

Properties of the input pattern fault model

[...]

R.D. Blanton¹, John P. Hayes¹•Institutions (1)

Carnegie Mellon University¹

12 Oct 1997

TL;DR: The IP fault model is described and a method for analyzing IP faults using standard SSL-based fault simulators and test generation tools is provided, used to generate test sets that target the IP faults of the ISCAS85 benchmark circuits and a carry-lookahead adder.

...read moreread less

Abstract: Recent work in IC failure analysis strongly indicates the need for fault models that directly analyze the function of circuit primitives. The input pattern (IP) fault model is a functional fault model that allows for both complete and partial functional verification of every circuit module, independent of the design level. We describe the IP fault model and provide a method for analyzing IP faults using standard SSL-based fault simulators and test generation tools. The method is used to generate test sets that target the IP faults of the ISCAS85 benchmark circuits and a carry-lookahead adder. Improved IP fault coverage for the benchmarks and the adder is obtained by adding a small number of test patterns to tests that target only SSL faults. We also conducted fault simulation experiments that show IP test patterns are effective in detecting non-targeted faults such as bridging and transistor stuck-on faults. Finally, we discuss the notion of IP redundancy and show how large amounts of this redundancy exist in the benchmarks and in SSL-irredundant adder circuits.

...read moreread less

Patent•

Multiplier array processing system with enhanced utilization at lower precision for group multiply and sum instruction

[...]

Craig Hansen¹, Henry Massalin•Institutions (1)

MicroUnity¹

16 May 1997

TL;DR: In this paper, a multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described, and new instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction.

...read moreread less

Abstract: A multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described. New instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction, and for the deployment of greater multiply and add operands as the symbol size is decreased.

...read moreread less

Patent•

Demodulator for CDMA spread spectrum communication using multiple PN codes

[...]

Changming Zhou, Guoliang Shou, Xuping Zhou, Makoto Yamamoto, Kenzo Urabe, Sunao Takatori - Show less +2 more

19 Feb 1997

TL;DR: In this paper, the demodulator has a plurality of matched filters in parallel, and each matched filter has a different binary PN code, a sample holder with a common input, a switch, a first capacitor, and a first inverse amplifier with an output and an input connected to the common input through the switch and the capacitor.

...read moreread less

Abstract: The demodulator has a plurality of matched filters in parallel. Each matched filter has a different binary PN code, a plurality of sample holders, a plurality of multipliers, an adder, and a controller. The sample holders has a common input, a switch, a first capacitor, a first inverse amplifier with an output and an input connected to the common input through the switch and the capacitor, and a first feedback capacitor for feeding the output of the first inverse amplifier back to the input. Each multiplier has a first and second sub-multiplexers, one of sub-multiplexer selecting corresponding sample holder output and another sub-multiplexer selecting a reference voltage.

...read moreread less

Proceedings Article•DOI•

A structured approach for designing low power adders

[...]

A.M. Shams¹, Magdy Bayoumi•Institutions (1)

University of Louisiana at Lafayette¹

02 Nov 1997

TL;DR: The adder cell is anatomized into smaller modules using the proposed structured approach to construct 24 different 1-bit full adder cells that exhibit different power consumption, speed, area, and driving capability figures.

...read moreread less

Abstract: A performance analysis of a general 1-bit full adder cell is presented. The adder cell is anatomized into smaller modules using the proposed structured approach. The modules are studied extensively and several designs of each of them are shown. Connecting combinations of designs of these modules together we construct 24 different 1-bit full adder cells (some of them are novel circuits). Each of these cells exhibits different power consumption, speed, area, and driving capability figures. Some of the new cells outperform existing standard designs of the full adder cell.

...read moreread less

Journal Article•DOI•

Arithmetic built-in self-test for DSP cores

[...]

K. Radecka¹, J. Rajski, J. Tyszser•Institutions (1)

Bell Labs¹

01 Nov 1997-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: A new built-in self-test (BIST) methodology is presented in which all generation and compaction functions are executed by basic building blocks such as adders, ALU's, and multipliers, performing regular arithmetic functions in digital signal processing (DSP) cores.

...read moreread less

Abstract: A new built-in self-test (BIST) methodology is presented in which all generation and compaction functions are executed by basic building blocks such as adders, ALU's, and multipliers, performing regular arithmetic functions in digital signal processing (DSP) cores. It is demonstrated how these components are themselves tested, and subsequently used to perform more complex testing functions. The need for extra hardware is either entirely eliminated or drastically reduced, test vectors can be easily distributed to different modules of the system, test responses can be collected in parallel, and there is virtually no performance degradation. As an integral part of the proposed BIST environment, arithmetic two-dimensional (2-D) generators of pseudorandom test vectors are also introduced to further integrate the scheme with parallel scan and boundary scan designs used to test peripheral devices of the core.

...read moreread less

Journal Article•DOI•

VLSI array algorithms and architectures for RSA modular multiplication

[...]

Yong-Jin Jeong¹, Wayne Burleson²•Institutions (2)

Samsung¹, University of Massachusetts Amherst²

01 Jun 1997-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Two novel iterative algorithms and their array structures for integer modular multiplication for Rivest-Shamir-Adelman (RSA) cryptography and are based on the familiar iterative Horner's rule, but use precalculated complements of the modulus.

...read moreread less

Abstract: We present two novel iterative algorithms and their array structures for integer modular multiplication. The algorithms are designed for Rivest-Shamir-Adelman (RSA) cryptography and are based on the familiar iterative Horner's rule, but use precalculated complements of the modulus. The problem of deciding which multiples of the modulus to subtract in intermediate iteration stages has been simplified using simple look-up of precalculated complement numbers, thus allowing a finer-grain pipeline. Both algorithms use a carry save adder scheme with module reduction performed on each intermediate partial product which results in an output in carry-save format. Regularity and local connections make both algorithms suitable for high-performance array implementation in FPGA's or deep submicron VLSI. The processing nodes consist of just one or two full adders and a simple multiplexor. The stored complement numbers need to be precalculated only when the modulus is changed, thus not affecting the performance of the main computation. In both cases, there exists a bit-level systolic schedule, which means the array can be fully pipelined for high performance and can also easily be mapped to linear arrays for various space/time tradeoffs.

...read moreread less

Patent•

Computer-implemented multiplication with shifting of pattern-product partials

[...]

Kenneth Alan Dockser¹•Institutions (1)

VLSI Technology¹

24 Jan 1997

TL;DR: In this article, a constant multiplication device is designed for multiplying a received binary multiplicand by a constant multiplier which, when expressed in binary or signed-digit notation, includes a repeated pattern with three or more nonzero values.

...read moreread less

Abstract: A constant multiplication device is designed for multiplying a received binary multiplicand by a constant multiplier which, when expressed in binary or signed-digit notation, includes a repeated pattern with three or more non-zero values. The device includes a pattern-product term generator that receives the multiplicand and generates terms corresponding to each of the non-zero values of the pattern. If, when all instances of the pattern are subtracted from the multiplier there are non-zero values in the difference, the pattern-product term generator can also generate remainder-product terms. The pattern-product terms, but not the remainder-product terms, are input to a pattern compressor that yields pattern-product partials; the compressor can be a carry-save adder and the partials can be in the form of a pseudo sum and a pseudo carry. A replica generator generates shifted replicas of each pattern-product partial. The replicas are input to a replica compressor, as are any remainder-product terms. The replica compressor converts these inputs to final-product partials. The replica compressor can be a carry-save adder and the final-product partials can be a pseudo sum and a pseudo carry. These are input to a product ripple accumulator, which can be a carry-propagate adder, to yield the product of the multiplicand and the multiplier. Since there is only one ripple stage, the device provides for relatively high-speed multiplication for multipliers with repeated patterns.

...read moreread less

Patent•

Symbol-matched filter having a low silicon and power requirement

[...]

Jimmy Cuong Tran, Sorin Davidovici

20 Aug 1997

TL;DR: In this article, a spread spectrum matched filter for use with a Spread Spectrum Receiver (SRE) consisting of a first plurality of shift registers (131), a second plurality of register registers (132), a control processor (138), a multiplexer (133), a data shift register (134), a plurality of exclusive-OR (XOR) gates (135), an adder tree (136), a memory (137), and adder (139).

...read moreread less

Abstract: A spread spectrum matched filter for use with a spread spectrum receiver comprising a first plurality of shift registers (131), a second plurality of shift registers (132), a control processor (138), a multiplexer (133), a plurality of data shift registers (134), a plurality of exclusive-OR (XOR) gates (135), an adder tree (136), a memory (137), and an adder (139). The first plurality of shift registers (131) stores a first portion of a reference-chip-sequence signal and the second plurality of shift registers (132) stores a second portion of the reference-chip-sequence signal. The multiplexer (133), responsive to the control processor (138), outputs the first portion during a first clock cycle and the second portion during a second clock cycle. The plurality of XOR gates (135) multiply the first portion, during a first clock cycle, by a plurality of input-data-samples shifted through the data shift registers (134) to generate a first plurality of product-output signals. The plurality of XOR gates (135) multiply the second portion by a plurality of input-data-samples shifted through the data shift registers (134) to generate a second plurality of product-output-signals. The adder tree (136) adds the first plurality of product-output signals as a first sum which is stored in the memory (137). The adder tree (136) adds the second plurality of product-output signals as a second sum. The adder (139) adds the first and second sums.

...read moreread less

Patent•

Booth multiplier with low power, high performance input circuitry

[...]

Tam-Anh Chu¹•Institutions (1)

Cirrus Logic¹

11 Apr 1997

TL;DR: In this paper, a multiplicative multiplier for multiplying a first number with a second number to produce a product has an array of adder cells arranged in a plurality of rows of cells and is provided with input circuitry that reduces the power consumption of the multiplier.

...read moreread less

Abstract: A Booth multiplier for multiplying a first number with a second number to produce a product has an array of adder cells arranged in a plurality of rows of adder cells and is provided with input circuitry that reduces the power consumption of the multiplier. This input circuitry includes a plurality of Booth recoding logic cells that provide the control signals to multiplexers in the adder cells in the array. The Booth recoding logic cells receive different subsets of bits of the second number and generate the Booth recoded control signals as a function of the received subset of bits. Each Booth recoding logic cell includes balanced logic circuitry for generating all of the Booth recoded control signals from that Booth recoding logic cell at the same time. The balanced logic circuitry minimizes temporary short-circuit paths in the multiplexers in the adder cells. The input circuitry also includes a split bus that provides the first number to the array. The split bus has a first branch that provides the first number unbuffered to the top row of the array, and a second branch having a buffer circuit and providing the first number buffered to the other rows of the array. The buffer circuit has low-power, low-speed buffers since the top row is able to receive the first number unbuffered, and the remaining rows in the array do not need to receive the first number until after the top row of adder cells completes its addition.

...read moreread less

Patent•

Interface for performing parallel arithmetic and round operations

[...]

Brent R. Boswell¹, Karol F. Menezes¹•Institutions (1)

Intel¹

29 Dec 1997

TL;DR: In this article, an interface circuit performs a last step of an arithmetic operation and a round operation in parallel, and includes a selection circuit connected to the outputs of the first and second adder circuits.

...read moreread less

Abstract: An interface circuit performs a last step of an arithmetic operation and a round operation in parallel. The interface circuit includes a first adder circuit that receives as an input a true result of an arithmetic operation in an intermediate format. The first adder circuit outputs both the true result in a final format and a first representable number approximating the true result. A second adder circuit is connected in parallel to the first adder circuit. The second adder circuit receives the true result in the intermediate format and a 1 as inputs. The second adder circuit outputs a second representable number approximating the true result. The interface circuit also includes a selection circuit connected to the outputs of the first and second adder circuits. The selection circuit outputs either the first or second representable numbers as a rounded result of the arithmetic operation.

...read moreread less

Journal Article•DOI•

Parallel optical negabinary arithmetic based on logic operations.

[...]

Guoqiang Li¹, Liren Liu¹, Lan Shao¹, Yaozu Yin¹, Jiawen Hua¹ - Show less +1 more•Institutions (1)

Academia Sinica¹

10 Feb 1997-Applied Optics

TL;DR: The proposed algorithm and optical system are simple, reliable, and practicable, and they have the property of parallel processing of two-dimensional data.

...read moreread less

Abstract: On the basis of signed-digit negabinary representation, parallel two-step addition and one-step subtraction can be performed for arbitrary-length negabinary operands. The arithmetic is realized by signed logic operations and optically implemented by spatial encoding and decoding techniques. The proposed algorithm and optical system are simple, reliable, and practicable, and they have the property of parallel processing of two-dimensional data. This leads to an efficient design for the optical arithmetic and logic unit.

...read moreread less

Patent•

Address generation with systems having programmable modules

[...]

Martin Vorbach, Robert Muench

04 Feb 1997

TL;DR: In this article, a cluster of programmable modules forms a cluster (0109) and connections determine the X and Y locations, where a register (0105) stores the number of elements in the X direction and combines with an adder (0106) using connections.

...read moreread less

Abstract: In one version of a data processing system a cluster of programmable modules forms a cluster (0109). Connections (0101,0102) determine the X and Y locations. A register (0105) stores the number of elements in the X direction and combines with an adder (0106) using connections (0101) . A register (0103) and adder (0104) performs the same operation for the Y axis.

...read moreread less

Patent•

Interference canceller for cdma

[...]

Shousei Yoshida¹, Akihisa Ushirokawa¹•Institutions (1)

NEC¹

10 Mar 1997

TL;DR: In this paper, a code-orthogonalizing filter performs inverse spread using an orthogonalising coefficient, which is obtained through a constraint condition process on a code multiplexed received signal as an input with a desired spread code waveform and independent on transmission line variations.

...read moreread less

Abstract: A code-orthogonalizing filter 101 performs inverse spread using an orthogonalizing coefficient, which is obtained through a constraint condition process on a code multiplexed received signal as an input with a desired spread code waveform and independent on transmission line variations, thus detecting a desired wave at a constant power level while suppressing interference waves. A carrier tracking circuit 102 effects carrier phase synchronization of the detected desired wave. A symbol decision unit 104 decides the output of the carrier tracking circuit 102 to be the most certain symbol. An adder 104 extracts as symbol decision error signal from the outputs of the symbol decision unit 103 and the carrier tracking circuit 102. A tap coefficient control means 105 adaptively updates the tap coefficient according to the input to the code-orthogonalizing filter 101, a reproduced carrier outputted from the carrier tracking circuit 102, a symbol decision error signal outputted from the adder 104 and the desired wave spread code waveform.

...read moreread less

Patent•

Apparatus, systems and method for improving memory bandwidth utilization in vector processing systems

[...]

Gary Gostin¹, Matthew F. Barr¹, Ruth A. McGuffey¹, Russell L. Roan¹•Institutions (1)

Hewlett-Packard¹

17 Jan 1997

TL;DR: In this article, a vector register file comprising at least one vector register having a plurality of elements, the vector register files further having a data port and an address port for accessing selected ones of the elements of the register.

...read moreread less

Abstract: Vector register circuitry is provided which includes a vector register file comprising at least one vector register having a plurality of elements, the vector register file further having at least one data port and at least one address port for accessing selected ones of the elements of the vector register. Address generation circuitry is provided coupled to the address port and includes an adder having an output coupled to the address port, a first element register having an output coupled to a first input of the adder and an element counter having an output coupled to a second input of the adder.

...read moreread less

Collapse