scispace - formally typeset
Search or ask a question

Showing papers on "Adder published in 1997"


Journal ArticleDOI
TL;DR: This paper shows that complementary CMOS is the logic style of choice for the implementation of arbitrary combinational circuits if low voltage, low power, and small power-delay products are of concern.
Abstract: Recently reported logic style comparisons based on full-adder circuits claimed complementary pass-transistor logic (CPL) to be much more power-efficient than complementary CMOS. However, new comparisons performed on more efficient CMOS circuit realizations and a wider range of different logic cells, as well as the use of realistic circuit arrangements demonstrate CMOS to be superior to CPL in most cases with respect to speed, area, power dissipation, and power-delay products. An implemented 32-b adder using complementary CMOS has a power-delay product of less than half that of the CPL version. Robustness with respect to voltage scaling and transistor sizing, as well as generality and ease-of-use, are additional advantages of CMOS logic gates, especially when cell-based design and logic synthesis are targeted. This paper shows that complementary CMOS is the logic style of choice for the implementation of arbitrary combinational circuits if low voltage, low power, and small power-delay products are of concern.

911 citations


DissertationDOI
01 Jan 1997
TL;DR: It is found that the ripple-carry, the carry-lookahead, and the proposed carry-increment adders show the best overall performance characteristics for cell-based design.
Abstract: The addition of two binary numbers is the fundamental and most often used arithmetic operation on microprocessors, digital signal processors (DSP), and data-processing application-specific integrated circuits (ASIC). Therefore, bi¬ nary adders are crucial building blocks in very large-scale integrated (VLSI) circuits. Their efficient implementation is not trivial because a costly carrypropagation operation involving all operand bits has to be performed. Many different circuit architectures for binary addition have been proposed over the last decades, covering a wide range of performance characteristics. Also, their realization at the transistor level for full-custom circuit implemen¬ tations has been addressed intensively. However, the suitability of adder archi¬ tectures for cell-based design and hardware synthesis both prerequisites for the ever increasing productivity in ASIC design — was hardly investigated. Based on the various speed-up schemes for binary addition, a compre¬ hensive overview and a qualitative evaluation of the different existing adder architectures are given in this thesis. In addition, a new multilevel carryincrement adder architecture is proposed. It is found that the ripple-carry, the carry-lookahead, and the proposed carry-increment adders show the best overall performance characteristics for cell-based design. These three adder architectures, which together cover the entire range of possible area vs. delay trade-offs, are comprised in the more general prefix adder architecture reported in the literature. It is shown that this universal and flexible prefix adder structure also allows the realization of various customized adders and of adders fulfilling arbitrary timing and area constraints. A non-heuristic algorithm for the synthesis and optimization of prefix adders is proposed. It allows the runtime-efficient generation of area-optimal adders for given timing constraints.

268 citations


Journal ArticleDOI
TL;DR: It is found that circuits with a large number of critical paths and with a low logic depth are most sensitive to uncorrelated gate delay variations, and scenarios for future technologies show the increased impact of uncor related delay variations on digital design.
Abstract: The yield of low voltage digital circuits is found to he sensitive to local gate delay variations due to uncorrelated intra-die parameter deviations. Caused by statistical deviations of the doping concentration they lead to more pronounced delay variations for minimum transistor sizes. Their influence on path delays in digital circuits is verified using a carry select adder test circuit fabricated in 0.5 and 0.35 /spl mu/m complementary metal-oxide-semiconductor (CMOS) technologies with two different threshold voltages. The increase of the path delay variations for smaller device dimensions and reduced supply voltages as well as the dependence on the path length is shown. It is found that circuits with a large number of critical paths and with a low logic depth are most sensitive to uncorrelated gate delay variations. Scenarios for future technologies show the increased impact of uncorrelated delay variations on digital design. A reduction of the maximal clock frequency of 10% is found for, for example, highly pipelined systems realized in a 0.18-/spl mu/m CMOS technology.

177 citations


Patent
15 Oct 1997
TL;DR: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied multipliers as discussed by the authors.
Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

172 citations


Patent
13 Nov 1997
TL;DR: In this paper, compare-select features are implemented in an adder and an arithmetic logic unit (ALU) to reduce the computational complexity in determining extrema in a data processor.
Abstract: A data processor determines an overall extremum value of an input set of array data, with the input set of array data partitionable into a first set of array data and a second set of array data. The data processor includes a pair of compare-select circuits implemented in an adder as well as in an arithmetic-logic unit (ALU), respectively, which operate in parallel for respectively processing the first set and the second set, and for respectively determining first and second extremum values of the first set and the second set, respectively. A first compare-select circuit of the pair of compare-select circuits determines the overall extremum value of the input set of array data from the first and second extremum values. The first compare-select circuit also determines the location of the overall extremum value in the input set of array data. The computational complexity in determining extrema is reduced by implementing compare-select features in an adder in addition to an ALU to operate in parallel to reduce the number of processing cycles.

161 citations


Book
01 Oct 1997
TL;DR: This chapter discusses Built-In Self-Test, High-Level Synthesis, and Implementation-Dependent Fault Grading, which aims to improve the quality of Diagnostic Resolution in Scan-Based Designs.
Abstract: 1. Built-In Self-Test. Introduction. Design for Testability. Generation of Test Vectors. Compaction of Test Responses. BIST Schemes for Random Logic. BIST for Memory Arrays. 2. Generation of Test Vectors. Additive Generators of Exhaustive Patterns. Other Generation Schemes. Two-Dimensional Generators. 3. Test-Response Compaction. Binary Adders. 1's Complement Adders. Rotate-Carry Adders. Cascaded Compaction Scheme. 4. Fault Diagnosis. Analytical Model. Experimental Validation. The Quality of Diagnostic Resolution. Fault Diagnosis in Scan-Based Designs. 5. BIST of Data-Path Kernel. Testing of ALU. Testing of the MAC Unit. Testing of the Microcontroller. 6. Fault Grading. Fault Simulation Framework. Functional Fault Simulation. Experimental Results. 7. High-Level Synthesis. Implementation-Dependent Fault Grading. Synthesis Steps. Simulation Results. 8. ABIST at Work. Testing of Random Logic. Memory Testing. Digital Integrators. Leaking Integrators. 9. Epilog. Bibliography. A. Tables of Generators. B. Assembly Language. Index.

129 citations


Proceedings ArticleDOI
16 Apr 1997
TL;DR: A non-restoring square root algorithm and two very simple single precision floating point square root implementations based on the algorithm on FPGAs that uses a traditional adder/subtracter and a high-throughput pipelined implementation.
Abstract: The square root operation is hard to implement on FPGAs because of the complexity of the algorithms. In this paper, we present a non-restoring square root algorithm and two very simple single precision floating point square root implementations based on the algorithm on FPGAs. One is low-cost iterative implementation that uses a traditional adder/subtracter. The operation latency is 25 clock cycles and the issue rate is 24 clock cycles. The other is high-throughput pipelined implementation that uses multiple adder/subtracters. The operation latency is 15 clock cycles and the issue rate is one clock cycle. It means that the pipelined implementation is capable of accepting a square root instruction on every clock cycle.

109 citations


Proceedings ArticleDOI
16 Apr 1997
TL;DR: A framework and tools for automating the production of designs which can be partially reconfigured at run time, and a tool which further optimises designs for FPGAs supporting simultaneous configuration of multiple cells.
Abstract: This paper describes a framework and tools for automating the production of designs which can be partially reconfigured at run time. The tools include: a partial evaluator, which produces configuration files for a given design, where the number of configurations can be minimised by a process, known as compile-time sequencing; an incremental configuration calculator, which takes the output of the partial evaluator and generates an initial configuration file and incremental configuration files that partially update preceding configurations; and a tool which further optimises designs for FPGAs supporting simultaneous configuration of multiple cells. While many of our techniques are independent of the design language and device used, our tools currently target Xilinx 6200 devices. Simultaneous configuration, for example, can be used to reduce the time for reconfiguring an adder to a subtractor from time linear with respect to its size to constant time at best and logarithmic time at worst.

103 citations


Proceedings ArticleDOI
07 Apr 1997
TL;DR: This paper presents an in-depth case study in high-performance asynchronous adder design that uses single-rail bundled datapaths but also allows early completion, and introduces five new dynamic designs for Brent-Kung and Carry-Bypass adders.
Abstract: This paper presents an in-depth case study in high-performance asynchronous adder design. A recent method, called "speculative completion", is used. This method uses single-rail bundled datapaths but also allows early completion. Five new dynamic designs are presented for Brent-Kung and Carry-Bypass adders. Furthermore, two new architectures are introduced, which target (i) small number addition, and (ii) hybrid operation. Initial SPICE simulation and statistical analysis show performance improvements up to 19% on random inputs and 14% on actual programs for 32-bit adders, and up to 29% on random inputs for 64-bit adders, over comparable synchronous designs.

101 citations


Patent
12 Sep 1997
TL;DR: An arithmetic cell to be used in field programmable devices is defined in this article, which allows efficient implementations of multipliers, multipliers/accumulators and adders (addition, compare, and subtraction) in one compact cell that is a collection of circuits common to FPGA devices.
Abstract: An arithmetic cell to be used in field programmable devices is defined in this invention. This cell will allow efficient implementations of multipliers, multipliers/accumulators and adders (addition, compare, and subtraction) in one compact cell that is a collection of circuits common to field programmable devices. This cell may be used in a flexible manner that allows full multipliers of any dimension (n*m products), adders of any length (n+m sums, compare, differences), accumulators, and registers (to hold complete results or partial products). Key elements in this invention are an application controlled multiplexer, signal routing to provide a shift function for multiplication, and a minimum collection of configuration bits and circuit elements to perform the basic arithmetic functions.

97 citations


Patent
04 Mar 1997
TL;DR: In this article, the spread-spectrum code synchronizing speed of a down control channel is improved by using different first spread-Spectrum codes having the repetitive periods of information symbol periods from each first short code (short code) generating section.
Abstract: The spread-spectrum code synchronizing speed of a down control channel is improved. The spectra of a control channel information signal and each communication channel information signal are spread by using different first spread-spectrum codes having the repetitive periods of information symbol periods from each first spread-spectrum code (short code) generating section (11). Then only the spectrum of the control channel information signal is spread by using third spread-spectrum codes which are complex conjugation of a common long code (second spread-spectrum code) to be spread from a third spread-spectrum code (complex conjugate code of a long code mask section) generating section (12). Thereafter, the signals of all channels are added at an adequate timing by an adder (13), the spectrum of the output of the adder (B) is spread by using second spread-spectrum codes from a second spread-spectrum code generating section (14), and the spread-spectrum signal is outputted as a spread modulated signal.

Proceedings ArticleDOI
06 Mar 1997
TL;DR: The method for designing bipartite tables, called the Symmetric Bipartite Table Method, utilizes symmetry in the table entries to reduce the overall memory requirements and requires smaller table lookups to achieve a given accuracy.
Abstract: The paper presents a methodology for designing bipartite tables for accurate function approximation. Bipartite tables use two parallel table lookups to obtain a carry-save (borrow-save) function approximation. A carry propagate adder can then convert this approximation to a two's complement number or the approximation can be directly Booth encoded. Our method for designing bipartite tables, called the Symmetric Bipartite Table Method, utilizes symmetry in the table entries to reduce the overall memory requirements. It has several advantages over previous bipartite table methods in that it: (1) provides a closed form solution for the table entries; (2) has right bounds on the maximum absolute error; (3) requires smaller table lookups to achieve a given accuracy; and (4) can be applied to a wide range of functions. Compared to conventional table lookups, the symmetric bipartite tables presented are 15.0 to 41.7 times smaller when the operand size is 16 bits and 99.1 to 273.9 times smaller when the operand size is 24 bits.

Patent
30 Jun 1997
TL;DR: In this article, a digital signal processor architecture for fast Fourier transform (FFT) algorithms is presented. But the architecture is not suitable for high-dimensional (HDF) data.
Abstract: A digital signal processor architecture particularly adapted for performing fast Fourier Transform algorithms efficiently. The architecture comprises dual, parallel multiply and accumulate units in which the output of the multiplier circuit portion of each MAC is cross-coupled to an input of the adder unit of the other MAC as well as to an input of the adder unit of the same MAC to which the multiplier belongs.

Journal ArticleDOI
TL;DR: Using an enhanced multiple output domino logic (EMODL) implementation of a carry lookahead adder (CLA), sums of several consecutive bits can be built in one nFET tree with a single carry-in.
Abstract: Using an enhanced multiple output domino logic (EMODL) implementation of a carry lookahead adder (CLA), sums of several consecutive bits can be built in one nFET tree with a single carry-in. Based on this result, a new sparse carry chain architecture is proposed for the CLA adder. We demonstrate the design approach using a 32-b adder, and show that only four carries are sufficient for generating all sums, with a consequent reduction in the number of stage delays. Using a 1.2-/spl mu/m CMOS technology, we verify our simulation procedures by fabrication and measurement of a 2.7 ns critical path.

Patent
Min-joong Rim1
05 Aug 1997
TL;DR: In this article, a modulo address generator is presented, which includes a first adder for adding a current address to an address increment to generate an incremented address, an inverter for producing a complement of a maximum address, a second adder to generate a circular correction value by adding the complement of the maximum address to a minimum address, and an adder/subtractor for generating a corrected next address by adding or subtracting the circular corrected value to or from the increment according to a sign value of the address increment.
Abstract: A modulo address generating apparatus and method are disclosed which obtain high speed performance with reduced integrated circuit area. A modulo address generator according to the present invention includes a first adder for adding a current address to an address increment to generate an incremented address, an inverter for producing a complement of a maximum address, a second adder for generating a circular correction value by adding the complement of the maximum address to a minimum address, an adder/subtracter for generating a corrected next address by adding or subtracting the circular correction value to or from the incremented address according to a sign value of the address increment, a comparator for checking whether the incremented address is within an address range defined by the maximum and minimum addresses, and a multiplexor controlled by the comparator which selects the incremented address for output as a next address when the incremented address is within the address range and selects the corrected address for output as the next address when the incremented address is outside the address range.


Proceedings ArticleDOI
12 Oct 1997
TL;DR: The IP fault model is described and a method for analyzing IP faults using standard SSL-based fault simulators and test generation tools is provided, used to generate test sets that target the IP faults of the ISCAS85 benchmark circuits and a carry-lookahead adder.
Abstract: Recent work in IC failure analysis strongly indicates the need for fault models that directly analyze the function of circuit primitives. The input pattern (IP) fault model is a functional fault model that allows for both complete and partial functional verification of every circuit module, independent of the design level. We describe the IP fault model and provide a method for analyzing IP faults using standard SSL-based fault simulators and test generation tools. The method is used to generate test sets that target the IP faults of the ISCAS85 benchmark circuits and a carry-lookahead adder. Improved IP fault coverage for the benchmarks and the adder is obtained by adding a small number of test patterns to tests that target only SSL faults. We also conducted fault simulation experiments that show IP test patterns are effective in detecting non-targeted faults such as bridging and transistor stuck-on faults. Finally, we discuss the notion of IP redundancy and show how large amounts of this redundancy exist in the benchmarks and in SSL-irredundant adder circuits.

Patent
16 May 1997
TL;DR: In this paper, a multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described, and new instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction.
Abstract: A multiplier array processing system which improves the utilization of the multiplier and adder array for lower-precision arithmetic is described. New instructions are defined which provide for the deployment of additional multiply and add operations as a result of a single instruction, and for the deployment of greater multiply and add operands as the symbol size is decreased.

Patent
19 Feb 1997
TL;DR: In this paper, the demodulator has a plurality of matched filters in parallel, and each matched filter has a different binary PN code, a sample holder with a common input, a switch, a first capacitor, and a first inverse amplifier with an output and an input connected to the common input through the switch and the capacitor.
Abstract: The demodulator has a plurality of matched filters in parallel. Each matched filter has a different binary PN code, a plurality of sample holders, a plurality of multipliers, an adder, and a controller. The sample holders has a common input, a switch, a first capacitor, a first inverse amplifier with an output and an input connected to the common input through the switch and the capacitor, and a first feedback capacitor for feeding the output of the first inverse amplifier back to the input. Each multiplier has a first and second sub-multiplexers, one of sub-multiplexer selecting corresponding sample holder output and another sub-multiplexer selecting a reference voltage.

Proceedings ArticleDOI
02 Nov 1997
TL;DR: The adder cell is anatomized into smaller modules using the proposed structured approach to construct 24 different 1-bit full adder cells that exhibit different power consumption, speed, area, and driving capability figures.
Abstract: A performance analysis of a general 1-bit full adder cell is presented. The adder cell is anatomized into smaller modules using the proposed structured approach. The modules are studied extensively and several designs of each of them are shown. Connecting combinations of designs of these modules together we construct 24 different 1-bit full adder cells (some of them are novel circuits). Each of these cells exhibits different power consumption, speed, area, and driving capability figures. Some of the new cells outperform existing standard designs of the full adder cell.

Journal ArticleDOI
TL;DR: A new built-in self-test (BIST) methodology is presented in which all generation and compaction functions are executed by basic building blocks such as adders, ALU's, and multipliers, performing regular arithmetic functions in digital signal processing (DSP) cores.
Abstract: A new built-in self-test (BIST) methodology is presented in which all generation and compaction functions are executed by basic building blocks such as adders, ALU's, and multipliers, performing regular arithmetic functions in digital signal processing (DSP) cores. It is demonstrated how these components are themselves tested, and subsequently used to perform more complex testing functions. The need for extra hardware is either entirely eliminated or drastically reduced, test vectors can be easily distributed to different modules of the system, test responses can be collected in parallel, and there is virtually no performance degradation. As an integral part of the proposed BIST environment, arithmetic two-dimensional (2-D) generators of pseudorandom test vectors are also introduced to further integrate the scheme with parallel scan and boundary scan designs used to test peripheral devices of the core.

Journal ArticleDOI
TL;DR: Two novel iterative algorithms and their array structures for integer modular multiplication for Rivest-Shamir-Adelman (RSA) cryptography and are based on the familiar iterative Horner's rule, but use precalculated complements of the modulus.
Abstract: We present two novel iterative algorithms and their array structures for integer modular multiplication. The algorithms are designed for Rivest-Shamir-Adelman (RSA) cryptography and are based on the familiar iterative Horner's rule, but use precalculated complements of the modulus. The problem of deciding which multiples of the modulus to subtract in intermediate iteration stages has been simplified using simple look-up of precalculated complement numbers, thus allowing a finer-grain pipeline. Both algorithms use a carry save adder scheme with module reduction performed on each intermediate partial product which results in an output in carry-save format. Regularity and local connections make both algorithms suitable for high-performance array implementation in FPGA's or deep submicron VLSI. The processing nodes consist of just one or two full adders and a simple multiplexor. The stored complement numbers need to be precalculated only when the modulus is changed, thus not affecting the performance of the main computation. In both cases, there exists a bit-level systolic schedule, which means the array can be fully pipelined for high performance and can also easily be mapped to linear arrays for various space/time tradeoffs.

Patent
24 Jan 1997
TL;DR: In this article, a constant multiplication device is designed for multiplying a received binary multiplicand by a constant multiplier which, when expressed in binary or signed-digit notation, includes a repeated pattern with three or more nonzero values.
Abstract: A constant multiplication device is designed for multiplying a received binary multiplicand by a constant multiplier which, when expressed in binary or signed-digit notation, includes a repeated pattern with three or more non-zero values. The device includes a pattern-product term generator that receives the multiplicand and generates terms corresponding to each of the non-zero values of the pattern. If, when all instances of the pattern are subtracted from the multiplier there are non-zero values in the difference, the pattern-product term generator can also generate remainder-product terms. The pattern-product terms, but not the remainder-product terms, are input to a pattern compressor that yields pattern-product partials; the compressor can be a carry-save adder and the partials can be in the form of a pseudo sum and a pseudo carry. A replica generator generates shifted replicas of each pattern-product partial. The replicas are input to a replica compressor, as are any remainder-product terms. The replica compressor converts these inputs to final-product partials. The replica compressor can be a carry-save adder and the final-product partials can be a pseudo sum and a pseudo carry. These are input to a product ripple accumulator, which can be a carry-propagate adder, to yield the product of the multiplicand and the multiplier. Since there is only one ripple stage, the device provides for relatively high-speed multiplication for multipliers with repeated patterns.

Patent
20 Aug 1997
TL;DR: In this article, a spread spectrum matched filter for use with a Spread Spectrum Receiver (SRE) consisting of a first plurality of shift registers (131), a second plurality of register registers (132), a control processor (138), a multiplexer (133), a data shift register (134), a plurality of exclusive-OR (XOR) gates (135), an adder tree (136), a memory (137), and adder (139).
Abstract: A spread spectrum matched filter for use with a spread spectrum receiver comprising a first plurality of shift registers (131), a second plurality of shift registers (132), a control processor (138), a multiplexer (133), a plurality of data shift registers (134), a plurality of exclusive-OR (XOR) gates (135), an adder tree (136), a memory (137), and an adder (139). The first plurality of shift registers (131) stores a first portion of a reference-chip-sequence signal and the second plurality of shift registers (132) stores a second portion of the reference-chip-sequence signal. The multiplexer (133), responsive to the control processor (138), outputs the first portion during a first clock cycle and the second portion during a second clock cycle. The plurality of XOR gates (135) multiply the first portion, during a first clock cycle, by a plurality of input-data-samples shifted through the data shift registers (134) to generate a first plurality of product-output signals. The plurality of XOR gates (135) multiply the second portion by a plurality of input-data-samples shifted through the data shift registers (134) to generate a second plurality of product-output-signals. The adder tree (136) adds the first plurality of product-output signals as a first sum which is stored in the memory (137). The adder tree (136) adds the second plurality of product-output signals as a second sum. The adder (139) adds the first and second sums.

Patent
Tam-Anh Chu1
11 Apr 1997
TL;DR: In this paper, a multiplicative multiplier for multiplying a first number with a second number to produce a product has an array of adder cells arranged in a plurality of rows of cells and is provided with input circuitry that reduces the power consumption of the multiplier.
Abstract: A Booth multiplier for multiplying a first number with a second number to produce a product has an array of adder cells arranged in a plurality of rows of adder cells and is provided with input circuitry that reduces the power consumption of the multiplier. This input circuitry includes a plurality of Booth recoding logic cells that provide the control signals to multiplexers in the adder cells in the array. The Booth recoding logic cells receive different subsets of bits of the second number and generate the Booth recoded control signals as a function of the received subset of bits. Each Booth recoding logic cell includes balanced logic circuitry for generating all of the Booth recoded control signals from that Booth recoding logic cell at the same time. The balanced logic circuitry minimizes temporary short-circuit paths in the multiplexers in the adder cells. The input circuitry also includes a split bus that provides the first number to the array. The split bus has a first branch that provides the first number unbuffered to the top row of the array, and a second branch having a buffer circuit and providing the first number buffered to the other rows of the array. The buffer circuit has low-power, low-speed buffers since the top row is able to receive the first number unbuffered, and the remaining rows in the array do not need to receive the first number until after the top row of adder cells completes its addition.

Patent
29 Dec 1997
TL;DR: In this article, an interface circuit performs a last step of an arithmetic operation and a round operation in parallel, and includes a selection circuit connected to the outputs of the first and second adder circuits.
Abstract: An interface circuit performs a last step of an arithmetic operation and a round operation in parallel. The interface circuit includes a first adder circuit that receives as an input a true result of an arithmetic operation in an intermediate format. The first adder circuit outputs both the true result in a final format and a first representable number approximating the true result. A second adder circuit is connected in parallel to the first adder circuit. The second adder circuit receives the true result in the intermediate format and a 1 as inputs. The second adder circuit outputs a second representable number approximating the true result. The interface circuit also includes a selection circuit connected to the outputs of the first and second adder circuits. The selection circuit outputs either the first or second representable numbers as a rounded result of the arithmetic operation.

Journal ArticleDOI
Guoqiang Li1, Liren Liu1, Lan Shao1, Yaozu Yin1, Jiawen Hua1 
TL;DR: The proposed algorithm and optical system are simple, reliable, and practicable, and they have the property of parallel processing of two-dimensional data.
Abstract: On the basis of signed-digit negabinary representation, parallel two-step addition and one-step subtraction can be performed for arbitrary-length negabinary operands. The arithmetic is realized by signed logic operations and optically implemented by spatial encoding and decoding techniques. The proposed algorithm and optical system are simple, reliable, and practicable, and they have the property of parallel processing of two-dimensional data. This leads to an efficient design for the optical arithmetic and logic unit.

Patent
04 Feb 1997
TL;DR: In this article, a cluster of programmable modules forms a cluster (0109) and connections determine the X and Y locations, where a register (0105) stores the number of elements in the X direction and combines with an adder (0106) using connections.
Abstract: In one version of a data processing system a cluster of programmable modules forms a cluster (0109). Connections (0101,0102) determine the X and Y locations. A register (0105) stores the number of elements in the X direction and combines with an adder (0106) using connections (0101) . A register (0103) and adder (0104) performs the same operation for the Y axis.

Patent
Shousei Yoshida1, Akihisa Ushirokawa1
10 Mar 1997
TL;DR: In this paper, a code-orthogonalizing filter performs inverse spread using an orthogonalising coefficient, which is obtained through a constraint condition process on a code multiplexed received signal as an input with a desired spread code waveform and independent on transmission line variations.
Abstract: A code-orthogonalizing filter 101 performs inverse spread using an orthogonalizing coefficient, which is obtained through a constraint condition process on a code multiplexed received signal as an input with a desired spread code waveform and independent on transmission line variations, thus detecting a desired wave at a constant power level while suppressing interference waves. A carrier tracking circuit 102 effects carrier phase synchronization of the detected desired wave. A symbol decision unit 104 decides the output of the carrier tracking circuit 102 to be the most certain symbol. An adder 104 extracts as symbol decision error signal from the outputs of the symbol decision unit 103 and the carrier tracking circuit 102. A tap coefficient control means 105 adaptively updates the tap coefficient according to the input to the code-orthogonalizing filter 101, a reproduced carrier outputted from the carrier tracking circuit 102, a symbol decision error signal outputted from the adder 104 and the desired wave spread code waveform.

Patent
17 Jan 1997
TL;DR: In this article, a vector register file comprising at least one vector register having a plurality of elements, the vector register files further having a data port and an address port for accessing selected ones of the elements of the register.
Abstract: Vector register circuitry is provided which includes a vector register file comprising at least one vector register having a plurality of elements, the vector register file further having at least one data port and at least one address port for accessing selected ones of the elements of the vector register. Address generation circuitry is provided coupled to the address port and includes an adder having an output coupled to the address port, a first element register having an output coupled to a first input of the adder and an element counter having an output coupled to a second input of the adder.