scispace - formally typeset
Search or ask a question

Showing papers on "Arithmetic logic unit published in 2011"


Proceedings ArticleDOI
04 Jul 2011
TL;DR: The design of two programmable reversible logic gate structures targeted at ALU implementation and their use in the realization of an efficient reversible ALU is demonstrated and its advantages over the only existing ALU design are quantitatively analyzed.
Abstract: Reversible logic is widely being considered as the potential logic design style for implementation in modern nanotechnology and quantum computing with minimal impact on physical entropy. Recent advances in reversible logic allow for improved quantum computer algorithms and schemes for corresponding computer architectures. Significant contributions have been made in the literature towards the design of reversible logic gate structures and arithmetic units, however, there are not many efforts directed towards the design of reversible ALUs. In this paper, we propose the design of two programmable reversible logic gate structures targeted at ALU implementation and their use in the realization of an efficient reversible ALU is demonstrated. The proposed ALU design is verified and its advantages over the only existing ALU design are quantitatively analyzed.

95 citations


Journal ArticleDOI
TL;DR: The design and functionality (low-speed) test results of the 8-bit ALU are presented and the target clock frequency of the ALU is 20 GHz.
Abstract: We have designed and demonstrated an Arithmetic-Logic Unit (ALU) based on RSFQ technology as a required step toward building an 8-bit RSFQ processor datapath. The circuit was designed and fabricated with HYPRES' standard 4.5 kA/cm2 process. The target clock frequency of the ALU is 20 GHz. In this paper, we present the design and functionality (low-speed) test results of the 8-bit ALU.

63 citations


Proceedings ArticleDOI
08 Apr 2011
TL;DR: The design of a reversible Arithmetic Logic Unit (ALU) is presented making use of multiplexer unit as well as control signals, found to be advantageous over the former in terms of number of garbage outputs and constant inputs produced.
Abstract: A function is reversible if each input vector produces a unique output vector. Reversible logic is of growing importance to many future computer technologies. In this paper, the design of a reversible Arithmetic Logic Unit (ALU) is presented making use of multiplexer unit as well as control signals. ALU is one of the most important components of CPU that can be part of a programmable reversible computing device such as a quantum computer. In multiplexer based ALU the operations are performed depending on the selection line. The control unit based ALU is developed with 9« elementary reversible gates for four basic arithmetic logical operations on two w-bit operands. The series of operations are performed on the same line depending on control signals, instead of selecting the desired result by a multiplexer. The later design is found to be advantageous over the former in terms of number of garbage outputs and constant inputs produced.

52 citations


Proceedings ArticleDOI
01 Aug 2011
TL;DR: This paper presents for the first time a reversible floating-point adder that closely follows the IEEE754 specification for binary floating-pointers and analyzes major components in terms of quantum cost, garbage outputs, and constant inputs.
Abstract: The study of reversible circuits holds great promise for emerging technologies. Reversible circuits offer the possibility for great reductions in power consumption, and quantum computers will require logically reversible digital circuits. Many different reversible implementations of logical and arithmetic units have been proposed in the literature, but very few reversible floating-point designs exist. Floating-point operations are needed very frequently in nearly all computing disciplines, and studies have shown floating-point addition to be the most oft used floating-point operation. In this paper we present for the first time a reversible floating-point adder that closely follows the IEEE754 specification for binary floating-point arithmetic. Our design requires reversible designs of a controlled swap unit, a subtracter, an alignment unit, signed integer representation conversion units, an integer adder, a normalization unit, and a rounding unit. We analyze these major components in terms of quantum cost, garbage outputs, and constant inputs.

37 citations


Proceedings ArticleDOI
01 Dec 2011
TL;DR: The construction of high speed adder circuit using Hardware Description Language (HDL) in the platform Xilinx ISE 9.2i and implement them on Field Programmable Gate Arrays (FPGAs) to analyze the design parameters and conclude that the carry-save adder is the more efficient in speed and area consumption.
Abstract: This paper is primarily deals the construction of high speed adder circuit using Hardware Description Language (HDL) in the platform Xilinx ISE 9.2i and implement them on Field Programmable Gate Arrays (FPGAs) to analyze the design parameters. The motivation behind this investigation is that an adder is a very basic building block of Arithmetic Logic Unit (ALU) and would be a limiting factor in performance of Central Processing Unit (CPU). In the past, thorough examination of the algorithms with the respect to particular technology has only been partially done. The merit of the new technology is to be evaluated by its ability to efficiently implement the computational algorithms. In the other words, the technology is developed with the aim to efficiently serve the computation. The reverse path; evaluating the merit of the algorithms should also be taken. Therefore, it is important to develop computational structures that fit well into the execution model of the processor and are optimized for the current technology. In such a case, optimization of the algorithms is performed globally across the critical path of its implementation. In this research article, we have simulated and synthesized the various adders like full adder, ripple carry adder, carry-look ahead adder, carry-skip adder, carry — select adder and carry-save adder by using VHDL and Xilinx ISE 9.2i. The simulated results are verified and the functionality of high speed adders and the parameters like area and speed is analyzed. Finally this paper concludes that the carry-save adder is the more efficient in speed and area consumption.

35 citations


Patent
18 Jul 2011
Abstract: One embodiment of the present invention relates to a heap overflow detection system that includes an arithmetic logic unit, a datapath, and address violation detection logic. The arithmetic logic unit is configured to receive an instruction having an opcode and an operand and to generate a final address and to generate a compare signal on the opcode indicating a heap memory access related instruction. The datapath is configured to provide the opcode and the operand to the arithmetic logic unit. The address violation detection logic determines whether a heap memory access is a violation according to the operand and the final address on receiving the compare signal from the arithmetic logic unit.

33 citations


Proceedings Article
Guan, Li, Ding, Hang, Ni 
01 Jan 2011

28 citations


Journal ArticleDOI
TL;DR: The cell-level design of a 32-bit RSFQ dual-lane integer processor has been developed at Stony Brook University in an effort to identify and study techniques capable of tolerating significant delay variations in future wide datapath superconductor processor circuits.
Abstract: Development of an efficient processor architecture with appropriate clocking mechanisms and datapath organization is one of the most challenging design issues for 32-/64-bit RSFQ processors. The cell-level design of a 32-bit RSFQ dual-lane integer processor has been developed at Stony Brook University in an effort to identify and study techniques capable of tolerating significant delay variations in future wide datapath superconductor processor circuits. Several key processor blocks have been designed and quantitatively evaluated at the cell-level, specifically: an instruction buffer, an instruction decoder, a multi-ported register file, a wave-pipelined arithmetic-logic unit, and an intra-processor data routing interconnect. Simulation and analysis of these blocks have been done using a generic VHDL cell library developed at Stony Brook University with cell parameters tuned to Hypres' 1.5 μm, 4.5 kA/cm2 process. After assembling these blocks together into a 32-bit processor datapath, an iterative approach has been used to optimize the design and reach a 20 GHz processing rate. Overall, the datapath has the total latency of ~ 972 ps with the design complexity exceeding 50 K Josephson junctions.

25 citations


Proceedings ArticleDOI
21 Jul 2011
TL;DR: This paper proposes, employing parallel prefix adders (fast adders) at the final stage of Wallace multipliers to reduce the delay.
Abstract: Arithmetic & Logic Unit (ALU) of a processor, when used for scientific computations, will spend more time in multiplications. Wallace multipliers perform in parallel, resulting in high speed. It uses full adders and half adders in their reduction phase. Reduced Complexity Wallace multiplier will have fewer adders than normal Wallace multiplier. In both multipliers, at the final stage, Carry propagating adder is used, which contributes to delay. This paper proposes, employing parallel prefix adders (fast adders) at the final stage of Wallace multipliers to reduce the delay.

21 citations


Proceedings ArticleDOI
03 Jun 2011
TL;DR: VHDL environment for floating point arithmetic and logic unit design using pipelining provides a high performance ALU to execute multiple instructions simultaneously.
Abstract: VHDL environment for floating point arithmetic and logic unit design using pipelining is introduced; the novelty in the ALU design with pipelining provides a high performance ALU to execute multiple instructions simultaneously. In top-down design approach, four arithmetic modules, addition, subtraction, multiplication and division are combined to form a floating point ALU unit. Each module is divided into sub- modules with two selection bits are combined to select a particular operation. Each module is independent to each other. The modules are realized and validated using VHDL simulation in the Xilinx12.1i software.

18 citations


Book ChapterDOI
10 Mar 2011
TL;DR: The simulated results revealed better performance characteristics of various logic and arithematic functions of a 1-bit ALU using GDI technique as compared to conventional CMOS and nMOS PTL techniques.
Abstract: The paper presents a low power high speed Arithmetic Logic Unit (ALU) in 45 nm technology using Gate Diffusion Input (GDI) technique and its performance comparison with CMOS and nMOS Pass Transistor Logic (PTL) techniques. The simulated results revealed better performance characteristics of various logic and arithematic functions of a 1-bit ALU using GDI techniqueas compared to conventional CMOS and nMOS PTL techniques. GDI technique allows reducing power dissipation and delay while maintaining low complexity of logic design. MICROWIND and DSCH 3.1 EDA tools were used for the schematic layout and simulation of ALU using BSIM4 model.

Proceedings Article
16 Jun 2011
TL;DR: A prototype at 8-bit RISC microcontroller, called OctaLynx, with 16-bit address bus, which consists of the core and some peripherals and has been implemented in Microsemi (i.e. Actel) IGLOO nano AGL250 device.
Abstract: The paper presents a prototype at 8-bit RISC microcontroller, called OctaLynx, with 16-bit address bus. It consists of the core (with instruction decoder, arithmetic logic unit and interrupt control unit) and some peripherals (three 8-bit general purpose input-output ports, timers/counters, USART, SPI). The prototype uses external memory. The OctaLynx behavior is described by means of Verilog hardware description language. It has been implemented in Microsemi (i.e. Actel) IGLOO nano AGL250 device. Some tests were carried out

Proceedings ArticleDOI
20 Jul 2011
TL;DR: This paper analyses the use of an ancient (or Vedic) mathematical approach for building an ALU and designs a 4x4 multiplier based on the Vedic and Conventional methods using SPICE simulator.
Abstract: VLSI design techniques are the key to re-engineering the digital gadgets of any kind which are needed to be operated with lower power to ensure a longer backup time. Power reduction in Arithmetic Logic Unit (ALU) is needed for this requirement. Multipliers and adders are the most important structures which use a larger fraction of power in such arithmetic units. This paper analyses the use of an ancient (or Vedic) mathematical approach for building an ALU. Validation for the low power operation of the circuit is made by designing a conventional CMOS counterpart whose power is compared with our ancient arithmetic design. A 4x4 multiplier based on the Vedic and Conventional methods have been designed using SPICE simulator. Simulation results depict the Vedic design incurring 29% of reduced average power.

Patent
Xiaomin Lu1
22 Feb 2011
TL;DR: In this article, a timestamp value is divided by a stride value using a plurality of binary-shift operations corresponding to a Taylor expansion series of the reciprocal stride value in a base of ½.
Abstract: In one embodiment of a header-compression method, a timestamp value is divided by a stride value using a plurality of binary-shift operations corresponding to a Taylor expansion series of the reciprocal stride value in a base of ½. When the division-logic circuitry of an arithmetic logic unit in the corresponding communication device is not designed to handle operands that can accommodate the length of the timestamp and/or stride values, the header-compression method can advantageously be used to improve the speed and efficiency of timestamp compression in communication devices.

Patent
28 Oct 2011
TL;DR: A data encoding apparatus for verifying data integrity by using a white box cipher includes an encoding unit for encoding content and an arithmetic logic unit for performing arithmetic logic operation on the white-box cipher table.
Abstract: A data encoding apparatus for verifying data integrity by using a white box cipher includes: an encoding unit for encoding content by using a white box cipher table; and an arithmetic logic unit for performing an arithmetic logic operation on the white box cipher table and content information to output an encoded white box cipher table. The arithmetic logic operation is an exclusive OR operation. The content information is license information of the content or hash value of the license information of the content.

Proceedings ArticleDOI
06 Jun 2011
TL;DR: This paper proposes a Tabu Search-based approach for the mapping of an application to the reconfigurable architecture, such that the performance is maximized and the performance loss is reduced with 16% for the best spare-cell placement strategy.
Abstract: In this paper we are interested in the mapping of embedded applications on a dynamically reconfigurable self-healing hardware architecture known as the eDNA (electronic DNA) architecture. The architecture consists of an array of cells interconnected through a 2D-mesh topology. Each cell consists of a processor and an Arithmetic Logic Unit (ALU). Applications are modeled as task graphs. We propose a Tabu Search-based approach for the mapping of an application to the reconfigurable architecture, such that the performance is maximized. When faults occur, the self-healing moves the affected functionality to spare-cells. We optimize the number and placement of spare-cells such that the performance overhead is minimized in the fault-free scenario and the application degrades gracefully in case of faults. This has been done using three different spare-cell placement strategies. We use Monte Carlo simulation to determine the average performance overhead increase due to fault occurrences. The approach has been evaluated using a large number of benchmarks and have shown that the performance loss is reduced with 16% for the best spare-cell placement strategy.

Patent
02 Sep 2011
TL;DR: In this paper, a method and apparatus for performing floating-point division using input check/output correction floating point division logic and fix-up instruction (e.g., an instruction, command, signal or other indicator) is presented.
Abstract: A method and apparatus provides for performing floating-point division using input check/output correction floating-point division logic and a floating-point division fix-up instruction (e.g., an instruction, command, signal or other indicator). In one example, the apparatus includes a processor having a floating-point arithmetic logic unit (ALU) that includes the input check/output correction floating-point division logic. The input check/output correction floating-point division logic is responsive to the floating-point division fix-up instruction executable by the floating-point ALU that causes the input check/output correction floating-point division logic to examine a first input representing a numerator and a second input representing a denominator to determine whether a special case of floating-point division occurs. The floating-point division fix-up instruction also causes the input check/output correction floating-point division logic to provide an output representing a floating-point division result based on the determined special case of floating-point division and a third input representing a candidate quotient.

Proceedings ArticleDOI
01 Nov 2011
TL;DR: FINFET based Arithmetic Logic Unit (ALU) is developed which acts as core part of a CPU, with the arithmetic functions such as addition, subtraction, and logical functionssuch as AND, OR etc.
Abstract: In VLSI technology, continuous scale down of the transistor proves the Moore's law which describes the transistors placed in a chip doubles in every 2 years. As MOS transistors sizes scales down, challenges and limitations also doubles in each shrinking of the transistors such as short channel effects and sub-threshold voltage variations etc. To reduce such challenges and keep shrinking the size of transistor will attain the certain level of achievements in nanoelectronics with the help of FINFET based transistors. In this paper, we've described about FINFET based Arithmetic Logic Unit (ALU) is developed which acts as core part of a CPU, with the arithmetic functions such as addition, subtraction, and logical functions such as AND, OR etc.

Proceedings ArticleDOI
31 Aug 2011
TL;DR: An arithmetic unit based on QSD number system based on quaternary system, developed using VHDL and implemented on FPGA device and results are compared with conventional arithmetic unit.
Abstract: Arithmetic operations in digital signal processing applications suffer from problems including propagation delay and circuit complexity. QSD number representation allows a method of fast addition/subtraction because the carry propagation chains are eliminated and hence it reduces the propagation time in comparison with common radix 2 system. Here we propose an arithmetic unit based on QSD number system based on quaternary system. The proposed design is developed using VHDL and implemented on FPGA device and results are compared with conventional arithmetic unit. The implementation of quaternary addition and multiplication results in a fix delay independent of the number of digits. Operations on a large number of digits such as 64, 128, or more, can be implemented with constant delay and less complexity.

Proceedings ArticleDOI
31 Aug 2011
TL;DR: The design of the novel asynchronous bidirectional arithmetic Logic Unit is compared with the ALU design proposed for reversible quantum computers in the CMOS context to show the logic efficiency of the proposed design around 30 % in area.
Abstract: A novel asynchronous bidirectional arithmetic Logic Unit (ALU) is introduced in this paper. The adder in the proposed design is a ripple carry adder with the bidirectional characteristic. The ALU is designed with asynchronous dual rail circuit style. Several ALUs with sizes ranging from 4bits to 32 bits were built. Their power and performance metrics were compared with the conventional ALUs built with the fast adders designed with dynamic logic style. Significant power reduction with the sub-threshold operating voltage is achieved. Also the design is compared with the ALU design proposed for reversible quantum computers in the CMOS context to show the logic efficiency of the proposed design around 30 % in area. Power reduction of 9-26% was achieved for the addition operation and and 19.5 -- 75.1% for the logical operation on the proposed 32 bit ALU, compared to the conventional dynamic logic based ALU operated over the voltage range 0.2-0.3V.

Patent
Peter Gentle1, Scott Pitkethly1
28 Sep 2011
TL;DR: In this paper, a microprocessor includes fetch logic for retrieving an instruction, decode logic for identifying an arithmetic operation specified in the instruction, and execution logic configured to receive operands specified by the instruction.
Abstract: In one embodiment, a microprocessor includes fetch logic for retrieving an instruction, decode logic configured to identify an arithmetic operation specified in the instruction, and execution logic configured to receive operands specified by the instruction. The execution logic includes a primary logic path configured to perform the arithmetic operation on such operands and a secondary parallel logic path configured to output metadata associated with the result of the arithmetic operation.

01 Jan 2011
TL;DR: The design of an Arithmetic Logic Unit (ALU) based on Redundant Binary signed Digit (RBSD) Number System is presented and its RTL view is generated by its FPGA implementation.
Abstract: paper we present the design of an Arithmetic Logic Unit (ALU) based on Redundant Binary signed Digit (RBSD) Number System. A redundant binary representation is a numeral system that uses more bits than needed to represent a single binary digit because of which most numbers have several representations. This unique feature of RBSD number system allows addition without using a typical carry. The RBSD ALU is designed using VHDL and its RTL view is generated by its FPGA implementation. The FPGA implementation is done in Xilinx ISE environment. KeywordsRBSD, FPGA, RTL, Xilinx, VHDL.nn

Proceedings ArticleDOI
19 Sep 2011
TL;DR: This paper investigates the performance of parallel conformal FDTD method on the Intel and AMD processors accelerated by the Vector Arithmetic Logic Unit (VALU), high performance cluster, and Graphics Processing Unit (GPU).
Abstract: In this paper, we introduce one novel hardware acceleration technique based on a vector unit built in a regular CPU for high performance electromagnetic simulation tools. We investigate the performance of parallel conformal FDTD method on the Intel and AMD processors accelerated by the Vector Arithmetic Logic Unit (VALU), high performance cluster, and Graphics Processing Unit (GPU). The FDTD method that is parallel in nature is one of the most popular numerical methods to simulate various electromagnetic problems and phenomenon. Several examples are employed to demonstrate the engineering applications of parallel conformal FDTD method based on the VALU acceleration.

Patent
04 Jul 2011
TL;DR: In this article, a shift device is used to extend the width of a rotator without increasing propagation delays by combining a rotation result with a shift result in accordance with a mask that is selected in response to at least a portion of the value of the degree to which a data word is to be shifted.
Abstract: A processor includes a shift device for extending the width of a rotator without increasing propagation delays. An extended-width result is obtained by combining a rotation result with a shift result in accordance with a mask that is selected in response to at least a portion of the value of the degree to which a data word is to be shifted.

Patent
27 Apr 2011
TL;DR: In this article, a fault-tolerant processor includes two basic devices: a master node and an operating node, which includes an operation code decoder, a clock pulse generator, a control unit, an instruction counter, an address register and a correlation unit.
Abstract: FIELD: information technology. ^ SUBSTANCE: fault-tolerant processor includes two basic devices: a master node and an operating node. The master node includes an operation code decoder, a clock pulse generator, a control unit, an instruction counter, an address register and a correlation unit. The operating node includes a shift counter, a number register, an accumulator register, an extra register, an extra code register, an adder and a control unit. The technical result is achieved by including a correlation unit for detecting and correcting errors of the processor control memory, as well as by including a control unit which enables to detect and correct errors of the arithmetic logic unit when performing arithmetic and logic operations. ^ EFFECT: high failure-tolerance of the processor due to detection and correction of errors. ^ 8 cl, 8 dwg

Proceedings ArticleDOI
13 Nov 2011
TL;DR: In this paper, an all-optical logic gate based on planar photonic crystal (PPC) waveguide is proposed to realize the AND logic function, as predicted using 2D finite-difference time-domain simulations.
Abstract: For stimulates great developments in high-speed and high-capacity Cloud Computing Technologies (CCT) systems in the future. The all-optical logic gate based on planar photonic crystal (PPC) waveguide is a promising technology. We design an all-optical AND gate in PPC, as an ultracompact component for planar lightwave circuit integration with suitable choice of parameters, perform this task. The PPC waveguides are composed of circular dielectric rods set in two-dimensional triangular lattice. To realize the AND logic function, as predicted using 2D finite-difference time-domain simulations. The combination of the ring and line defect coupler waveguides forms the device. On the basis of our simulations, we found that the optimized scheme maximizes the power transmission above 80% at a wavelength of 1.55 um. Besides, this device can apply to all-optical Arithmetic Logic Unit (ALU) in all-optical computing and potentially applicable for photonic integrated circuits (PICs) in the future.

Patent
19 Oct 2011
TL;DR: In this paper, a 4-bit RISC (Reduced Instruction-Set Computer) microcontroller with a control module, a program memory, a register file, a reset module and at least one peripheral function module is presented.
Abstract: The invention provides a 4-bit RISC (Reduced Instruction-Set Computer) microcontroller comprising a control module, a program memory, a register file, a reset module, a clock module and at least one peripheral function module; the control module comprises an instruction register, an instruction encoder, a stack, an ALU (Arithmetic Logic Unit) and a program counter; a two-level two-phase assembly line architecture is adopted; each instruction cycle is divided into a first phase and a second phase; in the first phase, the control module completes the operations of stack warehousing, program memory reading, register reading, instruction encoding and ALU arithmetic; and in the second phase, the control module completes the operations of stack pulling, instruction register latching, register writing and program counter rewriting. The invention further provides an intelligent toy comprising a control chip. The control chip has improved voice playing effect and lower production cost.

Book ChapterDOI
10 Mar 2011
TL;DR: An improved and efficient adder circuit called mirror adder is used which helps in decreasing the RC delay and decreasing parasitic capacitance hence increasing speed, and a programmable logic circuit is included to configure mirrorAdder circuit to subtractor circuit depending upon programmable input.
Abstract: The Arithmetic and Logic Unit (ALU) is a combination circuit that performs a number of arithmetic and logical operations. Over the past two decades, Complementary Metal Oxide Semiconductor (CMOS) technology has played important role in designing high performance systems because of the advantages that CMOS provides: an exceptionally low power-delay product, the ability to accommodate millions of devices on a single chip. To take the benefits of CMOS technology a novel ALU Circuit is proposed in this paper. An improved and efficient adder circuit called mirror adder [3] is used which helps in decreasing the RC delay. Also a programmable logic circuit is included to configure mirror adder circuit to subtractor circuit depending upon programmable input; this implementation helps in reducing the transistor count and power dissipation, decreasing parasitic capacitance hence increasing speed.

Patent
09 Dec 2011
TL;DR: In this paper, a processor (100) has a process management unit made of hardware, where management process includes activation and suspension of processes and execution of processes according to their priority.
Abstract: The processor (100) has a process management unit made of hardware, where management process includes activation and suspension of processes and execution of processes according to their priority. A program memory (101) stores instructions of programs, a data memory (102) stores data manipulated by the programs and an arithmetic logic unit (103) performs arithmetic and logical operations. A controller (104) decodes the instructions and operates the arithmetic logic unit according to decoded instructions, and the process management unit is integrated into the controller.

Patent
13 Sep 2011
TL;DR: In this paper, the authors proposed a biological signal measurement device consisting of a measurement section for measuring the biological signal and an arithmetic processing section for arithmetically processing the measured biological signal.
Abstract: PROBLEM TO BE SOLVED: To provide a biological signal measurement device capable of decreasing an entire operation amount.SOLUTION: The biological signal measurement device includes: a biological signal measurement section for measuring a biological signal; and an arithmetic processing section for arithmetically processing a measured biological signal. The arithmetic processing section has a first independently-controllable arithmetic processing part for executing an arithmetic processing required for calculation of the biological signal, and a second independently-controllable arithmetic processing part for executing a specific arithmetic processing. When the first arithmetic processing part meets a predetermined condition, it can make the second arithmetic processing part execute the specific arithmetic processing.