scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Binary-coded decimal digit multipliers

23 Jul 2007-Iet Computers and Digital Techniques (IET)-Vol. 1, Iss: 4, pp 377-381
TL;DR: A novel design is provided for the BCD-digit multiplier, which can serve as the key building block of a decimal multiplier, irrespective of the degree of parallelism, in semi- and fully parallel hardware decimal multiplication units.
Abstract: With the growing popularity of decimal computer arithmetic in scientific, commercial, financial and Internet-based applications, hardware realisation of decimal arithmetic algorithms is gaining more importance. Hardware decimal arithmetic units now serve as an integral part of some recently commercialised general purpose processors, where complex decimal arithmetic operations, such as multiplication, have been realised by rather slow iterative hardware algorithms. However, with the rapid advances in very large scale integration (VLSI) technology, semi- and fully parallel hardware decimal multiplication units are expected to evolve soon. The dominant representation for decimal digits is the binary-coded decimal (BCD) encoding. The BCD-digit multiplier can serve as the key building block of a decimal multiplier, irrespective of the degree of parallelism. A BCD-digit multiplier produces a two-BCD digit product from two input BCD digits. We provide a novel design for the latter, showing some advantages in BCD multiplier implementations.
Citations
More filters
Journal ArticleDOI
TL;DR: In order to improve the speed of parallel decimal multiplication, a new PPG method is presented, fine-tune the PPR method of one of the full solutions and the final addition scheme of the other; thus, assembling a new full solution is presented.
Abstract: Hardware support for decimal computer arithmetic is regaining popularity. One reason is the recent growth of decimal computations in commercial, scientific, financial, and Internet-based computer applications. Newly commercialized decimal arithmetic hardware units use radix-10 sequential multipliers that are rather slow for multiplication-intensive applications. Therefore, the future relevant processors are likely to host fast parallel decimal multiplication circuits. The corresponding hardware algorithms are normally composed of three steps: partial product generation (PPG), partial product reduction (PPR), and final carry-propagating addition. The state of the art is represented by two recent full solutions with alternative designs for all the three aforementioned steps. In addition, PPR by itself has been the focus of other recent studies. In this paper, we examine both of the full solutions and the impact of a PPR-only design on the appropriate one. In order to improve the speed of parallel decimal multiplication, we present a new PPG method, fine-tune the PPR method of one of the full solutions and the final addition scheme of the other; thus, assembling a new full solution. Logical Effort analysis and 0.13 mum synthesis show at least 13 percent speed advantage, but at a cost of at most 36 percent additional area consumption.

70 citations


Cites background from "Binary-coded decimal digit multipli..."

  • ...One could think of BCD multipliers producing all the partial products in parallel by a matrix of BCD digit multipliers [22], or through selection of precomputed multiples [23], [11], and [18]....

    [...]

  • ...Decimal multiplication, as in binary, may be accomplished sequentially, in parallel, or by a semiparallel approach [22]....

    [...]

  • ...The required simple combinational logic can be found in [22]....

    [...]

  • ...BCD digit multipliers: Fully combinational delayoptimized and area-optimized BCD digit multipliers, with eight input bits and eight output bits, are offered in [22]....

    [...]

Proceedings ArticleDOI
26 Apr 2010
TL;DR: This paper presents novel high speed low power architecture for fixed bit binary to BCD conversion which is at least 28% better in terms of power-delay product than the existing designs.
Abstract: Decimal data processing applications have grown exponentially in recent years thereby increasing the need to have hardware support for decimal arithmetic. Binary to BCD conversion forms the basic building block of decimal digit multipliers. This paper presents novel high speed low power architecture for fixed bit binary to BCD conversion which is at least 28% better in terms of power-delay product than the existing designs.

39 citations


Cites background or methods or result from "Binary-coded decimal digit multipli..."

  • ...In this paper we introduce a new architecture for binary to BCD conversion of partial products which forms the core of decimal multiplication algorithms such as [ 7 ] and [8]....

    [...]

  • ...Recently, a series of BCD multipliers have been proposed [6, 7 , 8] which use fixed bit binary to BCD conversion....

    [...]

  • ...The current state of art conversion scheme [ 7 ] is studied and irregularities in the implementation of their converter have been discussed....

    [...]

  • ...As the proposed implementation in [ 7 ] is misinterpreted and logically incorrect, one straightforward architecture based on the underlying principle is shown in Figure 4. This architecture has been logically verified in Verilog HDL....

    [...]

  • ...Binary Number to be converted BCD value from [ 7 ]’s circuit Actual BCD value...

    [...]

Proceedings ArticleDOI
01 Dec 2008
TL;DR: A novel design for single digit decimal multiplication that reduces the critical path delay and area is proposed in this research and leads to more regular VLSI implementation, and does not require special registers for storing easy multiples.
Abstract: Decimal multiplication is an integral part of financial, commercial, and internet-based computations. The basic building block of a decimal multiplier is a single digit multiplier. It accepts two Binary Coded Decimal (BCD) inputs and gives a product in the range [0, 81] represented by two BCD digits. A novel design for single digit decimal multiplication that reduces the critical path delay and area is proposed in this research. Out of the possible 256 combinations for the 8-bit input, only hundred combinations are valid BCD inputs. In the hundred valid combinations only four combinations require 4 times 4 multiplication, 64 combinations need 3 times 3 multiplication, and the remaining 32 combinations use either 3 times 4 or 4 times 3 multiplication. The proposed design makes use of this property. This design leads to more regular VLSI implementation, and does not require special registers for storing easy multiples. This is a fully parallel multiplier utilizing only combinational logic, and is extended to a Hex/Decimal multiplier that gives either a decimal output or a binary output. The accumulation of partial products generated using single digit multipliers is done by an array of multi-operand BCD adders for an (n-digit times n-digit) multiplication.

34 citations


Cites methods or result from "Binary-coded decimal digit multipli..."

  • ...The table shows that the proposed design has reduced area and delay compared to the existing one in [7]....

    [...]

  • ...A comparison of the proposed design with the existing design in [7] in terms of area and critical path delay is done with the logic synthesis tool Leonardo Spectrum from Mentor Graphics Corporation using ASIC Library 0....

    [...]

  • ...A comparison of the proposed HexlDecimal multiplier design with one designed using the multiplier in [7] in terms of area and critical path delay is done with the logic synthesis tool Leonardo Spectrum from Mentor Graphics Corporation using ASIC Library 0....

    [...]

  • ...24% compared to the HexlDecimal multiplier designed using the multiplier in [7]....

    [...]

  • ...The design in [7] uses a similar approach and so this is also synthesized in the same environment as the proposed multiplier....

    [...]

Proceedings ArticleDOI
09 Dec 2009
TL;DR: A variety of algorithms for basic one by one digit multiplication are proposed and FPGA implementations are presented, and time and area results for sequential and combinational implementations show better figures compared with previous published work.
Abstract: This paper presents a number of approaches to implement decimal multiplication algorithms on Xilinx FPGA’s. A variety of algorithms for basic one by one digit multiplication are proposed and FPGA implementations are presented. Later on N by one digit and N by M digit multiplications are studied. Time and area results for sequential and combinational implementations show better figures compared with previous published work. Comparisons against binary fully-optimized multipliers emphasize the interest of the proposed design techniques.

28 citations


Cites background or methods from "Binary-coded decimal digit multipli..."

  • ...Recent fast decimal arithmetic units are proposed in the literature [ 5-12 ]....

    [...]

  • ...20, 4). Observe that “p3 p2 p1 p0” could violate the interval [0, 9 ], then an additional adjust could be necessary....

    [...]

  • ...In [ 9 ] this multiplication is implemented through a combinational circuit....

    [...]

  • ...A first technique is based on a binary multiplication followed by a correction stage [ 9 ]....

    [...]

Proceedings ArticleDOI
09 Oct 2011
TL;DR: Two new algorithms (Three-Four split and Four-Three split) based on the principle of splitting the binary partial product into two parts and computing the contributions of the two parts to the partial BCD result in parallel are proposed.
Abstract: Decimal arithmetic has received considerable attention recently due to its suitability for many financial and commercial applications. In particular, numerous algorithms have been recently proposed for decimal multiplication. A major approach to decimal multiplication shaped by these proposals is based on performing the decimal digit-by-digit multiplication in binary, converting the binary partial product back to decimal, and then adding the decimal partial products as appropriate to form the final product in decimal. With this approach, the efficiency of binary-to-BCD partial product conversion is critical for the efficiency of the overall multiplication process. A recently proposed algorithm for this conversion is based on splitting the binary partial product into two parts (i.e., two groups of bits), and then computing the contributions of the two parts to the partial BCD result in parallel. This paper proposes two new algorithms (Three-Four split and Four-Three split) based on this principle. We present our proposed architectures that implement these algorithms and compare them to existing algorithms. The synthesis results show that the Three-Four split algorithm runs 15%faster and occupies 26.1%less area than the best performing equivalent circuit found in the literature. Furthermore, the Four-Three split algorithm occupies 37.5% less area than the state of the art equivalent circuit.

22 citations


Cites background or methods from "Binary-coded decimal digit multipli..."

  • ...The saving in area and power when compared with the corrected architecture of [6] are 39....

    [...]

  • ...The results presented in Table IV show that, in terms of speed, our Three-Four split algorithm achieves 15% and 42% improvement over the architecture presented in [2] and the corrected architecture of [6] respectively....

    [...]

  • ...These architectures are: (i) our Three-Four split, (ii) our Four-Three split, (iii) the architecture proposed in [2] with the C2 corrected as noted before, and (iv) the version of architecture of [6] which is corrected in [2]....

    [...]

  • ...Parallel architectures that perform digit-by-digit multiplications include [6], [7] and multi-digit architectures include [11]....

    [...]

  • ...On the other hand, comparing with the corrected architecture of [6], the Four-Three split algorithm has less area and power by 51% and 55% respectively....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step, using straightforward diode-transistor logic.
Abstract: It is suggested that the economics of present large-scale scientific computers could benefit from a greater investment in hardware to mechanize multiplication and division than is now common. As a move in this direction, a design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step. Using straightforward diode-transistor logic, it appears presently possible to obtain products in under 1, ?sec, and quotients in 3 ?sec. A rapid square-root process is also outlined. Approximate component counts are given for the proposed design, and it is found that the cost of the unit would be about 10 per cent of the cost of a modern large-scale computer.

1,750 citations

Proceedings ArticleDOI
15 Jun 2003
TL;DR: This work introduces a new approach to decimal floating point which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard.
Abstract: Decimal arithmetic is the norm in human calculations, and human centric applications must use a decimal floating point arithmetic to achieve the same results. Initial benchmarks indicate that some applications spend 50% to 90% of their time in decimal processing, because software decimal arithmetic suffers a 100/spl times/ to 1000/spl times/ performance penalty over hardware. The need for decimal floating point in hardware is urgent. Existing designs, however, either fail to conform to modern standards or are incompatible with the established rules of decimal arithmetic. We introduce a new approach to decimal floating point which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard. A hardware implementation of this arithmetic is in development, and it is expected that this will significantly accelerate a wide variety of applications.

287 citations

Proceedings ArticleDOI
24 Jun 2003
TL;DR: Two novel designs for fixed-point decimal multiplication are presented that utilize decimal carry-save addition to reduce the critical path delay and can be extended to support decimal floating-point multiplication.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. We present two novel designs for fixed-point decimal multiplication that utilize decimal carry-save addition to reduce the critical path delay. First, a multiplier that stores a reduced number of multiplicand multiples and uses decimal carry-save addition in the iterative portion of the design is presented. Then, a second multiplier design is proposed with several notable improvements including fast generation of multiplicand multiples that do not need to be stored, the use of decimal (4:2) compressors, and a simplified decimal carry-propagate addition to produce the final product. When multiplying two n-digit operands to produce a 2n-digit product, the improved multiplier design has a worst-case latency of n+4 cycles and an initiation interval of n+1 cycles. Three data-dependent optimizations, which help reduce the multipliers' average latency, are also described. The multipliers presented can be extended to support decimal floating-point multiplication.

138 citations

Journal ArticleDOI
G. Goto1, T. Sato1, M. Nakajima1, T. Sukemura1
TL;DR: By using recurring wire shifters, the authors can expand the level of repeated blocks to cover the entire adder tree, which simplifies the complicated Wallace tree wiring scheme.
Abstract: A 54-b*54-b parallel multiplier was implemented in 0.88- mu m CMOS using the new, regularly structured tree (RST) design approach. The circuit is basically a Wallace tree, but the tree and the set of partial-product-bit generators are combined into a recurring block which generates seven partial-product bits and compresses them to a pair of bits for the sum and carry signals. This block is used repeatedly to construct an RST block in which even wiring among blocks included in wire shifters is designed as recurring units. By using recurring wire shifters, the authors can expand the level of repeated blocks to cover the entire adder tree, which simplifies the complicated Wallace tree wiring scheme. In addition, to design time savings, layout density is increased by 70% to 6400 transistors/mm/sup 2/, and the multiplication time is decreased by 30% to 13 ns. >

138 citations

Proceedings ArticleDOI
27 Jun 2005
TL;DR: A novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit.
Abstract: Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit. The partial products are generated using a digit-by-digit multiplier on a word-by-digit basis, first in a signed-digit form with two digits per position, and then combined via a combinational circuit. As the signed-digit partial products are developed one at a time while traversing the recoded multiplier operand from the least significant digit to the most significant digit, each partial product is added along with the accumulated sum of previous partial products via a signed-digit adder. This work is significantly different from other work employing digit-by-digit multipliers due to the efficiency gained by restricting the range of digits throughout the multiplication process.

104 citations