scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A high performance architecture for rotating decimal coordinates

18 Nov 2008-pp 1757-1762
TL;DR: A modification of the CORDIC method for decimal arithmetic is proposed so as to produce fast rotations and a significant reduction in the number of iterations in comparison to the original decimal CORDic method is achieved.
Abstract: Although radix-10 arithmetic has been gaining renewed importance over the last few years, high performance decimal systems and techniques are still under development. In this paper, a modification of the CORDIC method for decimal arithmetic is proposed so as to produce fast rotations. The algorithm works with BCD operands and no conversion to binary is needed. A significant reduction in the number of iterations in comparison to the original decimal CORDIC method is achieved. The experiments showing the advantages of the new method are described. Finally, the results with regard to delay obtained by means of an FPGA implementation of the method are shown.

Summary (2 min read)

Introduction

  • Numbers are commonly expressed by human beings using decimal representation; as a consequence, in the early days of computing, most of the first computers worked with decimal operands [1].
  • The need for high precision engineering and manufacturing systems is also essential in CAD/CAM.
  • Originally, CORDIC was applied to binary arithmetic, but later its application was proposed for decimal data [13], [14].
  • Finally, in section V, the conclusions are given.

A. Reviewing the binary CORDIC method

  • Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE.
  • Walther [16] extended the method to hyperbolic and linear coordinates.
  • In vectoring mode, the vector (x0, y0) is progressively rotated towards the x-axis by means of angles such as those previously mentioned, so that the component y approaches 0.

B. Reviewing the Decimal CORDIC Method

  • These devices usually work with numbers in decimal format and, therefore, binary CORDIC cannot be directly used.
  • The drawback of this decimal CORDIC method lies on the relation between any two consecutive elementary angles in the form tan-1(10-j).
  • This fact facilitates convergence in binary CORDIC, as expressed in (5).
  • Therefore, the advantages of using the algorithm with BCD operands would be reduced to omit conversion between BCD and binary representation and, consequently, to avoid loss of precision.

III. THE NEW DECIMAL CORDIC METHOD

  • The proposal for a new decimal CORDIC method is based on the selection of successive angles αj such that: αj = tan-1(z⎣j⎦) (10) where z⎣j⎦ is the value resulting from truncating zj after the first digit on the left different from 0.
  • The computation of the factor for compensating this scaling can be obtained by means of the following expression: KND-1 = ∏j = 0..n cos(tan-1(z⎣j⎦)) (15) In Table V, the values for the scaling compensation factor incorporated within the first iterations are shown.
  • 6 LUT would constitute the storage block for tx,0, other 6 LUT would compound the LUT block for tx,1, and so Therefore, each small LUT will receive as inputs the value of a single digit of the coordinate and the mantissa and exponent of z⎣j⎦ .

A. Some Details on the Architecture Implementation

  • Addition on BCD operands is more complex than binary addition since the carry resulting from the sum of two digits must be propagated to the sum of the following ones [13].
  • BCD X3 representation allows decimal addition/subtraction to be more efficiently performed, since only two 4-bit binary adders are required for each pair of digits.
  • The final result is directly obtained in BCD X3.
  • Conversion from BCD to BCD X3 requires only 10 gates distributed over 3 level, and similar resources are needed when transforming BCD X3 into BCD operands.
  • The complete architecture for each of the iterations of the proposed ND-CORDIC method is shown in Fig 1.

B. Experiments on Precision

  • Different tests were carried out so as to make a complete comparison with regard to precision between B-CORDIC, DCORDIC, and the ND-CORDIC method proposed in this work.
  • Values within the range [0, 1) were chosen for the (x, y) coordinates and also for the rotation angle θ.
  • For DCORDIC and ND-CORDIC, a conversion stage from BCD to BCD X3 was included, whereas for B-CORDIC the BCD operands were converted into binary numbers.
  • 1760 Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE.
  • The error decreases much faster for ND-CORDIC.

C. Experiments on Latency and Hardware resources

  • The proposed architecture was implemented on VHDL using the Xilinx ISE 7.1i tool.
  • The global method was implemented on an unfolded architecture.
  • A homogeneous length of 28 bits was used for every number format, so six fractional digits, corresponding to 24 bits, were considered for the BCD original numbers.
  • The results for the delays and the FPGA resources used, when considering a single iteration and including number format conversions and scaling compensation, are shown in Table VI for comparison.
  • It can be observed that ND-CORDIC offers better global performance than D-CORDIC and B-CORDIC for the considered iterations.

V. CONCLUSIONS

  • One of the most important tasks in new hardware design is to achieve high performance rates with a trade-off between precision and delays of the circuitry that forms these new embedded architectures.
  • It seems that there is a growing trend towards developing new systems integrating decimal arithmetic, which is required in many practical research areas.
  • Moreover, the maximum error obtained is always lower for the proposed method than for binary and decimal CORDIC.
  • At this point, new scaling compensation techniques must be studied and developed so as to improve delay and resources utilization.
  • This work has been supported by the Generalitat Valenciana under Grant No. GV/2007/173.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

A High Performance Architecture for Rotating
Decimal Coordinates
Jose-Luis Sanchez, Higinio Mora, Jeronimo Mora, Antonio Jimeno
Computer Technology Department
University of Alicante
Alicante, Spain
Email: sanchez@dtic.ua.es, hmora@dtic.ua.es, jeronimo@dtic.ua.es, jimeno@dtic.ua.es
Abstract—Although radix-10 arithmetic has been gaining
renewed importance over the last few years, high performance
decimal systems and techniques are still under development. In
this paper, a modification of the CORDIC method for decimal
arithmetic is proposed so as to produce fast rotations. The
algorithm works with BCD operands and no conversion to binary
is needed. A significant reduction in the number of iterations in
comparison to the original decimal CORDIC method is achieved.
The experiments showing the advantages of the new method are
described. Finally, the results with regard to delay obtained by
means of an FPGA implementation of the method are shown.
I. INTRODUCTION
Numbers are commonly expressed by human beings using
decimal representation; as a consequence, in the early days of
computing, most of the first computers worked with decimal
operands [1]. However, due to the greater simplicity of binary
arithmetic unit and the compactness of binary numbers,
decimal arithmetic fell into disuse and for many years it has
been difficult to find new proposals of radix 10-based
computers. This fact has finally led to a preponderance of
binary systems over decimal ones. In spite of that, some
examples of decimal architectures can be found, such as
Hewlett Packard [2], Texas Instruments [3] and Casio
calculators [4], and some others [4].
In recent years, a renewed interest in decimal arithmetic
computing has arisen, since it is essential for many
applications. For instance, financial calculations are carried out
using decimal arithmetic, as binary operations often imply
rounding up or down the results when working with fractional
operands. Several studies involving financial and business-
oriented applications have revealed that 55% of the numerical
data contained in commercial databases are in decimal format
[5]. The need for high precision engineering and manufacturing
systems is also essential in CAD/CAM. When defining a radix-
10 magnitude for an object, the internal use of radix-2 usually
implies loss of precision, since the equivalent binary number is
likely to have an infinite amount of fractional digits. On the
other hand, there are currently optic and magnetic sensors
which directly provide the output in BCD format, so that the
user can easily monitor the evolution of certain magnitudes and
detect any errors [6]. The same happens with some types of
actuators which use ISO-ASCII as the code for inputting data
to the manufacturing process [7].
Proof of the importance recently given to decimal
representation is the fact that even the IEEE 854 standard uses
a radix-independent generalization of IEEE 754 and supports
decimal floating point operations [8], [9]. Recently the IBM
z900 microprocessor has been developed [10], with a decimal
arithmetic unit. Furthermore, the European Commission
specifies a certain number of decimal digits for calculating
currency conversions [11].
CORDIC (COordinate Rotation Digital Computer) is a
relevant method to approximate mathematical functions [12].
This method basically works as an iterative algorithm for
approximating rotation of a two-dimensional vector using only
add and shift operations. It is particularly suited to hardware
implementations due to the fact that it does not require any
multiplication. Originally, CORDIC was applied to binary
arithmetic, but later its application was proposed for decimal
data [13], [14].
This paper shows new results on the research in decimal
arithmetic carried out by the authors [15]. Thus, a new
CORDIC method for decimal operands is proposed, based on
the use of decimal arithmetic and on the selection of adequate
angles so as to reduce the number of iterations required to
obtain a suitable precision. In section II, both the binary and
the decimal CORDIC method are reviewed. In section III, the
new CORDIC method is proposed. In order for a real system to
operate with our method, an architecture carried out on FPGA
is proposed throughout section IV and the results of a series of
experiments with regard to precision and the required number
of stages are showed. Finally, in section V, the conclusions are
given.
II. T
HE CORDIC METHOD
A. Reviewing the binary CORDIC method
The rotation of a 2D point (x, y) through an angle θ can be
directly computed by means of the following equations:
x
R
= x cos θ - y
sin θ (1)
y
R
= x sin θ + y cos θ (2)
The above equations imply a high computational cost due to
the fact that some multiplications and the previous calculation
of cos θ and sin θ must be performed.
1757978-1-4244-1666-0/08/$25.00 '2008 IEEE
Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE. Downloaded on October 16, 2009 at 04:53 from IEEE Xplore. Restrictions apply.

CORDIC was developed by Volder [12] for computing the
rotation of a 2D vector of circular coordinates expressed as
binary numbers, exclusively using addition and shift
operations. Walther [16] extended the method to hyperbolic
and linear coordinates. CORDIC works in two different modes.
In rotation mode, a vector (x
0
, y
0
) is rotated through an angle
θ
in order to obtain a new vector (x
n
, y
n
). The overall rotation is
divided into micro-rotation such that, in micro-rotation j, an
angle α
j
= tan
-1
(2
-j
) is added to or subtracted from the
remaining angle θ
j
. In this way, this angle approaches zero. In
vectoring mode, the vector (x
0
, y
0
) is progressively rotated
towards the x-axis by means of angles such as those previously
mentioned, so that the component y approaches 0. As a result,
the sum of all the angles applied gives the value of the angle of
vector (x
0
, y
0
) towards the x-axis, whereas the final component
x
n
is the vector magnitude. The algorithm is based on the
following equations:
x
j+1
= x
j
m σ
j
y
j
2
-d(j)
(3a)
y
j+1
= y
j
+ σ
j
x
j
2
-d(j)
(3b)
z
j+1
= z
j
w
d(j)
(3c)
The values for m are 1 for circular, -1 for hyperbolic, and 0
for linear coordinates. The value for σ
j
determines the direction
of micro-rotation j. In rotation mode,
σ
j
is equal to 1 if z
j
is
positive, and σ
j
is equal to -1 otherwise. The values for d(j)
and w
d(j)
are shown in Table I, whereas Table II shows the
results provided by the algorithm in rotation mode depending
on the type of coordinates.
The elementary angles α
j
must fulfil the following
condition [16]:
0
1
+
+=
jn
n
jk
jj ,ααα
(4)
With regard to the elementary angles chosen for circular
coordinates, convergence is guaranteed since the following
property is accomplished:
0)2(tan)2(tan ,
1
11
+=
j
n
jk
kj
(5)
When working with hyperbolic coordinates, carrying out
each micro-rotation only once is not sufficient. Indeed,
convergence can be achieved by repeating certain iterations
[16], as shown in Table I.
In iteration j, a scaling factor is added to the new coordinates
(x
j
, y
j
). This factor is given by the following expression:
j
jm
mK
+= 2 1
,
(6)
The coordinates obtained after the last iteration have to be
compensated by multiplying them by K
m
-1
, taking into account
that K
m
results from the following product:
jm
j
m
KK
,
=
(7)
TABLE I
P
ARAMETERS FOR DIFFERENT COORDINATE TYPE
Type d(j) w
d(j)
Circular j tan
-1
(2
-j
)
Hyperbolic
jk, k is the largest integer such that
3
k+1
+ 2k - 1 2j
tanh
-1
(2
-j
)
Linear j 2
-j
TABLE II
R
ESULTS FOR DIFFERENT COORDINATE TYPE
Type Result
Circular
x
n
= K
1
(x cos zy
sin z)
y
n
= K
1
(y cos z + x
sin z)
z
n
= 0
Hyperbolic
x
n
= K
-1
(x cos z + y
sin z)
y
n
= K
-1
(y cos z + x
sin z)
z
n
= 0
Linear
x
n
= x
y
n
= y + x z
z
n
= 0
Several methods to avoid performing the final product by
K
m
-1
and carry out the scaling compensation in parallel with
each of the iterations have been proposed [17]-[20].
B.
Reviewing the Decimal CORDIC Method
The CORDIC method is flexible and simple, so it is suitable
for environments in which a small number of hardware
resources are available. One of these environments is that of
portable calculators [2]. However, these devices usually work
with numbers in decimal format and, therefore, binary
CORDIC cannot be directly used. In [13] and [21] the use of
CORDIC for BCD operands is proposed. The modification of
the method, focusing on the case of circular coordinates, is
expressed by the following iterative equations:
x
j+1
= x
j
σ
j
y
j
10
-j
(8a)
y
j+1
= y
j
+ σ
j
x
j
10
-j
(8b)
z
j+1
= z
j
tan
-1
(10
-j
) (8c)
The drawback of this decimal CORDIC method lies on the
relation between any two consecutive elementary angles in the
form tan
-1
(10
-j
). The relation between any two consecutive
angles in the form tan
-1
(2
-j
) is approximately 2. This fact
facilitates convergence in binary CORDIC, as expressed in (5).
However, in the case of decimal representation, each angle is
approximately 10 times smaller than the previous one, so
convergence of the method cannot be directly guaranteed. This
is not specific of radix 10 representation. Recall that in binary
CORDIC applied to hyperbolic coordinates, certain iterations
must be repeated so as to guarantee convergence. According to
decimal CORDIC, each iteration but the initial one must be
1758
Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE. Downloaded on October 16, 2009 at 04:53 from IEEE Xplore. Restrictions apply.

repeated 9 times so as to achieve convergence [13]. In this
case, the following condition is fulfilled:
0)10(tan9)10(tan
,
1
11
+=
j
n
jk
kj
(9)
References [13] and [21] show that decimal CORDIC can
compute sine and cosine functions with a 5-digit accuracy if at
least 30 angular steps are performed. These results are suitable
in terms of precision. However, this method cannot compete
with binary CORDIC with regard of latency, since the binary
method requires a smaller number of iterations so as to obtain
the same precision. Therefore, the advantages of using the
algorithm with BCD operands would be reduced to omit
conversion between BCD and binary representation and,
consequently, to avoid loss of precision.
III.
THE NEW DECIMAL CORDIC METHOD
The proposal for a new decimal CORDIC method is based
on the selection of successive angles α
j
such that:
α
j
= tan
-1
(z
j
) (10)
where z
j
is the value resulting from truncating z
j
after the
first digit on the left different from 0. In this way, the z
component for accumulating the remaining angle is calculated
by means of the following expression:
z
j+1
= z
j
– tan
-1
(z
j
) (11)
The angles
α
j
can be alternately determined in the following
way:
tan(α
j
) = tan(tan
-1
(z
j
) = z
j
(12)
As a consequence, the equations for the iterative
computation of x and y are expressed as follows:
x
j+1
= x
j
m σ
j
z
j
y
j
(13)
y
j+1
= y
j
+ σ
j
z
j
x
j
(14)
From this point on, the new decimal CORDIC will be
referred to as ND-CORDIC, whereas binary CORDIC and the
previous decimal CORDIC will be referred to as B-CORDIC
and D-CORDIC, respectively. Table III shows an example
where the initial rotating angle is z
0
= 0.785398. The different
values for z
j
, z
j
,
and tan
-1
(z
j
) according to each iteration j are
presented. Table IV shows the different values for x
j
and y
j
according to each iteration and the above mentioned rotation
angle, and taking into account the initial values x
0
= 0.931420
and y
0
= 0.538504.
TABLE III
V
ALUES FOR Z
J
, α
J
AND TAN
-1
(α
J
)
Iteration z
j
z
j
tan
-1
(z
j
)
j = 0 0.785398 0.7 0.610725
j = 1 0.174672 0.1 0.099668
j = 2 0.075003 0.07 0.069886
j = 3 0.005117 0.005 0.004999
j = 4 0.000117 0.0001 0.0001
j = 5 0.000017 0.00001 0.00001
j = 6 0.000007 0.000007 0.000007
j = 7 0.000000 0.000000 0.000000
TABLE IV
V
ALUES FOR Z
J
, α
J
AND TAN
-1
(α
J
)
Iteration
x
j
y
j
α
j
= z
j
j = 0 0.931420 0.538504 0.7
j = 1 0.554467 1.190498 0.1
j = 2 0.435417 1.245945 0.07
j = 3 0.348201 1.276424 0.005
j = 4 0.341822 1.278165 0.0001
j = 5 0.341694 1.278199 0.00001
j = 6 0.341681 1.278202 0.000007
j = 7 0.341669 1.278205 0
As shown in the last row of Table IV, the final values for the
rotated coordinates are x
7
= 0.341669, y
7
= 1.278205. The
rotation of the original point, directly computed by means of
(1) and (2), gives as a result the values x
R
= 0.277834 and y
R
=
1.039393. The divergence between (x
7
, y
7
) and (x
R
, y
R
) is
caused by the scaling factor that is incorporated within each
ND-CORDIC iteration. The computation of the factor for
compensating this scaling can be obtained by means of the
following expression:
K
ND
-1
=
j = 0..n
cos(tan
-1
(z
j
)) (15)
In Table V, the values for the scaling compensation factor
incorporated within the first iterations are shown. For relatively
small values of z
j
, the scaling factor can be assumed to be
equal to 1. The last row contains the value for the global
scaling factor K
ND
-1
as defined in (15).
TABLE V
T
ERMS DETERMINING THE SCALING FACTOR
Iteration
z
j
cos(tan
-1
(z
j
))
j = 0 0.7 0.81923192
j = 1 0.1 0.99503719
j = 2 0.07 0.99755897
j = 3 0.005 0.9999875
j = 4 0.0001 1
j = 5 0.00001 1
j = 6 0.000007 1
K
ND
-1
0.81316621
1759
Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE. Downloaded on October 16, 2009 at 04:53 from IEEE Xplore. Restrictions apply.

The results of the products x
7
·K
ND
-1
and y
7
·K
ND
-1
give an error
equal to 4.39821· 10
-7
for coordinate x and 1.17566· 10
-7
for
coordinate y.
The scale factor compensation by means of multiplication
must be avoided due to the high computational cost of this
operation. In B-CORDIC, compensation without products is
easy to perform due to the fact that the scale factor is a constant
[12]. However, in ND-CORDIC this factor varies depending
on the different angles chosen through the method iterations, as
shown in Table V.
A technique based on LUT (look-up tables) can be used
which allows the compensation to be performed on each
iteration. Equations (13) and (14) can be modified in order to
include the compensation, which results in the following
expression, where the superscript C indicates that the
coordinates are scaling-compensated:
x
j+1
C
= (x
j
m σ
j
z
j
y
j
) cos(tan
-1
(z
j
)) (16)
y
j+1
C
= (y
j
+ σ
j
z
j
x
j
) cos(tan
-1
(z
j
)) (17)
The above equations can be rewritten in the following way:
x
j+1
C
= x
j
cos(tan
-1
(z
j
) m σ
j
z
j
y
j
cos(tan
-1
(z
j
)) (18)
y
j+1
C
= y
j
cos(tan
-1
(z
j
) + σ
j
z
j
x
j
cos(tan
-1
(z
j
)) (19)
In equations (18) and (19), four different terms appear:
t
x,0
= x
j
cos(tan
-1
(z
j
)) (20a)
t
y,0
= y
j
cos(tan
-1
(z
j
)) (20c)
t
x,1
= z
j
y
j
cos(tan
-1
(z
j
)) (20b)
t
y,1
= z
j
x
j
cos(tan
-1
(z
j
)) (20d)
The proposed compensation technique consists in storing the
above four terms in four independent blocks of LUT. The
entries for each block of LUT consist of the one-digit mantissa
and the exponent of z
j
, and also the value of x
j
or y
j
. If each
term was stored on a single LUT, the size of each LUT would
be excessive. For instance, when a precision of 6 fractional
digits is required, 24 bits are needed for each coordinate, 4 for
indicating the mantissa of z
j
, and 3 for indicating the exponent
of z
j
(from 000 to 110). Thus, the size of a monolithic LUT
for each term would be 2
4·6 + 3 + 4
· 4 · 6 = 6144 MB. Instead,
much smaller LUT can be used. If the different BCD X3 digits
of x
j
are considered, the term t
x0
can be expressed as:
t
x,0
= (x
j
[5] x
j
[4] x
j
[3] x
j
[2] x
j
[1] x
j
[0]) cos(tan
-1
(z
j
)) (21)
Therefore, each small LUT will receive as inputs the value
of a single digit of the coordinate and the mantissa and
exponent of z
j
. For 6 fractional digits, the size of each LUT
would be 2
4 + 3 + 4
· 4 · 6 = 6 KB. Since 6 fractional digits and
four terms must be considered, the overall memory size would
be 6 KB · 6 · 4 = 144 KB. In this case, 6 LUT would constitute
the storage block for t
x,0
, other 6 LUT would compound the
LUT block for t
x,1
, and so
Therefore, each small LUT will receive as inputs the value
of a single digit of the coordinate and the mantissa and
exponent of z
j
. For 6 fractional digits, the size of each LUT
would be 2
4 + 3 + 4
· 4 · 6 = 6 KB. Since 6 fractional digits and
four terms must be considered, the overall memory size would
be 6 KB · 6 · 4 = 144 KB. In this case, 6 LUT would constitute
the storage block for t
x,0
, other 6 LUT would compound the
LUT block for t
x,1
, and so on.
IV.
DECIMAL CORDIC ARCHITECTURE. EXPERIMENTATION
A.
Some Details on the Architecture Implementation
Addition on BCD operands is more complex than binary
addition since the carry resulting from the sum of two digits
must be propagated to the sum of the following ones [13].
Moreover, the sum of two BCD digits must be corrected
adding the value 6 to this sum if it is greater than 9.
BCD X3 representation allows decimal addition/subtraction
to be more efficiently performed, since only two 4-bit binary
adders are required for each pair of digits. The final result is
directly obtained in BCD X3. More detailed information on
BCD X3 adders can be found in [13]. Conversion from BCD to
BCD X3 requires only 10 gates distributed over 3 level, and
similar resources are needed when transforming BCD X3 into
BCD operands. Therefore, the use of BCD X3 is proposed
since addition, subtraction, and other operations are simpler
than for BCD.
The complete architecture for each of the iterations of the
proposed ND-CORDIC method is shown in Fig 1.
B.
Experiments on Precision
Different tests were carried out so as to make a complete
comparison with regard to precision between B-CORDIC, D-
CORDIC, and the ND-CORDIC method proposed in this work.
Values within the range [0, 1) were chosen for the (x, y)
coordinates and also for the rotation angle
θ
. Original data
were represented in BCD with 6 fractional digits. For D-
CORDIC and ND-CORDIC, a conversion stage from BCD to
BCD X3 was included, whereas for B-CORDIC the BCD
operands were converted into binary numbers. In any case, 28-
bit operands were considered. The experiments were aimed at
comparing the number of iterations required in each method so
as to achieve suitable precision. In the test, 500 random points
were rotated through a random angle. The results for the error
distribution are shown in Fig. 2, whereas the results for the
maximum relative error are shown in Fig. 3.
1760
Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE. Downloaded on October 16, 2009 at 04:53 from IEEE Xplore. Restrictions apply.

A decreasing tendency can be observed for every method as
the number of iterations increases. However, the error
decreases much faster for ND-CORDIC. For this method, the
error reaches stability in about 10 iterations, whereas for D-
CORDIC and B-CORDIC much more iterations are required.
In addition, the maximum error is always lower for the ND-
CORDIC method. The mean and maximum relative error for
ND-CORDIC seems to be always lower than those for D-
CORDIC and B-CORDIC.
C.
Experiments on Latency and Hardware resources
The proposed architecture was implemented on VHDL using
the Xilinx ISE 7.1i tool. The Virtex4 XC4VLX60 FPGA was
chosen for simulation. The architectures for D-CORDIC
proposed in [13] and [21] and for B-CORDIC were also
implemented. In all cases, a complete stage of the method was
implemented, with the type of adder and shifter, if needed,
being varied according to each method. The global method was
implemented on an unfolded architecture. Conventional
arithmetic was used. In case of B-CORDIC, the scaling factor
was compensated by means of the method proposed in [20],
which allows the compensating product to be transformed into
simple additional shift-add iterations. In case of ND-CORDIC,
the compensation was achieved by means of the LUT
technique previously described. The initial conversion from
BCD to BCD X3 and the final conversion the other way were
also included for the D-CORDIC and the ND-CORDIC
methods. For B-CORDIC, an initial conversion from BCD to
binary and a final conversion the other way were also
implemented. A homogeneous length of 28 bits was used for
every number format, so six fractional digits, corresponding to
24 bits, were considered for the BCD original numbers.
The results for the delays and the FPGA resources used,
when considering a single iteration and including number
format conversions and scaling compensation, are shown in
Table VI for comparison. As it can be observed, the delay for
the proposed ND-CORDIC is less than half the delay for the
y
j
z
j
-Calculator
BCD X3 Adder
x
j
Register x Register y
Register z
z
j
LUT
x cos(atan(
z
j
))
LUT
z
j
x cos(atan(z
j
))
LUT
atan(z
j
)
LUT
y cos(atan(
z
j
))
LUT
z
j
y cos(atan(z
j
))
e
z
j
&
m
z
j
BCD X3 Adder
BCD X3 Adder
y
j
+1
x
j
+1
z
j
+1
Fig. 1. The architecture for an ND-CORDIC iteration.
1,00E-07
1,00E-06
1,00E-05
1,00E-04
1,00E-03
1,00E-02
1,00E-01
1,00E+00
1,00E+01
1,00E+02
123456789101112
Iterations
Error (%
Fig. 2. Relative error distribution when calculating the rotation of vectors
within the circumference unit, according to the number of iterations;
logarithmic scale (
c = D-CORDIC; = B- CORDIC; = ND-CORDIC).
1,00E-06
1,00E-05
1,00E-04
1,00E-03
1,00E-02
1,00E-01
1,00E+00
1,00E+01
1,00E+02
123456789101112
Iterations
Error (%
)
Fig. 3. Maximum relative error on calculating the rotation of vectors withi
n
the circumference unit, according to the number of iterations; logarithmic
scale (c = D-CORDIC; = B- CORDIC; = ND-CORDIC).
1761
Authorized licensed use limited to: UNIVERSIDAD DE ALICANTE. Downloaded on October 16, 2009 at 04:53 from IEEE Xplore. Restrictions apply.

References
More filters
Journal ArticleDOI
Jack E. Volder1
TL;DR: The trigonometric algorithms used in this computer and the instrumentation of these algorithms are discussed in this paper.
Abstract: The COordinate Rotation DIgital Computer(CORDIC) is a special-purpose digital computer for real-time airborne computation. In this computer, a unique computing technique is employed which is especially suitable for solving the trigonometric relationships involved in plane coordinate rotation and conversion from rectangular to polar coordinates. CORDIC is an entire-transfer computer; it contains a special serial arithmetic unit consisting of three shift registers, three adder-subtractors, and special interconnections. By use of a prescribed sequence of conditional additions or subtractions, the CORDIC arithmetic unit can be controlled to solve either set of the following equations: Y' = K(Y cos? + X sin?) X' = K(X cos? - Y sin?), or R = K?X2 + Y2 ? = tan-1 Y/X, where K is an invariable constant. This special arithmetic unit is also suitable for other computations such as multiplication, division, and the conversion between binary and mixed radix number systems. However, only the trigonometric algorithms used in this computer and the instrumentation of these algorithms are discussed in this paper.

2,639 citations


"A high performance architecture for..." refers background or methods in this paper

  • ...In B-CORDIC, compensation without products is easy to perform due to the fact that the scale factor is a constant [12]....

    [...]

  • ...CORDIC was developed by Volder [12] for computing the rotation of a 2D vector of circular coordinates expressed as binary numbers, exclusively using addition and shift operations....

    [...]

  • ...1757978-1-4244-1666-0/08/$25.00 '2008 IEEE CORDIC was developed by Volder [12] for computing the rotation of a 2D vector of circular coordinates expressed as binary numbers, exclusively using addition and shift operations....

    [...]

  • ...CORDIC (COordinate Rotation Digital Computer) is a relevant method to approximate mathematical functions [12]....

    [...]

Proceedings ArticleDOI
J. S. Walther1
18 May 1971
TL;DR: This paper describes a single unified algorithm for the calculation of elementary functions including multiplication, division, sin, cos, tan, arctan, sinh, cosh, tanh, arCTanh, In, exp and square-root.
Abstract: This paper describes a single unified algorithm for the calculation of elementary functions including multiplication, division, sin, cos, tan, arctan, sinh, cosh, tanh, arctanh, In, exp and square-root The basis for the algorithm is coordinate rotation in a linear, circular, or hyperbolic coordinate system depending on which function is to be calculated The only operations required are shifting, adding, subtracting and the recall of prestored constants The limited domain of convergence of the algorithm is calculated, leading to a discussion of the modifications required to extend the domain for floating point calculations

1,044 citations


"A high performance architecture for..." refers background or methods in this paper

  • ...Walther [16] extended the method to hyperbolic and linear coordinates....

    [...]

  • ...The elementary angles αj must fulfil the following condition [16]:...

    [...]

  • ...Indeed, convergence can be achieved by repeating certain iterations [16], as shown in Table I....

    [...]

Journal ArticleDOI
TL;DR: The CORDIC iteration is applied to several Fourier transform algorithms and a new, especially attractive FFT computer architecture is presented as an example of the utility of this technique.
Abstract: The CORDIC iteration is applied to several Fourier transform algorithms. The number of operations is found as a function of transform method and radix representation. Using these representations, several hardware configurations are examined for cost, speed, and complexity tradeoffs. A new, especially attractive FFT computer architecture is presented as an example of the utility of this technique. Compensated and modified CORDIC algorithms are also developed.

304 citations


"A high performance architecture for..." refers methods in this paper

  • ...Several methods to avoid performing the final product by Km and carry out the scaling compensation in parallel with each of the iterations have been proposed [17]-[20]....

    [...]

Proceedings ArticleDOI
15 Jun 2003
TL;DR: This work introduces a new approach to decimal floating point which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard.
Abstract: Decimal arithmetic is the norm in human calculations, and human centric applications must use a decimal floating point arithmetic to achieve the same results. Initial benchmarks indicate that some applications spend 50% to 90% of their time in decimal processing, because software decimal arithmetic suffers a 100/spl times/ to 1000/spl times/ performance penalty over hardware. The need for decimal floating point in hardware is urgent. Existing designs, however, either fail to conform to modern standards or are incompatible with the established rules of decimal arithmetic. We introduce a new approach to decimal floating point which not only provides the strict results which are necessary for commercial applications but also meets the constraints and requirements of the IEEE 854 standard. A hardware implementation of this arithmetic is in development, and it is expected that this will significantly accelerate a wide variety of applications.

287 citations


"A high performance architecture for..." refers background in this paper

  • ...If the different BCD X3 digits of xj are considered, the term tx0 can be expressed as: tx,0 = (xj[5] xj[4] xj[3] xj[2] xj[1] xj[0]) cos(tan(z⎣j⎦)) (21)...

    [...]

  • ...In spite of that, some examples of decimal architectures can be found, such as Hewlett Packard [2], Texas Instruments [3] and Casio calculators [4], and some others [4]....

    [...]

Journal ArticleDOI
TL;DR: A monolithic processor computes products, quotients, and several common transcendental functions, based on the well-known principles of "CORDIC," but recourse to a subtle novel corollary results in a scale factor of unity.
Abstract: A monolithic processor computes products, quotients, and several common transcendental functions. The algorithms are based on the well-known principles of "CORDIC," but recourse to a subtle novel corollary results in a scale factor of unity. Compared to older machines, the overhead burden is significantly reduced. Also, expansion of the functional repertoire beyond the circular domain, i.e., addition to the menu of hyperbolic and linear operations, is a relatively trivial matter, in terms of both hardware cost and execution time. A bulk CMOS technology with conservative layout rules is used for the sake of high reliability, low-power consumption, and good cycle speed.

160 citations

Frequently Asked Questions (2)
Q1. What are the contributions mentioned in the paper "A high performance architecture for rotating decimal coordinates" ?

In this paper, a modification of the CORDIC method for decimal arithmetic is proposed so as to produce fast rotations. 

As a future work, an interesting task consists in developing a hardware implementation of a specific CORDIC-based rotator embedded on a decimal architecture.