scispace - formally typeset
Open AccessJournal ArticleDOI

50 Years of CORDIC: Algorithms, Architectures, and Applications

TLDR
A brief overview of the key developments in the CORDIC algorithms and architectures along with their potential and upcoming applications is presented.
Abstract
Year 2009 marks the completion of 50 years of the invention of CORDIC (coordinate rotation digital computer) by Jack E. Volder. The beauty of CORDIC lies in the fact that by simple shift-add operations, it can perform several computing tasks such as the calculation of trigonometric, hyperbolic and logarithmic functions, real and complex multiplications, division, square-root, solution of linear systems, eigenvalue estimation, singular value decomposition, QR factorization and many others. As a consequence, CORDIC has been utilized for applications in diverse areas such as signal and image processing, communication systems, robotics and 3-D graphics apart from general scientific and technical computation. In this article, we present a brief overview of the key developments in the CORDIC algorithms and architectures along with their potential and upcoming applications.

read more

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 56, NO. 9, SEPTEMBER 2009 1893
50 Years of CORDIC: Algorithms, Architectures,
and Applications
Pramod K. Meher, Senior Member, IEEE, Javier Valls, Member, IEEE, Tso-Bing Juang, Member, IEEE,
K. Sridharan, Senior Member, IEEE, and Koushik Maharatna, Member, IEEE
Abstract—Year 2009 marks the completion of 50 years of the
invention of CORDIC (COordinate Rotation DIgital Computer)
by Jack E. Volder. The beauty of CORDIC lies in the fact that
by simple shift-add operations, it can perform several computing
tasks such as the calculation of trigonometric, hyperbolic and
logarithmic functions, real and complex multiplications, division,
square-root, solution of linear systems, eigenvalue estimation,
singular value decomposition, QR factorization and many others.
As a consequence, CORDIC has been utilized for applications in
diverse areas such as signal and image processing, communication
systems, robotics and 3-D graphics apart from general scientific
and technical computation. In this article, we present a brief
overview of the key developments in the CORDIC algorithms and
architectures along with their potential and upcoming applica-
tions.
Index Terms—Arithmetic circuits, CORDIC, CORDIC algo-
rithms, digital signal processing chip, VLSI.
I. INTRODUCTION
C
OORDINATE Rotation DIgital Computer is abbreviated
as CORDIC. The key concept of CORDIC arithmetic is
based on the simple and ancient principles of two-dimensional
geometry. But the iterative formulation of a computational algo-
rithm for its implementation was first described in 1959 by Jack
E. Volder [1], [2] for the computation of trigonometric func-
tions, multiplication and division. This year therefore marks the
completion of 50 years of the CORDIC algorithm. Not only
a wide variety of applications of CORDIC have emerged in
the last 50 years, but also a lot of progress has been made in
the area of algorithm design and development of architectures
for high-performance and low-cost hardware solutions of those
Manuscript received August 22, 2008; revised November 26, 2008 and April
10, 2009. First published June 19, 2009; current version published September
02, 2009. This paper was recommended by Associate Editor V. Öwall.
P. K. Meher is with the Department of Communication Systems, Institute for
Infocomm Research, Singapore 138632 (e-mail: pkmeher@i2r.a-star.edu.sg).
J. Valls is with Instituto de Telecomunicaciones y Aplicaciones Multimedia,
Universidad Politécnica de Valencia, 46730 Grao de Gandia, Spain (e-mail:
jvalls@eln.upv.es).
T.-B. Juang is with the Department of Computer Science and Information
Engineering, National Pingtung Institute of Commerce, Pingtung City, Taiwan
900 (e-mail: tsobing@npic.edu.tw).
K. Sridharan is with the Department of Electrical Engineering, Indian Insti-
tute of Technology Madras, Chennai 600036, India (e-mail: sridhara@iitm.ac.
in).
K. Maharatna is with the School of Electronics and Computer Sci-
ence, University of Southampton, Southampton, SO17 1BJ, U.K. (e-mail:
km3@ecs.soton.ac.uk).
Digital Object Identifier 10.1109/TCSI.2009.2025803
applications. CORDIC-based computing received increased at-
tention in 1971, when John Walther [3], [4] showed that, by
varying a few simple parameters, it could be used as a single
algorithm for unified implementation of a wide range of ele-
mentary transcendental functions involving logarithms, expo-
nentials, and square roots along with those suggested by Volder
[1]. During the same time, Cochran [5] benchmarked various al-
gorithms, and showed that CORDIC technique is a better choice
for scientific calculator applications.
The popularity of CORDIC was very much enhanced there-
after primarily due to its potential for efficient and low-cost
implementation of a large class of applications which include:
the generation of trigonometric, logarithmic and transcendental
elementary functions; complex number multiplication, eigen-
value computation, matrix inversion, solution of linear systems
and singular value decomposition (SVD) for signal processing,
image processing, and general scientific computation. Some
other popular and upcoming applications are:
1) direct frequency synthesis, digital modulation and coding
for speech/music synthesis and communication;
2) direct and inverse kinematics computation for robot ma-
nipulation;
3) planar and three-dimensional vector rotation for graphics
and animation.
Although CORDIC may not be the fastest technique to per-
form these operations, it is attractive due to the simplicity of
its hardware implementation, since the same iterative algorithm
could be used for all these applications using the basic shift-add
operations of the form
.
Keeping the requirements and constraints of different ap-
plication environments in view, the development of CORDIC
algorithm and architecture has taken place for achieving high
throughput rate and reduction of hardware-complexity as well
as the latency of implementation. Some of the typical ap-
proaches for reduced-complexity implementation are focussed
on minimization of the complexity of scaling operation and the
complexity of barrel-shifter in the CORDIC engine. Latency
of implementation is an inherent drawback of the conventional
CORDIC algorithm. Angle recoding schemes, mixed-grain
rotation and higher radix CORDIC have been developed for
reduced latency realization. Parallel and pipelined CORDIC
have been suggested for high-throughput computation. The
objective of this article is not to present a detailed survey of
the developments of algorithms, architectures and applications
of CORDIC, which would require a few doctoral and masters
level dissertations. Rather we aim at providing the key develop-
ments in algorithms and architectures along with an overview
1549-8328/$26.00 © 2009 IEEE
Authorized licensed use limited to: UNIVERSITY OF SOUTHAMPTON. Downloaded on September 15, 2009 at 13:19 from IEEE Xplore. Restrictions apply.

1894 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 56, NO. 9, SEPTEMBER 2009
of the major application areas and upcoming applications. We
shall however discuss here the basic principles of CORDIC
operations for the benefit of general readers.
The remainder of this paper is organized as follows. In
Section II, we discuss the principles of CORDIC operation,
covering the elementary ideas from coordinate transformation
to rotation mode and vectoring mode operations followed
by design of the basic CORDIC cell and multidimensional
CORDIC. The key developments in CORDIC algorithms and
architectures are discussed in Section III, which covers the al-
gorithms and architectures pertaining to higher-radix CORDIC,
angle recording, coarse-fine hybrid micro rotations, redundant
number representation, differential CORDIC, and pipeline
implementation. In Section IV, we discuss the scaling and
accuracy aspects including the scaling techniques, scaling-free
CORDIC, quantization and area-delay-accuracy trade-off. The
applications of CORDIC to scientific computations, signal pro-
cessing, communications, robotics and graphics are discussed
briefly in Section V. The conclusion along with future research
directions are discussed in Section VI.
II. B
ASIC CORDIC T
ECHNIQUES
In this Section, we discuss the basic principle underlying the
CORDIC-based computation, and present its iterative algorithm
for different operating modes and planar coordinate systems. At
the end of this section, we discuss the extension of two-dimen-
sional rotation to multidimensional formulation.
A. The CORDIC Algorithm
As shown in Fig. 1, the rotation of a two-dimensional vector
through an angle , to obtain a rotated vector
could be performed by the matrix product ,
where
is the rotation matrix:
(1)
By factoring out the cosine term in (1), the rotation matrix
can be rewritten as
(2)
and can be interpreted as a product of a scale-factor
with a pseudorotation matrix ,
given by
(3)
The pseudorotation operation rotates the vector
by an angle
and changes its magnitude by a factor , to produce
a pseudo-rotated vector
.
To achieve simplicity of hardware realization of the rotation,
the key ideas used in CORDIC arithmetic are to (i) decompose
the rotations into a sequence of elementary rotations through
predefined angles that could be implemented with minimum
hardware cost; and (ii) to avoid scaling, that might involve arith-
metic operation, such as square-root and division. The second
idea is based on the fact the scale-factor contains only the magni-
tude information but no information about the angle of rotation.
Fig. 1. Rotation of vector on a two-dimensional plane.
1) Iterative Decomposition of Angle of Rotation: The
CORDIC algorithm performs the rotation iteratively by
breaking down the angle of rotation into a set of small pre-de-
fined angles
1
,
, so that could
be implemented in hardware by shifting through
bit locations.
Instead of performing the rotation directly through an angle
,
CORDIC performs it by a certain number of microrotations
through angle
, where
and (4)
that satisfies the CORDIC convergence theorem [3]:
. But,
the decomposition according to (4) could be used only for
(called the “convergence range”)
since
. Therefore, the angular decom-
position of (4) is applicable for angles in the first and fourth
quadrants. To obtain on-the-fly decomposition of angles into
the discrete base
, one may otherwise use the nonrestoring
decomposition [6]
and (5)
with
if and otherwise, where the
rotation matrix for the
th iteration corresponding to the selected
angle
is given by
(6)
being the scale-factor, and the pseudoro-
tation matrix
(7)
Note that the pseudo-rotation matrix
for the th itera-
tion alters the magnitude of the rotated vector by a scale-factor
during the th microrotation, which is in-
dependent of the value of
(direction of microrotation) used in
the angle decomposition.
1
All angles are measured in radian unless otherwise stated.
Authorized licensed use limited to: UNIVERSITY OF SOUTHAMPTON. Downloaded on September 15, 2009 at 13:19 from IEEE Xplore. Restrictions apply.

MEHER et al.: 50 YEARS OF CORDIC: ALGORITHMS, ARCHITECTURES AND APPLICATIONS 1895
Fig. 2. Hardware implementation of a CORDIC iteration.
2) Avoidance of Scaling: The other simplification performed
by the Volder’s algorithm [1] is to remove the scale-factor
from (6). The removal of scaling from the itera-
tive microrotations leads to a pseudo-rotated vector
instead of the desired rotated vector , where the
scale-factor
is given by
(8)
Since the scale-factor of microrotations does not depend on
the direction of microrotations and decreases monotonically, the
final scale-factor
converges to . Therefore, in-
stead of scaling during each microrotation, the magnitude of
final output could be scaled by
. Therefore, the basic CORDIC
iterations are obtained by applying the pseudo-rotation of the
vector to have,
, together with the nonrestoring
decomposition of the selected angles
, as follows:
(9)
CORDIC iterations of (9) could be used in two operating modes,
namely the rotation mode (RM) and the vectoring mode (VM),
which differ basically on how the directions of the microrota-
tions are chosen. In the rotation mode, a vector
is rotated by
an angle
to obtain a new vector . In this mode, the direction
of each microrotation
is determined by the sign of : if sign
of
is positive, then otherwise . In the vec-
toring mode, the vector
is rotated towards the -axis so that
the
-component approaches zero. The sum of all angles of mi-
crorotations (output angle
) is equal to the angle of rotation of
vector
, while output corresponds to its magnitude. In this
operating mode, the decision about the direction of the micro-
rotation depends on the sign of
: if it is positive then
otherwise . CORDIC iterations are easily implemented
in both software and hardware. Fig. 2 shows the basic hardware
stage for a single CORDIC iteration. After each iteration the
number of shifts is incremented by a pair of barrel-shifters. To
have an
-bit output precision, CORDIC iterations are
needed. Note that it could be implemented by a simple selec-
tion operation in serial architectures like the one proposed in
the original work, or in fully parallel CORDIC architectures the
shift operations could be hardwired, where no barrel-shifters are
involved.
Finally, to overcome the problem of the limited convergence
range and, then to extend the CORDIC rotations to the complete
TABLE I
G
ENERALIZED CORDIC ALGORITHM
range of , an extra iteration is required to be performed. This
new iteration is shown in (10) which is required as an initial
rotation through
.
where (10)
B. Generalization of the CORDIC Algorithm
In 1971, Walther found how CORDIC iterations could be
modified to compute hyperbolic functions [3] and reformulated
the CORDIC algorithm in to a generalized and unified form
which is suitable to perform rotations in circular, hyperbolic and
linear coordinate systems. The unified formulation includes a
new variable
, which is assigned different values for different
coordinate systems. The generalized CORDIC is formulated as
follows:
(11)
where
for rotation mode
for vectoring mode
For
or , and or
, the algorithm given by (11) works in circular,
linear or hyperbolic coordinate systems, respectively. Table I
summarizes the operations that can be performed in rotation
and vectoring modes
2
in each of these coordinate systems.
The convergence range of linear and hyperbolic CORDIC are
obtained, as in the case of circular coordinate, by the sum of all
given by . The hyperbolic CORDIC requires
to execute iterations for
twice to ensure con-
vergence. Consequently, these repetitions must be considered
while computing the scale-factor
,
which converges to 0.8281.
2
In the rotation mode, the components of a vector resulting due to rotation of
a vector through a given angle are derived, while in the vectoring mode the mag-
nitude as well as the phase angle of a vector are estimated from the component
values. The rotation and vectoring modes are also known as the vector rotation
mode and the angle accumulation mode, respectively.
Authorized licensed use limited to: UNIVERSITY OF SOUTHAMPTON. Downloaded on September 15, 2009 at 13:19 from IEEE Xplore. Restrictions apply.

1896 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 56, NO. 9, SEPTEMBER 2009
C. Multidimensional CORDIC
The CORDIC algorithm was extended to higher dimensions
using simple Householder reflection [7]. The Householder re-
flection matrix is defined as
(12)
where
is an -dimensional vector and is the
identity matrix. The product reflects the -dimensional
vector
with respect to the hyperplane with normal that
passes through the origin. Basically, the Householder-based
CORDIC performs the vectoring operation of an
-dimen-
sional vector to one of the axes.
For the sake of clarity, we consider here the case of 3-D vector
projected on to the -axis in the Euclidean
space. The rotation matrix for 3-D case, corresponding to the
th
iteration,
, is given by the product of two simple House-
holder reflections as
(13)
where
, and with
, and and
being the directions of microrotations.
One can write the
th rotation matrix in terms of the
pseudo-rotation matrix as
, where
is the scale-factor and
is the pseudo-rotation matrix which could be expressed as
function of the shifting and decision variables as
(14)
Therefore, the
th iteration of 3-D Housholder CORDIC ro-
tation results
, and, the vector is projected
to
-axis, such that after iterations gives the length of the
vector scaled by
with bit precision [8].
III. A
DVANCED CORDIC ALGORITHMS AND
ARCHITECTURES
CORDIC computation is inherently sequential due to two
main bottlenecks: 1) the micro-rotation for any iteration is per-
formed on the intermediate vector computed by the previous
iteration and 2) the
th iteration could be started only
after the completion of the
th iteration, since the value of
which is required to start the th iteration could be known
only after the completion of the
th iteration. To alleviate the
second bottleneck some attempts have been made for evalua-
tion of
values corresponding to small micro-rotation angles
[9], [10]. However, the CORDIC iterations could not still be
performed in parallel due to the first bottleneck. A partial par-
allelization has been realized in [11] by combining a pair of
conventional CORDIC iterations into a single merged iteration
which provides better area-delay efficiency. But the accuracy
is slightly affected by such merging and cannot be extended to
a higher number of conventional CORDIC iterations since the
induced error becomes unacceptable [11]. Parallel realization
of CORDIC iterations to handle the first bottleneck by direct
unfolding of micro-rotation is possible, but that would result
in increase in computational complexity and the advantage of
simplicity of CORDIC algorithm gets degraded [12], [13]. Al-
though no popular architectures are known to us for fully par-
allel implementation of CORDIC, different forms of pipelined
implementation of CORDIC have however been proposed for
improving the computational throughput [14].
Since the CORDIC algorithm exhibits linear-rate conver-
gence, it requires
iterations to have -bit precision of
the output. Overall latency of the computation thus amounts to
product of the word-length and the CORDIC iteration period.
The speed of CORDIC operations is therefore constrained
either by the precision requirement (iteration count) or the
duration of the clock period. The duration of clock period on
the other hand mainly depends on the large carry propagation
time for the addition/subtraction during each micro-rotation.
It is a straight-forward choice to use fast adders for reducing
the iteration period at the expense of large silicon area. Use
of carry-save adder is a good option to reduce the iteration
period and overall latency [15]. Timmermann and others have
suggested a method of truncation of CORDIC algorithm after
iterations (for -bit precision), where the last itera-
tion performs a single rotation for implementing the remaining
angle. It lowers the the latency time but involves one multi-
plication or division, respectively, in the rotation or vectoring
mode [9].
To handle latency bottlenecks, various techniques have
been developed and reported in the literature. Most of the
well known algorithms could be grouped under, high-radix
CORDIC, the angle-recoding method, hybrid micro-rotation
scheme, redundant CORDIC and differential CORDIC which
we discuss briefly in the following subsections.
A. Higher Radix CORDIC Algorithm
The radix-4 CORDIC algorithm [16] is given by
(15)
where
and the elementary angles
. The scale-factor for the th iteration
. In order to preserve the norm of the
vector the output of micro-rotations is required to be scaled by
a factor
(16)
To have
-bit output precision, the radix-4 CORDIC algorithm
requires
micro-rotations, which is half that of radix-2 al-
gorithm. However, it requires more computation time for each
iteration and involves more hardware compared to the radix-2
CORDIC to select the value of
out of five different possi-
bilities. Moreover, the scale-factor, given by (16), also varies
with the rotation angles since it depends on
which could have
Authorized licensed use limited to: UNIVERSITY OF SOUTHAMPTON. Downloaded on September 15, 2009 at 13:19 from IEEE Xplore. Restrictions apply.

MEHER et al.: 50 YEARS OF CORDIC: ALGORITHMS, ARCHITECTURES AND APPLICATIONS 1897
any of the five different values. Some techniques have there-
fore been suggested for scale-factor compensation through iter-
ative shift-add operations [16], [17]. A high-radix CORDIC al-
gorithm in vectoring mode is also suggested in [18], which can
be used for reduced latency operation at the cost of larger size
tables for storing the elementary angles and pre-scaling factors
than the radix-2 and radix-4 implementation.
B. Angle Recoding (AR) Methods
The purpose of angle recoding (AR) is to reduce the number
of CORDIC iterations by encoding the angle of rotation as a
linear combination of a set of selected elementary angles of
micro-rotations. AR methods are well-suited for many signal
processing and image processing applications where the ro-
tation angle is known a priori, such as when performing the
discrete orthogonal transforms like discrete Fourier transform
(DFT), the discrete cosine transform (DCT), etc.
1) Elementary-Angle-Set Recoding: In the conventional
CORDIC, any given rotation angle is expressed as a linear com-
bination of
values of elementary angles that belong to the set
in order to obtain an -bit value as .
However, in AR methods, this constraint is relaxed by adding
zeros to the linear combination to obtain the desired angle
using relatively fewer terms of the form
for . The elementary-angle-set (EAS) used
by AR scheme is given by
. One of the simplest form
of the angle recoding method based on the greedy algorithm
proposed by Hu and Naganathan [19] tries to represent the re-
maining angle using the closest elementary angle
.
The angle recoding algorithm of [19] is briefly stated in Table II.
Using this recoding scheme the total number of iterations could
be reduced by at least 50% keeping the same
-bit accuracy
unchanged. A similar method of angle recoding in vectoring
mode called as the backward angle recoding is suggested in
[20].
2) Extended Elementary-Angle-Set Recoding: Wu et al. [21]
have suggested an AR scheme based on an extended elemen-
tary-angle-set (EEAS), that provides a more flexible way of de-
composing the target rotation angle. In the EEAS approach,
the set
of the elementary-angle set is extended further
to
and . EEAS has better
recoding efficiency in terms of the number of iterations and
can yield better error performance than the AR scheme based
on EAS. The pseudo-rotation for
th micro-rotations based on
EEAS scheme is given by
(17)
The pseudo-rotated vector
, obtained after
(the required number of micro-rotations) iterations, according
to (17), needs to be scaled by a factor
, where
to produce
the rotated vector. For reducing the scaling approximation and
for a more flexible implementation of scaling, similar to the
TABLE II
A
NGLE RECODING
ALGORITHM
Fig. 3. EEAS-based CORDIC architecture. BS represents the Barrel Shifter,
and C denotes the control signals for the micro-rotations.
EEAS scheme for the micro-rotation phase, a method has also
been suggested in [21], as given below
(18)
where
and . and
.
The iterations for micro-rotation phase as well as the scaling
phase could be implemented in the same architecture to reduce
the hardware cost, as shown in Fig. 3.
3) Parallel Angle Recoding: The AR methods [19], [21]
could be used to reduce the number of iterations by more than
50%, when the angle of rotation is known in advance. However,
for unknown rotation angles, their hardware implementation in-
volves more cycle time than the conventional implementation,
which results in a reduction in overall efficacy of the algorithm.
To reduce the cycle time of CORDIC iterations in such cases,
a parallel angle selection scheme is suggested in [22], which
can be used in conjunction with the AR method, to gain the
advantages of the reduction in iteration count, without further
increase in the cycle time. The parallel AR scheme in [22] is
based on dynamic angle selection, where the elementary angles
can be tested in parallel and the direction for the micro-ro-
tations can be determined quickly to minimize the iteration pe-
riod. During each iteration, the residual angle
, is passed to a
set of
adder-subtractor units that compute
for each elementary angle in parallel and the
differences
for are then fed to a binary-tree like
structure to compare them against each other to find the smallest
Authorized licensed use limited to: UNIVERSITY OF SOUTHAMPTON. Downloaded on September 15, 2009 at 13:19 from IEEE Xplore. Restrictions apply.

Citations
More filters
Journal ArticleDOI

64-Channel UWB Wireless Neural Vector Analyzer SOC With a Closed-Loop Phase Synchrony-Triggered Neurostimulator

TL;DR: An ultra wideband (UWB) 64-channel responsive neural stimulator system-on-chip (SoC) is presented, which is validated in the early detection and abortion of seizures in freely moving rodents on-line and in early seizure detection in humans off-line.
Journal ArticleDOI

ML Estimation of Timing and Frequency Offsets Using Distinctive Correlation Characteristics of OFDM Signals Over Dispersive Fading Channels

TL;DR: A synchronization algorithm for determining the symbol timing offset and the carrier frequency offset in OFDM systems, based on the maximum-likelihood (ML) criterion, is described and an approximate but closed-form solution is presented.
Journal ArticleDOI

Area-Time Efficient Scaling-Free CORDIC Using Generalized Micro-Rotation Selection

TL;DR: An area-time efficient CORDIC algorithm that completely eliminates the scale-factor is presented and a generalized micro-rotation selection technique based on high speed most-significant-1-detection obviates the complex search algorithms for identifying the micro- rotations.
Journal ArticleDOI

CORDIC Designs for Fixed Angle of Rotation

TL;DR: This paper has synthesized the proposed CORDIC cells by Synopsys Design Compiler using TSMC 90-nm library, and shown that the proposed designs offer higher throughput, less latency and less area-delay product than the reference CORDic design for fixed and known angles of rotation.
References
More filters
Book

Digital Signal Processing: Principles, Algorithms, and Applications

TL;DR: This paper presents a meta-analysis of the Z-Transform and its application to the Analysis of LTI Systems, and its properties and applications, as well as some of the algorithms used in this analysis.
Journal ArticleDOI

Illumination for computer generated pictures

TL;DR: Human visual perception and the fundamental laws of optics are considered in the development of a shading rule that provides better quality and increased realism in generated images.
Journal ArticleDOI

The CORDIC Trigonometric Computing Technique

TL;DR: The trigonometric algorithms used in this computer and the instrumentation of these algorithms are discussed in this paper.
Journal ArticleDOI

A Fast Computational Algorithm for the Discrete Cosine Transform

TL;DR: A Fast Discrete Cosine Transform algorithm has been developed which provides a factor of six improvement in computational complexity when compared to conventional DiscreteCosine Transform algorithms using the Fast Fourier Transform.
Related Papers (5)
Frequently Asked Questions (20)
Q1. What are the contributions in this paper?

In this article, the authors present a brief overview of the key developments in the CORDIC algorithms and architectures along with their potential and upcoming applications. 

For different advanced algorithms may be investigated in detail and compared with in future work. 

Apart from implementation of rotation operations, CORDIC is used in the evaluation of trigonometric functions and square root expressions involved in the inverse kinematics problems [103].3) CORDIC for Other Robotics Applications: CORDIC has also been applied to robot control [104], [105], where CORDIC circuits serve as the functional units of a programmable CPU co-processor. 

Two of the key problems where CORDIC provides area and power-efficient solutions are: 1) direct kinematics and 2) inverse kinematics of serial robot manipulators. 

The removal of scaling from the iterative microrotations leads to a pseudo-rotated vector instead of the desired rotated vector , where the scale-factor is given by(8)Since the scale-factor of microrotations does not depend on the direction of microrotations and decreases monotonically, the final scale-factor converges to . 

The purpose of angle recoding (AR) is to reduce the number of CORDIC iterations by encoding the angle of rotation as a linear combination of a set of selected elementary angles of micro-rotations. 

In order to preserve the norm of the vector the output of micro-rotations is required to be scaled by a factor(16)To have -bit output precision, the radix-4 CORDIC algorithm requires micro-rotations, which is half that of radix-2 algorithm. 

Rotation mode redundant CORDIC has been found to result in fast implementation of sinusoidal function generation, unitary matrix transformation, angle calculation and rotation [34]–[38]. 

The CORDIC algorithm performs the rotation iteratively by breaking down the angle of rotation into a set of small pre-defined angles1, , so that could be implemented in hardware by shifting through bit locations. 

CORDIC computation is inherently sequential due to two main bottlenecks: 1) the micro-rotation for any iteration is performed on the intermediate vector computed by the previous iteration and 2) the th iteration could be started only after the completion of the th iteration, since the value of which is required to start the th iteration could be known only after the completion of the th iteration. 

The CORDIC approach is valuable to find the inverse kinematic solution when a closed form solution is possible (when, in particular, the desired tool tip position is within the robot’s work envelope and when joint angle limits are not violated). 

Addition/subtraction operations are faster in the redundant number system, since unlike the binary system, it does not involve carry propagation. 

1) Direct Kinematics Solution (DKS) for Serial Robot Manipulators: A robot manipulator consists of a sequence of links, connected typically by either revolute or prismatic joints. 

It is shown in [61] that since the scale-factor is known in advance, one can perform the minimal recoding of the bits of scaling-factor, and implement the multiplication thereafter by a Wallace tree. 

A more efficient recoding scheme has been proposed in [33] for the reduction of number of micro-rotations to be employed in parallel CORDIC rotations. 

In the last fifty years, several algorithms and architectures have been developed to speed up the CORDIC by reducing its iteration counts and through its pipelined implementation. 

Since could be expressed as a linear combination of angels of small enough magnitude , where , the computation of fine rotation phase can be realized by a sequence of shift-and-add operations. 

For known and constant angle rotations the sign of micro-rotations could be predetermined, and the need of multiplexing could be avoided for reducing the critical-path. 

When the sum of the output of more than one independent CORDIC operations are to be evaluated, one can perform only one scaling of the output sum [50] in the case of constant factor scaling. 

A direct digital synthesizer (DDS) (as shown in Fig. 8) consists of a phase accumulator and a phase-to-waveform converter [86], [87].