A low-power, high-performance, 1024-point FFT processor

doi:10.1109/4.748190

Home
/
Papers
/
A low-power, high-performance, 1024-point FFT processor

Journal Article•DOI•

A low-power, high-performance, 1024-point FFT processor

Bevan M. Baas¹•Institutions (1)

Stanford University¹

01 Mar 1999-IEEE Journal of Solid-state Circuits (IEEE)-Vol. 34, Iss: 3, pp 380-387

TL;DR: This paper presents an energy-efficient, single-chip, 1024-point fast Fourier transform (FFT) processor, which has been fabricated in a standard 0.7 /spl mu/m CMOS process and is fully functional on first-pass silicon.

read less

Abstract: This paper presents an energy-efficient, single-chip, 1024-point fast Fourier transform (FFT) processor. The 460000-transistor design has been fabricated in a standard 0.7 /spl mu/m (L/sub poly/=0.6 /spl mu/m) CMOS process and is fully functional on first-pass silicon. At a supply voltage of 1.1 V, it calculates a 1024-point complex FFT in 330 /spl mu/s while consuming 9.5 mW, resulting in an adjusted energy efficiency more than 16 times greater than the previously most efficient known FFT processor. At 3.3 V, it operates at 173 MHz-which is a clock rate 2.6 times greater than the previously fastest rate.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A 180-mV subthreshold FFT processor using a minimum energy design methodology

[...]

Alice Wang¹, Anantha P. Chandrakasan²•Institutions (2)

Texas Instruments¹, Massachusetts Institute of Technology²

03 Jan 2005

TL;DR: New subthreshold logic and memory design methodologies are developed and demonstrated on a fast Fourier transform (FFT) processor that is designed to investigate the estimated minimum energy point.

...read moreread less

Abstract: In emerging embedded applications such as wireless sensor networks, the key metric is minimizing energy dissipation rather than processor speed. Minimum energy analysis of CMOS circuits estimates the optimal operating point of clock frequencies, supply voltage, and threshold voltage according to A. Chandrakasan et al. (see ibid., vol.27, no.4, p.473-84, Apr. 1992). The minimum energy analysis shows that the optimal power supply typically occurs in subthreshold (e.g., supply voltages that are below device thresholds). New subthreshold logic and memory design methodologies are developed and demonstrated on a fast Fourier transform (FFT) processor. The FFT processor uses an energy-aware architecture that allows for variable FFT length (128-1024 point), variable bit-precision (8 b and 16 b) and is designed to investigate the estimated minimum energy point. The FFT processor is fabricated using a standard 0.18-/spl mu/m CMOS logic process and operates down to 180 mV. The minimum energy point for the 16-b 1024-point FFT processor occurs at 350-mV supply voltage where it dissipates 155 nJ/FFT at a clock frequency of 10 kHz.

...read moreread less

619 citations

Cites background from "A low-power, high-performance, 1024..."

...Dedicated low-power FFT processors are desired to sustain low-power requirements of various embedded applications [11]....
[...]

Proceedings Article•DOI•

A 180mV FFT processor using subthreshold circuit techniques

[...]

Alice Wang¹, Anantha P. Chandrakasan¹•Institutions (1)

Massachusetts Institute of Technology¹

13 Sep 2004

TL;DR: Logic and memory design techniques allowing subthreshold operation are developed and demonstrated and the fabricated 1024-point FFT processor operates down to 180mV using a standard 0.18/spl mu/m CMOS logic process while using 155nJ/FFT at the optimal operating point.

...read moreread less

Abstract: Minimizing energy requires scaling supply voltages below device thresholds. Logic and memory design techniques allowing subthreshold operation are developed and demonstrated. The fabricated 1024-point FFT processor operates down to 180mV using a standard 0.18/spl mu/m CMOS logic process while using 155nJ/FFT at the optimal operating point.

...read moreread less

270 citations

Journal Article•DOI•

A 1-GS/s FFT/IFFT processor for UWB applications

[...]

Yu-Wei Lin, Hsuan-Yu Liu, Chen-Yi Lee

25 Jul 2005-IEEE Journal of Solid-state Circuits

TL;DR: A novel 128-point FFT/IFFT processor for ultrawideband (UWB) systems and the proposed pipelined FFT architecture, called mixed-radix multipath delay feedback (MRMDF), can provide a higher throughput rate by using the multidata-path scheme.

...read moreread less

Abstract: In this paper, we present a novel 128-point FFT/IFFT processor for ultrawideband (UWB) systems. The proposed pipelined FFT architecture, called mixed-radix multipath delay feedback (MRMDF), can provide a higher throughput rate by using the multidata-path scheme. Furthermore, the hardware costs of memory and complex multipliers in MRMDF are only 38.9% and 44.8% of those in the known FFT processor by means of the delay feedback and the data scheduling approaches. The high-radix FFT algorithm is also realized in our processor to reduce the number of complex multiplications. A test chip for the UWB system has been designed and fabricated using 0.18-/spl mu/m single-poly and six-metal CMOS process with a core area of 1.76/spl times/1.76 mm/sup 2/, including an FFT/IFFT processor and a test module. The throughput rate of this fabricated FFT processor is up to 1 Gsample/s while it consumes 175 mW. Power dissipation is 77.6 mW when its throughput rate meets UWB standard in which the FFT throughput rate is 409.6 Msample/s.

...read moreread less

220 citations

Cites background from "A low-power, high-performance, 1024..."

...Various FFT architectures, such as single-memory architecture, dual-memory architecture [2], pipelined architecture [3], array architecture [4], and cached-memory architecture [5], have been proposed in the last three decades....
[...]

Journal Article•DOI•

Reliable low-power digital signal processing via reduced precision redundancy

[...]

Byonghyo Shim¹, S.R. Sridhara¹, Naresh R. Shanbhag¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 May 2004-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The resulting soft digital signal processing system achieves up to 60% and 44% energy savings with no loss in the signal-to-noise ratio (SNR) for receive filtering in a QPSK system and the butterfly of fast Fourier transform in a WLAN OFDM system.

...read moreread less

Abstract: In this paper, we present a novel algorithmic noise-tolerance (ANT) technique referred to as reduced precision redundancy (RPR). RPR requires a reduced precision replica whose output can be employed as the corrected output in case the original system computes erroneously. When combined with voltage overscaling (VOS), the resulting soft digital signal processing system achieves up to 60% and 44% energy savings with no loss in the signal-to-noise ratio (SNR) for receive filtering in a QPSK system and the butterfly of fast Fourier transform (FFT) in a WLAN OFDM system, respectively. These energy savings are with respect to optimally scaled (i.e., the supply voltage equals the critical voltage V/sub dd-crit/) present day systems. Further, we show that the RPR technique is able to maintain the output SNR for error rates of up to 0.09/sample and 0.06/sample in an finite impulse response filter and a FFT block, respectively.

...read moreread less

196 citations

Cites background or methods from "A low-power, high-performance, 1024..."

...format requires the eventual truncation of the butterfly outputs when writing the outputs back to memory, computation of all the multiplier output bits is unnecessary [23]....
[...]
...The processor’s datapath computes one complex radix-2 DIT butterfly per cycle [23]....
[...]

Journal Article•DOI•

Design of an FFT/IFFT Processor for MIMO OFDM Systems

[...]

Yu-Wei Lin¹, Chen-Yi Lee²•Institutions (2)

MediaTek¹, National Chiao Tung University²

16 Apr 2007-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: A novel 128/64 point fast Fourier transform (FFT)/ inverse FFT (IFFT) processor for the applications in a multiple-input multiple-output orthogonal frequency-division multiplexing based IEEE 802.11n wireless local area network baseband processor.

...read moreread less

Abstract: In this paper, we present a novel 128/64 point fast Fourier transform (FFT)/ inverse FFT (IFFT) processor for the applications in a multiple-input multiple-output orthogonal frequency-division multiplexing based IEEE 802.11n wireless local area network baseband processor. The unfolding mixed-radix multipath delay feedback FFT architecture is proposed to efficiently deal with multiple data sequences. The proposed processor not only supports the operation of FFT/IFFT in 128 points and 64 points but can also provide different throughput rates for 1-4 simultaneous data sequences to meet IEEE 802.11n requirements. Furthermore, less hardware complexity is needed in our design compared with traditional four-parallel approach. The proposed FFT/IFFT processor is designed in a 0.13-mum single-poly and eight-metal CMOS process. The core area is 660times2142 mum2 , including an FFT/IFFT processor and a test module. At the operation clock rate of 40 MHz, our proposed processor can calculate 128-point FFT with four independent data sequences within 3.2 mus meeting IEEE 802.11n standard requirements

...read moreread less

143 citations

Cites background from "A low-power, high-performance, 1024..."

...[2], pipelined architecture [3], array architecture [4], and cached-memory architecture [5], have been proposed....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64

Collapse

References

PDF

Open Access

More filters

Book•

Computer Architecture: A Quantitative Approach

[...]

John L. Hennessy¹, David A. Patterson²•Institutions (2)

Stanford University¹, University of California, Berkeley²

01 Dec 1989

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.

...read moreread less

Abstract: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today. In this edition, the authors bring their trademark method of quantitative analysis not only to high-performance desktop machine design, but also to the design of embedded and server systems. They have illustrated their principles with designs from all three of these domains, including examples from consumer electronics, multimedia and Web technologies, and high-performance computing.

...read moreread less

11,671 citations

"A low-power, high-performance, 1024..." refers background in this paper

...It is well known that data caches increase the effective bandwidth to a memory—but only if the memory access pattern exhibits sufficient locality [ 10 ]....
[...]

Book•

Discrete-Time Signal Processing

[...]

Alan V. Oppenheim, Ronald W. Schafer

01 Jan 1989

TL;DR: In this paper, the authors provide a thorough treatment of the fundamental theorems and properties of discrete-time linear systems, filtering, sampling, and discrete time Fourier analysis.

...read moreread less

Abstract: For senior/graduate-level courses in Discrete-Time Signal Processing. THE definitive, authoritative text on DSP -- ideal for those with an introductory-level knowledge of signals and systems. Written by prominent, DSP pioneers, it provides thorough treatment of the fundamental theorems and properties of discrete-time linear systems, filtering, sampling, and discrete-time Fourier Analysis. By focusing on the general and universal concepts in discrete-time signal processing, it remains vital and relevant to the new challenges arising in the field --without limiting itself to specific technologies with relatively short life spans.

...read moreread less

10,388 citations

Book•

Theory and application of digital signal processing

[...]

Lawrence R. Rabiner, Ben Gold, C. K. Yuen¹•Institutions (1)

Australian National University¹

01 Jan 1975

TL;DR: Feyman and Wing as discussed by the authors introduced the simplicity of the invariant imbedding method to tackle various problems of interest to engineers, physicists, applied mathematicians, and numerical analysts.

...read moreread less

Abstract: sprightly style and is interesting from cover to cover. The comments, critiques, and summaries that accompany the chapters are very helpful in crystalizing the ideas and answering questions that may arise, particularly to the self-learner. The transparency in the presentation of the material in the book equips the reader to proceed quickly to a wealth of problems included at the end of each chapter. These problems ranging from elementary to research-level are very valuable in that a solid working knowledge of the invariant imbedding techniques is acquired as well as good insight in attacking problems in various applied areas. Furthermore, a useful selection of references is given at the end of each chapter. This book may not appeal to those mathematicians who are interested primarily in the sophistication of mathematical theory, because the authors have deliberately avoided all pseudo-sophistication in attaining transparency of exposition. Precisely for the same reason the majority of the intended readers who are applications-oriented and are eager to use the techniques quickly in their own fields will welcome and appreciate the efforts put into writing this book. From a purely mathematical point of view, some of the invariant imbedding results may be considered to be generalizations of the classical theory of first-order partial differential equations, and a part of the analysis of invariant imbedding is still at a somewhat heuristic stage despite successes in many computational applications. However, those who are concerned with mathematical rigor will find opportunities to explore the foundations of the invariant imbedding method. In conclusion, let me quote the following: "What is the best method to obtain the solution to a problem'? The answer is, any way that works." (Richard P. Feyman, Engineering and Science, March 1965, Vol. XXVIII, no. 6, p. 9.) In this well-written book, Bellman and Wing have indeed accomplished the task of introducing the simplicity of the invariant imbedding method to tackle various problems of interest to engineers, physicists, applied mathematicians, and numerical analysts.

...read moreread less

3,249 citations

Journal Article•DOI•

Low-power CMOS digital design

[...]

Anantha P. Chandrakasan¹, Samuel Sheng¹, Robert W. Brodersen¹•Institutions (1)

University of California, Berkeley¹

01 Apr 1992-IEEE Journal of Solid-state Circuits

TL;DR: In this paper, techniques for low power operation are presented which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations to reduce power consumption in CMOS digital circuits while maintaining computational throughput.

...read moreread less

Abstract: Motivated by emerging battery-operated applications that demand intensive computation in portable environments, techniques are investigated which reduce power consumption in CMOS digital circuits while maintaining computational throughput. Techniques for low-power operation are shown which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations. An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations. This optimum is achieved by trading increased silicon area for reduced power consumption. >

...read moreread less

2,690 citations

Journal Article•

Low-Power CMOS Digital Design

[...]

Anantha P. Chandrakasan, Samuel Sheng, Robert W. Brodersen

25 Apr 1992-IEICE Transactions on Electronics

TL;DR: An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations, and is achieved by trading increased silicon area for reduced power consumption.

...read moreread less

Abstract: Motivated by emerging battery-operated applications that demand intensive computation in portable environments, techniques are investigated which reduce power consumption in CMOS digital circuits while maintaining computational throughput Techniques for low-power operation are shown which use the lowest possible supply voltage coupled with architectural, logic style, circuit, and technology optimizations An architecturally based scaling strategy is presented which indicates that the optimum voltage is much lower than that determined by other scaling considerations This optimum is achieved by trading increased silicon area for reduced power consumption >

...read moreread less

2,337 citations