scispace - formally typeset
Search or ask a question
Author

Takagi

Bio: Takagi is an academic researcher from Kyoto University. The author has contributed to research in topics: Sorting & Serial binary adder. The author has an hindex of 2, co-authored 2 publications receiving 409 citations.

Papers
More filters
Journal ArticleDOI
Takagi1, Yasuura1, Yajima1
TL;DR: Since the multiplier has a regular cellular array structure similar to an array multiplier, it is suitable for VLSI implementation and is excellent in both computation speed and regularity in layout.
Abstract: A high-speed VLSI multiplication algorithm internally using redundant binary representation is proposed. In n bit binary integer multiplication, n partial products are first generated and then added up pairwise by means of a binary tree of redundant binary adders. Since parallel addition of two n-digit redundant binary numbers can be performed in a constant time independent of n without carry propagation, n bit multiplication can be performed in a time proportional to log2 n. The computation time is almost the same as that by a multiplier with a Wallace tree, in which three partial products will be converted into two, in contrast to our two-to-one conversion, and is much shorter than that by an array multiplier for longer operands. The number of computation elements of an n bit multiplier based on the algorithm is proportional to n2. It is almost the same as those of conventional ones. Furthermore, since the multiplier has a regular cellular array structure similar to an array multiplier, it is suitable for VLSI implementation. Thus, the multiplier is excellent in both computation speed and regularity in layout. It can be implemented on a VLSI chip with an area proportional to n2 log2 n. The algorithm can be directly applied to both unsigned and 2's complement binary integer multiplication.

344 citations

Journal ArticleDOI
Yasuura1, Takagi1, Yajima1
TL;DR: A new hardware algorithm of parallel enumeration sorting circuits whose processing time is linearly proportional to the number of data for sorting is designed, suitable for VLSI implementation.
Abstract: We propose a new parallel sorting scheme, called the parallel enumeration sorting scheme, which is suitable for VLSI implementation. This scheme can be introduced to conventional computer systems without changing their architecture. In this scheme, sorting is divided into two stages, the ordering process and the rearranging one. The latter can be efficiently performed by central processing units or intelligent memory devices. For implementations of the ordering process by VLSI technology, we design a new hardware algorithm of parallel enumeration sorting circuits whose processing time is linearly proportional to the number of data for sorting. Data are serially transmitted between the sorting circuit and memory devices and the total communication between them is minimized. The basic structure used in the algorithm is called a bus connected cellular array structure with pipeline and parallel processing. The circuit consists of a linear array of one type of simple cell and two buses connecting all cells for efficient global communications in the circuit. The sorting circuit is simple, regular and small enough for realization by today's VLSI technology. We discuss several applications of the sorting circuit and evaluate its performance.

69 citations


Cited by
More filters
Book
28 Feb 1999
TL;DR: Switching Theory for Logic Synthesis introduces and explains various topics that make up the subject of logic synthesis: multi-valued input two-valued output function, logic design for PLDs/FPGAs, EXOR-based design, and complexity theories of logic networks.
Abstract: From the Publisher: Switching Theory for Logic Synthesis covers the basic topics of switching theory and logic synthesis in fourteen chapters. Chapters 1 through 5 provide the mathematical foundation. Chapters 6 through 8 include an introduction to sequential circuits, optimization of sequential machines and asynchronous sequential circuits. Chapters 9 through 14 are the main feature of the book. These chapters introduce and explain various topics that make up the subject of logic synthesis: multi-valued input two-valued output function, logic design for PLDs/FPGAs, EXOR-based design, and complexity theories of logic networks. An appendix providing a history of switching theory is included. The reference list consists of over four hundred entries. Switching Theory for Logic Synthesis is based on the author's lectures at Kyushu Institute of Technology as well as seminars for CAD engineers from various Japanese technology companies. Switching Theory for Logic Synthesis will be of interest to CAD professionals and students at the advanced level. It is also useful as a textbook, as each chapter contains examples, illustrations, and exercises.

375 citations

Book
01 Jun 1994
TL;DR: In this paper, a tight lower bound of the VLSI layout area of the binary de Bruijn multiprocessor network (BDM) is derived; a procedure for an area-optimal VLSIsI layout is also described.
Abstract: It is shown that the binary de Bruijn multiprocessor network (BDM) can solve a wide variety of classes of problems. The BDM admits an N-node linear array, an N-node ring, (N-1)-node complete binary trees, ((3N/4)-2)-node tree machines, and an N-node one-step shuffle-exchange network, where N (=2/sup k/, k an integer) is the total number of nodes. The de Bruijn multiprocessor networks are proved to be fault-tolerant as well as extensible. A tight lower bound of the VLSI layout area of the BDM is derived; a procedure for an area-optimal VLSI layout is also described. It is demonstrated that the BDM is more versatile than the shuffle-exchange and the cube-connected cycles. Recent work has classified sorting architectures into (1) sequential input/sequential output, (2) parallel input/sequential output, (3) parallel input/parallel output, (4) sequential input/parallel output, and (5) hybrid input/hybrid output. It is demonstrated that the de Bruijn multiprocessor networks can sort data items in all of the abovementioned categories. No other network which can sort data items in all the categories is known. >

266 citations

MonographDOI
10 Dec 2008
TL;DR: FPGA-based Implementation of Signal Processing Systems is an important reference for practising engineers and researchers working on the design and development of DSP systems for radio, telecommunication, information, audio-visual and security applications.
Abstract: Field programmable gate arrays (FPGAs) are an increasingly popular technology for implementing digital signal processing (DSP) systems. By allowing designers to create circuit architectures developed for the specific applications, high levels of performance can be achieved for many DSP applications providing considerable improvements over conventional microprocessor and dedicated DSP processor solutions. The book addresses the key issue in this process specifically, the methods and tools needed for the design, optimization and implementation of DSP systems in programmable FPGA hardware. It presents a review of the leading-edge techniques in this field, analyzing advanced DSP-based design flows for both signal flow graph- (SFG-) based and dataflow-based implementation, system on chip (SoC) aspects, and future trends and challenges for FPGAs. The automation of the techniques for component architectural synthesis, computational models, and the reduction of energy consumption to help improve FPGA performance, are given in detail. Written from a system level design perspective and with a DSP focus, the authors present many practical application examples of complex DSP implementation, involving: high-performance computing e.g. matrix operations such as matrix multiplication; high-speed filtering including finite impulse response (FIR) filters and wave digital filters (WDFs); adaptive filtering e.g. recursive least squares (RLS) filtering; transforms such as the fast Fourier transform (FFT). FPGA-based Implementation of Signal Processing Systems is an important reference for practising engineers and researchers working on the design and development of DSP systems for radio, telecommunication, information, audio-visual and security applications. Senior level electrical and computer engineering graduates taking courses in signal processing or digital signal processing shall also find this volume of interest.

215 citations

Journal ArticleDOI
TL;DR: The area-time complexity of sorting is analyzed under an updated model of VLSI computation, which makes a distinction between "processing" circuits and "memory" circuits; the latter are less important since they are denser and consume less power.
Abstract: The area-time complexity of sorting is analyzed under an updated model of VLSI computation. The new model makes a distinction between "processing" circuits and "memory" circuits; the latter are less important since they are denser and consume less power. Other adjustments to the model make it possible to compare pipelined and nonpipelined designs.

214 citations

Book ChapterDOI
TL;DR: The chapter presents a unified treatment of various parallel sorting algorithms by bringing out clearly the relation between the architecture of parallel computers and the structure of algorithms.
Abstract: Publisher Summary This chapter presents a survey on various parallel sorting algorithms. Sorting is a nontrivial problem and has widespread commercial and business applications. Serial algorithms for sorting have been available since the days of punched-card machines. At present, there is a considerable body of literature on serial sorting algorithms. Parallel algorithms for sorting are of a recent origin and came into existence over the past decade. The chapter presents a unified treatment of various parallel sorting algorithms by bringing out clearly the relation between the architecture of parallel computers and the structure of algorithms. In the design of parallel algorithms in general, and of parallel sorting algorithms in particular, two models have been widely used: (1) models based on fixed interconnection networks such as the same or single instruction on multiple data (SIMD) machine mesh-connected network and (2) models based on a global memory, which is shared by various processors. The special-purpose network-sorting algorithms are described. Algorithms for SIMD machines are given.

197 citations