scispace - formally typeset
Search or ask a question
Author

M. Vishwanath

Other affiliations: Xerox, PARC
Bio: M. Vishwanath is an academic researcher from Pennsylvania State University. The author has contributed to research in topics: Discrete wavelet transform & Wavelet transform. The author has an hindex of 11, co-authored 24 publications receiving 1339 citations. Previous affiliations of M. Vishwanath include Xerox & PARC.

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a class of VLSI architectures based on linear systolic arrays, for computing the 1-D Discrete Wavelet Transform (DWT), is presented, where DWT is computed in real time (running DWT), using just N/sub w/(J-1) cells of storage.
Abstract: A class of VLSI architectures based on linear systolic arrays, for computing the 1-D Discrete Wavelet Transform (DWT), is presented. The various architectures of this class differ only in the design of their routing networks, which could be systolic, semisystolic, or RAM-based. These architectures compute the Recursive Pyramid Algorithm, which is a reformulation of Mallat's pyramid algorithm for the DWT. The DWT is computed in real time (running DWT), using just N/sub w/(J-1) cells of storage, where N/sub w/ is the length of the filter and J is the number of octaves. They are ideally suited for single-chip implementation due to their practical I/O rate, small storage, and regularity. The N-point 1-D DWT is computed in 2N cycles. The period can be reduced to N cycles by using N/sub w/ extra MAC's. Our architectures are shown to be optimal in both computation time and in area. A utilization of 100% is achieved for the linear array. Extensions of our architecture for computing the M-band DWT are discussed. Also, two architectures for computing the 2-D DWT (separable case) are discussed. One of these architectures, based on a combination of systolic and parallel filters, computes the N/sup 2/-point 2-D DWT, in real time, in N/sup 2/+N cycles, using 2NN/sub w/ cells of storage. >

308 citations

Journal ArticleDOI
TL;DR: The proposed systolic array and the parallel filter architectures implement these on-line algorithms and are optimal both with respect to area and time (under the word-serial model).
Abstract: This paper presents a wide range of algorithms and architectures for computing the 1D and 2D discrete wavelet transform (DWT) and the 1D and 2D continuous wavelet transform (CWT). The algorithms and architectures presented are independent of the size and nature of the wavelet function. New on-line algorithms are proposed for the DWT and the CWT that require significantly small storage. The proposed systolic array and the parallel filter architectures implement these on-line algorithms and are optimal both with respect to area and time (under the word-serial model). Moreover, these architectures are very regular and support single chip implementations in VLSI. The proposed SIMD architectures implement the existing pyramid and a'trous algorithms and are optimal with respect to time. >

244 citations

Journal ArticleDOI
TL;DR: The recursive pyramid algorithm is a reformulation of the classical pyramid algorithm for computing the discrete wavelet transform (DWT) using just L(log N/spl minus/1) words of storage, as compared with O(N) words required by the PA.
Abstract: The recursive pyramid algorithm (RPA) is a reformulation of the classical pyramid algorithm (PA) for computing the discrete wavelet transform (DWT). The RPA computes the N-point DWT in real time (running DWT) using just L(log N/spl minus/1) words of storage, as compared with O(N) words required by the PA. L is the length of the wavelet filter. The RPA is combined with the short-length FIR filter algorithms to reduce the number of multiplications and additions. >

234 citations

Patent
M. Vishwanath1, Philip A. Chou1
17 Aug 1995
TL;DR: In this paper, a weighted wavelet hierarchical vector quantization (WWHVQ) procedure is initiated by obtaining an N×N pixel image where 8 bits per pixel are used.
Abstract: A weighted wavelet hierarchical vector quantization (WWHVQ) procedure is initiated by obtaining an N×N pixel image where 8 bits per pixel (steps 10 and 12). A look-up operation is performed to obtain data representing a discrete wavelet transform (DWT) followed by a quantization of the data (step 14). Upon completion of the look-up, a data compression will have been performed. Further stages and look-up will result in further compression of the data, i.e., 4:1, 8:1, 16:1, 32:1, 64:1, . . . etc. Accordingly, a determination is made whether the compression is complete (step 16). If the compression is incomplete, further look-up is performed. If the compression is complete, however, the compressed data is transmitted (step 18). It is determined at a gateway whether further compression is required (step 19). If so, transcoding is performed (step 20). The receiver receives the compressed data (step 22). Subsequently, a second look-up operation is performed to obtain data representing an inverse discrete wavelet transform of the decompressed data (step 24). After one iteration, the data is decompressed by a factor of two. Further iterations allows for further decompression of the data. Accordingly, a determination is made whether decompression is complete (step 26). If the decompression is in incomplete, further look-ups are performed. If, however, the decompression is complete, the WWHVQ procedure is ended (step 28).

205 citations

Journal ArticleDOI
01 Nov 1996
TL;DR: This paper surveys the VLSI architectures that have been proposed for computing the Discrete and Continuous Wavelet Transforms for 1-D and 2-D signals and finds that they are optimal with respect to both area and time under the word-serial model.
Abstract: Wavelet transforms have proven to be useful tools for several applications, including signal analysis, signal compression and numerical analysis. This paper surveys the VLSI architectures that have been proposed for computing the Discrete and Continuous Wavelet Transforms for 1-D and 2-D signals. The architectures are based upon on-line versions of the wavelet transform algorithms. These architectures support single chip implementations and are optimal with respect to both area and time under the word-serial model.

166 citations


Cited by
More filters
Book
02 Nov 2007
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
Abstract: The main characteristic of Reconfigurable Computing is the presence of hardware that can be reconfigured to implement specific functionality more suitable for specially tailored hardware than on a simple uniprocessor. Reconfigurable computing systems join microprocessors and programmable hardware in order to take advantage of the combined strengths of hardware and software and have been used in applications ranging from embedded systems to high performance computing. Many of the fundamental theories have been identified and used by the Hardware/Software Co-Design research field. Although the same background ideas are shared in both areas, they have different goals and use different approaches.This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology. It will take a reader with a background in the basics of digital design and software programming and provide them with the knowledge needed to be an effective designer or researcher in this rapidly evolving field. · Treatment of FPGAs as computing vehicles rather than glue-logic or ASIC substitutes · Views of FPGA programming beyond Verilog/VHDL · Broad set of case studies demonstrating how to use FPGAs in novel and efficient ways

531 citations

Journal ArticleDOI
TL;DR: A line-based approach for the implementation of the wavelet transform is introduced, which yields the same results as a "normal" implementation, but where, unlike prior work, the memory issues arising from the need to synchronize encoder and decoder are addressed.
Abstract: This paper addresses the problem of low memory wavelet image compression. While wavelet or subband coding of images has been shown to be superior to more traditional transform coding techniques, little attention has been paid until recently to the important issue of whether both the wavelet transforms and the subsequent coding can be implemented in low memory without significant loss in performance. We present a complete system to perform low memory wavelet image coding. Our approach is "line-based" in that the images are read line by line and only the minimum required number of lines is kept in memory. There are two main contributions of our work. First, we introduce a line-based approach for the implementation of the wavelet transform, which yields the same results as a "normal" implementation, but where, unlike prior work, we address memory issues arising from the need to synchronize encoder and decoder. Second, we propose a novel context-based encoder which requires no global information and stores only a local set of wavelet coefficients. This low memory coder achieves performance comparable to state of the art coders at a fraction of their memory utilization.

369 citations

Journal ArticleDOI
TL;DR: An architecture that performs the forward and inverse discrete wavelet transform (DWT) using a lifting-based scheme for the set of seven filters proposed in JPEG2000 using an architecture consisting of two row processors, two column processors, and two memory modules.
Abstract: We propose an architecture that performs the forward and inverse discrete wavelet transform (DWT) using a lifting-based scheme for the set of seven filters proposed in JPEG2000. The architecture consists of two row processors, two column processors, and two memory modules. Each processor contains two adders, one multiplier, and one shifter. The precision of the multipliers and adders has been determined using extensive simulation. Each memory module consists of four banks in order to support the high computational bandwidth. The architecture has been designed to generate an output every cycle for the JPEG2000 default filters. The schedules have been generated by hand and the corresponding timings listed. Finally, the architecture has been implemented in behavioral VHDL. The estimated area of the proposed architecture in 0.18-/spl mu/ technology is 2.8 nun square, and the estimated frequency of operation is 200 MHz.

350 citations

Journal ArticleDOI
TL;DR: In this paper, a class of VLSI architectures based on linear systolic arrays, for computing the 1-D Discrete Wavelet Transform (DWT), is presented, where DWT is computed in real time (running DWT), using just N/sub w/(J-1) cells of storage.
Abstract: A class of VLSI architectures based on linear systolic arrays, for computing the 1-D Discrete Wavelet Transform (DWT), is presented. The various architectures of this class differ only in the design of their routing networks, which could be systolic, semisystolic, or RAM-based. These architectures compute the Recursive Pyramid Algorithm, which is a reformulation of Mallat's pyramid algorithm for the DWT. The DWT is computed in real time (running DWT), using just N/sub w/(J-1) cells of storage, where N/sub w/ is the length of the filter and J is the number of octaves. They are ideally suited for single-chip implementation due to their practical I/O rate, small storage, and regularity. The N-point 1-D DWT is computed in 2N cycles. The period can be reduced to N cycles by using N/sub w/ extra MAC's. Our architectures are shown to be optimal in both computation time and in area. A utilization of 100% is achieved for the linear array. Extensions of our architecture for computing the M-band DWT are discussed. Also, two architectures for computing the 2-D DWT (separable case) are discussed. One of these architectures, based on a combination of systolic and parallel filters, computes the N/sup 2/-point 2-D DWT, in real time, in N/sup 2/+N cycles, using 2NN/sub w/ cells of storage. >

308 citations

Patent
09 Apr 2010
TL;DR: In this article, a personal media broadcasting system enables video distribution over a computer network and allows a user to view and control media sources over the computer network from a remote location, where a personal broadcaster receives an input from one or more types of media sources, digitizes and compresses the content, and streams the compressed media over the network.
Abstract: A personal media broadcasting system enables video distribution over a computer network and allows a user to view and control media sources over a computer network from a remote location. A personal broadcaster receives an input from one or more types of media sources, digitizes and compresses the content, and streams the compressed media over a computer network to a media player running on any of a wide range of client devices for viewing the media. The system may allow the user to issue control commands (e.g., “channel up”) from the media player to the broadcaster, causing the source device to execute the commands. The broadcaster and the media player may employ several techniques for buffering, transmitting, and viewing the content to improve the user's experience.

256 citations