scispace - formally typeset
Search or ask a question
Author

Ajay H. Daptardar

Bio: Ajay H. Daptardar is an academic researcher from Brandeis University. The author has contributed to research in topics: Image retrieval & Content-based image retrieval. The author has an hindex of 6, co-authored 7 publications receiving 85 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A bitwise KMP algorithm is proposed that can move one extra bit in the case of a mismatch since the alphabet is binary, and two practical Huffman decoding schemes which handle more than a single bit per machine operation are combined.
Abstract: In the present work we perform compressed pattern matching in binary Huffman encoded texts [Huffman, D. (1952). A method for the construction of minimum redundancy codes, Proc. of the IRE, 40, 1098-1101]. A modified Knuth-Morris-Pratt algorithm is used in order to overcome the problem of false matches, i.e., an occurrence of the encoded pattern in the encoded text that does not correspond to an occurrence of the pattern itself in the original text. We propose a bitwise KMP algorithm that can move one extra bit in the case of a mismatch since the alphabet is binary. To avoid processing any bit of the encoded text more than once, a preprocessed table is used to determine how far to back up when a mismatch is detected, and is defined so that we are always able to align the start of the encoded pattern with the start of a codeword in the encoded text. We combine our KMP algorithm with two practical Huffman decoding schemes which handle more than a single bit per machine operation; skeleton trees defined by Klein [Klein, S. T. (2000). Skeleton trees for efficient decoding of huffman encoded texts. Information Retrieval, 3, 7-23], and numerical comparisons between special canonical values and portions of a sliding window presented in Moffat and Turpin [Moffat, A., & Turpin, A. (1997). On the implementation of minimum redundancy prefix codes. IEEE Transactions on Communications, 45, 1200-1207]. Experiments show rapid search times of our algorithms compared to the "decompress then search" method, therefore, files can be kept in their compressed form, saving memory space. When compression gain is important, these algorithms are better than cgrep [Ferragina, P., Tommasi, A., & Manzini, G. (2004). C Library to search over compressed texts, http://roquefort.di.unipi.it/~ferrax/CompressedSearch], which is only slightly faster than ours.

28 citations

Proceedings ArticleDOI
28 Mar 2006
TL;DR: A low complexity approach for content-based image retrieval (CBIR) using vector quantization (VQ), where the VQ codebooks serve as generative image models and are used to represent images while computing their similarity.
Abstract: We present a low complexity approach for content-based image retrieval (CBIR) using vector quantization (VQ). The VQ codebooks serve as generative image models and are used to represent images while computing their similarity. The hope is that encoding an image with a codebook of a similar image will yield a better representation than when a codebook of a dissimilar image is used. Experiments performed on a color image database support this hypothesis, and retrieval based on this method compares well with previous work. Our basic method "tags" each image with a thumbnail and a small VQ codebook of only 8 entries, where each entry is a 6 element color feature vector. In addition, we consider augmenting feature vectors with x-y coordinates associated with the entry.

14 citations

Proceedings ArticleDOI
23 Mar 2004
TL;DR: A bitwise KMP algorithm is proposed that can move one extra bit in the case of a mismatch, since the alphabet is binary, to overcome the problem of false matches in Huffman encoded texts.
Abstract: This paper presents a compressed pattern matching in Huffman encoded texts. A modified Knuth-Morris-Pratt (KMP) algorithm is used in order to overcome the problem of false matches. This paper also proposes a bitwise KMP algorithm that can move one extra bit in the case of a mismatch, since the alphabet is binary. The KMP algorithm is combined with two Huffman decoding algorithms called sk-kmp and win-kmp to handle more than a single bit per machine operation. However, skeleton trees are used for efficient decoding of Huffman encoded texts.

13 citations

Proceedings ArticleDOI
25 Mar 2008
TL;DR: Experiments performed on the COREL image database show this new approach to provide almost equivalent retrieval precision to the previous method of jointly trained codebooks (and an improvement over previous methods) at much lower complexity.
Abstract: We present a new lower complexity approach for content based image retrieval based on a relative compressibility similarity measure using VQ codebooks employing feature vectors based on color and position. In previous work we have developed a system that employs feature vectors that are a combination of color and position. In this paper, we present a new approach that decouples color and position. We present this approach as two methods. The first trains separate codebooks for color and position features, eliminating the need for potentially application specific feature weightings during training. The second method achieves nearly the same performance at greatly reduced complexity by partitioning images into regions and training high-rate TSVQ codebooks for each region (i.e., position information is made implicit). Features extracted from query regions are encoded with the corresponding database region codebooks. The maximum number of codewords that a database region codebook may contain is determined at runtime and is a function of the query features. Region codebooks are then pruned appropriately before encoding query features. Experiments performed on the COREL image database show this new approach to provide almost equivalent retrieval precision to our previous method of jointly trained codebooks (and an improvement over previous methods) at much lower complexity.

12 citations

Book ChapterDOI
05 Dec 2005
TL;DR: This work presents a novel approach for content-based image retrieval (CBIR) using vector quantization (VQ), which allows us to retain the image database in compressed form without any need to store additional features for image retrieval.
Abstract: Image retrieval and image compression are each areas that have received considerable attention in the past. However there have been fewer advances that address both these problems simultaneously. In this work, we present a novel approach for content-based image retrieval (CBIR) using vector quantization (VQ). Using VQ allows us to retain the image database in compressed form without any need to store additional features for image retrieval. The VQ codebooks serve as generative image models and are used to represent images while computing their similarity. The hope is that encoding an image with a codebook of a similar image will yield a better representation than when a codebook of a dissimilar image is used. Experiments performed on a color image database over a range of codebook sizes support this hypothesis and retrieval based on this method compares well with previous work.

12 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A similarity measure based on compression with dictionaries, the Fast Compression Distance (FCD), which reduces the complexity of these methods, without degradations in performance, is proposed.

66 citations

Book
12 Jul 2013
TL;DR: This monograph surveys and appraises techniques for pattern matching in compressed text and images, and identifies the important relationship between pattern matching and compression, and proposes some performance measures for compressed pattern matching algorithms.
Abstract: Pattern Matching in Compressed Texts and Images surveys and appraises techniques for pattern matching in compressed text and images. Normally compressed data needs to be decompressed before it is processed. If however the compression has been done in the right way, it is often possible to search the data without having to decompress it, or, at least, only partially decompress it. The problem can be divided into lossless and lossy compression methods, and then in each of these cases the pattern matching can be either exact or inexact. Much work has been reported in the literature on techniques for all of these cases. It includes algorithms that are suitable for pattern matching for various compression methods, and compression methods designed specifically for pattern matching. This monograph provides a survey of this work while also identifying the important relationship between pattern matching and compression, and proposing some performance measures for compressed pattern matching algorithms. Pattern Matching in Compressed Texts and Images is an excellent reference text for anyone who has an interest in the problem of searching compressed text and images. It concludes with a particularly insightful section on the ideas and research directions that are likely to occupy researchers in this field in the short and long term.

30 citations

Journal ArticleDOI
TL;DR: The experimental results show that the proposed algorithm could achieve an excellent compression ratio without losing data when compared to the standard compression algorithms.
Abstract: The development of multimedia and digital imaging has led to high quantity of data required to represent modern imagery. This requires large disk space for storage, and long time for transmission over computer networks, and these two are relatively expensive. These factors prove the need for images compression. Image compression addresses the problem of reducing the amount of space required to represent a digital image yielding a compact representation of an image, and thereby reducing the image storage/transmission time requirements. The key idea here is to remove redundancy of data presented within an image to reduce its size without affecting the essential information of it. We are concerned with lossless image compression in this paper. Our proposed approach is a mix of a number of already existing techniques. Our approach works as follows: first, we apply the well-known Lempel-Ziv-Welch (LZW) algorithm on the image in hand. What comes out of the first step is forward to the second step where the Bose, Chaudhuri and Hocquenghem (BCH) error correction and detected algorithm is used. To improve the compression ratio, the proposed approach applies the BCH algorithms repeatedly until “inflation” is detected. The experimental results show that the proposed algorithm could achieve an excellent compression ratio without losing data when compared to the standard compression algorithms.

29 citations

Journal ArticleDOI
TL;DR: The Wavelet tree is adapted, in this paper, to Fibonacci codes, so that in addition to supporting direct access to the fibonacci encoded file, it also increases the compression savings when compared to the original Fib onacci compressed file.

27 citations

Journal ArticleDOI
01 Sep 2017
TL;DR: This research modeled a search process of the Knuth-Morris-Pratt algorithm in the form of easy-to-understand visualization, Knuth/Morris algorithm selection because this algorithm is easy to learn and easy to implement into many programming languages.
Abstract: In this research modeled a search process of the Knuth-Morris-Pratt algorithm in the form of easy-to-understand visualization, Knuth-Morris-Pratt algorithm selection because this algorithm is easy to learn and easy to implement into many programming languages.

26 citations