scispace - formally typeset
Search or ask a question
Topic

Quantization (image processing)

About: Quantization (image processing) is a research topic. Over the lifetime, 7977 publications have been published within this topic receiving 126632 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A hybrid neural-network for human face recognition which compares favourably with other methods and analyzes the computational complexity and discusses how new classes could be added to the trained recognizer.
Abstract: We present a hybrid neural-network for human face recognition which compares favourably with other methods. The system combines local image sampling, a self-organizing map (SOM) neural network, and a convolutional neural network. The SOM provides a quantization of the image samples into a topological space where inputs that are nearby in the original space are also nearby in the output space, thereby providing dimensionality reduction and invariance to minor changes in the image sample, and the convolutional neural network provides partial invariance to translation, rotation, scale, and deformation. The convolutional network extracts successively larger features in a hierarchical set of layers. We present results using the Karhunen-Loeve transform in place of the SOM, and a multilayer perceptron (MLP) in place of the convolutional network for comparison. We use a database of 400 images of 40 individuals which contains quite a high degree of variability in expression, pose, and facial details. We analyze the computational complexity and discuss how new classes could be added to the trained recognizer.

2,954 citations

Journal ArticleDOI
TL;DR: Some of the most significant features of the standard are presented, such as region-of-interest coding, scalability, visual weighting, error resilience and file format aspects, and some comparative results are reported.
Abstract: One of the aims of the standardization committee has been the development of Part I, which could be used on a royalty- and fee-free basis. This is important for the standard to become widely accepted. The standardization process, which is coordinated by the JTCI/SC29/WG1 of the ISO/IEC has already produced the international standard (IS) for Part I. In this article the structure of Part I of the JPFG 2000 standard is presented and performance comparisons with established standards are reported. This article is intended to serve as a tutorial for the JPEG 2000 standard. The main application areas and their requirements are given. The architecture of the standard follows with the description of the tiling, multicomponent transformations, wavelet transforms, quantization and entropy coding. Some of the most significant features of the standard are presented, such as region-of-interest coding, scalability, visual weighting, error resilience and file format aspects. Finally, some comparative results are reported and the future parts of the standard are discussed.

1,842 citations

Proceedings ArticleDOI
23 Jun 2008
TL;DR: In this paper, a weighted set of visual words is obtained by selecting words based on proximity in descriptor space, and this representation may be incorporated into a standard tf-idf architecture and how spatial verification is modified in the case of this soft-assignment.
Abstract: The state of the art in visual object retrieval from large databases is achieved by systems that are inspired by text retrieval. A key component of these approaches is that local regions of images are characterized using high-dimensional descriptors which are then mapped to ldquovisual wordsrdquo selected from a discrete vocabulary.This paper explores techniques to map each visual region to a weighted set of words, allowing the inclusion of features which were lost in the quantization stage of previous systems. The set of visual words is obtained by selecting words based on proximity in descriptor space. We describe how this representation may be incorporated into a standard tf-idf architecture, and how spatial verification is modified in the case of this soft-assignment. We evaluate our method on the standard Oxford Buildings dataset, and introduce a new dataset for evaluation. Our results exceed the current state of the art retrieval performance on these datasets, particularly on queries with poor initial recall where techniques like query expansion suffer. Overall we show that soft-assignment is always beneficial for retrieval with large vocabularies, at a cost of increased storage requirements for the index.

1,630 citations

Proceedings ArticleDOI
21 Feb 2016
TL;DR: This paper presents an in-depth analysis of state-of-the-art CNN models and shows that Convolutional layers are computational-centric and Fully-Connected layers are memory-centric, and proposes a CNN accelerator design on embedded FPGA for Image-Net large-scale image classification.
Abstract: In recent years, convolutional neural network (CNN) based methods have achieved great success in a large number of applications and have been among the most powerful and widely used techniques in computer vision. However, CNN-based methods are com-putational-intensive and resource-consuming, and thus are hard to be integrated into embedded systems such as smart phones, smart glasses, and robots. FPGA is one of the most promising platforms for accelerating CNN, but the limited bandwidth and on-chip memory size limit the performance of FPGA accelerator for CNN.In this paper, we go deeper with the embedded FPGA platform on accelerating CNNs and propose a CNN accelerator design on embedded FPGA for Image-Net large-scale image classification. We first present an in-depth analysis of state-of-the-art CNN models and show that Convolutional layers are computational-centric and Fully-Connected layers are memory-centric.Then the dynamic-precision data quantization method and a convolver design that is efficient for all layer types in CNN are proposed to improve the bandwidth and resource utilization. Results show that only 0.4% accuracy loss is introduced by our data quantization flow for the very deep VGG16 model when 8/4-bit quantization is used. A data arrangement method is proposed to further ensure a high utilization of the external memory bandwidth. Finally, a state-of-the-art CNN, VGG16-SVD, is implemented on an embedded FPGA platform as a case study. VGG16-SVD is the largest and most accurate network that has been implemented on FPGA end-to-end so far. The system on Xilinx Zynq ZC706 board achieves a frame rate at 4.45 fps with the top-5 accuracy of 86.66% using 16-bit quantization. The average performance of convolutional layers and the full CNN is 187.8 GOP/s and 137.0 GOP/s under 150MHz working frequency, which outperform previous approaches significantly.

1,172 citations

Proceedings ArticleDOI
TL;DR: A new dataset, UCID (pronounced "use it") - an Uncompressed Colour Image Dataset which tries to bridge the gap between standardised image databases and objective evaluation of image retrieval algorithms that operate in the compressed domain.
Abstract: Standardised image databases or rather the lack of them are one of the main weaknesses in the field of content based image retrieval (CBIR). Authors often use their own images or do not specify the source of their datasets. Naturally this makes comparison of results somewhat difficult. While a first approach towards a common colour image set has been taken by the MPEG 7 committee 1 their database does not cater for all strands of research in the CBIR community. In particular as the MPEG-7 images only exist in compressed form it does not allow for an objective evaluation of image retrieval algorithms that operate in the compressed domain or to judge the influence image compression has on the performance of CBIR algorithms. In this paper we introduce a new dataset, UCID (pronounced ”use it”) - an Uncompressed Colour Image Dataset which tries to bridge this gap. The UCID dataset currently consists of 1338 uncompressed images together with a ground truth of a series of query images with corresponding models that an ideal CBIR algorithm would retrieve. While its initial intention was to provide a dataset for the evaluation of compressed domain algorithms, the UCID database also represents a good benchmark set for the evaluation of any kind of CBIR method as well as an image set that can be used to evaluate image compression and colour quantisation algorithms.

1,117 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
84% related
Image segmentation
79.6K papers, 1.8M citations
84% related
Feature (computer vision)
128.2K papers, 1.7M citations
84% related
Image processing
229.9K papers, 3.5M citations
83% related
Robustness (computer science)
94.7K papers, 1.6M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20228
2021354
2020283
2019294
2018259
2017295