scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Supervised dictionary learning using distance dependent indian buffet process

TL;DR: A novel Dictionary Learning algorithm based on the Distance Dependent Indian Buffet Process model is proposed, which demonstrates higher classification accuracy than other existing DL based classification methods.
Abstract: This paper proposes a novel Dictionary Learning (DL) algorithm for pattern classification tasks. Based on the Distance Dependent Indian Buffet Process (DDIBP) model, a shared dictionary for signals belonging to different classes is learned so that the learned sparse codes are highly discriminative which can improve the pattern classification performance. Moreover, using this non-parametric method, an appropriate dictionary size can be inferred. The proposed method evaluated on different standard databases demonstrates higher classification accuracy than other existing DL based classification methods.
Citations
More filters
Journal ArticleDOI
TL;DR: An approach based on Sum-to-k constrained non-negative matrix factorization (S2K-NMF) is proposed, which is able to effectively extract perceptually meaningful sources from complex mixtures.
Abstract: Energy disaggregation or non-intrusive load monitoring addresses the issue of extracting device-level energy consumption information by monitoring the aggregated signal at one single measurement point without installing meters on each individual device. Energy disaggregation can be formulated as a source separation problem, where the aggregated signal is expressed as linear combination of basis vectors in a matrix factorization framework. In this paper, an approach based on Sum-to-k constrained non-negative matrix factorization (S2K-NMF) is proposed. By imposing the sum-to- k constraint and the non-negative constraint, S2K-NMF is able to effectively extract perceptually meaningful sources from complex mixtures. The strength of the proposed algorithm is demonstrated through two sets of experiments: Energy disaggregation in a residential smart home; and heating, ventilating, and air conditioning components energy monitoring in an industrial building testbed maintained at the Oak Ridge National Laboratory. Extensive experimental results demonstrate the superior performance of S2K-NMF as compared to state-of-the-art decomposition-based disaggregation algorithms.

107 citations


Cites background from "Supervised dictionary learning usin..."

  • ...Learning the model from the training samples instead of using some predefined bases such as Fourier or wavelet bases has been shown to produce more accurate results [28]–[33]....

    [...]

Proceedings ArticleDOI
Alireza Rahimpour1, Liu Liu1, Ali Taalimi1, Yang Song1, Hairong Qi1 
14 Sep 2017
TL;DR: A novel approach based on using a gradient-based attention mechanism in deep convolution neural network for solving the person re-identification problem by learns to focus selectively on parts of the input image for which the networks' output is most sensitive to.
Abstract: Despite recent attempts for solving the person re-identification problem, it remains a challenging task since a person's appearance can vary significantly when large variations in view angle, human pose and illumination are involved. The concept of attention is one of the most interesting recent architectural innovations in neural networks. Inspired by that, in this paper we propose a novel approach based on using a gradient-based attention mechanism in deep convolution neural network for solving the person re-identification problem. Our model learns to focus selectively on parts of the input image for which the networks' output is most sensitive to. Extensive comparative evaluations demonstrate that the proposed method outperforms state-of-the-art approaches, including both traditional and deep neural network-based methods on the challenging CUHK01 and CUHK03 datasets.

29 citations


Cites background from "Supervised dictionary learning usin..."

  • ...metric [7–13] and developing a new feature representation [14–25]....

    [...]

01 Jan 2017
TL;DR: The goal is to extract a discriminative representation of the multimodal data that leads to easily finding its essential characteristics in the subsequent analysis step, e.g., regression and classification, in a decomposition coefficient vector that is favorable towards the maximal discriminatory power.
Abstract: A phenomenon or event can be received from various kinds of detectors or under different conditions. Each such acquisition framework is a modality of the phenomenon. Due to the relation between the modalities of multimodal phenomena, a single modality cannot fully describe the event of interest. Since several modalities report on the same event introduces new challenges comparing to the case of exploiting each modality separately. We are interested in designing new algorithmic tools to apply sensor fusion techniques in the particular signal representation of sparse coding which is a favorite methodology in signal processing, machine learning and statistics to represent data. This coding scheme is based on a machine learning technique and has been demonstrated to be capable of representing many modalities like natural images. We will consider situations where we are not only interested in support of the model to be sparse, but also to reflect a-priorily known knowledge about the application in hand. Our goal is to extract a discriminative representation of the multimodal data that leads to easily finding its essential characteristics in the subsequent analysis step, e.g., regression and classification. To be more precise, sparse coding is about representing signals as linear combinations of a small number of bases from a dictionary. The idea is to learn a dictionary that encodes intrinsic properties of the multimodal data in a decomposition coefficient vector that is favorable towards the maximal discriminatory power. We carefully design a multimodal representation framework to learn discriminative feature representations by fully exploiting, the modality-shared which is the information shared by various modalities, and modality-specific which is the information content of

5 citations

Proceedings ArticleDOI
07 Dec 2016
TL;DR: The proposed randomized approach is shown to bring about substantial savings in complexity and memory requirements for robust subspace learning over conventional approaches that use the full scale data.
Abstract: This paper develops and analyzes a randomized design for robust Principal Component Analysis (PCA). In the proposed randomized method, a data sketch is constructed using random row sampling followed by random column sampling. The proposed randomized approach is shown to bring about substantial savings in complexity and memory requirements for robust subspace learning over conventional approaches that use the full scale data. A characterization of the sample and computational complexity for the randomized approach is derived. It is shown that the correct subspace can be recovered with computational and sample complexity that are almost independent of the size of the data. The results of the mathematical analysis are confirmed through numerical simulations using both synthetic and real data.

1 citations


Cites background from "Supervised dictionary learning usin..."

  • ...INTRODUCTION Linear subspace models are widely used in signal processing and data analysis since many datasets can be wellapproximated with low-dimensional subspaces [1–12]....

    [...]

Posted Content
TL;DR: The proposed algorithm separates the background and foreground pixels by trying to fit pixel values in the block into a smooth function using a robust regression method and is shown to have superior performance over other methods, such as the hierarchical k-means clustering algorithm, shape primitive extraction and coding, and the least absolute deviation fitting scheme for foreground segmentation.
Abstract: This paper considers how to separate text and/or graphics from smooth background in screen content and mixed content images and proposes an algorithm to perform this segmentation task. The proposed methods make use of the fact that the background in each block is usually smoothly varying and can be modeled well by a linear combination of a few smoothly varying basis functions, while the foreground text and graphics create sharp discontinuity. This algorithm separates the background and foreground pixels by trying to fit pixel values in the block into a smooth function using a robust regression method. The inlier pixels that can be well represented with the smooth model will be considered as background, while remaining outlier pixels will be considered foreground. We have also created a dataset of screen content images extracted from HEVC standard test sequences for screen content coding with their ground truth segmentation result which can be used for this task. The proposed algorithm has been tested on the dataset mentioned above and is shown to have superior performance over other methods, such as the hierarchical k-means clustering algorithm, shape primitive extraction and coding, and the least absolute deviation fitting scheme for foreground segmentation.

Cites background from "Supervised dictionary learning usin..."

  • ...It is good to note that algorithms based on supervised dictionary learning and subspace learning are also useful for deriving the smooth representation of background component [25]-[28]....

    [...]

References
More filters
Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations

01 Jan 1998

12,940 citations


"Supervised dictionary learning usin..." refers background in this paper

  • ...MNIST [19] and USPS [20] are standard handwritten digit databases, ISOLET dataset [21] comprises of examples of letters from the alphabet spoken in isolation by 30 individual speakers and COIL2 [22] is two class object recognition dataset....

    [...]

Journal ArticleDOI
TL;DR: This work considers the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise, and proposes a general classification algorithm for (image-based) object recognition based on a sparse representation computed by C1-minimization.
Abstract: We consider the problem of automatically recognizing human faces from frontal views with varying expression and illumination, as well as occlusion and disguise. We cast the recognition problem as one of classifying among multiple linear regression models and argue that new theory from sparse signal representation offers the key to addressing this problem. Based on a sparse representation computed by C1-minimization, we propose a general classification algorithm for (image-based) object recognition. This new framework provides new insights into two crucial issues in face recognition: feature extraction and robustness to occlusion. For feature extraction, we show that if sparsity in the recognition problem is properly harnessed, the choice of features is no longer critical. What is critical, however, is whether the number of features is sufficiently large and whether the sparse representation is correctly computed. Unconventional features such as downsampled images and random projections perform just as well as conventional features such as eigenfaces and Laplacianfaces, as long as the dimension of the feature space surpasses certain threshold, predicted by the theory of sparse representation. This framework can handle errors due to occlusion and corruption uniformly by exploiting the fact that these errors are often sparse with respect to the standard (pixel) basis. The theory of sparse representation helps predict how much occlusion the recognition algorithm can handle and how to choose the training images to maximize robustness to occlusion. We conduct extensive experiments on publicly available databases to verify the efficacy of the proposed algorithm and corroborate the above claims.

9,658 citations


"Supervised dictionary learning usin..." refers methods in this paper

  • ...[11] used training data as atoms of the dictionary for face recognition task....

    [...]

  • ...We compare the proposed method (PM) to four popular DL based classification methods, RECL [11], SRSC [23], DLSI [6], FDDL [12], and two classical classification methods, K-nearest neighbor (K...

    [...]

  • ...We compare the proposed method (PM) to four popular DL based classification methods, RECL [11], SRSC [23], DLSI [6], FDDL [12], and two classical classification methods, K-nearest neighbor (K=3) and linear SVM ....

    [...]

Journal ArticleDOI
TL;DR: A novel algorithm for adapting dictionaries in order to achieve sparse signal representations, the K-SVD algorithm, an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data.
Abstract: In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method-the K-SVD algorithm-generalizing the K-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data

8,905 citations

Book
01 Jan 1999
TL;DR: This new edition contains five completely new chapters covering new developments and has sold 4300 copies worldwide of the first edition (1999).
Abstract: We have sold 4300 copies worldwide of the first edition (1999). This new edition contains five completely new chapters covering new developments.

6,884 citations


"Supervised dictionary learning usin..." refers methods in this paper

  • ...Hence, we resort to Gibbs sampling [17] to approximate the posterior with S samples (the Gibbs sampling equations are avalable in [18] and are ommited due to the lack of space)....

    [...]