Showing papers on "Autoencoder published in 2014"

PDF

Open Access

Proceedings Article•DOI•

Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction

[...]

Mayu Sakurada¹, Takehisa Yairi¹•Institutions (1)

02 Dec 2014

TL;DR: It is demonstrated that autoencoders are able to detect subtle anomalies which linear PCA fails and can be useful as nonlinear techniques without complex computation as kernel PCA requires.

...read moreread less

Abstract: This paper proposes to use autoencoders with nonlinear dimensionality reduction in the anomaly detection task. The authors apply dimensionality reduction by using an autoencoder onto both artificial data and real data, and compare it with linear PCA and kernel PCA to clarify its property. The artificial data is generated from Lorenz system, and the real data is the spacecrafts' telemetry data. This paper demonstrates that autoencoders are able to detect subtle anomalies which linear PCA fails. Also, autoencoders can increase their accuracy by extending them to denoising autoenconders. Moreover, autoencoders can be useful as nonlinear techniques without complex computation as kernel PCA requires. Finaly, the authors examine the learned features in the hidden layer of autoencoders, and present that autoencoders learn the normal state properly and activate differently with anomalous input.

...read moreread less

860 citations

Posted Content•

Towards Deep Neural Network Architectures Robust to Adversarial Examples

[...]

Shixiang Gu¹, Luca Rigazio²•Institutions (2)

Max Planck Society¹, Panasonic²

11 Dec 2014-arXiv: Learning

TL;DR: Deep Contractive Network as mentioned in this paper proposes a new end-to-end training procedure that includes a smoothness penalty inspired by the contractive autoencoder (CAE), which increases the network robustness to adversarial examples, without a significant performance penalty.

...read moreread less

Abstract: Recent work has shown deep neural networks (DNNs) to be highly susceptible to well-designed, small perturbations at the input layer, or so-called adversarial examples. Taking images as an example, such distortions are often imperceptible, but can result in 100% mis-classification for a state of the art DNN. We study the structure of adversarial examples and explore network topology, pre-processing and training strategies to improve the robustness of DNNs. We perform various experiments to assess the removability of adversarial examples by corrupting with additional noise and pre-processing with denoising autoencoders (DAEs). We find that DAEs can remove substantial amounts of the adversarial noise. How- ever, when stacking the DAE with the original DNN, the resulting network can again be attacked by new adversarial examples with even smaller distortion. As a solution, we propose Deep Contractive Network, a model with a new end-to-end training procedure that includes a smoothness penalty inspired by the contractive autoencoder (CAE). This increases the network robustness to adversarial examples, without a significant performance penalty.

...read moreread less

632 citations

Proceedings Article•

Learning deep representations for graph clustering

[...]

Fei Tian¹, Bin Gao², Qing Cui³, Enhong Chen¹, Tie-Yan Liu² - Show less +1 more•Institutions (3)

University of Science and Technology of China¹, Microsoft², Tsinghua University³

27 Jul 2014

TL;DR: This work proposes a simple method, which first learns a nonlinear embedding of the original graph by stacked autoencoder, and then runs k-means algorithm on the embedding to obtain clustering result, which significantly outperforms conventional spectral clustering.

...read moreread less

Abstract: Recently deep learning has been successfully adopted in many applications such as speech recognition and image classification. In this work, we explore the possibility of employing deep learning in graph clustering. We propose a simple method, which first learns a nonlinear embedding of the original graph by stacked autoencoder, and then runs k-means algorithm on the embedding to obtain clustering result. We show that this simple method has solid theoretical foundation, due to the similarity between autoencoder and spectral clustering in terms of what they actually optimize. Then, we demonstrate that the proposed method is more efficient and flexible than spectral clustering. First, the computational complexity of autoencoder is much lower than spectral clustering: the former can be linear to the number of nodes in a sparse graph while the latter is super quadratic due to eigenvalue decomposition. Second, when additional sparsity constraint is imposed, we can simply employ the sparse autoencoder developed in the literature of deep learning; however, it is nonstraightforward to implement a sparse spectral method. The experimental results on various graph datasets show that the proposed method significantly outperforms conventional spectral clustering, which clearly indicates the effectiveness of deep learning in graph clustering.

...read moreread less

596 citations

Proceedings Article•DOI•

Cross-modal Retrieval with Correspondence Autoencoder

[...]

Fangxiang Feng¹, Xiaojie Wang¹, Ruifan Li¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

03 Nov 2014

TL;DR: The problem of cross-modal retrieval, e.g., using a text query to search for images and vice-versa, is considered and a novel model involving correspondence autoencoder (Corr-AE) is proposed here for solving this problem, which is constructed by correlating hidden representations of two uni- modal autoencoders.

...read moreread less

Abstract: The problem of cross-modal retrieval, e.g., using a text query to search for images and vice-versa, is considered in this paper. A novel model involving correspondence autoencoder (Corr-AE) is proposed here for solving this problem. The model is constructed by correlating hidden representations of two uni-modal autoencoders. A novel optimal objective, which minimizes a linear combination of representation learning errors for each modality and correlation learning error between hidden representations of two modalities, is used to train the model as a whole. Minimization of correlation learning error forces the model to learn hidden representations with only common information in different modalities, while minimization of representation learning error makes hidden representations are good enough to reconstruct input of each modality. A parameter $\alpha$ is used to balance the representation learning error and the correlation learning error. Based on two different multi-modal autoencoders, Corr-AE is extended to other two correspondence models, here we called Corr-Cross-AE and Corr-Full-AE. The proposed models are evaluated on three publicly available data sets from real scenes. We demonstrate that the three correspondence autoencoders perform significantly better than three canonical correlation analysis based models and two popular multi-modal deep models on cross-modal retrieval tasks.

...read moreread less

558 citations

Journal Article•DOI•

EEG-Based Emotion Recognition Using Deep Learning Network with Principal Component Based Covariate Shift Adaptation

[...]

Suwicha Jirayucharoensak¹, Setha Pan-ngum¹, Pasin Israsena•Institutions (1)

Chulalongkorn University¹

01 Sep 2014-The Scientific World Journal

TL;DR: A deep learning network (DLN) is proposed to discover unknown feature correlation between input signals that is crucial for the learning task and provides better performance compared to SVM and naive Bayes classifiers.

...read moreread less

Abstract: Automatic emotion recognition is one of the most challenging tasks. To detect emotion from nonstationary EEG signals, a sophisticated learning algorithm that can represent high-level abstraction is required. This study proposes the utilization of a deep learning network (DLN) to discover unknown feature correlation between input signals that is crucial for the learning task. The DLN is implemented with a stacked autoencoder (SAE) using hierarchical feature learning approach. Input features of the network are power spectral densities of 32-channel EEG signals from 32 subjects. To alleviate overfitting problem, principal component analysis (PCA) is applied to extract the most important components of initial input features. Furthermore, covariate shift adaptation of the principal components is implemented to minimize the nonstationary effect of EEG signals. Experimental results show that the DLN is capable of classifying three different levels of valence and arousal with accuracy of 49.52% and 46.03%, respectively. Principal component based covariate shift adaptation enhances the respective classification accuracy by 5.55% and 6.53%. Moreover, DLN provides better performance compared to SVM and naive Bayes classifiers.

...read moreread less

432 citations

Journal Article•DOI•

Autoencoder for words

[...]

Cheng-Yuan Liou¹, Wei-Chen Cheng², Jiun-Wei Liou¹, Daw-Ran Liou³•Institutions (3)

National Taiwan University¹, Academia Sinica², Cornell University³

01 Sep 2014-Neurocomputing

TL;DR: A training method that encodes each word into a different vector in semantic space and its relation to low entropy coding is presented and is applied to the stylish analyses of two Chinese novels.

...read moreread less

390 citations

Posted Content•

An Autoencoder Approach to Learning Bilingual Word Representations

[...]

Sarath Chandar A P¹, Stanislas Lauly², Hugo Larochelle², Mitesh M. Khapra³, Balaraman Ravindran³, Vikas C. Raykar³, Amrita Saha¹ - Show less +3 more•Institutions (3)

Indian Institute of Technology Madras¹, Université de Sherbrooke², IBM³

06 Feb 2014-arXiv: Computation and Language

TL;DR: This work explores the use of autoencoder-based methods for cross-language learning of vectorial word representations that are coherent between two languages, while not relying on word-level alignments, and achieves state-of-the-art performance.

...read moreread less

Abstract: Cross-language learning allows us to use training data from one language to build models for a different language. Many approaches to bilingual learning require that we have word-level alignment of sentences from parallel corpora. In this work we explore the use of autoencoder-based methods for cross-language learning of vectorial word representations that are aligned between two languages, while not relying on word-level alignments. We show that by simply learning to reconstruct the bag-of-words representations of aligned sentences, within and between languages, we can in fact learn high-quality representations and do without word alignments. Since training autoencoders on word observations presents certain computational issues, we propose and compare different variations adapted to this setting. We also propose an explicit correlation maximizing regularizer that leads to significant improvement in the performance. We empirically investigate the success of our approach on the problem of cross-language test classification, where a classifier trained on a given language (e.g., English) must learn to generalize to a different language (e.g., German). These experiments demonstrate that our approaches are competitive with the state-of-the-art, achieving up to 10-14 percentage point improvements over the best reported results on this task.

...read moreread less

330 citations

Proceedings Article•DOI•

Generalized Autoencoder: A Neural Network Framework for Dimensionality Reduction

[...]

Wei Wang, Yan Huang, Yizhou Wang¹, Liang Wang•Institutions (1)

Peking University¹

23 Jun 2014

TL;DR: A dimensionality reduction method by manifold learning, which iteratively explores data relation and use the relation to pursue the manifold structure, and a multilayer architecture of the generalized autoencoder called deep generalized aut Koencoder to handle highly complex datasets.

...read moreread less

Abstract: The autoencoder algorithm and its deep version as traditional dimensionality reduction methods have achieved great success via the powerful representability of neural networks. However, they just use each instance to reconstruct itself and ignore to explicitly model the data relation so as to discover the underlying effective manifold structure. In this paper, we propose a dimensionality reduction method by manifold learning, which iteratively explores data relation and use the relation to pursue the manifold structure. The method is realized by a so called "generalized autoencoder" (GAE), which extends the traditional autoencoder in two aspects: (1) each instance xi is used to reconstruct a set of instances {xj} rather than itself. (2) The reconstruction error of each instance (| |xj--x'i| |2) is weighted by a relational function of xi and xj defined on the learned manifold. Hence, the GAE captures the structure of the data space through minimizing the weighted distances between reconstructed instances and the original ones. The generalized autoencoder provides a general neural network framework for dimensionality reduction. In addition, we propose a multilayer architecture of the generalized autoencoder called deep generalized autoencoder to handle highly complex datasets. Finally, to evaluate the proposed methods, we perform extensive experiments on three datasets. The experiments demonstrate that the proposed methods achieve promising performance.

...read moreread less

320 citations

Journal Article•DOI•

Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition

[...]

Jun Deng¹, Zixing Zhang¹, Florian Eyben¹, Björn Schuller¹•Institutions (1)

Technische Universität München¹

16 May 2014-IEEE Signal Processing Letters

TL;DR: An Adaptive Denoising Autoencoder based on an unsupervised domain adaptation method, where prior knowledge learned from a target set is used to regularize the training on a source set to achieve matched feature space representation for the target and source sets while ensuring target domain knowledge transfer.

...read moreread less

Abstract: With the availability of speech data obtained from different devices and varied acquisition conditions, we are often faced with scenarios, where the intrinsic discrepancy between the training and the test data has an adverse impact on affective speech analysis. To address this issue, this letter introduces an Adaptive Denoising Autoencoder based on an unsupervised domain adaptation method, where prior knowledge learned from a target set is used to regularize the training on a source set. Our goal is to achieve a matched feature space representation for the target and source sets while ensuring target domain knowledge transfer. The method has been successfully evaluated on the 2009 INTERSPEECH Emotion Challenge’s FAU Aibo Emotion Corpus as target corpus and two other publicly available speech emotion corpora as sources. The experimental results show that our method significantly improves over the baseline performance and outperforms related feature domain adaptation methods.

...read moreread less

253 citations

Proceedings Article•

Towards Deep Neural Network Architectures Robust to Adversarial Examples

[...]

Shixiang Gu¹, Luca Rigazio²•Institutions (2)

Max Planck Society¹, Panasonic²

11 Dec 2014

...read moreread less

227 citations

Proceedings Article•

Deep AutoRegressive Networks

[...]

Karol Gregor¹, Ivo Danihelka¹, Andriy Mnih¹, Charles Blundell¹, Daan Wierstra¹ - Show less +1 more•Institutions (1)

Google¹

21 Jun 2014

TL;DR: In this paper, a deep, generative autoencoder capable of learning hierarchies of distributed representations from data is introduced, where successive deep stochastic hidden layers are equipped with autoregressive connections to enable the model to be sampled from quickly and exactly via ancestral sampling.

...read moreread less

Abstract: We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data. Successive deep stochastic hidden layers are equipped with autoregressive connections, which enable the model to be sampled from quickly and exactly via ancestral sampling. We derive an efficient approximate parameter estimation method based on the minimum description length (MDL) principle, which can be seen as maximising a variational lower bound on the log-likelihood, with a feedforward neural network implementing approximate inference. We demonstrate state-of-the-art generative performance on a number of classic data sets, including several UCI data sets, MNIST and Atari 2600 games.

...read moreread less

Book Chapter•DOI•

From neural PCA to deep unsupervised learning

[...]

Harri Valpola

28 Nov 2014-arXiv: Machine Learning

TL;DR: A network supporting deep unsupervised learning is presented that is an autoencoder with lateral shortcut connections from the encoder to the decoder at each level of the hierarchy, analogous to hierarchical latent variable models.

...read moreread less

Abstract: A network supporting deep unsupervised learning is presented. The network is an autoencoder with lateral shortcut connections from the encoder to the decoder at each level of the hierarchy. The lateral shortcut connections allow the higher levels of the hierarchy to focus on abstract invariant features. Whereas autoencoders are analogous to latent variable models with a single layer of stochastic variables, the proposed network is analogous to hierarchical latent variable models. Learning combines denoising autoencoder and denoising sources separation frameworks. Each layer of the network contributes to the cost function a term which measures the distance of the representations produced by the encoder and the decoder. Since training signals originate from all levels of the network, all layers can learn efficiently even in deep networks. The speedup offered by cost terms from higher levels of the hierarchy and the ability to learn invariant features are demonstrated in experiments.

...read moreread less

Proceedings Article•

An Autoencoder Approach to Learning Bilingual Word Representations

[...]

Sarath Chandar A P¹, Stanislas Lauly², Hugo Larochelle², Mitesh M. Khapra³, Balaraman Ravindran³, Vikas C. Raykar³, Amrita Saha¹ - Show less +3 more•Institutions (3)

Indian Institute of Technology Madras¹, Université de Sherbrooke², IBM³

08 Dec 2014

TL;DR: This article explore the use of autoencoder-based methods for cross-language learning of vectorial word representations that are coherent between two languages, while not relying on word-level alignments.

...read moreread less

Abstract: Cross-language learning allows one to use training data from one language to build models for a different language. Many approaches to bilingual learning require that we have word-level alignment of sentences from parallel corpora. In this work we explore the use of autoencoder-based methods for cross-language learning of vectorial word representations that are coherent between two languages, while not relying on word-level alignments. We show that by simply learning to reconstruct the bag-of-words representations of aligned sentences, within and between languages, we can in fact learn high-quality representations and do without word alignments. We empirically investigate the success of our approach on the problem of cross-language text classification, where a classifier trained on a given language (e.g., English) must learn to generalize to a different language (e.g., German). In experiments on 3 language pairs, we show that our approach achieves state-of-the-art performance, outperforming a method exploiting word alignments and a strong machine translation baseline.

...read moreread less

Proceedings Article•DOI•

Deep autoencoder neural networks for gene ontology annotation predictions

[...]

Davide Chicco¹, Peter Sadowski², Pierre Baldi²•Institutions (2)

University of Toronto¹, University of California, Irvine²

20 Sep 2014

TL;DR: With experiments on gene annotation data from the Gene Ontology project, it is shown that deep autoencoder networks achieve better performance than other standard machine learning methods, including the popular truncated singular value decomposition.

...read moreread less

Abstract: The annotation of genomic information is a major challenge in biology and bioinformatics. Existing databases of known gene functions are incomplete and prone to errors, and the bimolecular experiments needed to improve these databases are slow and costly. While computational methods are not a substitute for experimental verification, they can help in two ways: algorithms can aid in the curation of gene annotations by automatically suggesting inaccuracies, and they can predict previously-unidentified gene functions, accelerating the rate of gene function discovery. In this work, we develop an algorithm that achieves both goals using deep autoencoder neural networks. With experiments on gene annotation data from the Gene Ontology project, we show that deep autoencoder networks achieve better performance than other standard machine learning methods, including the popular truncated singular value decomposition.

...read moreread less

Journal Article•DOI•

Multimodal integration learning of robot behavior using deep neural networks

[...]

Kuniaki Noda¹, Hiroaki Arie¹, Yuki Suga¹, Tetsuya Ogata¹•Institutions (1)

Waseda University¹

01 Jun 2014-Robotics and Autonomous Systems

TL;DR: A novel computational framework enabling the integration of sensory-motor time-series data and the self-organization of multimodal fused representations based on a deep learning approach is proposed.

...read moreread less

Proceedings Article•DOI•

Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders.

[...]

Jie Tan¹, Matthew Ung¹, Chao Cheng¹, Casey S. Greene¹•Institutions (1)

Dartmouth College¹

01 Nov 2014

TL;DR: Denoising autoencoders (DAs) as mentioned in this paper have been used to identify and extract complex patterns from genomic data, such as tumor or normal samples, estrogen receptor (ER) status, and molecular subtypes.

...read moreread less

Abstract: Big data bring new opportunities for methods that efficiently summarize and automatically extract knowledge from such compendia. While both supervised learning algorithms and unsupervised clustering algorithms have been successfully applied to biological data, they are either dependent on known biology or limited to discerning the most significant signals in the data. Here we present denoising autoencoders (DAs), which employ a data-defined learning objective independent of known biology, as a method to identify and extract complex patterns from genomic data. We evaluate the performance of DAs by applying them to a large collection of breast cancer gene expression data. Results show that DAs successfully construct features that contain both clinical and molecular information. There are features that represent tumor or normal samples, estrogen receptor (ER) status, and molecular subtypes. Features constructed by the autoencoder generalize to an independent dataset collected using a distinct experimental platform. By integrating data from ENCODE for feature interpretation, we discover a feature representing ER status through association with key transcription factors in breast cancer. We also identify a feature highly predictive of patient survival and it is enriched by FOXM1 signaling pathway. The features constructed by DAs are often bimodally distributed with one peak near zero and another near one, which facilitates discretization. In summary, we demonstrate that DAs effectively extract key biological principles from gene expression data and summarize them into constructed features with convenient properties.

...read moreread less

Proceedings Article•DOI•

Genetic algorithms for evolving deep neural networks

[...]

Omid E. David¹, Iddo Greental²•Institutions (2)

Bar-Ilan University¹, Tel Aviv University²

12 Jul 2014

TL;DR: Experimental results indicate that this GA-assisted approach improves the performance of a deep autoencoder, producing a sparser neural network.

...read moreread less

Abstract: In recent years, deep learning methods applying unsupervised learning to train deep layers of neural networks have achieved remarkable results in numerous fields In the past, many genetic algorithms based methods have been successfully applied to training neural networks In this paper, we extend previous work and propose a GA-assisted method for deep learning Our experimental results indicate that this GA-assisted approach improves the performance of a deep autoencoder, producing a sparser neural network

...read moreread less

Proceedings Article•

Modeling Deep Temporal Dependencies with Recurrent Grammar Cells

[...]

Vincent Michalski¹, Roland Memisevic², Kishore Konda¹•Institutions (2)

Goethe University Frankfurt¹, Université de Montréal²

08 Dec 2014

TL;DR: This work shows how a bi-linear model of transformations, such as a gated autoencoder, can be turned into a recurrent network, by training it to predict future frames from the current one and the inferred transformation using backprop-through-time.

...read moreread less

Abstract: We propose modeling time series by representing the transformations that take a frame at time t to a frame at time t+1. To this end we show how a bi-linear model of transformations, such as a gated autoencoder, can be turned into a recurrent network, by training it to predict future frames from the current one and the inferred transformation using backprop-through-time. We also show how stacking multiple layers of gating units in a recurrent pyramid makes it possible to represent the "syntax" of complicated time series, and that it can outperform standard recurrent neural networks in terms of prediction accuracy on a variety of tasks.

...read moreread less

Posted Content•

Deep Learning Representation using Autoencoder for 3D Shape Retrieval

[...]

Zhuotun Zhu, Xinggang Wang, Song Bai, Cong Yao, Xiang Bai - Show less +1 more

25 Sep 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work projects 3D shapes into 2D space and uses autoencoder for feature learning on the 2D images and shows the proposed deep learning feature is complementary to conventional local image descriptors, which can obtain the state-of-the-art performance on 3D shape retrieval benchmarks.

...read moreread less

Abstract: We study the problem of how to build a deep learning representation for 3D shape. Deep learning has shown to be very effective in variety of visual applications, such as image classification and object detection. However, it has not been successfully applied to 3D shape recognition. This is because 3D shape has complex structure in 3D space and there are limited number of 3D shapes for feature learning. To address these problems, we project 3D shapes into 2D space and use autoencoder for feature learning on the 2D images. High accuracy 3D shape retrieval performance is obtained by aggregating the features learned on 2D images. In addition, we show the proposed deep learning feature is complementary to conventional local image descriptors. By combing the global deep learning representation and the local descriptor representation, our method can obtain the state-of-the-art performance on 3D shape retrieval benchmarks.

...read moreread less

Proceedings Article•DOI•

Facial expression recognition via deep learning

[...]

Yadan Lv¹, Zhiyong Feng¹, Chao Xu¹•Institutions (1)

Tianjin University¹

01 Nov 2014

TL;DR: Experimental results on the Japanese Female Facial Expression database and extended Cohn-Kanade dataset outperform other methods and show the effectiveness and robustness of the face parsing components.

...read moreread less

Abstract: This paper mainly studies facial expression recognition with the components by face parsing (FP). Considering the disadvantage that different parts of face contain different amount of information for facial expression and the weighted function are not the same for different faces, an idea is proposed to recognize facial expression using components which are active in expression disclosure. The face parsing detectors are trained via deep belief network and tuned by logistic regression. The detectors first detect face, and then detect nose, eyes and mouth hierarchically. A deep architecture pretrained with stacked autoencoder is applied to facial expression recognition with the concentrated features of detected components. The parsing components remove the redundant information in expression recognition, and images don't need to be aligned or any other artificial treatment. Experimental results on the Japanese Female Facial Expression database and extended Cohn-Kanade dataset outperform other methods and show the effectiveness and robustness of this algorithm.

...read moreread less

Proceedings Article•DOI•

Voice conversion using deep neural networks with speaker-independent pre-training

[...]

Seyed Hamidreza Mohammadi¹, Alexander Kain¹•Institutions (1)

Oregon Health & Science University¹

01 Apr 2014

TL;DR: This study trained a deep autoencoder to build compact representations of short-term spectra of multiple speakers, using this compact representation as mapping features, and trained an artificial neural network to predict target voice features from source voice features.

...read moreread less

Abstract: In this study, we trained a deep autoencoder to build compact representations of short-term spectra of multiple speakers. Using this compact representation as mapping features, we then trained an artificial neural network to predict target voice features from source voice features. Finally, we constructed a deep neural network from the trained deep autoencoder and artificial neural network weights, which were then fine-tuned using back-propagation. We compared the proposed method to existing methods using Gaussian mixture models and frame-selection. We evaluated the methods objectively, and also conducted perceptual experiments to measure both the conversion accuracy and speech quality of selected systems. The results showed that, for 70 training sentences, frame-selection performed best, regarding both accuracy and quality. When using only two training sentences, the pre-trained deep neural network performed best, regarding both accuracy and quality.

...read moreread less

Book Chapter•DOI•

Autoencoder-Based Collaborative Filtering

[...]

Yuanxin Ouyang¹, Liu Wenqi¹, Wenge Rong¹, Zhang Xiong¹•Institutions (1)

Beihang University¹

03 Nov 2014

TL;DR: This research proposed an autoencoder based collaborative filtering method, in which pretraining and stacking mechanism is provided, which has shown its potential and effectiveness in getting higher recall.

...read moreread less

Abstract: Currently collaborative filtering is widely used in recommender systems. With the development of idea of deep learning, a lot of researches have been conducted to improve collaborative filtering by integrating deep learning techniques. In this research, we proposed an autoencoder based collaborative filtering method, in which pretraining and stacking mechanism is provided. The experimental study on commonly used MovieLens datasets have shown its potential and effectiveness in getting higher recall.

...read moreread less

Proceedings Article•

k-Sparse Autoencoders

[...]

Alireza Makhzani¹, Brendan J. Frey¹•Institutions (1)

University of Toronto¹

01 Jan 2014

TL;DR: In this paper, an autoencoder with linear activation function is proposed, where in hidden layers only the k highest activities are kept, which achieves better classification results than denoising autoencoders, networks trained with dropout, and RBMs.

...read moreread less

Abstract: Recently, it has been observed that when representations are learnt in a way that encourages sparsity, improved performance is obtained on classification tasks. These methods involve combinations of activation functions, sampling steps and different kinds of penalties. To investigate the effectiveness of sparsity by itself, we propose the k-sparse autoencoder, which is an autoencoder with linear activation function, where in hidden layers only the k highest activities are kept. When applied to the MNIST and NORB datasets, we find that this method achieves better classification results than denoising autoencoders, networks trained with dropout, and RBMs. k-sparse autoencoders are simple to train and the encoding stage is very fast, making them well-suited to large problem sizes, where conventional sparse coding algorithms cannot be applied.

...read moreread less

Proceedings Article•DOI•

Start from Scratch: Towards Automatically Identifying, Modeling, and Naming Visual Attributes

[...]

Hanwang Zhang¹, Yang Yang², Huanbo Luan¹, Shuicheng Yang¹, Tat-Seng Chua¹ - Show less +1 more•Institutions (2)

National University of Singapore¹, University of Electronic Science and Technology of China²

03 Nov 2014

TL;DR: A novel attribute discovery approach that can automatically identify, model and name attributes from an arbitrary set of image and text pairs that can be easily gathered on the Web, and is able to build a large visual knowledge base without any human efforts.

...read moreread less

Abstract: Higher-level semantics such as visual attributes are crucial for fundamental multimedia applications. We present a novel attribute discovery approach that can automatically identify, model and name attributes from an arbitrary set of image and text pairs that can be easily gathered on the Web. Different from conventional attribute discovery methods, our approach does not rely on any pre-defined vocabularies and human labeling. Therefore, we are able to build a large visual knowledge base without any human efforts. The discovery is based on a novel deep architecture, named Independent Component Multimodal Autoencoder (ICMAE), that can continually learn shared higher-level representations across the visual and textual modalities. With the help of the resultant representations encoding strong visual and semantic evidences, we propose to (a) identify attributes and their corresponding high-quality training images, (b) iteratively model them with maximum compactness and comprehensiveness, and (c) name the attribute models with human understandable words. To date, the proposed system has discovered 1,898 attributes over 1.3 million pairs of image and text. Extensive experiments on various real-world multimedia datasets demonstrate the quality and effectiveness of the discovered attributes, facilitating multimedia applications such as image annotation and retrieval as compared to the state-of-the-art approaches.

...read moreread less

Proceedings Article•DOI•

Robust feature learning by stacked autoencoder with maximum correntropy criterion

[...]

Yu Qi¹, Yueming Wang¹, Xiaoxiang Zheng¹, Zhaohui Wu¹•Institutions (1)

Zhejiang University¹

04 May 2014

TL;DR: A robust stacked autoencoder based on maximum correntropy criterion (MCC) to deal with the data containing non-Gaussian noises and outliers is proposed and Experimental results show that R-SAE is capable of learning robust features on noisy data.

...read moreread less

Abstract: Unsupervised feature learning with deep networks has been widely studied in the recent years Despite the progress, most existing models would be fragile to non-Gaussian noises and outliers due to the criterion of mean square error (MSE) In this paper, we propose a robust stacked autoencoder (R-SAE) based on maximum correntropy criterion (MCC) to deal with the data containing non-Gaussian noises and outliers By replacing MSE with MCC, the anti-noise ability of stacked autoencoder is improved The proposed method is evaluated using the MNIST benchmark dataset Experimental results show that, compared with the ordinary stacked autoencoder, the R-SAE improves classification accuracy by 14% and reduces the reconstruction error by 39%, which demonstrates that R-SAE is capable of learning robust features on noisy data

...read moreread less

Proceedings Article•DOI•

Stacked Sparse Autoencoder (SSAE) based framework for nuclei patch classification on breast cancer histopathology

[...]

Jun Xu¹, Lei Xiang¹, Renlong Hang¹, Jianzhong Wu•Institutions (1)

Nanjing University of Information Science and Technology¹

31 Jul 2014

TL;DR: The proposed Stacked Sparse Autoencoder (SSAE) based framework for nuclei classification on breast cancer histopathology yields an accuracy of 83.7%, F1 score of 82%, and AUC of 0.8992, which outperform Softmax classifier, PCA+Soft max, and SAE+Softmax.

...read moreread less

Abstract: In this paper, a Stacked Sparse Autoencoder (SSAE) based framework is presented for nuclei classification on breast cancer histopathology. SSAE works very well in learning useful high-level feature for better representation of input raw data. To show the effectiveness of proposed framework, SSAE+Softmax is compared with conventional Softmax classifier, PCA+Softmax, and single layer Sparse Autoencoder (SAE)+Softmax in classifying the nuclei and non-nuclei patches extracted from breast cancer histopathology. The SSAE+Softmax for nuclei patch classification yields an accuracy of 83.7%, F1 score of 82%, and AUC of 0.8992, which outperform Softmax classifier, PCA+Softmax, and SAE+Softmax.

...read moreread less

Proceedings Article•DOI•

Introducing shared-hidden-layer autoencoders for transfer learning and their application in acoustic emotion recognition

[...]

Jun Deng, Rui Xia¹, Zixing Zhang, Yang Liu¹, Björn Schuller² - Show less +1 more•Institutions (2)

University of Texas at Dallas¹, Imperial College London²

04 May 2014

TL;DR: The experimental results show that the SHLA method significantly improves over the baseline performance and outperforms today's state-of-the-art domain adaptation methods.

...read moreread less

Abstract: This study addresses a situation in practice where training and test samples come from different corpora - here in acoustic emotion recognition. In this situation, a model is trained on one database while tested on another disjoint one. The typical inherent mismatch between the corpora and by that between test and training set usually leads to significant performance degradation. To cope with this problem when no training data from the target domain exists, we propose a `shared-hidden-layer autoencoder' (SHLA) approach for learning common feature representations shared across the training and test set in order to reduce the discrepancy in them. To exemplify effectiveness of our approach, we select the Interspeech Emotion Challenge's FAU Aibo Emotion Corpus as test database and two other publicly available databases as training set for extensive evaluation. The experimental results show that our SHLA method significantly improves over the baseline performance and outperforms today's state-of-the-art domain adaptation methods.

...read moreread less

Proceedings Article•DOI•

Classifying and visualizing motion capture sequences using deep neural networks

[...]

Kyunghyun Cho¹, Xi Chen¹•Institutions (1)

Aalto University¹

01 Jan 2014

TL;DR: This paper proposes a novel system to recognize the actions from skeleton data with simple, but effective, features using deep neural networks, which achieves an accuracy above 95% which is, to the knowledge, the state of the art result for such a large dataset.

...read moreread less

Abstract: The gesture recognition using motion capture data and depth sensors has recently drawn more attention in vision recognition. Currently most systems only classify dataset with a couple of dozens different actions. Moreover, feature extraction from the data is often computational complex. In this paper, we propose a novel system to recognize the actions from skeleton data with simple, but effective, features using deep neural networks. Features are extracted for each frame based on the relative positions of joints (PO), temporal differences (TD), and normalized trajectories of motion (NT). Given these features a hybrid multi-layer perceptron is trained, which simultaneously classifies and reconstructs input data. We use deep autoencoder to visualize learnt features. The experiments show that deep neural networks can capture more discriminative information than, for instance, principal component analysis can. We test our system on a public database with 65 classes and more than 2,000 motion sequences. We obtain an accuracy above 95% which is, to our knowledge, the state of the art result for such a large dataset.

...read moreread less

Proceedings Article•

Learning Ordered Representations with Nested Dropout

[...]

Oren Rippel¹, Michael A. Gelbart¹, Ryan P. Adams¹•Institutions (1)

Harvard University¹

21 Jun 2014

TL;DR: Nested dropout, a procedure for stochastically removing coherent nested sets of hidden units in a neural network, is introduced and it is rigorously shown that the application of nested dropout enforces identifiability of the units, which leads to an exact equivalence with PCA.

...read moreread less

Abstract: In this paper, we present results on ordered representations of data in which different dimensions have different degrees of importance. To learn these representations we introduce nested dropout, a procedure for stochastically removing coherent nested sets of hidden units in a neural network. We first present a sequence of theoretical results for the special case of a semilinear autoencoder. We rigorously show that the application of nested dropout enforces identifiability of the units, which leads to an exact equivalence with PCA. We then extend the algorithm to deep models and demonstrate the relevance of ordered representations to a number of applications. Specifically, we use the ordered property of the learned codes to construct hash-based data structures that permit very fast retrieval, achieving retrieval in time logarithmic in the database size and independent of the dimensionality of the representation. This allows codes that are hundreds of times longer than currently feasible for retrieval. We therefore avoid the diminished quality associated with short codes, while still performing retrieval that is competitive in speed with existing methods. We also show that ordered representations are a promising way to learn adaptive compression for efficient online data reconstruction.

...read moreread less

Posted Content•

Zero-bias autoencoders and the benefits of co-adapting features

[...]

Kishore Konda¹, Roland Memisevic², David Krueger²•Institutions (2)

Goethe University Frankfurt¹, Université de Montréal²

13 Feb 2014-arXiv: Machine Learning

TL;DR: This work shows that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation and proposes a new activation function that decouples the two roles of the hidden layer.

...read moreread less

Abstract: Regularized training of an autoencoder typically results in hidden unit biases that take on large negative values. We show that negative biases are a natural result of using a hidden layer whose responsibility is to both represent the input data and act as a selection mechanism that ensures sparsity of the representation. We then show that negative biases impede the learning of data distributions whose intrinsic dimensionality is high. We also propose a new activation function that decouples the two roles of the hidden layer and that allows us to learn representations on data with very high intrinsic dimensionality, where standard autoencoders typically fail. Since the decoupled activation function acts like an implicit regularizer, the model can be trained by minimizing the reconstruction error of training data, without requiring any additional regularization.

...read moreread less