Showing papers on "Unsupervised learning published in 2015"

PDF

Open Access

Journal Article•DOI•

[...]

01 Jan 2015-Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

14,635 citations

Posted Content•

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

[...]

Alec Radford, Luke Metz, Soumith Chintala¹•Institutions (1)

Facebook¹

19 Nov 2015-arXiv: Learning

TL;DR: This work introduces a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrates that they are a strong candidate for unsupervised learning.

...read moreread less

Abstract: In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.

...read moreread less

6,759 citations

Proceedings Article•

Unsupervised Learning of Video Representations using LSTMs

[...]

Nitish Srivastava¹, Elman Mansimov¹, Ruslan Salakhudinov¹•Institutions (1)

University of Toronto¹

06 Jul 2015

TL;DR: In this paper, an encoder LSTM is used to map an input video sequence into a fixed length representation, which is then decoded using single or multiple decoder Long Short Term Memory (LSTM) networks to perform different tasks.

...read moreread less

Abstract: We use Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future sequence. We experiment with two kinds of input sequences - patches of image pixels and high-level representations ("percepts") of video frames extracted using a pretrained convolutional net. We explore different design choices such as whether the decoder LSTMs should condition on the generated output. We analyze the outputs of the model qualitatively to see how well the model can extrapolate the learned video representation into the future and into the past. We further evaluate the representations by finetuning them for a supervised learning problem - human action recognition on the UCF-101 and HMDB-51 datasets. We show that the representations help improve classification accuracy, especially when there are only few training examples. Even models pretrained on unrelated datasets (300 hours of YouTube videos) can help action recognition performance.

...read moreread less

2,217 citations

Proceedings Article•

Skip-thought vectors

[...]

Ryan Kiros¹, Yukun Zhu¹, Ruslan Salakhutdinov², Richard S. Zemel², Antonio Torralba³, Raquel Urtasun¹, Sanja Fidler¹ - Show less +3 more•Institutions (3)

University of Toronto¹, Canadian Institute for Advanced Research², Massachusetts Institute of Technology³

07 Dec 2015

TL;DR: This article used the continuity of text from books to train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage, which can produce highly generic sentence representations that are robust and perform well in practice.

...read moreread less

Abstract: We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next introduce a simple vocabulary expansion method to encode words that were not seen as part of training, allowing us to expand our vocabulary to a million words. After training our model, we extract and evaluate our vectors with linear models on 8 tasks: semantic relatedness, paraphrase detection, image-sentence ranking, question-type classification and 4 benchmark sentiment and subjectivity datasets. The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.

...read moreread less

1,802 citations

Book Chapter•DOI•

Deep metric learning using Triplet network

[...]

Elad Hoffer¹, Nir Ailon¹•Institutions (1)

Technion – Israel Institute of Technology¹

12 Oct 2015

TL;DR: This paper proposes the triplet network model, which aims to learn useful representations by distance comparisons, and demonstrates using various datasets that this model learns a better representation than that of its immediate competitor, the Siamese network.

...read moreread less

Abstract: Deep learning has proven itself as a successful set of models for learning useful semantic representations of data. These, however, are mostly implicitly learned as part of a classification task. In this paper we propose the triplet network model, which aims to learn useful representations by distance comparisons. A similar model was defined by Wang et al. (2014), tailor made for learning a ranking for image information retrieval. Here we demonstrate using various datasets that our model learns a better representation than that of its immediate competitor, the Siamese network. We also discuss future possible usage as a framework for unsupervised learning.

...read moreread less

1,635 citations

Posted Content•

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

[...]

Jascha Sohl-Dickstein¹, Eric L. Weiss², Niru Maheswaranathan¹, Surya Ganguli¹•Institutions (2)

Stanford University¹, University of California, Berkeley²

12 Mar 2015-arXiv: Learning

TL;DR: This work develops an approach to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process, then learns a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data.

...read moreread less

Abstract: A central problem in machine learning involves modeling complex data-sets using highly flexible families of probability distributions in which learning, sampling, inference, and evaluation are still analytically or computationally tractable. Here, we develop an approach that simultaneously achieves both flexibility and tractability. The essential idea, inspired by non-equilibrium statistical physics, is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. We then learn a reverse diffusion process that restores structure in data, yielding a highly flexible and tractable generative model of the data. This approach allows us to rapidly learn, sample from, and evaluate probabilities in deep generative models with thousands of layers or time steps, as well as to compute conditional and posterior probabilities under the learned model. We additionally release an open source reference implementation of the algorithm.

...read moreread less

1,481 citations

Journal Article•DOI•

Machine learning applications in genetics and genomics

[...]

Maxwell W. Libbrecht¹, William Stafford Noble¹•Institutions (1)

University of Washington¹

07 May 2015-Nature Reviews Genetics

TL;DR: An overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data is provided.

...read moreread less

Abstract: The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets. Here, we provide an overview of machine learning applications for the analysis of genome sequencing data sets, including the annotation of sequence elements and epigenetic, proteomic or metabolomic data. We present considerations and recurrent challenges in the application of supervised, semi-supervised and unsupervised machine learning methods, as well as of generative and discriminative modelling approaches. We provide general guidelines to assist in the selection of these machine learning methods and their practical application for the analysis of genetic and genomic data sets.

...read moreread less

1,317 citations

Proceedings Article•

Semi-supervised learning with Ladder networks

[...]

Antti Rasmus, Harri Valpola, Mikko Honkala¹, Mathias Berglund², Tapani Raiko² - Show less +1 more•Institutions (2)

Nokia¹, Aalto University²

07 Dec 2015

TL;DR: This work builds on top of the Ladder network proposed by Valpola which is extended by combining the model with supervision and shows that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutation-invariant MNIST classification with all labels.

...read moreread less

Abstract: We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pre-training. Our work builds on top of the Ladder network proposed by Valpola [1] which we extend by combining the model with supervision. We show that the resulting model reaches state-of-the-art performance in semi-supervised MNIST and CIFAR-10 classification in addition to permutation-invariant MNIST classification with all labels.

...read moreread less

1,162 citations

Posted Content•

Skip-Thought Vectors

[...]

Ryan Kiros¹, Yukun Zhu¹, Ruslan Salakhutdinov², Richard S. Zemel², Antonio Torralba³, Raquel Urtasun¹, Sanja Fidler¹ - Show less +3 more•Institutions (3)

University of Toronto¹, Canadian Institute for Advanced Research², Massachusetts Institute of Technology³

22 Jun 2015-arXiv: Computation and Language

TL;DR: The approach for unsupervised learning of a generic, distributed sentence encoder is described, using the continuity of text from books to train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage.

...read moreread less

1,115 citations

Journal Article•DOI•

Unsupervised learning of digit recognition using spike-timing-dependent plasticity.

[...]

Peter U. Diehl¹, Matthew Cook¹•Institutions (1)

University of Zurich¹

03 Aug 2015-Frontiers in Computational Neuroscience

TL;DR: A SNN for digit recognition which is based on mechanisms with increased biological plausibility, i.e., conductance-based instead of current-based synapses, spike-timing-dependent plasticity with time-dependent weight change, lateral inhibition, and an adaptive spiking threshold is presented.

...read moreread less

Abstract: In order to understand how the mammalian neocortex is performing computations, two things are necessary; we need to have a good understanding of the available neuronal processing units and mechanisms, and we need to gain a better understanding of how those mechanisms are combined to build functioning systems. Therefore, in recent years there is an increasing interest in how spiking neural networks (SNN) can be used to perform complex computations or solve pattern recognition tasks. However, it remains a challenging task to design SNNs which use biologically plausible mechanisms (especially for learning new patterns), since most of such SNN architectures rely on training in a rate-based network and subsequent conversion to a SNN. We present a SNN for digit recognition which is based on mechanisms with increased biological plausibility, i.e. conductance-based instead of current-based synapses, spike-timing-dependent plasticity with time-dependent weight change, lateral inhibition, and an adaptive spiking threshold. Unlike most other systems, we do not use a teaching signal and do not present any class labels to the network. Using this unsupervised learning scheme, our architecture achieves 95% accuracy on the MNIST benchmark, which is better than previous SNN implementations without supervision. The fact that we used no domain-specific knowledge points toward the general applicability of our network design. Also, the performance of our network scales well with the number of neurons used and shows similar performance for four different learning rules, indicating robustness of the full combination of mechanisms, which suggests applicability in heterogeneous biological neural networks.

...read moreread less

1,098 citations

Proceedings Article•DOI•

Unsupervised Learning of Visual Representations Using Videos

[...]

Xiaolong Wang¹, Abhinav Gupta¹•Institutions (1)

Carnegie Mellon University¹

07 Dec 2015

TL;DR: A simple yet surprisingly powerful approach for unsupervised learning of CNN that uses hundreds of thousands of unlabeled videos from the web to learn visual representations and designs a Siamese-triplet network with a ranking loss function to train this CNN representation.

...read moreread less

Abstract: Is strong supervision necessary for learning a good visual representation? Do we really need millions of semantically-labeled images to train a Convolutional Neural Network (CNN)? In this paper, we present a simple yet surprisingly powerful approach for unsupervised learning of CNN. Specifically, we use hundreds of thousands of unlabeled videos from the web to learn visual representations. Our key idea is that visual tracking provides the supervision. That is, two patches connected by a track should have similar visual representation in deep feature space since they probably belong to same object or object part. We design a Siamese-triplet network with a ranking loss function to train this CNN representation. Without using a single image from ImageNet, just using 100K unlabeled videos and the VOC 2012 dataset, we train an ensemble of unsupervised networks that achieves 52% mAP (no bounding box regression). This performance comes tantalizingly close to its ImageNet-supervised counterpart, an ensemble which achieves a mAP of 54.4%. We also show that our unsupervised network can perform competitively in other tasks such as surface-normal estimation.

...read moreread less

Journal Article•DOI•

Transfer learning using computational intelligence

[...]

Jie Lu¹, Vahid Behbood¹, Peng Hao¹, Hua Zuo¹, Shan Xue¹, Guangquan Zhang¹ - Show less +2 more•Institutions (1)

University of Technology, Sydney¹

01 May 2015-Knowledge Based Systems

TL;DR: This paper systematically examines computational intelligence-based transfer learning techniques and clusters related technique developments into four main categories and provides state-of-the-art knowledge that will directly support researchers and practice-based professionals to understand the developments in computational Intelligence- based transfer learning research and applications.

...read moreread less

Abstract: Transfer learning aims to provide a framework to utilize previously-acquired knowledge to solve new but similar problems much more quickly and effectively. In contrast to classical machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling consisting of different data patterns in the current domain. To improve the performance of existing transfer learning methods and handle the knowledge transfer process in real-world systems, computational intelligence has recently been applied in transfer learning. This paper systematically examines computational intelligence-based transfer learning techniques and clusters related technique developments into four main categories: (a) neural network-based transfer learning; (b) Bayes-based transfer learning; (c) fuzzy transfer learning, and (d) applications of computational intelligence-based transfer learning. By providing state-of-the-art knowledge, this survey will directly support researchers and practice-based professionals to understand the developments in computational intelligence-based transfer learning research and applications.

...read moreread less

Posted Content•

Unsupervised Learning of Video Representations using LSTMs

[...]

Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov

16 Feb 2015-arXiv: Learning

TL;DR: This work uses Long Short Term Memory networks to learn representations of video sequences and evaluates the representations by finetuning them for a supervised learning problem - human action recognition on the UCF-101 and HMDB-51 datasets.

...read moreread less

Abstract: We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future sequence. We experiment with two kinds of input sequences - patches of image pixels and high-level representations ("percepts") of video frames extracted using a pretrained convolutional net. We explore different design choices such as whether the decoder LSTMs should condition on the generated output. We analyze the outputs of the model qualitatively to see how well the model can extrapolate the learned video representation into the future and into the past. We try to visualize and interpret the learned features. We stress test the model by running it on longer time scales and on out-of-domain data. We further evaluate the representations by finetuning them for a supervised learning problem - human action recognition on the UCF-101 and HMDB-51 datasets. We show that the representations help improve classification accuracy, especially when there are only a few training examples. Even models pretrained on unrelated datasets (300 hours of YouTube videos) can help action recognition performance.

...read moreread less

Journal Article•DOI•

Gaussian Processes for Data-Efficient Learning in Robotics and Control

[...]

Marc Peter Deisenroth¹, Dieter Fox², Carl Edward Rasmussen³•Institutions (3)

Imperial College London¹, University of Washington², University of Cambridge³

01 Feb 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper learns a probabilistic, non-parametric Gaussian process transition model of the system and applies it to autonomous learning in real robot and control tasks, achieving an unprecedented speed of learning.

...read moreread less

Abstract: Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

...read moreread less

Journal Article•DOI•

Unsupervised Deep Feature Extraction for Remote Sensing Image Classification

[...]

Adriana Romero, Carlo Gatta¹, Gustau Camps-Valls•Institutions (1)

Autonomous University of Barcelona¹

25 Nov 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: The proposed algorithm clearly outperforms standard principal component analysis and its kernel counterpart (kPCA), as well as current state-of-the-art algorithms of aerial classification, while being extremely computationally efficient at learning representations of data.

...read moreread less

Abstract: This paper introduces the use of single layer and deep convolutional networks for remote sensing data analysis. Direct application to multi- and hyper-spectral imagery of supervised (shallow or deep) convolutional networks is very challenging given the high input data dimensionality and the relatively small amount of available labeled data. Therefore, we propose the use of greedy layer-wise unsupervised pre-training coupled with a highly efficient algorithm for unsupervised learning of sparse features. The algorithm is rooted on sparse representations and enforces both population and lifetime sparsity of the extracted features, simultaneously. We successfully illustrate the expressive power of the extracted representations in several scenarios: classification of aerial scenes, as well as land-use classification in very high resolution (VHR), or land-cover classification from multi- and hyper-spectral images. The proposed algorithm clearly outperforms standard Principal Component Analysis (PCA) and its kernel counterpart (kPCA), as well as current state-of-the-art algorithms of aerial classification, while being extremely computationally efficient at learning representations of data. Results show that single layer convolutional networks can extract powerful discriminative features only when the receptive field accounts for neighboring pixels, and are preferred when the classification requires high resolution and detailed results. However, deep architectures significantly outperform single layers variants, capturing increasing levels of abstraction and complexity throughout the feature hierarchy.

...read moreread less

Journal Article•DOI•

Saliency-Guided Unsupervised Feature Learning for Scene Classification

[...]

Fan Zhang¹, Bo Du¹, Liangpei Zhang¹•Institutions (1)

Wuhan University¹

01 Apr 2015-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The proposed unsupervised-feature-learning-based scene classification method provides more accurate classification results than the other latent-Dirichlet-allocation-based methods and the sparse coding method.

...read moreread less

Abstract: Due to the rapid technological development of various different satellite sensors, a huge volume of high-resolution image data sets can now be acquired. How to efficiently represent and recognize the scenes from such high-resolution image data has become a critical task. In this paper, we propose an unsupervised feature learning framework for scene classification. By using the saliency detection algorithm, we extract a representative set of patches from the salient regions in the image data set. These unlabeled data patches are exploited by an unsupervised feature learning method to learn a set of feature extractors which are robust and efficient and do not need elaborately designed descriptors such as the scale-invariant-feature-transform-based algorithm. We show that the statistics generated from the learned feature extractors can characterize a complex scene very well and can produce excellent classification accuracy. In order to reduce overfitting in the feature learning step, we further employ a recently developed regularization method called “dropout,” which has proved to be very effective in image classification. In the experiments, the proposed method was applied to two challenging high-resolution data sets: the UC Merced data set containing 21 different aerial scene categories with a submeter resolution and the Sydney data set containing seven land-use categories with a 60-cm spatial resolution. The proposed method obtained results that were equal to or even better than the previous best results with the UC Merced data set, and it also obtained the highest accuracy with the Sydney data set, demonstrating that the proposed unsupervised-feature-learning-based scene classification method provides more accurate classification results than the other latent-Dirichlet-allocation-based methods and the sparse coding method.

...read moreread less

Posted Content•

Unsupervised Learning of Visual Representations using Videos

[...]

Xiaolong Wang¹, Abhinav Gupta¹•Institutions (1)

Carnegie Mellon University¹

04 May 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a Siamese-triplet network with a ranking loss function was proposed to train a CNN representation for unsupervised learning of visual representations. But this method requires a large amount of unlabeled videos from the web to train the network.

...read moreread less

Abstract: Is strong supervision necessary for learning a good visual representation? Do we really need millions of semantically-labeled images to train a Convolutional Neural Network (CNN)? In this paper, we present a simple yet surprisingly powerful approach for unsupervised learning of CNN. Specifically, we use hundreds of thousands of unlabeled videos from the web to learn visual representations. Our key idea is that visual tracking provides the supervision. That is, two patches connected by a track should have similar visual representation in deep feature space since they probably belong to the same object or object part. We design a Siamese-triplet network with a ranking loss function to train this CNN representation. Without using a single image from ImageNet, just using 100K unlabeled videos and the VOC 2012 dataset, we train an ensemble of unsupervised networks that achieves 52% mAP (no bounding box regression). This performance comes tantalizingly close to its ImageNet-supervised counterpart, an ensemble which achieves a mAP of 54.4%. We also show that our unsupervised network can perform competitively in other tasks such as surface-normal estimation.

...read moreread less

Journal Article•DOI•

Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study

[...]

Isaac Triguero¹, Salvador García², Francisco Herrera¹•Institutions (2)

University of Granada¹, University of Jaén²

01 Feb 2015-Knowledge and Information Systems

TL;DR: This paper provides a survey of self-labeled methods for semi-supervised classification and proposes a taxonomy based on the main characteristics presented in them, aiming to measure their performance in terms of transductive and inductive classification capabilities.

...read moreread less

Abstract: Semi-supervised classification methods are suitable tools to tackle training sets with large amounts of unlabeled data and a small quantity of labeled data. This problem has been addressed by several approaches with different assumptions about the characteristics of the input data. Among them, self-labeled techniques follow an iterative procedure, aiming to obtain an enlarged labeled data set, in which they accept that their own predictions tend to be correct. In this paper, we provide a survey of self-labeled methods for semi-supervised classification. From a theoretical point of view, we propose a taxonomy based on the main characteristics presented in them. Empirically, we conduct an exhaustive study that involves a large number of data sets, with different ratios of labeled data, aiming to measure their performance in terms of transductive and inductive classification capabilities. The results are contrasted with nonparametric statistical tests. Note is then taken of which self-labeled models are the best-performing ones. Moreover, a semi-supervised learning module has been developed for the Knowledge Extraction based on Evolutionary Learning software, integrating analyzed methods and data sets.

...read moreread less

Proceedings Article•DOI•

Deep multiple instance learning for image classification and auto-annotation

[...]

Jiajun Wu¹, Yinan Yu², Chang Huang², Kai Yu²•Institutions (2)

Massachusetts Institute of Technology¹, Baidu²

07 Jun 2015

TL;DR: This paper attempts to model deep learning in a weakly supervised learning (multiple instance learning) framework, where each image follows a dual multi-instance assumption, where its object proposals and possible text annotations can be regarded as two instance sets.

...read moreread less

Abstract: The recent development in learning deep representations has demonstrated its wide applications in traditional vision tasks like classification and detection. However, there has been little investigation on how we could build up a deep learning framework in a weakly supervised setting. In this paper, we attempt to model deep learning in a weakly supervised learning (multiple instance learning) framework. In our setting, each image follows a dual multi-instance assumption, where its object proposals and possible text annotations can be regarded as two instance sets. We thus design effective systems to exploit the MIL property with deep learning strategies from the two ends; we also try to jointly learn the relationship between object and annotation proposals. We conduct extensive experiments and prove that our weakly supervised deep learning framework not only achieves convincing performance in vision tasks including classification and image annotation, but also extracts reasonable region-keyword pairs with little supervision, on both widely used benchmarks like PASCAL VOC and MIT Indoor Scene 67, and also a dataset for image-and patch-level annotations.

...read moreread less

Journal Article•DOI•

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

[...]

Yi Yang¹, Zhigang Ma², Feiping Nie³, Xiaojun Chang¹, Alexander G. Hauptmann² - Show less +1 more•Institutions (3)

University of Technology, Sydney¹, Carnegie Mellon University², Northwestern Polytechnical University³

01 Jun 2015-International Journal of Computer Vision

TL;DR: This paper proposes a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition that exploits the whole active pool to evaluate the uncertainty of the data, and proposes to make the selected data as diverse as possible.

...read moreread less

Abstract: As a way to relieve the tedious work of manual annotation, active learning plays important roles in many applications of visual concept recognition. In typical active learning scenarios, the number of labelled data in the seed set is usually small. However, most existing active learning algorithms only exploit the labelled data, which often suffers from over-fitting due to the small number of labelled examples. Besides, while much progress has been made in binary class active learning, little research attention has been focused on multi-class active learning. In this paper, we propose a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition. Our algorithm exploits the whole active pool to evaluate the uncertainty of the data. Considering that uncertain data are always similar to each other, we propose to make the selected data as diverse as possible, for which we explicitly impose a diversity constraint on the objective function. As a multi-class active learning algorithm, our algorithm is able to exploit uncertainty across multiple classes. An efficient algorithm is used to optimize the objective function. Extensive experiments on action recognition, object classification, scene recognition, and event detection demonstrate its advantages.

...read moreread less

Book•

Machine Learning: A Bayesian and Optimization Perspective

[...]

Sergios Theodoridis

10 Apr 2015

TL;DR: This tutorial text gives a unifying perspective on machine learning by covering both Probabilistic and deterministic approaches - which are based on optimization techniques together with the Bayesian inference approach, whose essence lies in the use of a hierarchy of probabilistic models.

...read moreread less

Abstract: This tutorial text gives a unifying perspective on machine learning by covering bothprobabilistic and deterministic approaches -which are based on optimization techniques together with the Bayesian inference approach, whose essence liesin the use of a hierarchy of probabilistic models. The book presents the major machine learning methods as they have been developed in different disciplines, such as statistics, statistical and adaptive signal processing and computer science. Focusing on the physical reasoning behind the mathematics, all the various methods and techniques are explained in depth, supported by examples and problems, giving an invaluable resource to the student and researcher for understanding and applying machine learning concepts. The book builds carefully from the basic classical methods to the most recent trends, with chapters written to be as self-contained as possible, making the text suitable for different courses: pattern recognition, statistical/adaptive signal processing, statistical/Bayesian learning, as well as short courses on sparse modeling, deep learning, and probabilistic graphical models. All major classical techniques: Mean/Least-Squares regression and filtering, Kalman filtering, stochastic approximation and online learning, Bayesian classification, decision trees, logistic regression and boosting methods. The latest trends: Sparsity, convex analysis and optimization, online distributed algorithms, learning in RKH spaces, Bayesian inference, graphical and hidden Markov models, particle filtering, deep learning, dictionary learning and latent variables modeling. Case studies - protein folding prediction, optical character recognition, text authorship identification, fMRI data analysis, change point detection, hyperspectral image unmixing, target localization, channel equalization and echo cancellation, show how the theory can be applied. MATLAB code for all the main algorithms are available on an accompanying website, enabling the reader to experiment with the code.

...read moreread less

Journal Article•DOI•

Multi-View Intact Space Learning

[...]

Chang Xu¹, Dacheng Tao², Chao Xu¹•Institutions (2)

Peking University¹, University of Technology, Sydney²

01 Dec 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the authors proposed the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data.

...read moreread less

Abstract: It is practical to assume that an individual view is unlikely to be sufficient for effective multi-view learning Therefore, integration of multi-view information is both valuable and necessary In this paper, we propose the Multi-view Intact Space Learning (MISL) algorithm, which integrates the encoded complementary information in multiple views to discover a latent intact representation of the data Even though each view on its own is insufficient, we show theoretically that by combing multiple views we can obtain abundant information for latent intact space learning Employing the Cauchy loss (a technique used in statistical learning) as the error measurement strengthens robustness to outliers We propose a new definition of multi-view stability and then derive the generalization error bound based on multi-view stability and Rademacher complexity, and show that the complementarity between multiple views is beneficial for the stability and generalization MISL is efficiently optimized using a novel Iteratively Reweight Residuals (IRR) technique, whose convergence is theoretically analyzed Experiments on synthetic data and real-world datasets demonstrate that MISL is an effective and promising algorithm for practical applications

...read moreread less

Journal Article•DOI•

Robust network traffic classification

[...]

Jun Zhang¹, Xiao Chen¹, Yang Xiang¹, Wanlei Zhou¹, Jie Wu² - Show less +1 more•Institutions (2)

Deakin University¹, Temple University²

01 Aug 2015-IEEE ACM Transactions on Networking

TL;DR: The proposed RTC scheme has the capability of identifying the traffic of zero-day applications as well as accurately discriminating predefined application classes and is significantly better than four state-of-the-art methods.

...read moreread less

Abstract: As a fundamental tool for network management and security, traffic classification has attracted increasing attention in recent years. A significant challenge to the robustness of classification performance comes from zero-day applications previously unknown in traffic classification systems. In this paper, we propose a new scheme of Robust statistical Traffic Classification (RTC) by combining supervised and unsupervised machine learning techniques to meet this challenge. The proposed RTC scheme has the capability of identifying the traffic of zero-day applications as well as accurately discriminating predefined application classes. In addition, we develop a new method for automating the RTC scheme parameters optimization process. The empirical study on real-world traffic data confirms the effectiveness of the proposed scheme. When zero-day applications are present, the classification performance of the new scheme is significantly better than four state-of-the-art methods: random forest, correlation-based classification, semi-supervised clustering, and one-class SVM.

...read moreread less

Posted Content•

Towards Biologically Plausible Deep Learning

[...]

Yoshua Bengio, Dong-Hyun Lee, Jörg Bornschein, Zhouhan Lin

14 Feb 2015-arXiv: Learning

TL;DR: The theory about the probabilistic interpretation of auto-encoders is extended to justify improved sampling schemes based on the generative interpretation of denoising auto- Encoder, and these ideas are validated on generative learning tasks.

...read moreread less

Abstract: Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology. We explore more biologically plausible versions of deep representation learning, focusing here mostly on unsupervised learning but developing a learning mechanism that could account for supervised, unsupervised and reinforcement learning. The starting point is that the basic learning rule believed to govern synaptic weight updates (Spike-TimingDependent Plasticity) can be interpreted as gradient descent on some objective function so long as the neuronal dynamics push firing rates towards better values of the objective function (be it supervised, unsupervised, or reward-driven). The second main idea is that this corresponds to a form of the variational EM algorithm, i.e., with approximate rather than exact posteriors, implemented by neural dynamics. Another contribution of this paper is that the gradients required for updating the hidden states in the above variational interpretation can be estimated using an approximation that only requires propagating activations forward and backward, with pairs of layers learning to form a denoising auto-encoder. Finally, we extend the theory about the probabilistic interpretation of auto-encoders to justify improved sampling schemes based on the generative interpretation of denoising auto-encoders, and we validate all these ideas on generative learning tasks.

...read moreread less

Posted Content•

Stacked What-Where Auto-encoders

[...]

Junbo Jake Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun

08 Jun 2015-arXiv: Machine Learning

TL;DR: A novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training.

...read moreread less

Abstract: We present a novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvolutional net (Deconvnet) (Zeiler et al. (2010)) to produce the reconstruction. The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet. Each pooling layer produces two sets of variables: the "what" which are fed to the next layer, and its complementary variable "where" that are fed to the corresponding layer in the generative decoder.

...read moreread less

Proceedings Article•DOI•

From Group to Individual Labels Using Deep Features

[...]

Dimitrios Kotzias¹, Misha Denil², Nando de Freitas², Padhraic Smyth¹•Institutions (2)

University of California, Irvine¹, University of Oxford²

10 Aug 2015

TL;DR: This paper proposes a new objective function that encourages smoothness of inferred instance-level labels based on instance- level similarity, while at the same time respecting group-level label constraints, and applies this approach to the problem of predicting labels for sentences given labels for reviews, using a convolutional neural network to infer sentence similarity.

...read moreread less

Abstract: In many classification problems labels are relatively scarce. One context in which this occurs is where we have labels for groups of instances but not for the instances themselves, as in multi-instance learning. Past work on this problem has typically focused on learning classifiers to make predictions at the group level. In this paper we focus on the problem of learning classifiers to make predictions at the instance level. To achieve this we propose a new objective function that encourages smoothness of inferred instance-level labels based on instance-level similarity, while at the same time respecting group-level label constraints. We apply this approach to the problem of predicting labels for sentences given labels for reviews, using a convolutional neural network to infer sentence similarity. The approach is evaluated using three large review data sets from IMDB, Yelp, and Amazon, and we demonstrate the proposed approach is both accurate and scalable compared to various alternatives.

...read moreread less

Journal Article•DOI•

Domain Adaptation Extreme Learning Machines for Drift Compensation in E-Nose Systems

[...]

Lei Zhang¹, David Zhang²•Institutions (2)

Chongqing University¹, Hong Kong Polytechnic University²

01 Jul 2015-IEEE Transactions on Instrumentation and Measurement

TL;DR: Experiments on the popular sensor drift data with multiple batches collected using E-nose system clearly demonstrate that the proposed DAELM significantly outperforms existing drift-compensation methods without cumbersome measures, and also bring new perspectives for ELM.

...read moreread less

Abstract: This paper addresses an important issue known as sensor drift, which exhibits a nonlinear dynamic property in electronic nose (E-nose), from the viewpoint of machine learning. Traditional methods for drift compensation are laborious and costly owing to the frequent acquisition and labeling process for gas samples’ recalibration. Extreme learning machines (ELMs) have been confirmed to be efficient and effective learning techniques for pattern recognition and regression. However, ELMs primarily focus on the supervised, semisupervised, and unsupervised learning problems in single domain (i.e., source domain). To our best knowledge, ELM with cross-domain learning capability has never been studied. This paper proposes a unified framework called domain adaptation extreme learning machine (DAELM), which learns a robust classifier by leveraging a limited number of labeled data from target domain for drift compensation as well as gas recognition in E-nose systems, without losing the computational efficiency and learning ability of traditional ELM. In the unified framework, two algorithms called source DAELM (DAELM-S) and target DAELM (DAELM-T) are proposed in this paper. In order to perceive the differences among ELM, DAELM-S, and DAELM-T, two remarks are provided. Experiments on the popular sensor drift data with multiple batches collected using E-nose system clearly demonstrate that the proposed DAELM significantly outperforms existing drift-compensation methods without cumbersome measures, and also bring new perspectives for ELM.

...read moreread less

Book Chapter•DOI•

Deep Learning, Sparse Coding, and SVM for Melanoma Recognition in Dermoscopy Images

[...]

Noel C. F. Codella¹, Junjie Cai¹, Mani Abedini¹, Rahil Garnavi¹, Alan Halpern², John R. Smith¹ - Show less +2 more•Institutions (2)

IBM¹, Memorial Sloan Kettering Cancer Center²

05 Oct 2015

TL;DR: An approach for melanoma recognition in dermoscopy images that combines deep learning, sparse coding, and support vector machine SVM learning algorithms is presented, suggesting the proposed approach is an effective improvement over prior state-of-art.

...read moreread less

Abstract: This work presents an approach for melanoma recognition in dermoscopy images that combines deep learning, sparse coding, and support vector machine SVM learning algorithms. One of the beneficial aspects of the proposed approach is that unsupervised learning within the domain, and feature transfer from the domain of natural photographs, eliminates the need of annotated data in the target task to learn good features. The applied feature transfer also allows the system to draw analogies between observations in dermoscopic images and observations in the natural world, mimicking the process clinical experts themselves employ to describe patterns in skin lesions. To evaluate the methodology, performance is measured on a dataset obtained from the International Skin Imaging Collaboration, containing 2624 clinical cases of melanoma 334, atypical nevi 144, and benign lesions 2146. The approach is compared to the prior state-of-art method on this dataset. Two-fold cross-validation is performed 20 times for evaluation 40 total experiments, and two discrimination tasks are examined: 1 melanoma vs. all non-melanoma lesions, and 2 melanoma vs. atypical lesions only. The presented approach achieves an accuracy of 93.1% 94.9% sensitivity, and 92.8% specificity for the first task, and 73.9% accuracy 73.8% sensitivity, and 74.3% specificity for the second task. In comparison, prior state-of-art ensemble modeling approaches alone yield 91.2% accuracy 93.0% sensitivity, and 91.0% specificity first the first task, and 71.5% accuracy 72.7% sensitivity, and 68.9% specificity for the second. Differences in performance were statistically significant p $$<$$ 0.05, suggesting the proposed approach is an effective improvement over prior state-of-art.

...read moreread less

Proceedings Article•DOI•

Analysis of function of rectified linear unit used in deep learning

[...]

Kazuyuki Hara¹, Daisuke Saito², Hayaru Shouno³•Institutions (3)

College of Industrial Technology¹, Nihon University², University of Electro-Communications³

12 Jul 2015

TL;DR: A rectified linear unit (ReLU) is proposed to speed up the learning convergence of the deep learning using a using simpler network called the soft-committee machine and the reasons for the speedup are clarified.

...read moreread less

Abstract: Deep Learning is attracting much attention in object recognition and speech processing. A benefit of using the deep learning is that it provides automatic pre-training. Several proposed methods that include auto-encoder are being successfully used in various applications. Moreover, deep learning uses a multilayer network that consists of many layers, a huge number of units, and huge amount of data. Thus, executing deep learning requires heavy computation, so deep learning is usually utilized with parallel computation with many cores or many machines. Deep learning employs the gradient algorithm, however this traps the learning into the saddle point or local minima. To avoid this difficulty, a rectified linear unit (ReLU) is proposed to speed up the learning convergence. However, the reasons the convergence is speeded up are not well understood. In this paper, we analyze the ReLU by a using simpler network called the soft-committee machine and clarify the reason for the speedup. We also train the network in an on-line manner. The soft-committee machine provides a good test bed to analyze deep learning. The results provide some reasons for the speedup of the convergence of the deep learning.

...read moreread less

Proceedings Article•DOI•

Learning Image Representations Tied to Ego-Motion

[...]

Dinesh Jayaraman¹, Kristen Grauman¹•Institutions (1)

University of Texas at Austin¹

07 Dec 2015

TL;DR: This work proposes to exploit proprioceptive motor signals to provide unsupervised regularization in convolutional neural networks to learn visual representations from egocentric video to enforce that the authors' learned features exhibit equivariance, i.e, they respond predictably to transformations associated with distinct ego-motions.

...read moreread less

Abstract: Understanding how images of objects and scenes behave in response to specific ego-motions is a crucial aspect of proper visual development, yet existing visual learning methods are conspicuously disconnected from the physical source of their images. We propose to exploit proprioceptive motor signals to provide unsupervised regularization in convolutional neural networks to learn visual representations from egocentric video. Specifically, we enforce that our learned features exhibit equivariance, i.e, they respond predictably to transformations associated with distinct ego-motions. With three datasets, we show that our unsupervised feature learning approach significantly outperforms previous approaches on visual recognition and next-best-view prediction tasks. In the most challenging test, we show that features learned from video captured on an autonomous driving platform improve large-scale scene recognition in static images from a disjoint domain.

...read moreread less

Collapse