Showing papers on "Active learning (machine learning) published in 2015"

PDF

Open Access

Book•

Understanding Machine Learning: From Theory To Algorithms

[...]

Shai Shalev-Shwartz¹, Shai Ben-David²•Institutions (2)

Hebrew University of Jerusalem¹, University of Waterloo²

01 Jan 2015

TL;DR: The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way in an advanced undergraduate or beginning graduate course.

...read moreread less

Abstract: Machine learning is one of the fastest growing areas of computer science, with far-reaching applications. The aim of this textbook is to introduce machine learning, and the algorithmic paradigms it offers, in a principled way. The book provides an extensive theoretical account of the fundamental ideas underlying machine learning and the mathematical derivations that transform these principles into practical algorithms. Following a presentation of the basics of the field, the book covers a wide array of central topics that have not been addressed by previous textbooks. These include a discussion of the computational complexity of learning and the concepts of convexity and stability; important algorithmic paradigms including stochastic gradient descent, neural networks, and structured output learning; and emerging theoretical concepts such as the PAC-Bayes approach and compression-based bounds. Designed for an advanced undergraduate or beginning graduate course, the text makes the fundamentals and algorithms of machine learning accessible to students and non-expert readers in statistics, computer science, mathematics, and engineering.

...read moreread less

3,857 citations

Proceedings Article•DOI•

Feature selection based on mutual information

[...]

Muhammad Aliyu Sulaiman¹, Jane Labadin¹•Institutions (1)

Universiti Malaysia Sarawak¹

10 Dec 2015

TL;DR: Experimental results indicate that the proposed feature selection based on mutual information criterion is capable of improving the performance of the machine learning models in terms of prediction accuracy and reduction in training time.

...read moreread less

Abstract: The application of machine learning models such as support vector machine (SVM) and artificial neural networks (ANN) in predicting reservoir properties has been effective in the recent years when compared with the traditional empirical methods. Despite that the machine learning models suffer a lot in the faces of uncertain data which is common characteristics of well log dataset. The reason for uncertainty in well log dataset includes a missing scale, data interpretation and measurement error problems. Feature Selection aimed at selecting feature subset that is relevant to the predicting property. In this paper a feature selection based on mutual information criterion is proposed, the strong point of this method relies on the choice of threshold based on statistically sound criterion for the typical greedy feedforward method of feature selection. Experimental results indicate that the proposed method is capable of improving the performance of the machine learning models in terms of prediction accuracy and reduction in training time.

...read moreread less

825 citations

Journal Article•DOI•

Transfer learning using computational intelligence

[...]

Jie Lu¹, Vahid Behbood¹, Peng Hao¹, Hua Zuo¹, Shan Xue¹, Guangquan Zhang¹ - Show less +2 more•Institutions (1)

University of Technology, Sydney¹

01 May 2015-Knowledge Based Systems

TL;DR: This paper systematically examines computational intelligence-based transfer learning techniques and clusters related technique developments into four main categories and provides state-of-the-art knowledge that will directly support researchers and practice-based professionals to understand the developments in computational Intelligence- based transfer learning research and applications.

...read moreread less

Abstract: Transfer learning aims to provide a framework to utilize previously-acquired knowledge to solve new but similar problems much more quickly and effectively. In contrast to classical machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling consisting of different data patterns in the current domain. To improve the performance of existing transfer learning methods and handle the knowledge transfer process in real-world systems, computational intelligence has recently been applied in transfer learning. This paper systematically examines computational intelligence-based transfer learning techniques and clusters related technique developments into four main categories: (a) neural network-based transfer learning; (b) Bayes-based transfer learning; (c) fuzzy transfer learning, and (d) applications of computational intelligence-based transfer learning. By providing state-of-the-art knowledge, this survey will directly support researchers and practice-based professionals to understand the developments in computational intelligence-based transfer learning research and applications.

...read moreread less

662 citations

Journal Article•DOI•

An introduction to quantum machine learning

[...]

Maria Schuld¹, Ilya Sinayskiy¹, Francesco Petruccione¹•Institutions (1)

University of KwaZulu-Natal¹

03 Apr 2015-Contemporary Physics

TL;DR: A systematic overview of the emerging field of quantum machine learning can be found in this paper, which presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.

...read moreread less

Abstract: Machine learning algorithms learn a desired input-output relation from examples in order to interpret new inputs. This is important for tasks such as image and speech recognition or strategy optimisation, with growing applications in the IT industry. In the last couple of years, researchers investigated if quantum computing can help to improve classical machine learning algorithms. Ideas range from running computationally costly algorithms or their subroutines efficiently on a quantum computer to the translation of stochastic methods into the language of quantum theory. This contribution gives a systematic overview of the emerging field of quantum machine learning. It presents the approaches as well as technical details in an accessible way, and discusses the potential of a future theory of quantum learning.

...read moreread less

580 citations

Journal Article•DOI•

Gaussian Processes for Data-Efficient Learning in Robotics and Control

[...]

Marc Peter Deisenroth¹, Dieter Fox², Carl Edward Rasmussen³•Institutions (3)

Imperial College London¹, University of Washington², University of Cambridge³

01 Feb 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper learns a probabilistic, non-parametric Gaussian process transition model of the system and applies it to autonomous learning in real robot and control tasks, achieving an unprecedented speed of learning.

...read moreread less

Abstract: Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

...read moreread less

575 citations

Proceedings Article•DOI•

Principles of Explanatory Debugging to Personalize Interactive Machine Learning

[...]

Todd Kulesza¹, Margaret Burnett¹, Weng-Keen Wong¹, Simone Stumpf²•Institutions (2)

Oregon State University¹, City University London²

18 Mar 2015

TL;DR: An empirical evaluation shows that Explanatory Debugging increased participants' understanding of the learning system by 52% and allowed participants to correct its mistakes up to twice as efficiently as participants using a traditional learning system.

...read moreread less

Abstract: How can end users efficiently influence the predictions that machine learning systems make on their behalf? This paper presents Explanatory Debugging, an approach in which the system explains to users how it made each of its predictions, and the user then explains any necessary corrections back to the learning system. We present the principles underlying this approach and a prototype instantiating it. An empirical evaluation shows that Explanatory Debugging increased participants' understanding of the learning system by 52% and allowed participants to correct its mistakes up to twice as efficiently as participants using a traditional learning system.

...read moreread less

445 citations

Kernel Methods for Machine Learning

[...]

Michael Rabadi¹•Institutions (1)

New York University¹

01 Jan 2015

422 citations

Proceedings Article•DOI•

Deep multiple instance learning for image classification and auto-annotation

[...]

Jiajun Wu¹, Yinan Yu², Chang Huang², Kai Yu²•Institutions (2)

Massachusetts Institute of Technology¹, Baidu²

07 Jun 2015

TL;DR: This paper attempts to model deep learning in a weakly supervised learning (multiple instance learning) framework, where each image follows a dual multi-instance assumption, where its object proposals and possible text annotations can be regarded as two instance sets.

...read moreread less

Abstract: The recent development in learning deep representations has demonstrated its wide applications in traditional vision tasks like classification and detection. However, there has been little investigation on how we could build up a deep learning framework in a weakly supervised setting. In this paper, we attempt to model deep learning in a weakly supervised learning (multiple instance learning) framework. In our setting, each image follows a dual multi-instance assumption, where its object proposals and possible text annotations can be regarded as two instance sets. We thus design effective systems to exploit the MIL property with deep learning strategies from the two ends; we also try to jointly learn the relationship between object and annotation proposals. We conduct extensive experiments and prove that our weakly supervised deep learning framework not only achieves convincing performance in vision tasks including classification and image annotation, but also extracts reasonable region-keyword pairs with little supervision, on both widely used benchmarks like PASCAL VOC and MIT Indoor Scene 67, and also a dataset for image-and patch-level annotations.

...read moreread less

406 citations

Journal Article•DOI•

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

[...]

Yi Yang¹, Zhigang Ma², Feiping Nie³, Xiaojun Chang¹, Alexander G. Hauptmann² - Show less +1 more•Institutions (3)

University of Technology, Sydney¹, Carnegie Mellon University², Northwestern Polytechnical University³

01 Jun 2015-International Journal of Computer Vision

TL;DR: This paper proposes a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition that exploits the whole active pool to evaluate the uncertainty of the data, and proposes to make the selected data as diverse as possible.

...read moreread less

Abstract: As a way to relieve the tedious work of manual annotation, active learning plays important roles in many applications of visual concept recognition. In typical active learning scenarios, the number of labelled data in the seed set is usually small. However, most existing active learning algorithms only exploit the labelled data, which often suffers from over-fitting due to the small number of labelled examples. Besides, while much progress has been made in binary class active learning, little research attention has been focused on multi-class active learning. In this paper, we propose a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition. Our algorithm exploits the whole active pool to evaluate the uncertainty of the data. Considering that uncertain data are always similar to each other, we propose to make the selected data as diverse as possible, for which we explicitly impose a diversity constraint on the objective function. As a multi-class active learning algorithm, our algorithm is able to exploit uncertainty across multiple classes. An efficient algorithm is used to optimize the objective function. Extensive experiments on action recognition, object classification, scene recognition, and event detection demonstrate its advantages.

...read moreread less

401 citations

Proceedings Article•DOI•

Deep Learning for Just-in-Time Defect Prediction

[...]

Xinli Yang¹, David Lo, Xin Xia¹, Yun Zhang¹, Jianling Sun¹ - Show less +1 more•Institutions (1)

Zhejiang University¹

03 Aug 2015

TL;DR: An approach Deeper is proposed which leverages deep learning techniques to predict defect-prone changes by leveraging a deep belief network algorithm and a machine learning classifier is built on the selected features.

...read moreread less

Abstract: Defect prediction is a very meaningful topic, particularly at change-level. Change-level defect prediction, which is also referred as just-in-time defect prediction, could not only ensure software quality in the development process, but also make the developers check and fix the defects in time. Nowadays, deep learning is a hot topic in the machine learning literature. Whether deep learning can be used to improve the performance of just-in-time defect prediction is still uninvestigated. In this paper, to bridge this research gap, we propose an approach Deeper which leverages deep learning techniques to predict defect-prone changes. We first build a set of expressive features from a set of initial change features by leveraging a deep belief network algorithm. Next, a machine learning classifier is built on the selected features. To evaluate the performance of our approach, we use datasets from six large open source projects, i.e., Bugzilla, Columba, JDT, Platform, Mozilla, and PostgreSQL, containing a total of 137,417 changes. We compare our approach with the approach proposed by Kamei et al. The experimental results show that on average across the 6 projects, Deeper could discover 32.22% more bugs than Kamei et al's approach (51.04% versus 18.82% on average). In addition, Deeper can achieve F1-scores of 0.22-0.63, which are statistically significantly higher than those of Kamei et al.'s approach on 4 out of the 6 projects.

...read moreread less

305 citations

Proceedings Article•DOI•

MLaaS: Machine Learning as a Service

[...]

Mauro Ribeiro¹, Katarina Grolinger¹, Miriam A. M. Capretz¹•Institutions (1)

University of Western Ontario¹

01 Dec 2015

TL;DR: This paper proposes an architecture to create a flexible and scalable machine learning as a service, using real-world sensor and weather data by running different algorithms at the same time.

...read moreread less

Abstract: The demand for knowledge extraction has been increasing. With the growing amount of data being generated by global data sources (e.g., social media and mobile apps) and the popularization of context-specific data (e.g., the Internet of Things), companies and researchers need to connect all these data and extract valuable information. Machine learning has been gaining much attention in data mining, leveraging the birth of new solutions. This paper proposes an architecture to create a flexible and scalable machine learning as a service. An open source solution was implemented and presented. As a case study, a forecast of electricity demand was generated using real-world sensor and weather data by running different algorithms at the same time.

...read moreread less

Posted Content•

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning

[...]

Shakir Mohamed¹, Danilo Jimenez Rezende¹•Institutions (1)

Google¹

29 Sep 2015-arXiv: Machine Learning

TL;DR: This paper develops a stochastic optimisation algorithm that allows for scalable information maximisation and empowerment-based reasoning directly from pixels to actions on the problem of intrinsically-motivated learning.

...read moreread less

Abstract: The mutual information is a core statistical quantity that has applications in all areas of machine learning, whether this is in training of density models over multiple data modalities, in maximising the efficiency of noisy transmission channels, or when learning behaviour policies for exploration by artificial agents. Most learning algorithms that involve optimisation of the mutual information rely on the Blahut-Arimoto algorithm --- an enumerative algorithm with exponential complexity that is not suitable for modern machine learning applications. This paper provides a new approach for scalable optimisation of the mutual information by merging techniques from variational inference and deep learning. We develop our approach by focusing on the problem of intrinsically-motivated learning, where the mutual information forms the definition of a well-known internal drive known as empowerment. Using a variational lower bound on the mutual information, combined with convolutional networks for handling visual input streams, we develop a stochastic optimisation algorithm that allows for scalable information maximisation and empowerment-based reasoning directly from pixels to actions.

...read moreread less

Proceedings Article•DOI•

Analysis of function of rectified linear unit used in deep learning

[...]

Kazuyuki Hara¹, Daisuke Saito², Hayaru Shouno³•Institutions (3)

College of Industrial Technology¹, Nihon University², University of Electro-Communications³

12 Jul 2015

TL;DR: A rectified linear unit (ReLU) is proposed to speed up the learning convergence of the deep learning using a using simpler network called the soft-committee machine and the reasons for the speedup are clarified.

...read moreread less

Abstract: Deep Learning is attracting much attention in object recognition and speech processing. A benefit of using the deep learning is that it provides automatic pre-training. Several proposed methods that include auto-encoder are being successfully used in various applications. Moreover, deep learning uses a multilayer network that consists of many layers, a huge number of units, and huge amount of data. Thus, executing deep learning requires heavy computation, so deep learning is usually utilized with parallel computation with many cores or many machines. Deep learning employs the gradient algorithm, however this traps the learning into the saddle point or local minima. To avoid this difficulty, a rectified linear unit (ReLU) is proposed to speed up the learning convergence. However, the reasons the convergence is speeded up are not well understood. In this paper, we analyze the ReLU by a using simpler network called the soft-committee machine and clarify the reason for the speedup. We also train the network in an on-line manner. The soft-committee machine provides a good test bed to analyze deep learning. The results provide some reasons for the speedup of the convergence of the deep learning.

...read moreread less

Proceedings Article•

Machine teaching: an inverse problem to machine learning and an approach toward optimal education

[...]

Xiaojin Zhu¹•Institutions (1)

University of Wisconsin-Madison¹

25 Jan 2015

TL;DR: The reader's attention is drawn to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model, and the Socratic dialogue style aims to stimulate critical thinking.

...read moreread less

Abstract: I draw the reader's attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing education and personnel training. The Socratic dialogue style aims to stimulate critical thinking.

...read moreread less

Journal Article•DOI•

New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes

[...]

Ying-Qi Zhao¹, Donglin Zeng², Eric B. Laber³, Michael R. Kosorok²•Institutions (3)

University of Wisconsin-Madison¹, University of North Carolina at Chapel Hill², North Carolina State University³

06 Jul 2015-Journal of the American Statistical Association

TL;DR: Two new statistical learning methods for estimating the optimal DTR are introduced, termed backward outcome weighted learning (BOWL) and simultaneous outcome weightedlearning (SOWL), and it is proved that the resulting rules are consistent, and provide finite sample bounds for the errors using the estimated rules.

...read moreread less

Abstract: Dynamic treatment regimes (DTRs) are sequential decision rules for individual patients that can adapt over time to an evolving illness. The goal is to accommodate heterogeneity among patients and find the DTR which will produce the best long-term outcome if implemented. We introduce two new statistical learning methods for estimating the optimal DTR, termed backward outcome weighted learning (BOWL), and simultaneous outcome weighted learning (SOWL). These approaches convert individualized treatment selection into an either sequential or simultaneous classification problem, and can thus be applied by modifying existing machine learning techniques. The proposed methods are based on directly maximizing over all DTRs a nonparametric estimator of the expected long-term outcome; this is fundamentally different than regression-based methods, for example, Q-learning, which indirectly attempt such maximization and rely heavily on the correctness of postulated regression models. We prove that the resulting rules ar...

...read moreread less

Journal Article•DOI•

Support vector machines under adversarial label contamination

[...]

Huang Xiao¹, Battista Biggio², Blaine Nelson², Han Xiao¹, Claudia Eckert¹, Fabio Roli² - Show less +2 more•Institutions (2)

Technische Universität München¹, University of Cagliari²

21 Jul 2015-Neurocomputing

TL;DR: This work considers an attacker that aims to maximize the SVM?s classification error by flipping a number of labels in the training data, and formalizes a corresponding optimal attack strategy, and solves it by means of heuristic approaches to keep the computational complexity tractable.

...read moreread less

Journal Article•DOI•

High-Performance Extreme Learning Machines: A Complete Toolbox for Big Data Applications

[...]

Anton Akusok¹, Kaj-Mikael Björk², Yoan Miche³, Amaury Lendasse¹•Institutions (3)

University of Iowa¹, Arcada University of Applied Sciences², Nokia Networks³

30 Jun 2015-IEEE Access

TL;DR: This paper presents a complete approach to a successful utilization of a high-performance extreme learning machines (ELM) Toolbox for Big Data, and summarizes recent advantages in algorithmic performance; gives a fresh view on the ELM solution in relation to the traditional linear algebraic performance; and reaps the latest software and hardware performance achievements.

...read moreread less

Abstract: This paper presents a complete approach to a successful utilization of a high-performance extreme learning machines (ELMs) Toolbox for Big Data. It summarizes recent advantages in algorithmic performance; gives a fresh view on the ELM solution in relation to the traditional linear algebraic performance; and reaps the latest software and hardware performance achievements. The results are applicable to a wide range of machine learning problems and thus provide a solid ground for tackling numerous Big Data challenges. The included toolbox is targeted at enabling the full potential of ELMs to the widest range of users.

...read moreread less

Book Chapter•DOI•

What Is Machine Learning

[...]

Issam El Naqa¹, Issam El Naqa², Martin J. Murphy³•Institutions (3)

McGill University¹, University of Michigan², Virginia Commonwealth University³

01 Jan 2015

TL;DR: The ability of machine learning algorithms to learn from current context and generalize into unseen tasks would allow improvements in both the safety and efficacy of radiotherapy practice leading to better outcomes.

...read moreread less

Abstract: Machine learning is an evolving branch of computational algorithms that are designed to emulate human intelligence by learning from the surrounding environment. They are considered the working horse in the new era of the so-called big data. Techniques based on machine learning have been applied successfully in diverse fields ranging from pattern recognition, computer vision, spacecraft engineering, finance, entertainment, and computational biology to biomedical and medical applications. More than half of the patients with cancer receive ionizing radiation (radiotherapy) as part of their treatment, and it is the main treatment modality at advanced stages of local disease. Radiotherapy involves a large set of processes that not only span the period from consultation to treatment but also extend beyond that to ensure that the patients have received the prescribed radiation dose and are responding well. The degrees of the complexity of these processes can vary and may involve several stages of sophisticated human-machine interactions and decision making, which would naturally invite the use of machine learning algorithms into optimizing and automating these processes including but not limited to radiation physics quality assurance, contouring and treatment planning, image-guided radiotherapy, respiratory motion management, treatment response modeling, and outcomes prediction. The ability of machine learning algorithms to learn from current context and generalize into unseen tasks would allow improvements in both the safety and efficacy of radiotherapy practice leading to better outcomes.

...read moreread less

Journal Article•DOI•

A new learning function for Kriging and its applications to solve reliability problems in engineering

[...]

Zhaoyan Lv¹, Zhenzhou Lu¹, Pan Wang¹•Institutions (1)

Northwestern Polytechnical University¹

01 Sep 2015-Computers & Mathematics With Applications

TL;DR: A new learning function based on information entropy is proposed that can help select the next point effectively and add it to the design of experiments to update the metamodel in a more efficient way.

...read moreread less

Abstract: In structural reliability, an important challenge is to reduce the number of calling the performance function, especially a finite element model in engineering problem which usually involves complex computer codes and requires time-consuming computations. To solve this problem, one of the metamodels, Kriging is then introduced as a surrogate for the original model. Kriging presents interesting characteristics such as exact interpolation and a local index of uncertainty on the prediction which can be used as an active learning method. In this paper, a new learning function based on information entropy is proposed. The new learning criterion can help select the next point effectively and add it to the design of experiments to update the metamodel. Then it is applied in a new method constructed in this paper which combines Kriging and Line Sampling to estimate the reliability of structures in a more efficient way. In the end, several examples including non-linearity, high dimensionality and engineering problems are performed to demonstrate the efficiency of the methods with the proposed learning function.

...read moreread less

Journal Article•DOI•

Entanglement-Based Machine Learning on a Quantum Computer

[...]

Xin-Dong Cai¹, Dian Wu¹, Zu-En Su¹, Ming-Cheng Chen¹, Xi-Lin Wang¹, Li Li¹, Nai-Le Liu¹, Chao-Yang Lu¹, Jian-Wei Pan¹ - Show less +5 more•Institutions (1)

University of Science and Technology of China¹

19 Mar 2015-Physical Review Letters

TL;DR: The first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer is reported, which can be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

...read moreread less

Abstract: Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing ``big data'' could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

...read moreread less

Proceedings Article•

Reinforcement learning from demonstration through shaping

[...]

Tim Brys¹, Anna Harutyunyan¹, Halit Bener Suay², Sonia Chernova², Matthew D. Taylor³, Ann Nowé¹ - Show less +2 more•Institutions (3)

Vrije Universiteit Brussel¹, Worcester Polytechnic Institute², Washington State University³

25 Jul 2015

TL;DR: This paper investigates the intersection of reinforcement learning and expert demonstrations, leveraging the theoretical guarantees provided by reinforcement learning, and using expert demonstrations to speed up this learning by biasing exploration through a process called reward shaping.

...read moreread less

Abstract: Reinforcement learning describes how a learning agent can achieve optimal behaviour based on interactions with its environment and reward feedback. A limiting factor in reinforcement learning as employed in artificial intelligence is the need for an often prohibitively large number of environment samples before the agent reaches a desirable level of performance. Learning from demonstration is an approach that provides the agent with demonstrations by a supposed expert, from which it should derive suitable behaviour. Yet, one of the challenges of learning from demonstration is that no guarantees can be provided for the quality of the demonstrations, and thus the learned behavior. In this paper, we investigate the intersection of these two approaches, leveraging the theoretical guarantees provided by reinforcement learning, and using expert demonstrations to speed up this learning by biasing exploration through a process called reward shaping. This approach allows us to leverage human input without making an erroneous assumption regarding demonstration optimality. We show experimentally that this approach requires significantly fewer demonstrations, is more robust against suboptimality of demonstrations, and achieves much faster learning than the recently developed HAT algorithm.

...read moreread less

Book•

Active Learning for Recommender Systems

[...]

Rasoul Karimi¹•Institutions (1)

University of Hildesheim¹

01 Oct 2015

TL;DR: The aim of this dissertation is to take inspiration from the literature of active learning for classification (regression) problems and develop new methods for the new-user problem in recommender systems.

...read moreread less

Abstract: Recommender systems learn user preferences and provide them personalized recommendations. Evidently, the performance of recommender systems depends on the amount of information that users provide regarding items, most often in the form of ratings. This problem is amplified for new users because they have not provided any rating, which impacts negatively on the quality of generated recommendations. This problem is called new-user problem. A simple and effective way to overcome this problem is posing queries to new users so that they express their preferences about selected items, e.g., by rating them. Nevertheless, the selection of items must take into consideration that users are not willing to answer a lot of such queries. To address this problem, active learning methods have been proposed to acquire the most informative ratings, i.e., ratings from users that will help most in determining their interests. Active learning is a learning algorithm that is able to interactively query the Oracle to obtain labels for data instances. The Oracle is a user or teacher who knows the labels. The aim of this dissertation [8] is to take inspiration from the literature of active learning for classification (regression) problems and develop new methods for the new-user problem in recommender systems. In the recommender system context, new users play the role of the Oracle and provide ratings (labels) to items (data instances). Specifically, the following questions are addressed in this dissertation: (1) which recommendation model is suitable for active-learning purposes? (Sect. 2) (2) how can active learning criteria be adapted and customized for the new-user problem and which one is the best? (Sect. 3) (3) what are the specific requirements and properties of the new-user problem that do not exist in active learning and how can new active learning methods be developed based on these properties? (Sects. 4, 5).

...read moreread less

Posted Content•

No More Pesky Learning Rate Guessing Games.

[...]

Leslie N. Smith

03 Jun 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: It is shown that training with cyclical learning rates achieves near optimal classification accuracy without tuning and often in many fewer iterations.

...read moreread less

Abstract: It is known that the learning rate is the most important hyper-parameter to tune for training deep convolutional neural networks (i.e., a "guessing game"). This report describes a new method for setting the learning rate, named cyclical learning rates, that eliminates the need to experimentally find the best values and schedule for the learning rates. Instead of setting the learning rate to fixed values, this method lets the learning rate cyclically vary within reasonable boundary values. This report shows that training with cyclical learning rates achieves near optimal classification accuracy without tuning and often in many fewer iterations. This report also describes a simple way to estimate "reasonable bounds" - by linearly increasing the learning rate in one training run of the network for only a few epochs. In addition, cyclical learning rates are demonstrated on training with the CIFAR-10 dataset and the AlexNet and GoogLeNet architectures on the ImageNet dataset. These methods are practical tools for everyone who trains convolutional neural networks.

...read moreread less

Book•

Efficient Learning Machines: Theories, Concepts, and Applications for Engineers and System Designers

[...]

Mariette Awad, Rahul Khanna

27 Apr 2015

TL;DR: Efficient Learning Machines as mentioned in this paper explores the major topics of machine learning, including knowledge discovery, classifications, genetic algorithms, neural networks, kernel methods, and biologically-inspired techniques.

...read moreread less

Abstract: Machine learning techniques provide cost-effective alternatives to traditional methods for extracting underlying relationships between information and data and for predicting future events by processing existing information to train models. Efficient Learning Machines explores the major topics of machine learning, including knowledge discovery, classifications, genetic algorithms, neural networking, kernel methods, and biologically-inspired techniques. Mariette Awad and Rahul Khannas synthetic approach weaves together the theoretical exposition, design principles, and practical applications of efficient machine learning. Their experiential emphasis, expressed in their close analysis of sample algorithms throughout the book, aims to equip engineers, students of engineering, and system designers to design and create new and more efficient machine learning systems. Readers of Efficient Learning Machines will learn how to recognize and analyze the problems that machine learning technology can solve for them, how to implement and deploy standard solutions to sample problems, and how to design new systems and solutions. Advances in computing performance, storage, memory, unstructured information retrieval, and cloud computing have coevolved with a new generation of machine learning paradigms and big data analytics, which the authors present in the conceptual context of their traditional precursors. Awad and Khanna explore current developments in the deep learning techniques of deep neural networks, hierarchical temporal memory, and cortical algorithms. Nature suggests sophisticated learning techniques that deploy simple rules to generate highly intelligent and organized behaviors with adaptive, evolutionary, and distributed properties. The authors examine the most popular biologically-inspired algorithms, together with a sample application to distributed datacenter management. They also discuss machine learning techniques for addressing problems of multi-objective optimization in which solutions in real-world systems are constrained and evaluated based on how well they perform with respect to multiple objectives in aggregate. Two chapters on support vector machines and their extensions focus on recent improvements to the classification and regression techniques at the core of machine learning. What youll learn Efficient Learning Machines systematically guides readers to an understanding and practical mastery of the following techniques:the machine learning techniques most commonly used to solve complex real-world problemsrecent improvements to classification and regression techniquesthe application of bio-inspired techniques to real-life problemsnew deep learning techniques that exploit advances in computing performance and storagemachine learning techniques for solving multi-objective optimization problems with nondominated methods that minimize distance to the Pareto front Who this book is for Efficient Learning Machines equips engineers, students of engineering, and system designers with the knowledge and guidance to design and create new and more efficient machine learning systems.

...read moreread less

Journal Article•DOI•

Multiple kernel extreme learning machine

[...]

Xinwang Liu¹, Lei Wang², Guang-Bin Huang³, Jian Zhang⁴, Jianping Yin¹ - Show less +1 more•Institutions (4)

National University of Defense Technology¹, University of Wollongong², Nanyang Technological University³, University of Technology, Sydney⁴

03 Feb 2015-Neurocomputing

TL;DR: A general learning framework, termed multiple kernel extreme learning machines (MK-ELM), to address the lack of a general framework for ELM to integrate multiple heterogeneous data sources for classification and can achieve comparable or even better classification performance than state-of-the-art MKL algorithms, while incurring much less computational cost.

...read moreread less

Posted Content•

G-Learning: Taming the Noise in Reinforcement Learning via Soft Updates.

[...]

Roy Fox, Ari Pakman, Naftali Tishby

28 Dec 2015-arXiv: Learning

TL;DR: G-learning as mentioned in this paper regularizes the noise in the space of optimal actions by penalizing deterministic policies at the beginning of the learning, which enables naturally incorporating prior distributions over optimal actions when available.

...read moreread less

Abstract: Model-free reinforcement learning algorithms such as Q-learning perform poorly in the early stages of learning in noisy environments, because much effort is spent on unlearning biased estimates of the state-action function. The bias comes from selecting, among several noisy estimates, the apparent optimum, which may actually be suboptimal. We propose G-learning, a new off-policy learning algorithm that regularizes the noise in the space of optimal actions by penalizing deterministic policies at the beginning of the learning. Moreover, it enables naturally incorporating prior distributions over optimal actions when available. The stochastic nature of G-learning also makes it more cost-effective than Q-learning in noiseless but exploration-risky domains. We illustrate these ideas in several examples where G-learning results in significant improvements of the learning rate and the learning cost.

...read moreread less

Journal Article•DOI•

Deep Extreme Learning Machine and Its Application in EEG Classification

[...]

Shifei Ding¹, Shifei Ding², Nan Zhang², Nan Zhang¹, Xinzheng Xu¹, Xinzheng Xu², Guo Lili², Guo Lili¹, Jian Zhang¹, Jian Zhang² - Show less +6 more•Institutions (2)

China University of Mining and Technology¹, Chinese Academy of Sciences²

27 May 2015-Mathematical Problems in Engineering

TL;DR: Effectiveness of the application of DELM in EEG classification is confirmed and it is confirmed that MLELM approximate the complicated function but it also does not need to iterate during the training process.

...read moreread less

Abstract: Recently, deep learning has aroused wide interest in machine learning fields. Deep learning is a multilayer perceptron artificial neural network algorithm. Deep learning has the advantage of approximating the complicated function and alleviating the optimization difficulty associated with deep models. Multilayer extreme learning machine (MLELM) is a learning algorithm of an artificial neural network which takes advantages of deep learning and extreme learning machine. Not only does MLELM approximate the complicated function but it also does not need to iterate during the training process. We combining with MLELM and extreme learning machine with kernel (KELM) put forward deep extreme learning machine (DELM) and apply it to EEG classification in this paper. This paper focuses on the application of DELM in the classification of the visual feedback experiment, using MATLAB and the second brain-computer interface (BCI) competition datasets. By simulating and analyzing the results of the experiments, effectiveness of the application of DELM in EEG classification is confirmed.

...read moreread less

Proceedings Article•DOI•

Privacy-preserving deep learning

[...]

Reza Shokri¹, Vitaly Shmatikov²•Institutions (2)

University of Texas at Austin¹, Cornell University²

01 Sep 2015

TL;DR: The unprecedented accuracy of deep learning methods has turned them into the foundation of new AI-based services on the Internet and commercial companies that collect user data on a large scale have been the main beneficiaries.

...read moreread less

Abstract: Deep learning based on artificial neural networks is a very popular approach to modeling, classifying, and recognizing complex data such as images, speech, and text. The unprecedented accuracy of deep learning methods has turned them into the foundation of new AI-based services on the Internet. Commercial companies that collect user data on a large scale have been the main beneficiaries of this trend since the success of deep learning techniques is directly proportional to the amount of data available for training.

...read moreread less

Proceedings Article•DOI•

Learning Generalized Linear Models Over Normalized Data

[...]

Arun Kumar¹, Jeffrey F. Naughton¹, Jignesh M. Patel¹•Institutions (1)

University of Wisconsin-Madison¹

27 May 2015

TL;DR: A new approach named factorized learning is introduced that pushes ML computations through joins and avoids redundancy in both I/O and computations and is often substantially faster than the alternatives, but is not always the fastest, necessitating a cost-based approach.

...read moreread less

Abstract: Enterprise data analytics is a booming area in the data management industry. Many companies are racing to develop toolkits that closely integrate statistical and machine learning techniques with data management systems. Almost all such toolkits assume that the input to a learning algorithm is a single table. However, most relational datasets are not stored as single tables due to normalization. Thus, analysts often perform key-foreign key joins before learning on the join output. This strategy of learning after joins introduces redundancy avoided by normalization, which could lead to poorer end-to-end performance and maintenance overheads due to data duplication. In this work, we take a step towards enabling and optimizing learning over joins for a common class of machine learning techniques called generalized linear models that are solved using gradient descent algorithms in an RDBMS setting. We present alternative approaches to learn over a join that are easy to implement over existing RDBMSs. We introduce a new approach named factorized learning that pushes ML computations through joins and avoids redundancy in both I/O and computations. We study the tradeoff space for all our approaches both analytically and empirically. Our results show that factorized learning is often substantially faster than the alternatives, but is not always the fastest, necessitating a cost-based approach. We also discuss extensions of all our approaches to multi-table joins as well as to Hive.

...read moreread less

Journal Article•DOI•

Beyond Manual Tuning of Hyperparameters

[...]

Frank Hutter¹, Jörg Lücke², Lars Schmidt-Thieme³•Institutions (3)

University of Freiburg¹, University of Oldenburg², University of Hildesheim³

11 Jul 2015-Künstliche Intelligenz

TL;DR: This work discusses two strategies towards making machine learning algorithms more autonomous: automated optimization of hyperparameters (including mechanisms for feature selection, preprocessing, model selection, etc) and the development of algorithms with reduced sets ofhyperparameters.

...read moreread less

Abstract: The success of hand-crafted machine learning systems in many applications raises the question of making machine learning algorithms more autonomous, i.e., to reduce the requirement of expert input to a minimum. We discuss two strategies towards this goal: (1) automated optimization of hyperparameters (including mechanisms for feature selection, preprocessing, model selection, etc) and (2) the development of algorithms with reduced sets of hyperparameters. Since many research directions (e.g., deep learning), show a tendency towards increasingly complex algorithms with more and more hyperparamters, the demand for both of these strategies continuously increases. We review recent hyperparameter optimization methods and discuss data-driven approaches to avoid the introduction of hyperparameters using unsupervised learning. We end in discussing how these complementary strategies can work hand-in-hand, representing a very promising approach towards autonomous machine learning.

...read moreread less

Collapse