Showing papers on "Active learning (machine learning) published in 2011"

PDF

Open Access

Journal Article•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel¹, Peter Prettenhofer², Ron Weiss³, Vincent Dubourg, Jake Vanderplas⁴, Alexandre Passos⁵, David Cournapeau, Matthieu Brucher⁶, Matthieu Perrot, Edouard Duchesnay - Show less +12 more•Institutions (6)

Kobe University¹, Bauhaus University, Weimar², Google³, University of Washington⁴, University of Massachusetts Amherst⁵, Total S.A.⁶

01 Feb 2011-Journal of Machine Learning Research

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.

...read moreread less

Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

...read moreread less

47,974 citations

Proceedings Article•

A Three-Way Model for Collective Learning on Multi-Relational Data

[...]

Maximilian Nickel, Volker Tresp¹, Hans-Peter Kriegel•Institutions (1)

Siemens¹

28 Jun 2011

TL;DR: This work presents a novel approach to relational learning based on the factorization of a three-way tensor that is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorizations.

...read moreread less

Abstract: Relational learning is becoming increasingly important in many areas of application. Here, we present a novel approach to relational learning based on the factorization of a three-way tensor. We show that unlike other tensor approaches, our method is able to perform collective learning via the latent components of the model and provide an efficient algorithm to compute the factorization. We substantiate our theoretical considerations regarding the collective learning capabilities of our model by the means of experiments on both a new dataset and a dataset commonly used in entity resolution. Furthermore, we show on common benchmark datasets that our approach achieves better or on-par results, if compared to current state-of-the-art relational learning solutions, while it is significantly faster to compute.

...read moreread less

1,830 citations

Journal Article•DOI•

AK-MCS: An active learning reliability method combining Kriging and Monte Carlo Simulation

[...]

B. Echard¹, Nicolas Gayton¹, Maurice Lemaire¹•Institutions (1)

Institut Français¹

01 Mar 2011-Structural Safety

TL;DR: An iterative approach based on Monte Carlo Simulation and Kriging metamodel to assess the reliability of structures in a more efficient way and is shown to be very efficient as the probability of failure obtained with AK-MCS is very accurate and this, for only a small number of calls to the performance function.

...read moreread less

1,234 citations

Proceedings Article•DOI•

Adversarial machine learning

[...]

Ling Huang¹, Anthony D. Joseph², Blaine Nelson³, Benjamin I. P. Rubinstein⁴, J. D. Tygar² - Show less +1 more•Institutions (4)

Intel¹, University of California, Berkeley², University of Tübingen³, Microsoft⁴

21 Oct 2011

TL;DR: In this article, the authors discuss an emerging field of study: adversarial machine learning (AML), the study of effective machine learning techniques against an adversarial opponent, and give a taxonomy for classifying attacks against online machine learning algorithms.

...read moreread less

Abstract: In this paper (expanded from an invited talk at AISEC 2010), we discuss an emerging field of study: adversarial machine learning---the study of effective machine learning techniques against an adversarial opponent. In this paper, we: give a taxonomy for classifying attacks against online machine learning algorithms; discuss application-specific factors that limit an adversary's capabilities; introduce two models for modeling an adversary's capabilities; explore the limits of an adversary's knowledge about the algorithm, feature space, training, and input data; explore vulnerabilities in machine learning algorithms; discuss countermeasures against attacks; introduce the evasion challenge; and discuss privacy-preserving learning techniques.

...read moreread less

947 citations

Posted Content•

Bayesian Active Learning for Classification and Preference Learning

[...]

Neil Houlsby, Ferenc Huszar, Zoubin Ghahramani, Máté Lengyel

24 Dec 2011-arXiv: Machine Learning

TL;DR: This work proposes an approach that expresses information gain in terms of predictive entropies, and applies this method to the Gaussian Process Classier (GPC), and makes minimal approximations to the full information theoretic objective.

...read moreread less

Abstract: Information theoretic active learning has been widely studied for probabilistic models. For simple regression an optimal myopic policy is easily tractable. However, for other tasks and with more complex models, such as classication with nonparametric models, the optimal solution is harder to compute. Current approaches make approximations to achieve tractability. We propose an approach that expresses information gain in terms of predictive entropies, and apply this method to the Gaussian Process Classier (GPC). Our approach makes minimal approximations to the full information theoretic objective. Our experimental performance compares favourably to many popular active learning algorithms, and has equal or lower computational complexity. We compare well to decision theoretic approaches also, which are privy to more information and require much more computational time. Secondly, by developing further a reformulation of binary preference learning to a classication problem, we extend our algorithm to Gaussian Process preference learning.

...read moreread less

578 citations

Journal Article•DOI•

Adaptive submodularity: theory and applications in active learning and stochastic optimization

[...]

Daniel Golovin¹, Andreas Krause²•Institutions (2)

California Institute of Technology¹, ETH Zurich²

01 Sep 2011-Journal of Artificial Intelligence Research

TL;DR: In this article, the concept of adaptive submodularity is introduced, which generalizes submodular set functions to adaptive policies and provides performance guarantees for both stochastic maximization and coverage, and can be exploited to speed up the greedy algorithm by using lazy evaluations.

...read moreread less

Abstract: Many problems in artificial intelligence require adaptively making a sequence of decisions with uncertain outcomes under partial observability. Solving such stochastic optimization problems is a fundamental but notoriously difficult challenge. In this paper, we introduce the concept of adaptive submodularity, generalizing submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy. In addition to providing performance guarantees for both stochastic maximization and coverage, adaptive submodularity can be exploited to drastically speed up the greedy algorithm by using lazy evaluations. We illustrate the usefulness of the concept by giving several examples of adaptive submodular objectives arising in diverse AI applications including management of sensing resources, viral marketing and active learning. Proving adaptive submodularity for these problems allows us to recover existing results in these applications as special cases, improve approximation guarantees and handle natural generalizations.

...read moreread less

570 citations

Proceedings Article•DOI•

Scaling up machine learning: parallel and distributed approaches

[...]

Ron Bekkerman¹, Mikhail Bilenko², John Langford•Institutions (2)

LinkedIn¹, Microsoft²

21 Aug 2011

TL;DR: This tutorial gives a broad view of modern approaches for scaling up machine learning and data mining methods on parallel/distributed platforms and provides an integrated overview of state-of-the-art platforms and algorithm choices.

...read moreread less

Abstract: This tutorial gives a broad view of modern approaches for scaling up machine learning and data mining methods on parallel/distributed platforms. Demand for scaling up machine learning is task-specific: for some tasks it is driven by the enormous dataset sizes, for others by model complexity or by the requirement for real-time prediction. Selecting a task-appropriate parallelization platform and algorithm requires understanding their benefits, trade-offs and constraints. This tutorial focuses on providing an integrated overview of state-of-the-art platforms and algorithm choices. These span a range of hardware options (from FPGAs and GPUs to multi-core systems and commodity clusters), programming frameworks (including CUDA, MPI, MapReduce, and DryadLINQ), and learning settings (e.g., semi-supervised and online learning). The tutorial is example-driven, covering a number of popular algorithms (e.g., boosted trees, spectral clustering, belief propagation) and diverse applications (e.g., recommender systems and object recognition in vision).The tutorial is based on (but not limited to) the material from our upcoming Cambridge U. Press edited book which is currently in production.Visit the tutorial website at http://hunch.net/~large_scale_survey/

...read moreread less

415 citations

Journal Article•DOI•

Hyperspectral Image Segmentation Using a New Bayesian Approach With Active Learning

[...]

Jun Li¹, Jose M. Bioucas-Dias, Antonio Plaza¹•Institutions (1)

University of Extremadura¹

12 May 2011-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of a multinomial logistic regression model to learn the class posterior probability distributions and a new active sampling approach, called modified breaking ties, which is able to provide an unbiased sampling.

...read moreread less

Abstract: This paper introduces a new supervised Bayesian approach to hyperspectral image segmentation with active learning, which consists of two main steps. First, we use a multinomial logistic regression (MLR) model to learn the class posterior probability distributions. This is done by using a recently introduced logistic regression via splitting and augmented Lagrangian algorithm. Second, we use the information acquired in the previous step to segment the hyperspectral image using a multilevel logistic prior that encodes the spatial information. In order to reduce the cost of acquiring large training sets, active learning is performed based on the MLR posterior probabilities. Another contribution of this paper is the introduction of a new active sampling approach, called modified breaking ties, which is able to provide an unbiased sampling. Furthermore, we have implemented our proposed method in an efficient way. For instance, in order to obtain the time-consuming maximum a posteriori segmentation, we use the α-expansion min-cut-based integer optimization algorithm. The state-of-the-art performance of the proposed approach is illustrated using both simulated and real hyperspectral data sets in a number of experimental comparisons with recently introduced hyperspectral image analysis methods.

...read moreread less

414 citations

Journal Article•DOI•

A Short Introduction to Learning to Rank

[...]

Hang Li¹•Institutions (1)

Microsoft¹

01 Oct 2011-IEICE Transactions on Information and Systems

TL;DR: Several learning to rank methods using SVM techniques are described in details and the fundamental problems, existing approaches, and future work of learning toRank are explained.

...read moreread less

Abstract: Learning to rank refers to machine learning techniques for training the model in a ranking task. Learning to rank is useful for many applications in Information Retrieval, Natural Language Processing, and Data Mining. Intensive studies have been conducted on the problem and significant progress has been made[1],[2]. This short paper gives an introduction to learning to rank, and it specifically explains the fundamental problems, existing approaches, and future work of learning to rank. Several learning to rank methods using SVM techniques are described in details.

...read moreread less

341 citations

Proceedings Article•

Active Learning from Crowds

[...]

Yan Yan¹, Glenn Fung², R mer Rosales³, Jennifer G. Dy¹•Institutions (3)

Northeastern University¹, Siemens², Yahoo!³

28 Jun 2011

TL;DR: In this article, the authors employ a probabilistic model for learning from multiple annotators that can also learn the annotator expertise even when their expertise may not be consistently accurate across the task domain.

...read moreread less

Abstract: Obtaining labels can be expensive or time-consuming, but unlabeled data is often abundant and easier to obtain. Most learning tasks can be made more efficient, in terms of labeling cost, by intelligently choosing specific unlabeled instances to be labeled by an oracle. The general problem of optimally choosing these instances is known as active learning. As it is usually set in the context of supervised learning, active learning relies on a single oracle playing the role of a teacher. We focus on the multiple annotator scenario where an oracle, who knows the ground truth, no longer exists; instead, multiple labelers, with varying expertise, are available for querying. This paradigm posits new challenges to the active learning scenario. We can now ask which data sample should be labeled next and which annotator should be queried to benefit our learning model the most. In this paper, we employ a probabilistic model for learning from multiple annotators that can also learn the annotator expertise even when their expertise may not be consistently accurate across the task domain. We then focus on providing a criterion and formulation that allows us to select both a sample and the annotator/s to query the labels from.

...read moreread less

309 citations

Journal Article•DOI•

On the use of stochastic hessian information in optimization methods for machine learning

[...]

Richard H. Byrd¹, Gillian M. Chin², Will Neveitt³, Jorge Nocedal²•Institutions (3)

University of Colorado Boulder¹, Northwestern University², Google³

22 Sep 2011-Siam Journal on Optimization

TL;DR: Curvature information is incorporated in two subsampled Hessian algorithms, one based on a matrix-free inexact Newton iteration and one on a preconditioned limited memory BFGS iteration.

...read moreread less

Abstract: This paper describes how to incorporate sampled curvature information in a Newton-CG method and in a limited memory quasi-Newton method for statistical learning. The motivation for this work stems from supervised machine learning applications involving a very large number of training points. We follow a batch approach, also known in the stochastic optimization literature as a sample average approximation approach. Curvature information is incorporated in two subsampled Hessian algorithms, one based on a matrix-free inexact Newton iteration and one on a preconditioned limited memory BFGS iteration. A crucial feature of our technique is that Hessian-vector multiplications are carried out with a significantly smaller sample size than is used for the function and gradient. The efficiency of the proposed methods is illustrated using a machine learning application involving speech recognition.

...read moreread less

Journal Article•DOI•

Testing and validating machine learning classifiers by metamorphic testing

[...]

Xiaoyuan Xie¹, Joshua W. K. Ho², Christian Murphy³, Gail E. Kaiser⁴, Baowen Xu⁵, Tsong Yueh Chen¹ - Show less +2 more•Institutions (5)

Swinburne University of Technology¹, Brigham and Women's Hospital², University of Pennsylvania³, Columbia University⁴, Nanjing University⁵

01 Apr 2011

TL;DR: This paper presents a technique for testing the implementations of machine learning classification algorithms which support such applications, based on the technique "metamorphic testing", which has been shown to be effective to alleviate the oracle problem.

...read moreread less

Abstract: Machine learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no ''test oracle'' to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique ''metamorphic testing'', which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program.

...read moreread less

Journal Article•DOI•

Machine learning in side-channel analysis: a first study

[...]

Gabriel Hospodar¹, Benedikt Gierlichs¹, Elke De Mulder¹, Ingrid Verbauwhede¹, Joos Vandewalle¹ - Show less +1 more•Institutions (1)

Katholieke Universiteit Leuven¹

27 Oct 2011-Journal of Cryptographic Engineering

TL;DR: This work comprehensively investigates the application of a machine learning technique in SCA, a powerful kernel-based learning algorithm: the Least Squares Support Vector Machine (LS-SVM) and the target is a software implementation of the Advanced Encryption Standard.

...read moreread less

Abstract: Electronic devices may undergo attacks going beyond traditional cryptanalysis. Side-channel analysis (SCA) is an alternative attack that exploits information leaking from physical implementations of e.g. cryptographic devices to discover cryptographic keys or other secrets. This work comprehensively investigates the application of a machine learning technique in SCA. The considered technique is a powerful kernel-based learning algorithm: the Least Squares Support Vector Machine (LS-SVM). The chosen side-channel is the power consumption and the target is a software implementation of the Advanced Encryption Standard. In this study, the LS-SVM technique is compared to Template Attacks. The results show that the choice of parameters of the machine learning technique strongly impacts the performance of the classification. In contrast, the number of power traces and time instants does not influence the results in the same proportion. This effect can be attributed to the usage of data sets with straightforward Hamming weight leakages in this first study.

...read moreread less

Journal Article•DOI•

Batch-Mode Active-Learning Methods for the Interactive Classification of Remote Sensing Images

[...]

Begum Demir¹, Claudio Persello², Lorenzo Bruzzone²•Institutions (2)

Kocaeli University¹, University of Trento²

01 Mar 2011-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: This paper investigates different batch-mode active-learning techniques for the classification of remote sensing images with support vector machines and proposes a novel query function that is based on a kernel-clustering technique for assessing the diversity of samples and a new strategy for selecting the most informative representative sample from each cluster.

...read moreread less

Abstract: This paper investigates different batch-mode active-learning (AL) techniques for the classification of remote sensing (RS) images with support vector machines. This is done by generalizing to multiclass problem techniques defined for binary classifiers. The investigated techniques exploit different query functions, which are based on the evaluation of two criteria: uncertainty and diversity. The uncertainty criterion is associated to the confidence of the supervised algorithm in correctly classifying the considered sample, while the diversity criterion aims at selecting a set of unlabeled samples that are as more diverse (distant one another) as possible, thus reducing the redundancy among the selected samples. The combination of the two criteria results in the selection of the potentially most informative set of samples at each iteration of the AL process. Moreover, we propose a novel query function that is based on a kernel-clustering technique for assessing the diversity of samples and a new strategy for selecting the most informative representative sample from each cluster. The investigated and proposed techniques are theoretically and experimentally compared with state-of-the-art methods adopted for RS applications. This is accomplished by considering very high resolution multispectral and hyperspectral images. By this comparison, we observed that the proposed method resulted in better accuracy with respect to other investigated and state-of-the art methods on both the considered data sets. Furthermore, we derived some guidelines on the design of AL systems for the classification of different types of RS images.

...read moreread less

Proceedings Article•

Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis

[...]

Graham Neubig¹, Yosuke Nakata¹, Shinsuke Mori¹•Institutions (1)

Kyoto University¹

19 Jun 2011

TL;DR: A pointwise approach to Japanese morphological analysis (MA) that ignores structure information during learning and tagging is presented, able to outperform the current state-of-the-art structured approach, and achieves accuracy similar to that of structured predictors using the same feature set.

...read moreread less

Abstract: We present a pointwise approach to Japanese morphological analysis (MA) that ignores structure information during learning and tagging. Despite the lack of structure, it is able to outperform the current state-of-the-art structured approach for Japanese MA, and achieves accuracy similar to that of structured predictors using the same feature set. We also find that the method is both robust to out-of-domain data, and can be easily adapted through the use of a combination of partial annotation and active learning.

...read moreread less

Journal Article•DOI•

Basic elements and characteristics of mobile learning

[...]

Fezile Ozdamli¹, Nadire Cavus¹•Institutions (1)

Near East University¹

01 Jan 2011-Procedia - Social and Behavioral Sciences

TL;DR: The basic elements and characteristic of mobile learning according to new trends in developing technology are described to better understand the underlying motivations that lead academics to adopting mobile learning elements and characteristics.

...read moreread less

Journal Article•DOI•

A study on effectiveness of extreme learning machine

[...]

Yu Guang Wang¹, Feilong Cao¹, Yubo Yuan¹•Institutions (1)

China Jiliang University¹

01 Sep 2011-Neurocomputing

TL;DR: An improved algorithm called EELM is proposed that makes a proper selection of the input weights and bias before calculating the output weights, which ensures the full column rank of H in theory and improves to some extend the learning rate and the robustness property of the networks.

...read moreread less

Proceedings Article•DOI•

Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning

[...]

Marc Peter Deisenroth¹, Carl Edward Rasmussen², Dieter Fox¹•Institutions (2)

University of Washington¹, University of Cambridge²

27 Jun 2011

TL;DR: It is demonstrated how a low-cost off-the-shelf robotic system can learn closed-loop policies for a stacking task in only a handful of trials-from scratch.

...read moreread less

Abstract: Over the last years, there has been substantial progress in robust manipulation in unstructured environments. The long-term goal of our work is to get away from precise, but very expensive robotic systems and to develop affordable, potentially imprecise, self-adaptive manipulator systems that can interactively perform tasks such as playing with children. In this paper, we demonstrate how a low-cost off-the-shelf robotic system can learn closed-loop policies for a stacking task in only a handful of trials-from scratch. Our manipulator is inaccurate and provides no pose feedback. For learning a controller in the work space of a Kinect-style depth camera, we use a model-based reinforcement learning technique. Our learning method is data efficient, reduces model bias, and deals with several noise sources in a principled way during long-term planning. We present a way of incorporating state-space constraints into the learning process and analyze the learning gain by exploiting the sequential structure of the stacking task.

...read moreread less

Book•

Learning with Support Vector Machines

[...]

Colin Campbell¹, Yiming Ying²•Institutions (2)

University of Bristol¹, University of Exeter²

15 Feb 2011

TL;DR: This book starts with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise, and shows that this framework can be extended to many other scenarios such as prediction with real-valued outputs, novelty detection and the handling of complex output structures such as parse trees.

...read moreread less

Abstract: Support Vectors Machines have become a well established tool within machine learning. They work well in practice and have now been used across a wide range of applications from recognizing hand-written digits, to face identification, text categorisation, bioinformatics, and database marketing. In this book we give an introductory overview of this subject. We start with a simple Support Vector Machine for performing binary classification before considering multi-class classification and learning in the presence of noise. We show that this framework can be extended to many other scenarios such as prediction with real-valued outputs, novelty detection and the handling of complex output structures such as parse trees. Finally, we give an overview of the main types of kernels which are used in practice and how to learn and make predictions from multiple types of input data. Table of Contents: Support Vector Machines for Classification / Kernel-based Models / Learning with Kernels

...read moreread less

Journal Article•DOI•

Reinforcement learning in feedback control

[...]

Roland Hafner¹, Martin Riedmiller¹•Institutions (1)

University of Freiburg¹

01 Jul 2011-Machine Learning

TL;DR: This article focuses on the presentation of four typical benchmark problems whilst highlighting important and challenging aspects of technical process control: nonlinear dynamics; varying set-points; long-term dynamic effects; influence of external variables; and the primacy of precision.

...read moreread less

Abstract: Technical process control is a highly interesting area of application serving a high practical impact. Since classical controller design is, in general, a demanding job, this area constitutes a highly attractive domain for the application of learning approaches--in particular, reinforcement learning (RL) methods. RL provides concepts for learning controllers that, by cleverly exploiting information from interactions with the process, can acquire high-quality control behaviour from scratch. This article focuses on the presentation of four typical benchmark problems whilst highlighting important and challenging aspects of technical process control: nonlinear dynamics; varying set-points; long-term dynamic effects; influence of external variables; and the primacy of precision. We propose performance measures for controller quality that apply both to classical control design and learning controllers, measuring precision, speed, and stability of the controller. A second set of key-figures describes the performance from the perspective of a learning approach while providing information about the efficiency of the method with respect to the learning effort needed. For all four benchmark problems, extensive and detailed information is provided with which to carry out the evaluations outlined in this article. A close evaluation of our own RL learning scheme, NFQCA (Neural Fitted Q Iteration with Continuous Actions), in acordance with the proposed scheme on all four benchmarks, thereby provides performance figures on both control quality and learning behavior.

...read moreread less

Book Chapter•DOI•

Statistical Learning Theory: Models, Concepts, and Results

[...]

U von Luxburg¹, Bernhard Schölkopf¹•Institutions (1)

Max Planck Society¹

01 May 2011

TL;DR: The statistical learning theory as discussed by the authors is regarded as one of the most beautifully developed branches of artificial intelligence, and it provides the theoretical basis for many of today's machine learning algorithms, such as classification.

...read moreread less

Abstract: Publisher Summary Statistical learning theory is regarded as one of the most beautifully developed branches of artificial intelligence. It provides the theoretical basis for many of today's machine learning algorithms. The theory helps to explore what permits to draw valid conclusions from empirical data. This chapter provides an overview of the key ideas and insights of statistical learning theory. The statistical learning theory begins with a class of hypotheses and uses empirical data to select one hypothesis from the class. If the data generating mechanism is benign, then it is observed that the difference between the training error and test error of a hypothesis from the class is small. The statistical learning theory generally avoids metaphysical statements about aspects of the true underlying dependency, and thus is precise by referring to the difference between training and test error. The chapter also describes some other variants of machine learning.

...read moreread less

Proceedings Article•

Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances

[...]

Burr Settles¹•Institutions (1)

Carnegie Mellon University¹

27 Jul 2011

TL;DR: A novel semi-supervised training algorithm developed for this setting is presented, which is fast enough to support real-time interactive speeds, and at least as accurate as preexisting methods for learning with mixed feature and instance labels.

...read moreread less

Abstract: This paper describes DUALIST, an active learning annotation paradigm which solicits and learns from labels on both features (e.g., words) and instances (e.g., documents). We present a novel semi-supervised training algorithm developed for this setting, which is (1) fast enough to support real-time interactive speeds, and (2) at least as accurate as preexisting methods for learning with mixed feature and instance labels. Human annotators in user studies were able to produce near-state-of-the-art classifiers---on several corpora in a variety of application domains---with only a few minutes of effort.

...read moreread less

Posted Content•

Guided Data Repair

[...]

Mohamed Yakout¹, Ahmed K. Elmagarmid², Jennifer Neville¹, Mourad Ouzzani¹, Ihab F. Ilyas³ - Show less +1 more•Institutions (3)

Purdue University¹, Qatar Computing Research Institute², University of Waterloo³

16 Mar 2011-arXiv: Databases

TL;DR: Guided Data Repair (GDR) as discussed by the authors uses machine learning methods to identify and apply the correct updates directly to the database without the actual involvement of the user on these specific updates.

...read moreread less

Abstract: In this paper we present GDR, a Guided Data Repair framework that incorporates user feedback in the cleaning process to enhance and accelerate existing automatic repair techniques while minimizing user involvement. GDR consults the user on the updates that are most likely to be beneficial in improving data quality. GDR also uses machine learning methods to identify and apply the correct updates directly to the database without the actual involvement of the user on these specific updates. To rank potential updates for consultation by the user, we first group these repairs and quantify the utility of each group using the decision-theory concept of value of information (VOI). We then apply active learning to order updates within a group based on their ability to improve the learned model. User feedback is used to repair the database and to adaptively refine the training set for the model. We empirically evaluate GDR on a real-world dataset and show significant improvement in data quality using our user guided repairing process. We also, assess the trade-off between the user efforts and the resulting data quality.

...read moreread less

Journal Article•DOI•

Guided data repair

[...]

Mohamed Yakout¹, Ahmed K. Elmagarmid¹, Jennifer Neville², Mourad Ouzzani², Ihab F. Ilyas¹ - Show less +1 more•Institutions (2)

Qatar Computing Research Institute¹, Purdue University²

01 Feb 2011

TL;DR: GDR, a Guided Data Repair framework that incorporates user feedback in the cleaning process to enhance and accelerate existing automatic repair techniques while minimizing user involvement is presented.

...read moreread less

Proceedings Article•DOI•

Human model evaluation in interactive supervised learning

[...]

Rebecca Fiebrink¹, Perry R. Cook¹, Dan Trueman¹•Institutions (1)

Princeton University¹

07 May 2011

TL;DR: This work studying the evaluation practices of end users interactively building supervised learning systems for real-world gesture analysis problems observed that users employed evaluation techniques not only to make relevant judgments of algorithms' performance and interactively improve the trained models, but also to learn to provide more effective training data.

...read moreread less

Abstract: Model evaluation plays a special role in interactive machine learning (IML) systems in which users rely on their assessment of a model's performance in order to determine how to improve it. A better understanding of what model criteria are important to users can therefore inform the design of user interfaces for model evaluation as well as the choice and design of learning algorithms. We present work studying the evaluation practices of end users interactively building supervised learning systems for real-world gesture analysis problems. We examine users' model evaluation criteria, which span conventionally relevant criteria such as accuracy and cost, as well as novel criteria such as unexpectedness. We observed that users employed evaluation techniques---including cross-validation and direct, real-time evaluation---not only to make relevant judgments of algorithms' performance and interactively improve the trained models, but also to learn to provide more effective training data. Furthermore, we observed that evaluation taught users about what types of models were easy or possible to build, and users sometimes used this information to modify the learning problem definition or their plans for using the trained models in practice. We discuss the implications of these findings with regard to the role of generalization accuracy in IML, the design of new algorithms and interfaces, and the scope of potential benefits of incorporating human interaction in the design of supervised learning systems.

...read moreread less

Journal Article•DOI•

Incremental Learning From Stream Data

[...]

Haibo He¹, Sheng Chen², Kang Li³, Xin Xu⁴•Institutions (4)

University of Rhode Island¹, Stevens Institute of Technology², Queen's University Belfast³, National University of Defense Technology⁴

01 Dec 2011-IEEE Transactions on Neural Networks

TL;DR: This paper proposes a general adaptive incremental learning framework named ADAIN that is capable of learning from continuous raw data, accumulating experience over time, and using such knowledge to improve future learning and prediction performance.

...read moreread less

Abstract: Recent years have witnessed an incredibly increasing interest in the topic of incremental learning Unlike conventional machine learning situations, data flow targeted by incremental learning becomes available continuously over time Accordingly, it is desirable to be able to abandon the traditional assumption of the availability of representative training data during the training period to develop decision boundaries Under scenarios of continuous data flow, the challenge is how to transform the vast amount of stream raw data into information and knowledge representation, and accumulate experience over time to support future decision-making process In this paper, we propose a general adaptive incremental learning framework named ADAIN that is capable of learning from continuous raw data, accumulating experience over time, and using such knowledge to improve future learning and prediction performance Detailed system level architecture and design strategies are presented in this paper Simulation results over several real-world data sets are used to validate the effectiveness of this method

...read moreread less

Book•

A First Course in Machine Learning

[...]

Simon Rogers, Mark Girolami

25 Oct 2011

TL;DR: A First Course in Machine Learning covers the core mathematical and statistical techniques needed to understand some of the most popular machine learning algorithms and provides students with the knowledge and confidence to explore the machine learning literature and research specific methods in more detail.

...read moreread less

Abstract: A First Course in Machine Learning covers the core mathematical and statistical techniques needed to understand some of the most popular machine learning algorithms. The algorithms presented span the main problem areas within machine learning: classification, clustering and projection. The text gives detailed descriptions and derivations for a small number of algorithms rather than cover many algorithms in less detail. Referenced throughout the text and available on a supporting website (http://bit.ly/firstcourseml), an extensive collection of MATLAB/Octave scripts enables students to recreate plots that appear in the book and investigate changing model specifications and parameter values. By experimenting with the various algorithms and concepts, students see how an abstract set of equations can be used to solve real problems. Requiring minimal mathematical prerequisites, the classroom-tested material in this text offers a concise, accessible introduction to machine learning. It provides students with the knowledge and confidence to explore the machine learning literature and research specific methods in more detail.

...read moreread less

Posted Content•

Differentially Private Online Learning

[...]

Prateek Jain¹, Pravesh K. Kothari², Abhradeep Thakurta³•Institutions (3)

Microsoft¹, University of Texas at Austin², Pennsylvania State University³

01 Sep 2011-arXiv: Learning

TL;DR: This paper provides a general framework to convert the given algorithm into a privacy preserving OCP algorithm with good (sub-linear) regret, and shows that this framework can be used to provide differentially private algorithms for offline learning as well.

...read moreread less

Abstract: In this paper, we consider the problem of preserving privacy in the online learning setting. We study the problem in the online convex programming (OCP) framework---a popular online learning setting with several interesting theoretical and practical implications---while using differential privacy as the formal privacy measure. For this problem, we distill two critical attributes that a private OCP algorithm should have in order to provide reasonable privacy as well as utility guarantees: 1) linearly decreasing sensitivity, i.e., as new data points arrive their effect on the learning model decreases, 2) sub-linear regret bound---regret bound is a popular goodness/utility measure of an online learning algorithm. Given an OCP algorithm that satisfies these two conditions, we provide a general framework to convert the given algorithm into a privacy preserving OCP algorithm with good (sub-linear) regret. We then illustrate our approach by converting two popular online learning algorithms into their differentially private variants while guaranteeing sub-linear regret ($O(\sqrt{T})$). Next, we consider the special case of online linear regression problems, a practically important class of online learning problems, for which we generalize an approach by Dwork et al. to provide a differentially private algorithm with just $O(\log^{1.5} T)$ regret. Finally, we show that our online learning framework can be used to provide differentially private algorithms for offline learning as well. For the offline learning problem, our approach obtains better error bounds as well as can handle larger class of problems than the existing state-of-the-art methods Chaudhuri et al.

...read moreread less

Proceedings Article•DOI•

Integrating reinforcement learning with human demonstrations of varying ability

[...]

Matthew D. Taylor¹, Halit Bener Suay², Sonia Chernova²•Institutions (2)

Lafayette College¹, Worcester Polytechnic Institute²

02 May 2011

TL;DR: This work introduces Human-Agent Transfer (HAT), an algorithm that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance in complex domains.

...read moreread less

Abstract: This work introduces Human-Agent Transfer (HAT), an algorithm that combines transfer learning, learning from demonstration and reinforcement learning to achieve rapid learning and high performance in complex domains. Using experiments in a simulated robot soccer domain, we show that human demonstrations transferred into a baseline policy for an agent and refined using reinforcement learning significantly improve both learning time and policy performance. Our evaluation compares three algorithmic approaches to incorporating demonstration rule summaries into transfer learning, and studies the impact of demonstration quality and quantity, as well as the effect of combining demonstrations from multiple teachers. Our results show that all three transfer methods lead to statistically significant improvement in performance over learning without demonstration. The best performance was achieved by combining the best demonstrations from two teachers.

...read moreread less

Journal Article•

An Overview on Theory and Algorithm of Support Vector Machines

[...]

QI Bing-juan

01 Jan 2011-Journal of the University of Electronic Science and Technology of China

TL;DR: The theoretical basis of support vector machines (SVM) is described systematically, the mainstream machine training algorithms of traditional SVM and some new learning models and algorithms detailedly areums up, and the research and development prospects of SVM are pointed out.

...read moreread less

Abstract: Statistical learning theory is the statistical theory of smallsample,and it focuses on the statistical law and the nature of learning of small samples.Support vector machine is a new machine learning method based on statistical learning theory,and it has become the research field of machine learning because of its excellent performance.This paper describes the theoretical basis of support vector machines(SVM) systematically,sums up the mainstream machine training algorithms of traditional SVM and some new learning models and algorithms detailedly,and finally points out the research and development prospects of support vector machine.

...read moreread less

Collapse