Showing papers on "Active learning (machine learning) published in 2010"

PDF

Open Access

Journal Article•DOI•

[...]

Sinno Jialin Pan¹, Qiang Yang¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Oct 2010-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.

...read moreread less

Abstract: A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression, and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.

...read moreread less

18,616 citations

Proceedings Article•

Self-Paced Learning for Latent Variable Models

[...]

M. P. Kumar¹, Benjamin Packer¹, Daphne Koller¹•Institutions (1)

Stanford University¹

06 Dec 2010

TL;DR: A novel, iterative self-paced learning algorithm where each iteration simultaneously selects easy samples and learns a new parameter vector that outperforms the state of the art method for learning a latent structural SVM on four applications.

...read moreread less

Abstract: Latent variable models are a powerful tool for addressing several tasks in machine learning. However, the algorithms for learning the parameters of latent variable models are prone to getting stuck in a bad local optimum. To alleviate this problem, we build on the intuition that, rather than considering all samples simultaneously, the algorithm should be presented with the training data in a meaningful order that facilitates learning. The order of the samples is determined by how easy they are. The main challenge is that often we are not provided with a readily computable measure of the easiness of samples. We address this issue by proposing a novel, iterative self-paced learning algorithm where each iteration simultaneously selects easy samples and learns a new parameter vector. The number of samples selected is governed by a weight that is annealed until the entire training data has been considered. We empirically demonstrate that the self-paced learning algorithm outperforms the state of the art method for learning a latent structural SVM on four applications: object localization, noun phrase coreference, motif finding and handwritten digit recognition.

...read moreread less

1,220 citations

Posted Content•

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

[...]

Stephane Ross¹, Geoffrey J. Gordon¹, J. Andrew Bagnell¹•Institutions (1)

Carnegie Mellon University¹

02 Nov 2010-arXiv: Learning

TL;DR: In this article, a no-regret algorithm is proposed to train a stationary deterministic policy with good performance under the distribution of observations it induces in such sequential settings, and it outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem.

...read moreread less

Abstract: Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches provide stronger guarantees in this setting, but remain somewhat unsatisfactory as they train either non-stationary or stochastic policies and require a large number of iterations. In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. We show that any such no regret algorithm, combined with additional reduction assumptions, must find a policy with good performance under the distribution of observations it induces in such sequential settings. We demonstrate that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem.

...read moreread less

1,176 citations

Proceedings Article•

Active Learning by Querying Informative and Representative Examples

[...]

Sheng-Jun Huang¹, Rong Jin², Zhi-Hua Zhou¹•Institutions (2)

Nanjing University¹, Michigan State University²

06 Dec 2010

TL;DR: The proposed QUIRE approach provides a systematic way for measuring and combining the informativeness and representativeness of an unlabeled instance by incorporating the correlation among labels and is extended to multi-label learning by actively querying instance-label pairs.

...read moreread less

Abstract: Most active learning approaches select either informative or representative unlabeled instances to query their labels. Although several active learning algorithms have been proposed to combine the two criteria for query selection, they are usually ad hoc in finding unlabeled instances that are both informative and representative. We address this challenge by a principled approach, termed QUIRE, based on the min-max view of active learning. The proposed approach provides a systematic way for measuring and combining the informativeness and representativeness of an instance. Extensive experimental results show that the proposed QUIRE approach outperforms several state-of -the-art active learning approaches.

...read moreread less

518 citations

Journal Article•

A Surrogate Modeling and Adaptive Sampling Toolbox for Computer Based Design

[...]

Dirk Gorissen, Ivo Couckuyt, Piet Demeester, Tom Dhaene, Karel Crombecq - Show less +1 more

01 Mar 2010-Journal of Machine Learning Research

TL;DR: This paper presents a mature, flexible, and adaptive machine learning toolkit for regression modeling and active learning to tackle issues of computational cost and model accuracy.

...read moreread less

Abstract: An exceedingly large number of scientific and engineering fields are confronted with the need for computer simulations to study complex, real world phenomena or solve challenging design problems. However, due to the computational cost of these high fidelity simulations, the use of neural networks, kernel methods, and other surrogate modeling techniques have become indispensable. Surrogate models are compact and cheap to evaluate, and have proven very useful for tasks such as optimization, design space exploration, prototyping, and sensitivity analysis. Consequently, in many fields there is great interest in tools and techniques that facilitate the construction of such regression models, while minimizing the computational cost and maximizing model accuracy. This paper presents a mature, flexible, and adaptive machine learning toolkit for regression modeling and active learning to tackle these issues. The toolkit brings together algorithms for data fitting, model selection, sample selection (active learning), hyperparameter optimization, and distributed computing in order to empower a domain expert to efficiently generate an accurate model for the problem or data at hand.

...read moreread less

490 citations

Journal Article•DOI•

A review of multi-instance learning assumptions

[...]

James R. Foulds¹, Eibe Frank¹•Institutions (1)

University of Waikato¹

01 Mar 2010-Knowledge Engineering Review

TL;DR: This paper aims to clarify the use of alternative MI assumptions by reviewing the work done in this area, and focuses on a relaxed view of the MI problem, where the standard MI assumption is dropped and alternative assumptions are considered instead.

...read moreread less

Abstract: Multi-instance (MI) learning is a variant of inductive machine learning, where each learning example contains a bag of instances instead of a single feature vector. The term commonly refers to the supervised setting, where each bag is associated with a label. This type of representation is a natural fit for a number of real-world learning scenarios, including drug activity prediction and image classification, hence many MI learning algorithms have been proposed. An yM I learning method must relate instances to bag-level class labels, but many types of relationships between instances and class labels are possible. Although all early work in MI learning assumes a specific MI concept class known to be appropriate for a drug activity prediction domain; this ‘standard MI assumption’ is not guaranteed to hold in other domains. Much of the recent work in MI learning has concentrated on a relaxed view of the MI problem, where the standard MI assumption is dropped, and alternative assumptions are considered instead. However, often it is not clearly stated what particular assumption is used and how it relates to other assumptions that have been proposed. In this paper, we aim to clarify the use of alternative MI assumptions by reviewing the work done in this area.

...read moreread less

378 citations

Proceedings Article•

Large Margin Multi-Task Metric Learning

[...]

Shibin Parameswaran¹, Kilian Q. Weinberger²•Institutions (2)

University of California, San Diego¹, Washington University in St. Louis²

06 Dec 2010

TL;DR: This paper proposes an alternative formulation for multi-task learning by extending the recently published large margin nearest neighbor (1mnn) algorithm to the MTL paradigm and shows that it consistently outperforms single-task kNN under several metrics and state-of-the-art MTL classifiers.

...read moreread less

Abstract: Multi-task learning (MTL) improves the prediction performance on multiple, different but related, learning problems through shared parameters or representations. One of the most prominent multi-task learning algorithms is an extension to support vector machines (svm) by Evgeniou et al. [15]. Although very elegant, multi-task svm is inherently restricted by the fact that support vector machines require each class to be addressed explicitly with its own weight vector which, in a multi-task setting, requires the different learning tasks to share the same set of classes. This paper proposes an alternative formulation for multi-task learning by extending the recently published large margin nearest neighbor (1mnn) algorithm to the MTL paradigm. Instead of relying on separating hyperplanes, its decision function is based on the nearest neighbor rule which inherently extends to many classes and becomes a natural fit for multi-task learning. We evaluate the resulting multi-task 1mnn on real-world insurance data and speech classification problems and show that it consistently outperforms single-task kNN under several metrics and state-of-the-art MTL classifiers.

...read moreread less

293 citations

Proceedings Article•DOI•

Reinforcement learning of motor skills in high dimensions: A path integral approach

[...]

Evangelos A. Theodorou¹, Jonas Buchli¹, Stefan Schaal¹•Institutions (1)

University of Southern California¹

03 May 2010

TL;DR: This paper derives a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals, and believes that this new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

...read moreread less

Abstract: Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far due to the computational difficulties that reinforcement learning encounters in high dimensional continuous state-action spaces. In this paper, we derive a novel approach to RL for parameterized control policies based on the framework of stochastic optimal control with path integrals. While solidly grounded in optimal control theory and estimation theory, the update equations for learning are surprisingly simple and have no danger of numerical instabilities as neither matrix inversions nor gradient learning rates are required. Empirical evaluations demonstrate significant performance improvements over gradient-based policy learning and scalability to high-dimensional control problems. Finally, a learning experiment on a robot dog illustrates the functionality of our algorithm in a real-world scenario. We believe that our new algorithm, Policy Improvement with Path Integrals (PI2), offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL in robotics.

...read moreread less

289 citations

Introduction to Information System

[...]

James O'Brien

01 Jan 2010

224 citations

Journal Article•DOI•

Ensemble Based Extreme Learning Machine

[...]

Nan Liu, Han Wang¹•Institutions (1)

Nanyang Technological University¹

21 Jun 2010-IEEE Signal Processing Letters

TL;DR: An ensemble based ELM (EN-ELM) algorithm is proposed where ensemble learning and cross-validation are embedded into the training phase so as to alleviate the overtraining problem and enhance the predictive stability.

...read moreread less

Abstract: Extreme learning machine (ELM) was proposed as a new class of learning algorithm for single-hidden layer feedforward neural network (SLFN). To achieve good generalization performance, ELM minimizes training error on the entire training data set, therefore it might suffer from overfitting as the learning model will approximate all training samples well. In this letter, an ensemble based ELM (EN-ELM) algorithm is proposed where ensemble learning and cross-validation are embedded into the training phase so as to alleviate the overtraining problem and enhance the predictive stability. Experimental results on several benchmark databases demonstrate that EN-ELM is robust and efficient for classification.

...read moreread less

222 citations

Posted Content•

Safe Feature Elimination in Sparse Supervised Learning

[...]

Laurent El Ghaoui¹, Vivian Viallon¹, Tarek Rabbani¹•Institutions (1)

University of California, Berkeley¹

17 Sep 2010-arXiv: Learning

TL;DR: This work investigates fast methods that allow to quickly eliminate variables (features) in supervised learning problems involving a convex loss function and a $l_1$-norm penalty, leading to a potentially substantial reduction in the number of variables prior to running the supervised learning algorithm.

...read moreread less

Abstract: We investigate fast methods that allow to quickly eliminate variables (features) in supervised learning problems involving a convex loss function and a $l_1$-norm penalty, leading to a potentially substantial reduction in the number of variables prior to running the supervised learning algorithm. The methods are not heuristic: they only eliminate features that are {\em guaranteed} to be absent after solving the learning problem. Our framework applies to a large class of problems, including support vector machine classification, logistic regression and least-squares. The complexity of the feature elimination step is negligible compared to the typical computational effort involved in the sparse supervised learning problem: it grows linearly with the number of features times the number of examples, with much better count if data is sparse. We apply our method to data sets arising in text classification and observe a dramatic reduction of the dimensionality, hence in computational effort required to solve the learning problem, especially when very sparse classifiers are sought. Our method allows to immediately extend the scope of existing algorithms, allowing us to run them on data sets of sizes that were out of their reach before.

...read moreread less

Proceedings Article•DOI•

On active learning of record matching packages

[...]

Arvind Arasu¹, Michaela Götz², Raghav Kaushik¹•Institutions (2)

Microsoft¹, Cornell University²

06 Jun 2010

TL;DR: This work considers the problem of learning a record matching package (classifier) in an active learning setting, and presents new algorithms for this problem that overcome limitations.

...read moreread less

Abstract: We consider the problem of learning a record matching package (classifier) in an active learning setting. In active learning, the learning algorithm picks the set of examples to be labeled, unlike more traditional passive learning setting where a user selects the labeled examples. Active learning is important for record matching since manually identifying a suitable set of labeled examples is difficult. Previous algorithms that use active learning for record matching have serious limitations: The packages that they learn lack quality guarantees and the algorithms do not scale to large input sizes. We present new algorithms for this problem that overcome these limitations. Our algorithms are fundamentally different from traditional active learning approaches, and are designed ground up to exploit problem characteristics specific to record matching. We include a detailed experimental evaluation on realworld data demonstrating the effectiveness of our algorithms.

...read moreread less

Journal Article•DOI•

Rule extraction from support vector machines: A review

[...]

Nahla Barakat¹, Andrew P. Bradley²•Institutions (2)

German University of Technology in Oman¹, University of Queensland²

01 Dec 2010-Neurocomputing

TL;DR: In this paper, a review of rule extraction from SVM classifiers is presented, and a comparison of the algorithms' salient features and relative performance as measured by a number of metrics is made.

...read moreread less

Journal Article•DOI•

Gaussian Processes for Object Categorization

[...]

Ashish Kapoor¹, Kristen Grauman², Raquel Urtasun³, Trevor Darrell³•Institutions (3)

Microsoft¹, University of Texas at Austin², University of California, Berkeley³

01 Jun 2010-International Journal of Computer Vision

TL;DR: This work shows that with an appropriate combination of kernels a significant boost in classification performance is possible, and indicates the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

...read moreread less

Abstract: Discriminative methods for visual object category recognition are typically non-probabilistic, predicting class labels but not directly providing an estimate of uncertainty. Gaussian Processes (GPs) provide a framework for deriving regression techniques with explicit uncertainty models; we show here how Gaussian Processes with covariance functions defined based on a Pyramid Match Kernel (PMK) can be used for probabilistic object category recognition. Our probabilistic formulation provides a principled way to learn hyperparameters, which we utilize to learn an optimal combination of multiple covariance functions. It also offers confidence estimates at test points, and naturally allows for an active learning paradigm in which points are optimally selected for interactive labeling. We show that with an appropriate combination of kernels a significant boost in classification performance is possible. Further, our experiments indicate the utility of active learning with probabilistic predictive models, especially when the amount of training data labels that may be sought for a category is ultimately very small.

...read moreread less

Journal Article•DOI•

A Heuristic Algorithm for planning personalized learning paths for context-aware ubiquitous learning

[...]

Gwo-Jen Hwang¹, Fan-Ray Kuo¹, Peng-Yeng Yin², Kuo-Hsien Chuang²•Institutions (2)

National University of Tainan¹, National Chi Nan University²

01 Feb 2010-Computer Education

TL;DR: In this study, an optimization problem that models the objectives and criteria for determining personalized context-aware ubiquitous learning paths to maximize the learning efficacy for individual students is formulated by taking the meaningfulness of the learning paths and the number of simultaneous visitors to each learning object into account.

...read moreread less

Abstract: In a context-aware ubiquitous learning environment, learning systems can detect students' learning behaviors in the real-world with the help of context-aware (sensor) technology; that is, students can be guided to observe or operate real-world objects with personalized support from the digital world. In this study, an optimization problem that models the objectives and criteria for determining personalized context-aware ubiquitous learning paths to maximize the learning efficacy for individual students is formulated by taking the meaningfulness of the learning paths and the number of simultaneous visitors to each learning object into account. Moreover, a Heuristic Algorithm is proposed to find a quality solution. Experimental results from the learning activities conducted in a natural science butterfly-ecology course of an elementary school are also given to depict the benefits of the innovative approach.

...read moreread less

Journal Article•DOI•

Learning Control in Robotics

[...]

Stefan Schaal¹, Christopher G. Atkeson²•Institutions (2)

University of Southern California¹, Carnegie Mellon University²

07 Jun 2010-IEEE Robotics & Automation Magazine

TL;DR: It is shown that imitation learning has helped significantly to start learning with reasonable initial behavior, however, many applications are still restricted to rather lowdimensional domains and toy applications.

...read moreread less

Abstract: Recent trends in robot learning are to use trajectory-based optimal control techniques and reinforcement learning to scale complex robotic systems. On the one hand, increased computational power and multiprocessing, and on the other hand, probabilistic reinforcement learning methods and function approximation, have contributed to a steadily increasing interest in robot learning. Imitation learning has helped significantly to start learning with reasonable initial behavior. However, many applications are still restricted to rather lowdimensional domains and toy applications. Future work will have to demonstrate the continual and autonomous learning abilities, which were alluded to in the introduction.

...read moreread less

Proceedings Article•

Near-Optimal Bayesian Active Learning with Noisy Observations

[...]

Daniel Golovin¹, Andreas Krause¹, Debajyoti Ray¹•Institutions (1)

California Institute of Technology¹

06 Dec 2010

TL;DR: In this article, a greedy active learning algorithm called EC2 was proposed for Bayesian active learning with noisy observations, and it was shown that it is competitive with the optimal policy.

...read moreread less

Abstract: We tackle the fundamental problem of Bayesian active learning with noise, where we need to adaptively select from a number of expensive tests in order to identify an unknown hypothesis sampled from a known prior distribution. In the case of noise-free observations, a greedy algorithm called generalized binary search (GBS) is known to perform near-optimally. We show that if the observations are noisy, perhaps surprisingly, GBS can perform very poorly. We develop EC2, a novel, greedy active learning algorithm and prove that it is competitive with the optimal policy, thus obtaining the first competitiveness guarantees for Bayesian active learning with noisy observations. Our bounds rely on a recently discovered diminishing returns property called adaptive submodularity, generalizing the classical notion of submodular set functions to adaptive policies. Our results hold even if the tests have non-uniform cost and their noise is correlated. We also propose EFFECX-TIVE, a particularly fast approximation of EC2, and evaluate it on a Bayesian experimental design problem involving human subjects, intended to tease apart competing economic theories of how people make decisions under uncertainty.

...read moreread less

Journal Article•DOI•

Combining active learning and reactive control for robot grasping

[...]

Oliver Kroemer¹, Renaud Detry, Justus Piater, Jan Peters¹•Institutions (1)

Max Planck Society¹

01 Sep 2010-Robotics and Autonomous Systems

TL;DR: A hierarchical controller is proposed that is able to quickly learn good grasps of a novel object in an unstructured environment, by executing smooth reaching motions and preshaping the hand depending on the object's geometry.

...read moreread less

Proceedings Article•

Agnostic Active Learning Without Constraints

[...]

Alina Beygelzimer¹, John Langford², Zhang Tong², Daniel Hsu³•Institutions (3)

IBM¹, Yahoo!², University of Pennsylvania³

06 Dec 2010

TL;DR: This work presents and analyze an agnostic active learning algorithm that works without keeping a version space, unlike all previous approaches where a restricted set of candidate hypotheses is maintained throughout learning, and only hypotheses from this set are ever returned.

...read moreread less

Abstract: We present and analyze an agnostic active learning algorithm that works without keeping a version space. This is unlike all previous approaches where a restricted set of candidate hypotheses is maintained throughout learning, and only hypotheses from this set are ever returned. By avoiding this version space approach, our algorithm sheds the computational burden and brittleness associated with maintaining version spaces, yet still allows for substantial improvements over supervised learning for classification.

...read moreread less

Proceedings Article•

Active Learning and Crowd-Sourcing for Machine Translation

[...]

Vamshi Ambati¹, Stephan Vogel¹, Jaime G. Carbonell¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2010

TL;DR: This paper proposes Active Crowd Translation (ACT), a new paradigm where active learning and crowd-sourcing come together to enable automatic translation for low-resource language pairs and sees significant improvements in translation quality.

...read moreread less

Abstract: Large scale parallel data generation for new language pairs requires intensive human effort and availability of experts. It becomes immensely difficult and costly to provide Statistical Machine Translation (SMT) systems for most languages due to the paucity of expert translators to provide parallel data. Even if experts are present, it appears infeasible due to the impending costs. In this paper we propose Active Crowd Translation (ACT), a new paradigm where active learning and crowd-sourcing come together to enable automatic translation for low-resource language pairs. Active learning aims at reducing cost of label acquisition by prioritizing the most informative data for annotation, while crowd-sourcing reduces cost by using the power of the crowds to make do for the lack of expensive language experts. We experiment and compare our active learning strategies with strong baselines and see significant improvements in translation quality. Similarly, our experiments with crowd-sourcing on Mechanical Turk have shown that it is possible to create parallel corpora using non-experts and with sufficient quality assurance, a translation system that is trained using this corpus approaches expert quality.

...read moreread less

Proceedings Article•

Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning

[...]

François Mairesse¹, Milica Gasic¹, Filip Jurčíček¹, Simon Keizer¹, Blaise Thomson¹, Kai Yu¹, Steve Young¹ - Show less +3 more•Institutions (1)

University of Cambridge¹

11 Jul 2010

TL;DR: Bagel is presented, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators, and can generate natural and informative utterances from unseen inputs in the information presentation domain.

...read moreread less

Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents Bagel, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that Bagel can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation performance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.

...read moreread less

Proceedings Article•DOI•

Early exit optimizations for additive machine learned ranking systems

[...]

B. Barla Cambazoglu¹, Hugo Zaragoza¹, Olivier Chapelle¹, Jiang Chen¹, Ciya Liao¹, Zhaohui Zheng¹, Jon Degenhardt¹ - Show less +3 more•Institutions (1)

Yahoo!¹

04 Feb 2010

TL;DR: By proposing optimization strategies that allow short-circuiting score computations in additive learning systems, this paper is able to speedup the score computation process by more than four times with almost no loss in result quality.

...read moreread less

Abstract: Some commercial web search engines rely on sophisticated machine learning systems for ranking web documents. Due to very large collection sizes and tight constraints on query response times, online efficiency of these learning systems forms a bottleneck. An important problem in such systems is to speedup the ranking process without sacrificing much from the quality of results. In this paper, we propose optimization strategies that allow short-circuiting score computations in additive learning systems. The strategies are evaluated over a state-of-the-art machine learning system and a large, real-life query log, obtained from Yahoo!. By the proposed strategies, we are able to speedup the score computations by more than four times with almost no loss in result quality.

...read moreread less

Journal Article•DOI•

An adaptive navigation support system for conducting context-aware ubiquitous learning in museums

[...]

Chuang-Kai Chiou¹, Judy C. R. Tseng¹, Gwo-Jen Hwang², Shelly Heller³•Institutions (3)

Chung Hua University¹, National University of Tainan², George Washington University³

01 Sep 2010-Computer Education

TL;DR: The navigation support problem for context-aware ubiquitous learning is formulated and two navigation support algorithms are proposed by taking learning efficacy and navigation efficiency into consideration and it is concluded that the innovative approach is helpful to the students to more effectively and efficiently utilize the learning resources and achieve better learning efficacy.

...read moreread less

Abstract: In context-aware ubiquitous learning, students are guided to learn in the real world with personalized supports from the learning system. As the learning resources are realistic objects in the real world, certain physical constraints, such as the limitation of stream of people who visit the same learning object, the time for moving from one object to another, and the environmental parameters, need to be taken into account. Moreover, the values of these context-dependent parameters are likely to change swiftly during the learning process, which makes it a challenging and important issue to find a navigation support mechanism for suggesting learning paths for individual students in real time. In this paper, the navigation support problem for context-aware ubiquitous learning is formulated and two navigation support algorithms are proposed by taking learning efficacy and navigation efficiency into consideration. From the simulation results of learning in a butterfly museum setting, it is concluded that the innovative approach is helpful to the students to more effectively and efficiently utilize the learning resources and achieve better learning efficacy.

...read moreread less

Proceedings Article•

Active Instance Sampling via Matrix Partition

[...]

Yuhong Guo¹•Institutions (1)

Temple University¹

06 Dec 2010

TL;DR: A novel batch-mode active learning approach that selects a batch of queries in each iteration by maximizing a natural mutual information criterion between the labeled and unlabeled instances by employing a Gaussian process framework.

...read moreread less

Abstract: Recently, batch-mode active learning has attracted a lot of attention. In this paper, we propose a novel batch-mode active learning approach that selects a batch of queries in each iteration by maximizing a natural mutual information criterion between the labeled and unlabeled instances. By employing a Gaussian process framework, this mutual information based instance selection problem can be formulated as a matrix partition problem. Although matrix partition is an NP-hard combinatorial optimization problem, we show that a good local solution can be obtained by exploiting an effective local optimization technique on a relaxed continuous optimization problem. The proposed active learning approach is independent of employed classification models. Our empirical studies show this approach can achieve comparable or superior performance to discriminative batch-mode active learning methods.

...read moreread less

Proceedings Article•DOI•

Active learning for biomedical citation screening

[...]

Byron C. Wallace¹, Kevin Small¹, Carla E. Brodley¹, Thomas A Trikalinos²•Institutions (2)

Tufts University¹, Tufts Medical Center²

25 Jul 2010

TL;DR: This work proposes a novel active learning strategy that exploits a priori domain knowledge provided by the expert (specifically, labeled features) and extends this model via a Linear Programming algorithm for situations where the expert can provide ranked labeled features.

...read moreread less

Abstract: Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifiers, thereby reducing annotator effort. We describe a real-world, deployed application of AL to the problem of biomedical citation screening for systematic reviews at the Tufts Medical Center's Evidence-based Practice Center. We propose a novel active learning strategy that exploits a priori domain knowledge provided by the expert (specifically, labeled features)and extend this model via a Linear Programming algorithm for situations where the expert can provide ranked labeled features. Our methods outperform existing AL strategies on three real-world systematic review datasets. We argue that evaluation must be specific to the scenario under consideration. To this end, we propose a new evaluation framework for finite-pool scenarios, wherein the primary aim is to label a fixed set of examples rather than to simply induce a good predictive model. We use a method from medical decision theory for eliciting the relative costs of false positives and false negatives from the domain expert, constructing a utility measure of classification performance that integrates the expert preferences. Our findings suggest that the expert can, and should, provide more information than instance labels alone. In addition to achieving strong empirical results on the citation screening problem, this work outlines many important steps for moving away from simulated active learning and toward deploying AL for real-world applications.

...read moreread less

Posted Content•

Agnostic Active Learning Without Constraints

[...]

Alina Beygelzimer, Daniel Hsu, John Langford, Tong Zhang

14 Jun 2010-arXiv: Learning

TL;DR: In this paper, an agnostic active learning algorithm that works without keeping a version space is presented and analyzed, unlike all previous approaches where a restricted set of candidate hypotheses is maintained throughout learning, and only hypotheses from this set are ever returned.

...read moreread less

Book Chapter•DOI•

Learning I/O automata

[...]

Fides Aarts¹, Frits W. Vaandrager¹•Institutions (1)

Radboud University Nijmegen¹

31 Aug 2010

TL;DR: It is shown that, by exploiting links between three widely used modeling frameworks for reactive systems, any tool for active learning of Mealy machines can be used for learning I/O automata that are deterministic and output determined.

...read moreread less

Abstract: Links are established between three widely used modeling frameworks for reactive systems: the ioco theory of Tretmans, the interface automata of De Alfaro and Henzinger, and Mealy machines. It is shown that, by exploiting these links, any tool for active learning of Mealy machines can be used for learning I/O automata that are deterministic and output determined. The main idea is to place a transducer in between the I/O automata teacher and the Mealy machine learner, which translates concepts from the world of I/O automata to the world of Mealy machines, and vice versa. The transducer comes equipped with an interface automaton that allows us to focus the learning process on those parts of the behavior that can effectively be tested and/or are of particular interest. The approach has been implemented on top of the LearnLib tool and has been applied successfully to three case studies.

...read moreread less

Proceedings Article•DOI•

Multi-task learning for boosting with application to web search ranking

[...]

Olivier Chapelle¹, Pannagadatta K. Shivaswamy², Srinivas Vadrevu¹, Kilian Q. Weinberger³, Ya Zhang⁴, Belle L. Tseng¹ - Show less +2 more•Institutions (4)

Yahoo!¹, Columbia University², Washington University in St. Louis³, Shanghai Jiao Tong University⁴

25 Jul 2010

TL;DR: A novel algorithm for multi-task learning with boosted decision trees that learns several different learning tasks with a joint model, explicitly addressing the specifics of each learning task with task-specific parameters and the commonalities between them through shared parameters.

...read moreread less

Abstract: In this paper we propose a novel algorithm for multi-task learning with boosted decision trees. We learn several different learning tasks with a joint model, explicitly addressing the specifics of each learning task with task-specific parameters and the commonalities between them through shared parameters. This enables implicit data sharing and regularization. We evaluate our learning method on web-search ranking data sets from several countries. Here, multitask learning is particularly helpful as data sets from different countries vary largely in size because of the cost of editorial judgments. Our experiments validate that learning various tasks jointly can lead to significant improvements in performance with surprising reliability.

...read moreread less

Proceedings Article•DOI•

Gestalt: integrated support for implementation and analysis in machine learning

[...]

Kayur Patel¹, Naomi Bancroft¹, Steven M. Drucker², James Fogarty¹, Amy J. Ko¹, James A. Landay¹ - Show less +2 more•Institutions (2)

University of Washington¹, Microsoft²

03 Oct 2010

TL;DR: Gestalt allows developers to implement a classification pipeline, analyze data as it moves through that pipeline, and easily transition between implementation and analysis, and significantly improves the ability of developers to find and fix bugs in machine learning systems.

...read moreread less

Abstract: We present Gestalt, a development environment designed to support the process of applying machine learning. While traditional programming environments focus on source code, we explicitly support both code and data. Gestalt allows developers to implement a classification pipeline, analyze data as it moves through that pipeline, and easily transition between implementation and analysis. An experiment shows this significantly improves the ability of developers to find and fix bugs in machine learning systems. Our discussion of Gestalt and our experimental observations provide new insight into general-purpose support for the machine learning process.

...read moreread less

Proceedings Article•DOI•

Using Crowdsourcing and Active Learning to Track Sentiment in Online Media

[...]

Anthony Brew¹, Derek Greene¹, Pádraig Cunningham¹•Institutions (1)

University College Dublin¹

04 Aug 2010

TL;DR: A system for tracking economic sentiment in online media that has been deployed since August 2009 is described, which uses annotations provided by a cohort of non-expert annotators to train a learning system to classify a large body of news items.

...read moreread less

Abstract: Tracking sentiment in the popular media has long been of interest to media analysts and pundits. With the availability of news content via online syndicated feeds, it is now possible to automate some aspects of this process. There is also great potential to crowdsource Crowdsourcing is a term, sometimes associated with Web 2.0 technologies, that describes outsourcing of tasks to a large often anonymous community. much of the annotation work that is required to train a machine learning system to perform sentiment scoring. We describe such a system for tracking economic sentiment in online media that has been deployed since August 2009. It uses annotations provided by a cohort of non-expert annotators to train a learning system to classify a large body of news items. We report on the design challenges addressed in managing the effort of the annotators and in making annotation an interesting experience.

...read moreread less

Collapse