Showing papers on "Active learning (machine learning) published in 2008"

PDF

Open Access

Journal Article•DOI•

KEEL: a software tool to assess evolutionary algorithms for data mining problems

[...]

Jesús Alcalá-Fdez¹, Luciano Sánchez², Salvador García¹, M. J. del Jesus³, Sebastián Ventura⁴, Josep Maria Garrell⁵, José Otero², Cristóbal Romero⁴, Jaume Bacardit⁶, Víctor M. Rivas³, Juan Carlos Fernández⁴, Francisco Herrera¹ - Show less +8 more•Institutions (6)

University of Granada¹, University of Oviedo², University of Jaén³, University of Córdoba (Spain)⁴, Ramon Llull University⁵, University of Nottingham⁶

15 Oct 2008

TL;DR: KEEL as discussed by the authors is a software tool to assess evolutionary algorithms for data mining problems of various kinds including regression, classification, unsupervised learning, etc., which includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL.

...read moreread less

Abstract: This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.

...read moreread less

1,297 citations

Proceedings Article•DOI•

An Analysis of Active Learning Strategies for Sequence Labeling Tasks

[...]

Burr Settles¹, Mark Craven¹•Institutions (1)

University of Wisconsin-Madison¹

25 Oct 2008

TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.

...read moreread less

Abstract: Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed light on the best active learning approaches for sequence labeling tasks such as information extraction and document segmentation. We survey previously used query selection strategies for sequence models, and propose several novel algorithms to address their shortcomings. We also conduct a large-scale empirical comparison using multiple corpora, which demonstrates that our proposed methods advance the state of the art.

...read moreread less

1,003 citations

Journal Article•DOI•

2008 Special Issue: Reinforcement learning of motor skills with policy gradients

[...]

Jan Peters¹, Stefan Schaal¹•Institutions (1)

University of Southern California¹

01 May 2008-Neural Networks

TL;DR: This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.

...read moreread less

921 citations

Proceedings Article•DOI•

Hierarchical sampling for active learning

[...]

Sanjoy Dasgupta¹, Daniel Hsu¹•Institutions (1)

University of California, Berkeley¹

05 Jul 2008

TL;DR: This work presents an active learning scheme that exploits cluster structure in data and demonstrates the power of cluster-based learning to improve the quality of research in many domains.

...read moreread less

Abstract: We present an active learning scheme that exploits cluster structure in data.

...read moreread less

493 citations

Journal Article•DOI•

Optimization Techniques for Semi-Supervised Support Vector Machines

[...]

Olivier Chapelle¹, Vikas Sindhwani, S. Sathiya Keerthi•Institutions (1)

Max Planck Society¹

01 Jun 2008-Journal of Machine Learning Research

TL;DR: The performance and behavior of various S3VMs algorithms is studied together, under a common experimental setting, to review key ideas in this literature on semi-supervised support Vector Machines.

...read moreread less

Abstract: Due to its wide applicability, the problem of semi-supervised classification is attracting increasing attention in machine learning. Semi-Supervised Support Vector Machines (S3VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their formulation leads to a non-convex optimization problem. A suite of algorithms have recently been proposed for solving S3VMs. This paper reviews key ideas in this literature. The performance and behavior of various S3VMs algorithms is studied together, under a common experimental setting.

...read moreread less

437 citations

Proceedings Article•

Zero-data learning of new tasks

[...]

Hugo Larochelle¹, Dumitru Erhan¹, Yoshua Bengio¹•Institutions (1)

Université de Montréal¹

13 Jul 2008

TL;DR: The main contributions of this work lie in the presentation of a general formalization of zero-data learning, in an experimental analysis of its properties and in empirical evidence showing that generalization is possible and significant in this context.

...read moreread less

Abstract: We introduce the problem of zero-data learning, where a model must generalize to classes or tasks for which no training data are available and only a description of the classes or tasks are provided. Zero-data learning is useful for problems where the set of classes to distinguish or tasks to solve is very large and is not entirely covered by the training data. The main contributions of this work lie in the presentation of a general formalization of zero-data learning, in an experimental analysis of its properties and in empirical evidence showing that generalization is possible and significant in this context. The experimental work of this paper addresses two classification problems of character recognition and a multitask ranking problem in the context of drug discovery. Finally, we conclude by discussing how this new framework could lead to a novel perspective on how to extend machine learning towards AI, where an agent can be given a specification for a learning problem before attempting to solve it (with very few or even zero examples).

...read moreread less

437 citations

Book Chapter•DOI•

Policy Search for Motor Primitives in Robotics

[...]

Jens Kober¹, Jan Peters¹•Institutions (1)

Max Planck Society¹

08 Dec 2008

TL;DR: This paper extends previous work on policy learning from the immediate reward case to episodic reinforcement learning, resulting in a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning that is particularly well-suited for dynamic motor primitives.

...read moreread less

Abstract: Many motor skills in humanoid robotics can be learned using parametrized motor primitives as done in imitation learning. However, most interesting motor learning problems are high-dimensional reinforcement learning problems often beyond the reach of current methods. In this paper, we extend previous work on policy learning from the immediate reward case to episodic reinforcement learning. We show that this results in a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning that is particularly well-suited for dynamic motor primitives. The resulting algorithm is an EM-inspired algorithm applicable to complex motor learning tasks. We compare this algorithm to several well-known parametrized policy search methods and show that it outperforms them. We apply it in the context of motor learning and show that it can learn a complex Ball-in-a-Cup task using a real Barrett WAM™ robot arm.

...read moreread less

411 citations

Journal Article•DOI•

Pareto-Based Multiobjective Machine Learning: An Overview and Case Studies

[...]

Yaochu Jin¹, Bernhard Sendhoff¹•Institutions (1)

Honda¹

01 May 2008

TL;DR: An overview of the existing research on multiobjective machine learning, focusing on supervised learning is provided, and a number of case studies are provided to illustrate the major benefits of the Pareto-based approach to machine learning.

...read moreread less

Abstract: Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggregated to a scalar cost function. This can be mainly attributed to the fact that most conventional learning algorithms can only deal with a scalar cost function. Over the last decade, efforts on solving machine learning problems using the Pareto-based multiobjective optimization methodology have gained increasing impetus, particularly due to the great success of multiobjective optimization using evolutionary algorithms and other population-based stochastic search methods. It has been shown that Pareto-based multiobjective learning approaches are more powerful compared to learning algorithms with a scalar cost function in addressing various topics of machine learning, such as clustering, feature selection, improvement of generalization ability, knowledge extraction, and ensemble generation. One common benefit of the different multiobjective learning approaches is that a deeper insight into the learning problem can be gained by analyzing the Pareto front composed of multiple Pareto-optimal solutions. This paper provides an overview of the existing research on multiobjective machine learning, focusing on supervised learning. In addition, a number of case studies are provided to illustrate the major benefits of the Pareto-based approach to machine learning, e.g., how to identify interpretable models and models that can generalize on unseen data from the obtained Pareto-optimal solutions. Three approaches to Pareto-based multiobjective ensemble generation are compared and discussed in detail. Finally, potentially interesting topics in multiobjective machine learning are suggested.

...read moreread less

399 citations

Proceedings Article•DOI•

On the Class Imbalance Problem

[...]

Xinjian Guo¹, Yilong Yin¹, Cailing Dong¹, Gongping Yang¹, Guang-Tong Zhou¹ - Show less +1 more•Institutions (1)

Shandong University¹

18 Oct 2008

TL;DR: This paper reviewed academic activities special for the class imbalance problem and investigated various remedies in four different levels according to learning phases, and showed some future directions at last.

...read moreread less

Abstract: The class imbalance problem has been recognized in many practical domains and a hot topic of machine learning in recent years. In such a problem, almost all the examples are labeled as one class, while far fewer examples are labeled as the other class, usually the more important class. In this case, standard machine learning algorithms tend to be overwhelmed by the majority class and ignore the minority class since traditional classifiers seeking an accurate performance over a full range of instances. This paper reviewed academic activities special for the class imbalance problem firstly. Then investigated various remedies in four different levels according to learning phases. Following surveying evaluation metrics and some other related factors, this paper showed some future directions at last.

...read moreread less

384 citations

Journal Article•DOI•

Intelligent web-based learning system with personalized learning path guidance

[...]

Chih-Ming Chen¹•Institutions (1)

National Chengchi University¹

01 Sep 2008-Computer Education

TL;DR: Experimental results indicated that applying the proposed genetic-based personalized e-learning system for web-based learning is superior to the freely browsing learning mode because of high quality and concise learning path for individual learners.

...read moreread less

Abstract: Personalized curriculum sequencing is an important research issue for web-based learning systems because no fixed learning paths will be appropriate for all learners. Therefore, many researchers focused on developing e-learning systems with personalized learning mechanisms to assist on-line web-based learning and adaptively provide learning paths in order to promote the learning performance of individual learners. However, most personalized e-learning systems usually neglect to consider if learner ability and the difficulty level of the recommended courseware are matched to each other while performing personalized learning services. Moreover, the problem of concept continuity of learning paths also needs to be considered while implementing personalized curriculum sequencing because smooth learning paths enhance the linked strength between learning concepts. Generally, inappropriate courseware leads to learner cognitive overload or disorientation during learning processes, thus reducing learning performance. Therefore, compared to the freely browsing learning mode without any personalized learning path guidance used in most web-based learning systems, this paper assesses whether the proposed genetic-based personalized e-learning system, which can generate appropriate learning paths according to the incorrect testing responses of an individual learner in a pre-test, provides benefits in terms of learning performance promotion while learning. Based on the results of pre-test, the proposed genetic-based personalized e-learning system can conduct personalized curriculum sequencing through simultaneously considering courseware difficulty level and the concept continuity of learning paths to support web-based learning. Experimental results indicated that applying the proposed genetic-based personalized e-learning system for web-based learning is superior to the freely browsing learning mode because of high quality and concise learning path for individual learners.

...read moreread less

353 citations

Proceedings Article•

Exploiting machine learning to subvert your spam filter

[...]

Blaine Nelson¹, Marco Barreno¹, Fuching Jack Chi¹, Anthony D. Joseph¹, Benjamin I. P. Rubinstein¹, Udam Saini¹, Charles Sutton¹, J. D. Tygar¹, Kai Xia¹ - Show less +5 more•Institutions (1)

University of California, Berkeley¹

15 Apr 2008

TL;DR: This paper shows how an adversary can exploit statistical machine learning, as used in the SpamBayes spam filter, to render it useless--even if the adversary's access is limited to only 1% of the training messages.

...read moreread less

Abstract: Using statistical machine learning for making security decisions introduces new vulnerabilities in large scale systems. This paper shows how an adversary can exploit statistical machine learning, as used in the SpamBayes spam filter, to render it useless--even if the adversary's access is limited to only 1% of the training messages. We further demonstrate a new class of focused attacks that successfully prevent victims from receiving specific email messages. Finally, we introduce two new types of defenses against these attacks.

...read moreread less

Proceedings Article•

Translated Learning: Transfer Learning across Different Feature Spaces

[...]

Wenyuan Dai¹, Yuqiang Chen¹, Gui-Rong Xue¹, Qiang Yang², Yong Yu¹ - Show less +1 more•Institutions (2)

Shanghai Jiao Tong University¹, Hong Kong University of Science and Technology²

08 Dec 2008

TL;DR: Through experiments on the text-aided image classification and cross-language classification tasks, it is demonstrated that the translated learning framework can greatly outperform many state-of-the-art baseline methods.

...read moreread less

Abstract: This paper investigates a new machine learning strategy called translated learning. Unlike many previous learning tasks, we focus on how to use labeled data from one feature space to enhance the classification of other entirely different learning spaces. For example, we might wish to use labeled text data to help learn a model for classifying image data, when the labeled images are difficult to obtain. An important aspect of translated learning is to build a "bridge" to link one feature space (known as the "source space") to another space (known as the "target space") through a translator in order to migrate the knowledge from source to target. The translated learning solution uses a language model to link the class labels to the features in the source spaces, which in turn is translated to the features in the target spaces. Finally, this chain of linkages is completed by tracing back to the instances in the target spaces. We show that this path of linkage can be modeled using a Markov chain and risk minimization. Through experiments on the text-aided image classification and cross-language classification tasks, we demonstrate that our translated learning framework can greatly outperform many state-of-the-art baseline methods.

...read moreread less

Journal Article•DOI•

An Active Learning Approach to Hyperspectral Data Classification

[...]

Suju Rajan¹, Joydeep Ghosh¹, Melba M. Crawford²•Institutions (2)

University of Texas at Austin¹, Purdue University²

21 Mar 2008-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: An active learning technique that efficiently updates existing classifiers by using fewer labeled data points than semisupervised methods is proposed that is well suited for learning or adapting classifiers when there is substantial change in the spectral signatures between labeled and unlabeled data.

...read moreread less

Abstract: Obtaining training data for land cover classification using remotely sensed data is time consuming and expensive especially for relatively inaccessible locations. Therefore, designing classifiers that use as few labeled data points as possible is highly desirable. Existing approaches typically make use of small-sample techniques and semisupervision to deal with the lack of labeled data. In this paper, we propose an active learning technique that efficiently updates existing classifiers by using fewer labeled data points than semisupervised methods. Further, unlike semisupervised methods, our proposed technique is well suited for learning or adapting classifiers when there is substantial change in the spectral signatures between labeled and unlabeled data. Thus, our active learning approach is also useful for classifying a series of spatially/temporally related images, wherein the spectral signatures vary across the images. Our interleaved semisupervised active learning method was tested on both single and spatially/temporally related hyperspectral data sets. We present empirical results that establish the superior performance of our proposed approach versus other active learning and semisupervised methods.

...read moreread less

Journal Article•DOI•

Ubiquitous learning website: Scaffold learners by mobile devices with information-aware techniques

[...]

Gwo-Dong Chen¹, Chih-Kai Chang², Chin-Yeh Wang¹•Institutions (2)

National Central University¹, National University of Tainan²

01 Jan 2008-Computer Education

TL;DR: A website is created providing functions enabling learning to take place anytime and anywhere with any available learning device, for ubiquitous learning according to various properties of mobile devices and results indicate that the proposed system can enhance three learning performance indicators, namely academic performance, task accomplishment rates, and learning goals achievement rates.

...read moreread less

Abstract: The portability and immediate communication properties of mobile devices influence the learning processes in interacting with peers, accessing resources and transferring data. For example, the short message and browsing functions in a cell phone provide users with timely and adaptive information access. Although many studies of mobile learning indicate the pedagogical potential of mobile devices, the screen size, computational power, battery capacity, input interfaces, and network bandwidth are too restricted to develop acceptable functionality for the entire learning processes in a handheld device. Therefore, mobile devices can be adopted to fill the gap between Web-based learning and ubiquitous mobile learning. This study first creates a website, providing functions enabling learning to take place anytime and anywhere with any available learning device, for ubiquitous learning according to various properties of mobile devices. Nowadays, learners' behaviors on a website can be recorded as learning portfolios and analyzed for behavioral diagnosis or instructional planning. A student model is then built according to the analytical results of learning portfolios and a concept map of the learning domain. Based on the student model and learners' available learning devices, three modules are developed to build a ubiquitous learning environment to enhance learning performance via learning status awareness, schedule reminders and mentor recommendation. Finally, an experiment is conducted with 54 college students after implementation of the ubiquitous learning website. Experimental results indicate that the proposed system can enhance three learning performance indicators, namely academic performance, task accomplishment rates, and learning goals achievement rates.

...read moreread less

Proceedings Article•DOI•

Proactive learning: cost-sensitive active learning with multiple imperfect oracles

[...]

Pinar Donmez¹, Jaime G. Carbonell¹•Institutions (1)

Carnegie Mellon University¹

26 Oct 2008

TL;DR: Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications, by casting the problem as a utility optimization problem subject to a budget constraint.

...read moreread less

Abstract: Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications. Active learning seeks to select the most informative unlabeled instances and ask an omniscient oracle for their labels, so as to retrain the learning algorithm maximizing accuracy. However, the oracle is assumed to be infallible (never wrong), indefatigable (always answers), individual (only one oracle), and insensitive to costs (always free or always charges the same). Proactive learning relaxes all four of these assumptions, relying on a decision-theoretic approach to jointly select the optimal oracle and instance, by casting the problem as a utility optimization problem subject to a budget constraint. Results on multi-oracle optimization over several data sets demonstrate the superiority of our approach over the single-imperfect-oracle baselines in most cases.

...read moreread less

Journal Article•DOI•

Survey paper: Optimal experimental design and some related control problems

[...]

Luc Pronzato¹•Institutions (1)

University of Nice Sophia Antipolis¹

01 Feb 2008-Automatica

TL;DR: The strong relations between experimental design and control are traced, such as the use of optimal inputs to obtain precise parameter estimation in dynamical systems and the introduction of suitably designed perturbations in adaptive control.

...read moreread less

Active Learning with Real Annotation Costs

[...]

Burr Settles, Mark Craven, Lewis A. Friedland

01 Jan 2008

TL;DR: A detailed empirical study of active learning with annotation costs in four real-world domains involving human annotators is presented, to better understand the nature of actual labeling costs in domains where labeling costs may vary.

...read moreread less

Abstract: The goal of active learning is to minimize the cost of training an accurate model by allowing the learner to choose which instances are labeled for training. However, most research in active learning to date has assumed that the cost of acquiring labels is the same for all instances. In domains where labeling costs may vary, a reduction in the number of labeled instances does not guarantee a reduction in cost. To better understand the nature of actual labeling costs in such domains, we present a detailed empirical study of active learning with annotation costs in four real-world domains involving human annotators.

...read moreread less

Posted Content•

Adaptive design and analysis of supercomputer experiments

[...]

Robert B. Gramacy¹, Herbert K. H. Lee²•Institutions (2)

University of Cambridge¹, University of California, Santa Cruz²

28 May 2008-arXiv: Applications

TL;DR: In this article, an adaptive sequential design framework was developed to cope with an asynchronous, random, agent-based supercomputing environment, by using a hybrid approach that melds optimal strategies from the statistics literature with flexible strategies from active learning literature.

...read moreread less

Abstract: Computer experiments are often performed to allow modeling of a response surface of a physical experiment that can be too costly or difficult to run except using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance can result in obtaining insufficient information in parts of the space, particularly when the surface calls for a nonstationary model. We propose an approach that automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs. The newly developed Bayesian treed Gaussian process is used as the surrogate model, and a fully Bayesian approach allows explicit measures of uncertainty. We develop an adaptive sequential design framework to cope with an asynchronous, random, agent--based supercomputing environment, by using a hybrid approach that melds optimal strategies from the statistics literature with flexible strategies from the active learning literature. The merits of this approach are borne out in several examples, including the motivating computational fluid dynamics simulation of a rocket booster.

...read moreread less

Proceedings Article•DOI•

Video corpus annotation using active learning

[...]

Stéphane Ayache, Georges Quénot

30 Mar 2008

TL;DR: This paper describes the collaborative annotation system used to annotate the High Level Features (HLF) in the development set of TRECVID 2007 and shows that Active Learning allows simultaneously getting the most useful information from the partial annotation and significantly reducing the annotation effort per participant relatively to previous collaborative annotations.

...read moreread less

Abstract: Concept indexing in multimedia libraries is very useful for users searching and browsing but it is a very challenging research problem as well. Beyond the systems' implementations issues, semantic indexing is strongly dependent upon the size and quality of the training examples. In this paper, we describe the collaborative annotation system used to annotate the High Level Features (HLF) in the development set of TRECVID 2007. This system is web-based and takes advantage of Active Learning approach. We show that Active Learning allows simultaneously getting the most useful information from the partial annotation and significantly reducing the annotation effort per participant relatively to previous collaborative annotations.

...read moreread less

Proceedings Article•DOI•

Semi-supervised SVM batch mode active learning for image retrieval

[...]

Steven C. H. Hoi¹, Rong Jin², Jianke Zhu³, Michael R. Lyu³•Institutions (3)

Nanyang Technological University¹, Michigan State University², The Chinese University of Hong Kong³

23 Jun 2008

TL;DR: This paper proposes a novel scheme that exploits both semi-supervised kernel learning and batch mode active learning for relevance feedback in CBIR and shows that the proposed scheme is significantly more effective than other state-of-the-art approaches.

...read moreread less

Abstract: Active learning has been shown as a key technique for improving content-based image retrieval (CBIR) performance. Among various methods, support vector machine (SVM) active learning is popular for its application to relevance feedback in CBIR. However, the regular SVM active learning has two main drawbacks when used for relevance feedback. First, SVM often suffers from learning with a small number of labeled examples, which is the case in relevance feedback. Second, SVM active learning usually does not take into account the redundancy among examples, and therefore could select multiple examples in relevance feedback that are similar (or even identical) to each other. In this paper, we propose a novel scheme that exploits both semi-supervised kernel learning and batch mode active learning for relevance feedback in CBIR. In particular, a kernel function is first learned from a mixture of labeled and unlabeled examples. The kernel will then be used to effectively identify the informative and diverse examples for active learning via a min-max framework. An empirical study with relevance feedback of CBIR showed that the proposed scheme is significantly more effective than other state-of-the-art approaches.

...read moreread less

Proceedings Article•DOI•

Authorship Attribution and Verification with Many Authors and Limited Data

[...]

Kim Luyckx¹, Walter Daelemans¹•Institutions (1)

University of Antwerp¹

18 Aug 2008

TL;DR: What the effect is of many authors on feature selection and learning, and robustness of a memory-based learning approach in doing authorship attribution and verification with many authors and limited training data when compared to eager learning methods such as SVMs and maximum entropy learning are shown.

...read moreread less

Abstract: Most studies in statistical or machine learning based authorship attribution focus on two or a few authors. This leads to an overestimation of the importance of the features extracted from the training data and found to be discriminating for these small sets of authors. Most studies also use sizes of training data that are unrealistic for situations in which stylometry is applied (e.g., forensics), and thereby overestimate the accuracy of their approach in these situations. A more realistic interpretation of the task is as an authorship verification problem that we approximate by pooling data from many different authors as negative examples. In this paper, we show, on the basis of a new corpus with 145 authors, what the effect is of many authors on feature selection and learning, and show robustness of a memory-based learning approach in doing authorship attribution and verification with many authors and limited training data when compared to eager learning methods such as SVMs and maximum entropy learning.

...read moreread less

Journal Article•DOI•

Classifying Single-Trial EEG During Motor Imagery by Iterative Spatio-Spectral Patterns Learning (ISSPL)

[...]

Wei Wu¹, Xiaorong Gao¹, Bo Hong¹, Shangkai Gao¹•Institutions (1)

Tsinghua University¹

16 May 2008-IEEE Transactions on Biomedical Engineering

TL;DR: Experimental results on two datasets show that the proposed algorithm can correctly identify the discriminative frequency bands, demonstrating the algorithm's superiority over contemporary approaches in classification performance.

...read moreread less

Abstract: In most current motor-imagery-based brain-computer interfaces (BCIs), machine learning is carried out in two consecutive stages: feature extraction and feature classification. Feature extraction has focused on automatic learning of spatial filters, with little or no attention being paid to optimization of parameters for temporal filters that still require time-consuming, ad hoc manual tuning. In this paper, we present a new algorithm termed iterative spatio-spectral patterns learning (ISSPL) that employs statistical learning theory to perform automatic learning of spatio-spectral filters. In ISSPL, spectral filters and the classifier are simultaneously parameterized for optimization to achieve good generalization performance. A detailed derivation and theoretical analysis of ISSPL are given. Experimental results on two datasets show that the proposed algorithm can correctly identify the discriminative frequency bands, demonstrating the algorithm's superiority over contemporary approaches in classification performance.

...read moreread less

Journal Article•DOI•

Learning to Control in Operational Space

[...]

Jan Peters¹, Stefan Schaal¹•Institutions (1)

University of Southern California¹

01 Feb 2008-The International Journal of Robotics Research

TL;DR: The proposed method works in the setting of learning resolved motion rate control on a real, physical Mitsubishi PA-10 medical robotics arm and demonstrates feasibility for complex high degree-of-freedom robots.

...read moreread less

Abstract: One of the most general frameworks for phrasing control problems for complex, redundant robots is operational-space control. However, while this framework is of essential importance for robotics and well understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in the face of modeling errors, which are inevitable in complex robots (e.g. humanoid robots). In this paper, we suggest a learning approach for operational-space control as a direct inverse model learning problem. A first important insight for this paper is that a physically correct solution to the inverse problem with redundant degrees of freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component of our work is based on the insight that many operational-space controllers can be understood in terms of a constrained optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational-space controller. From the machine learning point of view, this learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees-of-freedom robot arm are used to illustrate the suggested approach. The application to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on a real, physical Mitsubishi PA-10 medical robotics arm.

...read moreread less

Proceedings Article•DOI•

Active Learning with Sampling by Uncertainty and Density for Word Sense Disambiguation and Text Classification

[...]

Jingbo Zhu¹, Huizhen Wang¹, Tianshun Yao¹, Benjamin K. Tsou²•Institutions (2)

Northeastern University (China)¹, City University of Hong Kong²

18 Aug 2008

TL;DR: A new selective sampling technique, sampling by uncertainty and density (SUD), is presented, in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier.

...read moreread less

Abstract: This paper addresses two issues of active learning. Firstly, to solve a problem of uncertainty sampling that it often fails by selecting outliers, this paper presents a new selective sampling technique, sampling by uncertainty and density (SUD), in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier. Secondly, a technique of sampling by clustering (SBC) is applied to build a representative initial training data set for active learning. Finally, we implement a new algorithm of active learning with SUD and SBC techniques. The experimental results from three real-world data sets show that our method outperforms competing methods, particularly at the early stages of active learning.

...read moreread less

Journal Article•DOI•

Bayesian approaches to associative learning: From passive to active learning

[...]

John K. Kruschke¹•Institutions (1)

Indiana University¹

01 Aug 2008-Learning & Behavior

TL;DR: The first part of this article reviews two Bayesian accounts of backward blocking, a phenomenon that is challenging for many traditional theories and focuses on two formalizations of optimal active learning: maximizing either the expected information gain or the probability gain.

...read moreread less

Abstract: Traditional associationist models represent an organism's knowledge state by a single strength of association on each associative link. Bayesian models instead represent knowledge by a distribution of graded degrees of belief over a range of candidate hypotheses. Many traditional associationist models assume that the learner is passive, adjusting strengths of association only in reaction to stimuli delivered by the environment. Bayesian models, on the other hand, can describe how the learner should actively probe the environment to learn optimally. The first part of this article reviews two Bayesian accounts of backward blocking, a phenomenon that is challenging for many traditional theories. The broad Bayesian framework, in which these models reside, is also selectively reviewed. The second part focuses on two formalizations of optimal active learning: maximizing either the expected information gain or the probability gain. New analyses of optimal active learning by a Kalman filter and by a noisy-logic gate show that these two Bayesian models make different predictions for some environments. The Kalman filter predictions are disconfirmed in at least one case.

...read moreread less

Book Chapter•DOI•

Towards Scalable Dataset Construction: An Active Learning Approach

[...]

Brendan M. Collins¹, Jia Deng¹, Kai Li¹, Li Fei-Fei¹•Institutions (1)

Princeton University¹

20 Oct 2008

TL;DR: This work presents a discriminative learning process which employs active, online learning to quickly classify many images with minimal user input, and demonstrates precision which is often superior to the state-of-the-art, with scalability which exceeds previous work.

...read moreread less

Abstract: As computer vision research considers more object categories and greater variation within object categories, it is clear that larger and more exhaustive datasets are necessary. However, the process of collecting such datasets is laborious and monotonous. We consider the setting in which many images have been automatically collected for a visual category (typically by automatic internet search), and we must separate relevant images from noise. We present a discriminative learning process which employs active, online learning to quickly classify many images with minimal user input. The principle advantage of this work over previous endeavors is its scalability. We demonstrate precision which is often superior to the state-of-the-art, with scalability which exceeds previous work.

...read moreread less

Book•DOI•

Data Analysis, Machine Learning, and Applications

[...]

Christine Preisach, Hans Burkhardt, Lars Schmidt-Thieme, Reinhold Decker

01 Jan 2008

TL;DR: A truncated Euclidean similarity measure and a self-normalized similarity measure related to the Canberra distance are considered and it is proved that they are positive semi-definite (p.s.d.), thus facilitating their use in kernel-based methods, like the Support Vector Machine, a very popular machine learning tool.

...read moreread less

Abstract: We consider distance-based similarity measures for real-valued vectors of interest in kernel-based machine learning algorithms. In particular, a truncated Euclidean similarity measure and a self-normalized similarity measure related to the Canberra distance. It is proved that they are positive semi-definite (p.s.d.), thus facilitating their use in kernel-based methods, like the Support Vector Machine, a very popular machine learning tool. These kernels may be better suited than standard kernels (like the RBF) in certain situations, that are described in the paper. Some rather general results concerning positivity properties are presented in detail as well as some interesting ways of proving the p.s.d. property.

...read moreread less

Proceedings Article•DOI•

Two-Dimensional Active Learning for image classification

[...]

Guo-Jun Qi¹, Xian-Sheng Hua², Yong Rui², Jinhui Tang¹, Hong-Jiang Zhang² - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Jun 2008

TL;DR: This paper proposes a two-dimensional active learning scheme that not only considers the sample dimension but also the label dimension, and it is shown that the traditional active learning formulation is a special case of 2DAL when there is only one label.

...read moreread less

Abstract: In this paper, we propose a two-dimensional active learning scheme and show its application in image classification. Traditional active learning methods select samples only along the sample dimension. While this is the right strategy in binary classification, it is sub-optimal for multi-label classification. In multi-label classification, we argue that, for each selected sample, only a part of more effective labels are necessary to be annotated while others can be inferred by exploring the correlations among the labels. The reason is that the contributions of different labels to minimizing the classification error are different due to the inherent label correlations. To this end, we propose to select sample-label pairs, rather than only samples, to minimize a multi-label Bayesian classification error bound. This new active learning strategy not only considers the sample dimension but also the label dimension, and we call it Two-Dimensional Active Learning (2DAL). We also show that the traditional active learning formulation is a special case of 2DAL when there is only one label. Extensive experiments conducted on two real-world applications show that the 2DAL significantly outperforms the best existing approaches which did not take label correlation into account.

...read moreread less

Journal Article•DOI•

A stopping criterion for active learning

[...]

Andreas Vlachos¹•Institutions (1)

University of Cambridge¹

01 Jul 2008-Computer Speech & Language

TL;DR: This work presents a stopping criterion for active learning based on the way instances are selected during uncertainty-based sampling and verifies its applicability in a variety of settings.

...read moreread less

Journal Article•DOI•

Using a style-based ant colony system for adaptive learning

[...]

Tzone-I Wang¹, Kun-Te Wang¹, Yueh-Min Huang¹•Institutions (1)

National Cheng Kung University¹

01 May 2008-Expert Systems With Applications

TL;DR: An extended approach of ant colony optimization is proposed, which is based on a recent metaheuristic method for discovering group patterns that is designed to help learners advance their on-line learning along an adaptive learning path.

...read moreread less

Abstract: Adaptive learning provides an alternative to the traditional ''one size fits all'' approach and has driven the development of teaching and learning towards a dynamic learning process for learning. Therefore, exploring the adaptive paths to suit learners personalized needs is an interesting issue. This paper proposes an extended approach of ant colony optimization, which is based on a recent metaheuristic method for discovering group patterns that is designed to help learners advance their on-line learning along an adaptive learning path. The investigation emphasizes the relationship of learning content to the learning style of each participant in adaptive learning. An adaptive learning rule was developed to identify how learners of different learning styles may associate those contents which have the higher probability of being useful to form an optimal learning path. A style-based ant colony system is implemented and its algorithm parameters are optimized to conform to the actual pedagogical process. A survey was also conducted to evaluate the validity and efficiency of the system in producing adaptive paths to different learners. The results reveal that both the learners and the lecturers agree that the style-based ant colony system is able to provide useful supplementary learning paths.

...read moreread less

Collapse