scispace - formally typeset
Search or ask a question

Showing papers on "Active learning (machine learning) published in 2008"


Journal ArticleDOI
15 Oct 2008
TL;DR: KEEL as discussed by the authors is a software tool to assess evolutionary algorithms for data mining problems of various kinds including regression, classification, unsupervised learning, etc., which includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL.
Abstract: This paper introduces a software tool named KEEL which is a software tool to assess evolutionary algorithms for Data Mining problems of various kinds including as regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on different approaches: Pittsburgh, Michigan and IRL, as well as the integration of evolutionary learning techniques with different pre-processing techniques, allowing it to perform a complete analysis of any learning model in comparison to existing software tools. Moreover, KEEL has been designed with a double goal: research and educational.

1,297 citations


Proceedings ArticleDOI
25 Oct 2008
TL;DR: This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
Abstract: Active learning is well-suited to many problems in natural language processing, where unlabeled data may be abundant but annotation is slow and expensive. This paper aims to shed light on the best active learning approaches for sequence labeling tasks such as information extraction and document segmentation. We survey previously used query selection strategies for sequence models, and propose several novel algorithms to address their shortcomings. We also conduct a large-scale empirical comparison using multiple corpora, which demonstrates that our proposed methods advance the state of the art.

1,003 citations


Journal ArticleDOI
TL;DR: This paper examines learning of complex motor skills with human-like limbs, and combines the idea of modular motor control by means of motor primitives as a suitable way to generate parameterized control policies for reinforcement learning with the theory of stochastic policy gradient learning.

921 citations


Proceedings ArticleDOI
05 Jul 2008
TL;DR: This work presents an active learning scheme that exploits cluster structure in data and demonstrates the power of cluster-based learning to improve the quality of research in many domains.
Abstract: We present an active learning scheme that exploits cluster structure in data.

493 citations


Journal ArticleDOI
TL;DR: The performance and behavior of various S3VMs algorithms is studied together, under a common experimental setting, to review key ideas in this literature on semi-supervised support Vector Machines.
Abstract: Due to its wide applicability, the problem of semi-supervised classification is attracting increasing attention in machine learning. Semi-Supervised Support Vector Machines (S3VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their formulation leads to a non-convex optimization problem. A suite of algorithms have recently been proposed for solving S3VMs. This paper reviews key ideas in this literature. The performance and behavior of various S3VMs algorithms is studied together, under a common experimental setting.

437 citations


Proceedings Article
13 Jul 2008
TL;DR: The main contributions of this work lie in the presentation of a general formalization of zero-data learning, in an experimental analysis of its properties and in empirical evidence showing that generalization is possible and significant in this context.
Abstract: We introduce the problem of zero-data learning, where a model must generalize to classes or tasks for which no training data are available and only a description of the classes or tasks are provided. Zero-data learning is useful for problems where the set of classes to distinguish or tasks to solve is very large and is not entirely covered by the training data. The main contributions of this work lie in the presentation of a general formalization of zero-data learning, in an experimental analysis of its properties and in empirical evidence showing that generalization is possible and significant in this context. The experimental work of this paper addresses two classification problems of character recognition and a multitask ranking problem in the context of drug discovery. Finally, we conclude by discussing how this new framework could lead to a novel perspective on how to extend machine learning towards AI, where an agent can be given a specification for a learning problem before attempting to solve it (with very few or even zero examples).

437 citations


Book ChapterDOI
08 Dec 2008
TL;DR: This paper extends previous work on policy learning from the immediate reward case to episodic reinforcement learning, resulting in a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning that is particularly well-suited for dynamic motor primitives.
Abstract: Many motor skills in humanoid robotics can be learned using parametrized motor primitives as done in imitation learning. However, most interesting motor learning problems are high-dimensional reinforcement learning problems often beyond the reach of current methods. In this paper, we extend previous work on policy learning from the immediate reward case to episodic reinforcement learning. We show that this results in a general, common framework also connected to policy gradient methods and yielding a novel algorithm for policy learning that is particularly well-suited for dynamic motor primitives. The resulting algorithm is an EM-inspired algorithm applicable to complex motor learning tasks. We compare this algorithm to several well-known parametrized policy search methods and show that it outperforms them. We apply it in the context of motor learning and show that it can learn a complex Ball-in-a-Cup task using a real Barrett WAM™ robot arm.

411 citations


Journal ArticleDOI
Yaochu Jin1, Bernhard Sendhoff1
01 May 2008
TL;DR: An overview of the existing research on multiobjective machine learning, focusing on supervised learning is provided, and a number of case studies are provided to illustrate the major benefits of the Pareto-based approach to machine learning.
Abstract: Machine learning is inherently a multiobjective task. Traditionally, however, either only one of the objectives is adopted as the cost function or multiple objectives are aggregated to a scalar cost function. This can be mainly attributed to the fact that most conventional learning algorithms can only deal with a scalar cost function. Over the last decade, efforts on solving machine learning problems using the Pareto-based multiobjective optimization methodology have gained increasing impetus, particularly due to the great success of multiobjective optimization using evolutionary algorithms and other population-based stochastic search methods. It has been shown that Pareto-based multiobjective learning approaches are more powerful compared to learning algorithms with a scalar cost function in addressing various topics of machine learning, such as clustering, feature selection, improvement of generalization ability, knowledge extraction, and ensemble generation. One common benefit of the different multiobjective learning approaches is that a deeper insight into the learning problem can be gained by analyzing the Pareto front composed of multiple Pareto-optimal solutions. This paper provides an overview of the existing research on multiobjective machine learning, focusing on supervised learning. In addition, a number of case studies are provided to illustrate the major benefits of the Pareto-based approach to machine learning, e.g., how to identify interpretable models and models that can generalize on unseen data from the obtained Pareto-optimal solutions. Three approaches to Pareto-based multiobjective ensemble generation are compared and discussed in detail. Finally, potentially interesting topics in multiobjective machine learning are suggested.

399 citations


Proceedings ArticleDOI
Xinjian Guo1, Yilong Yin1, Cailing Dong1, Gongping Yang1, Guang-Tong Zhou1 
18 Oct 2008
TL;DR: This paper reviewed academic activities special for the class imbalance problem and investigated various remedies in four different levels according to learning phases, and showed some future directions at last.
Abstract: The class imbalance problem has been recognized in many practical domains and a hot topic of machine learning in recent years. In such a problem, almost all the examples are labeled as one class, while far fewer examples are labeled as the other class, usually the more important class. In this case, standard machine learning algorithms tend to be overwhelmed by the majority class and ignore the minority class since traditional classifiers seeking an accurate performance over a full range of instances. This paper reviewed academic activities special for the class imbalance problem firstly. Then investigated various remedies in four different levels according to learning phases. Following surveying evaluation metrics and some other related factors, this paper showed some future directions at last.

384 citations


Journal ArticleDOI
TL;DR: Experimental results indicated that applying the proposed genetic-based personalized e-learning system for web-based learning is superior to the freely browsing learning mode because of high quality and concise learning path for individual learners.
Abstract: Personalized curriculum sequencing is an important research issue for web-based learning systems because no fixed learning paths will be appropriate for all learners. Therefore, many researchers focused on developing e-learning systems with personalized learning mechanisms to assist on-line web-based learning and adaptively provide learning paths in order to promote the learning performance of individual learners. However, most personalized e-learning systems usually neglect to consider if learner ability and the difficulty level of the recommended courseware are matched to each other while performing personalized learning services. Moreover, the problem of concept continuity of learning paths also needs to be considered while implementing personalized curriculum sequencing because smooth learning paths enhance the linked strength between learning concepts. Generally, inappropriate courseware leads to learner cognitive overload or disorientation during learning processes, thus reducing learning performance. Therefore, compared to the freely browsing learning mode without any personalized learning path guidance used in most web-based learning systems, this paper assesses whether the proposed genetic-based personalized e-learning system, which can generate appropriate learning paths according to the incorrect testing responses of an individual learner in a pre-test, provides benefits in terms of learning performance promotion while learning. Based on the results of pre-test, the proposed genetic-based personalized e-learning system can conduct personalized curriculum sequencing through simultaneously considering courseware difficulty level and the concept continuity of learning paths to support web-based learning. Experimental results indicated that applying the proposed genetic-based personalized e-learning system for web-based learning is superior to the freely browsing learning mode because of high quality and concise learning path for individual learners.

353 citations


Proceedings Article
15 Apr 2008
TL;DR: This paper shows how an adversary can exploit statistical machine learning, as used in the SpamBayes spam filter, to render it useless--even if the adversary's access is limited to only 1% of the training messages.
Abstract: Using statistical machine learning for making security decisions introduces new vulnerabilities in large scale systems. This paper shows how an adversary can exploit statistical machine learning, as used in the SpamBayes spam filter, to render it useless--even if the adversary's access is limited to only 1% of the training messages. We further demonstrate a new class of focused attacks that successfully prevent victims from receiving specific email messages. Finally, we introduce two new types of defenses against these attacks.

Proceedings Article
08 Dec 2008
TL;DR: Through experiments on the text-aided image classification and cross-language classification tasks, it is demonstrated that the translated learning framework can greatly outperform many state-of-the-art baseline methods.
Abstract: This paper investigates a new machine learning strategy called translated learning. Unlike many previous learning tasks, we focus on how to use labeled data from one feature space to enhance the classification of other entirely different learning spaces. For example, we might wish to use labeled text data to help learn a model for classifying image data, when the labeled images are difficult to obtain. An important aspect of translated learning is to build a "bridge" to link one feature space (known as the "source space") to another space (known as the "target space") through a translator in order to migrate the knowledge from source to target. The translated learning solution uses a language model to link the class labels to the features in the source spaces, which in turn is translated to the features in the target spaces. Finally, this chain of linkages is completed by tracing back to the instances in the target spaces. We show that this path of linkage can be modeled using a Markov chain and risk minimization. Through experiments on the text-aided image classification and cross-language classification tasks, we demonstrate that our translated learning framework can greatly outperform many state-of-the-art baseline methods.

Journal ArticleDOI
TL;DR: An active learning technique that efficiently updates existing classifiers by using fewer labeled data points than semisupervised methods is proposed that is well suited for learning or adapting classifiers when there is substantial change in the spectral signatures between labeled and unlabeled data.
Abstract: Obtaining training data for land cover classification using remotely sensed data is time consuming and expensive especially for relatively inaccessible locations. Therefore, designing classifiers that use as few labeled data points as possible is highly desirable. Existing approaches typically make use of small-sample techniques and semisupervision to deal with the lack of labeled data. In this paper, we propose an active learning technique that efficiently updates existing classifiers by using fewer labeled data points than semisupervised methods. Further, unlike semisupervised methods, our proposed technique is well suited for learning or adapting classifiers when there is substantial change in the spectral signatures between labeled and unlabeled data. Thus, our active learning approach is also useful for classifying a series of spatially/temporally related images, wherein the spectral signatures vary across the images. Our interleaved semisupervised active learning method was tested on both single and spatially/temporally related hyperspectral data sets. We present empirical results that establish the superior performance of our proposed approach versus other active learning and semisupervised methods.

Journal ArticleDOI
TL;DR: A website is created providing functions enabling learning to take place anytime and anywhere with any available learning device, for ubiquitous learning according to various properties of mobile devices and results indicate that the proposed system can enhance three learning performance indicators, namely academic performance, task accomplishment rates, and learning goals achievement rates.
Abstract: The portability and immediate communication properties of mobile devices influence the learning processes in interacting with peers, accessing resources and transferring data. For example, the short message and browsing functions in a cell phone provide users with timely and adaptive information access. Although many studies of mobile learning indicate the pedagogical potential of mobile devices, the screen size, computational power, battery capacity, input interfaces, and network bandwidth are too restricted to develop acceptable functionality for the entire learning processes in a handheld device. Therefore, mobile devices can be adopted to fill the gap between Web-based learning and ubiquitous mobile learning. This study first creates a website, providing functions enabling learning to take place anytime and anywhere with any available learning device, for ubiquitous learning according to various properties of mobile devices. Nowadays, learners' behaviors on a website can be recorded as learning portfolios and analyzed for behavioral diagnosis or instructional planning. A student model is then built according to the analytical results of learning portfolios and a concept map of the learning domain. Based on the student model and learners' available learning devices, three modules are developed to build a ubiquitous learning environment to enhance learning performance via learning status awareness, schedule reminders and mentor recommendation. Finally, an experiment is conducted with 54 college students after implementation of the ubiquitous learning website. Experimental results indicate that the proposed system can enhance three learning performance indicators, namely academic performance, task accomplishment rates, and learning goals achievement rates.

Proceedings ArticleDOI
26 Oct 2008
TL;DR: Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications, by casting the problem as a utility optimization problem subject to a budget constraint.
Abstract: Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications. Active learning seeks to select the most informative unlabeled instances and ask an omniscient oracle for their labels, so as to retrain the learning algorithm maximizing accuracy. However, the oracle is assumed to be infallible (never wrong), indefatigable (always answers), individual (only one oracle), and insensitive to costs (always free or always charges the same). Proactive learning relaxes all four of these assumptions, relying on a decision-theoretic approach to jointly select the optimal oracle and instance, by casting the problem as a utility optimization problem subject to a budget constraint. Results on multi-oracle optimization over several data sets demonstrate the superiority of our approach over the single-imperfect-oracle baselines in most cases.

Journal ArticleDOI
TL;DR: The strong relations between experimental design and control are traced, such as the use of optimal inputs to obtain precise parameter estimation in dynamical systems and the introduction of suitably designed perturbations in adaptive control.

01 Jan 2008
TL;DR: A detailed empirical study of active learning with annotation costs in four real-world domains involving human annotators is presented, to better understand the nature of actual labeling costs in domains where labeling costs may vary.
Abstract: The goal of active learning is to minimize the cost of training an accurate model by allowing the learner to choose which instances are labeled for training. However, most research in active learning to date has assumed that the cost of acquiring labels is the same for all instances. In domains where labeling costs may vary, a reduction in the number of labeled instances does not guarantee a reduction in cost. To better understand the nature of actual labeling costs in such domains, we present a detailed empirical study of active learning with annotation costs in four real-world domains involving human annotators.

Posted Content
TL;DR: In this article, an adaptive sequential design framework was developed to cope with an asynchronous, random, agent-based supercomputing environment, by using a hybrid approach that melds optimal strategies from the statistics literature with flexible strategies from active learning literature.
Abstract: Computer experiments are often performed to allow modeling of a response surface of a physical experiment that can be too costly or difficult to run except using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance can result in obtaining insufficient information in parts of the space, particularly when the surface calls for a nonstationary model. We propose an approach that automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs. The newly developed Bayesian treed Gaussian process is used as the surrogate model, and a fully Bayesian approach allows explicit measures of uncertainty. We develop an adaptive sequential design framework to cope with an asynchronous, random, agent--based supercomputing environment, by using a hybrid approach that melds optimal strategies from the statistics literature with flexible strategies from the active learning literature. The merits of this approach are borne out in several examples, including the motivating computational fluid dynamics simulation of a rocket booster.

Proceedings ArticleDOI
30 Mar 2008
TL;DR: This paper describes the collaborative annotation system used to annotate the High Level Features (HLF) in the development set of TRECVID 2007 and shows that Active Learning allows simultaneously getting the most useful information from the partial annotation and significantly reducing the annotation effort per participant relatively to previous collaborative annotations.
Abstract: Concept indexing in multimedia libraries is very useful for users searching and browsing but it is a very challenging research problem as well. Beyond the systems' implementations issues, semantic indexing is strongly dependent upon the size and quality of the training examples. In this paper, we describe the collaborative annotation system used to annotate the High Level Features (HLF) in the development set of TRECVID 2007. This system is web-based and takes advantage of Active Learning approach. We show that Active Learning allows simultaneously getting the most useful information from the partial annotation and significantly reducing the annotation effort per participant relatively to previous collaborative annotations.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: This paper proposes a novel scheme that exploits both semi-supervised kernel learning and batch mode active learning for relevance feedback in CBIR and shows that the proposed scheme is significantly more effective than other state-of-the-art approaches.
Abstract: Active learning has been shown as a key technique for improving content-based image retrieval (CBIR) performance. Among various methods, support vector machine (SVM) active learning is popular for its application to relevance feedback in CBIR. However, the regular SVM active learning has two main drawbacks when used for relevance feedback. First, SVM often suffers from learning with a small number of labeled examples, which is the case in relevance feedback. Second, SVM active learning usually does not take into account the redundancy among examples, and therefore could select multiple examples in relevance feedback that are similar (or even identical) to each other. In this paper, we propose a novel scheme that exploits both semi-supervised kernel learning and batch mode active learning for relevance feedback in CBIR. In particular, a kernel function is first learned from a mixture of labeled and unlabeled examples. The kernel will then be used to effectively identify the informative and diverse examples for active learning via a min-max framework. An empirical study with relevance feedback of CBIR showed that the proposed scheme is significantly more effective than other state-of-the-art approaches.

Proceedings ArticleDOI
18 Aug 2008
TL;DR: What the effect is of many authors on feature selection and learning, and robustness of a memory-based learning approach in doing authorship attribution and verification with many authors and limited training data when compared to eager learning methods such as SVMs and maximum entropy learning are shown.
Abstract: Most studies in statistical or machine learning based authorship attribution focus on two or a few authors. This leads to an overestimation of the importance of the features extracted from the training data and found to be discriminating for these small sets of authors. Most studies also use sizes of training data that are unrealistic for situations in which stylometry is applied (e.g., forensics), and thereby overestimate the accuracy of their approach in these situations. A more realistic interpretation of the task is as an authorship verification problem that we approximate by pooling data from many different authors as negative examples. In this paper, we show, on the basis of a new corpus with 145 authors, what the effect is of many authors on feature selection and learning, and show robustness of a memory-based learning approach in doing authorship attribution and verification with many authors and limited training data when compared to eager learning methods such as SVMs and maximum entropy learning.

Journal ArticleDOI
TL;DR: Experimental results on two datasets show that the proposed algorithm can correctly identify the discriminative frequency bands, demonstrating the algorithm's superiority over contemporary approaches in classification performance.
Abstract: In most current motor-imagery-based brain-computer interfaces (BCIs), machine learning is carried out in two consecutive stages: feature extraction and feature classification. Feature extraction has focused on automatic learning of spatial filters, with little or no attention being paid to optimization of parameters for temporal filters that still require time-consuming, ad hoc manual tuning. In this paper, we present a new algorithm termed iterative spatio-spectral patterns learning (ISSPL) that employs statistical learning theory to perform automatic learning of spatio-spectral filters. In ISSPL, spectral filters and the classifier are simultaneously parameterized for optimization to achieve good generalization performance. A detailed derivation and theoretical analysis of ISSPL are given. Experimental results on two datasets show that the proposed algorithm can correctly identify the discriminative frequency bands, demonstrating the algorithm's superiority over contemporary approaches in classification performance.

Journal ArticleDOI
TL;DR: The proposed method works in the setting of learning resolved motion rate control on a real, physical Mitsubishi PA-10 medical robotics arm and demonstrates feasibility for complex high degree-of-freedom robots.
Abstract: One of the most general frameworks for phrasing control problems for complex, redundant robots is operational-space control. However, while this framework is of essential importance for robotics and well understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in the face of modeling errors, which are inevitable in complex robots (e.g. humanoid robots). In this paper, we suggest a learning approach for operational-space control as a direct inverse model learning problem. A first important insight for this paper is that a physically correct solution to the inverse problem with redundant degrees of freedom does exist when learning of the inverse map is performed in a suitable piecewise linear way. The second crucial component of our work is based on the insight that many operational-space controllers can be understood in terms of a constrained optimal control problem. The cost function associated with this optimal control problem allows us to formulate a learning algorithm that automatically synthesizes a globally consistent desired resolution of redundancy while learning the operational-space controller. From the machine learning point of view, this learning problem corresponds to a reinforcement learning problem that maximizes an immediate reward. We employ an expectation-maximization policy search algorithm in order to solve this problem. Evaluations on a three degrees-of-freedom robot arm are used to illustrate the suggested approach. The application to a physically realistic simulator of the anthropomorphic SARCOS Master arm demonstrates feasibility for complex high degree-of-freedom robots. We also show that the proposed method works in the setting of learning resolved motion rate control on a real, physical Mitsubishi PA-10 medical robotics arm.

Proceedings ArticleDOI
18 Aug 2008
TL;DR: A new selective sampling technique, sampling by uncertainty and density (SUD), is presented, in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier.
Abstract: This paper addresses two issues of active learning. Firstly, to solve a problem of uncertainty sampling that it often fails by selecting outliers, this paper presents a new selective sampling technique, sampling by uncertainty and density (SUD), in which a k-Nearest-Neighbor-based density measure is adopted to determine whether an unlabeled example is an outlier. Secondly, a technique of sampling by clustering (SBC) is applied to build a representative initial training data set for active learning. Finally, we implement a new algorithm of active learning with SUD and SBC techniques. The experimental results from three real-world data sets show that our method outperforms competing methods, particularly at the early stages of active learning.

Journal ArticleDOI
TL;DR: The first part of this article reviews two Bayesian accounts of backward blocking, a phenomenon that is challenging for many traditional theories and focuses on two formalizations of optimal active learning: maximizing either the expected information gain or the probability gain.
Abstract: Traditional associationist models represent an organism's knowledge state by a single strength of association on each associative link. Bayesian models instead represent knowledge by a distribution of graded degrees of belief over a range of candidate hypotheses. Many traditional associationist models assume that the learner is passive, adjusting strengths of association only in reaction to stimuli delivered by the environment. Bayesian models, on the other hand, can describe how the learner should actively probe the environment to learn optimally. The first part of this article reviews two Bayesian accounts of backward blocking, a phenomenon that is challenging for many traditional theories. The broad Bayesian framework, in which these models reside, is also selectively reviewed. The second part focuses on two formalizations of optimal active learning: maximizing either the expected information gain or the probability gain. New analyses of optimal active learning by a Kalman filter and by a noisy-logic gate show that these two Bayesian models make different predictions for some environments. The Kalman filter predictions are disconfirmed in at least one case.

Book ChapterDOI
20 Oct 2008
TL;DR: This work presents a discriminative learning process which employs active, online learning to quickly classify many images with minimal user input, and demonstrates precision which is often superior to the state-of-the-art, with scalability which exceeds previous work.
Abstract: As computer vision research considers more object categories and greater variation within object categories, it is clear that larger and more exhaustive datasets are necessary. However, the process of collecting such datasets is laborious and monotonous. We consider the setting in which many images have been automatically collected for a visual category (typically by automatic internet search), and we must separate relevant images from noise. We present a discriminative learning process which employs active, online learning to quickly classify many images with minimal user input. The principle advantage of this work over previous endeavors is its scalability. We demonstrate precision which is often superior to the state-of-the-art, with scalability which exceeds previous work.

BookDOI
01 Jan 2008
TL;DR: A truncated Euclidean similarity measure and a self-normalized similarity measure related to the Canberra distance are considered and it is proved that they are positive semi-definite (p.s.d.), thus facilitating their use in kernel-based methods, like the Support Vector Machine, a very popular machine learning tool.
Abstract: We consider distance-based similarity measures for real-valued vectors of interest in kernel-based machine learning algorithms. In particular, a truncated Euclidean similarity measure and a self-normalized similarity measure related to the Canberra distance. It is proved that they are positive semi-definite (p.s.d.), thus facilitating their use in kernel-based methods, like the Support Vector Machine, a very popular machine learning tool. These kernels may be better suited than standard kernels (like the RBF) in certain situations, that are described in the paper. Some rather general results concerning positivity properties are presented in detail as well as some interesting ways of proving the p.s.d. property.

Proceedings ArticleDOI
23 Jun 2008
TL;DR: This paper proposes a two-dimensional active learning scheme that not only considers the sample dimension but also the label dimension, and it is shown that the traditional active learning formulation is a special case of 2DAL when there is only one label.
Abstract: In this paper, we propose a two-dimensional active learning scheme and show its application in image classification. Traditional active learning methods select samples only along the sample dimension. While this is the right strategy in binary classification, it is sub-optimal for multi-label classification. In multi-label classification, we argue that, for each selected sample, only a part of more effective labels are necessary to be annotated while others can be inferred by exploring the correlations among the labels. The reason is that the contributions of different labels to minimizing the classification error are different due to the inherent label correlations. To this end, we propose to select sample-label pairs, rather than only samples, to minimize a multi-label Bayesian classification error bound. This new active learning strategy not only considers the sample dimension but also the label dimension, and we call it Two-Dimensional Active Learning (2DAL). We also show that the traditional active learning formulation is a special case of 2DAL when there is only one label. Extensive experiments conducted on two real-world applications show that the 2DAL significantly outperforms the best existing approaches which did not take label correlation into account.

Journal ArticleDOI
TL;DR: This work presents a stopping criterion for active learning based on the way instances are selected during uncertainty-based sampling and verifies its applicability in a variety of settings.

Journal ArticleDOI
TL;DR: An extended approach of ant colony optimization is proposed, which is based on a recent metaheuristic method for discovering group patterns that is designed to help learners advance their on-line learning along an adaptive learning path.
Abstract: Adaptive learning provides an alternative to the traditional ''one size fits all'' approach and has driven the development of teaching and learning towards a dynamic learning process for learning. Therefore, exploring the adaptive paths to suit learners personalized needs is an interesting issue. This paper proposes an extended approach of ant colony optimization, which is based on a recent metaheuristic method for discovering group patterns that is designed to help learners advance their on-line learning along an adaptive learning path. The investigation emphasizes the relationship of learning content to the learning style of each participant in adaptive learning. An adaptive learning rule was developed to identify how learners of different learning styles may associate those contents which have the higher probability of being useful to form an optimal learning path. A style-based ant colony system is implemented and its algorithm parameters are optimized to conform to the actual pedagogical process. A survey was also conducted to evaluate the validity and efficiency of the system in producing adaptive paths to different learners. The results reveal that both the learners and the lecturers agree that the style-based ant colony system is able to provide useful supplementary learning paths.