scispace - formally typeset
Search or ask a question

Showing papers on "Active learning (machine learning) published in 2004"


Book
01 Oct 2004
TL;DR: Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts, and discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining.
Abstract: The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data. Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts. In order to present a unified treatment of machine learning problems and solutions, it discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining. All learning algorithms are explained so that the student can easily move from the equations in the book to a computer program. The text covers such topics as supervised learning, Bayesian decision theory, parametric methods, multivariate methods, multilayer perceptrons, local models, hidden Markov models, assessing and comparing classification algorithms, and reinforcement learning. New to the second edition are chapters on kernel machines, graphical models, and Bayesian estimation; expanded coverage of statistical tests in a chapter on design and analysis of machine learning experiments; case studies available on the Web (with downloadable results for instructors); and many additional exercises. All chapters have been revised and updated. Introduction to Machine Learning can be used by advanced undergraduates and graduate students who have completed courses in computer programming, probability, calculus, and linear algebra. It will also be of interest to engineers in the field who are concerned with the application of machine learning methods. Adaptive Computation and Machine Learning series

3,950 citations


Proceedings ArticleDOI
25 Jul 2004
TL;DR: A new learning algorithm called extreme learning machine (ELM) for single-hidden layer feedforward neural networks (SLFNs) which randomly chooses the input weights and analytically determines the output weights of SLFNs is proposed.
Abstract: It is clear that the learning speed of feedforward neural networks is in general far slower than required and it has been a major bottleneck in their applications for past decades. Two key reasons behind may be: 1) the slow gradient-based learning algorithms are extensively used to train neural networks, and 2) all the parameters of the networks are tuned iteratively by using such learning algorithms. Unlike these traditional implementations, this paper proposes a new learning algorithm called extreme learning machine (ELM) for single-hidden layer feedforward neural networks (SLFNs) which randomly chooses the input weights and analytically determines the output weights of SLFNs. In theory, this algorithm tends to provide the best generalization performance at extremely fast learning speed. The experimental results based on real-world benchmarking function approximation and classification problems including large complex applications show that the new algorithm can produce best generalization performance in some cases and can learn much faster than traditional popular learning algorithms for feedforward neural networks.

3,643 citations


Proceedings ArticleDOI
04 Jul 2004
TL;DR: This paper proposes to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs, and demonstrates the versatility and effectiveness of the method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.
Abstract: Learning general functional dependencies is one of the main goals in machine learning. Recent progress in kernel-based methods has focused on designing flexible and powerful input representations. This paper addresses the complementary issue of problems involving complex outputs such as multiple dependent output variables and structured output spaces. We propose to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs. The resulting optimization problem is solved efficiently by a cutting plane algorithm that exploits the sparseness and structural decomposition of the problem. We demonstrate the versatility and effectiveness of our method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.

1,446 citations


Journal ArticleDOI
TL;DR: This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning and shows up precise connections to other "kernel machines" popular in the community.
Abstract: Gaussian processes (GPs) are natural generalisations of multivariate Gaussian random variables to infinite (countably or continuous) index sets. GPs have been applied in a large number of fields to a diverse range of ends, and very many deep theoretical analyses of various properties are available. This paper gives an introduction to Gaussian processes on a fairly elementary level with special emphasis on characteristics relevant in machine learning. It draws explicit connections to branches such as spline smoothing models and support vector machines in which similar ideas have been investigated. Gaussian process models are routinely used to solve hard machine learning problems. They are attractive because of their flexible non-parametric nature and computational simplicity. Treated within a Bayesian framework, very powerful statistical methods can be implemented which offer valid estimates of uncertainties in our predictions and generic model selection procedures cast as nonlinear optimization problems. Their main drawback of heavy computational scaling has recently been alleviated by the introduction of generic sparse approximations.13,78,31 The mathematical literature on GPs is large and often uses deep concepts which are not required to fully understand most machine learning applications. In this tutorial paper, we aim to present characteristics of GPs relevant to machine learning and to show up precise connections to other "kernel machines" popular in the community. Our focus is on a simple presentation, but references to more detailed sources are provided.

752 citations


Book ChapterDOI
TL;DR: This tutorial introduces the techniques that are used to obtain results in the form of so-called error bounds in statistical learning theory.
Abstract: The goal of statistical learning theory is to study, in a statistical framework, the properties of learning algorithms. In particular, most results take the form of so-called error bounds. This tutorial introduces the techniques that are used to obtain such results.

602 citations


Proceedings ArticleDOI
04 Jul 2004
TL;DR: A formal framework that incorporates clustering into active learning with two-class active learning that allows to select the most representative samples as well as to avoid repeatedly labeling samples in the same cluster.
Abstract: The paper is concerned with two-class active learning. While the common approach for collecting data in active learning is to select samples close to the classification boundary, better performance can be achieved by taking into account the prior data distribution. The main contribution of the paper is a formal framework that incorporates clustering into active learning. The algorithm first constructs a classifier on the set of the cluster representatives, and then propagates the classification decision to the other samples via a local noise model. The proposed model allows to select the most representative samples as well as to avoid repeatedly labeling samples in the same cluster. During the active learning process, the clustering is adjusted using the coarse-to-fine strategy in order to balance between the advantage of large clusters and the accuracy of the data representation. The results of experiments in image databases show a better performance of our algorithm compared to the current methods.

548 citations


Book
01 Jan 2004
TL;DR: This book provides a valuable primer that delineates what the authors know, what they would like to know, and the limits of what they can know, when they try to learn about a system that is composed of other learners.
Abstract: 1. The Interactive Learning Problem 2. Reinforcement and Regret 3. Equilibrium 4. Conditional No-Regret Learning 5. Prediction, Postdiction, and Calibration 6. Fictitious Play and Its Variants 7. Bayesian Learning 8. Hypothesis Testing 9. Conclusion

504 citations


Proceedings ArticleDOI
10 Oct 2004
TL;DR: MRBIR first makes use of a manifold ranking algorithm to explore the relationship among all the data points in the feature space, and then measures relevance between the query and all the images in the database accordingly, which is different from traditional similarity metrics based on pair-wise distance.
Abstract: In this paper, we propose a novel transductive learning framework named manifold-ranking based image retrieval (MRBIR) Given a query image, MRBIR first makes use of a manifold ranking algorithm to explore the relationship among all the data points in the feature space, and then measures relevance between the query and all the images in the database accordingly, which is different from traditional similarity metrics based on pair-wise distance In relevance feedback, if only positive examples are available, they are added to the query set to improve the retrieval result; if examples of both labels can be obtained, MRBIR discriminately spreads the ranking scores of positive and negative examples, considering the asymmetry between these two types of images Furthermore, three active learning methods are incorporated into MRBIR, which select images in each round of relevance feedback according to different principles, aiming to maximally improve the ranking result Experimental results on a general-purpose image database show that MRBIR attains a significant improvement over existing systems from all aspects

382 citations


Proceedings ArticleDOI
04 Jul 2004
TL;DR: The multi-task IVM (MTIVM) saves computation by greedily selecting the most informative examples from the separate tasks and is shown to be more efficient than random sub-sampling on an artificial data-set and more effective than the traditional IVM in a speaker dependent phoneme recognition task.
Abstract: This paper describes an efficient method for learning the parameters of a Gaussian process (GP). The parameters are learned from multiple tasks which are assumed to have been drawn independently from the same GP prior. An efficient algorithm is obtained by extending the informative vector machine (IVM) algorithm to handle the multi-task learning case. The multi-task IVM (MTIVM) saves computation by greedily selecting the most informative examples from the separate tasks. The MT-IVM is also shown to be more efficient than random sub-sampling on an artificial data-set and more effective than the traditional IVM in a speaker dependent phoneme recognition task.

359 citations



Proceedings Article
01 Dec 2004
TL;DR: The core search problem of active learning schemes is abstract out, and it is proved that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.
Abstract: We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity. We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.

Journal ArticleDOI
TL;DR: It is shown that a probabilistic active learning method can be used to actively query the user, thereby solving the "new user problem" of memory-based collaborative filtering.
Abstract: Memory-based collaborative filtering (CF) has been studied extensively in the literature and has proven to be successful in various types of personalized recommender systems. In this paper, we develop a probabilistic framework for memory-based CF (PMCF). While this framework has clear links with classical memory-based CF, it allows us to find principled solutions to known problems of CF-based recommender systems. In particular, we show that a probabilistic active learning method can be used to actively query the user, thereby solving the "new user problem." Furthermore, the probabilistic framework allows us to reduce the computational cost of memory-based CF by working on a carefully selected subset of user profiles, while retaining high accuracy. We report experimental results based on two real-world data sets, which demonstrate that our proposed PMCF framework allows an accurate and efficient prediction of user preferences.

Proceedings Article
01 Dec 2004
TL;DR: This paper studied various Human Interactive Proofs (HIPs) on the market, and found that most HIPs are pure recognition tasks which can easily be broken using machine learning.
Abstract: Machine learning is often used to automatically solve human tasks. In this paper, we look for tasks where machine learning algorithms are not as good as humans with the hope of gaining insight into their current limitations. We studied various Human Interactive Proofs (HIPs) on the market, because they are systems designed to tell computers and humans apart by posing challenges presumably too hard for computers. We found that most HIPs are pure recognition tasks which can easily be broken using machine learning. The harder HIPs use a combination of segmentation and recognition tasks. From this observation, we found that building segmentation tasks is the most effective way to confuse machine learning algorithms. This has enabled us to build effective HIPs (which we deployed in MSN Passport), as well as design challenging segmentation tasks for machine learning algorithms.

Journal ArticleDOI
TL;DR: The problem of scarcity of labeled pixels, required for segmentation of remotely sensed satellite images in supervised pixel classification framework, is addressed in this article and a support vector machine (SVM) is considered for classifying the pixels into different landcover types.

Proceedings ArticleDOI
01 Dec 2004
TL;DR: This paper proves the algorithm guarantees at most zero average regret, while demonstrating the algorithm converges in many situations of self-play, and suggests a third new learning criterion combining convergence and regret, which is called negative non-convergence regret (NNR).
Abstract: Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment is no longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to deception, where the other agents may be able to exploit a learner's particular dynamics. In the worst case, this could result in poorer performance than if the agent was not learning at all. These challenges are identifiable in the two most common evaluation criteria for multiagent learning algorithms: convergence and regret. Algorithms focusing on convergence or regret in isolation are numerous. In this paper, we seek to address both criteria in a single algorithm by introducing GIGA-WoLF, a learning algorithm for normal-form games. We prove the algorithm guarantees at most zero average regret, while demonstrating the algorithm converges in many situations of self-play. We prove convergence in a limited setting and give empirical results in a wider variety of situations. These results also suggest a third new learning criterion combining convergence and regret, which we call negative non-convergence regret (NNR).

Journal ArticleDOI
TL;DR: It was found that the Naïve Bayes algorithm is the most appropriate to be used for the construction of a software support tool, has more than satisfactory accuracy, its overall sensitivity is extremely satisfactory, and is the easiest algorithm to implement.
Abstract: The ability to predict a student's performance could be useful in a great number of different ways associated with university-level distance learning. Students' key demographic characteristics and their marks on a few written assignments can constitute the training set for a supervised machine learning algorithm. The learning algorithm could then be able to predict the performance of new students, thus becoming a useful tool for identifying predicted poor performers. The scope of this work is to compare some of the state of the art learning algorithms. Two experiments have been conducted with six algorithms, which were trained using data sets provided by the Hellenic Open University. Among other significant conclusions, it was found that the Naive Bayes algorithm is the most appropriate to be used for the construction of a software support tool, has more than satisfactory accuracy, its overall sensitivity is extremely satisfactory, and is the easiest algorithm to implement.

Proceedings ArticleDOI
21 Jul 2004
TL;DR: A multi-criteria-based active learning approach is proposed and effectively applied to named entity recognition and includes all the criteria using two selection strategies, both of which result in less labeling cost than single-criterion-based method.
Abstract: In this paper, we propose a multi-criteria-based active learning approach and effectively apply it to named entity recognition. Active learning targets to minimize the human annotation efforts by selecting examples for labeling. To maximize the contribution of the selected examples, we consider the multiple criteria: informativeness, representativeness and diversity and propose measures to quantify them. More comprehensively, we incorporate all the criteria using two selection strategies, both of which result in less labeling cost than single-criterion-based method. The results of the named entity recognition in both MUC-6 and GENIA show that the labeling cost can be reduced by at least 80% without degrading the performance.

Proceedings ArticleDOI
15 Nov 2004
TL;DR: This work presents democratic colearning in which multiple algorithms instead of multiple views enable learners to label data for each other, a new example selection method for active learning.
Abstract: For many machine learning applications it is important to develop algorithms that use both labeled and unlabeled data. We present democratic colearning in which multiple algorithms instead of multiple views enable learners to label data for each other. Our technique leverages off the fact that different learning algorithms have different inductive biases and that better predictions can be made by the voted majority. We also present democratic priority sampling, a new example selection method for active learning.

01 Jan 2004
TL;DR: This thesis is a comprehensive study of rating-based, pure, non-sequential collaborative filtering, and implements a total of nine prediction methods, and conducts large scale prediction accuracy experiments.
Abstract: Collaborative Filtering: A Machine Learning Perspective Benjamin Marlin Master of Science Graduate Department of Computer Science University of Toronto 2004 Collaborative filtering was initially proposed as a framework for filtering information based on the preferences of users, and has since been refined in many different ways. This thesis is a comprehensive study of rating-based, pure, non-sequential collaborative filtering. We analyze existing methods for the task of rating prediction from a machine learning perspective. We show that many existing methods proposed for this task are simple applications or modifications of one or more standard machine learning methods for classification, regression, clustering, dimensionality reduction, and density estimation. We introduce new prediction methods in all of these classes. We introduce a new experimental procedure for testing stronger forms of generalization than has been used previously. We implement a total of nine prediction methods, and conduct large scale prediction accuracy experiments. We show interesting new results on the relative performance of these methods.

Proceedings ArticleDOI
28 Sep 2004
TL;DR: This paper presents a new algorithm for walk optimization based on an evolutionary approach, which makes it more robust to noise in parameter evaluations and avoids prematurely converging to local optima, a problem encountered by both of the previously suggested algorithms.
Abstract: Developing fast gaits for legged robots is a difficult task that requires optimizing parameters in a highly irregular, multidimensional space. In the past, walk optimization for quadruped robots, namely the Sony AIBO robot, was done by handtuning the parameterized gaits. In addition to requiring a lot of time and human expertise, this process produced sub-optimal results. Several recent projects have focused on using machine learning to automate the parameter search. Algorithms utilizing Powell's minimization method and policy gradient reinforcement learning have shown significant improvement over previous walk optimization results. In this paper we present a new algorithm for walk optimization based on an evolutionary approach. Unlike previous methods, our algorithm does not attempt to approximate the gradient of the multidimensional space. This makes it more robust to noise in parameter evaluations and avoids prematurely converging to local optima, a problem encountered by both of the previously suggested algorithms. Our evolutionary algorithm matches the best previous learning method, achieving several different walks of high quality. Furthermore, the best learned walks represent an impressive 20% improvement over our own best hand-tuned walks.

Book ChapterDOI
20 Sep 2004
TL;DR: This paper compares the performance of many learning models on a substantial benchmark of binary text classification tasks having small training sets and varies the training size and class distribution to examine the learning surface, as opposed to the traditional learning curve.
Abstract: Many real-world machine learning tasks are faced with the problem of small training sets. Additionally, the class distribution of the training set often does not match the target distribution. In this paper we compare the performance of many learning models on a substantial benchmark of binary text classification tasks having small training sets. We vary the training size and class distribution to examine the learning surface, as opposed to the traditional learning curve. The models tested include various feature selection methods each coupled with four learning algorithms: Support Vector Machines (SVM), Logistic Regression, Naive Bayes, and Multinomial Naive Bayes. Different models excel in different regions of the learning surface, leading to meta-knowledge about which to apply in different situations. This helps guide the researcher and practitioner when facing choices of model and feature selection methods in, for example, information retrieval settings and others.

Journal ArticleDOI
TL;DR: Machine learning methods are shown to be possible to automatically build models for retrieving high-quality, content-specific articles using inclusion or citation by the ACP Journal Club as a gold standard in a given time period in internal medicine that perform better than the 1994 PubMed clinical query filters.

Book
03 Sep 2004
TL;DR: Three leading researchers bridge the gap between research, design, and deployment, introducing key algorithms as well as practical implementation techniques to construct robust information processing systems for biometric authentication in both face and voice recognition systems.
Abstract: A breakthrough approach to improving biometrics performance Constructing robust information processing systems for face and voice recognition Supporting high-performance data fusion in multimodal systems Algorithms, implementation techniques, and application examples Machine learning: driving significant improvements in biometric performance As they improve, biometric authentication systems are becoming increasingly indispensable for protecting life and property. This book introduces powerful machine learning techniques that significantly improve biometric performance in a broad spectrum of application domains. Three leading researchers bridge the gap between research, design, and deployment, introducing key algorithms as well as practical implementation techniques. They demonstrate how to construct robust information processing systems for biometric authentication in both face and voice recognition systems, and to support data fusion in multimodal systems. Coverage includes: How machine learning approaches differ from conventional template matching Theoretical pillars of machine learning for complex pattern recognition and classification Expectation-maximization (EM) algorithms and support vector machines (SVM) Multi-layer learning models and back-propagation (BP) algorithms Probabilistic decision-based neural networks (PDNNs) for face biometrics Flexible structural frameworks for incorporating machine learning subsystems in biometric applications Hierarchical mixture of experts and inter-class learning strategies based on class-based modular networks Multi-cue data fusion techniques that integrate face and voice recognition Application case studies

Proceedings ArticleDOI
10 Oct 2004
TL;DR: A coherent language model for automatic image annotation is proposed that takes into account the word-to-word correlation by estimating a coherent language models for an image to significantly reduce the required number of annotated image examples.
Abstract: Image annotations allow users to access a large image database with textual queries. There have been several studies on automatic image annotation utilizing machine learning techniques, which automatically learn statistical models from annotated images and apply them to generate annotations for unseen images. One common problem shared by most previous learning approaches for automatic image annotation is that each annotated word is predicated for an image independently from other annotated words. In this paper, we proposed a coherent language model for automatic image annotation that takes into account the word-to-word correlation by estimating a coherent language model for an image. This new approach has two important advantages: 1) it is able to automatically determine the annotation length to improve the accuracy of retrieval results, and 2) it can be used with active learning to significantly reduce the required number of annotated image examples. Empirical studies with Corel dataset are presented to show the effectiveness of the coherent language model for automatic image annotation.

Journal ArticleDOI
TL;DR: The model explains the idea of a "copy-exactly" ramp-up, which freezes the process for some time period, thereby exhibiting a nonmonotone trajectory, which is shown to be optimal if the initial knowledge level is low, the lifecycle short and demand growth is steep, and learning is difficult.
Abstract: Production ramp-up is the period of time during which a manufacturing process is scaled up from a small laboratory-like environment to high-volume production. During this scale-up, the firm needs to overcome the numerous discrepancies between how the process is specified to operate as written in the process recipe and how it actually is operated at large volume. The reduction of these discrepancies, a process that we will refer to as learning, will lead to improved production yields and higher output. In addition to its learning effort, however, the firm also attempts to change the process recipe itself, which can be in direct conflict with the learning objective. We formalize this intertemporal tradeoff between learning and process change in the form of a dynamic optimization problem. Our model explains the idea of a "copy-exactly" ramp-up, which freezes the process for some time period, i.e., does not allow for any change in the process. Mathematically, this corresponds to a process improvement policy which delays process changes, thereby exhibiting a nonmonotone trajectory, which we show to be optimal if the initial knowledge level is low, the lifecycle short and demand growth is steep, and learning is difficult.

Proceedings ArticleDOI
07 Jul 2004
TL;DR: In this paper, the authors take into account the posterior distribution of the estimated model, which results in more robust active learning algorithm, and they show that when the number of ratings from the active user is restricted to be small, active learning methods only based on estimated model don't perform well while the active learning method using the model distribution achieves substantially better performance.
Abstract: Collaborative filtering is a useful technique for exploiting the preference patterns of a group of users to predict the utility of items for the active user. In general, the performance of collaborative filtering depends on the number of rated examples given by the active user. The more the number of rated examples given by the active user, the more accurate the predicted ratings will be. Active learning provides an effective way to acquire the most informative rated examples from active users. Previous work on active learning for collaborative filtering only considers the expected loss function based on the estimated model, which can be misleading when the estimated model is inaccurate. This paper takes one step further by taking into account of the posterior distribution of the estimated model, which results in more robust active learning algorithm. Empirical studies with datasets of movie ratings show that when the number of ratings from the active user is restricted to be small, active learning methods only based on the estimated model don't perform well while the active learning method using the model distribution achieves substantially better performance.

Proceedings Article
01 Jul 2004
TL;DR: This work explores one such strategy: using a model during annotation to automate some of the decisions during annotation, showing an 80% reduction in annotation cost compared with labeling randomly selected data with a single model.
Abstract: Active learning (AL) promises to reduce the cost of annotating labeled datasets for trainable human language technologies Contrary to expectations, when creating labeled training material for HPSG parse selection and later reusing it with other models, gains from AL may be negligible or even negative This has serious implications for using AL, showing that additional cost-saving strategies may need to be adopted We explore one such strategy: using a model during annotation to automate some of the decisions Our best results show an 80% reduction in annotation cost compared with labeling randomly selected data with a single model

Journal ArticleDOI
TL;DR: A probabilistic active learning strategy for support vector machine (SVM) design in large data applications that queries for a set of points according to a distribution as determined by the current separating hyperplane and a newly defined concept of an adaptive confidence factor.
Abstract: The paper describes a probabilistic active learning strategy for support vector machine (SVM) design in large data applications. The learning strategy is motivated by the statistical query model. While most existing methods of active SVM learning query for points based on their proximity to the current separating hyperplane, the proposed method queries for a set of points according to a distribution as determined by the current separating hyperplane and a newly defined concept of an adaptive confidence factor. This enables the algorithm to have more robust and efficient learning capabilities. The confidence factor is estimated from local information using the k nearest neighbor principle. The effectiveness of the method is demonstrated on real-life data sets both in terms of generalization performance, query complexity, and training time.

Proceedings ArticleDOI
24 Oct 2004
TL;DR: A multilabel SVM active learning method to reduce the human effort of labelling images, especiallyMultilabel image classification, and two selection strategies: Max Loss strategy and Mean Max loss strategy.
Abstract: Image classification is an important task in computer vision. However, how to assign suitable labels to images is a subjective matter, especially when some images can be categorized into multiple classes simultaneously. Multilabel image classification focuses on the problem that each image can have one or multiple labels. It is known that manually labelling images is time-consuming and expensive. In order to reduce the human effort of labelling images, especially multilabel images, we proposed a multilabel SVM active learning method. We also proposed two selection strategies: Max Loss strategy and Mean Max Loss strategy. Experimental results on both artificial data and real-world images demonstrated the advantage of proposed method.

01 Jan 2004
TL;DR: This paper uses a machine learning approach to characterize and partition C-space into regions that are well suited to one of the methods in the authors' library of roadmapbased motion planners, and demonstrates that the simple prototype system reliably outperforms any of the planners on their own.
Abstract: Although there are many motion planning techniques, there is no method that outperforms all others for all problem instances. Rather, each technique has different strengths and weaknesses which makes it best-suited for certain types of problems. Moreover, since an environment can contain vastly different regions, there may not be a single planner that will perform well in all its regions. Ideally, one would use a suite of planners in concert and would solve the problem by applying the best-suited planner in each region. In this paper, we propose an automated framework for feature-sensitive motion planning. We use a machine learning approach to characterize and partition C-space into regions that are well suited to one of the methods in our library of roadmapbased motion planners. After the best-suited method is applied in each region, the resulting region roadmaps are combined to form a roadmap of the entire planning space. Over a range of problems, we demonstrate that our simple prototype system reliably outperforms any of the planners on their own.