scispace - formally typeset
Search or ask a question

Showing papers in "Machine Learning in 1998"


Journal ArticleDOI
TL;DR: This case study relates issues as problem formulation, selection of evaluation measures, and data preparation to properties of the oil spill application, such as its imbalanced class distribution, that are shown to be common to many applications.
Abstract: During a project examining the use of machine learning techniques for oil spill detection, we encountered several essential questions that we believe deserve the attention of the research community. We use our particular case study to illustrate such issues as problem formulation, selection of evaluation measures, and data preparation. We relate these issues to properties of the oil spill application, such as its imbalanced class distribution, that are shown to be common to many applications. Our solutions to these issues are implemented in the Canadian Environmental Hazards Detection System (CEHDS), which is about to undergo field testing.

1,279 citations


Journal ArticleDOI
TL;DR: This work introduces, analyzes and demonstrates a recursive hierarchical generalization of the widely used hidden Markov models, which is motivated by the complex multi-scale structure which appears in many natural sequences, particularly in language, handwriting and speech.
Abstract: We introduce, analyze and demonstrate a recursive hierarchical generalization of the widely used hidden Markov models, which we name Hierarchical Hidden Markov Models (HHMM) Our model is motivated by the complex multi-scale structure which appears in many natural sequences, particularly in language, handwriting and speech We seek a systematic unsupervised approach to the modeling of such structures By extending the standard Baum-Welch (forward-backward) algorithm, we derive an efficient procedure for estimating the model parameters from unlabeled data We then use the trained model for automatic hierarchical parsing of observation sequences We describe two applications of our model and its parameter estimation procedure In the first application we show how to construct hierarchical models of natural English text In these models different levels of the hierarchy correspond to structures on different length scales in the text In the second application we demonstrate how HHMMs can be used to automatically identify repeated strokes that represent combination of letters in cursive handwriting

1,050 citations


Journal ArticleDOI
TL;DR: This paper addresses the problem of building large-scale geometric maps of indoor environments with mobile robots as a constrained, probabilistic maximum-likelihood estimation problem, and devises a practical algorithm for generating the most likely map from data, along with the best path taken by the robot.
Abstract: This paper addresses the problem of building large-scale geometric maps of indoor environments with mobile robots It poses the map building problem as a constrained, probabilistic maximum-likelihood estimation problem It then devises a practical algorithm for generating the most likely map from data, along with the most likely path taken by the robot Experimental results in cyclic environments of size up to 80 by 25 meter illustrate the appropriateness of the approach

826 citations


Journal ArticleDOI
TL;DR: The generalization allows the sequence to be partitioned into segments, and the goal is to bound the additional loss of the algorithm over the sum of the losses of the best experts for each segment to model situations in which the examples change and different experts are best for certain segments of the sequence of examples.
Abstract: We generalize the recent relative loss bounds for on-line algorithms where the additional loss of the algorithm on the whole sequence of examples over the loss of the best expert is bounded. The generalization allows the sequence to be partitioned into segments, and the goal is to bound the additional loss of the algorithm over the sum of the losses of the best experts for each segment. This is to model situations in which the examples change and different experts are best for certain segments of the sequence of examples. In the single segment case, the additional loss is proportional to log n, where n is the number of experts and the constant of proportionality depends on the loss function. Our algorithms do not produce the best partition; however the loss bound shows that our predictions are close to those of the best partition. When the number of segments is k+1 and the sequence is of length e, we can bound the additional loss of our algorithm over the best partition by O(k \log n+k \log(e/k)). For the case when the loss per trial is bounded by one, we obtain an algorithm whose additional loss over the loss of the best partition is independent of the length of the sequence. The additional loss becomes O(k\log n+ k \log(L/k)), where L is the loss of the best partitionwith k+1 segments. Our algorithms for tracking the predictions of the best expert aresimple adaptations of Vovk's original algorithm for the single best expert case. As in the original algorithms, we keep one weight per expert, and spend O(1) time per weight in each trial.

589 citations


Journal ArticleDOI
Eibe Frank1, Yong Wang1, S. Inglis1, Geoffrey Holmes1, Ian H. Witten1 
TL;DR: Surprisingly, using this simple transformation the model tree inducer M5′, based on Quinlan's M5, generates more accurate classifiers than the state-of-the-art decision tree learner C5.0, particularly when most of the attributes are numeric.
Abstract: Model trees, which are a type of decision tree with linear regression functions at the leaves, form the basis of a recent successful technique for predicting continuous numeric values. They can be applied to classification problems by employing a standard method of transforming a classification problem into a problem of function approximation. Surprisingly, using this simple transformation the model tree inducerM5 ′, based on Quinlan‘s M5, generates more accurate classifiers than the state-of-the-art decision tree learner C5.0, particularly when most of the attributes are numeric.

396 citations


Journal ArticleDOI
TL;DR: Application papers focus research on importantunsolved problems that currently restrict the practical applicability of machine learning methods and help to attract high caliber students.
Abstract: Common arguments for including applications papers in the Machine Learning literatureare often based on the papers’ value for advertising success stories and for boosting morale.Forexample,high-profileapplicationscanhelptosecurefundingforfutureresearchandcanhelp to attract high caliber students. However, there is another reason why such papers areof value to the field, which is, arguably, even more vital. Application papers are essential inorder for Machine Learning to remain a viable science. They focus research on importantunsolved problems that currently restrict the practical applicability of machine learningmethods.Muchofthe“science”ofMachineLearningisascienceofengineering.

331 citations


Journal ArticleDOI
TL;DR: The power of multi-agent RL on a very large scale stochastic dynamic optimization problem of practical utility is demonstrated, with results that in simulation surpass the best of the heuristic elevator control algorithms of which the author is aware.
Abstract: Recent algorithmic and theoretical advances in reinforcement learning (RL) have attracted widespread interest RL algorithms have appeared that approximate dynamic programming on an incremental basis They can be trained on the basis of real or simulated experiences, focusing their computation on areas of state space that are actually visited during control, making them computationally tractable on very large problems If each member of a team of agents employs one of these algorithms, a new collective learning algorithm emerges for the team as a whole In this paper we demonstrate that such collective RL algorithms can be powerful heuristic methods for addressing large-scale control problems Elevator group control serves as our testbed It is a difficult domain posing a combination of challenges not seen in most multi-agent learning research to date We use a team of RL agents, each of which is responsible for controlling one elevator car The team receives a global reward signal which appears noisy to each agent due to the effects of the actions of the other agents, the random nature of the arrivals and the incomplete observation of the state In spite of these complications, we show results that in simulation surpass the best of the heuristic elevator control algorithms of which we are aware These results demonstrate the power of multi-agent RL on a very large scale stochastic dynamic optimization problem of practical utility

299 citations


Journal ArticleDOI
TL;DR: A rigorous Bayesian analysis of probabilistic localization is presented, which produces a rational argument for evaluating features, for selecting them optimally, and for training the networks that approximate the optimal solution.
Abstract: To operate successfully in indoor environments, mobile robots must be able to localize themselves. Most current localization algorithms lack flexibility, autonomy, and often optimality, since they rely on a human to determine what aspects of the sensor data to use in localization (e.g., what landmarks to use). This paper describes a learning algorithm, called BaLL, that enables mobile robots to learn what features/landmarks are best suited for localization, and also to train artificial neural networks for extracting them from the sensor data. A rigorous Bayesian analysis of probabilistic localization is presented, which produces a rational argument for evaluating features, for selecting them optimally, and for training the networks that approximate the optimal solution. In a systematic experimental study, BaLL outperforms two other recent approaches to mobile robot localization.

244 citations


Journal ArticleDOI
TL;DR: This work investigates how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a “meta-game” of self-learning.
Abstract: Following Tesauro‘s work on TD-Gammon, we used a 4,000 parameter feedforward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and selection of the position with the highest evaluation. However, no backpropagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hillclimbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a “meta-game” of self-learning.

217 citations


Journal ArticleDOI
TL;DR: An off-line, meta-learning approach for the identification of hidden context that uses an existing batch learner and the process of contextual clustering to identify stable hidden contexts and the associated context specific, locally stable concepts is presented.
Abstract: Concept drift due to hidden changes in context complicates learning in many domains including financial prediction, medical diagnosis, and communication network performance Existing machine learning approaches to this problem use an incremental learning, on-line paradigm Batch, off-line learners tend to be ineffective in domains with hidden changes in context as they assume that the training set is homogeneous An off-line, meta-learning approach for the identification of hidden context is presented The new approach uses an existing batch learner and the process of {\it contextual clustering} to identify stable hidden contexts and the associated context specific, locally stable concepts The approach is broadly applicable to the extraction of context reflected in time and spatial attributes Several algorithms for the approach are presented and evaluated A successful application of the approach to a complex flight simulator control task is also presented

188 citations


Journal ArticleDOI
TL;DR: This paper presents a case study of a machine-aided knowledge discovery process within the general area of drug design, and the Inductive Logic Programming (ILP) system progol is applied to the problem of identifying potential pharmacophores for ACE inhibition.
Abstract: This paper presents a case study of a machine-aided knowledge discovery process within the general area of drug design. Within drug design, the particular problem of pharmacophore discovery is isolated, and the Inductive Logic Programming (ILP) system progol is applied to the problem of identifying potential pharmacophores for ACE inhibition. The case study reported in this paper supports four general lessons for machine learning and knowledge discovery, as well as more specific lessons for pharmacophore discovery, for Inductive Logic Programming, and for ACE inhibition. The general lessons for machine learning and knowledge discovery are as follows. 1. An initial rediscovery step is a useful tool when approaching a new application domain. 2. General machine learning heuristics may fail to match the details of an application domain, but it may be possible to successfully apply a heuristic-based algorithm in spite of the mismatch. 3. A complete search for all plausible hypotheses can provide useful information to a user, although experimentation may be required to choose between competing hypotheses. 4. A declarative knowledge representation facilitates the development and debugging of background knowledge in collaboration with a domain expert, as well as the communication of final results.

Journal ArticleDOI
TL;DR: This note compares two choices of basis for models parameterized by probabilities, showing that it is possible to improve on the traditional choice, the probability simplex, by transforming to the 'softmax' basis.
Abstract: Maximum a posteriori optimization of parameters and the Laplace approximation for the marginal likelihood are both basis-dependent methods. This note compares two choices of basis for models parameterized by probabilities, showing that it is possible to improve on the traditional choice, the probability simplex, by transforming to the ‘softmax’ basis.

Journal ArticleDOI
TL;DR: Important aspects of the application process not commonly encountered in the “toy world,” including obtaining labeled training data, the difficulties of working with pixel data, and the automatic extraction of higher-level features are discussed.
Abstract: Dramatic improvements in sensor and image acquisition technology have created a demand for automated tools that can aid in the analysis of large image databases. We describe the development of JARtool, a trainable software system that learns to recognize volcanoes in a large data set of Venusian imagery. A machine learning approach is used because it is much easier for geologists to identify examples of volcanoes in the imagery than it is to specify domain knowledge as a set of pixel-level constraints. This approach can also provide portability to other domains without the need for explicit reprogramming; the user simply supplies the system with a new set of training examples. We show how the development of such a system requires a completely different set of skills than are required for applying machine learning to “toy world” domains. This paper discusses important aspects of the application process not commonly encountered in the “toy world,” including obtaining labeled training data, the difficulties of working with pixel data, and the automatic extraction of higher-level features.

Journal ArticleDOI
TL;DR: In this paper, a randomized version of Littlestone's Winnow algorithm for learning k-literal disjunctions is presented, which can predict the best disjunction schedule for an arbitrary sequence of examples.
Abstract: Littlestone developed a simple deterministic on-line learning algorithm for learning k-literal disjunctions. This algorithm (called {\sc Winnow}) keeps one weight for each of then variables and does multiplicative updates to its weights. We develop a randomized version of {\sc Winnow} and prove bounds for an adaptation of the algorithm for the case when the disjunction may change over time. In this case a possible target {\it disjunction schedule} T is a sequence of disjunctions (one per trial) and the {\it shift size} is the total number of literals that are added/removed from the disjunctions as one progresses through the sequence. We develop an algorithm that predicts nearly as well as the best disjunction schedule for an arbitrary sequence of examples. This algorithm that allows us to track the predictions of the best disjunction is hardly more complex than the original version. However, the amortized analysis needed for obtaining worst-case mistake bounds requires new techniques. In some cases our lower bounds show that the upper bounds of our algorithm have the right constant in front of the leading term in the mistake bound and almost the right constant in front of the second leading term. Computer experiments support our theoretical findings.

Journal ArticleDOI
TL;DR: This work presents a generic multiagent exchange situation, in which competitive behavior constitutes a conjectural equilibrium, and introduces an agent that executes a more sophisticated strategic learning strategy, building a model of the response of other agents.
Abstract: Learning in a multiagent environment is complicated by the fact that as other agents learn, the environment effectively changes. Moreover, other agents‘ actions are often not directly observable, and the actions taken by the learning agent can strongly bias which range of behaviors are encountered. We define the concept of a conjectural equilibrium, where all agents‘ expectations are realized, and each agent responds optimally to its expectations. We present a generic multiagent exchange situation, in which competitive behavior constitutes a conjectural equilibrium. We then introduce an agent that executes a more sophisticated strategic learning strategy, building a model of the response of other agents. We find that the system reliably converges to a conjectural equilibrium, but that the final result achieved is highly sensitive to initial belief. In essence, the strategic learner‘s actions tend to fulfill its expectations. Depending on the starting point, the agent may be better or worse off than had it not attempted to learn a model of the other agents at all.

Journal ArticleDOI
TL;DR: The faster Q(λ)-learning method is based on the observation that Q-value updates may be postponed until they are needed, and its update complexity is bounded by the number of actions.
Abstract: Q(λ)-learning uses TD(λ)-methods to accelerate Q-learning. The update complexity of previous online Q(λ) implementations based on lookup tables is bounded by the size of the state/action space. Our faster algorithm‘s update complexity is bounded by the number of actions. The method is based on the observation that Q-value updates may be postponed until they are needed.

Journal ArticleDOI
TL;DR: This paper defines and characterize the process of developing a “real-world” Machine Learning application, with its difficulties and relevant issues, distinguishing it from the popular practice of exploiting ready-to-use data sets.
Abstract: In this paper we define and characterize the process of developing a “real-world” Machine Learning application, with its difficulties and relevant issues, distinguishing it from the popular practice of exploiting ready-to-use data sets. To this aim, we analyze and summarize the lessons learned from applying Machine Learning techniques to a variety of problems. We believe that these lessons, though primarily based on our personal experience, can be generalized to a wider range of situations and are supported by the reported experiences of other researchers.

Journal ArticleDOI
TL;DR: A learning method to identify what information will improve coordination in specific problem-solving situations is presented, accomplished by recording and analyzing traces of inferences after problem solving.
Abstract: Coordination is an essential technique in cooperative, distributed multiagent systems. However, sophisticated coordination strategies are not always cost-effective in all problem-solving situations. This paper presents a learning method to identify what information will improve coordination in specific problem-solving situations. Learning is accomplished by recording and analyzing traces of inferences after problem solving. The analysis identifies situations where inappropriate coordination strategies caused redundant activities, or the lack of timely execution of important activities, thus degrading system performance. To remedy this problem, situation-specific control rules are created which acquire additional nonlocal information about activities in the agent networks and then select another plan or another scheduling strategy. Examples from a real distributed problem-solving application involving diagnosis of a local area network are described.

Journal ArticleDOI
TL;DR: The results suggest that in some multiagent learning scenarios direct search in policy space can offer advantages over EF-based approaches, including PIPE and CO-PIPE, which do not depend on EFs and find good policies faster and more reliably.
Abstract: We use simulated soccer to study multiagent learning. Each team‘s players (agents) share action set and policy, but may behave differently due to position-dependent inputs. All agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare several learning algorithms: TD-Q learning with linear neural networks (TD-Q), Probabilistic Incremental Program Evolution (PIPE), and a PIPE version that learns by coevolution (CO-PIPE). TD-Q is based on learning evaluation functions (EFs) mapping input/action pairs to expected reward. PIPE and CO-PIPE search policy space directly. They use adaptive probability distributions to synthesize programs that calculate action probabilities from current inputs. Our results show that linear TD-Q encounters several difficulties in learning appropriate shared EFs. PIPE and CO-PIPE, however, do not depend on EFs and find good policies faster and more reliably. This suggests that in some multiagent learning scenarios direct search in policy space can offer advantages over EF-based approaches.

Journal ArticleDOI
TL;DR: It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed and supports the view that adaptive algorithms are advantageous to non-adaptive ones in complex environments.
Abstract: The behavior of reinforcement learning (RL) algorithms is best understood in completely observable, discrete-time controlled Markov chains with finite state and action spaces. In contrast, robot-learning domains are inherently continuous both in time and space, and moreover are partially observable. Here we suggest a systematic approach to solve such problems in which the available qualitative and quantitative knowledge is used to reduce the complexity of learning task. The steps of the design process are to: (i) decompose the task into subtasks using the qualitative knowledge at hand; (ii) design local controllers to solve the subtasks using the available quantitative knowledge, and (iii) learn a coordination of these controllers by means of reinforcement learning. It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed. The approach was verified on a non-trivial real-life robot task. Several RL algorithms were compared by ANOVA and it was found that the model-based approach worked significantly better than the model-free approach. The learnt switching strategy performed comparably to a handcrafted version. Moreover, the learnt strategy seemed to exploit certain properties of the environment which were not foreseen in advance, thus supporting the view that adaptive algorithms are advantageous to nonadaptive ones in complex environments.

Journal ArticleDOI
TL;DR: This paper discusses the emergence of sensorimotor coordination for ESCHeR, a 4DOF redundant foveated rob ot-head, by interaction with its environment, and explains how the development of ESC heR's visual abilities can be related to the Piagetian ‘stage theory’.
Abstract: This paper discusses the emergence of sensorimotor coordination for ESCHeR, a 4DOF redundant foveated rob ot-head, by interaction with its environment. A feedback-error-learning(FEL)-based distributed control provides the system with explorative abilities with reflexes constraining the learning space. A Kohonen network, trained at run-time, categorizes the sensorimotor patterns obtained over ESCHeR‘s interaction with its environment, enables the reinforcement of frequently executed actions, thus stabilizing the learning activity over time. We explain how the development of ESCHeR‘s visual abilities (namely gaze fixation and saccadic motion), from a context-free reflex-based control process to a context-dependent, pattern-based sensorimotor coordination can be related to the Piagetian ‘stage theory’.

Journal ArticleDOI
TL;DR: A multi-year collaboration among computer scientists, toxicologists, chemists, and a statistician, in which the RL induction program was used to assist toxicologists in analyzing relationships among various features of chemical compounds and their carcinogenicity in rodents demonstrated the utility of knowledge-based rule induction.
Abstract: In this paper, we report on a multi-year collaboration among computer scientists, toxicologists, chemists, and a statistician, in which the RL induction program was used to assist toxicologists in analyzing relationships among various features of chemical compounds and their carcinogenicity in rodents. Our investigation demonstrated the utility of knowledge-based rule induction in the problem of predicting rodent carcinogenicity and the place of rule induction in the overall process of discovery. Flexibility of the program in accepting different definitions of background knowledge and preferences was considered essential in this exploratory effort. This investigation has made significant contributions not only to predicting carcinogenicity and non-carcinogenicity in rodents, but to understanding how to extend a rule induction program into an exploratory data analysis tool.

Journal ArticleDOI
TL;DR: This paper proposes a principled solution to the issue of catastrophic fusion in multimodal recognition systems that integrate the output from several modules while working in non-stationary environments based upon Bayesian ideas of competitive models and inference robustification.
Abstract: This paper analyzes the issue of catastrophic fusion, a problem that occurs in multimodal recognition systems that integrate the output from several modules while working in non-stationary environments. For concreteness we frame the analysis with regard to the problem of automatic audio visual speech recognition (AVSR), but the issues at hand are very general and arise in multimodal recognition systems which need to work in a wide variety of contexts. Catastrophic fusion is said to have occurred when the performance of a multimodal system is inferior to the performance of some isolated modules, e.g., when the performance of the audio visual speech recognition system is inferior to that of the audio system alone. Catastrophic fusion arises because recognition modules make implicit assumptions and thus operate correctly only within a certain context. Practice shows that when modules are tested in contexts inconsistent with their assumptions, their influence on the fused product tends to increase, with catastrophic results. We propose a principled solution to this problem based upon Bayesian ideas of competitive models and inference robustification. We study the approach analytically on a classic Gaussian discrimination task and then apply it to a realistic problem on audio visual speech recognition (AVSR) with excellent results.

Journal ArticleDOI
TL;DR: The model where a feed-forward network learns from examples generated by a time dependent teacher of the same architecture is analyzed and the best possible generalization ability is determined exactly, through the use of a variational method.
Abstract: We review the application of statistical mechanics methods to the study of online learning of a drifting concept in the limit of large systems. The model where a feed-forward network learns from examples generated by a time dependent teacher of the same architecture is analyzed. The best possible generalization ability is determined exactly, through the use of a variational method. The constructive variational method also suggests a learning algorithm. It depends, however, on some unavailable quantities, such as the present performance of the student. The construction of estimators for these quantities permits the implementation of a very effective, highly adaptive algorithm. Several other algorithms are also studied for comparison with the optimal bound and the adaptive algorithm, for different types of time evolution of the rule.

Journal Article
TL;DR: A modal language for talking about projective planes, containing formulas to be evaluated at points and at lines, and it is decidable whether a given formula is satisfiable in a projective plane.
Abstract: We introduce a modal language for talking about projective planes. This language is two-sorted, containing formulas to be evaluated at points and at lines, respectively. The language has two diamonds whose intended accessibility relations are the two directions of the incidence relation between points and lines. We provide a sound and complete axiomatization for the formulas that are valid in the class of projective planes. We also show that it is decidable whether a given formula is satisfiable in a projective plane, and we characterize the computational complexity of this satisfaction problem. ∗Institute of Logic, Language and Information, University of Amsterdam, Plantage Muidergracht 24, 1018 TV Amsterdam, The Netherlands. yde@wins.uva.nl The research of the author has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences

Journal ArticleDOI
Josef Pauli1
TL;DR: This work uniformly approximate the necessary functions by networks of gaussian basis functions (GBF networks) by modifying the number of basis functions and/or the size of the gaussian support the quality of the function approximation changes.
Abstract: We apply techniques of computer vision and neural network learning to get a versatile robot manipulator. All work conducted follows the principle of autonomous learning from visual demonstration. The user must demonstra te the relevant objects, situations, and/or actions, and the robot vision system must learn from those. For approaching and grasping technical objects three principal tasks have to be done—calibrating the camera-robot coordination, detecting the desired object in the images, and choosing a stable grasping pose. These procedures are based on (nonlinear) functions, which are not known a priori and therefore have to be learned. We uniformly approximate the necessary functions by networks of gaussian basis functions (GBF networks). By modifying the number of basis functions and/or the size of the gaussian support the quality of the function approximation changes. The appropriate configuration is learned in the training phase and applied during the operation phase. All experiments are carried out in real world applications using an industrial articulation robot manipulator and the computer vision system KHOROS.

Journal ArticleDOI
TL;DR: The authors' analysis provides a mathematical ground to the intuition that localization is indeed much easier than identification and upper-bounds on the hardness of localization are established by applying a new, algebraic-geometry based, general tool for the calculation of the VC-dimension of classes of algebraically defined objects.
Abstract: How difficult is it to find the position of a known object using random samples? We study this question, which is central to Computer Vision and Robotics, in a formal way. We compare the information complexity of two types of tasks: the task of identification of an unknown object from labeled examples input, and the task of localization in which the identity of the target is known and its location in some background scene has to be determined. We carry out the comparison of these tasks using two measuring rods for the complexity of classes of sets; The Vapnik-Chervonenkis dimension and the e-entropy of relevant classes. The VC-dimension analysis yields bounds on the sample complexity of performing these tasks in the PAC-learning scenario whereas the e-entropy parameter reflects the complexity of the relevant learning tasks when the examples are generated by the uniform distribution (over the background scene). Our analysis provides a mathematical ground to the intuition that localization is indeed much easier than identification. Our upper-bounds on the hardness of localization are established by applying a new, algebraic-geometry based, general tool for the calculation of the VC-dimension of classes of algebraically defined objects. This technique was independently discovered by Goldberg and Jerrum. We believe that our techniques will prove useful for further VC-dimension estimation problems.

Journal ArticleDOI
TL;DR: Nonparametric regression estimators—based on neural networks—that select the number of “hidden units” (or “neurons”) using either prequential model selection or delete-one cross-validation are proposed.
Abstract: Prequential model selection and delete-one cross-validation are data-driven methodologies for choosing between rival models on the basis of their predictive abilities. For a given set of observations, the predictive ability of a model is measured by the model‘s accumulated prediction error and by the model‘s average-out-of-sample prediction error, respectively, for prequential model selection and for cross-validation. In this paper, given i.i.d. observations, we propose nonparametric regression estimators—based on neural networks—that select the number of “hidden units” (or “neurons”) using either prequential model selection or delete-one cross-validation. As our main contributions: (i) we establish rates of convergence for the integrated mean-squared errors in estimating the regression function using “off-line” or “batch” versions of the proposed estimators and (ii) we establish rates of convergence for the time-averaged expected prediction errors in using “on-line” versions of the proposed estimators. We also present computer simulations (i) empirically validating the proposed estimators and (ii) empirically comparing the proposed estimators with certain novel prequential and cross-validated “mixture” regression estimators.

Journal ArticleDOI
TL;DR: A new approach for learning multiagent coordination strategies that addresses these issues is presented and the effectiveness of the technique is demonstrated using a synthetic domain and the predator and prey pursuit problem.
Abstract: A central issue in the design of cooperative multiagent systems is how to coordinate the behavior of the agents to meet the goals of the designer. Traditionally, this had been accomplished by hand-coding the coordination strategies. However, this task is complex due to the interactions that can take place among agents. Recent work in the area has focused on how strategies can be learned. Yet, many of these systems suffer from convergence, complexity and performance problems. This paper presents a new approach for learning multiagent coordination strategies that addresses these issues. The effectiveness of the technique is demonstrated using a synthetic domain and the predator and prey pursuit problem.

Journal ArticleDOI
TL;DR: The open problem of relating the cost of self-directed learning to the VC-dimension is answered by showing that no such relation exists and it is proved that an algorithm that makes fewer queries throughout its learning process, necessarily suffers a higher number of mistakes.
Abstract: We study the self-directed (SD) learning model. In this model a learner chooses examples, guesses their classification and receives immediate feedback indicating the correctness of its guesses. We consider several fundamental questions concerning this model: the parameters of a task that determine the cost of learning, the computational complexity of a student, and the relationship between this model and the teacher-directed (TD) learning model. We answer the open problem of relating the cost of self-directed learning to the VC-dimension by showing that no such relation exists. Furthermore, we refute the conjecture that for the intersection-closed case, the cost of self-directed learning is bounded by the VC-dimension. We also show that the cost of SD learning may be arbitrarily higher that that of TD learning. Finally, we discuss the number of queries needed for learning in this model and its relationship to the number of mistakes the student incurs. We prove a trade-off formula showing that an algorithm that makes fewer queries throughout its learning process, necessarily suffers a higher number of mistakes.