scispace - formally typeset
Search or ask a question
Author

R. Bharat Rao

Bio: R. Bharat Rao is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Generalization & Measure (mathematics). The author has an hindex of 4, co-authored 4 publications receiving 104 citations.

Papers
More filters
Book ChapterDOI
01 Jan 1995
TL;DR: In this article, the conservation law for generalization performance in a uniformly random universe was studied and a more meaningful measure of generalization was introduced, expected generalization, which is conserved only when certain symmetric properties hold in our universe.
Abstract: The “Conservation Law for Generalization Performance” [Schaffer, 1994] states that for any learning algorithm and bias, “generalization is a zero-sum enterprise.” In this paper we study the law and show that while the law is true, the manner in which the Conservation Law adds up generalization performance over all target concepts, without regard to the probability with which each concept occurs, is relevant only in a uniformly random universe. We then introduce a more meaningful measure of generalization, expected generalization performance. Unlike the Conservation Law's measure of generalization performance (which is, in essence, defined to be zero), expected generalization performance is conserved only when certain symmetric properties hold in our universe. There is no reason to believe, a priori, that such symmetries exist; learning algorithms may well exhibit non-zero (expected) generalization performance.

69 citations

Proceedings Article
12 Jul 1992
TL;DR: The minimum description length principle, together with the KEDS algorithm, is used to guide the partitioning of the problem space and has been tested on discovering models for predicting the performance efficiencies of an internal combustion engine.
Abstract: This paper discusses discovery of mathematical models from engineering data sets. KEDS, a Knowledge-based Equation Discovery System, identifies several potentially overlapping regions in the problem space, each associated with an equation of different complexity and accuracy. The minimum description length principle, together with the KEDS algorithm, is used to guide the partitioning of the problem space. The KEDSMDL algorithm has been tested on discovering models for predicting the performance efficiencies of an internal combustion engine.

19 citations

Proceedings Article
09 Jul 1995
TL;DR: An improved vehicle rear-view mirror assembly of the type in which the mirror support arm is carried by means of at least one movable joint from the mounting part of the mirror assembly which is adapted to be secured to the bodywork of the vehicle includes a shroud of a resilient and flexible material mounted on the back of the Mirror.
Abstract: An improved vehicle rear-view mirror assembly of the type in which the mirror support arm is carried by means of at least one movable joint from the mounting part of the mirror assembly which is adapted to be secured to the bodywork of the vehicle includes a shroud of a resilient and flexible material mounted on the back of the mirror, and an extension having at least one movable joint with the end region of the extension in sealing relation with the mounting part, the extension enshrouded by a bellows-like tubular gaiter of flexible material. The shroud and the extension reduce the risk of injury to a person who may be accidentally hit by, or hit, the mirror assembly and the bellows-like gaiter also protects the movable joint or joints from dirt and moisture while readily flexing to allow the joint or joints to move or to be adjusted.

11 citations

Book ChapterDOI
01 Jun 1991
TL;DR: KEDS is presented, a Knowledge-based Equation Discovery System, which uses a model-driven approach to discover equations and thereby build models from engineering data.
Abstract: Machine learning techniques that can compress datasets into relationships that underlie the data have great utility for engineers. Most model discovery systems are not well suited to discovering comprehensible models in engineering domains. In this paper we present KEDS, a Knowledge-based Equation Discovery System, which uses a model-driven approach to discover equations and thereby build models from engineering data.

8 citations


Cited by
More filters
Journal ArticleDOI
Ricardo Vilalta1, Youssef Drissi1
TL;DR: This paper provides its own perspective view in which the goal is to build self-adaptive learners that improve their bias dynamically through experience by accumulating meta-knowledge, and provides a survey of meta-learning as reported by the machine-learning literature.
Abstract: Different researchers hold different views of what the term meta-learning exactly means. The first part of this paper provides our own perspective view in which the goal is to build self-adaptive learners (i.e. learning algorithms that improve their bias dynamically through experience by accumulating meta-knowledge). The second part provides a survey of meta-learning as reported by the machine-learning literature. We find that, despite different views and research lines, a question remains constant: how can we exploit knowledge about learning (i.e. meta-knowledge) to improve the performance of learning algorithms? Clearly the answer to this question is key to the advancement of the field and continues being the subject of intensive research.

1,052 citations

Journal ArticleDOI
TL;DR: This paper compares the performance of Ant-Miner with CN2, a well-known data mining algorithm for classification, in six public domain data sets and provides evidence that Ant- Miner is competitive with CN1 with respect to predictive accuracy and the rule lists discovered are considerably simpler than those discovered by CN2.
Abstract: The paper proposes an algorithm for data mining called Ant-Miner (ant-colony-based data miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is inspired by both research on the behavior of real ant colonies and some data mining concepts as well as principles. We compare the performance of Ant-Miner with CN2, a well-known data mining algorithm for classification, in six public domain data sets. The results provide evidence that: 1) Ant-Miner is competitive with CN2 with respect to predictive accuracy, and 2) the rule lists discovered by Ant-Miner are considerably simpler (smaller) than those discovered by CN2.

994 citations

Journal ArticleDOI
TL;DR: MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees that is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction.
Abstract: MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees. MultiBoosting can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction. Using C4.5 as the base learning algorithm, MultiBoosting is demonstrated to produce decision committees with lower error than either AdaBoost or wagging significantly more often than the reverse over a large representative cross-section of UCI data sets. It offers the further advantage over AdaBoost of suiting parallel execution.

729 citations

Journal ArticleDOI
TL;DR: A new version of the SUBDUE substructure discovery system based on the minimum description length principle is described, which discovers substructures that compress the original data and represent structural concepts in the data.
Abstract: The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimumdescription length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUBDUE's ability to find substructures capable of compressing the original data and to discover structural concepts important to the domain.

527 citations

Journal ArticleDOI
TL;DR: It is argued that Occam's razor's continued use in KDD risks causing significant opportunities to be missed, and should therefore be restricted to the comparatively few applications where it is appropriate.
Abstract: Many KDD systems incorporate an implicit or explicit preference for simpler models, but this use of “Occam‘s razor” has been strongly criticized by several authors (e.g., Schaffer, 1993s Webb, 1996). This controversy arises partly because Occam‘s razor has been interpreted in two quite different ways. The first interpretation (simplicity is a goal in itself) is essentially correct, but is at heart a preference for more comprehensible models. The second interpretation (simplicity leads to greater accuracy) is much more problematic. A critical review of the theoretical arguments for and against it shows that it is unfounded as a universal principle, and demonstrably false. A review of empirical evidence shows that it also fails as a practical heuristic. This article argues that its continued use in KDD risks causing significant opportunities to be missed, and should therefore be restricted to the comparatively few applications where it is appropriate. The article proposes and reviews the use of domain constraints as an alternative for avoiding overfitting, and examines possible methods for handling the accuracy–comprehensibility trade-off.

423 citations