scispace - formally typeset
Search or ask a question

Showing papers on "Decision tree model published in 1996"


Proceedings Article
04 Aug 1996
TL;DR: This work proposes a lazy decision tree algorithm--LAZYDT--that conceptually constructs the "best" decision tree for each test instance, and is robust with respect to missing values without resorting to the complicated methods usually seen in induction of decision trees.
Abstract: Lazy learning algorithms, exemplified by nearest-neighbor algorithms, do not induce a concise hypothesis from a given training set; the inductive process is delayed until a test instance is given. Algorithms for constructing decision trees, such as C4.5, ID3, and CART create a single "best" decision tree during the training phase, and this tree is then used to classify test instances. The tests at the nodes of the constructed tree are good on average, but there may be better tests for classifying a specific instance. We propose a lazy decision tree algorithm--LAZYDT--that conceptually constructs the "best" decision tree for each test instance. In practice, only a path needs to be constructed, and a caching scheme makes the algorithm fast. The algorithm is robust with respect to missing values without resorting to the complicated methods usually seen in induction of decision trees. Experiments on real and artificial problems are presented.

290 citations


MonographDOI
01 Jan 1996

262 citations


Journal ArticleDOI
TL;DR: The model LIGNUM, which describes the three dimensional structure of the tree crown and defines the growth in terms of the metabolism taking place in these units, has been parametrized for young Scots pine (Pinus sylaestris L.) trees.

256 citations


Journal ArticleDOI
TL;DR: Three different types of complexity lower bounds for the one-way unbounded-error and bounded-error error probabilistic communication protocols for boolean functions are proved.

108 citations


Journal ArticleDOI
TL;DR: It is proved that the probabilistic communication complexity of the identity function in a 3-computer model is O(√n), where n is the number of computers in the model and √n is the population of computers.
Abstract: It is proved that the probabilistic communication complexity of the identity function in a 3-computer model isO(√n).

62 citations


Journal ArticleDOI
TL;DR: An alternative model of human cognitive development on the balance scale task is presented, which features the symbolic learning algorithm C4.5 as a transition mechanism and exhibits regularities found in the human data including orderly stage progression, U-shaped development, and the torque difference effect.
Abstract: We present an alternative model of human cognitive development on the balance scale task. Study of this task has inspired a wide range of human and computational work. The task requires that children predict the outcome of placing a discrete number of weights at various distances on either side of a fulcrum. Our model, which features the symbolic learning algorithm C4.5 as a transition mechanism, exhibits regularities found in the human data including orderly stage progression, U-shaped development, and the torque difference effect. Unlike previous successful models of the task, the current model uses a single free parameter, is not restricted in the size of the balance scale that it can accommodate, and does not require the assumption of a highly structured output representation or a training environment biased towards weight or distance information. The model makes a number of predictions differing from those of previous computational efforts.

34 citations



Proceedings ArticleDOI
12 Aug 1996
TL;DR: This paper proposes an area model to estimate the area of single-output Boolean functions given only their functional description based on a new complexity measure called the average cube complexity, and empirical results demonstrating its feasibility and utility are presented.
Abstract: Estimation of the area complexity of a Boolean function from its functional description is an important step towards a power estimation capability at the register transfer level (RTL). This paper addresses the problem of computing the area complexity of single-output Boolean functions given only their functional description, where area complexity is measured in terms of the number of gates required for an optimal implementation of the function. We propose an area model to estimate the area based on a new complexity measure called the average cube complexity. This model has been implemented, and empirical results demonstrating its feasibility and utility are presented.

25 citations


Proceedings ArticleDOI
22 Oct 1996
TL;DR: A tree based modeling method for identifying fault prone software modules is presented, which has been applied to a subsystem of the Joint Surveillance Target Attack Radar System, JSTPARS, a large tactical military system.
Abstract: Tactical military software is required to have high reliability. Each software function is often considered mission critical, and the lives of military personnel often depend on mission success. The paper presents a tree based modeling method for identifying fault prone software modules, which has been applied to a subsystem of the Joint Surveillance Target Attack Radar System, JSTPARS, a large tactical military system. We developed a decision tree model using software product metrics from one iteration of a spiral life cycle to predict whether or not each module in the next iteration would be considered fault prone. Model results could be used to identify those modules that would probably benefit from extra reviews and testing and thus reduce the risk of discovering faults later on. Identifying fault prone modules early in the development can lead to better reliability. High reliability of each iteration translates into a highly reliable final product. A decision tree also facilitates interpretation of software product metrics to characterize the fault prone class. The decision tree was constructed using the TREED-ISC algorithm which is a refinement of the CHAID algorithm. This algorithm partitions the ranges of independent variables based on chi squared tests with the dependent variable. In contrast to algorithms used by previous tree based studies of software metric data, there is no restriction to binary trees, and statistically significant relationships with the dependent variable are the basis for branching.

24 citations



Book ChapterDOI
22 Feb 1996
TL;DR: In this paper, the authors developed a framework of a complexity theory for formal verification with binary decision diagrams, which is based on read-once projections, and showed that the class of functions with polynomial-size free decision diagrams has no complete problem while for the corresponding classes for the other considered types of binary decision diagram complete problems are presented.
Abstract: Computational complexity is concerned with the complexity of solving problems and computing functions and not with the complexity of verifying circuit designs. The importance of formal verification is evident. Therefore, the framework of a complexity theory for formal verification with binary decision diagrams is developed. This theory is based on read-once projections. For many problems it is determined whether and how they are related with respect to read-once projections. The result that circuits for squaring may be harder to verify than circuits for multiplication is derived and discussed. It is shown that the class of functions with polynomial-size free binary decision diagrams has no complete problem while for the corresponding classes for the other considered types of binary decision diagrams complete problems are presented.

Proceedings ArticleDOI
01 Aug 1996
TL;DR: Empirical results show how expected utility increases with the size of the tree and the number of Bayesian net calculations.
Abstract: We report on work towards flexible algorithms for solving decision problems represented as influence diagrams. An algorithm is given to construct a tree structure for each decision node in an influence diagram. Each tree represents a decision function and is constructed incrementally. The improvements to the tree converge to the optimal decision function (neglecting computational costs) and the asymptotic behaviour is only a constant factor worse than dynamic programming techniques, counting the number of Bayesian network queries. Empirical results show how expected utility increases with the size of the tree and the number of Bayesian net calculations.

Journal ArticleDOI
TL;DR: A more general and efficient tree communication scheme for hypercube multiprocessors is proposed and fault-tolerant algorithms for this scheme are proposed, by exploiting the unique properties of the tree communication Scheme.
Abstract: The tree communication scheme was shown to be very efficient for global operations on data residing in the processors of a hypercube with time complexity of O(log/sub 2/N), where N is the number of processors. This communication scheme is very useful for many parallel algorithms on hypercube multiprocessors. If a problem can be divided into independent subproblems, each subproblem can first be solved by one of the processors. Then, the tree communication scheme is invoked to merge the subresults into the final results. All the algorithms for problems with this property can benefit from the tree communication scheme. We propose a more general and efficient tree communication scheme in this paper. In addition, we also propose fault-tolerant algorithms for the tree communication scheme, by exploiting the unique properties of the tree communication scheme. The computation and communication slowdown is small (<2) under the effect of multiple link and/or node failures.

Book ChapterDOI
TL;DR: In this article, the relationship between the complexity of a task description and the minimal complexity of deterministic and non-deterministic decision trees was studied. But the complexity was not investigated in this paper.
Abstract: We study the relationships between the complexity of a task description and the minimal complexity of deterministic and nondeterministic decision trees solving this task. We investigate decision trees assuming a global approach i.e. arbitrary checks from a given check system can be used for constructing decision trees.


Proceedings ArticleDOI
01 Jul 1996
TL;DR: A new method is presented for deriving lower bounds to the expected number of queries made by noisy decision trees computing Boolean functions that has the feature that expectations are taken with respect to a uniformly distributed random input, as well as to the random noise, thus yielding stronger lower bounds.
Abstract: We present a new method for deriving lower bounds to the expected number of queries made by noisy decision trees computing Boolean functions. The new method has the feature that expectations are taken with respect to a uniformly distributed random input, as well as with respect to the random noise, thus yielding stronger lower bounds. It also applies to many more functions than do previous results. The method yields a simple proof of the result (previously established by Reischuk and Schmeltz) that almost all Boolean functions of n arguments require fl(n log n) queries, and strengthens this bound from the worst-case over inputs to the average over inputs. The method also yields bounds for specific Boolean functions in terms of their spectra (their Fourier transforms). The simplest instance of this spectral bound yields the result (previously established by Feige, Peleg, Raghavan and Upfal) that the parity function of n arguments requires $J(n log n) queries, and again strengthens this bound from the worst-case over inputs to the average over inputs. In its full generality, the spectral bound applies to the “highly resilient” functions introduced by Chor, Friedman, Goldreich, Hast ad, Rudich and Smolensky, and it yields non-linear lower bounds whenever the resilient y is asymptotic to the number of arguments.

Journal ArticleDOI
01 May 1996-Networks
TL;DR: In this article, the authors consider the problem of finding an optimal spanning tree with respect to objective functions which depend on the set of leaves of the tree, and address 18 different such problems and determine their computational complexity.
Abstract: We consider the problem of finding an optimal spanning tree with respect to objective functions which depend on the set of leaves of the tree. We address 18 different such problems and determine their computational complexity. Only a few of the problems examined have been given attention in the existing literature.

Proceedings ArticleDOI
07 May 1996
TL;DR: A new statistical model for acoustic observations in speech recognition that represents intra-utterance phone correlation and can be viewed as a probabilistic model of an utterance or a speaker.
Abstract: This paper introduces a new statistical model for acoustic observations in speech recognition. The model represents intra-utterance phone correlation and can be viewed as a probabilistic model of an utterance or a speaker. The phone correlation is modeled by a dependence tree, using the mutual information between pairs of phones as a measure of correlation. The experiments presented focus on robust tree topology design and on robust estimation for a given topology. With appropriate design algorithms, the dependence trees are shown to provide a better model for independent test data than an independent phone model.

Journal ArticleDOI
TL;DR: The complexity of computation in real-valued models is studied intensively recently from such diverse aspects as th theory ofreal-valued complexity classes in the Blum-Shub-Smale model and the computational capabilities of feedforward neural networks.

01 Jan 1996
TL;DR: In this article, the authors introduced the bounded theory of finite trees, which replaces the usual equality =, interpreted as identity, with the in nite family of approximate equalities, down to a given given depth.
Abstract: The theory of nite trees is the full rst-order theory of equality in the Herbrand universe (the set of ground terms) over a functional signature containing non-unary function symbols and constants. Albeit decidable, this theory turns out to be of non-elementary complexity [Vor96a]. To overcome the intractability of the theory of nite trees, we introduce in this paper the bounded theory of nite trees. This theory replaces the usual equality =, interpreted as identity, with the in nite family of approximate equalities \down to a xed given depth" f=dgd2!, with d written in binary, and s =d t meaning that the ground terms s and t coincide if all their branches longer than d are cut o . By using a re nement of Ferrante-Racko 's complexitytailored Ehrenfeucht-Fra ss e games, we demonstrate that the bounded theory of nite trees can be decided within linear double exponential space 22cn (n is the length of input) for some constant c > 0.


Book ChapterDOI
01 Jan 1996
TL;DR: This chapter discusses how the ratios covered in previous chapters relate to each other and construct a profit tree to show the profitability pathways adopted by an organisation.
Abstract: When you have finished working through this chapter you should be able to Discuss how the ratios covered in previous chapters relate to each other and construct a profit tree to show the profitability pathways adopted by an organisation Make use of the information provided in the profit tree to construct a coherent and logical argument identifying the reasons behind an organisation’s performance and relating overall performance back to decisions and systems at the operational level Use the profit tree model to carry out a detailed comparative analysis and interpretation of two sets of accounts Make use of a spreadsheet to construct a template for the profit tree model Use the profit tree spreadsheet to examine the impact of a range of proposed changes at the operating level on the overall future performance of the business

01 Jan 1996
TL;DR: In this paper, the role of randomness for the decisional complexity in algebraic decision trees was considered, i.e., the number of comparisons ignoring all other computation, and it was shown that in general the randomized complexity is logarithmic in the size of the decision tree.
Abstract: We consider the role of randomness for the decisional complexity in algebraic decision (or computation) trees, i.e., the number of comparisons ignoring all other computation. Recently Ting and Yao showed that the problem of finding the maximum of n elements has decisional complexity O(log2n) (1994, Inform. Process. Lett., 49, 39?43). In contrast, Rabin showed in 1972 an ?(n) bound for the deterministic case (1972, J. Comput. System Sci., 6, 639?650). We point out that their technique is applicable to several problems for which corresponding ?(n) lower bounds hold. We show that in general the randomized decisional complexity is logarithmic in the size of the decision tree. We then turn to the question of the number of random bits needed to obtain the Ting and Yao result. We provide a deterministic O(k log n) algorithm for finding the elements which are larger than a given element, given a bound k on the number of these elements. We use this algorithm to obtain an O(log2n) random bits and O(log2n) queries algorithm for finding the maximum.