scispace - formally typeset
Search or ask a question

Showing papers on "Decision tree model published in 1994"


Journal ArticleDOI
TL;DR: This paper studies the depth of noisy decision trees in which each node gives the wrong answer with some constant probability, giving tight bounds for several problems.
Abstract: This paper studies the depth of noisy decision trees in which each node gives the wrong answer with some constant probability. In the noisy Boolean decision tree model, tight bounds are given on the number of queries to input variables required to compute threshold functions, the parity function and symmetric functions. In the noisy comparison tree model, tight bounds are given on the number of noisy comparisons for searching, sorting, selection and merging. The paper also studies parallel selection and sorting with noisy comparisons, giving tight bounds for several problems.

338 citations


Journal Article
TL;DR: The main emphasis is on the computational power of various acyclic and cyclic network models, but the complexity aspects of synthesizing networks from examples of their behavior are discussed.
Abstract: We survey some of the central results in the complexity theory of discrete neural networks, with pointers to the literature. Our main emphasis is on the computational power of various acyclic and cyclic network models, but we also discuss briefly the complexity aspects of synthesizing networks from examples of their behavior.

91 citations


Proceedings ArticleDOI
23 May 1994
TL;DR: This paper designs an etiicient approximation algorithm with performance ratio 2 for tree alignment and implies a polynomial-time approximation scheme for planar Steiner trees under a given topology (with any constant degree).
Abstract: We study the following fundamental problem in computational molecular biology: Given a set of DNA sequences representing some species and a phylogenetic tree depicting the ancestral relationship among these species, compute an optimal alignment oft he sequences by the means of constructing a minimum-cost evolutionary tree. The problem is an important variant of multiple sequence alignment, and is widely known as tree alignment. A more generalized version of the problem, called generalized tr’ee alignment in this paper, is that we are given the DNA sequences only and still have to construct a minimum-cost evolutionary tree. The paper presents some hardness results as well as approximation algorithms. It is shown that tree alignment is NP-hard and generalized tree alignment is MAX SNP-hard. On the positive side, we design an etiicient approximation algorithm with performance ratio 2 for tree alignment. The algorithm is then extended to a polynomialtime approximation scheme. The construction actually works for Steiner trees in any metric space, and thus implies a polynomial-time approximation scheme for planar Steiner trees under a given topology (with any constant degree). To our knowledge, this is the first polynomial-time approximation scheme in the fields of computational biology and Steiner trees. The contrast *Supported in part by a grant from SERB, McMaster University, and NSERC Operating Grant OGPO046613. Address: Department of Computer Science, MeMaster University, Hamilton, Ont. LSS 4K1, Canada. E-msil: .@@maccs.mcxnaster.ca t supported in p=t by us Dcp~trncnt of Energy Grant DE-FG03-90ER6099. Address: Computer Science Division, University of California, Berkeley, CA 94720, USA. Email: lawler@cs .berkele y. edu $supported in pmt by NSER.C operating Gr~t C) GPO046613. Address: Department of Electrical and Computer Engineering, McMaster University, Hsrnilton, Onterio L8S 4K1, Canada. B mail: lwsng@maccs .mcmaster .ca Permission to copy without fee all or pari of thk material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association of Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. STOC 945/94 Montreal, Quebee, Canada @ 1994 ACM 0-89791 -663-8/94/0005..$3.50 between the approximabtity of tree alignment and generalized tree alignment shows that a phylogenetic tree can indeed help in multiple alignment. The approximation algorithms may be useful in evolutionary genetics practice as they can provide a good initial alignment for the iterative method in [24].

70 citations


Proceedings ArticleDOI
16 Jul 1994
TL;DR: A statistical approach to decision tree modeling is described, in which each decision in the tree is modeled parametrically as is the process by which an output is generated from an input and a sequence of decisions, yielding a likelihood measure of goodness of fit.
Abstract: A statistical approach to decision tree modeling is described. In this approach, each decision in the tree is modeled parametrically as is the process by which an output is generated from an input and a sequence of decisions. The resulting model yields a likelihood measure of goodness of fit, allowing ML and MAP estimation techniques to be utilized. An efficient algorithm is presented to estimate the parameters in the tree. The model selection problem is presented and several alternative proposals are considered. A hidden Markov version of the tree is described for data sequences that have temporal dependencies.

45 citations


Proceedings ArticleDOI
28 Jun 1994
TL;DR: A general setting in which the complexity of solving two independent problems is the product of the associated individual complexities and several concrete results are derived for decision trees and communication complexity.
Abstract: Gives a general setting in which the complexity (or quality) of solving two independent problems is the product of the associated individual complexities. The authors then derive from this setting several concrete results of this type for decision trees and communication complexity. >

29 citations


Journal Article
TL;DR: The computational complexity of languages with interactive proofs of logarithmic knowledge complexity was studied in this article, where it was shown that all such languages can be recognized in BPP and NP.
Abstract: We study the computational complexity of languages which have interactive proofs of logarithmic knowledge complexity. We show that all such languages can be recognized in ${\cal BPP}^{\cal NP}$. Prior to this work, for languages with greater-than-zero knowledge complexity only trivial computational complexity bounds were known. In the course of our proof, we relate statistical knowledge complexity to perfect knowledge complexity; specifically, we show that, for the honest verifier, these hierarchies coincide up to a logarithmic additive term.

28 citations



Book ChapterDOI
05 Jun 1994
TL;DR: A sequence of tree matching operations based on a classification of properties preserved in matching is defined, and the time complexity of the primitives is analyzed.
Abstract: We consider primitives for retrieving information from trees. We define a sequence of tree matching operations based on a classification of properties preserved in matching. We analyze the time complexity of the primitives. The addition of logical variables to the primitives is also considered, and its effects on the complexities is studied.

15 citations


Journal ArticleDOI
TL;DR: It is shown that if f can be represented in k-DNF form and in j-CNF form, then O(n log(min(k, j)/q) queries suffice to compute f with error probability less than q, where n is the number of input bits.
Abstract: We consider the problem of computing with faulty components in the context of the Boolean decision tree model, in which cost is measured by the number of input bits queried, and the responses to queries are faulty with a fixed probability. We show that if f can be represented in k-DNF form and in j-CNF form, then O(n log(min(k, j)/q)) queries suffice to compute f with error probability less than q, where n is the number of input bits. © 1994 John Wiley & Sons, Inc.

12 citations


Book ChapterDOI
10 Jul 1994
TL;DR: A statistical approach to decision tree modeling is described, in which each decision in the tree is modeled parametrically as is the process by which an output is generated from an input and a sequence of decisions, yielding a likelihood measure of goodness of fit.
Abstract: A statistical approach to decision tree modeling is described In this approach, each decision in the tree is modeled parametrically as is the process by which an output is generated from an input and a sequence of decisions The resulting model yields a likelihood measure of goodness of fit, allowing ML and MAP estimation techniques to be utilized An efficient algorithm is presented to estimate the parameters in the tree The model selection problem is presented and several alternative proposals are considered A hidden Markov version of the tree is described for data sequences that have temporal dependencies

3 citations


Proceedings ArticleDOI
02 May 1994
TL;DR: A fast algorithm for data exchange in a network of processors organized as a reconfigurable tree structure that has linear lime complexity, and provides a large reduction in run-time as compared to an existing algorithm.
Abstract: The paper presents a fast algorithm for data exchange in a network of processors organized as a reconfigurable tree structure. For a given data exchange table, the algorithm generates a sequence of tree configurations in which the data exchanges are to be executed. A significant feature of the algorithm is that each exchange is executed in a tree configuration in which the source and destination nodes are adjacent to each other. It has been proved in a theorem that for every pair of nodes in the reconfigurable tree structure, there always exist two and only two configurations in which these two nodes are adjacent to each other. The algorithm utilizes this fact and determines the solution so as to optimize both the number of configurations required and the time to perform the data exchanges. Analysis of the algorithm shows that it has linear lime complexity, and provides a large reduction in run-time as compared to an existing algorithm. This is well confirmed from the experimental results obtained by executing a large number of randomly generated data exchange tables. Another significant feature of the algorithm is that the bit size of the routing information code is always two bits, irrespective of the number of nodes in the tree. This not only increases the speed of the algorithm but also results in simpler hardware inside each node. >

Book ChapterDOI
01 Mar 1994
TL;DR: This paper introduces the notion of rewriting systems for sets, and considers the complexity of the reachability problems for these systems, showing that this problem is PSPACE-complete in the general case and is P-complete for particular rewriting systems.
Abstract: In this paper we introduce the notion of rewriting systems for sets, and consider the complexity of the reachability problems for these systems, showing that this problem is PSPACE-complete in the general case and is P-complete for particular rewriting systems As a consequence, we show that the emptiness and finiteness problems for E0L systems are PSPACE-complete, solving in this way an open problem Finally, we give completeness results for some decision problems concerning binary systolic tree automata


Journal ArticleDOI
TL;DR: Three methods to estimate the accuracy of a pruned decision tree are given and a general bound is provided which requires no assumption over the instance space.
Abstract: Many studies have shown that decision tree induction methods could be used to determine rules for expert systems. Pruning techniques are often used to increase the accuracy of an induced decision tree over the instance space. While recent results of decision tree induction show that large samples may be required to induce a decision tree of small error, recent expository studies have used very small sample sizes. In such cases it is of value to obtain a posterior evaluation of the error of the induced concept. In this paper we give three methods to estimate the accuracy of a pruned decision tree. The first method assumes uniform prior distribution. For those cases where uniform prior is not appropriate, we develop a method to obtain appropriate prior using a beta distribution. Finally, we provide a general bound which requires no assumption over the instance space. These results can be used when a pruned decision tree is used to classify the original domain or another close domain.

Journal ArticleDOI
01 Dec 1994
TL;DR: An automatic method for target identification which uses a pattern recognition algorithm to analyse an ensemble of range image profiles is presented, and results derived from real data show that a high classification rate can be achieved.
Abstract: An automatic method for target identification which uses a pattern recognition algorithm to analyse an ensemble of range image profiles is presented. Such profiles are typical of those produced by high resolution radar and sonar systems employing pulse compression techniques. The approach uses an attributed relational tree model to characterise features extracted from the waveform image profile. The algorithm is capable of learning generic models for each type of target during a supervised training session. Targets are then classified by matching tree models to a database of stored prototypes using a dynamic programming alignment algorithm. Probability attributes are used to model the large amount of scan to scan distortion in the signal caused by target motion. An experimental system has been implemented, and results derived from real data show that a high classification rate can be achieved. >


Journal ArticleDOI
TL;DR: A systolic algorithm for solving the 0/1-knapsack problems with n items is presented and the computational model used is a tree structure which consists of 2" identical processing elements (PEs).
Abstract: A systolic algorithm for solving the 0/1-knapsack problems with n items is presented. The computational model used is a tree structure which consists of 2" identical processing elements (PEs). Each PE executes the same program at any time step. The time complexity varies from n to 3n — 2 steps which includes all the input/output data communication time. The design process and the correctness verification of this algorithm are considered in detail.

Proceedings ArticleDOI
02 Oct 1994
TL;DR: This work presents an extremely promising heuristic method for creating effective decision trees, and computational results show that the method obtains optimal solutions for 95% of the cases tested.
Abstract: We consider the problem of identifying the state of an n component coherent system, where each component can be working or failed. It is costly to determine the states of the components. The goal is to find a decision tree which specifies the order of the components to be tested with minimum expected cost. The problem is known to be NP-hard. We present an extremely promising heuristic method for creating effective decision trees, and computational results show that the method obtains optimal solutions for 95% of the cases tested. >

Book ChapterDOI
15 Dec 1994
TL;DR: It is shown that any parallel algorithm in the fixed degree algebraic decision tree model that answers membership queries in W ⊑ R n using p processors, requires Ω(¦W¦/n log(p/n) rounds where ¦w¦ is the number of connected components of W.
Abstract: We show that any parallel algorithm in the fixed degree algebraic decision tree model that answers membership queries in W ⊑ R n using p processors, requires Ω(¦W¦/n log(p/n)) rounds where ¦W¦ is the number of connected components of W. We further prove a similar result for the average case complexity. We give applications of this result to various fundamental problems in computational geometry like convex-hull construction and trapezoidal decomposition and also present algorithms with matching upper bounds.

14 Jan 1994
TL;DR: The computational complexity of the subclass problem is evaluated and it is shown that the problem is NP-hard.
Abstract: When training sarr}ples of several classes in an Euclidean space R” are giveR, to fiRd boundaries in R” in such a way that they include on}y samples of a certain ciass is one of the most important issues in £he field of pattern recognition. A subclass preblem is one of such problems where bottndaries are limited to hyper-rectangles. This paper evaluates the computational complexity of the subclass problem and shows that the problem is NP-hard. 1. lntroduction In the filed of patterR recognition, it is very important for each class to find its discrimination bou“daries which can be calculated efficiently in both processes’ of the construction of boundaries and of judgement of membership for unknown samples, i. e., whether the samples be}oRg to the class or noe. Such processes are done on the basis of training samples for given classes. A subclass problemi) is a kind of such problerns : Boundarles are restricted to hyper-rectangles and are required to include only the trainjng samples of a certain class aRd hold the maximality among such hyper-rectangles, This problem or its similar ones have beeR appeared several times iR the literature of pattern recognieion2・3), and some methods have been proposed to solve those problems optimally or suboptimally. The applicable area includes no£ only the discrimination of unknown samples but also feature selection4t6). In this paper, we discuss the computational complexity of the subclass problem. To do this, we divide the problem into two stages and iRvestigate their complexities separately. ORe of them is related to a decision problem which is proven to be in the class of NPcomplete. Finally, the subclass problem is proven to be NP-hard. 2. PreparatioR In this section, we give some notations and definitions. The notations should be referred to reference (6). 2.1. Seme classes pf problems. A problem ff consists of a set DE of instances. A decision ProPlem fi is a problem so as to answer “yes” or “no”, given an instance 1 Eff IRfermation Engineering Email ; mine@huie. hokudai. ac. jp 48 M. Kudo and M. Shimbe nl. One of we1トknown decision problems is CLIQUE (with K) INSTANCE: Graph G :(V, E), positive integer K 〈 l lil. QUESTION: Does G contain a clique of size K or more, i.e., a subset V’ g Y with lV’1 〉 K such that every two vertices in V’ are connected by an edge of E? A search Problem ll consists of ,Din , and for each instance f (EI Dk , a set & [1] of finite objects called solutions for L An algorithm is said to solve a searclt problem H if, given as input any instance 1 Eli a , it returns the answer “no” whenever S} U] is empty and otherwise returns some solution s (EE & [f] . A decision problem can be associated to a search problem by answering “yes” if Si [f]t¢ and “no” otherwise. In a search problem ff, each iRstance 1 Ei Pi has an associated solution set Si [」] , and for the given 1, we are required to find one element of Si V] . The enitmercttion Prob/em based on the search problem ll is “Given 1, what is the cardinality of G [A , i. e., how many solutions are there ?” For examp}e, for CLIQUE, the following is the enumeration problem : ENU瓢CH硬UE(with K) INSTANCE : Graph G =(V,E), positive integer K 〈 I VI. QUESTION : How many cliques of size K or more are there for G ? Furthermore, we introduce a new class of problems. A /ist Problem fi is associated with the search problem and requires us to /ist (disPlay) all solutions of Si [1] for a given 1. 2.2 Complexity. The enumeration problems associated with NP-complete decision problems are ciearly NP-hard, since if we know the answer for an enumeration problem we can easily answer “yes” or “no” for the corresponding NP-complete problem, according to whether the answer is greater than O or not. However, some enumeration probiems do not belong to the class P, even if the underlying problems belong to P6). For this reason, Valiant7) proposed a new class of complexity called #P-comPlete which contains many enumeration problems associated with NP-complete problems. This class is defined as : PefinitioR 2.1 (ValiaRt, 1979). The class # P is of all problems computed by nondeterministic polynomial time Turing machines that have the additionaほacility of outputti!ユg the number of accepting computations. An enumeration problem fi is said to be # P-complete when ll Ei# P and, for all R’E# P, ff’a7・ fi, where evT denotes that there is a Polynomia/ Tzaring reduetion from R’ to fi (for the definition of polynomial Turing reduction, see reference (6)) By a similar discussion about the complexity of enumeration problems, we can say that Computational Complexity of Subclass Problems 49 the list problems associatecl with # P-complete enumeration problems are clearly # P-hard, since if we can list the solutions we can easily answer the number by incrementing a counter by one, instead of listing one solution. 2.3.Po韮yno賊a至tra, Rsforrriabi翫y. For proving NP-completeness of a problem lt, it is enough to find an NP-complete problem fft and a Polynomial transformation f: ff’ 一〉 fi, where f can be calculated in polynomial steps in terms of the size of f (1 Ui and has obvious correspondence between solutions of ff’ and those of ff. As a similar way, for proving #P -completeness of a problem fi, it is known to be enough to find an # P-complete problem H’ and a polynomial time Parsimonious transformation f : H’ 一〉 il which holds the number of solutions6). 3. Subclass Problem In this section, we describe SUBCLASS Problem (denoted by SP for simplicity). The problem can be written as follows. SUBCLASS INSTANCE : S =(S“,S一,d), where S“ and Srm are the collections of d-imensional vectors. QUESTION : List every subset P ef S“ such that (1) for all y EEE Snt, y er Rect(P) (Exclztsiveness), and (2) for any P’ holding exclusiveness, P g P’ (Mauima/ity), where Rect(,1’) denotes the hyper-rectangle which includes every x El P within it minimally. We call P a sztbclass. For example, S“=={xi= (O,O), x2==(1,3), x3 ==(3,i)} and S一 : {y=(2,2)} provide an instance (S“,Srm, 2) of SP (See Fig. 1). ln this instance, the subsets of S“ satisfying exclusiveness are six of ¢, {xi}, {x2}, {x3}, {xi, x2} and {xi, x3}, e. g., a rectangle Rect({xi, x2}) = [O, 1] × [O, 3] does not include y =(2,2). By maximality, the final answer becomes two of {xi, x2} and {Xl,X3}・ In SP, we divide the whole problem into two stages of LIST EX and LIST SUB : (1) LIST EX ; list exclusive subsets of S’, and (2) LIST SUB ; list maximal elements of the set consisting of all exclusive subsets. Related to LIST EX, we consider the following three problems : 50 M. Kudo and M. Shimbo