Topic
Decision tree model
About: Decision tree model is a research topic. Over the lifetime, 2256 publications have been published within this topic receiving 38142 citations.
Papers published on a yearly basis
Papers
More filters
•
30 Jun 2000TL;DR: A Markov random field (MRF) approach based on frequent sets and maximum entropy is studied, and it is found that the MRF model provides substantially more accurate probability estimates than the other methods but is more expensive from a computational and memory viewpoint.
Abstract: Large sparse sets of binary transaction data with millions of records and thousands of attributes occur in various domains: customers purchasing products, users visiting web pages, and documents containing words are just three typical examples. Real-time query selectivity estimation (the problem of estimating the number of rows in the data satisfying a given predicate) is an important practical problem for such databases.
We investigate the application of probabilistic models to this problem. In particular, we study a Markov random field (MRF) approach based on frequent sets and maximum entropy, and compare it to the independence model and the Chow-Liu tree model. We find that the MRF model provides substantially more accurate probability estimates than the other methods but is more expensive from a computational and memory viewpoint. To alleviate the computational requirements we show how one can apply bucket elimination and clique tree approaches to take advantage of structure in the models and in the queries. We provide experimental results on two large real-world transaction datasets.
30 citations
••
TL;DR: It is shown that a ⌈lg k⌉ height binary decision tree always exists for k polygonal models (in fixed position) and an efficient algorithm for constructing such decision tress is given when the models are given as a set of polygons in the plane.
Abstract: A fundamental problem in model-based computer vision is that of identifying which of a given set of geometric models is present in an image. Considering a "probe" to be an oracle that tells us whether or not a model is present at a given point, we study the problem of computing efficient strategies ("decision trees") for probing an image, with the goal to minimize the number of probes necessary (in the worst case) to determine which single model is present. We show that a ⌈lg k⌉ height binary decision tree always exists for k polygonal models (in fixed position), provided (1) they are non-degenerate (do not share boundaries) and (2) they share a common point of intersection. Further, we give an efficient algorithm for constructing such decision tress when the models are given as a set of polygons in the plane. We show that constructing a minimum height tree is NP-complete if either of the two assumptions is omitted. We provide an efficient greedy heuristic strategy and show that, in the general case, it yields a decision tree whose height is at most ⌈lg k⌉ times that of an optimal tree. Finally, we discuss some restricted cases whose special structure allows for improved results.
30 citations
01 Aug 1991
TL;DR: In this paper, the complexity hierarchy of P-time incremental problems, inherently Exp~ time incremental problems and non-incremental problems is investigated. But the results in this paper are restricted to locally persistent algorithms.
Abstract: Our results, together with some previously known ones, shed light on the organization of the complexity hierarchy that exists when incremental-computation problems are classified according to their incremental complexity with respect to locally persistent algorithms. In particular, these results separate the classes of P-time incremental problems, inherently Exp~ time incremental problems, and non-incremental problems.
30 citations
01 Jan 2005
TL;DR: This paper presents the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach, and proposes a novel and unique embedding list representation that is suitable for describing embedded subtrees.
Abstract: Tree mining has many useful applications in areas such as Bioinformatics, XML mining, Web mining, etc. In general, most of the formally represented information in these domains is a tree structured form. In this paper we focus on mining frequent embedded subtrees from databases of rooted labeled ordered subtrees. We propose a novel and unique embedding list representation that is suitable for describing embedded subtrees. This representation is completely different from the string-like or conventional adjacency list representation previously utilized for trees. We present the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach previously introduced in [8]. The key characteristic of the TMG approach is that it enumerates fewer candidates by ensuring that only valid candidates that conform to the structural aspects of the data are generated as opposed to the join approach. Our experiments with both synthetic and real-life datasets provide comparisons against one of the state-of-the-art algorithms, TreeMiner [15], and they demonstrate the effectiveness and the efficiency of the technique.
29 citations
••
TL;DR: In this article, the DSMART algorithm was used to disaggregate conventional soil maps and to produce high-quality soil maps when point observations are not available, and the results demonstrated that a suitable approach can provide reliable soil maps at a national extent.
29 citations