scispace - formally typeset
Search or ask a question
Topic

Decision tree model

About: Decision tree model is a research topic. Over the lifetime, 2256 publications have been published within this topic receiving 38142 citations.


Papers
More filters
Proceedings Article
30 Jun 2000
TL;DR: A Markov random field (MRF) approach based on frequent sets and maximum entropy is studied, and it is found that the MRF model provides substantially more accurate probability estimates than the other methods but is more expensive from a computational and memory viewpoint.
Abstract: Large sparse sets of binary transaction data with millions of records and thousands of attributes occur in various domains: customers purchasing products, users visiting web pages, and documents containing words are just three typical examples. Real-time query selectivity estimation (the problem of estimating the number of rows in the data satisfying a given predicate) is an important practical problem for such databases. We investigate the application of probabilistic models to this problem. In particular, we study a Markov random field (MRF) approach based on frequent sets and maximum entropy, and compare it to the independence model and the Chow-Liu tree model. We find that the MRF model provides substantially more accurate probability estimates than the other methods but is more expensive from a computational and memory viewpoint. To alleviate the computational requirements we show how one can apply bucket elimination and clique tree approaches to take advantage of structure in the models and in the queries. We provide experimental results on two large real-world transaction datasets.

30 citations

Journal ArticleDOI
TL;DR: It is shown that a ⌈lg k⌉ height binary decision tree always exists for k polygonal models (in fixed position) and an efficient algorithm for constructing such decision tress is given when the models are given as a set of polygons in the plane.
Abstract: A fundamental problem in model-based computer vision is that of identifying which of a given set of geometric models is present in an image. Considering a "probe" to be an oracle that tells us whether or not a model is present at a given point, we study the problem of computing efficient strategies ("decision trees") for probing an image, with the goal to minimize the number of probes necessary (in the worst case) to determine which single model is present. We show that a ⌈lg k⌉ height binary decision tree always exists for k polygonal models (in fixed position), provided (1) they are non-degenerate (do not share boundaries) and (2) they share a common point of intersection. Further, we give an efficient algorithm for constructing such decision tress when the models are given as a set of polygons in the plane. We show that constructing a minimum height tree is NP-complete if either of the two assumptions is omitted. We provide an efficient greedy heuristic strategy and show that, in the general case, it yields a decision tree whose height is at most ⌈lg k⌉ times that of an optimal tree. Finally, we discuss some restricted cases whose special structure allows for improved results.

30 citations

01 Aug 1991
TL;DR: In this paper, the complexity hierarchy of P-time incremental problems, inherently Exp~ time incremental problems and non-incremental problems is investigated. But the results in this paper are restricted to locally persistent algorithms.
Abstract: Our results, together with some previously known ones, shed light on the organization of the complexity hierarchy that exists when incremental-computation problems are classified according to their incremental complexity with respect to locally persistent algorithms. In particular, these results separate the classes of P-time incremental problems, inherently Exp~ time incremental problems, and non-incremental problems.

30 citations

01 Jan 2005
TL;DR: This paper presents the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach, and proposes a novel and unique embedding list representation that is suitable for describing embedded subtrees.
Abstract: Tree mining has many useful applications in areas such as Bioinformatics, XML mining, Web mining, etc. In general, most of the formally represented information in these domains is a tree structured form. In this paper we focus on mining frequent embedded subtrees from databases of rooted labeled ordered subtrees. We propose a novel and unique embedding list representation that is suitable for describing embedded subtrees. This representation is completely different from the string-like or conventional adjacency list representation previously utilized for trees. We present the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach previously introduced in [8]. The key characteristic of the TMG approach is that it enumerates fewer candidates by ensuring that only valid candidates that conform to the structural aspects of the data are generated as opposed to the join approach. Our experiments with both synthetic and real-life datasets provide comparisons against one of the state-of-the-art algorithms, TreeMiner [15], and they demonstrate the effectiveness and the efficiency of the technique.

29 citations

Journal ArticleDOI
01 May 2019-Geoderma
TL;DR: In this article, the DSMART algorithm was used to disaggregate conventional soil maps and to produce high-quality soil maps when point observations are not available, and the results demonstrated that a suitable approach can provide reliable soil maps at a national extent.

29 citations


Network Information
Related Topics (5)
Cluster analysis
146.5K papers, 2.9M citations
80% related
Artificial neural network
207K papers, 4.5M citations
78% related
Fuzzy logic
151.2K papers, 2.3M citations
77% related
The Internet
213.2K papers, 3.8M citations
77% related
Deep learning
79.8K papers, 2.1M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202310
202224
2021101
2020163
2019158
2018121