scispace - formally typeset
Search or ask a question

Showing papers on "Interval tree published in 2009"


Journal ArticleDOI
TL;DR: This paper proposes three novel tree structures to efficiently perform incremental and interactive HUP mining that can capture the incremental data without any restructuring operation, and shows that these tree structures are very efficient and scalable.
Abstract: Recently, high utility pattern (HUP) mining is one of the most important research issues in data mining due to its ability to consider the nonbinary frequency values of items in transactions and different profit values for every item. On the other hand, incremental and interactive data mining provide the ability to use previous data structures and mining results in order to reduce unnecessary calculations when a database is updated, or when the minimum threshold is changed. In this paper, we propose three novel tree structures to efficiently perform incremental and interactive HUP mining. The first tree structure, Incremental HUP Lexicographic Tree (IHUPL-Tree), is arranged according to an item's lexicographic order. It can capture the incremental data without any restructuring operation. The second tree structure is the IHUP transaction frequency tree (IHUPTF-Tree), which obtains a compact size by arranging items according to their transaction frequency (descending order). To reduce the mining time, the third tree, IHUP-transaction-weighted utilization tree (IHUPTWU-Tree) is designed based on the TWU value of items in descending order. Extensive performance analyses show that our tree structures are very efficient and scalable for incremental and interactive HUP mining.

555 citations


Patent
25 Aug 2009
TL;DR: In this article, the authors recursively partitioned an image into a tree of image regions having the image as a tree root and at least one image patch in each leaf image region of the tree, the tree having nodes defined by the image regions and edges defined by pairs of nodes connected by edges.
Abstract: Classification of image regions comprises: recursively partitioning an image into a tree of image regions having the image as a tree root and at least one image patch in each leaf image region of the tree, the tree having nodes defined by the image regions and edges defined by pairs of nodes connected by edges of the tree; assigning unary classification potentials to nodes of the tree; assigning pairwise classification potentials to edges of the tree; and labeling the image regions of the tree of image regions based on optimizing an objective function comprising an aggregation of the unary classification potentials and the pairwise classification potentials.

89 citations


Proceedings ArticleDOI
29 Jun 2009
TL;DR: An improved redesign of the R*-tree is presented that is entirely suitable for running within a DBMS, and an insertion is guaranteed to be restricted to a single path because re-insertion could be abandoned.
Abstract: In this paper we present an improved redesign of the R*-tree that is entirely suitable for running within a DBMS Most importantly, an insertion is guaranteed to be restricted to a single path because re-insertion could be abandoned We re-engineered both, subtree choice and split algorithm, to be more robust against specific data distributions and insertion orders, as well as peculiarities often found in real multidimensional data sets This comes along with a substantial reduction in CPU-time Our experimental setup covers a wide range of different artificial and real data sets The experimental comparison shows that the search performance of our revised R*-tree is superior to that of its three most important competitors In comparison to its predecessor, the original R*-tree, the creation of a tree is substantially faster, while the I/O cost required for processing queries is improved by more than 30% on average for two- and three-dimensional data For higher dimensional data, particularly for real data sets, much larger improvements are achieved

86 citations


Book ChapterDOI
06 Jul 2009
TL;DR: A cache-oblivious Cartesian tree for solving the range minimum query problem, aCartesian tree of a tree for the bottleneck edge query problem on trees and undirected graphs, and a proof that no Cartesian trees exists for the two-dimensional version of the rangeminimum query problem are introduced.
Abstract: We present new results on Cartesian trees with applications in range minimum queries and bottleneck edge queries. We introduce a cache-oblivious Cartesian tree for solving the range minimum query problem, a Cartesian tree of a tree for the bottleneck edge query problem on trees and undirected graphs, and a proof that no Cartesian tree exists for the two-dimensional version of the range minimum query problem.

75 citations


Journal ArticleDOI
TL;DR: TIs that were easier to obtain but traditionally too complex for simulation can now be considered as input to SNESIM and significantly increases the simulation speed by searching a vector of smaller trees instead of a single large one.

71 citations


Proceedings ArticleDOI
12 May 2009
TL;DR: An algorithm is presented to recover parameters such as branch locations, angles, radii, and lengths, as well as connectivity information between branches, which can then be fed into functional-structural plant models to study the relationships between the structure of a plant, its environment, and its internal biology.
Abstract: We present a method for reconstructing 3D models of tree branch structure from laser range data. Our approach is probabilistic, and uses general knowledge of tree structure to guide an iterative reconstruction process. Our goal is to recover parameters such as branch locations, angles, radii, and lengths, as well as connectivity information between branches. These parameters can then be fed into functional-structural plant models to study the relationships between the structure of a plant, its environment, and its internal biology. In this paper we present an algorithm for finding these parameters, and results on both simulated and real datasets.

40 citations


Journal ArticleDOI
TL;DR: This work proposes an algorithm that dynamically discretizes the continuous label at each node during the tree induction process that outperforms the preprocessing approach, the regression tree approach, and several nontree-based algorithms.
Abstract: In traditional decision (classification) tree algorithms, the label is assumed to be a categorical (class) variable. When the label is a continuous variable in the data, two possible approaches based on existing decision tree algorithms can be used to handle the situations. The first uses a data discretization method in the preprocessing stage to convert the continuous label into a class label defined by a finite set of nonoverlapping intervals and then applies a decision tree algorithm. The second simply applies a regression tree algorithm, using the continuous label directly. These approaches have their own drawbacks. We propose an algorithm that dynamically discretizes the continuous label at each node during the tree induction process. Extensive experiments show that the proposed method outperforms the preprocessing approach, the regression tree approach, and several nontree-based algorithms.

36 citations


Patent
26 Feb 2009
TL;DR: In this article, the tree information measuring method is characterized by including a step S10 measuring distance data to any parts of a measured object at a plurality of spots by using distance data obtaining means 16 and 11, steps S20 and S201 in which a feature data extracting means extracting a mass of feature data equivalent to a trunk of tree from the distance data.
Abstract: PROBLEM TO BE SOLVED: To provide a tree information measuring method, a tree information measuring device, and a program for obtaining highly accurate tree information with less labor SOLUTION: The tree information measuring method is characterized by including a step S10 measuring distance data to any parts of a measured object at a plurality of spots by using distance data obtaining means 16 and 11, steps S20 and S201 in which a feature data extracting means extracting a mass of feature data equivalent to a trunk of tree from the distance data, a step S30 and S206 in which a matching means 32 makes the distance data at a plurality of spots correspond to each other by scan matching and specifies the distance data in a three-dimensional coordinate system, steps S40 and S300 in which a single tree extraction means 33 extracts a single tree from the coordinate point data specified in the three-dimensional coordinate system and step S50 and S600 in which a tree information detection means 34 detects, on a per-single-tree basis, tree information including one or more of the height of a tree, the trunk diameter, and the tree canopy length or diameter COPYRIGHT: (C)2010,JPO&INPIT

35 citations


Proceedings ArticleDOI
20 Jun 2009
TL;DR: In this paper, a model for segmentation and reconstruction of branching structures, like vascular trees, is proposed, which relies on an explicit representation of a deformable tree, where topological relationships between segments are modeled.
Abstract: The proposed model is devoted to the segmentation and reconstruction of branching structures, like vascular trees. We rely on an explicit representation of a deformable tree, where topological relationships between segments are modeled. This allows easy posterior interactions and quantitative analysis, such as measuring diameters or lengths of vessels. Starting from a unique user-provided root point, an initial tree is built with a technique relying on minimal paths. Within the constructed tree, the central curve of each segment and an associated variable radius function evolve in order to satisfy a region homogeneity criterion.

30 citations


Journal ArticleDOI
TL;DR: Simulations show that DST based searches behave better than randomly generated graphs and trees as it generates less messages to query all computers while avoiding the tree bottlenecks.
Abstract: Search algorithms are a key issue to share resources in large distributed systems as peer networks. Several distributed interconnection structures and algorithms have already been studied in this context. With expanding ring algorithms, the efficiency of searches depends on the topology used to send query requests and the dynamics of the structure. In this paper, we present an interconnection structure that limits the number of messages needed for search queries. This structure, called distributed spanning tree (DST), defines each node as the root of a spanning tree. So, it behaves as a tree for the number of messages but it balances the load generated by the requests among computers, and thus, it avoids to overload the root node. This structure is scalable because it needs only a logarithmic memory space per computer to be maintained. A formal and practical description of the structure is presented and we describe traversal algorithms. Simulations show that DST based searches behave better than randomly generated graphs and trees as it generates less messages to query all computers while avoiding the tree bottlenecks.

24 citations


Proceedings ArticleDOI
07 Oct 2009
TL;DR: Algorithms for the computation of parse-forests, best tree probability, inside probability (called partition function), and prefix probability are provided by extending to weighted tree automata the Bar-Hillel technique, as defined for context-free grammars.
Abstract: We investigate several algorithms related to the parsing problem for weighted automata, under the assumption that the input is a string rather than a tree. This assumption is motivated by several natural language processing applications. We provide algorithms for the computation of parse-forests, best tree probability, inside probability (called partition function), and prefix probability. Our algorithms are obtained by extending to weighted tree automata the Bar-Hillel technique, as defined for context-free grammars.

Proceedings ArticleDOI
19 Jul 2009
TL;DR: K-tree is an efficient approximation of the k-means clustering algorithm that is suitable for large document collections and allows for efficient disk based implementations where space requirements exceed that of main memory.
Abstract: We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

Patent
Mitali Singh1
15 Jun 2009
TL;DR: In this paper, an interval tree data structure is assembled from one or more address prefix outbound route filter (ORF) entries by partitioning a two-dimensional space defined by the address prefix and prefix length domains into nonoverlapping intervals.
Abstract: In general, the invention is directed to techniques for improving the performance of route filtering methods. More specifically, an interval tree data structure is assembled from one or more address prefix outbound route filter (ORF) entries by partitioning, according to the characteristics of the ORF entries, a two-dimensional space defined by the address prefix and prefix length domains into non-overlapping intervals. Each node in the interval tree represents a non-overlapping interval in the address prefix dimension. In addition, each node includes a distinct tree structure having nodes that maintain information about the ORF entries that map onto the represented interval for various non-overlapping intervals in the prefix length domain. By traversing the two tiers of trees, a network device can quickly determine the appropriate action to apply to a route.

Journal ArticleDOI
Navin Kashyap1
TL;DR: It is observed that the trewidth of a code is related to a measure of graph complexity, also called treewidth, and this connection is exploited to resolve a conjecture of Forney's regarding the gap between the minimum trellis constraint complexity and thetreewidth of the code.
Abstract: A tree decomposition of the coordinates of a code is a mapping from the coordinate set to the set of vertices of a tree. A tree decomposition can be extended to a tree realization, i.e., a cycle-free realization of the code on the underlying tree, by specifying a state space at each edge of the tree, and a local constraint code at each vertex of the tree. The constraint complexity of a tree realization is the maximum dimension of any of its local constraint codes. A measure of the complexity of maximum-likelihood (ML) decoding for a code is its treewidth, which is the least constraint complexity of any of its tree realizations.It is known that among all tree realizations of a linear code that extends a given tree decomposition, there exists a unique minimal realization that minimizes the state-space dimension at each vertex of the underlying tree. In this paper, we give two new constructions of these minimal realizations. As a by-product of the first construction, a generalization of the state-merging procedure for trellis realizations, we obtain the fact that the minimal tree realization also minimizes the local constraint code dimension at each vertex of the underlying tree. The second construction relies on certain code decomposition techniques that we develop. We further observe that the treewidth of a code is related to a measure of graph complexity, also called treewidth. We exploit this connection to resolve a conjecture of Forney's regarding the gap between the minimum trellis constraint complexity and the treewidth of a code. We present a family of codes for which this gap can be arbitrarily large.

Journal ArticleDOI
TL;DR: An algorithm that allows the incremental addition or removal of unranked ordered trees to a minimal frontier-to-root deterministic finite-state tree automaton (DTA) that can be used to efficiently maintain dictionaries which store large collections of trees or tree fragments.
Abstract: We describe an algorithm that allows the incremental addition or removal of unranked ordered trees to a minimal frontier-to-root deterministic finite-state tree automaton (DTA). The algorithm takes a tree t and a minimal DTA A as input; it outputs a minimal DTA A′ which accepts the language L(A) accepted by A incremented (or decremented) with the tree t. The algorithm can be used to efficiently maintain dictionaries which store large collections of trees or tree fragments.

Proceedings ArticleDOI
02 Aug 2009
TL;DR: Results on the NIST MT-05 Chinese-English translation task show that the proposed model statistically significantly outperforms the baseline systems and an algorithm targeting the noncontiguous constituent decoding is also proposed.
Abstract: The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of subtrees This paper goes further to present a translation model based on non-contiguous tree sequence alignment, where a non-contiguous tree sequence is a sequence of sub-trees and gaps Compared with the contiguous tree sequence-based model, the proposed model can well handle non-contiguous phrases with any large gaps by means of non-contiguous tree sequence alignment An algorithm targeting the noncontiguous constituent decoding is also proposed Experimental results on the NIST MT-05 Chinese-English translation task show that the proposed model statistically significantly outperforms the baseline systems

Patent
05 Feb 2009
TL;DR: In this article, the authors propose a method for rendering a deformable object from a 3D volumetric voxel dataset of a region, such region having therein an object to be rendered; building a tree hierarchical structure for the obtained VOLUME 7, 2019 dataset, such tree structure blocks as the nodes of a primary tree hierarchy and bricks being those blocks stored as textures in a video memory.
Abstract: A method for rendering a deformable object. The method includes: obtaining a 3D volumetric voxel dataset of a region, such region having therein an object to be rendered; building a tree hierarchical structure for the obtained volumetric dataset, such tree structure blocks as the nodes of a primary tree hierarchy and bricks being those blocks stored as textures in a video memory; augmenting the primary tree hierarchical structure with maximum and minimum values of the data contained within a block; creating a neighborhood tree hierarchy having for each leaf block of the neighborhood tree hierarchy a reference to the neighboring leaf blocks in the neighborhood tree hierarchy as well as references to neighboring bricks in the neighborhood tree hierarchy; updating the information about minimum and maximum in the primary tree hierarchy by saving for each block the minimum and maximum of the neighboring blocks; and rendering the leaf blocks in visibility order.

Book ChapterDOI
07 Feb 2009
TL;DR: A tree distance function based on multi-sets is introduced, it is shown that this function is a metric on tree spaces, and an algorithm to compute the distance between trees of size at most n in O (n 2) time and O ( n ) space is designed.
Abstract: We introduce a tree distance function based on multi-sets. We show that this function is a metric on tree spaces, and we design an algorithm to compute the distance between trees of size at most n in O (n 2) time and O (n ) space. Contrary to other tree distance functions that require expensive memory allocations to maintain dynamic programming tables of forests, our function can be implemented over simple and static structures. Additionally, we present a case study in which we compare our function with other two distance functions.

Book ChapterDOI
18 Jun 2009
TL;DR: This paper studies the node-weighted Steiner tree problem in unit disk graphs and presents a (1+*** )-approximation algorithm for any *** > 0, when the given set of vertices is c -local.
Abstract: The node-weighted Steiner tree problem is a variation of classical Steiner minimum tree problem. Given a graph G = (V ,E ) with node weight function C :V ***R + and a subset X of V , the node-weighted Steiner tree problem is to find a Steiner tree for the set X such that its total weight is minimum. In this paper, we study this problem in unit disk graphs and present a (1+*** )-approximation algorithm for any *** > 0, when the given set of vertices is c -local. As an application, we use node-weighted Steiner tree to solve the node-weighted connected dominating set problem in unit disk graphs and obtain a (5 + *** )-approximation algorithm.

Journal ArticleDOI
TL;DR: This paper first proves that deciding whether there is a tree homeomorphism is LOGSPACE-complete, improving on the current LOGCFL upper bound, and develops a practical algorithm for the tree homeomorphic decision problem that is both space- and time-efficient.

Posted Content
TL;DR: The SXSI system stores the tree structure of an XML document using a bit array of opening and closing brackets, and stores the text nodes of the document uses a global compressed self-index, and outperforms all other systems for counting-only queries.
Abstract: A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can efficiently be implemented using a compressed self-index of the document's text nodes. Most queries, however, contain some parts of querying the text of the document, plus some parts of querying the tree structure. It is therefore a challenge to choose an appropriate evaluation order for a given query, which optimally leverages the execution speeds of the text and tree indexes. Here the SXSI system is introduced; it stores the tree structure of an XML document using a bit array of opening and closing brackets, and stores the text nodes of the document using a global compressed self-index. On top of these indexes sits an XPath query engine that is based on tree automata. The engine uses fast counting queries of the text index in order to dynamically determine whether to evaluate top-down or bottom-up with respect to the tree structure. The resulting system has several advantages over existing systems: (1) on pure tree queries (without text search) such as the XPathMark queries, the SXSI system performs on par or better than the fastest known systems MonetDB and Qizx, (2) on queries that use text search, SXSI outperforms the existing systems by 1--3 orders of magnitude (depending on the size of the result set), and (3) with respect to memory consumption, SXSI outperforms all other systems for counting-only queries.

Proceedings ArticleDOI
04 Jan 2009
TL;DR: A data structure, called a biased range tree, is presented that preprocesses a set S of n points in ℝ2 and a query distribution D for 2-sided orthogonal range counting queries (a.k.a. dominance counting queries) to that of the optimal comparison tree for S and D.
Abstract: A data structure, called a biased range tree, is presented that preprocesses a set S of n points in R2 and a query distribution D for 2-sided orthogonal range counting queries. The expected query time for this data structure, when queries are drawn according to D, matches, to within a constant factor, that of the optimal comparison tree for S and D. The memory and preprocessing requirements of the data structure are O(n log n).

Journal ArticleDOI
TL;DR: It is shown that the optimal tree is such a tree where any internal node has degree at most 7 and children of nodes with degree not equal to 2 or 3 are all leaves, and a dynamic programming algorithm with O(n^2) time is given.

Proceedings ArticleDOI
20 Apr 2009
TL;DR: This paper addresses the open question of which domain should be partitioned first to obtain a better re-use rate, and shows both theoretically and experimentally that partitioning the time domain first is better.
Abstract: In this paper, we propose a novel out-of-core volume rendering algorithm for large time-varying fields. Exploring temporal and spatial coherences has been an important direction for speeding up the rendering of time-varying data. Previously, there were techniques that hierarchically partition both the time and space domains into a data structure so as to re-use some results from the previous time step in multiresolution rendering; however, it has not been studied on which domain should be partitioned first to obtain a better re-use rate. We address this open question, and show both theoretically and experimentally that partitioning the time domain first is better. We call the resulting structure (a binary time tree as the primary structure and an octree as the secondary structure) the space-partitioning time (SPT) tree. Typically, our SPT-tree rendering has a higher level of details, a higher re-use rate, and runs faster. In addition, we devise a novel cut-finding algorithm to facilitate efficient out-of-core volume rendering using our SPT tree, we develop a novel out-of-core preprocessing algorithm to build our SPT tree I/O-efficiently, and we propose modified error metrics with a theoretical guarantee of a monotonicity property that is desirable for the tree search. The experiments on datasets as large as 25GB using a PC with only 2GB of RAM demonstrated the efficacy of our new approach.

Proceedings ArticleDOI
09 Nov 2009
TL;DR: Effective segmental results demonstrated that this approach could be applied to nondestructive measurements in forestry and can be used as a solution for tree segmentation from 3D point cloud data with few restrictions.
Abstract: Tree segmentation is an important step in tree reconstruction from scanned data. A new method is presented for automatic extraction of single objects from a complex scene. The proposed method can be used as a solution for tree segmentation from 3D point cloud data with few restrictions, where many complex objects are included in the scene, like trees, building, cars, and so on. The scene data is initially segmented into several small regions according to the distances between points. A weighted combination is constructed on distances and normal angles in each small region for further segmentation. The minimization of the function will be used to determine whether these regions will be merged or not. This method is tested on several data sets. Effective segmental results demonstrated that this approach could be applied to nondestructive measurements in forestry.

Book ChapterDOI
17 Nov 2009
TL;DR: A strategy by which a Self-Organizing Map with an underlying Binary Search Tree structure can be adaptively re-structured using conditional rotations, whereby the neurons are ultimately placed in the input space so as to represent its stochastic distribution.
Abstract: We present a strategy by which a Self-Organizing Map (SOM) with an underlying Binary Search Tree (BST) structure can be adaptively re-structured using conditional rotations. These rotations on the nodes of the tree are local and are performed in constant time , guaranteeing a decrease in the Weighted Path Length (WPL) of the entire tree. As a result, the algorithm, referred to as the Tree-based Topology-Oriented SOM with Conditional Rotations (TTO-CONROT), converges in such a manner that the neurons are ultimately placed in the input space so as to represent its stochastic distribution, and additionally, the neighborhood properties of the neurons suit the best BST that represents the data.

Patent
Oliver Draese1
08 Jan 2009
TL;DR: In this paper, a set of arrays associated with each node of a binary tree can correspond to one of the columns that comprises the value of the data type represented by the node of the binary tree.
Abstract: Provided is a solution for storing data, the data comprising a set of tables, each table comprising a set of columns, each column comprising a set of values, each value being one or more data types. In the solution, a binary tree can be created for each of the data types. Each binary tree can comprise a set of nodes. A set of arrays can be associated with each node of the binary tree. The array associated with each node of each binary tree can correspond to one of the columns that comprises the value of the data type represented by the node of the binary tree. Each array can indicate at least one table row and column from the plurality of tables in which the value of the data type represented by the node of the binary tree occurs.

01 Jan 2009
TL;DR: This paper introduced some problems on interval graphs which are solved by using the data structure interval tree and some of its important properties are presented here.
Abstract: Interval graph is a very important subclass of intersection graphs and perfect graphs. It has many applications in different real life situations. The problems on interval graph are solved by using different data structures among them interval tree is very useful. During last decade this data structure is used to solve many problems on interval graphs due to its nice properties. Some of its important properties are presented here. Here we introduced some problems on interval graphs which are solved by using the data structure interval tree. A brief review of interval graph is also given here.

Journal ArticleDOI
01 Jan 2009
TL;DR: C-Classifiers and M-Classifier algorithms for rule based classification of tree structured data based on extracting especial tree patterns from training dataset, capable of extracting characteristics of training trees completely and non-redundant.
Abstract: Recently, tree structures have become a popular way for storing and manipulating huge amount of data. Classification of these data can facilitate storage, retrieval, indexing, query answering and different processing operations. In this paper, we present C-Classifier and M-Classifier algorithms for rule based classification of tree structured data. These algorithms are based on extracting especial tree patterns from training dataset. These tree patterns, i.e. closed tree patterns and maximal tree patterns are capable of extracting characteristics of training trees completely and non-redundantly. Our experiments show that M-Classifier significantly reduces running time and complexity. As experimental results show, accuracies of M-Classifier and C-Classifier depend on whether or not we want to classify all of the data points (even uncovered data). In the case of complete classification, C-Classifier shows the best classification quality. On the other hand and in the case of partial classification, M-Classifier improves classification quality measures.

Book ChapterDOI
24 Jul 2009
TL;DR: This paper examines the properties needed for planar trees to lock, with a focus on finding the smallest locked trees according to different measures of complexity, and suggests some new avenues of research for the problem of algorithmic characterization.
Abstract: Locked tree linkages have been known to exist in the plane since 1998, but it is still open whether they have a polynomial-time characterization. This paper examines the properties needed for planar trees to lock, with a focus on finding the smallest locked trees according to different measures of complexity, and suggests some new avenues of research for the problem of algorithmic characterization. First we present a locked linear tree with only eight edges. In contrast, the smallest previous locked tree has 15 edges. We further show minimality by proving that every locked linear tree has at least eight edges. We also show that a six-edge tree can interlock with a four-edge chain, which is the first locking result for individually unlocked trees. Next we present several new examples of locked trees with varying minimality results. Finally, we provide counterexamples to two conjectures of [12], [13] by showing the existence of two new types of locked tree: a locked orthogonal tree (all edges horizontal and vertical) and a locked equilateral tree (all edges unit length).