scispace - formally typeset
Search or ask a question

Showing papers on "Tree (data structure) published in 1995"


Journal ArticleDOI
TL;DR: The authors derive a natural upper bound on the cumulative redundancy of the method for individual sequences that shows that the proposed context-tree weighting procedure is optimal in the sense that it achieves the Rissanen (1984) lower bound.
Abstract: Describes a sequential universal data compression procedure for binary tree sources that performs the "double mixture." Using a context tree, this method weights in an efficient recursive way the coding distributions corresponding to all bounded memory tree sources, and achieves a desirable coding distribution for tree sources with an unknown model and unknown parameters. Computational and storage complexity of the proposed procedure are both linear in the source sequence length. The authors derive a natural upper bound on the cumulative redundancy of the method for individual sequences. The three terms in this bound can be identified as coding, parameter, and model redundancy, The bound holds for all source sequence lengths, not only for asymptotically large lengths. The analysis that leads to this bound is based on standard techniques and turns out to be extremely simple. The upper bound on the redundancy shows that the proposed context-tree weighting procedure is optimal in the sense that it achieves the Rissanen (1984) lower bound. >

999 citations


Proceedings Article
27 Nov 1995
TL;DR: This work presents a novel algorithm, TREPAN, for extracting comprehensible, symbolic representations from trained neural networks, which is general in its applicability and scales well to large networks and problems with high-dimensional input spaces.
Abstract: A significant limitation of neural networks is that the representations they learn are usually incomprehensible to humans. We present a novel algorithm, TREPAN, for extracting comprehensible, symbolic representations from trained neural networks. Our algorithm uses queries to induce a decision tree that approximates the concept represented by a given network. Our experiments demonstrate that TREPAN is able to produce decision trees that maintain a high level of fidelity to their respective networks while being comprehensible and accurate. Unlike previous work in this area, our algorithm is general in its applicability and scales well to large networks and problems with high-dimensional input spaces.

679 citations


Proceedings Article
11 Sep 1995
TL;DR: A data structure to solve the problem of finding approximate matches in a large database called a GNAT { Geometric Near-neighbor Access Tree} is introduced based on the philosophy that the data structure should act as a hierarchical geometrical model of the data as opposed to a simple decomposition of theData that does not use its intrinsic geometry.
Abstract: Given user data, one often wants to find approximate matches in a large database. A good example of such a task is finding images similar to a given image in a large collection of images. We focus on the important and technically diffcult case where each data element is high dimensional, or more generally, is represented by a point in a large metric spaceand distance calculations are computationally expensive. In this paper we introduce a data structure to solve this problem called a GNAT { Geometric Near-neighbor Access Tree. It is based on the philosophy that the data structure should act as a hierarchical geometrical model of the data as opposed to a simple decomposition of the data that does not use its intrinsic geometry. In experiments, we find that GNAT's outperform previous data structures in a number of applications. Keywords { near neighbor, metric space, approximate queries, data mining, Dirichlet domains, Voronoi regions

631 citations


Proceedings Article
11 Sep 1995
TL;DR: The Generalized Search Tree is introduced, an index structure supporting an extensible set of queries and data types and providing all the basic search tree logic required by a database system, thereby unifying disparate structures such as B+-tree and R-trees in a single piece of code and opening the application of search trees to general extensibility.
Abstract: This paper introduces the Generalized Search Tree (GiST), an index structure supporting an extensible set of queries and data types. The GiST allows new data types to be indexed in a manner supporting queries natural to the types; this is in contrast to previous work on tree extensibility which only supported the traditional set of equality and range predicates. In a single data structure, the GiST provides all the basic search tree logic required by a database system, thereby unifying disparate structures such as B+-trees and R-trees in a single piece of code, and opening the application of search trees to general extensibility. To illustrate the flexibility of the GiST, we provide simple method implementations that allow it to behave like a B+-tree, an R-tree, and an RD-tree, a new index for data with set-valued attributes. We also present a preliminary performance analysis of RD-trees, which leads to discussion on the nature of tree indices and how they behave for various datasets.

563 citations


Journal ArticleDOI
TL;DR: It is confirmed that allowing multivariate tests generally improves the accuracy of the resulting decision tree over a univariate tree, and several new methods for forming multivariate decision trees are presented.
Abstract: Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing a multivariate test, including symbolic and numeric features, learning the coefficients of a multivariate test, selecting the features to include in a test, and pruning of multivariate decision trees. We present several new methods for forming multivariate decision trees and compare them with several well-known methods. We compare the different methods across a variety of learning tasks, in order to assess each method's ability to find concise, accurate decision trees. The results demonstrate that some multivariate methods are in general more effective than others (in the context of our experimental assumptions). In addition, the experiments confirm that allowing multivariate tests generally improves the accuracy of the resulting decision tree over a univariate tree.

346 citations


Journal ArticleDOI
TL;DR: It is shown that this universal source incorporates any minimal data-generating tree machine in an asymptotically optimal manner in the following sense: the negative logarithm of the probability it assigns to any long typical sequence, generated by any tree machine, approaches that assigned by the tree machine at the best possible rate.
Abstract: An irreducible parameterization for a finite memory source is constructed in the form of a tree machine. A universal information source for the set of finite memory sources is constructed by a predictive modification of an earlier studied algorithm-Context. It is shown that this universal source incorporates any minimal data-generating tree machine in an asymptotically optimal manner in the following sense: the negative logarithm of the probability it assigns to any long typical sequence, generated by any tree machine, approaches that assigned by the tree machine at the best possible rate. >

255 citations


Proceedings ArticleDOI
01 Jan 1995
TL;DR: This work visualize the structure of sections of the World Wide Web by constructing graphical representations in 3D hyperbolic space and uses the Geomview/WebOOGL 3D Web browser as an interface between the 3D representation and the actual documents on the Web.
Abstract: We visualize the structure of sections of the World Wide Web by constructing graphical representations in 3D hyperbolic space. The felicitous property that hyperbolic space has “more room” than Euclidean space allows more information to be seen amid less clutter, and motion by hyperbolic isometries provides for mathematically elegant navigation. The 3D graphical representations, available in the WebOOGL or VRML file formats, contain link anchors which point to the original pages on the Web itself. We use the Geomview/WebOOGL 3D Web browser as an interface between the 3D representation and the actual documents on the Web. The Web is just one example of a hierarchical tree structure with links “back up the tree” i.e. a directed graph which contains cycles. Our information visualization techniques are appropriate for other types of directed graphs with cycles, such as filesystems with symbolic links.

237 citations


Journal ArticleDOI
TL;DR: The REALISM system decreases the time for collision detection by using a three stage process that uses a spherical geometry approximation rapidly to locate regions of potential collisions and then uses a local intersection test with actual object geometry information, which is therefore fast and accurate.
Abstract: The detection of collisions between moving polyhedral objects is one of the most computationally intensive tasks in the computer animation process. The use of object-oriented techniques to encapsulate data within the objects' structures compounds this problem through the requirement for inter-object message passing in order to obtain geometric information for collision detection. The REALISM system decreases the time for collision detection by using a three stage process. The first stage identifies objects in the same locality using a global bounding volume table. The second stage locates regions of possible collision using a sphere-tree data structure (a hierarchical tree of spheres based on octree-type spatial subdivision). The final stage finds intersections between polygonal faces of the objects that are contained within the intersecting pairs of leaf nodes. Hence the algorithm uses a spherical geometry approximation rapidly to locate regions of potential collisions and then uses a local intersection test with actual object geometry information. The system is therefore fast and accurate. Tests for various geometric objects support this and show performance improvements of five times over traditional polyhedral intersection tests.

219 citations


Journal Article
TL;DR: In this paper, the Steiner tree problem is solved using a novel technique of choosing Steiner points in dependence on the possible deviation from the optimal solutions, achieving an approximation ratio of 1.644 in arbitrary metric and 1.267 in rectilinear plane, respectively.
Abstract: The Steiner tree problem asks for the shortest tree connecting a given set of terminal points in a metric space. We design new approximation algorithms for the Steiner tree problems using a novel technique of choosing Steiner points in dependence on the possible deviation from the optimal solutions. We achieve the best up to now approximation ratios of 1.644 in arbitrary metric and 1.267 in rectilinear plane, respectively.

196 citations



Journal ArticleDOI
TL;DR: A new hierarchical triangle-based model for representing surfaces over sampled data, based on the subdivision of the surface domain into nested triangulations, is proposed, which allows compression of spatial data and representation of a surface at successively finer degrees of resolution.
Abstract: A new hierarchical triangle-based model for representing surfaces over sampled data is proposed, which is based on the subdivision of the surface domain into nested triangulations, called a hierarchical triangulation (HT). The model allows compression of spatial data and representation of a surface at successively finer degrees of resolution. An HT is a collection of triangulations organized in a tree, where each node, except for the root, is a triangulation refining a face belonging to its parent in the hierarchy. We present a topological model for representing an HT, and algorithms for its construction and for the extraction of a triangulation at a given degree of resolution. The surface model, called a hierarchical triangulated surface (HTS) is obtained by associating data values with the vertices of triangles, and by defining suitable functions that describe the surface over each triangular patch. We consider an application of a piecewise-linear version of the HTS to interpolate topographical data, and we describe a specialized version of the construction algorithm that builds an HTS for a terrain starting from a high-resolution rectangular grid of sampled data. Finally, we present an algorithm for extracting representations of terrain at variable resolution over the domain.

Journal ArticleDOI
TL;DR: A constructive procedure is presented for converting a CFG into a left anchored LTIG that preserves ambiguity and generates the same trees and it was possible to parse more quickly with the LTIGs than with the original CFGs.

Journal ArticleDOI
TL;DR: The development of management techniques that allow the manipulation of tree root systems to maximize benefit and minimize competition are proposed as important tasks for future agroforestry research.
Abstract: This literature review presents information about the role of tree root systems for the functioning of agroforestry associations and rotations and attempts to identify root-related criteria for the selection of agroforestry tree species and the design of agroforestry systems. Tree roots are expected to enrich soil with organic matter, feed soil biomass, reduce nutrient leaching, recycle nutrients from the subsoil below the crop rooting zone and improve soil physical properties, among other functions. On the other hand, they can depress crop yields in tree-crop associations through root competition. After a brief review of favourable tree root effects in agroforestry, four strategies are discussed as potential solutions to the dilemma of the simultaneous occurrence of desirable and undesirable tree root functions: 1) the selection of tree species with low root competitiveness, eventually supplemented by shoot pruning; 2) the identification of trees with a root distribution complementary to that of the crops; 3) the reduction of tree root length density by trenching or tillage; and 4) the use of agroforestry rotations instead of tree-crop associations. The potential and limitations of these strategies are discussed, and deficits in current understanding of tree root ecology in agroforestry are identified. In addition to the selection of tree species and provenances according to root-related criteria, the development of management techniques that allow the manipulation of tree root systems to maximize benefit and minimize competition are proposed as important tasks for future agroforestry research.

Patent
07 Jun 1995
TL;DR: In this article, the vector quantization (VQ) method is used to build a codebook for the compression of data, which is initialized by establishing N initial nodes and creating the remainder of the codebook as a binary codebook, and this splitting and reassociating process continues until the maximum number of terminal nodes is created in the tree, a total error or distortion threshold has been reached or some other criterion.
Abstract: Improved method and apparatus for vector quantization (VQ) to build a codebook for the compression of data. The codebook or "tree" is initialized by establishing N initial nodes and creating the remainder of the codebook as a binary codebook. Children entries are split upon determination of various attributes, such as maximum distortion, population, etc. Vectors obtained from the data are associated with the children nodes, and then representative children entries are recalculated. This splitting/reassociation continues iteratively until a difference in error associated with the previous children and current children becomes less than a threshold. This splitting and reassociating process continues until the maximum number of terminal nodes is created in the tree, a total error or distortion threshold has been reached or some other criterion. The data may then be transmitted as a compressed bitstream comprising a codebook and indices referencing the codebook.


Book ChapterDOI
Lars Arge1
16 Aug 1995
TL;DR: This paper shows how the technique for transforming an internal memory tree data structure into an external storage structure can be used to develop a search-tree-like structure, a priority-queue, a (one-dimensional) range-tree and a segment-tree, and examples of how these structures can be use to develop efficient I/O-algorithms.
Abstract: In this paper we develop a technique for transforming an internal memory tree data structure into an external storage structure We show how the technique can be used to develop a search-tree-like structure, a priority-queue, a (one-dimensional) range-tree and a segment-tree, and give examples of how these structures can be used to develop efficient I/O-algorithms All our algorithms are either extremely simple or straightforward generalizations of known internal memory algorithms — given the developed external data structures

Patent
02 Oct 1995
TL;DR: In this paper, a method for interactively presenting multi-media information in a computer system is presented. The method includes the steps of receiving the information, and converting the information to a common intermediate representation stored in a memory of a computer systems in the form of a hierarchical attribute tree.
Abstract: In a computer system, a method is implemented for interactively presenting electronically encoded multi-media information. The information including marks to indicate a structure of the information. The method includes the steps of receiving the information, and converting the information to a common intermediate representation stored in a memory of a computer system in the form of a hierarchical attribute tree. The tree has a plurality of document objects, the document objects represent the information, the structure of the information, and procedures which can operate on the information. The common intermediate representation is presented using a plurality of user communication modalities according to the hierarchical attribute tree. While presenting the information, the method receives control signals from a user using the plurality of user communication modalities to enable the user to interactively and independently control the receiving of the information and the presentation of the information in a plurality of presentation modalities.


Patent
24 May 1995
TL;DR: In this article, the user can separate a portion of a tree control at a node and create a new tree control for viewing and editing, and changes to a newly created tree control propagate through to related tree controls.
Abstract: A method for interactive display of a graphical tree structure in a windowing environment. A tree control graphically represents hierarchical data. The user can separate a portion of a tree control at a node and create a new tree control for viewing and editing. Changes to a newly created tree control propagate through to related tree controls.

Patent
28 Dec 1995
TL;DR: In this paper, a system for the creation and completion of goal oriented electronic forms is presented, which includes a form creation mode and a run-time mode for the user to specify a set of tree branches, tree nodes, and conclusions in association with fields of the form.
Abstract: A system for creation and completion of goal oriented electronic forms creates a graphical image data file which defines: a graphical image of a form for display and printing; a graphical image of tree branches, tree nodes, and conclusions in association with fields of the form; reading and writing links between form fields and data sources and destinations; and links to other forms which, with the original form, comprise a related stack of forms. The system includes a form creation mode and a run time mode. The trees are defined by an application developer using the form creation mode to establish both qualitative and quantitative relationships between the various fields on the forms thereby providing the basis for the goal oriented prompting for the application user using the run time mode.

Journal ArticleDOI
TL;DR: The authors introduce a classification tree to manage the relationships among different classes of layout structures and propose a method to recognize the layout structures of multi-kinds of table-form document images.
Abstract: Many approaches have reported that knowledge-based layout recognition methods are very successful in classifying the meaningful data from document images automatically. However, these approaches are applicable to only the same kind of documents because they are based on the paradigm that specifies the structure definition information in advance so as to be able to analyze a particular class of documents intelligently. In this paper, the authors propose a method to recognize the layout structures of multi-kinds of table-form document images. For this purpose, the authors introduce a classification tree to manage the relationships among different classes of layout structures. The authors' recognition system has two modes: layout knowledge acquisition and layout structure recognition. In the layout knowledge acquisition mode, table-form document images are distinguished according to this. Classification tree and then the structure description trees which specify the logical structures of table-form documents are generated automatically. While, in the layout structure recognition mode, individual item fields in the table-form document images are extracted and classified successfully by searching the classification tree and interpreting the structure description tree. >


Journal ArticleDOI
TL;DR: It is proved that performance improves remarkably when using a tree-based iterative method, which iteratively refines an alignment whenever two subalignments are merged in aTree-based way.
Abstract: Multiple sequence alignment is an important problem in the biosciences. To date, most multiple alignment systems have employed a tree-based algorithm, which combines the results of two-way dynamic programming in a tree-like order of sequence similarity. The alignment quality is not, however, high enough when the sequence similarity is low. Once an error occurs in the alignment process, that error can never be corrected. Recently, an effective new class of algorithms has been developed. These algorithms iteratively apply dynamic programming to partially aligned sequences to improve their alignment quality. The iteration corrects any errors that may have occurred in the alignment process. Such an iterative strategy requires heuristic search methods to solve practical alignment problems. Incorporating such methods yields various iterative algorithms. This paper reports our comprehensive comparison of iterative algorithms. We proved that performance improves remarkably when using a tree-based iterative method, which iteratively refines an alignment whenever two subalignments are merged in a tree-based way. We propose a tree-dependent, restricted partitioning technique to efficiently reduce the execution time of iterative algorithms.

Journal ArticleDOI
TL;DR: The load is distributed in more than installment in an optimal manner to minimize the processing time and this is a deviation and an improvement over earlier studies in which the load distribution is done in only one installment.
Abstract: This paper presents a new strategy for load distribution in a single-level tree network equipped with or without front-ends. The load is distributed in more than installment in an optimal manner to minimize the processing time. This is a deviation and an improvement over earlier studies in which the load distribution is done in only one installment. Recursive equations for the general case, and their closed-form solutions for a special case in which the network has identical processors and identical links, are derived. An asymptotic analysis of the network performance with respect to the number of processors and the number of installments is carried out. Discussions of the results in terms of some practical issues like the tradeoff relationship between the number of processors and the number of installments are also presented. >

Journal ArticleDOI
TL;DR: A model for the optimal layout selection for a network with a tree structure is described, which includes the use of an efficient tree growing algorithm and the incorporation of redundant ‘genetic’ information within the ‘reproduction’ phase.
Abstract: A model for the optimal layout selection for a network with a tree structure is described. Such networks occur in sewerage, irrigation, water and gas supply and distribution, and in many other diverse areas of engineering. The model is based on Evolutionary Design and Genetic Algorithm principles. Novel features include the use of an efficient tree growing algorithm and the incorporation of redundant ‘genetic’ information within the ‘reproduction’ phase. Software performance is described and discussed using two examples.

Proceedings ArticleDOI
09 May 1995
TL;DR: A new search strategy particularly effective for very large vocabulary word recognition, performs a tree based, time synchronous, left-to-right beam search that develops time-dependent acoustic and phonetic hypotheses.
Abstract: The paper presents a fast segmental Viterbi algorithm. A new search strategy particularly effective for very large vocabulary word recognition. It performs a tree based, time synchronous, left-to-right beam search that develops time-dependent acoustic and phonetic hypotheses. At any given time, it makes active a sub-word unit associated to an arc of a lexical tree only if that time is likely to be the boundary between the current and the next unit. This new technique, tested with a vocabulary of 188892 directory entries, achieves the same results obtained with the Viterbi algorithm, with a 35% speedup. Results are also presented for a 718 word, speaker independent continuous speech recognition task.

Journal ArticleDOI
TL;DR: In this paper, a theory based on the concepts of hydrological equilibrium and ecological field theory is developed, which accounts for (occasional) non-linear behaviour of the flux/leaf area relationship in evergreen trees.
Abstract: Extrapolation of measurements of water use by individual trees to that for a stand of trees is a critical step in linking plant physiology and hydrology. Limitations in sampling resources and variation in tree sizes within a stand necessitate the use of some scaling relationship. Further, to scale tree water use in space as well as time, the relationship must reflect the changing availabilities of energy and water supply. It is argued here that tree leaf area is the most appropriate covariate of water use to achieve this aim. However, empirical results show that the relationship is not always linear. A theory is developed, based on the concepts of hydrological equilibrium (sensu Nemani and Running, 1989) and ecological field theory (Walker et al., 1989) which accounts for (occasional) non-linear behaviour of the flux/leaf area relationship in evergreen trees. A key feature of this theory is the notion of a non-linear, quasi-equilibrium reflecting plant water stress. An equation is derived from these concepts and a standard, explicit treatment of tree water use (Landsberg and McMurtrie, 1984), which is used to characterise this relationship. This equation has the form Q = aIA + bΨf s A f . The theory is tested against field data and published reports on Eucalyptus tree water use.


Journal ArticleDOI
TL;DR: The problem of computing a constrained editing distance between ordered labeled trees, which can be applied to pattern recognition, syntactic tree comparison, classification tree comparison and other applications, is considered.

Book ChapterDOI
01 Jan 1995
TL;DR: The concept of "on-the-fly' map generalization is very different from the implementation approaches described in the paper by Muller et al.: batch and interactive generalization (Chapter 1), so a special structure is proposed: the GAP-tree.
Abstract: The concept of "on-the-fly' map generalization is very different from the implementation approaches described in the paper by Muller et al.: batch and interactive generalization (Chapter 1). The term batch generalization is used for the process in which a computer gets an input dataset and returns an output dataset using algorithms, rules, or constraints (Lagrange et al., 1993) without the intervention of humans. Area partitioning possesses some special problems when being generalized. In order to avoid gaps when not selecting small area features, a special structure is proposed: the GAP-tree. Section 9.3 describes two other reactive data structures, which will be used in combination with the new GAP-tree: the Reactive-tree and the BLG-tree. The implementation and test results are given in section 9.4, where both visual and numerical results are shown. Finally, conclusions and future work are summarized in section 9.5. -from Author