scispace - formally typeset
Search or ask a question

Showing papers on "Tree (data structure) published in 1993"


Journal ArticleDOI
TL;DR: The theoretical basis of the minimum-evolution method of phylogenetic inference is presented by showing that the expectation of the sum of branch length estimates for the true tree is smallest among all possible trees, provided that the evolutionary distances used are statistically unbiased and that the branch lengths are estimated by the ordinary least-squares method.
Abstract: The minimum-evolution (ME) method of phylogenetic inference is based on the assumption that the tree with the smallest sum of branch length estimates is most likely to be the true one. In the past this assumption has been used without mathematical proof. Here we present the theoretical basis of this method by showing that the expectation of the sum of branch length estimates for the true tree is smallest among all possible trees, provided that the evolutionary distances used are statistically unbiased and that the branch lengths are estimated by the ordinary least-squares method. We also present simple mathematical formulas for computing branch length estimates and their standard errors for any unrooted bifurcating tree, with the least-squares approach. As a numerical example, we have analyzed mtDNA sequence data obtained by Vigilant et al. and have found the ME tree for 95 human and 1 chimpanzee (outgroup) sequences. The tree was somewhat different from the neighbor-joining tree constructed by Tamura and Nei, but there was no statistically significant difference between them.

621 citations


Proceedings ArticleDOI
01 Dec 1993
TL;DR: The authors report on an efficient adaptive N-body method which computes the forces on an arbitrary distribution of bodies in a time which scales as N log N with the particle number.
Abstract: The authors report on an efficient adaptive N-body method which we have recently designed and implemented. The algorithm computes the forces on an arbitrary distribution of bodies in a time which scales as N log N with the particle number. The accuracy of the force calculations is analytically bounded, and can be adjusted via a user defined parameter between a few percent relative accuracy, down to machine arithmetic accuracy. Instead of using pointers to indicate the topology of the tree, the authors identify each possible cell with a key. The mapping of keys into memory locations is achieved via a hash table. This allows the program to access data in an efficient manner across multiple processors. Performance of the parallel program is measured on the 512 processor Intel Touchstone Delta system. Comments on a number of wide-ranging applications which can benefit from application of this type of algorithm are included.

457 citations


Journal ArticleDOI
TL;DR: A new spatially explicit model of forest dynamics is introduced that is constructed from submodels that predict an individual tree's growth, survival, dispersal, and recruitment, and submodels...
Abstract: We introduce a new spatially explicit model of forest dynamics. The model is constructed from submodels that predict an individual tree's growth, survival, dispersal, and recruitment, and submodels...

451 citations


Journal ArticleDOI
TL;DR: A package of programs (run by a management program called TREECON) was developed for the construction and drawing of evolutionary trees and the modules TREE, ROOT and DRAW are applicable to any kind of dissimilarity matrix.
Abstract: A package of programs (run by a management program called TREECON) was developed for the construction and drawing of evolutionary trees. The program MATRIX calculates dissimilarity values and can perform boostrap analysis on nucleic acid sequences. TREE implements different evolutionary tree constructing methods based on distance matrices. Because some of these methods produce unrooted evolutionary trees, a program ROOT places a root on the tree. Finally, the program DRAW draws the evolutionary tree, changes its size or topology, and produces drawings suitable for publication. Whereas, MATRIX is suited only for nucleic acids, the modules TREE, ROOT and DRAW are applicable to any kind of dissimilarity matrix. The programs run on IBM-compatible microcomputers using the DOS operating system.

419 citations


Journal ArticleDOI
TL;DR: A tree-based method for censored survival data is developed, based on maximizing the difference in survival between groups of patients represented by nodes in a binary tree that includes a pruning algorithm with optimal properties analogous to the classification and regression tree (CART) pruned algorithm.
Abstract: A tree-based method for censored survival data is developed, based on maximizing the difference in survival between groups of patients represented by nodes in a binary tree. The method includes a pruning algorithm with optimal properties analogous to the classification and regression tree (CART) pruning algorithm. Uniform convergence of the estimates of the conditional cumulative hazard and survival functions is discussed, and an example is given to show the utility of the algorithm for developing prognostic classifications for patients.

324 citations



Patent
Itsuko Kiuchi1
30 Nov 1993
TL;DR: In this paper, the user selects an intended node from among the nodes on the displayed hierarchical tree and enters a concept for classifying nodes of sub-concepts of the selected nodes.
Abstract: In method of displaying items of information organized in a hierarchical structure, a hierarchical tree of nodes is displayed, which are items of information, classified based on a specified concept. The user selects an intended node from among the nodes on the displayed hierarchical tree and enters a concept for classifying nodes of sub-concepts of the selected nodes. The selected node and a partial hierarchical tree that is created based on the entered; concept are displayed.

191 citations


Journal ArticleDOI
TL;DR: In this paper, the central problem in one-to-many wide-area communications is forming the delivery tree, the collection of nodes and links that a multicast packet traverses.
Abstract: One of the central problems in one-to-many wide-area communications is forming the delivery tree - the collection of nodes and links that a multicast packet traverses. Significant problems remain t...

184 citations


Book ChapterDOI
01 Nov 1993
TL;DR: The basic idea of the algorithm is to use the heterogeneity of the search space for a density-based structuring and to employ this precomputed structure, a k- d tree, for efficient case retrieval according to a given similarity measure.
Abstract: Retrieval of cases is one important step within the case-based reasoning paradigm. We propose an improvement of this stage in the process model for finding most similar cases with an average effort of O[log2n], n number of cases. The basic idea of the algorithm is to use the heterogeneity of the search space for a density-based structuring and to employ this precomputed structure, a k- d tree, for efficient case retrieval according to a given similarity measure. Besides illustrating the basic idea, we present empirical results of a comparison of four different k- d tree generating strategies and introduce the notion of dynamic bounds which significantly reduce the retrieval effort. The presented approach is fully implemented and used within two case-based reasoning systems for classification and diagnostic tasks, Patdex and Inreca.

150 citations


Journal ArticleDOI
TL;DR: This paper compares the performance of logistic regression to decision-tree induction in classifying patients as having acute cardiac ischemia using the database of 5773 patients originally used to develop the logistic-regression tool and test it prospectively.

145 citations


Journal ArticleDOI
TL;DR: This algorithm dynamically searches for the globally optimal solution on a minimum solution tree which is a subtree of the solution tree used in the traditional branch and bound algorithm.

Proceedings ArticleDOI
19 Apr 1993
TL;DR: A class of tree structures, called generalization trees, that can be applied efficiently to compute spatial joins in a hierarchical manner are described.
Abstract: Spatial joins are join operations that involve spatial data types and operators. Due to basic properties of spatial data, many conventional join strategies suffer serious performance penalties or are not applicable at all. The join strategies known from conventional databases that can be applied to spatial joins and the ways in which some of these techniques can be modified to be more efficient in the context of spatial data are discussed. A class of tree structures, called generalization trees, that can be applied efficiently to compute spatial joins in a hierarchical manner are described. The performances of the most promising strategies are analytically modeled and compared. >

Journal ArticleDOI
TL;DR: A simple index is proposed for the agreement of different estimation methods that can serve as a measure of the reliability of the joint estimate and was coupled with an unbiased character weighting procedure to increase the accuracy of estimation.
Abstract: Three methods of phylogenetic tree estimation, UPGMA clustering (UP), maximum parsimony (MP), and neighbor joining (NJ), were used to estimate trees from a large simulation study of 5,400 eight-taxon data sets. The data sets represented nine evolutionary models and 20 tree topologies. The agreement of the trees estimated by the three methods was highly correlated (0.71) with the average agreement with the true tree from which the data sets had been generated. A simple index is proposed for the agreement of different estimation methods that can serve as a measure of the reliability of the joint estimate. This index was coupled with an unbiased character weighting procedure to increase the accuracy of estimation. Accuracy was increased by an average of 24% by use of this procedure in 47 of 61 data sets examined

Patent
06 Oct 1993
TL;DR: In this article, a system for the creation and completion of goal oriented electronic forms is presented, which includes a form creation mode and a run-time mode for the user to specify a set of tree branches, tree nodes, and conclusions in association with fields of the form.
Abstract: A system for creation and completion of goal oriented electronic forms creates a graphical image data file which defines: a graphical image of a form for display and printing; a graphical image of tree branches, tree nodes, and conclusions in association with fields of the form; reading and writing links between form fields and data sources and destinations; and links to other forms which, with the original form, comprise a related stack of forms. The system includes a form creation mode and a run time mode. The trees are defined by an application developer using the form creation mode to establish both qualitative and quantitative relationships between the various fields on the forms thereby providing the basis for the goal oriented prompting for the application user using the run time mode.

Journal ArticleDOI
TL;DR: It is shown that solving the path-based problem formulations by decomposition results in substantially fewer master problem iterations and lower CPU times than by using decomposition on the equivalent tree-based formulations.
Abstract: This paper investigates the impact of problem formulation on Dantzig--Wolfe decomposition for the multicommodity network flow problem. These problems are formulated in three ways: origin-destination specific, destination specific, and product specific. The path-based origin-destination specific formulation is equivalent to the tree-based destination specific formulation by a simple transformation. Supersupply and superdemand nodes are appended to the tree-based product specific formulation to create an equivalent path-based product specific formulation. We show that solving the path-based problem formulations by decomposition results in substantially fewer master problem iterations and lower CPU times than by using decomposition on the equivalent tree-based formulations. Computational results on a series of multicommodity network flow problems are presented.

Proceedings ArticleDOI
01 Jul 1993
TL;DR: Two performance-driven Steiner tree algorithms for global routing are presented which consider the minimization of timing delay during the tree construction as the goal and are based on nonlinear optimization method and heuristic approach.
Abstract: This paper presents two performance-driven Steiner tree algorithms for global routing which consider the minimization of timing delay during the tree construction as the goal One algorithm is based on nonlinear optimization method, another uses heuristic approach to guide the construction of Steiner tree A new timing model is established which includes both total length and critical path between source and sink in delay formulation, and an upper bound for timing delay is deducted and used to guide both algorithms Experiment results are given to demonstrate the effectiveness of the two algorithms

Journal ArticleDOI
TL;DR: A new method for weighting characters according to their homoplasy is proposed; the method is non-iterative and does not require independent estimations of weights.

Journal ArticleDOI
TL;DR: The generalized ID3 (GID3) algorithm, which takes a training set of experimental data and produces a decision tree that predicts the outcome of future experiments under various, more general conditions, is described.
Abstract: The generalized ID3 (GID3) algorithm, which takes a training set of experimental data and produces a decision tree that predicts the outcome of future experiments under various, more general conditions, is described. The tree can then be translated into a set of rules for an expert system. Two extensions to GID3MmRIST, and KARSM-that deal with the problems of noisy data and the limited availability of training data are discussed. The application of GID3 to reactive ion etching manufacturing process diagnosis and optimization and to knowledge acquisition for an expert system is described. >

Journal ArticleDOI
TL;DR: The authors investigated the relationship between the size of a decision tree consistent with some training data and the accuracy of the tree on test data and found that smaller decision trees are on average less accurate than the average accuracy of slightly larger trees.
Abstract: We report on a series of experiments in which all decision trees consistent with the training data are constructed. These experiments were run to gain an understanding of the properties of the set of consistent decision trees and the factors that affect the accuracy of individual trees. In particular, we investigated the relationship between the size of a decision tree consistent with some training data and the accuracy of the tree on test data. The experiments were performed on a massively parallel Maspar computer. The results of the experiments on several artificial and two real world problems indicate that, for many of the problems investigated, smaller consistent decision trees are on average less accurate than the average accuracy of slightly larger trees.

Patent
Florin Oprescu1
16 Dec 1993
TL;DR: In this article, a hierarchical tree structure where there is only one root node is proposed and a signaling scheme is developed in which nodes via on board communications hardware, signal all connected nodes and respond accordingly until hierarchical relationships are established.
Abstract: A system and method are described which take an arbitrarily assembled collection of nodes on a bus or network and imposes an optimized hierarchical tree structure where there is only one root node. Nodes having both parent and child nodes are considered branch nodes while nodes having only parent nodes are leaf nodes. Loops or cycles in the physical topology are resolved into a logical topology that is acyclic and directed. A signaling scheme is developed in which nodes, via on board communications hardware, signal all connected nodes and respond accordingly until hierarchical relationships are established. Cycles are resolved by intelligently breaking links to yield an acyclic graph. Direction is established by each node recognizing its parent/child status with respect to connected nodes until a single node is established as a root node.

Proceedings ArticleDOI
24 Aug 1993
TL;DR: Fractal approaches to the problems which associate with visualizing huge hierarchies are described and a prototype visualization system for UNIX directories is shown.
Abstract: This paper describes fractal approaches to the problems which associate with visualizing huge hierarchies. The geometrical characteristic of a fractal, self-similarity, allows users to visually interact with a huge tree in the same manner at every level of the tree. The fractal dimension, a measure of complexity, makes it possible to control the total amount of displayed nodes. A prototype visualization system for UNIX directories is also shown. >

Patent
17 Jun 1993
TL;DR: In this paper, the authors present a database system for organizing large amounts of data to be accessed by a digital computer, in the form of a summarized, multikey tree, built from files stored on the computer.
Abstract: The subject invention is directed to a database system for organizing large amounts of data to be accessed by a digital computer. More particularly, a free form type database, in the form of a summarized, multikey tree, is built from files stored on the computer. After a building operation, the user obtains specified information by using the summarized database. Information in the files is divided into three categories; that is, a dimension field which comprises data to be organized, a summary field which comprises a numeric quantity on which calculations can be performed, and a non-summary field which comprises other information associated with an input record. The internal nodes of the tree summarize and organize sets of input records. Methods are provided for reducing the amount of storage space used by cutting off the tree when the size of sets go below a given threshold, and sharing parts of the tree so that each record does not appear n! times in the database.

Journal ArticleDOI
TL;DR: This article proposes a sampling method in which the genetic criterion is taken as the most important: samples created with this method will reflect optimally the diversity of the languages of the world.
Abstract: In recent years more attention is being paid to the quality of language samples in typological work. Without an adequate sampling strategy, samples may suffer from various kinds of bias. In this article we propose a sampling method in which the genetic criterion is taken as the most important: samples created with this method will reflect optimally the diversity of the languages of the world. On the basis of the internal structure of each genetic language tree a measure is computed that reflects the linguistic diversity in the language families represented by these trees. This measure is used to determine how many languages from each phylum should be selected, given any required sample size.

Patent
19 Jan 1993
TL;DR: In this paper, the authors propose a self-generating nodal network for communicating in a client/server system wherein the method includes the steps of creating a server nodal tree, which includes both process steps for communicating to an operating system and service nodes.
Abstract: Method for building a self-generating nodal network for communicating in a client/server system wherein the method includes the steps of creating a server nodal network tree which includes the steps of generating a server root node which includes both process steps for communicating to an operating system and service nodes, and process steps for building service nodes which correspond to servers within the client/server system, each service node includes both process steps for advertising a service to the server root node and process steps for building a topic node which includes both process steps for accessing a server and process steps for building a job node for storing a job request. The method also includes the step of creating a client nodal network tree which includes the steps of generating a client root node which includes both process steps for communicating to an operating system and client service nodes, and process steps for building client service nodes corresponding to each service node of the server nodal network tree, each client service nodes includes both process steps for communicating to an application program and process steps to create a job request in accordance with a job request designated by the application program, wherein the client service node is receiving a job request from the application program propagates the request back through the client nodal network tree to the server nodal network tree for execution.

Proceedings ArticleDOI
01 Nov 1993
TL;DR: An image compression algorithm based on optimal bit rate allocation between scalar and tree-structured quantizers is proposed, and achieves excellent coding efficiency in the rate-distortion sense.
Abstract: Wavelet image decompositions generate a tree-structured set of coefficients, providing an hierarchical data-structure for representing images. While early wavelet-based algorithms for image compression concentrated on optimal quantization of wavelet coefficients, several recent researchers have proposed approaches which couple coefficient quantization (either scalar or vector-based) with various strategies for quantizing the tree itself. This paper proposes an image compression algorithm based on optimal bit rate allocation between scalar and tree-structured quantizers. A predictive approach to representing the pruned tree structure is presented, and the entropy of this representation is included in the optimal allocation problem. The algorithm couples Lagrangian optimization of scalar quantizers with a marginal analysis approach for optimizing the tree structure, and achieves excellent coding efficiency in the rate-distortion sense. >

Proceedings Article
24 Aug 1993
TL;DR: Experimental results show that, while dynamic programming produces the be& plans, simple heuristics often do nearly as well as dynamic programming, and the advantages of bushy execution trees over more restricted tree shapes are highlighted.
Abstract: This paper looks at the problem of multi-join query optimization for symmetric multiproceasore. Optimizrtlion algorithms based on dynamic programming and greedy heuristics are described that, unlike traditional methods, include memory resources and pipelining in their cost model. An analytical model is presented and used to compare the quality of plans produced by each optimization algorithm. Experimental results show that, while dynamic programming produces the be& plans, simple heuristics often do nearly as well. The came results are also used to highlight the advantages of bushy execution trees over more restricted tree shapes.

Patent
12 Jul 1993
TL;DR: In this paper, the problems of efficiently building a large software system are solved by the present invention of language scoping for effective configuration descriptions, where a software system is defined by a tree of system models which are written in a functional language.
Abstract: The problems of efficiently building a large software system are solved by the present invention of language scoping for effective configuration descriptions. A software system is defined by a tree of system models which are written in a functional language. The functional language provides a unique combination of static and dynamic scoping. This combination is required to write modular, flexible, and concise, yet complete, configuration descriptions.

Proceedings ArticleDOI
20 Sep 1993
TL;DR: The paper assesses the strengths and weaknesses of the PMH model as a generic model and recommends a parameterized generic model that captures the features of diverse computer architectures to facilitate the development of portable programs.
Abstract: A parameterized generic model that captures the features of diverse computer architectures would facilitate the development of portable programs. Specific models appropriate to particular computers are obtained by specifying parameters of the generic model. A generic model should be simple, and for each machine that it is intended to represent, it should have a reasonably accurate specific model. The Parallel Memory Hierarchy (PMH) model of computation uses a single mechanism to model the costs of both interprocessor communication and memory hierarchy traffic. A computer is modeled as a tree of memory modules with processors at the leaves. All data movement takes the form of block transfers between children and their parents. The paper assesses the strengths and weaknesses of the PMH model as a generic model. >

Journal ArticleDOI
TL;DR: A broad overview is presented of quantitative methods, based on bibliometric data, for constructingSpecific indicators for important aspects of scientific research can be constructed, the main types being characteristics of scientific output; characteristics of science impact; and maps of science.
Abstract: A broad overview is presented of quantitative methods, based on bibliometric data. Specific indicators for important aspects of scientific research can be constructed, the main types being characteristics of scientific output; characteristics of scientific impact; and maps of science. Copyright , Beech Tree Publishing.

Patent
Satyajit Rao1, James V. Mahoney1
24 Nov 1993
TL;DR: In this article, the authors define an input image set that shows a node-link structure, such as a directed graph, an undirected graph, a tree, a flow chart, a circuit diagram, or a state-transition diagram.
Abstract: Input image data define an input image set that shows a node-link structure, such as a directed graph, an undirected graph, a tree, a flow chart, a circuit diagram, or a state-transition diagram. The input image set can include one image showing the node-link structure or two images, one showing graphical features that are a subset of the nodes and the other an image of an overlay with editing marks that include the links and another subset of the nodes. The input image data are used to obtain likely node-link data indicating parts of the input image set that satisfy a constraint on nodes and parts that satisfy a constraint on links. The likely node-link data are used to obtain constrained node-link data indicating subsets of the likely nodes and links that satisfy a constraint on node-link structures. The likely node-link data can include data defining a likely node image showing parts that meet a node criterion and data defining a likely link image showing parts that meet a link criterion. The constrained node-link data can be obtained by iteratively applying a link nearness criterion to the likely nodes and a node nearness criterion to the likely links until stability is reached. The constrained node-link data can be used to obtain output image data defining an output image that includes a precisely formed version of the node-link structure or an edited version of an input image. Or the constrained node-link data can be used to provide control signals to a system.