scispace - formally typeset
Search or ask a question

Showing papers on "Tree (data structure) published in 2008"


Journal ArticleDOI
TL;DR: The Dynamic Tree Cut R package is presented, that implements novel dynamic branch cutting methods for detecting clusters in a dendrogram depending on their shape that can optionally combine the advantages of hierarchical clustering and partitioning around medoids, giving better detection of outliers.
Abstract: Summary: Hierarchical clustering is a widely used method for detecting clusters in genomic data. Clusters are defined by cutting branches off the dendrogram. A common but inflexible method uses a constant height cutoff value; this method exhibits suboptimal performance on complicated dendrograms. We present the Dynamic Tree Cut R package that implements novel dynamic branch cutting methods for detecting clusters in a dendrogram depending on their shape. Compared to the constant height cutoff method, our techniques offer the following advantages: (1) they are capable of identifying nested clusters; (2) they are flexible—cluster shape parameters can be tuned to suit the application at hand; (3) they are suitable for automation; and (4) they can optionally combine the advantages of hierarchical clustering and partitioning around medoids, giving better detection of outliers. We illustrate the use of these methods by applying them to protein–protein interaction network data and to a simulated gene expression data set. Availability: The Dynamic Tree Cut method is implemented in an R package available at http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting Contact: stevitihit@yahoo.com Supplementary information: Supplementary data are available at Bioinformatics online.

1,661 citations


Proceedings Article
08 Dec 2008
TL;DR: A fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data are introduced and it is shown that the resulting models can outperform non-hierarchical neural models as well as the best n-gram models.
Abstract: Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models. The main drawback of NPLMs is their extremely long training and testing times. Morin and Bengio have proposed a hierarchical language model built around a binary tree of words, which was two orders of magnitude faster than the non-hierarchical model it was based on. However, it performed considerably worse than its non-hierarchical counterpart in spite of using a word tree created using expert knowledge. We introduce a fast hierarchical language model along with a simple feature-based algorithm for automatic construction of word trees from the data. We then show that the resulting models can outperform non-hierarchical neural models as well as the best n-gram models.

989 citations


Journal ArticleDOI
TL;DR: HMC trees outperform HSC and SC trees along three dimensions: predictive accuracy, model size, and induction time, and it is concluded that HMC trees should definitely be considered in HMC tasks where interpretable models are desired.
Abstract: Hierarchical multi-label classification (HMC) is a variant of classification where instances may belong to multiple classes at the same time and these classes are organized in a hierarchy. This article presents several approaches to the induction of decision trees for HMC, as well as an empirical study of their use in functional genomics. We compare learning a single HMC tree (which makes predictions for all classes together) to two approaches that learn a set of regular classification trees (one for each class). The first approach defines an independent single-label classification task for each class (SC). Obviously, the hierarchy introduces dependencies between the classes. While they are ignored by the first approach, they are exploited by the second approach, named hierarchical single-label classification (HSC). Depending on the application at hand, the hierarchy of classes can be such that each class has at most one parent (tree structure) or such that classes may have multiple parents (DAG structure). The latter case has not been considered before and we show how the HMC and HSC approaches can be modified to support this setting. We compare the three approaches on 24 yeast data sets using as classification schemes MIPS's FunCat (tree structure) and the Gene Ontology (DAG structure). We show that HMC trees outperform HSC and SC trees along three dimensions: predictive accuracy, model size, and induction time. We conclude that HMC trees should definitely be considered in HMC tasks where interpretable models are desired.

616 citations


Proceedings ArticleDOI
17 May 2008
TL;DR: A simple variant of the k-d tree which automatically adapts to intrinsic low dimensional structure in data without having to explicitly learn this structure is presented.
Abstract: We present a simple variant of the k-d tree which automatically adapts to intrinsic low dimensional structure in data without having to explicitly learn this structure.

435 citations


Proceedings Article
22 Oct 2008
TL;DR: This paper puts forward Monte-Carlo Tree Search as a novel, unified framework to game AI, and demonstrates that it can be applied effectively to classic board- games, modern board-games, and video games.
Abstract: Classic approaches to game AI require either a high quality of domain knowledge, or a long time to generate effective AI behaviour. These two characteristics hamper the goal of establishing challenging game AI. In this paper, we put forward Monte-Carlo Tree Search as a novel, unified framework to game AI. In the framework, randomized explorations of the search space are used to predict the most promising game actions. We will demonstrate that Monte-Carlo Tree Search can be applied effectively to (1) classic board-games, (2) modern board-games, and (3) video games.

336 citations


Journal ArticleDOI
TL;DR: The design of the Word Tree is described, along with some of the technical issues that arise in its implementation, and the results of several months of public deployment of word trees on Many Eyes are discussed, which provides a window onto the ways in which users obtain value from the visualization.
Abstract: We introduce the Word Tree, a new visualization and information-retrieval technique aimed at text documents. A Word Tree is a graphical version of the traditional "keyword-in-context" method, and enables rapid querying and exploration of bodies of text. In this paper we describe the design of the technique, along with some of the technical issues that arise in its implementation. In addition, we discuss the results of several months of public deployment of word trees on Many Eyes, which provides a window onto the ways in which users obtain value from the visualization.

330 citations


Book ChapterDOI
15 Sep 2008
TL;DR: This paper compares the performance of several popular decision tree splitting criteria and identifies a new skew insensitive measure in Hellinger distance, which is proposed for its application in forming decision trees, and performs a comprehensive comparative analysis between each decision tree construction method.
Abstract: Learning from unbalanced datasets presents a convoluted problem in which traditional learning algorithms may perform poorly. The objective functions used for learning the classifiers typically tend to favor the larger, less important classes in such problems. This paper compares the performance of several popular decision tree splitting criteria --- information gain, Gini measure, and DKM --- and identifies a new skew insensitive measure in Hellinger distance. We outline the strengths of Hellinger distance in class imbalance, proposes its application in forming decision trees, and performs a comprehensive comparative analysis between each decision tree construction method. In addition, we consider the performance of each tree within a powerful sampling wrapper framework to capture the interaction of the splitting metric and sampling. We evaluate over this wide range of datasets and determine which operate best under class imbalance.

247 citations


Book ChapterDOI
20 May 2008
TL;DR: A tree-based mining algorithm is proposed to efficiently find frequent patterns from uncertain data, where each item in the transactions is associated with an existential probability.
Abstract: Many frequent pattern mining algorithms find patterns from traditional transaction databases, in which the content of each transaction--namely, items--is definitely known and precise. However, there are many real-life situations in which the content of transactions is uncertain. To deal with these situations, we propose a tree-based mining algorithm to efficiently find frequent patterns from uncertain data, where each item in the transactions is associated with an existential probability. Experimental results show the efficiency of our proposed algorithm.

228 citations


Journal ArticleDOI
TL;DR: A novel approach for autonomous and incremental learning of motion pattern primitives by observation of human motion, abstracted into a dynamic stochastic model, analogous to the mirror neuron hypothesis in primates is described.
Abstract: This paper describes a novel approach for autonomous and incremental learning of motion pattern primitives by observation of human motion. Human motion patterns are abstracted into a dynamic stochastic model, which can be used for both subsequent motion recognition and generation, analogous to the mirror neuron hypothesis in primates. The model size is adaptable based on the discrimination requirements in the associated region of the current knowledge base. A new algorithm for sequentially training the Markov chains is developed, to reduce the computation cost during model adaptation. As new motion patterns are observed, they are incrementally grouped together using hierarchical agglomerative clustering based on their relative distance in the model space. The clustering algorithm forms a tree structure, with specialized motions at the tree leaves, and generalized motions closer to the root. The generated tree structure will depend on the type of training data provided, so that the most specialized motions will be those for which the most training has been received. Tests with motion capture data for a variety of motion primitives demonstrate the efficacy of the algorithm.

226 citations


Proceedings Article
08 Dec 2008
TL;DR: This work provides an algorithm that combines tree methods with the Improved Fast Gauss Transform (IFGT) and employs a tree data structure, resulting in four evaluation methods whose performance varies based on the distribution of sources and targets and input parameters such as desired accuracy and bandwidth.
Abstract: Many machine learning algorithms require the summation of Gaussian kernel functions, an expensive operation if implemented straightforwardly. Several methods have been proposed to reduce the computational complexity of evaluating such sums, including tree and analysis based methods. These achieve varying speedups depending on the bandwidth, dimension, and prescribed error, making the choice between methods difficult for machine learning tasks. We provide an algorithm that combines tree methods with the Improved Fast Gauss Transform (IFGT). As originally proposed the IFGT suffers from two problems: (1) the Taylor series expansion does not perform well for very low bandwidths, and (2) parameter selection is not trivial and can drastically affect performance and ease of use. We address the first problem by employing a tree data structure, resulting in four evaluation methods whose performance varies based on the distribution of sources and targets and input parameters such as desired accuracy and bandwidth. To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth. In addition, the new IFGT parameter selection approach allows for tighter error bounds. Our approach chooses the fastest method at negligible additional cost, and has superior performance in comparisons with previous approaches.

218 citations


Patent
03 Oct 2008
TL;DR: In this article, a technique for organizing data to facilitate data deduplication includes dividing a block-based set of data into multiple “chunks”, where the chunk boundaries are independent of the block boundaries (due to the hashing algorithm).
Abstract: A technique for organizing data to facilitate data deduplication includes dividing a block-based set of data into multiple “chunks”, where the chunk boundaries are independent of the block boundaries (due to the hashing algorithm). Metadata of the data set, such as block pointers for locating the data, are stored in a tree structure that includes multiple levels, each of which includes at least one node. The lowest level of the tree includes multiple nodes that each contain chunk metadata relating to the chunks of the data set. In each node of the lowest level of the buffer tree, the chunk metadata contained therein identifies at least one of the chunks. The chunks (user-level data) are stored in one or more system files that are separate from the buffer tree and not visible to the user.

Journal Article
TL;DR: In this paper, three parallelization methods for Monte-Carlo Tree Search (MCTS) are discussed: leaf parallelization, root parallelization and tree parallelization (tree parallelization requires two techniques: adequately handling of local mutexes and virtual loss).
Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search method that started a revolution in the field of Computer Go. Parallelizing MCTS is an important way to increase the strength of any Go program. In this article, we discuss three parallelization methods for MCTS: leaf parallelization, root parallelization,and tree parallelization. To be effective tree parallelization requires two techniques: adequately handling of (1) local mutexes and (2) virtual loss. Experiments in 1313 Go reveal that in the program Mango root parallelization may lead to the best results for a specific time setting and specific program parame- ters. However, as soon as the selection mechanism is able to handle more adequately the balance of exploitation and exploration, tree paralleliza- tion should have attention too and could become a second choice for parallelizing MCTS. Preliminary experiments on the smaller 99 board provide promising prospects for tree parallelization.

Journal ArticleDOI
TL;DR: It is found that proposals producing topology changes as a side effect of branch length changes (LOCAL and Continuous Change) consistently perform worse than those involving stochastic branch rearrangements (nearest neighbor interchange, subtree pruning and regrafting, tree bisection and reconnection, or subtree swapping).
Abstract: The main limiting factor in Bayesian MCMC analysis of phylogeny is typically the efficiency with which topology proposals sample tree space. Here we evaluate the performance of seven different proposal mechanisms, including most of those used in current Bayesian phylogenetics software. We sampled 12 empirical nucleotide data sets--ranging in size from 27 to 71 taxa and from 378 to 2,520 sites--under difficult conditions: short runs, no Metropolis-coupling, and an oversimplified substitution model producing difficult tree spaces (Jukes Cantor with equal site rates). Convergence was assessed by comparison to reference samples obtained from multiple Metropolis-coupled runs. We find that proposals producing topology changes as a side effect of branch length changes (LOCAL and Continuous Change) consistently perform worse than those involving stochastic branch rearrangements (nearest neighbor interchange, subtree pruning and regrafting, tree bisection and reconnection, or subtree swapping). Among the latter, moves that use an extension mechanism to mix local with more distant rearrangements show better overall performance than those involving only local or only random rearrangements. Moves with only local rearrangements tend to mix well but have long burn-in periods, whereas moves with random rearrangements often show the reverse pattern. Combinations of moves tend to perform better than single moves. The time to convergence can be shortened considerably by starting with a good tree, but this comes at the cost of compromising convergence diagnostics based on overdispersed starting points. Our results have important implications for developers of Bayesian MCMC implementations and for the large group of users of Bayesian phylogenetics software.

Journal ArticleDOI
TL;DR: A fast updated FP-tree (FUFP- tree) structure is proposed, which makes the tree update process become easier and an incremental FUFP-tree maintenance algorithm is also proposed for reducing the execution time in reconstructing the tree when new transactions are inserted.
Abstract: The frequent-pattern-tree (FP-tree) is an efficient data structure for association-rule mining without generation of candidate itemsets. It was used to compress a database into a tree structure which stored only large items. It, however, needed to process all transactions in a batch way. In real-world applications, new transactions are usually inserted into databases. In this paper, we thus attempt to modify the FP-tree construction algorithm for efficiently handling new transactions. A fast updated FP-tree (FUFP-tree) structure is proposed, which makes the tree update process become easier. An incremental FUFP-tree maintenance algorithm is also proposed for reducing the execution time in reconstructing the tree when new transactions are inserted. Experimental results also show that the proposed FUFP-tree maintenance algorithm runs faster than the batch FP-tree construction algorithm for handling new transactions and generates nearly the same tree structure as the FP-tree algorithm. The proposed approach can thus achieve a good trade-off between execution time and tree complexity.

Book ChapterDOI
29 Sep 2008
TL;DR: Three parallelization methods for MCTS are discussed: leaf parallelization, root Parallelization, and tree parallelization.
Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search method that started a revolution in the field of Computer Go Parallelizing MCTS is an important way to increase the strength of any Go program In this article, we discuss three parallelization methods for MCTS: leaf parallelization, root parallelization, and tree parallelization To be effective tree parallelization requires two techniques: adequately handling of (1) local mutexesand (2) virtual loss Experiments in 13×13 Go reveal that in the program Mango root parallelization may lead to the best results for a specific time setting and specific program parameters However, as soon as the selection mechanism is able to handle more adequately the balance of exploitation and exploration, tree parallelization should have attention too and could become a second choice for parallelizing MCTS Preliminary experiments on the smaller 9×9 board provide promising prospects for tree parallelization

Journal ArticleDOI
TL;DR: This work motivates the use of tree transducers for natural language and addresses the training problem for probabilistic tree- to-tree and tree-to-string transducers.
Abstract: Many probabilistic models for natural language are now written in terms of hierarchical tree structure. Tree-based modeling still lacks many of the standard tools taken for granted in (finite-state) string-based modeling. The theory of tree transducer automata provides a possible framework to draw on, as it has been worked out in an extensive literature. We motivate the use of tree transducers for natural language and address the training problem for probabilistic tree-to-tree and tree-to-string transducers.

Proceedings ArticleDOI
13 Apr 2008
TL;DR: This paper presents a systematic in-depth study on the existence, importance, and application of stable nodes in peer- to-peer live video streaming, and presents a tiered overlay design, with stable nodes being organized into a tier-1 backbone for serving tier-2 nodes.
Abstract: This paper presents a systematic in-depth study on the existence, importance, and application of stable nodes in peer- to-peer live video streaming. Using traces from a real large-scale system as well as analytical models, we show that, while the number of stable nodes is small throughout a whole session, their longer lifespans make them constitute a significant portion in a per-snapshot view of a peer-to-peer overlay. As a result, they have substantially affected the performance of the overall system. Inspired by this, we propose a tiered overlay design, with stable nodes being organized into a tier-1 backbone for serving tier-2 nodes. It offers a highly cost-effective and deployable alternative to proxy-assisted designs. We develop a comprehensive set of algorithms for stable node identification and organization. Specifically, we present a novel structure, Labeled Tree, for the tier-1 overlay, which, leveraging stable peers, simultaneously achieves low overhead and high transmission reliability. Our tiered framework flexibly accommodates diverse existing overlay structures in the second tier. Our extensive simulation results demonstrated that the customized optimization using selected stable nodes boosts the streaming quality and also effectively reduces the control overhead. This is further validated through prototype experiments over the PlanetLab network.

Proceedings ArticleDOI
13 Apr 2008
TL;DR: This work designs an algorithm which starts from an arbitrary tree and iteratively reduces the load on bottleneck nodes (nodes likely to soon deplete their energy due to high degree or low remaining energy) and shows that the algorithm terminates in polynomial time and is provably near optimal.
Abstract: Energy efficiency is critical for wireless sensor networks. The data gathering process must be carefully designed to conserve energy and extend the network lifetime. For applications where each sensor continuously monitors the environment and periodically reports to a base station, a tree-based topology is often used to collect data from sensor nodes. In this work, we study the construction of a data gathering tree to maximize the network lifetime, which is defined as the time until the first node depletes its energy. The problem is shown to be NP-complete. We design an algorithm which starts from an arbitrary tree and iteratively reduces the load on bottleneck nodes (nodes likely to soon deplete their energy due to high degree or low remaining energy). We show that the algorithm terminates in polynomial time and is provably near optimal.

Journal ArticleDOI
TL;DR: A new approach for image matching and the related software package is developed and used in 3D tree modelling, showing results from analogue and digital aerial images and high‐resolution satellite images (IKONOS).
Abstract: Image matching is a key procedure in the process of generation of Digital Surface Models (DSM). We have developed a new approach for image matching and the related software package. This technique has proved its good performance in many applications. Here, we demonstrate its use in 3D tree modelling. After a brief description of our image matching technique, we show results from analogue and digital aerial images and high-resolution satellite images (IKONOS). In some cases, comparisons with manual measurements and/or airborne laser data have been performed. The evaluation of the results, qualitative and quantitative, indicate the very good performance of our matcher. Depending on the data acquisition parameters, the photogrammetric DSM can be denser than a DSM generated by laser, and its accuracy may be better than that from laser, as in these investigations. The tree canopy is well modelled, without smoothing of small details and avoiding the canopy penetration occurring with laser. Depending on the image scale, not only dense forest areas but also individual trees can be modelled.

Proceedings Article
01 Jun 2008
TL;DR: A forest-based approach that translates a packed forest of exponentially many parses, which encodes many more alternatives than standard n-best lists, which takes even less time to translate.
Abstract: Among syntax-based translation models, the tree-based approach, which takes as input a parse tree of the source sentence, is a promising direction being faster and simpler than its string-based counterpart. However, current tree-based systems suffer from a major drawback: they only use the 1-best parse to direct the translation, which potentially introduces translation mistakes due to parsing errors. We propose a forest-based approach that translates a packed forest of exponentially many parses, which encodes many more alternatives than standard n-best lists. Large-scale experiments show an absolute improvement of 1.7 BLEU points over the 1-best baseline. This result is also 0.8 points higher than decoding with 30-best parses, and takes even less time.

Journal ArticleDOI
TL;DR: Simulation results show that the LFSD can be used to approach the performance of the LSD while having a lower and fixed complexity, making the algorithm suitable for hardware implementation.
Abstract: A list extension for a fixed-complexity sphere decoder (FSD) to perform iterative detection and decoding in turbo-multiple input-multiple output (MIMO) systems is proposed in this paper. The algorithm obtains a list of candidates that can be used to calculate likelihood information about the transmitted bits required by the outer decoder. The list FSD (LFSD) overcomes the two main problems of the list sphere decoder (LSD), namely, its variable complexity and the sequential nature of its tree search. It combines a search through a very small subset of the complete transmit constellation and a specific channel matrix ordering to approximate the soft- quality of the list of candidates obtained by the LSD. A simple method is proposed to generate that subset, extending the subset searched by the original FSD. Simulation results show that the LFSD can be used to approach the performance of the LSD while having a lower and fixed complexity, making the algorithm suitable for hardware implementation.

Proceedings ArticleDOI
Longhua Qian1, Guodong Zhou1, Fang Kong1, Qiaoming Zhu1, Peide Qian1 
18 Aug 2008
TL;DR: Evaluation on the ACE RDC 2004 corpus shows that the dynamic syntactic parse tree outperforms all previous tree spans, and the composite kernel combining this tree kernel with a linear state-of-the-art feature-based kernel, achieves the so far best performance.
Abstract: This paper proposes a new approach to dynamically determine the tree span for tree kernel-based semantic relation extraction. It exploits constituent dependencies to keep the nodes and their head children along the path connecting the two entities, while removing the noisy information from the syntactic parse tree, eventually leading to a dynamic syntactic parse tree. This paper also explores entity features and their combined features in a unified parse and semantic tree, which integrates both structured syntactic parse information and entity-related semantic information. Evaluation on the ACE RDC 2004 corpus shows that our dynamic syntactic parse tree outperforms all previous tree spans, and the composite kernel combining this tree kernel with a linear state-of-the-art feature-based kernel, achieves the so far best performance.

Journal ArticleDOI
01 Jul 2008
TL;DR: A novel class of memory-constrained UPGMA (MC-UPGMA) algorithms are presented, which demonstrate that leveraging the entire mass embodied in all sequence similarities allows to significantly improve on current protein family clusterings which are unable to directly tackle the sheer mass of this data.
Abstract: Motivation: UPGMA (average linking) is probably the most popular algorithm for hierarchical data clustering, especially in computational biology. However, UPGMA requires the entire dissimilarity matrix in memory. Due to this prohibitive requirement, UPGMA is not scalable to very large datasets. Application: We present a novel class of memory-constrained UPGMA (MC-UPGMA) algorithms. Given any practical memory size constraint, this framework guarantees the correct clustering solution without explicitly requiring all dissimilarities in memory. The algorithms are general and are applicable to any dataset. We present a data-dependent characterization of hardness and clustering efficiency. The presented concepts are applicable to any agglomerative clustering formulation. Results: We apply our algorithm to the entire collection of protein sequences, to automatically build a comprehensive evolutionary-driven hierarchy of proteins from sequence alone. The newly created tree captures protein families better than state-of-the-art large-scale methods such as CluSTr, ProtoNet4 or single-linkage clustering. We demonstrate that leveraging the entire mass embodied in all sequence similarities allows to significantly improve on current protein family clusterings which are unable to directly tackle the sheer mass of this data. Furthermore, we argue that non-metric constraints are an inherent complexity of the sequence space and should not be overlooked. The robustness of UPGMA allows significant improvement, especially for multidomain proteins, and for large or divergent families. Availability: A comprehensive tree built from all UniProt sequence similarities, together with navigation and classification tools will be made available as part of the ProtoNet service. A C++ implementation of the algorithm is available on request. Contact: lonshy@cs.huji.ac.il

Journal ArticleDOI
TL;DR: In this paper, the relationship between electrical tree propagation and the material morphology in XLPE cable insulation has been studied by researching the structure and growth characteristics of a double structure electrical tree.
Abstract: This paper presents our experiments and analysis of the electrical tree growing characteristics. The relationship between electrical tree propagation and the material morphology in XLPE cable insulation has been studied by researching the structure and growth characteristics of a double structure electrical tree. It has been found that, due to the influence of uneven congregating state, difference in crystalline structure, and the existence of residual stress in semi-crystalline polymer, five types of electrical tree structures (branch, bush, bine-branch, pine-branch, and mixed configurations) would propagate in XLPE cable insulation. Three basic treeing propagation phases (initiation, stagnation, and rapid propagating phases) are presented in electrical tree propagating process. If initiation phase is very active, the single branch tree will propagate while if this phase is weak then the bush tree will occur more easily. There would be a clear double structure of electrical tree when it grows at submicroscopic structure uneven region of the material. A new parameter, the expansion coefficient is introduced to describe the electrical tree propagation characteristics. In addition, two other coefficients being used to describe our experimental results are dynamic fractal dimension and growth rate of electrical tree.

Journal ArticleDOI
TL;DR: This paper presents an efficient, simulation-free algorithm for computing two important and ubiquitous evolutionary trajectory properties, the mean number of trait changes, and the mean evolutionary reward, accrued proportionally to the time a trait occupies each of its states.
Abstract: Mapping evolutionary trajectories of discrete traits onto phylogenies receives considerable attention in evolutionary biology. Given the trait observations at the tips of a phylogenetic tree, researchers are often interested where on the tree the trait changes its state and whether some changes are preferential in certain parts of the tree. In a model-based phylogenetic framework, such questions translate into characterizing probabilistic properties of evolutionary trajectories. Current methods of assessing these properties rely on computationally expensive simulations. In this paper, we present an efficient, simulation-free algorithm for computing two important and ubiquitous evolutionary trajectory properties. The first is the mean number of trait changes, where changes can be divided into classes of interest (e.g. synonymous/non-synonymous mutations). The mean evolutionary reward, accrued proportionally to the time a trait occupies each of its states, is the second property. To illustrate the usefulness of our results, we first employ our simulation-free stochastic mapping to execute a posterior predictive test of correlation between two evolutionary traits. We conclude by mapping synonymous and non-synonymous mutations onto branches of an HIV intrahost phylogenetic tree and comparing selection pressure on terminal and internal tree branches.

Journal ArticleDOI
TL;DR: An innovative approach for the optimal matching of independently optimum sum and difference patterns through sub-arrayed monopulse linear arrays is presented, and a fast resolution algorithm that exploits the presence of elements more suitable to change subarray membership is presented.
Abstract: An innovative approach for the optimal matching of independently optimum sum and difference patterns through sub-arrayed monopulse linear arrays is presented. By exploiting the relationship between the independently optimal sum and difference excitations, the set of possible solutions is considerably reduced and the synthesis problem is recast as the search of the best solution in a noncomplete binary tree. Towards this end, a fast resolution algorithm that exploits the presence of elements more suitable to change subarray membership is presented. The results of a set of numerical experiments are reported in order to validate the proposed approach pointing out its effectiveness also in comparison with state-of-the-art optimal matching techniques.

Journal ArticleDOI
TL;DR: In this article, the authors evaluated the performance of two fundamentally different automated tree detection and measurement algorithms (spatial wavelet analysis (SWA) and variable window filters (VWF)) across a full range of canopy conditions in a mixed-species, structurally diverse conifer forest in northern Idaho, USA.
Abstract: Individual tree detection algorithms can provide accurate measurements of individual tree locations, crown diameters (from aerial photography and light detection and ranging (lidar) data), and tree heights (from lidar data). However, to be useful for forest management goals relating to timber harvest, carbon accounting, and ecological processes, there is a need to assess the performance of these image-based tree detection algorithms across a full range of canopy structure conditions. We evaluated the performance of two fundamentally different automated tree detection and measurement algorithms (spatial wavelet analysis (SWA) and variable window filters (VWF)) across a full range of canopy conditions in a mixed-species, structurally diverse conifer forest in northern Idaho, USA. Each algorithm performed well in low canopy cover conditions (<50% canopy cover), detecting over 80% of all trees with measurements, and producing tree height and crown diameter estimates that are well correlated with field measure...

Proceedings ArticleDOI
12 May 2008
TL;DR: A new algorithm is proposed that enables fast recovery of piecewise smooth signals, a large and useful class of signals whose sparse wavelet expansions feature a distinct "connected tree" structure, and which outperforms the standard compressive recovery algorithms as well as previously proposed wavelet-based recovery algorithms.
Abstract: Compressive sensing aims to recover a sparse or compressible signal from a small set of projections onto random vectors; conventional solutions involve linear programming or greedy algorithms that can be computationally expensive. Moreover, these recovery techniques are generic and assume no particular structure in the signal aside from sparsity. In this paper, we propose a new algorithm that enables fast recovery of piecewise smooth signals, a large and useful class of signals whose sparse wavelet expansions feature a distinct "connected tree" structure. Our algorithm fuses recent results on iterative reweighted pound1-norm minimization with the wavelet Hidden Markov Tree model. The resulting optimization-based solver outperforms the standard compressive recovery algorithms as well as previously proposed wavelet-based recovery algorithms. As a bonus, the algorithm reduces the number of measurements necessary to achieve low-distortion reconstruction.

Patent
25 Feb 2008
TL;DR: In this article, a method for determining bandwidth allocations in a content distribution network that comprises multiple trees, where the root of each tree has a server that broadcasts multiple programs throughout the tree, is presented.
Abstract: A method is presented for determining bandwidth allocations in a content distribution network that comprises multiple trees, where the root of each tree has a server that broadcasts multiple programs throughout the tree. Each network link has limited capacity and may be used by one or more of these trees. The allocation problem is formulated as an equitable resource allocation problem with a lexicographic maximin objective function that attempts to provide equitable service performance for all requested programs at the various nodes. The constraints include link capacity constraints and tree-like ordering constraints imposed on each of the programs. The algorithm provides an equitable solution in polynomial time for wide classes of performance functions. At each iteration, the algorithm solves single-link maximin optimization problems while relaxing the ordering constraints, selects a bottleneck link and fixes various variables at their optimal value.

Journal ArticleDOI
TL;DR: UNLABELLED PhyloWidget is a web-based tool for the visualization and manipulation of phylogenetic tree data and a simple URL-based API allows databases to easily link to and customize PhylaWidget for interactively viewing medium- to large-sized trees.
Abstract: Summary: PhyloWidget is a web-based tool for the visualization and manipulation of phylogenetic tree data. It can be accessed online or downloaded as a standalone application. A simple URL-based API allows databases to easily link to and customize PhyloWidget for interactively viewing medium- to large-sized trees. Availability: PhyloWidget is available for online use or download at http://www.phylowidget.org/. Its source code is released under the GNU General Public License.