Showing papers on "Tree (data structure) published in 2009"

PDF

Open Access

Agroforestree Database: a tree reference and selection guide. Version 4.

[...]

C. Orwa, A. Mutua, R. Kindt, R. Jamnadass, A. Simons - Show less +1 more

01 Jan 2009

1,626 citations

Proceedings Article•DOI•

WhereNext: a location predictor on trajectory pattern mining

[...]

Anna Monreale¹, Fabio Pinelli¹, Roberto Trasarti¹, Fosca Giannotti¹•Institutions (1)

Istituto di Scienza e Tecnologie dell'Informazione¹

28 Jun 2009

TL;DR: This paper proposes WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object, which uses previously extracted movement patterns named Trajectory Patterns, which are a concise representation of behaviors of moving objects as sequences of regions frequently visited with a typical travel time.

...read moreread less

Abstract: The pervasiveness of mobile devices and location based services is leading to an increasing volume of mobility data.This side eect provides the opportunity for innovative methods that analyse the behaviors of movements. In this paper we propose WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object. The prediction uses previously extracted movement patterns named Trajectory Patterns, which are a concise representation of behaviors of moving objects as sequences of regions frequently visited with a typical travel time. A decision tree, named T-pattern Tree, is built and evaluated with a formal training and test process. The tree is learned from the Trajectory Patterns that hold a certain area and it may be used as a predictor of the next location of a new trajectory finding the best matching path in the tree. Three dierent best matching methods to classify a new moving object are proposed and their impact on the quality of prediction is studied extensively. Using Trajectory Patterns as predictive rules has the following implications: (I) the learning depends on the movement of all available objects in a certain area instead of on the individual history of an object; (II) the prediction tree intrinsically contains the spatio-temporal properties that have emerged from the data and this allows us to define matching methods that striclty depend on the properties of such movements. In addition, we propose a set of other measures, that evaluate a priori the predictive power of a set of Trajectory Patterns. This measures were tuned on a real life case study. Finally, an exhaustive set of experiments and results on the real dataset are presented.

...read moreread less

610 citations

Journal Article•DOI•

Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases

[...]

Chowdhury Farhan Ahmed¹, Syed Khairuzzaman Tanbeer¹, Byeong-Soo Jeong¹, Young-Koo Lee¹•Institutions (1)

Kyung Hee University¹

01 Dec 2009-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposes three novel tree structures to efficiently perform incremental and interactive HUP mining that can capture the incremental data without any restructuring operation, and shows that these tree structures are very efficient and scalable.

...read moreread less

Abstract: Recently, high utility pattern (HUP) mining is one of the most important research issues in data mining due to its ability to consider the nonbinary frequency values of items in transactions and different profit values for every item. On the other hand, incremental and interactive data mining provide the ability to use previous data structures and mining results in order to reduce unnecessary calculations when a database is updated, or when the minimum threshold is changed. In this paper, we propose three novel tree structures to efficiently perform incremental and interactive HUP mining. The first tree structure, Incremental HUP Lexicographic Tree (IHUPL-Tree), is arranged according to an item's lexicographic order. It can capture the incremental data without any restructuring operation. The second tree structure is the IHUP transaction frequency tree (IHUPTF-Tree), which obtains a compact size by arranging items according to their transaction frequency (descending order). To reduce the mining time, the third tree, IHUP-transaction-weighted utilization tree (IHUPTWU-Tree) is designed based on the TWU value of items in descending order. Extensive performance analyses show that our tree structures are very efficient and scalable for incremental and interactive HUP mining.

...read moreread less

555 citations

Journal Article•DOI•

phyloXML: XML for evolutionary biology and comparative genomics

[...]

Mira V. Han¹, Christian M. Zmasek²•Institutions (2)

Indiana University¹, Sanford-Burnham Institute for Medical Research²

27 Oct 2009-BMC Bioinformatics

TL;DR: PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data.

...read moreread less

Abstract: Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types.

...read moreread less

527 citations

Journal Article•DOI•

Phylogenetic and functional assessment of orthologs inference projects and methods.

[...]

Adrian M. Altenhoff¹, Christophe Dessimoz¹•Institutions (1)

Swiss Institute of Bioinformatics¹

16 Jan 2009-PLOS Computational Biology

TL;DR: In this article, the results of nine leading orthology identification projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) were systematically compared with respect to both phylogeny and function, using six different tests.

...read moreread less

Abstract: Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) and two standard methods (bidirectional best-hit and reciprocal smallest distance). We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology to verify orthology. And third, it sets performance standards for current and future approaches.

...read moreread less

421 citations

Posted Content•

Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity

[...]

Seyoung Kim¹, Eric P. Xing¹•Institutions (1)

Carnegie Mellon University¹

08 Sep 2009

TL;DR: In this paper, a tree-guided group lasso is proposed for estimating structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree, and a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlap among groups.

...read moreread less

Abstract: We consider the problem of estimating a sparse multi-response regression function, with an application to expression quantitative trait locus (eQTL) mapping, where the goal is to discover genetic variations that influence gene-expression levels. In particular, we investigate a shrinkage technique capable of capturing a given hierarchical structure over the responses, such as a hierarchical clustering tree with leaf nodes for responses and internal nodes for clusters of related responses at multiple granularity, and we seek to leverage this structure to recover covariates relevant to each hierarchically-defined cluster of responses. We propose a tree-guided group lasso, or tree lasso, for estimating such structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree. We describe a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlaps among groups. For efficient optimization, we employ a smoothing proximal gradient method that was originally developed for a general class of structured-sparsity-inducing penalties. Using simulated and yeast data sets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns, compared to other methods for learning a multivariate-response regression.

...read moreread less

387 citations

Book Chapter•DOI•

Adaptive Learning from Evolving Data Streams

[...]

Albert Bifet¹, Ricard Gavaldà¹•Institutions (1)

Polytechnic University of Catalonia¹

27 Aug 2009

TL;DR: A method for developing algorithms that can adaptively learn from data streams that drift over time, based on using change detectors and estimator modules at the right places and choosing implementations with theoretical guarantees in order to extend such guarantees to the resulting adaptive learning algorithm.

...read moreread less

Abstract: We propose and illustrate a method for developing algorithms that can adaptively learn from data streams that drift over time. As an example, we take Hoeffding Tree, an incremental decision tree inducer for data streams, and use as a basis it to build two new methods that can deal with distribution and concept drift: a sliding window-based algorithm, Hoeffding Window Tree, and an adaptive method, Hoeffding Adaptive Tree. Our methods are based on using change detectors and estimator modules at the right places; we choose implementations with theoretical guarantees in order to extend such guarantees to the resulting adaptive learning algorithm. A main advantage of our methods is that they require no guess about how fast or how often the stream will drift; other methods typically have several user-defined parameters to this effect. In our experiments, the new methods never do worse, and in some cases do much better, than CVFDT, a well-known method for tree induction on data streams with drift.

...read moreread less

384 citations

Proceedings Article•

Emotion recognition using a hierarchical binary decision tree approach

[...]

Chi-Chun Lee¹, Emily Mower¹, Carlos Busso¹, Sungbok Lee¹, Shrikanth S. Narayanan¹ - Show less +1 more•Institutions (1)

University of Southern California¹

01 Jan 2009

TL;DR: A hierarchical computational structure to recognize emotions is introduced that maps an input speech utterance into one of the multiple emotion classes through subsequent layers of binary classifications and is effective for classifying emotional utterances in multiple database contexts.

...read moreread less

Abstract: Automated emotion state tracking is a crucial element in the computational study of human communication behaviors It is important to design robust and reliable emotion recognition systems that are suitable for real-world applications both to enhance analytical abilities to support human decision making and to design human-machine interfaces that facilitate efficient communication We introduce a hierarchical computational structure to recognize emotions The proposed structure maps an input speech utterance into one of the multiple emotion classes through subsequent layers of binary classifications The key idea is that the levels in the tree are designed to solve the easiest classification tasks first, allowing us to mitigate error propagation We evaluated the classification framework on two different emotional databases using acoustic features, the AIBO database and the USC IEMOCAP database In the case of the AIBO database, we obtain a balanced recall on each of the individual emotion classes using this hierarchical structure The performance measure of the average unweighted recall on the evaluation data set improves by 337% absolute (882% relative) over a Support Vector Machine baseline model In the USC IEMOCAP database, we obtain an absolute improvement of 744% (1458%) over a baseline Support Vector Machine modeling The results demonstrate that the presented hierarchical approach is effective for classifying emotional utterances in multiple database contexts

...read moreread less

339 citations

Proceedings Article•DOI•

Recognizing actions by shape-motion prototype trees

[...]

Zhe Lin¹, Zhuolin Jiang¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

01 Sep 2009

TL;DR: The prototype-based approach enables robust action matching in very challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences.

...read moreread less

Abstract: A prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, first, an action prototype tree is learned in a joint shape and motion space via hierarchical k-means clustering; then a lookup table of prototype-to-prototype distances is generated. During testing, based on a joint likelihood model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint likelihood, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance matrices used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in very challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 91.07% on a large gesture dataset (with dynamic backgrounds), 100% on the Weizmann action dataset and 95.77% on the KTH action dataset.

...read moreread less

327 citations

Journal Article•DOI•

Bandwidth optimal all-reduce algorithms for clusters of workstations

[...]

Pitch Patarasuk¹, Xin Yuan¹•Institutions (1)

Florida State University¹

01 Feb 2009-Journal of Parallel and Distributed Computing

TL;DR: A tight lower bound of the amount of data that must be communicated in order to complete the all-reduce operation with large data sizes in cluster environments is derived and a ring-based algorithm is proposed that only requires tree connectivity to achieve bandwidth optimality.

...read moreread less

324 citations

Journal Article•DOI•

PLANET: massively parallel learning of tree ensembles with MapReduce

[...]

Biswanath Panda¹, Joshua Seth Herbach¹, Sugato Basu¹, Roberto J. Bayardo¹•Institutions (1)

Google¹

01 Aug 2009

TL;DR: This paper describes PLANET: a scalable distributed framework for learning tree models over large datasets, and shows how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models.

...read moreread less

Abstract: Classification and regression tree learning on massive datasets is a common data mining task at Google, yet many state of the art tree learning algorithms require training data to reside in memory on a single machine. While more scalable implementations of tree learning have been proposed, they typically require specialized parallel computing architectures. In contrast, the majority of Google's computing infrastructure is based on commodity hardware.In this paper, we describe PLANET: a scalable distributed framework for learning tree models over large datasets. PLANET defines tree learning as a series of distributed computations, and implements each one using the MapReduce model of distributed computation. We show how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models. We discuss the benefits and challenges of using a MapReduce compute cluster for tree learning, and demonstrate the scalability of this approach by applying it to a real world learning task from the domain of computational advertising.

...read moreread less

FigTree. Tree Figure Drawing Tool

[...]

A. Rambaut

01 Jan 2009

Journal Article•DOI•

The structural and radiative consistency of three-dimensional tree reconstructions from terrestrial lidar

[...]

Jean-François Côté¹, Jean-Luc Widlowski, Richard A. Fournier¹, Michel M. Verstraete•Institutions (1)

Université de Sherbrooke¹

15 May 2009-Remote Sensing of Environment

TL;DR: A quantitative evaluation of structural attributes, like the vertical foliage and wood area profiles, as well as the shoot orientation distribution, confirmed the appropriateness of the proposed tree reconstruction model for the generation of structurally and radiatively faithful copies of existing plant and canopy architectures.

...read moreread less

Journal Article•DOI•

An optimal decomposition algorithm for tree edit distance

[...]

Erik D. Demaine¹, Shay Mozes², Benjamin Rossman¹, Oren Weimann¹•Institutions (2)

Massachusetts Institute of Technology¹, Brown University²

28 Dec 2009-ACM Transactions on Algorithms

TL;DR: This article presents a worst-case O(n) 3-time algorithm for the problem when the two trees have size n, and proves the optimality of the algorithm among the family of decomposition strategy algorithms—which also includes the previous fastest algorithms—by tightening the known lower bound.

...read moreread less

Abstract: The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this article, we present a worst-case O(n3)-time algorithm for the problem when the two trees have size n, improving the previous best O(n3 log n)-time algorithm. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems, together with a deeper understanding of the previous algorithms for the problem. We prove the optimality of our algorithm among the family of decomposition strategy algorithms—which also includes the previous fastest algorithms—by tightening the known lower bound of Ω(n2 log2n) to Ω(n3), matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds for decomposition strategy algorithms of Θ(nm2 (1 + log n/m)) when the two trees have sizes m and n and m

...read moreread less

Proceedings Article•DOI•

A genetic programming approach to automated software repair

[...]

Stephanie Forrest¹, ThanhVu Nguyen¹, Westley Weimer², Claire Le Goues²•Institutions (2)

University of New Mexico¹, University of Virginia²

08 Jul 2009

TL;DR: The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.

...read moreread less

Abstract: Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.

...read moreread less

Proceedings Article•DOI•

A Commutative Replicated Data Type for Cooperative Editing

[...]

Nuno Preguiça¹, Joan Manuel Marquès², Marc Shapiro³, Mihai Letia⁴•Institutions (4)

Citigroup¹, Open University of Catalonia², French Institute for Research in Computer Science and Automation³, École Normale Supérieure⁴

22 Jun 2009

TL;DR: Treedoc is described, a novel CRDT design for cooperative text editing where the identifiers of Treedoc atoms are selected from a dense space and the results with traces from existing edit histories are validated.

...read moreread less

Abstract: A Commutative Replicated Data Type (CRDT) is one where all concurrent operations commute. The replicas of a CRDT converge automatically, without complex concurrency control. This paper describes Treedoc, a novel CRDT design for cooperative text editing. An essential property is that the identifiers of Treedoc atoms are selected from a dense space. We discuss practical alternatives for implementing the identifier space based on an extended binary tree. We also discuss storage alternatives for data and meta-data, and mechanisms for compacting the tree. In the best case, Treedoc incurs no overhead with respect to a linear text buffer. We validate the results with traces from existing edit histories.

...read moreread less

Journal Article•DOI•

Species tree inference by minimizing deep coalescences.

[...]

Cuong Than¹, Luay Nakhleh•Institutions (1)

Rice University¹

11 Sep 2009-PLOS Computational Biology

TL;DR: It is shown that searching for the species tree in the compatibility graph of the clusters induced by the gene trees may be sufficient in practice, a finding that helps ameliorate the computational requirements of optimization solutions.

...read moreread less

Abstract: In a 1997 seminal paper, W Maddison proposed minimizing deep coalescences, or MDC, as an optimization criterion for inferring the species tree from a set of incongruent gene trees, assuming the incongruence is exclusively due to lineage sorting In a subsequent paper, Maddison and Knowles provided and implemented a search heuristic for optimizing the MDC criterion, given a set of gene trees However, the heuristic is not guaranteed to compute optimal solutions, and its hill-climbing search makes it slow in practice In this paper, we provide two exact solutions to the problem of inferring the species tree from a set of gene trees under the MDC criterion In other words, our solutions are guaranteed to find the tree that minimizes the total number of deep coalescences from a set of gene trees One solution is based on a novel integer linear programming (ILP) formulation, and another is based on a simple dynamic programming (DP) approach Powerful ILP solvers, such as CPLEX, make the first solution appealing, particularly for very large-scale instances of the problem, whereas the DP-based solution eliminates dependence on proprietary tools, and its simplicity makes it easy to integrate with other genomic events that may cause gene tree incongruence Using the exact solutions, we analyze a data set of 106 loci from eight yeast species, a data set of 268 loci from eight Apicomplexan species, and several simulated data sets We show that the MDC criterion provides very accurate estimates of the species tree topologies, and that our solutions are very fast, thus allowing for the accurate analysis of genome-scale data sets Further, the efficiency of the solutions allow for quick exploration of sub-optimal solutions, which is important for a parsimony-based criterion such as MDC, as we show We show that searching for the species tree in the compatibility graph of the clusters induced by the gene trees may be sufficient in practice, a finding that helps ameliorate the computational requirements of optimization solutions Further, we study the statistical consistency and convergence rate of the MDC criterion, as well as its optimality in inferring the species tree Finally, we show how our solutions can be used to identify potential horizontal gene transfer events that may have caused some of the incongruence in the data, thus augmenting Maddison's original framework We have implemented our solutions in the PhyloNet software package, which is freely available at: http://bioinfocsriceedu/phylonet

...read moreread less

Proceedings Article•DOI•

ZStream: a cost-based query processor for adaptively detecting composite events

[...]

Yuan Mei¹, Samuel Madden¹•Institutions (1)

Massachusetts Institute of Technology¹

29 Jun 2009

TL;DR: A cost model can accurately capture the actual runtime behavior of a plan, and that choosing the optimal plan can result in a factor of four or more speedup versus an NFA based approach, and a dynamic programming algorithm is described used in this cost model to efficiently search for an optimal query plan for a given pattern.

...read moreread less

Abstract: Composite (or Complex) event processing (CEP) systems search sequences of incoming events for occurrences of user-specified event patterns. Recently, they have gained more attention in a variety of areas due to their powerful and expressive query language and performance potential. Sequentiality (temporal ordering) is the primary way in which CEP systems relate events to each other. In this paper, we present a CEP system called ZStream to efficiently process such sequential patterns. Besides simple sequential patterns, ZStream is also able to detect other patterns, including conjunction, disjunction, negation and Kleene closure. Unlike most recently proposed CEP systems, which use non-deterministic finite automata (NFA's) to detect patterns, ZStream uses tree-based query plans for both the logical and physical representation of query patterns. By carefully designing the underlying infrastructure and algorithms, ZStream is able to unify the evaluation of sequence, conjunction, disjunction, negation, and Kleene closure as variants of the join operator. Under this framework, a single pattern in ZStream may have several equivalent physical tree plans, with different evaluation costs. We propose a cost model to estimate the computation costs of a plan. We show that our cost model can accurately capture the actual runtime behavior of a plan, and that choosing the optimal plan can result in a factor of four or more speedup versus an NFA based approach. Based on this cost model and using a simple set of statistics about operator selectivity and data rates, ZStream is able to adaptively and seamlessly adjust the order in which it detects patterns on the fly. Finally, we describe a dynamic programming algorithm used in our cost model to efficiently search for an optimal query plan for a given pattern.

...read moreread less

Journal Article•DOI•

Capturing tree crown formation through implicit surface reconstruction using airborne lidar data

[...]

Akira Kato¹, L. Monika Moskal², Peter Schiess², Mark E. Swanson³, Donna Calhoun⁴, Werner Stuetzle⁵ - Show less +2 more•Institutions (5)

Chiba University¹, University of Washington², Washington State University³, French Alternative Energies and Atomic Energy Commission⁴, Florida State University College of Arts and Sciences⁵

15 Jun 2009-Remote Sensing of Environment

TL;DR: In this article, the exact shape of an irregular tree crown of various tree species based on the lidar point cloud and visualize their exact crown formation in three-dimensional space was captured using computer graphics.

...read moreread less

Journal Article•DOI•

Compressing and indexing labeled trees, with applications

[...]

Paolo Ferragina¹, Fabrizio Luccio¹, Giovanni Manzini², S. Muthukrishnan³•Institutions (3)

University of Pisa¹, University of Eastern Piedmont², Rutgers University³

27 Nov 2009-Journal of the ACM

TL;DR: For the first time, by using the properties of the XBW-transform, compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space.

...read moreread less

Abstract: Consider an ordered, static tree T where each node has a label from alphabet Σ. Tree T may be of arbitrary degree and shape. Our goal is designing a compressed storage scheme of T that supports basic navigational operations among the immediate neighbors of a node (i.e. parent, ith child, or any child with some label,…) as well as more sophisticated path-based search operations over its labeled structure.We present a novel approach to this problem by designing what we call the XBW-transform of the tree in the spirit of the well-known Burrows-Wheeler transform for strings [1994]. The XBW-transform uses path-sorting to linearize the labeled tree T into two coordinated arrays, one capturing the structure and the other the labels. For the first time, by using the properties of the XBW-transform, our compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space.Our XBW-transform is simple and likely to spur new results in the theory of tree compression and indexing, as well as interesting application contexts. As an example, we use the XBW-transform to design and implement a compressed index for XML documents whose compression ratio is significantly better than the one achievable by state-of-the-art tools, and its query time performance is order of magnitudes faster.

...read moreread less

Journal Article•DOI•

Simultaneous Bayesian gene tree reconstruction and reconciliation analysis.

[...]

Örjan Åkerborg¹, Bengt Sennblad, Lars Arvestad, Jens Lagergren•Institutions (1)

Royal Institute of Technology¹

07 Apr 2009-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A probabilistic model integrating gene duplication, sequence evolution, and a relaxed molecular clock for substitution rates that enables genomewide analysis of gene families and is able to draw biologically relevant conclusions concerning gene duplications creating key yeast phenotypes is presented.

...read moreread less

Abstract: We present GSR, a probabilistic model integrating gene duplication, sequence evolution, and a relaxed molecular clock for substitution rates, that enables genomewide analysis of gene families. The gene duplication and loss process is a major cause for incongruence between gene and species tree, and deterministic methods have been developed to explain such differences through tree reconciliations. Although probabilistic methods for phylogenetic inference have been around for decades, probabilistic reconciliation methods are far less established. Based on our model, we have implemented a Bayesian analysis tool, PrIME-GSR, for gene tree inference that takes a known species tree into account. Our implementation is sound and we demonstrate its utility for genomewide gene-family analysis by applying it to recently presented yeast data. We validate PrIME-GSR by comparing with previous analyses of these data that take advantage of gene order information. In a case study we apply our method to the ADH gene family and are able to draw biologically relevant conclusions concerning gene duplications creating key yeast phenotypes. On a higher level this shows the biological relevance of our method. The obtained results demonstrate the value of a relaxed molecular clock. Our good performance will extend to species where gene order conservation is insufficient.

...read moreread less

Journal Article•DOI•

Self-organizing tree models for image synthesis

[...]

Wojciech Palubicki¹, Kipp Horel¹, Steven Longay¹, Adam Runions¹, Brendan Lane¹, Radomír Měch², Przemyslaw Prusinkiewicz¹ - Show less +3 more•Institutions (2)

University of Calgary¹, Adobe Systems²

27 Jul 2009

TL;DR: Simulations of this method robustly generate a wide range of realistic trees and bushes, which can be controlled with a variety of interactive techniques, including procedural brushes, sketching, and editing operations such as pruning and bending of branches.

...read moreread less

Abstract: We present a method for generating realistic models of temperate-climate trees and shrubs. This method is based on the biological hypothesis that the form of a developing tree emerges from a self-organizing process dominated by the competition of buds and branches for light or space, and regulated by internal signaling mechanisms. Simulations of this process robustly generate a wide range of realistic trees and bushes. The generated forms can be controlled with a variety of interactive techniques, including procedural brushes, sketching, and editing operations such as pruning and bending of branches. We illustrate the usefulness and versatility of the proposed method with diverse tree models, forest scenes, animations of tree development, and examples of combined interactive-procedural tree modeling.

...read moreread less

Proceedings Article•DOI•

Types and higher-order recursion schemes for verification of higher-order programs

[...]

Naoki Kobayashi¹•Institutions (1)

Tohoku University¹

21 Jan 2009

TL;DR: A new verification method for temporal properties of higher-order functional programs, which takes advantage of Ong's recent result on the decidability of the model-checking problem for higher- order recursion schemes (HORS's).

...read moreread less

Abstract: We propose a new verification method for temporal properties of higher-order functional programs, which takes advantage of Ong's recent result on the decidability of the model-checking problem for higher-order recursion schemes (HORS's). A program is transformed to an HORS that generates a tree representing all the possible event sequences of the program, and then the HORS is model-checked. Unlike most of the previous methods for verification of higher-order programs, our verification method is sound and complete. Moreover, this new verification framework allows a smooth integration of abstract model checking techniques into verification of higher-order programs. We also present a type-based verification algorithm for HORS's. The algorithm can deal with only a fragment of the properties expressed by modal mu-calculus, but the algorithm and its correctness proof are (arguably) much simpler than those of Ong's game-semantics-based algorithm. Moreover, while the HORS model checking problem is n-EXPTIME in general, our algorithm is linear in the size of HORS, under the assumption that the sizes of types and specification formulas are bounded by a constant.

...read moreread less

Proceedings Article•DOI•

A Type System Equivalent to the Modal Mu-Calculus Model Checking of Higher-Order Recursion Schemes

[...]

Naoki Kobayashi¹, C.-H. Luke Ong•Institutions (1)

Tohoku University¹

11 Aug 2009

TL;DR: This work gives an alternative, type-based verification method for modal mu-calculus model checking of trees generated by order-n recursion scheme, and its correctness proof is comparatively easy to understand.

...read moreread less

Abstract: The model checking of higher-order recursion schemes has important applications in the verification of higher-order programs. Ong has previously shown that the modal mu-calculus model checking of trees generated by order-n recursion scheme is n-EXPTIME complete, but his algorithm and its correctness proof were rather complex. We give an alternative, type-based verification method: Given a modal mu-calculus formula, we can construct a type system in which a recursion scheme is typable if, and only if, the (possibly infinite, ranked) tree generated by the scheme satisfies the formula. The model checking problem is thus reduced to a type checking problem. Our type-based approach yields a simple verification algorithm, and its correctness proof (constructed without recourse to game semantics) is comparatively easy to understand. Furthermore, the algorithm is polynomial-time in the size of the recursion scheme, assuming that the formula and the largest order and arity of non-terminals of the recursion scheme are fixed.

...read moreread less

Journal Article•DOI•

Lazy-Adaptive Tree: an optimized index structure for flash devices

[...]

Devesh Agrawal¹, Deepak Ganesan¹, Ramesh K. Sitaraman¹, Yanlei Diao¹, Shashi Shekhar Singh¹ - Show less +1 more•Institutions (1)

University of Massachusetts Amherst¹

01 Aug 2009

TL;DR: The Lazy-Adaptive Tree (LA-Tree) is presented, a novel index structure that is designed to improve performance by minimizing accesses to flash by amortizing the cost of node reads and writes by performing update operations in a lazy manner using cascaded buffers.

...read moreread less

Abstract: Flash memories are in ubiquitous use for storage on sensor nodes, mobile devices, and enterprise servers. However, they present significant challenges in designing tree indexes due to their fundamentally different read and write characteristics in comparison to magnetic disks.In this paper, we present the Lazy-Adaptive Tree (LA-Tree), a novel index structure that is designed to improve performance by minimizing accesses to flash. The LA-tree has three key features: 1) it amortizes the cost of node reads and writes by performing update operations in a lazy manner using cascaded buffers, 2) it dynamically adapts buffer sizes to workload using an online algorithm, which we prove to be optimal under the cost model for raw NAND flashes, and 3) it optimizes index parameters, memory management, and storage reclamation to address flash constraints. Our performance results on raw NAND flashes show that the LA-Tree achieves 2x to 12x gains over the best of alternate schemes across a range of workloads and memory constraints. Initial results on SSDs are also promising, with 3x to 6x gains in most cases.

...read moreread less

Journal Article•DOI•

From image parsing to painterly rendering

[...]

Kun Zeng, Mingtian Zhao¹, Caiming Xiong, Song-Chun Zhu¹•Institutions (1)

University of California, Los Angeles¹

15 Dec 2009-ACM Transactions on Graphics

TL;DR: This work presents a semantics-driven approach for stroke-based painterly rendering, based on recent image parsing techniques, which benefits from richer meaningful image semantic information, which leads to better simulation of painting techniques of artists using the high-quality brush dictionary.

...read moreread less

Abstract: We present a semantics-driven approach for stroke-based painterly rendering, based on recent image parsing techniques [Tu et al. 2005; Tu and Zhu 2006] in computer vision. Image parsing integrates segmentation for regions, sketching for curves, and recognition for object categories. In an interactive manner, we decompose an input image into a hierarchy of its constituent components in a parse tree representation with occlusion relations among the nodes in the tree. To paint the image, we build a brush dictionary containing a large set (760) of brush examples of four shape/appearance categories, which are collected from professional artists, then we select appropriate brushes from the dictionary and place them on the canvas guided by the image semantics included in the parse tree, with each image component and layer painted in various styles. During this process, the scene and object categories also determine the color blending and shading strategies for inhomogeneous synthesis of image details. Compared with previous methods, this approach benefits from richer meaningful image semantic information, which leads to better simulation of painting techniques of artists using the high-quality brush dictionary. We have tested our approach on a large number (hundreds) of images and it produced satisfactory painterly effects.

...read moreread less

Proceedings Article•DOI•

Manipulation planning with Workspace Goal Regions

[...]

Dmitry Berenson¹, Siddhartha S. Srinivasa², Dave Ferguson², Alvaro Collet¹, James J. Kuffner¹ - Show less +1 more•Institutions (2)

Carnegie Mellon University¹, Intel²

12 May 2009

TL;DR: An approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses and shows that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties.

...read moreread less

Abstract: We present an approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses Instead of specifying a discrete set of goals in the manipulator's configuration space, we specify goals more intuitively as volumes in the manipulator's workspace We show that WGRs provide a common framework for describing goal regions that are useful for grasping and manipulation We also describe two randomized planning algorithms capable of planning with WGRs The first is an extension of RRT-JT that interleaves exploration using a Rapidly-exploring Random Tree (RRT) with exploitation using Jacobian-based gradient descent toward WGR samples The second is the IKBiRRT algorithm, which uses a forward-searching tree rooted at the start and a backward-searching tree that is seeded by WGR samples We demonstrate both simulation and experimental results for a 7DOF WAM arm with a mobile base performing reaching and pick-and-place tasks Our results show that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties

...read moreread less

Journal Article•DOI•

Sliding window-based frequent pattern mining over data streams

[...]

Syed Khairuzzaman Tanbeer¹, Chowdhury Farhan Ahmed¹, Byeong-Soo Jeong¹, Young-Koo Lee¹•Institutions (1)

Kyung Hee University¹

01 Nov 2009-Information Sciences

TL;DR: An efficient technique to discover the complete set of recent frequent patterns from a high-speed data stream over a sliding window is proposed and the concept of dynamic tree restructuring in the CPS-tree is introduced to produce a highly compact frequency-descending tree structure at runtime.

...read moreread less

Journal Article•DOI•

Efficient single-pass frequent pattern mining using a prefix-tree

[...]

Syed Khairuzzaman Tanbeer¹, Chowdhury Farhan Ahmed¹, Byeong-Soo Jeong¹, Young-Koo Lee¹•Institutions (1)

Kyung Hee University¹

01 Feb 2009-Information Sciences

TL;DR: The CP-tree introduces the concept of dynamic tree restructuring to produce a highly compact frequency-descending tree structure at runtime that captures database information with one scan and provides the same mining performance as the FP-growth method (restructuring phase).

...read moreread less

Proceedings Article•DOI•

Structure-and-motion pipeline on a hierarchical cluster tree

[...]

Michela Farenzena¹, Andrea Fusiello¹, Riccardo Gherardi¹•Institutions (1)

University of Verona¹

01 Sep 2009

TL;DR: This papers introduces a novel hierarchical scheme for computing Structure and Motion that has a lower computational complexity, it is independent from the initial pair of views, and copes better with drift problems.

...read moreread less

Abstract: This papers introduces a novel hierarchical scheme for computing Structure and Motion. The images are organized into a tree with agglomerative clustering, using a measure of overlap as the distance. The reconstruction follows this tree from the leaves to the root. As a result, the problem is broken into smaller instances, which are then separately solved and combined. Compared to the standard sequential approach, this framework has a lower computational complexity, it is independent from the initial pair of views, and copes better with drift problems. A formal complexity analysis and some experimental results support these claims.

...read moreread less

Collapse