scispace - formally typeset
Search or ask a question

Showing papers on "Tree (data structure) published in 2009"



Proceedings ArticleDOI
28 Jun 2009
TL;DR: This paper proposes WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object, which uses previously extracted movement patterns named Trajectory Patterns, which are a concise representation of behaviors of moving objects as sequences of regions frequently visited with a typical travel time.
Abstract: The pervasiveness of mobile devices and location based services is leading to an increasing volume of mobility data.This side eect provides the opportunity for innovative methods that analyse the behaviors of movements. In this paper we propose WhereNext, which is a method aimed at predicting with a certain level of accuracy the next location of a moving object. The prediction uses previously extracted movement patterns named Trajectory Patterns, which are a concise representation of behaviors of moving objects as sequences of regions frequently visited with a typical travel time. A decision tree, named T-pattern Tree, is built and evaluated with a formal training and test process. The tree is learned from the Trajectory Patterns that hold a certain area and it may be used as a predictor of the next location of a new trajectory finding the best matching path in the tree. Three dierent best matching methods to classify a new moving object are proposed and their impact on the quality of prediction is studied extensively. Using Trajectory Patterns as predictive rules has the following implications: (I) the learning depends on the movement of all available objects in a certain area instead of on the individual history of an object; (II) the prediction tree intrinsically contains the spatio-temporal properties that have emerged from the data and this allows us to define matching methods that striclty depend on the properties of such movements. In addition, we propose a set of other measures, that evaluate a priori the predictive power of a set of Trajectory Patterns. This measures were tuned on a real life case study. Finally, an exhaustive set of experiments and results on the real dataset are presented.

610 citations


Journal ArticleDOI
TL;DR: This paper proposes three novel tree structures to efficiently perform incremental and interactive HUP mining that can capture the incremental data without any restructuring operation, and shows that these tree structures are very efficient and scalable.
Abstract: Recently, high utility pattern (HUP) mining is one of the most important research issues in data mining due to its ability to consider the nonbinary frequency values of items in transactions and different profit values for every item. On the other hand, incremental and interactive data mining provide the ability to use previous data structures and mining results in order to reduce unnecessary calculations when a database is updated, or when the minimum threshold is changed. In this paper, we propose three novel tree structures to efficiently perform incremental and interactive HUP mining. The first tree structure, Incremental HUP Lexicographic Tree (IHUPL-Tree), is arranged according to an item's lexicographic order. It can capture the incremental data without any restructuring operation. The second tree structure is the IHUP transaction frequency tree (IHUPTF-Tree), which obtains a compact size by arranging items according to their transaction frequency (descending order). To reduce the mining time, the third tree, IHUP-transaction-weighted utilization tree (IHUPTWU-Tree) is designed based on the TWU value of items in descending order. Extensive performance analyses show that our tree structures are very efficient and scalable for incremental and interactive HUP mining.

555 citations


Journal ArticleDOI
TL;DR: PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data.
Abstract: Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types.

527 citations


Journal ArticleDOI
TL;DR: In this article, the results of nine leading orthology identification projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) were systematically compared with respect to both phylogeny and function, using six different tests.
Abstract: Accurate genome-wide identification of orthologs is a central problem in comparative genomics, a fact reflected by the numerous orthology identification projects developed in recent years. However, only a few reports have compared their accuracy, and indeed, several recent efforts have not yet been systematically evaluated. Furthermore, orthology is typically only assessed in terms of function conservation, despite the phylogeny-based original definition of Fitch. We collected and mapped the results of nine leading orthology projects and methods (COG, KOG, Inparanoid, OrthoMCL, Ensembl Compara, Homologene, RoundUp, EggNOG, and OMA) and two standard methods (bidirectional best-hit and reciprocal smallest distance). We systematically compared their predictions with respect to both phylogeny and function, using six different tests. This required the mapping of millions of sequences, the handling of hundreds of millions of predicted pairs of orthologs, and the computation of tens of thousands of trees. In phylogenetic analysis or in functional analysis where high specificity is required, we find that OMA and Homologene perform best. At lower functional specificity but higher coverage level, OrthoMCL outperforms Ensembl Compara, and to a lesser extent Inparanoid. Lastly, the large coverage of the recent EggNOG can be of interest to build broad functional grouping, but the method is not specific enough for phylogenetic or detailed function analyses. In terms of general methodology, we observe that the more sophisticated tree reconstruction/reconciliation approach of Ensembl Compara was at times outperformed by pairwise comparison approaches, even in phylogenetic tests. Furthermore, we show that standard bidirectional best-hit often outperforms projects with more complex algorithms. First, the present study provides guidance for the broad community of orthology data users as to which database best suits their needs. Second, it introduces new methodology to verify orthology. And third, it sets performance standards for current and future approaches.

421 citations


Posted Content
08 Sep 2009
TL;DR: In this paper, a tree-guided group lasso is proposed for estimating structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree, and a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlap among groups.
Abstract: We consider the problem of estimating a sparse multi-response regression function, with an application to expression quantitative trait locus (eQTL) mapping, where the goal is to discover genetic variations that influence gene-expression levels. In particular, we investigate a shrinkage technique capable of capturing a given hierarchical structure over the responses, such as a hierarchical clustering tree with leaf nodes for responses and internal nodes for clusters of related responses at multiple granularity, and we seek to leverage this structure to recover covariates relevant to each hierarchically-defined cluster of responses. We propose a tree-guided group lasso, or tree lasso, for estimating such structured sparsity under multi-response regression by employing a novel penalty function constructed from the tree. We describe a systematic weighting scheme for the overlapping groups in the tree-penalty such that each regression coefficient is penalized in a balanced manner despite the inhomogeneous multiplicity of group memberships of the regression coefficients due to overlaps among groups. For efficient optimization, we employ a smoothing proximal gradient method that was originally developed for a general class of structured-sparsity-inducing penalties. Using simulated and yeast data sets, we demonstrate that our method shows a superior performance in terms of both prediction errors and recovery of true sparsity patterns, compared to other methods for learning a multivariate-response regression.

387 citations


Book ChapterDOI
27 Aug 2009
TL;DR: A method for developing algorithms that can adaptively learn from data streams that drift over time, based on using change detectors and estimator modules at the right places and choosing implementations with theoretical guarantees in order to extend such guarantees to the resulting adaptive learning algorithm.
Abstract: We propose and illustrate a method for developing algorithms that can adaptively learn from data streams that drift over time. As an example, we take Hoeffding Tree, an incremental decision tree inducer for data streams, and use as a basis it to build two new methods that can deal with distribution and concept drift: a sliding window-based algorithm, Hoeffding Window Tree, and an adaptive method, Hoeffding Adaptive Tree. Our methods are based on using change detectors and estimator modules at the right places; we choose implementations with theoretical guarantees in order to extend such guarantees to the resulting adaptive learning algorithm. A main advantage of our methods is that they require no guess about how fast or how often the stream will drift; other methods typically have several user-defined parameters to this effect. In our experiments, the new methods never do worse, and in some cases do much better, than CVFDT, a well-known method for tree induction on data streams with drift.

384 citations


Proceedings Article
01 Jan 2009
TL;DR: A hierarchical computational structure to recognize emotions is introduced that maps an input speech utterance into one of the multiple emotion classes through subsequent layers of binary classifications and is effective for classifying emotional utterances in multiple database contexts.
Abstract: Automated emotion state tracking is a crucial element in the computational study of human communication behaviors It is important to design robust and reliable emotion recognition systems that are suitable for real-world applications both to enhance analytical abilities to support human decision making and to design human-machine interfaces that facilitate efficient communication We introduce a hierarchical computational structure to recognize emotions The proposed structure maps an input speech utterance into one of the multiple emotion classes through subsequent layers of binary classifications The key idea is that the levels in the tree are designed to solve the easiest classification tasks first, allowing us to mitigate error propagation We evaluated the classification framework on two different emotional databases using acoustic features, the AIBO database and the USC IEMOCAP database In the case of the AIBO database, we obtain a balanced recall on each of the individual emotion classes using this hierarchical structure The performance measure of the average unweighted recall on the evaluation data set improves by 337% absolute (882% relative) over a Support Vector Machine baseline model In the USC IEMOCAP database, we obtain an absolute improvement of 744% (1458%) over a baseline Support Vector Machine modeling The results demonstrate that the presented hierarchical approach is effective for classifying emotional utterances in multiple database contexts

339 citations


Proceedings ArticleDOI
01 Sep 2009
TL;DR: The prototype-based approach enables robust action matching in very challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences.
Abstract: A prototype-based approach is introduced for action recognition. The approach represents an action as a sequence of prototypes for efficient and flexible action matching in long video sequences. During training, first, an action prototype tree is learned in a joint shape and motion space via hierarchical k-means clustering; then a lookup table of prototype-to-prototype distances is generated. During testing, based on a joint likelihood model of the actor location and action prototype, the actor is tracked while a frame-to-prototype correspondence is established by maximizing the joint likelihood, which is efficiently performed by searching the learned prototype tree; then actions are recognized using dynamic prototype sequence matching. Distance matrices used for sequence matching are rapidly obtained by look-up table indexing, which is an order of magnitude faster than brute-force computation of frame-to-frame distances. Our approach enables robust action matching in very challenging situations (such as moving cameras, dynamic backgrounds) and allows automatic alignment of action sequences. Experimental results demonstrate that our approach achieves recognition rates of 91.07% on a large gesture dataset (with dynamic backgrounds), 100% on the Weizmann action dataset and 95.77% on the KTH action dataset.

327 citations


Journal ArticleDOI
TL;DR: A tight lower bound of the amount of data that must be communicated in order to complete the all-reduce operation with large data sizes in cluster environments is derived and a ring-based algorithm is proposed that only requires tree connectivity to achieve bandwidth optimality.

324 citations


Journal ArticleDOI
01 Aug 2009
TL;DR: This paper describes PLANET: a scalable distributed framework for learning tree models over large datasets, and shows how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models.
Abstract: Classification and regression tree learning on massive datasets is a common data mining task at Google, yet many state of the art tree learning algorithms require training data to reside in memory on a single machine. While more scalable implementations of tree learning have been proposed, they typically require specialized parallel computing architectures. In contrast, the majority of Google's computing infrastructure is based on commodity hardware.In this paper, we describe PLANET: a scalable distributed framework for learning tree models over large datasets. PLANET defines tree learning as a series of distributed computations, and implements each one using the MapReduce model of distributed computation. We show how this framework supports scalable construction of classification and regression trees, as well as ensembles of such models. We discuss the benefits and challenges of using a MapReduce compute cluster for tree learning, and demonstrate the scalability of this approach by applying it to a real world learning task from the domain of computational advertising.


Journal ArticleDOI
TL;DR: A quantitative evaluation of structural attributes, like the vertical foliage and wood area profiles, as well as the shoot orientation distribution, confirmed the appropriateness of the proposed tree reconstruction model for the generation of structurally and radiatively faithful copies of existing plant and canopy architectures.

Journal ArticleDOI
TL;DR: This article presents a worst-case O(n) 3-time algorithm for the problem when the two trees have size n, and proves the optimality of the algorithm among the family of decomposition strategy algorithms—which also includes the previous fastest algorithms—by tightening the known lower bound.
Abstract: The edit distance between two ordered rooted trees with vertex labels is the minimum cost of transforming one tree into the other by a sequence of elementary operations consisting of deleting and relabeling existing nodes, as well as inserting new nodes. In this article, we present a worst-case O(n3)-time algorithm for the problem when the two trees have size n, improving the previous best O(n3 log n)-time algorithm. Our result requires a novel adaptive strategy for deciding how a dynamic program divides into subproblems, together with a deeper understanding of the previous algorithms for the problem. We prove the optimality of our algorithm among the family of decomposition strategy algorithms—which also includes the previous fastest algorithms—by tightening the known lower bound of Ω(n2 log2n) to Ω(n3), matching our algorithm's running time. Furthermore, we obtain matching upper and lower bounds for decomposition strategy algorithms of Θ(nm2 (1 + log n/m)) when the two trees have sizes m and n and m

Proceedings ArticleDOI
08 Jul 2009
TL;DR: The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.
Abstract: Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.

Proceedings ArticleDOI
22 Jun 2009
TL;DR: Treedoc is described, a novel CRDT design for cooperative text editing where the identifiers of Treedoc atoms are selected from a dense space and the results with traces from existing edit histories are validated.
Abstract: A Commutative Replicated Data Type (CRDT) is one where all concurrent operations commute. The replicas of a CRDT converge automatically, without complex concurrency control. This paper describes Treedoc, a novel CRDT design for cooperative text editing. An essential property is that the identifiers of Treedoc atoms are selected from a dense space. We discuss practical alternatives for implementing the identifier space based on an extended binary tree. We also discuss storage alternatives for data and meta-data, and mechanisms for compacting the tree. In the best case, Treedoc incurs no overhead with respect to a linear text buffer. We validate the results with traces from existing edit histories.

Journal ArticleDOI
TL;DR: It is shown that searching for the species tree in the compatibility graph of the clusters induced by the gene trees may be sufficient in practice, a finding that helps ameliorate the computational requirements of optimization solutions.
Abstract: In a 1997 seminal paper, W Maddison proposed minimizing deep coalescences, or MDC, as an optimization criterion for inferring the species tree from a set of incongruent gene trees, assuming the incongruence is exclusively due to lineage sorting In a subsequent paper, Maddison and Knowles provided and implemented a search heuristic for optimizing the MDC criterion, given a set of gene trees However, the heuristic is not guaranteed to compute optimal solutions, and its hill-climbing search makes it slow in practice In this paper, we provide two exact solutions to the problem of inferring the species tree from a set of gene trees under the MDC criterion In other words, our solutions are guaranteed to find the tree that minimizes the total number of deep coalescences from a set of gene trees One solution is based on a novel integer linear programming (ILP) formulation, and another is based on a simple dynamic programming (DP) approach Powerful ILP solvers, such as CPLEX, make the first solution appealing, particularly for very large-scale instances of the problem, whereas the DP-based solution eliminates dependence on proprietary tools, and its simplicity makes it easy to integrate with other genomic events that may cause gene tree incongruence Using the exact solutions, we analyze a data set of 106 loci from eight yeast species, a data set of 268 loci from eight Apicomplexan species, and several simulated data sets We show that the MDC criterion provides very accurate estimates of the species tree topologies, and that our solutions are very fast, thus allowing for the accurate analysis of genome-scale data sets Further, the efficiency of the solutions allow for quick exploration of sub-optimal solutions, which is important for a parsimony-based criterion such as MDC, as we show We show that searching for the species tree in the compatibility graph of the clusters induced by the gene trees may be sufficient in practice, a finding that helps ameliorate the computational requirements of optimization solutions Further, we study the statistical consistency and convergence rate of the MDC criterion, as well as its optimality in inferring the species tree Finally, we show how our solutions can be used to identify potential horizontal gene transfer events that may have caused some of the incongruence in the data, thus augmenting Maddison's original framework We have implemented our solutions in the PhyloNet software package, which is freely available at: http://bioinfocsriceedu/phylonet

Proceedings ArticleDOI
29 Jun 2009
TL;DR: A cost model can accurately capture the actual runtime behavior of a plan, and that choosing the optimal plan can result in a factor of four or more speedup versus an NFA based approach, and a dynamic programming algorithm is described used in this cost model to efficiently search for an optimal query plan for a given pattern.
Abstract: Composite (or Complex) event processing (CEP) systems search sequences of incoming events for occurrences of user-specified event patterns. Recently, they have gained more attention in a variety of areas due to their powerful and expressive query language and performance potential. Sequentiality (temporal ordering) is the primary way in which CEP systems relate events to each other. In this paper, we present a CEP system called ZStream to efficiently process such sequential patterns. Besides simple sequential patterns, ZStream is also able to detect other patterns, including conjunction, disjunction, negation and Kleene closure. Unlike most recently proposed CEP systems, which use non-deterministic finite automata (NFA's) to detect patterns, ZStream uses tree-based query plans for both the logical and physical representation of query patterns. By carefully designing the underlying infrastructure and algorithms, ZStream is able to unify the evaluation of sequence, conjunction, disjunction, negation, and Kleene closure as variants of the join operator. Under this framework, a single pattern in ZStream may have several equivalent physical tree plans, with different evaluation costs. We propose a cost model to estimate the computation costs of a plan. We show that our cost model can accurately capture the actual runtime behavior of a plan, and that choosing the optimal plan can result in a factor of four or more speedup versus an NFA based approach. Based on this cost model and using a simple set of statistics about operator selectivity and data rates, ZStream is able to adaptively and seamlessly adjust the order in which it detects patterns on the fly. Finally, we describe a dynamic programming algorithm used in our cost model to efficiently search for an optimal query plan for a given pattern.

Journal ArticleDOI
TL;DR: In this article, the exact shape of an irregular tree crown of various tree species based on the lidar point cloud and visualize their exact crown formation in three-dimensional space was captured using computer graphics.

Journal ArticleDOI
TL;DR: For the first time, by using the properties of the XBW-transform, compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space.
Abstract: Consider an ordered, static tree T where each node has a label from alphabet Σ. Tree T may be of arbitrary degree and shape. Our goal is designing a compressed storage scheme of T that supports basic navigational operations among the immediate neighbors of a node (i.e. parent, ith child, or any child with some label,…) as well as more sophisticated path-based search operations over its labeled structure.We present a novel approach to this problem by designing what we call the XBW-transform of the tree in the spirit of the well-known Burrows-Wheeler transform for strings [1994]. The XBW-transform uses path-sorting to linearize the labeled tree T into two coordinated arrays, one capturing the structure and the other the labels. For the first time, by using the properties of the XBW-transform, our compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space.Our XBW-transform is simple and likely to spur new results in the theory of tree compression and indexing, as well as interesting application contexts. As an example, we use the XBW-transform to design and implement a compressed index for XML documents whose compression ratio is significantly better than the one achievable by state-of-the-art tools, and its query time performance is order of magnitudes faster.

Journal ArticleDOI
TL;DR: A probabilistic model integrating gene duplication, sequence evolution, and a relaxed molecular clock for substitution rates that enables genomewide analysis of gene families and is able to draw biologically relevant conclusions concerning gene duplications creating key yeast phenotypes is presented.
Abstract: We present GSR, a probabilistic model integrating gene duplication, sequence evolution, and a relaxed molecular clock for substitution rates, that enables genomewide analysis of gene families. The gene duplication and loss process is a major cause for incongruence between gene and species tree, and deterministic methods have been developed to explain such differences through tree reconciliations. Although probabilistic methods for phylogenetic inference have been around for decades, probabilistic reconciliation methods are far less established. Based on our model, we have implemented a Bayesian analysis tool, PrIME-GSR, for gene tree inference that takes a known species tree into account. Our implementation is sound and we demonstrate its utility for genomewide gene-family analysis by applying it to recently presented yeast data. We validate PrIME-GSR by comparing with previous analyses of these data that take advantage of gene order information. In a case study we apply our method to the ADH gene family and are able to draw biologically relevant conclusions concerning gene duplications creating key yeast phenotypes. On a higher level this shows the biological relevance of our method. The obtained results demonstrate the value of a relaxed molecular clock. Our good performance will extend to species where gene order conservation is insufficient.

Journal ArticleDOI
27 Jul 2009
TL;DR: Simulations of this method robustly generate a wide range of realistic trees and bushes, which can be controlled with a variety of interactive techniques, including procedural brushes, sketching, and editing operations such as pruning and bending of branches.
Abstract: We present a method for generating realistic models of temperate-climate trees and shrubs. This method is based on the biological hypothesis that the form of a developing tree emerges from a self-organizing process dominated by the competition of buds and branches for light or space, and regulated by internal signaling mechanisms. Simulations of this process robustly generate a wide range of realistic trees and bushes. The generated forms can be controlled with a variety of interactive techniques, including procedural brushes, sketching, and editing operations such as pruning and bending of branches. We illustrate the usefulness and versatility of the proposed method with diverse tree models, forest scenes, animations of tree development, and examples of combined interactive-procedural tree modeling.

Proceedings ArticleDOI
21 Jan 2009
TL;DR: A new verification method for temporal properties of higher-order functional programs, which takes advantage of Ong's recent result on the decidability of the model-checking problem for higher- order recursion schemes (HORS's).
Abstract: We propose a new verification method for temporal properties of higher-order functional programs, which takes advantage of Ong's recent result on the decidability of the model-checking problem for higher-order recursion schemes (HORS's). A program is transformed to an HORS that generates a tree representing all the possible event sequences of the program, and then the HORS is model-checked. Unlike most of the previous methods for verification of higher-order programs, our verification method is sound and complete. Moreover, this new verification framework allows a smooth integration of abstract model checking techniques into verification of higher-order programs. We also present a type-based verification algorithm for HORS's. The algorithm can deal with only a fragment of the properties expressed by modal mu-calculus, but the algorithm and its correctness proof are (arguably) much simpler than those of Ong's game-semantics-based algorithm. Moreover, while the HORS model checking problem is n-EXPTIME in general, our algorithm is linear in the size of HORS, under the assumption that the sizes of types and specification formulas are bounded by a constant.

Proceedings ArticleDOI
11 Aug 2009
TL;DR: This work gives an alternative, type-based verification method for modal mu-calculus model checking of trees generated by order-n recursion scheme, and its correctness proof is comparatively easy to understand.
Abstract: The model checking of higher-order recursion schemes has important applications in the verification of higher-order programs. Ong has previously shown that the modal mu-calculus model checking of trees generated by order-n recursion scheme is n-EXPTIME complete, but his algorithm and its correctness proof were rather complex. We give an alternative, type-based verification method: Given a modal mu-calculus formula, we can construct a type system in which a recursion scheme is typable if, and only if, the (possibly infinite, ranked) tree generated by the scheme satisfies the formula. The model checking problem is thus reduced to a type checking problem. Our type-based approach yields a simple verification algorithm, and its correctness proof (constructed without recourse to game semantics) is comparatively easy to understand. Furthermore, the algorithm is polynomial-time in the size of the recursion scheme, assuming that the formula and the largest order and arity of non-terminals of the recursion scheme are fixed.

Journal ArticleDOI
01 Aug 2009
TL;DR: The Lazy-Adaptive Tree (LA-Tree) is presented, a novel index structure that is designed to improve performance by minimizing accesses to flash by amortizing the cost of node reads and writes by performing update operations in a lazy manner using cascaded buffers.
Abstract: Flash memories are in ubiquitous use for storage on sensor nodes, mobile devices, and enterprise servers. However, they present significant challenges in designing tree indexes due to their fundamentally different read and write characteristics in comparison to magnetic disks.In this paper, we present the Lazy-Adaptive Tree (LA-Tree), a novel index structure that is designed to improve performance by minimizing accesses to flash. The LA-tree has three key features: 1) it amortizes the cost of node reads and writes by performing update operations in a lazy manner using cascaded buffers, 2) it dynamically adapts buffer sizes to workload using an online algorithm, which we prove to be optimal under the cost model for raw NAND flashes, and 3) it optimizes index parameters, memory management, and storage reclamation to address flash constraints. Our performance results on raw NAND flashes show that the LA-Tree achieves 2x to 12x gains over the best of alternate schemes across a range of workloads and memory constraints. Initial results on SSDs are also promising, with 3x to 6x gains in most cases.

Journal ArticleDOI
TL;DR: This work presents a semantics-driven approach for stroke-based painterly rendering, based on recent image parsing techniques, which benefits from richer meaningful image semantic information, which leads to better simulation of painting techniques of artists using the high-quality brush dictionary.
Abstract: We present a semantics-driven approach for stroke-based painterly rendering, based on recent image parsing techniques [Tu et al. 2005; Tu and Zhu 2006] in computer vision. Image parsing integrates segmentation for regions, sketching for curves, and recognition for object categories. In an interactive manner, we decompose an input image into a hierarchy of its constituent components in a parse tree representation with occlusion relations among the nodes in the tree. To paint the image, we build a brush dictionary containing a large set (760) of brush examples of four shape/appearance categories, which are collected from professional artists, then we select appropriate brushes from the dictionary and place them on the canvas guided by the image semantics included in the parse tree, with each image component and layer painted in various styles. During this process, the scene and object categories also determine the color blending and shading strategies for inhomogeneous synthesis of image details. Compared with previous methods, this approach benefits from richer meaningful image semantic information, which leads to better simulation of painting techniques of artists using the high-quality brush dictionary. We have tested our approach on a large number (hundreds) of images and it produced satisfactory painterly effects.

Proceedings ArticleDOI
12 May 2009
TL;DR: An approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses and shows that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties.
Abstract: We present an approach to path planning for manipulators that uses Workspace Goal Regions (WGRs) to specify goal end-effector poses Instead of specifying a discrete set of goals in the manipulator's configuration space, we specify goals more intuitively as volumes in the manipulator's workspace We show that WGRs provide a common framework for describing goal regions that are useful for grasping and manipulation We also describe two randomized planning algorithms capable of planning with WGRs The first is an extension of RRT-JT that interleaves exploration using a Rapidly-exploring Random Tree (RRT) with exploitation using Jacobian-based gradient descent toward WGR samples The second is the IKBiRRT algorithm, which uses a forward-searching tree rooted at the start and a backward-searching tree that is seeded by WGR samples We demonstrate both simulation and experimental results for a 7DOF WAM arm with a mobile base performing reaching and pick-and-place tasks Our results show that planning with WGRs provides an intuitive and powerful method of specifying goals for a variety of tasks without sacrificing efficiency or desirable completeness properties

Journal ArticleDOI
TL;DR: An efficient technique to discover the complete set of recent frequent patterns from a high-speed data stream over a sliding window is proposed and the concept of dynamic tree restructuring in the CPS-tree is introduced to produce a highly compact frequency-descending tree structure at runtime.

Journal ArticleDOI
TL;DR: The CP-tree introduces the concept of dynamic tree restructuring to produce a highly compact frequency-descending tree structure at runtime that captures database information with one scan and provides the same mining performance as the FP-growth method (restructuring phase).

Proceedings ArticleDOI
01 Sep 2009
TL;DR: This papers introduces a novel hierarchical scheme for computing Structure and Motion that has a lower computational complexity, it is independent from the initial pair of views, and copes better with drift problems.
Abstract: This papers introduces a novel hierarchical scheme for computing Structure and Motion. The images are organized into a tree with agglomerative clustering, using a measure of overlap as the distance. The reconstruction follows this tree from the leaves to the root. As a result, the problem is broken into smaller instances, which are then separately solved and combined. Compared to the standard sequential approach, this framework has a lower computational complexity, it is independent from the initial pair of views, and copes better with drift problems. A formal complexity analysis and some experimental results support these claims.