scispace - formally typeset
Search or ask a question

Showing papers on "Binary tree published in 2009"


Journal ArticleDOI
TL;DR: A binary tree structure is exploited to solve the problem of communicating pairs of peak points and distribution of pixel differences is used to achieve large hiding capacity while keeping the distortion low.
Abstract: In this letter, we present a reversible data hiding scheme based on histogram modification. We exploit a binary tree structure to solve the problem of communicating pairs of peak points. Distribution of pixel differences is used to achieve large hiding capacity while keeping the distortion low. We also adopt a histogram shifting technique to prevent overflow and underflow. Performance comparisons with other existing schemes are provided to demonstrate the superiority of the proposed scheme.

550 citations


Proceedings ArticleDOI
22 Jun 2009
TL;DR: Treedoc is described, a novel CRDT design for cooperative text editing where the identifiers of Treedoc atoms are selected from a dense space and the results with traces from existing edit histories are validated.
Abstract: A Commutative Replicated Data Type (CRDT) is one where all concurrent operations commute. The replicas of a CRDT converge automatically, without complex concurrency control. This paper describes Treedoc, a novel CRDT design for cooperative text editing. An essential property is that the identifiers of Treedoc atoms are selected from a dense space. We discuss practical alternatives for implementing the identifier space based on an extended binary tree. We also discuss storage alternatives for data and meta-data, and mechanisms for compacting the tree. In the best case, Treedoc incurs no overhead with respect to a linear text buffer. We validate the results with traces from existing edit histories.

229 citations


Journal ArticleDOI
TL;DR: A novel adaptive mutation operator that has no parameter is reported that is non-revisiting: It remembers every position that it has searched before and shows and maintains a stable good performance.
Abstract: A novel genetic algorithm is reported that is non-revisiting: It remembers every position that it has searched before. An archive is used to store all the solutions that have been explored before. Different from other memory schemes in the literature, a novel binary space partitioning tree archive design is advocated. Not only is the design an efficient method to check for revisits, if any, it in itself constitutes a novel adaptive mutation operator that has no parameter. To demonstrate the power of the method, the algorithm is evaluated using 19 famous benchmark functions. The results are as follows. (1) Though it only uses finite resolution grids, when compared with a canonical genetic algorithm, a generic real-coded genetic algorithm, a canonical genetic algorithm with simple diversity mechanism, and three particle swarm optimization algorithms, it shows a significant improvement. (2) The new algorithm also shows superior performance compared to covariance matrix adaptation evolution strategy (CMA-ES), a state-of-the-art method for adaptive mutation. (3) It can work with problems that have large search spaces with dimensions as high as 40. (4) The corresponding CPU overhead of the binary space partitioning tree design is insignificant for applications with expensive or time-consuming fitness evaluations, and for such applications, the memory usage due to the archive is acceptable. (5) Though the adaptive mutation is parameter-less, it shows and maintains a stable good performance. However, for other algorithms we compare, the performance is highly dependent on suitable parameter settings.

154 citations


Journal ArticleDOI
TL;DR: Theoretical analysis and simulation results demonstrate the validity and practicality of the BAT scheme, which can effectively eliminate the performance bottleneck when verifying a mass of signatures within a rigorously required interval, even under adverse scenarios with bogus messages.
Abstract: In this paper, we propose a robust and efficient signature scheme for vehicle-to-infrastructure communications, called binary authentication tree (BAT). The BAT scheme can effectively eliminate the performance bottleneck when verifying a mass of signatures within a rigorously required interval, even under adverse scenarios with bogus messages. Given any n received messages with k ges 1 bogus ones, the computation cost to verify all these messages only requires approximately (k + 1) ldr log(n/k) + 4k - 2 time-consuming pairing operations. The BAT scheme can also be gracefully transplanted to other similar batch signature schemes. In addition, it offers the other conventional security for vehicular networks, such as identity privacy and traceability. Theoretical analysis and simulation results demonstrate the validity and practicality of the BAT scheme.

117 citations


Journal ArticleDOI
01 Dec 2009
TL;DR: This work presents a new, simple algorithmic idea for the collective communication operations broadcast, reduction, and scan (prefix sums), which beats all previous algorithms for reduction and scan.
Abstract: We present a new, simple algorithmic idea for the collective communication operations broadcast, reduction, and scan (prefix sums). The algorithms concurrently communicate over two binary trees which both span the entire network. By careful layout and communication scheduling, each tree communicates as efficiently as a single tree with exclusive use of the network. Our algorithms thus achieve up to twice the bandwidth of most previous algorithms. In particular, our approach beats all previous algorithms for reduction and scan. Experiments on clusters with Myrinet and InfiniBand interconnect show significant reductions in running time for all three operations sometimes even close to the best possible factor of two.

96 citations


Book ChapterDOI
03 Oct 2009
TL;DR: A family of pairwise tournaments reducing k-class classification to binary classification are presented, which are provably robust against a constant fraction of binary errors, and match the best possible computation and regret up to a constant.
Abstract: We present a family of pairwise tournaments reducing k-class classification to binary classification. These reductions are provably robust against a constant fraction of binary errors, and match the best possible computation and regret up to a constant.

84 citations


Journal ArticleDOI
TL;DR: A simple and practical method that computes the exact (r)SPR distance with integer linear programming and it is shown that this new method outperforms existing software tools in term of accuracy and efficiency.
Abstract: Motivation: Subtree prune and regraft (SPR) is one kind of tree rearrangements that has seen applications in solving several computational biology problems. The minimum number of rooted SPR (rSPR) operations needed to transform one rooted binary tree to another is called the rSPR distance between the two trees. Computing the rSPR distance has been actively studied in recent years. Currently, there is a lack of practical software tools for computing the rSPR distance for relatively large trees with large rSPR distance. Results: In this article, we present a simple and practical method that computes the exact rSPR distance with integer linear programming. By applying this new method on several simulated and real biological datasets, we show that our new method outperforms existing software tools in term of accuracy and ef.ciency. Our experimental results indicate that our method can compute the exact rSPR distance for many large trees with large rSPR distance. Availability: A software tool, SPRDist, is available for download from the web page: http://www.engr.uconn.edu/~ywu. Contact: ywu@engr.uconn.edu

66 citations


Journal ArticleDOI
TL;DR: Experimental results show that the CBKM achieves better clustering quality than that of KM, BKM, and single linkage (SL) algorithms with comparable time performance over a number of artificial, text documents, and gene expression datasets.

63 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that an infinite weighted tree admits a bi-Lipschitz embedding into Hilbert space if and only if it does not contain arbitrarily large complete binary trees with uniformly bounded distortion.
Abstract: We show that an infinite weighted tree admits a bi-Lipschitz embedding into Hilbert space if and only if it does not contain arbitrarily large complete binary trees with uniformly bounded distortion. We also introduce a new metric invariant called Markov convexity, and show how it can be used to compute the Euclidean distortion of any metric tree up to universal factors.

62 citations


Proceedings Article
10 May 2009
TL;DR: This paper investigates the computational complexity of tournament schedule control, i.e., designing a tournament that maximizes the winning probability a target player.
Abstract: Knockout tournaments constitute a common format of sporting events, and also model a specific type of election scheme (namely, sequential pairwise elimination election). In such tournaments the designer controls the shape of the tournament (a binary tree) and the seeding of the players (their assignment to the tree leaves). In this paper we investigate the computational complexity of tournament schedule control, i.e., designing a tournament that maximizes the winning probability a target player. We start with a generic probabilistic model consisting of a matrix of pairwise winning probabilities, and then investigate the problem under two types of constraint: constraints on the probability matrix, and constraints on the allowable tournament structure. While the complexity of the general problem is as yet unknown, these various constraints -- all naturally occurring in practice -- serve to push to the problem to one side or the other: easy (polynomial) or hard (NP-complete).

61 citations


Journal ArticleDOI
01 Aug 2009
TL;DR: A novel framework, called Pruned Wavelet Trees, is proposed, that aims for the best combination of Wavelet trees of properly-designed shapes and compressors either binary (like, Run-Length encoders) or non-binary ( like, Huffman and Arithmetic encoder).
Abstract: Wavelet Trees have been introduced by Grossi et al. in SODA 2003 and have been rapidly recognized as a very flexible tool for the design of compressed full-text indexes and data compression algorithms. Although several papers have investigated the properties and usefulness of this data structure in the full-text indexing scenario, its impact on data compression has not been fully explored. In this paper we provide a throughout theoretical analysis of a wide class of compression algorithms based on Wavelet Trees. Also, we propose a novel framework, called Pruned Wavelet Trees, that aims for the best combination of Wavelet Trees of properly-designed shapes and compressors either binary (like, Run-Length encoders) or non-binary (like, Huffman and Arithmetic encoders).

Book ChapterDOI
05 Dec 2009
TL;DR: A data structure in the RAM model is presented supporting queries that given two indices i ≤ j and an integer k report the k smallest elements in the subarray A[i..j] in sorted order in optimal O(k) time.
Abstract: We study the following one-dimensional range reporting problem: On an array A of n elements, support queries that given two indices i ≤ j and an integer k report the k smallest elements in the subarray A[i..j] in sorted order. We present a data structure in the RAM model supporting such queries in optimal O(k) time. The structure uses O(n) words of space and can be constructed in O(n logn) time. The data structure can be extended to solve the online version of the problem, where the elements in A[i..j] are reported one-by-one in sorted order, in O(1) worst-case time per element. The problem is motivated by (and is a generalization of) a problem with applications in search engines: On a tree where leaves have associated rank values, report the highest ranked leaves in a given subtree. Finally, the problem studied generalizes the classic range minimum query (RMQ) problem on arrays.

Journal ArticleDOI
TL;DR: In this paper, the authors used a natural ordered extension of the Chinese Restaurant Process to grow a two-parameter family of binary self-similar continuum fragmentation trees, and provided an explicit embedding of Ford's sequence of alpha model trees in the continuum tree which they identified in a previous article as a distributional scaling limit of Ford’s trees.
Abstract: We use a natural ordered extension of the Chinese Restaurant Process to grow a twoparameter family of binary self-similar continuum fragmentation trees. We provide an explicit embedding of Ford’s sequence of alpha model trees in the continuum tree which we identified in a previous article as a distributional scaling limit of Ford’s trees. In general, the Markov branching trees induced by the two-parameter growth rule are not sampling consistent, so the existence of compact limiting trees cannot be deduced from previous work on the sampling consistent case. We develop here a new approach to establish such limits, based on regenerative interval partitions and the urn-model description of sampling from Dirichlet random distributions. AMS 2000 subject classifications: 60J80.

Journal ArticleDOI
TL;DR: VINE as mentioned in this paper is a parallel N-body simulation code that uses a binary tree to obtain forces efficiently from special purpose GRAPE hardware, and uses the interfaces required to allow transparent substitution of those forces in the code instead of those obtained from the tree.
Abstract: We continue our presentation of VINE. In this paper, we begin with a description of relevant architectural properties of the serial and shared memory parallel computers on which VINE is intended to run, and describe their influences on the design of the code itself. We continue with a detailed description of a number of optimizations made to the layout of the particle data in memory and to our implementation of a binary tree used to access that data for use in gravitational force calculations and searches for SPH neighbor particles. We describe the modifications to the code necessary to obtain forces efficiently from special purpose ‘GRAPE’ hardware, the interfaces required to allow transparent substitution of those forces in the code instead of those obtained from the tree, and the modifications necessary to use both tree and GRAPE together as a fused GRAPE/tree combination. We conclude with an extensive series of performance tests, which demonstrate that the code can be run efficiently and without modification in serial on small workstations or in parallel using the OpenMP compiler directives on large scale, shared memory parallel machines. We analyze the effects of the code optimizations and estimate that they improve its overall performance by more than an order of magnitude over that obtained by many other tree codes. Scaled parallel performance of the gravity and SPH calculations, together the most costly components of most simulations, is nearly linear up to at least 120 processors on moderate sized test problems using the Origin 3000 architecture, and to the maximum machine sizes available to us on several other architectures. At similar accuracy, performance of VINE, used in GRAPE-tree mode, is approximately a factor two slower than that of VINE, used in host-only mode. Further optimizations of the GRAPE/host communications could improve the speed by as much as a factor of three, but have not yet been implemented in VINE. Finally, we find that although parallel performance on small problems may reach a plateau beyond which more processors bring no additional speedup, performance never decreases, a factor important for running large simulations on many processors with individual time steps, where only a small fraction of the total particles require updates at any given moment. Subject headings: methods: numerical — methods: N-body simulations

Proceedings ArticleDOI
18 Nov 2009
TL;DR: An efficient indexing and retrieval approach for human motion data is presented based on a novel similarity metric that can not only cluster the motion data accurately, but also discover the relationships between different motion types by a binary tree structure.
Abstract: For the convenient reuse of large-scale 3D motion capture data, browsing and searching methods for the data should be explored. In this paper, an efficient indexing and retrieval approach for human motion data is presented based on a novel similarity metric. We divide the human character model into three partitions to reduce the spatial complexity and measure the temporal similarity of each partition by self-organizing map and Smith--Waterman algorithm. The overall similarity between two motion clips can be achieved by integrating the similarities of the separate body partitions. Then the hierarchical clustering method is implemented, which can not only cluster the motion data accurately, but also discover the relationships between different motion types by a binary tree structure. With our typical cluster locating algorithm and motion motif mining method, fast and accurate retrieval can be performed. The experiment results show the effectiveness of our approach.

Proceedings ArticleDOI
06 Aug 2009
TL;DR: The method is built on top of a hybrid tree representation that jointly encodes both the meaning representation as well as the natural language in a tree structure that performs better than a previous state-of-the-art natural language generation model.
Abstract: This paper presents an effective method for generating natural language sentences from their underlying meaning representations. The method is built on top of a hybrid tree representation that jointly encodes both the meaning representation as well as the natural language in a tree structure. By using a tree conditional random field on top of the hybrid tree representation, we are able to explicitly model phrase-level dependencies amongst neighboring natural language phrases and meaning representation components in a simple and natural way. We show that the additional dependencies captured by the tree conditional random field allows it to perform better than directly inverting a previously developed hybrid tree semantic parser. Furthermore, we demonstrate that the model performs better than a previous state-of-the-art natural language generation model. Experiments are performed on two benchmark corpora with standard automatic evaluation metrics.

01 Jan 2009
TL;DR: Treedoc is described, a novel CRDT design for cooperative text editing where the identifiers of Treedoc atoms are selected from a dense space and the results with traces from existing edit histories are validated.
Abstract: A Commutative Replicated Data Type (CRDT) is one where all concurrent operations commute. The replicas of a CRDT converge automatically, without complex concurrency control. This paper describes Treedoc, a novel CRDT design for cooperative text editing. An essential property is that the identifiers of Treedoc atoms are selected from a dense space. We discuss practical alternatives for implementing the identifier space based on an extended binary tree. We also discuss storage alternatives for data and meta-data, and mechanisms for compacting the tree. In the best case, Treedoc incurs no overhead with respect to a linear text buffer. We validate the results with traces from existing edit histories.

Journal ArticleDOI
TL;DR: An adaptive binary tree (ABT) to reduce the test computational complexity of multiclass support vector machine (SVM) achieves a fast classification by selecting the binary SVMs with the fewest average number of support vectors (SVs).

Journal ArticleDOI
TL;DR: This paper proposes a fast adaptive balancing method, in which a binary tree structure is used to partition the simulation region into sub-domains, and compares it with two previously proposed algorithms by an artificial case and a real distributed case respectively.

Journal ArticleDOI
TL;DR: A class of one-dimensional maps is introduced which can be used to generate the binary trees in dierent ways and study their ergodic properties and some random processes arising in a natural way in this context.
Abstract: This paper is devoted to a systematic study of a class of binary trees encoding the structure of rational numbers both from arithmetic and dynamical point of view. The paper is divided into two parts. The rst one is a critical review of rather standard topics such as SternBrocot and Farey trees and their connections with continued fraction expansion and the question mark function. In the second part we introduce a class of one-dimensional maps which can be used to generate the binary trees in dierent ways and study their ergodic properties. This also leads us to study some random processes (Markov chains and martingales) arising in a natural way in this context.

Book ChapterDOI
06 Jul 2009
TL;DR: These new representations are able to simultaneously emulate the BP, DFUDS and partitioned representations using a single instance of the data structure, and thus aim towards universality.
Abstract: We consider the succinct representation of ordinal and cardinal trees on the RAM with logarithmic word size. Given a tree T , our representations support the following operations in O (1) time: (i) $\mbox{{\tt BP-substring}}(i,b)$, which reports the substring of length b bits (b is at most the wordsize) beginning at position i of the balanced parenthesis representation of T , (ii) $\mbox{{\tt DFUDS-substring}}(i,b)$, which does the same for the depth first unary degree sequence representation, and (iii) a similar operation for tree-partition based representations of T . We give: an asymptotically space-optimal 2n + o (n ) bit representation of n -node ordinal trees that supports all the above operations with b = *** (logn ), answering an open question from [He et al., ICALP'07]. an asymptotically space-optimal C (n ,k ) + o (n )-bit representation of k -ary cardinal trees, that supports (with $b = \Theta(\sqrt{\log n})$) the operations (ii) and (iii) above, on the ordinal tree obtained by removing labels from the cardinal tree, as well as the usual label-based operations. As a result, we obtain a fully-functional cardinal tree representation with the above space complexity. This answers an open question from [Raman et al, SODA'02]. Our new representations are able to simultaneously emulate the BP, DFUDS and partitioned representations using a single instance of the data structure, and thus aim towards universality . They not only support the union of all the ordinal tree operations supported by these representations, but will also automatically inherit any new operations supported by these representations in the future.

Proceedings ArticleDOI
28 Jun 2009
TL;DR: This work analyzes some random binary tree sequences for which the normalized entropies H(Tn/n converge to a limit as n → ∞, as well as some other sequences in which the Normalized Entropies fail to converge.
Abstract: For each positive integer n, let T n be a random rooted full binary tree having 2n-1 vertices. We can view H(T n ), the entropy of T n , as a measure of the structural complexity of tree T n in the sense that approximately H(T n ) bits suffice to construct T n . We analyze some random binary tree sequences (T n : n = 1,2…) for which the normalized entropies H(T n /n converge to a limit as n → ∞, as well as some other sequences (T n ) in which the normalized entropies fail to converge.

Book ChapterDOI
02 Dec 2009
TL;DR: By providing a fast transformation to the Balanced Subgraph problem, it is shown that the problem admits an O(2 k n 4) algorithm, improving upon a previous fixed-parameter approach with running time O(c k n O(1)) where c ≈ 1000.
Abstract: Given two binary phylogenetic trees covering the same n species, it is useful to compare them by drawing them with leaves arranged side-by-side. To facilitate comparison, we would like to arrange the trees to minimize the number of crossings k induced by connecting pairs of identical species. This is the NP-hard Tanglegram Layout problem. By providing a fast transformation to the Balanced Subgraph problem, we show that the problem admits an O(2 k n 4) algorithm, improving upon a previous fixed-parameter approach with running time O(c k n O(1)) where c ≈ 1000. We enhance a Balanced Subgraph implementation based on data reduction and iterative compression with improvements tailored towards these instances, and run experiments with real-world data to show the practical applicability of this approach. All practically relevant (k ≤ 1000) Tanglegram Layout instances can be solved exactly within seconds. Additionally, we provide a kernel-like bound by showing how to reduce the Balanced Subgraph instances for Tanglegram Layout on complete binary trees to a size of O(k logk).

Proceedings Article
Petr Vilím1
20 Sep 2009
TL;DR: The new Edge Finding algorithm for discrete cumulative resources, i.e. resources which can process several activities simultaneously up to some maximal capacity C, is presented, based on the Θ-tree data structure which already proved to be very useful in filtering algorithms for unary resource constraints.
Abstract: This paper presents new Edge Finding algorithm for discrete cumulative resources, i.e. resources which can process several activities simultaneously up to some maximal capacity C. The algorithm has better time complexity than the current version of this algorithm: O(kn log n) versus O(kn2) where n is number of activities on the resource and k is number of distinct capacity demands. Moreover the new algorithm is slightly stronger and it is able to handle optional activities. The algorithm is based on the Θ-tree - a binary tree data structure which already proved to be very useful in filtering algorithms for unary resource constraints.

Journal ArticleDOI
TL;DR: An incremental procedure based on the iterative use of tree-based method is proposed and a suitable Incremental Imputation Algorithm is introduced to define a lexicographic ordering of cases and variables so that conditional mean imputation via binary trees can be performed incrementally.
Abstract: In the framework of incomplete data analysis, this paper provides a nonparametric approach to missing data imputation based on Information Retrieval. In particular, an incremental procedure based on the iterative use of tree-based method is proposed and a suitable Incremental Imputation Algorithm is introduced. The key idea is to define a lexicographic ordering of cases and variables so that conditional mean imputation via binary trees can be performed incrementally. A simulation study and real data applications are carried out to describe the advantages and the performance with respect to standard approaches.

Book ChapterDOI
23 Sep 2009
TL;DR: A new method based on binary space partitions to simultaneously mesh and compress a depth map that is represented with a compressed adaptive mesh that can be directly applied to render the 3D scene.
Abstract: We propose in this paper a new method based on binary space partitions to simultaneously mesh and compress a depth map. The method divides the map adaptively into a mesh that has the form of a binary triangular tree (tritree). The nodes of the mesh are the sparse non-uniform samples of the depth map and are able to interpolate the other pixels with minimal error. We apply differential coding after that to represent the sparse disparities at the mesh nodes. We then use entropy coding to compress the encoded disparities. We finally benefit from the binary tree and compress the mesh via binary tree coding to condense its representation. The results we obtained on various depth images show that the proposed scheme leads to lower depth error rate at higher compression ratios when compared to standard compression techniques like JPEG 2000. Moreover, using our method, a depth map is represented with a compressed adaptive mesh that can be directly applied to render the 3D scene.

Journal ArticleDOI
TL;DR: In this article, the authors considered random Schrodinger operators on tree graphs and proved a continuous spectrum at small disorder for two models. The first model is the usual binary tree with certain strongly correlated random potentials.
Abstract: We consider random Schrodinger operators on tree graphs and prove absolutely continuous spectrum at small disorder for two models. The first model is the usual binary tree with certain strongly correlated random potentials. These potentials are of interest since for complete correlation they exhibit localization at all disorders. In the second model, we change the tree graph by adding all possible edges to the graph inside each sphere, with weights proportional to the number of points in the sphere.

Patent
Robert Glenn Deen1, Hongxia Jin1
05 Dec 2009
TL;DR: In this paper, a system, method and computer program product for synchronizing content directories on cluster devices are provided, which generate a binary tree for each device in a cluster of devices, the binary tree representing the locations of all copies of all content residing in the device.
Abstract: According to embodiments of the invention, a system, method and computer program product for a computer program product for synchronizing content directories on cluster devices are provided. Embodiments generate a binary tree for each device in a cluster of devices, the binary tree representing the locations of all copies of content residing in the device. The binary tree for a plurality of other devices in the cluster may be stored in each device the binary tree. The binary trees for the plurality of other devices may be used to determine availability of content and the available content may be displayed to a user.

Journal Article
LU Hua-pu1
TL;DR: A KNN-NPR(K-nearest neighbors non-parametric regression) method based on balanced binary tree to forecast short-term traffic flow is presented to improve forecasting precision and meet real-time reqirement.
Abstract: Real-time and accurate short-term traffic flow forecasting has become critical in traffic control and guidance.Non-parametric regression is a good way to solve the problem.But the foundation of case database and the search speed are two obstacles for application.A KNN-NPR(K-nearest neighbors non-parametric regression) method based on balanced binary tree to forecast short-term traffic flow is presented.In order to improve forecasting precision and meet real-time reqirement,clustering methods and balanced binary tree are adopted to build case database.An example is given to show its availability.

Book ChapterDOI
05 Feb 2009
TL;DR: It is shown that the maximization version of the dual problem for general binary trees can be reduced to a version of MaxCut for which the algorithm of Goemans and Williamson yields a 0.878-approximation.
Abstract: A binary tanglegram is a pair 〈S,T〉 of binary trees whose leaf sets are in one-to-one correspondence; matching leaves are connected by inter-tree edges. For applications, for example in phylogenetics, it is essential that both trees are drawn without edge crossings and that the inter-tree edges have as few crossings as possible. It is known that finding a drawing with the minimum number of crossings is NP-hard and that the problem is fixed-parameter tractable with respect to that number. We prove that under the Unique Games Conjecture there is no constant-factor approximation for general binary trees. We show that the problem is hard even if both trees are complete binary trees. For this case we give an O(n 3)-time 2-approximation and a new and simple fixed-parameter algorithm. We show that the maximization version of the dual problem for general binary trees can be reduced to a version of MaxCut for which the algorithm of Goemans and Williamson yields a 0.878-approximation.