scispace - formally typeset
Search or ask a question

Showing papers on "Tree (data structure) published in 2004"


Journal ArticleDOI
TL;DR: The simulation results show that the accuracy of NJ trees decline only by approximately 5% when the number of sequences used increases from 32 to 4,096 (128 times) even in the presence of extensive variation in the evolutionary rate among lineages or significant biases in the nucleotide composition and transition/transversion ratio.
Abstract: Current efforts to reconstruct the tree of life and histories of multigene families demand the inference of phylogenies consisting of thousands of gene sequences. However, for such large data sets even a moderate exploration of the tree space needed to identify the optimal tree is virtually impossible. For these cases the neighbor-joining (NJ) method is frequently used because of its demonstrated accuracy for smaller data sets and its computational speed. As data sets grow, however, the fraction of the tree space examined by the NJ algorithm becomes minuscule. Here, we report the results of our computer simulation for examining the accuracy of NJ trees for inferring very large phylogenies. First we present a likelihood method for the simultaneous estimation of all pairwise distances by using biologically realistic models of nucleotide substitution. Use of this method corrects up to 60% of NJ tree errors. Our simulation results show that the accuracy of NJ trees decline only by ≈5% when the number of sequences used increases from 32 to 4,096 (128 times) even in the presence of extensive variation in the evolutionary rate among lineages or significant biases in the nucleotide composition and transition/transversion ratio. Our results encourage the use of complex models of nucleotide substitution for estimating evolutionary distances and hint at bright prospects for the application of the NJ and related methods in inferring large phylogenies.

4,489 citations


Journal ArticleDOI
TL;DR: A novel frequent-pattern tree (FP-tree) structure is proposed, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and an efficient FP-tree-based mining method, FP-growth, is developed for mining the complete set of frequent patterns by pattern fragment growth.
Abstract: Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist a large number of patterns and/or long patterns. In this study, we propose a novel frequent-pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a condensed, smaller data structure, FP-tree which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern-fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent-pattern mining methods.

2,567 citations


Journal ArticleDOI
01 Jul 2004
TL;DR: The design, implementation, and evaluation of Ganglia are presented along with experience gained through real world deployments on systems of widely varying scale, configurations, and target application domains over the last two and a half years.
Abstract: Ganglia is a scalable distributed monitoring system for high performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It relies on a multicast-based listen/announce protocol to monitor state within clusters and uses a tree of point-to-point connections amongst representative cluster nodes to federate clusters and aggregate their state. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on over 500 clusters around the world. This paper presents the design, implementation, and evaluation of Ganglia along with experience gained through real world deployments on systems of widely varying scale, configurations, and target application domains over the last two and a half years.

1,401 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new normalized information distance based on the non-computable notion of Kolmogorov complexity, which minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities).
Abstract: A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance," based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the similarity metric . This theory forms the foundation for a new practical tool. To evidence generality and robustness, we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.

1,002 citations


Journal ArticleDOI
TL;DR: A formal methodology is introduced, which allows us to compare multiple split criteria and permits us to present fundamental insights into the decision process.
Abstract: Knowledge Discovery in Databases (KDD) is an active and important research area with the promise for a high payoff in many business and scientific applications. One of the main tasks in KDD is classification. A particular efficient method for classification is decision tree induction. The selection of the attribute used at each node of the tree to split the data (split criterion) is crucial in order to correctly classify objects. Different split criteria were proposed in the literature (Information Gain, Gini Index, etc.). It is not obvious which of them will produce the best decision tree for a given data set. A large amount of empirical tests were conducted in order to answer this question. No conclusive results were found. In this paper we introduce a formal methodology, which allows us to compare multiple split criteria. This permits us to present fundamental insights into the decision process. Furthermore, we are able to present a formal description of how to select between split criteria for a given data set. As an illustration we apply the methodology to two widely used split criteria: Gini Index and Information Gain.

554 citations


Journal ArticleDOI
TL;DR: This paper introduces the concept of dynamic convoy tree-based collaboration, and formalizes it as a multiple objective optimization problem which needs to find a convoy tree sequence with high tree coverage and low energy consumption.
Abstract: Most existing work on sensor networks concentrates on finding efficient ways to forward data from the information source to the data centers, and not much work has been done on collecting local data and generating the data report. This paper studies this issue by proposing techniques to detect and track a mobile target. We introduce the concept of dynamic convoy tree-based collaboration, and formalize it as a multiple objective optimization problem which needs to find a convoy tree sequence with high tree coverage and low energy consumption. We propose an optimal solution which achieves 100% coverage and minimizes the energy consumption under certain ideal situations. Considering the real constraints of a sensor network, we propose several practical implementations: the conservative scheme and the prediction-based scheme for tree expansion and pruning; the sequential and the localized reconfiguration schemes for tree reconfiguration. Extensive experiments are conducted to compare the practical implementations and the optimal solution. The results show that the prediction-based scheme outperforms the conservative scheme and it can achieve similar coverage and energy consumption to the optimal solution. The experiments also show that the localized reconfiguration scheme outperforms the sequential reconfiguration scheme when the node density is high, and the trend is reversed when the node density is low.

499 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: A decision tree learning approach to diagnosing failures in large Internet sites is presented, and it is found that, among hundreds of potential causes, the algorithm successfully identifies 13 out of 14 true causes of failure, along with 2 false positives.
Abstract: We present a decision tree learning approach to diagnosing failures in large Internet sites. We record runtime properties of each request and apply automated machine learning and data mining techniques to identify the causes of failures. We train decision trees on the request traces from time periods in which user-visible failures are present. Paths through the tree are ranked according to their degree of correlation with failure, and nodes are merged according to the observed partial order of system components. We evaluate this approach using actual failures from eBay, and find that, among hundreds of potential causes, the algorithm successfully identifies 13 out of 14 true causes of failure, along with 2 false positives. We discuss some results in applying simplified decision trees on eBay's production site for several months. In addition, we give a cost-benefit analysis of manual vs. automated diagnosis systems. Our contributions include the statistical learning approach, the adaptation of decision trees to the context of failure diagnosis, and the deployment and evaluation of our tools on a high-volume production service.

387 citations


Journal ArticleDOI
TL;DR: In Zigzag, the multicast tree has a height logarithmic with the number of clients, and a node degree bounded by a constant, so that the end-to-end delay is kept small.
Abstract: Given that the Internet does not widely support Internet protocol multicast while content-distribution-network technologies are costly, the concept of peer-to-peer could be a promising start for enabling large-scale streaming systems In our so-called Zigzag approach, we propose a method for clustering peers into a hierarchy called the administrative organization for easy management, and a method for building the multicast tree atop this hierarchy for efficient content transmission In Zigzag, the multicast tree has a height logarithmic with the number of clients, and a node degree bounded by a constant This helps reduce the number of processing hops on the delivery path to a client while avoiding network bottlenecks Consequently, the end-to-end delay is kept small Although one could build a tree satisfying such properties easily, an efficient control protocol between the nodes must be in place to maintain the tree under the effects of network dynamics Zigzag handles such situations gracefully, requiring a constant amortized worst-case control overhead Especially, failure recovery is done regionally with impact on, at most, a constant number of existing clients and with mostly no burden on the server

386 citations


Proceedings ArticleDOI
07 Mar 2004
TL;DR: It is proved that building an optimal data gathering tree is NP-complete and various distributed approximation algorithms are proposed for the explicit communication case.
Abstract: We consider the problem of correlated data gathering by a network with a sink node and a tree communication structure, where the goal is to minimize the total transmission cost of transporting the information collected by the nodes, to the sink node. Two coding strategies are analyzed: a Slepian-Wolf model where optimal coding is complex and transmission optimization is simple, and a joint entropy coding model with explicit communication where coding is simple and transmission optimization is difficult. This problem requires a joint optimization of the rate allocation at the nodes and of the transmission structure. For the Slepian-Wolf setting, we derive a closed form solution and an efficient distributed approximation algorithm with a good performance. For the explicit communication case, we prove that building an optimal data gathering tree is NP-complete and we propose various distributed approximation algorithms.

369 citations


Book ChapterDOI
31 Aug 2004
TL;DR: This work represents moving-object locations as vectors that are timestamped based on their update time and shows that it is capable of substantially outperforming the R-tree based TPR-tree for both single and concurrent access scenarios.
Abstract: A number of emerging applications of data management technology involve the monitoring and querying of large quantities of continuous variables, e.g., the positions of mobile service users, termed moving objects. In such applications, large quantities of state samples obtained via sensors are streamed to a database. Indexes for moving objects must support queries efficiently, but must also support frequent updates. Indexes based on minimum bounding regions (MBRs) such as the R-tree exhibit high concurrency overheads during node splitting, and each individual update is known to be quite costly. This motivates the design of a solution that enables the B+ -tree to manage moving objects. We represent moving-object locations as vectors that are timestamped based on their update time. By applying a novel linearization technique to these values, it is possible to index the resulting values using a single B+-tree that partitions values according to their timestamp and otherwise preserves spatial proximity. We develop algorithms for range and k nearest neighbor queries, as well as continuous queries. The proposal can be grafted into existing database systems cost effectively. An extensive experimental study explores the performance characteristics of the proposal and also shows that it is capable of substantially outperforming the R-tree based TPR-tree for both single and concurrent access scenarios.

347 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: A novel, unsupervised approach to training an efficient and robust detector which is capable of not only detecting the presence of human hands within an image but classifying the hand shape.
Abstract: The ability to detect a persons unconstrained hand in a natural video sequence has applications in sign language, gesture recognition and HCl. This paper presents a novel, unsupervised approach to training an efficient and robust detector which is capable of not only detecting the presence of human hands within an image but classifying the hand shape. A database of images is first clustered using a k-method clustering algorithm with a distance metric based upon shape context. From this, a tree structure of boosted cascades is constructed. The head of the tree provides a general hand detector while the individual branches of the tree classify a valid shape as belong to one of the predetermined clusters exemplified by an indicative hand shape. Preliminary experiments carried out showed that the approach boasts a promising 99.8% success rate on hand detection and 97.4% success at classification. Although we demonstrate the approach within the domain of hand shape it is equally applicable to other problems where both detection and classification are required for objects that display high variability in appearance.

Proceedings Article
01 Jan 2004

Journal ArticleDOI
João Gama1
TL;DR: In this article, the authors study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning, and propose a framework combining a univariate decision tree with a linear function by means of constructive induction.
Abstract: In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.

Proceedings ArticleDOI
30 Mar 2004
TL;DR: This work proposes a new labeling scheme that take advantage of the unique property of prime numbers to meet the need for efficient support to order-sensitive queries and updates of XML queries.
Abstract: Efficient evaluation of XML queries requires the determination of whether a relationship exists between two elements. A number of labeling schemes have been designed to label the element nodes such that the relationships between nodes can be easily determined by comparing their labels. With the increased popularity of XML on the Web, finding a labeling scheme that is able to support order-sensitive queries in the presence of dynamic updates becomes urgent. We propose a new labeling scheme that take advantage of the unique property of prime numbers to meet this need. The global order of the nodes can be captured by generating simultaneous congruence values from the prime number node labels. Theoretical analysis of the label size requirements for the various labeling schemes is given. Experiment results indicate that the prime number labeling scheme is compact compared to existing dynamic labeling schemes, and provides efficient support to order-sensitive queries and updates.

Proceedings ArticleDOI
21 Jul 2004
TL;DR: A new approach for coreference resolution is proposed which uses the Bell tree to represent the search space and casts the coreference Resolution problem as finding the best path from the root of theBell tree to the leaf nodes.
Abstract: This paper proposes a new approach for coreference resolution which uses the Bell tree to represent the search space and casts the coreference resolution problem as finding the best path from the root of the Bell tree to the leaf nodes. A Maximum Entropy model is used to rank these paths. The coreference performance on the 2002 and 2003 Automatic Content Extraction (ACE) data will be reported. We also train a coreference system using the MUC6 data and competitive results are obtained.

Proceedings ArticleDOI
04 Jul 2004
TL;DR: This work presents an algorithmic framework for supervised classification learning where the set of labels is organized in a predefined hierarchical structure encoded by a rooted tree which induces a metric over the label set.
Abstract: We present an algorithmic framework for supervised classification learning where the set of labels is organized in a predefined hierarchical structure. This structure is encoded by a rooted tree which induces a metric over the label set. Our approach combines ideas from large margin kernel methods and Bayesian analysis. Following the large margin principle, we associate a prototype with each label in the tree and formulate the learning task as an optimization problem with varying margin constraints. In the spirit of Bayesian methods, we impose similarity requirements between the prototypes corresponding to adjacent labels in the hierarchy. We describe new online and batch algorithms for solving the constrained optimization problem. We derive a worst case loss-bound for the online algorithm and provide generalization analysis for its batch counterpart. We demonstrate the merits of our approach with a series of experiments on synthetic, text and speech data.

Proceedings ArticleDOI
07 Jun 2004
TL;DR: This paper focuses on the segmentation of ladar data using local 3-D point statistics into three classes: clutter to capture grass and tree canopy, linear to capture thin objects like wires or tree branches, and finally surface to capture solid objects like ground terrain surface, rocks or tree trunks.
Abstract: Because of the difficulty of interpreting laser data in a meaningful way, safe navigation in vegetated terrain is still a daunting challenge. In this paper, we focus on the segmentation of ladar data using local 3-D point statistics into three classes: clutter to capture grass and tree canopy, linear to capture thin objects like wires or tree branches, and finally surface to capture solid objects like ground terrain surface, rocks or tree trunks. We present the details of the method proposed, the modifications we made to implement it on-board an autonomous ground vehicle. Finally, we present results from field tests using this rover and results produced from different stationary laser sensors.

01 Jan 2004
TL;DR: In this article, the authors present a family tree of evaluation theory based on accountability and social inquiry, with three primary branches of the family tree representing accountability, social inquiry and research.
Abstract: O ur evaluation theory tree is presented in Figure 2.1, in which we depict the trunk and the three primary branches of the family tree. The trunk is built on a dual foundation of accountability and systematic social inquiry. These two areas have supported the development of the field in different ways. The need and desire for accountability presents a need for evaluation. The importance of accounting for actions or for resources used in the conduct of programs is particularly evident for programs supported by government entities. The same accountability demand could be said to be present for corporate businesses that annually are required to provide reports from outside auditors to their shareholders (although the accounting scandals of the early 2000s, e.g., Enron, would dispute how accountable these audits actually have been). As a root for program evaluation, we think of accountability in the broadest way possible. That is, accountability is not a limiting activity, but, rather, is designed to improve and better programs and society. The social inquiry root of the tree emanates from a concern for employing a systematic and justifiable set of methods for determining accountability. While accountability provides the rationale, it is primarily from social inquiry that evaluation models have been derived. The main branch of the tree is the continuation of the social inquiry trunk. This is the evaluation as research, or evaluation guided by research methods, branch. This branch we have designated methods since in its purest form, it

20 Dec 2004
TL;DR: This master thesis describes fundamental principles of tree construction, different splitting algorithms and pruning procedures, and answers the questions why should the authors use or should not use the CART method.
Abstract: This master thesis is devoted to Classification and Regression Trees (CART). CART is classification method which uses historical data to construct decision trees. Depending on available information about the dataset, classification tree or regression tree can be constructed. Constructed tree can be then used for classification of new observations. The first part of the thesis describes fundamental principles of tree construction, different splitting algorithms and pruning procedures. Second part of the paper answers the questions why should we use or should not use the CART method. Advantages and weaknesses of the method are discussed and tested in detail. In the last part, CART is applied to real data, using the statistical software XploRe. Here different statistical macros (quantlets), graphical and plotting tools are presented.

Journal ArticleDOI
TL;DR: It is shown that if new tree species begin as small populations, species that are now common must have spread more quickly than chance allows, and most tree species have some setting in which they can increase when rare, and these trade-offs underlie the mechanisms maintaining α-diversity and species turnover.
Abstract: Understanding why there are so many kinds of tropical trees requires learning, not only how tree species coexist, but what factors drive tree speciation and what governs a tree clade’s diversification rate. Many report that hybrid sterility evolves very slowly between separated tree populations. If so, tree species rarely originate by splitting of large populations. Instead, they begin with few trees. The few studies available suggest that reproductive isolation between plant populations usually results from selection driven by lowered fitness of hybrids: speciation is usually a response to a ‘‘niche opportunity.’’ Using Hubbell’s neutral theory of forest dynamics as a null hypothesis, we show that if new tree species begin as small populations, species that are now common must have spread more quickly than chance allows. Therefore, most tree species have some setting in which they can increase when rare. Trees face trade-offs in suitability for different microhabitats, different-sized clearings, different soils and climates, and resistance to different pests. These trade-offs underlie the mechanisms maintaining a-diversity and species turnover. Disturbance and microhabitat specialization appear insufficient to maintain a-diversity of tropical trees, although they may maintain tree diversity north of Mexico or in northern Europe. Many studies show that where trees grow readily, tree diversity is higher and temperature and rainfall are less seasonal. The few data available suggest that pest pressure is higher, maintaining higher tree diversity, where winter is absent. Tree a-diversity is also higher in regions with more tree species, which tend to be larger, free for a longer time from major shifts of climate, or in the tropics, where there are more opportunities for local coexistence.

Book ChapterDOI
02 May 2004
TL;DR: In this paper, the authors presented a technique for Merkle tree traversal which requires only logarithmic space and time, where the units of computation are hash function evaluations or leaf value computations.
Abstract: We present a technique for Merkle tree traversal which requires only logarithmic space and time. For a tree with N leaves, our algorithm computes sequential tree leaves and authentication path data in time 2 log2(N) and space less than 3 log2(N), where the units of computation are hash function evaluations or leaf value computations, and the units of space are the number of node values stored. This result is an asymptotic improvement over all other previous results (for example, measuring cost=space*time). We also prove that the complexity of our algorithm is optimal: There can exist no Merkle tree traversal algorithm which consumes both less than O(log2(N)) space and less than O(log2(N)) time. Our algorithm is especially of practical interest when space efficiency is required.

Patent
09 Dec 2004
TL;DR: In this article, a method of generating an optimized grammar for speech recognition from a data set or big list of items is presented, including the steps of obtaining a tree representing items in the data set, and generating the grammar using the tree.
Abstract: A method of generating an optimized grammar, for use in speech recognition, from a data set or big list of items, is disclosed. The method includes the steps of obtaining a tree representing items in the data set, and generating the grammar using the tree. The tree or tree data structure representing items in the data set is a simulated recognition search tree, representing items in the data set, which can be automatically generated from the data set.

Patent
22 Jan 2004
TL;DR: In this paper, a system and method for controlling the flow of a dialog within a spoken dialog service dialog management module is described, which uses a dialog disambiguation rooted tree, the rooted tree having a root node.
Abstract: A system and method are disclosed for controlling the flow of a dialog within a spoken dialog service dialog management module. The method uses a dialog disambiguation rooted tree, the rooted tree having a root node, nodes descending from the root nodes organized in categories and leaf nodes. The method comprises gathering input from a user to match with at least one node and node condition, wherein a first prompt from the dialog manager relates to a focus root node, lighting at least one relevant node according to the received user input, generalizing by attempting to select a new focus node further from a current focus node by: (1) assigning a node as a new focus node if it is the only lit direct descendent of a focus node after the lighting step; and (2) assigning a lowest common ancestor node as a new focus node if there are multiple descendent nodes that are lit and the step of assigning a new node a new focus if it is the only lit direct descendent does not apply. This method enables improved disambiguation of user intent in a spoken dialog service.

Proceedings ArticleDOI
07 Mar 2004
TL;DR: This work addresses the tree reconfiguration problem as finding a min-cost convoy tree sequence, and solves it by proposing an optimized complete reconfigured tree sequence and an optimized interception-based reconfigurations scheme.
Abstract: Sensor nodes have limited sensing range and are not very reliable. To obtain accurate sensing data, many sensor nodes should he deployed and then the collaboration among them becomes an important issue. In W. Zhang and G. Cao, a tree-based approach has been proposed to facilitate sensor nodes collaborating in detecting and tracking a mobile target. As the target moves, many nodes in the tree may become faraway from the root of the tree, and hence a large amount of energy may be wasted for them to send their sensing data to the root. We address the tree reconfiguration problem. We formalize it as finding a min-cost convoy tree sequence, and solve it by proposing an optimized complete reconfiguration scheme and an optimized interception-based reconfiguration scheme. Analysis and simulation are conducted to compare the proposed schemes with each other and with other reconfiguration schemes. The results show that the proposed schemes are more energy efficient than others.

Journal ArticleDOI
TL;DR: The estimate for the overall topology and, for land plants, divergence times of the plant tree of life is presented and several major controversies and unsolved problems in resolving portions of this tree are discussed.
Abstract: We provide a brief overview of this special issue on the plant tree of life, describing its history and the general nature of its articles. We then present our estimate for the overall topology and, for land plants, divergence times of the plant tree of life. We discuss several major controversies and unsolved problems in resolving portions of this tree. We conclude with a few thoughts about the prospects for obtaining a comprehensive, robustly resolved, and accurately dated plant tree of life and the importance of such a grand endeavor.

Book ChapterDOI
TL;DR: In this paper, the problem of allocating space at berth for vessels with the objective of minimizing total weighted flow time is considered and two mathematical formulations are considered where one is used to develop a tree search procedure while the other is used for developing a lower bound that can speed up the tree search.
Abstract: In this paper, we consider the problem of allocating space at berth for vessels with the objective of minimizing total weighted flow time. Two mathematical formulations are considered where one is used to develop a tree search procedure while the other is used to develop a lower bound that can speed up the tree search procedure. Furthermore, a composite heuristic combining the tree search procedure and pair-wise exchange heuristic is proposed for large size problems. Finally, computational experiments are reported to evaluate the efficiency of the methods.

Proceedings ArticleDOI
20 Jun 2004
TL;DR: Through extensive simulations, it is shown that setting up the clock out timer based on a node's position in the aggregation tree results in a beneficial "cascading effect", yielding considerable energy efficiency, yet maintaining data accuracy and freshness.
Abstract: This paper evaluates the effect of timing in data aggregation algorithms. In-network aggregation achieves energy-efficient data propagation by processing data as it flows from information sources to sinks. Our goal is to show that the decision of when to "clock out" data as it is processed by nodes have significant performance impact in terms of data accuracy and freshness. Using the sensor network paradigm where all nodes produce information periodically, we compare three aggregation timing policies. Through extensive simulations we show that setting up the clock out timer based on a node's position in the aggregation tree results in a beneficial "cascading effect", yielding considerable energy efficiency, yet maintaining data accuracy and freshness.

Journal Article
TL;DR: In this article, the problem of broadcasting confidential information to a collection of n devices while providing the ability to revoke an arbitrary subset of those devices and tolerating collusion among the revoked devices was studied.
Abstract: We study the problem of broadcasting confidential information to a collection of n devices while providing the ability to revoke an arbitrary subset of those devices (and tolerating collusion among the revoked devices). In this paper, we restrict our attention to low-memory devices, that is, devices that can store at most O(log n) keys. We consider solutions for both zero-state and low-state cases, where such devices are organized in a tree structure T. We allow the group controller to encrypt broadcasts to any subtree of T, even if the tree is based on an multi-way organizational chart or a severely unbalanced multicast tree.

Book ChapterDOI
15 Aug 2004
TL;DR: This paper restricts its attention to low-memory devices, that is, devices that can store at most O(log n) keys, and considers solutions for both zero-state and low-state cases, where such devices are organized in a tree structure T.
Abstract: We study the problem of broadcasting confidential information to a collection of n devices while providing the ability to revoke an arbitrary subset of those devices (and tolerating collusion among the revoked devices). In this paper, we restrict our attention to low-memory devices, that is, devices that can store at most O(log n) keys. We consider solutions for both zero-state and low-state cases, where such devices are organized in a tree structure T. We allow the group controller to encrypt broadcasts to any subtree of T, even if the tree is based on an multi-way organizational chart or a severely unbalanced multicast tree.

Journal ArticleDOI
TL;DR: A cluster-based tree algorithm to accelerate k-NN classification without any presuppositions about the metric form and properties of a dissimilarity measure is proposed.
Abstract: Most fast k-nearest neighbor (k-NN) algorithms exploit metric properties of distance measures for reducing computation cost and a few can work effectively on both metric and nonmetric measures. We propose a cluster-based tree algorithm to accelerate k-NN classification without any presuppositions about the metric form and properties of a dissimilarity measure. A mechanism of early decision making and minimal side-operations for choosing searching paths largely contribute to the efficiency of the algorithm. The algorithm is evaluated through extensive experiments over standard NIST and MNIST databases.