Showing papers on "Graph (abstract data type) published in 2003"

PDF

Open Access

Proceedings Article•DOI•

CloseGraph: mining closed frequent graph patterns

[...]

Xifeng Yan¹, Jiawei Han¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

24 Aug 2003

TL;DR: A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods and shows that it not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.

...read moreread less

Abstract: Recent research on pattern discovery has progressed form mining frequent itemsets and sequences to mining structured patterns including trees, lattices, and graphs. As a general data structure, graph can model complicated relations among data with wide applications in bioinformatics, Web exploration, and etc. However, mining large graph patterns in challenging due to the presence of an exponential number of frequent subgraphs. Instead of mining all the subgraphs, we propose to mine closed frequent graph patterns. A graph g is closed in a database if there exists no proper supergraph of g that has the same support as g. A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods. Our performance study shows that CloseGraph not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.

...read moreread less

722 citations

Proceedings Article•DOI•

Efficient mining of frequent subgraphs in the presence of isomorphism

[...]

Jun Huan¹, Wei Wang¹, Jan F. Prins¹•Institutions (1)

University of North Carolina at Chapel Hill¹

19 Nov 2003

TL;DR: This work proposes a novel frequent subgraph mining algorithm: FFSM, which employs a vertical search scheme within an algebraic graph framework it has developed to reduce the number of redundant candidates proposed.

...read moreread less

Abstract: Frequent subgraph mining is an active research topic in the data mining community. A graph is a general model to represent data and has been used in many domains like cheminformatics and bioinformatics. Mining patterns from graph databases is challenging since graph related operations, such as subgraph testing, generally have higher time complexity than the corresponding operations on itemsets, sequences, and trees, which have been studied extensively. We propose a novel frequent subgraph mining algorithm: FFSM, which employs a vertical search scheme within an algebraic graph framework we have developed to reduce the number of redundant candidates proposed. Our empirical study on synthetic and real datasets demonstrates that FFSM achieves a substantial performance gain over the current start-of-the-art subgraph mining algorithm gSpan.

...read moreread less

699 citations

Proceedings Article•DOI•

Computing geodesics and minimal surfaces via graph cuts

[...]

Boykov¹, Kolmogorov•Institutions (1)

Princeton University¹

13 Oct 2003

TL;DR: This work shows how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric.

...read moreread less

Abstract: Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry.

...read moreread less

654 citations

Proceedings Article•

Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions

[...]

Xiaojin Zhu¹, John Lafferty¹, Zoubin Ghahramani²•Institutions (2)

Carnegie Mellon University¹, University College London²

16 Sep 2003

TL;DR: This work combines active and semi-supervised learning techniques under a Gaussian random field model, which requires a much smaller number of queries to achieve high accuracy compared with random query selection.

...read moreread less

Abstract: Active and semi-supervised learning are important techniques when labeled data are scarce. We combine the two under a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The semi-supervised learning problem is then formulated in terms of a Gaussian random field on this graph, the mean of which is characterized in terms of harmonic functions. Active learning is performed on top of the semisupervised learning scheme by greedily selecting queries from the unlabeled data to minimize the estimated expected classification error (risk); in the case of Gaussian fields the risk is efficiently computed using matrix methods. We present experimental results on synthetic data, handwritten digit recognition, and text classification tasks. The active learning scheme requires a much smaller number of queries to achieve high accuracy compared with random query selection.

...read moreread less

578 citations

Proceedings Article•

Link Prediction in Relational Data

[...]

Ben Taskar¹, Ming-fai Wong¹, Pieter Abbeel¹, Daphne Koller¹•Institutions (1)

Stanford University¹

09 Dec 2003

TL;DR: It is shown that the collective classification approach of RMNs, and the introduction of subgraph patterns over link labels, provide significant improvements in accuracy over flat classification, which attempts to predict each link in isolation.

...read moreread less

Abstract: Many real-world domains are relational in nature, consisting of a set of objects related to each other in complex ways. This paper focuses on predicting the existence and the type of links between entities in such domains. We apply the relational Markov network framework of Taskar et al. to define a joint probabilistic model over the entire link graph — entity attributes and links. The application of the RMN algorithm to this task requires the definition of probabilistic patterns over subgraph structures. We apply this method to two new relational datasets, one involving university webpages, and the other a social network. We show that the collective classification approach of RMNs, and the introduction of subgraph patterns over link labels, provide significant improvements in accuracy over flat classification, which attempts to predict each link in isolation.

...read moreread less

524 citations

Proceedings Article•DOI•

Graph-based anomaly detection

[...]

Caleb C. Noble¹, Diane J. Cook¹•Institutions (1)

University of Texas at Arlington¹

24 Aug 2003

TL;DR: This paper introduces two techniques for graph-based anomaly detection, and introduces a new method for calculating the regularity of a graph, with applications to anomaly detection.

...read moreread less

Abstract: Anomaly detection is an area that has received much attention in recent years. It has a wide variety of applications, including fraud detection and network intrusion detection. A good deal of research has been performed in this area, often using strings or attribute-value data as the medium from which anomalies are to be extracted. Little work, however, has focused on anomaly detection in graph-based data. In this paper, we introduce two techniques for graph-based anomaly detection. In addition, we introduce a new method for calculating the regularity of a graph, with applications to anomaly detection. We hypothesize that these methods will prove useful both for finding anomalies, and for determining the likelihood of successful anomaly detection within graph-based data. We provide experimental results using both real-world network intrusion data and artificially-created data.

...read moreread less

504 citations

Proceedings Article•DOI•

Expertise identification using email communications

[...]

Christopher S. Campbell¹, Paul P. Maglio¹, Alex Cozzi¹, Byron Dom¹•Institutions (1)

IBM¹

03 Nov 2003

TL;DR: Two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that take account both of text and communication patterns.

...read moreread less

Abstract: A common method for finding information in an organization is to use social networks---ask people, following referrals until someone with the right information is found. Another way is to automatically mine documents to determine who knows what. Email documents seem particularly well suited to this task of "expertise location", as people routinely communicate what they know. Moreover, because people explicitly direct email to one another, social networks are likely to be contained in the patterns of communication. Can these patterns be used to discover experts on particular topics? Is this approach better than mining message content alone? To find answers to these questions, two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that takes account both of text and communication patterns. An evaluation was done using email and explicit expertise ratings from two different organizations. The rankings given by each algorithm were compared to the explicit rankings with the precision and recall measures commonly used in information retrieval, as well as the d' measure commonly used in signal-detection theory. Results show that the graph-based algorithm performs better than the content-based algorithm at identifying experts in both cases, demonstrating that the graph-based algorithm effectively extracts more information than is found in content alone.

...read moreread less

395 citations

Journal Article•DOI•

Predicting protein function from protein/protein interaction data: a probabilistic approach

[...]

Stanley Letovsky¹, Simon Kasif¹•Institutions (1)

Boston University¹

03 Jul 2003-Bioinformatics

TL;DR: A method of assigning functions based on a probabilistic analysis of graph neighborhoods in a protein-protein interaction network that exploits the fact that graph neighbors are more likely to share functions than nodes which are not neighbors.

...read moreread less

Abstract: Motivation: The development of experimental methods for genome scale analysis of molecular interaction networks has made possible new approaches to inferring protein function. This paper describes a method of assigning functions based on a probabilistic analysis of graph neighborhoods in a protein-protein interaction network. The method exploits the fact that graph neighbors are more likely to share functions than nodes which are not neighbors. A binomial model of local neighbor function labeling probability is combined with a Markov random field propagation algorithm to assign function probabilities for proteins in the network. Results: We applied the method to a protein-protein interaction dataset for the yeast Saccharomyces cerevisiae using the Gene Ontology (GO) terms as function labels. The method reconstructed known GO term assignments with high precision, and produced putative GO assignments to 320 proteins that currently lack GO annotation, which represents about 10% of the unlabeled proteins in S. cere

...read moreread less

387 citations

Proceedings Article•DOI•

Automatic application-specific instruction-set extensions under microarchitectural constraints

[...]

Kubilay Atasu¹, Laura Pozzi², Paolo Ienne²•Institutions (2)

Boğaziçi University¹, École Polytechnique Fédérale de Lausanne²

02 Jun 2003

TL;DR: In this article, a more general algorithm which selects maximal speedup convex subgraphs of the application dataflow graph under fundamental micro-architectural constraints is presented, which improves significantly on the state of the art.

...read moreread less

Abstract: Many commercial processors now offer the possibility of extending their instruction set for a specific application - that is, to introduce customized functional units. There is a need to develop algorithms that decide automatically, from high-level application code, which operations are to be carried out in the customized extensions. A few algorithms exist but are severely limited in the type of operation clusters they can choose and hence reduce significantly the effectiveness of specialization. In this paper, we introduce a more general algorithm which selects maximal-speedup convex subgraphs of the application dataflow graph under fundamental microarchitectural constraints, and which improves significantly on the state of the art.

...read moreread less

355 citations

Journal Article•DOI•

A novel generic graph model for traffic grooming in heterogeneous WDM mesh networks

[...]

Hongyue Zhu¹, Hui Zang¹, Keyao Zhu², Biswanath Mukherjee²•Institutions (2)

Sprint Corporation¹, University of California, Davis²

01 Apr 2003-IEEE ACM Transactions on Networking

TL;DR: A new generic graph model for traffic grooming in heterogeneous WDM mesh networks, based on the auxiliary graph, is proposed which can achieve various objectives using different grooming policies, while taking into account various constraints such as transceivers, wavelengths, wavelength-conversion capabilities, and grooming capabilities.

...read moreread less

Abstract: As the operation of our fiber-optic backbone networks migrates from interconnected SONET rings to arbitrary mesh topology, traffic grooming on wavelength-division multiplexing (WDM) mesh networks becomes an extremely important research problem. To address this problem, we propose a new generic graph model for traffic grooming in heterogeneous WDM mesh networks. The novelty of our model is that, by only manipulating the edges of the auxiliary graph created by our model and the weights of these edges, our model can achieve various objectives using different grooming policies, while taking into account various constraints such as transceivers, wavelengths, wavelength-conversion capabilities, and grooming capabilities. Based on the auxiliary graph, we develop an integrated traffic-grooming algorithm (IGABAG) and an integrated grooming procedure (INGPROC) which jointly solve several traffic-grooming subproblems by simply applying the shortest-path computation method. Different grooming policies can be represented by different weight-assignment functions, and the performance of these grooming policies are compared under both nonblocking scenario and blocking scenario. The IGABAG can be applied to both static and dynamic traffic grooming. In static grooming, the traffic-selection scheme is key to achieving good network performance. We propose several traffic-selection schemes based on this model and we evaluate their performance for different network topologies.

...read moreread less

355 citations

Book Chapter•DOI•

Experiments on Graph Clustering Algorithms

[...]

Ulrik Brandes¹, Marco Gaertler², Dorothea Wagner²•Institutions (2)

University of Passau¹, Karlsruhe Institute of Technology²

16 Sep 2003

TL;DR: In this article, a promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. intercluster sparsity, and a new approach that compares favorably with graph partitioning and geometric clustering.

...read moreread less

Abstract: A promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. inter-cluster sparsity. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed no conclusive argument on their appropriateness has been given. As a first step towards understanding the consequences of particular con- ceptions, we conducted an experimental evaluation of graph clustering approaches. By combining proven techniques from graph partitioning and geometric clustering, we also introduce a new approach that compares favorably.

...read moreread less

Book•

Counting, Sampling and Integrating: Algorithms and Complexity

[...]

Mark Jerrum

01 Jan 2003

TL;DR: In this paper, two good counting algorithms were proposed for #P-completeness and #Pcomplete problem in planar graphs, and a proof of the Poincaru inequality (Theorem 6.7).

...read moreread less

Abstract: Foreword.- 1 Two good counting algorithms.- 1.1 Spanning trees.- 1.2 Perfect matchings in a planar graph.- 2 #P-completeness.- 2.1 The class #P.- 2.2 A primal #P-complete problem.- 2.3 Computing the permanent is hard on average.- 3 Sampling and counting.- 3.1 Preliminaries.- 3.2 Reducing approximate countingto almost uniform sampling.- 3.3 Markov chains.- 4 Coupling and colourings.- 4.1 Colourings of a low-degree graph.- 4.2 Bounding mixing time using coupling.- 4.3 Path coupling.- 5 Canonical paths and matchings.- 5.1 Matchings in a graph.- 5.2 Canonical paths.- 5.3 Back to matchings.- 5.4 Extensions and further applications.- 5.5 Continuous time.- 6 Volume of a convex body.- 6.1 A few remarks on Markov chainswith continuous state space.- 6.2 Invariant measure of the ball walk.- 6.3 Mixing rate of the ball walk.- 6.4 Proof of the Poincaru inequality (Theorem 6.7).- 6.5 Proofs of the geometric lemmas.- 6.6 Relaxing the curvature condition.- 6.7 Using samples to estimate volume.- 6.8 Appendix: a proof of Corollary 6.8.- 7 Inapproximability.- 7.1 Independent sets in a low degree graph.

...read moreread less

Proceedings Article•DOI•

D(k)-index: an adaptive structural summary for graph-structured data

[...]

Qun Chen¹, Andrew Lim¹, Kian Win Ong¹•Institutions (1)

National University of Singapore¹

09 Jun 2003

TL;DR: The D(k) index is introduced, an adaptive structural summary for general graph structured documents based on the concept of bisimilarity, and is shown to be a more effective structural summary than previous static ones, as a result of its query load sensitivity.

...read moreread less

Abstract: To facilitate queries over semi-structured data, various structural summaries have been proposed. Structural summaries are derived directly from the data and serve as indices for evaluating path expressions on semi-structured or XML data. We introduce the D(k) index, an adaptive structural summary for general graph structured documents. Building on previous work, 1-index and A(k) index, the D(k)-index is also based on the concept of bisimilarity. However, as a generalization of the 1-index and A(k)-index, the D(k) index possesses the adaptive ability to adjust its structure according to the current query load. This dynamism also facilitates efficient update algorithms, which are crucial to practical applications of structural indices, but have not been adequately addressed in previous index proposals. Our experiments show that the D(k) index is a more effective structural summary than previous static ones, as a result of its query load sensitivity. In addition, update operations on the D(k) index can be performed more efficiently than on its predecessors.

...read moreread less

Proceedings Article•DOI•

The K-Neigh Protocol for Symmetric Topology Control in Ad Hoc Networks

[...]

Douglas M. Blough¹, Mauro Leoncini, Giovanni Resta, Paolo Santi•Institutions (1)

Georgia Institute of Technology¹

01 Jun 2003

TL;DR: In this paper, the authors propose an approach to topology control based on the principle of maintaining the number of neighbors of every node equal to or slightly below a specific value k. The approach enforces symmetry on the resulting communication graph, thereby easing the operation of higher layer protocols.

...read moreread less

Abstract: We propose an approach to topology control based on the principle of maintaining the number of neighbors of every node equal to or slightly below a specific value k. The approach enforces symmetry on the resulting communication graph, thereby easing the operation of higher layer protocols. To evaluate the performance of our approach, we estimate the value of k that guarantees connectivity of the communication graph with high probability. We then define k-Neigh, a fully distributed, asynchronous, and localized protocol that follows the above approach and uses distance estimation. We prove that k-Neigh terminates at every node after a total of 2n messages have been exchanged (with n nodes in the network) and within strictly bounded time. Finally, we present simulations results which show that our approach is about 20% more energy-efficient than a widely-studied existing protocol.

...read moreread less

Proceedings Article•DOI•

Adaptive on-line page importance computation

[...]

Serge Abiteboul¹, Mihai Preda, Gregory Cobena¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

20 May 2003

TL;DR: A new algorithm OPIC is introduced that works on-line, and uses much less resources, and does not require storing the link matrix, and is used to focus crawling to the most interesting pages.

...read moreread less

Abstract: The computation of page importance in a huge dynamic graph has recently attracted a lot of attention because of the web. Page importance, or page rank is defined as the fixpoint of a matrix equation. Previous algorithms compute it off-line and require the use of a lot of extra CPU as well as disk resources (e.g. to store, maintain and read the link matrix). We introduce a new algorithm OPIC that works on-line, and uses much less resources. In particular, it does not require storing the link matrix. It is on-line in that it continuously refines its estimate of page importance while the web/graph is visited. Thus it can be used to focus crawling to the most interesting pages. We prove the correctness of OPIC. We present Adaptive OPIC that also works on-line but adapts dynamically to changes of the web. A variant of this algorithm is now used by Xyleme.We report on experiments with synthetic data. In particular, we study the convergence and adaptiveness of the algorithms for various scheduling strategies for the pages to visit. We also report on experiments based on crawls of significant portions of the web.

...read moreread less

Proceedings Article•DOI•

Ρ-Queries: enabling querying for semantic associations on the semantic web

[...]

Kemafor Anyanwu¹, Amit P. Sheth¹•Institutions (1)

University of Georgia¹

20 May 2003

TL;DR: This paper presents the notion of Semantic Associations as complex relationships between resource entities based on a specific notion of similarity called r-isomorphism, and formalizes these notions for the RDF data model, by introducing a notion of a Property Sequence as a type.

...read moreread less

Abstract: This paper presents the notion of Semantic Associations as complex relationships between resource entities. These relationships capture both a connectivity of entities as well as similarity of entities based on a specific notion of similarity called r-isomorphism. It formalizes these notions for the RDF data model, by introducing a notion of a Property Sequence as a type. In the context of a graph model such as that for RDF, Semantic Associations amount to specific certain graph signatures. Specifically, they refer to sequences (i.e. directed paths) here called Property Sequences, between entities, networks of Property Sequences (i.e. undirected paths), or subgraphs of r-isomorphic Property Sequences.The ability to query about the existence of such relationships is fundamental to tasks in analytical domains such as national security and business intelligence, where tasks often focus on finding complex yet meaningful and obscured relationships between entities. However, support for such queries is lacking in contemporary query systems, including those for RDF.

...read moreread less

Proceedings Article•DOI•

Packing Steiner trees

[...]

Kamal Jain¹, Mohammad Mahdian², Mohammad R. Salavatipour³•Institutions (3)

Microsoft¹, Massachusetts Institute of Technology², University of Toronto³

12 Jan 2003

TL;DR: An algorithm with an asymptotic approximation factor of |S|/4 gives a sufficient condition for the existence of k edge-disjoint Steiner trees in a graph in terms of the edge-connectivity of the graph.

...read moreread less

Abstract: The Steiner packing problem is to find the maximum number of edge-disjoint subgraphs of a given graph G that connect a given set of required points S. This problem is motivated by practical applications in VLSI- layout and broadcasting, as well as theoretical reasons. In this paper, we study this problem and present an algorithm with an asymptotic approximation factor of vSv/4. This gives a sufficient condition for the existence of k edge-disjoint Steiner trees in a graph in terms of the edge-connectivity of the graph. We will show that this condition is the best possible if the number of terminals is 3. At the end, we consider the fractional version of this problem, and observe that it can be reduced to the minimum Steiner tree problem via the ellipsoid algorithm.

...read moreread less

Journal Article•DOI•

Weak laws of large numbers in geometric probability

[...]

Mathew D. Penrose, Joseph E. Yukich

01 Jan 2003-Annals of Applied Probability

TL;DR: In this article, a general weak law of large numbers for functionals of binomial point processes in d-dimensional space is established, with a limit that depends explicitly on the density of the point process.

...read moreread less

Abstract: Using a coupling argument, we establish a general weak law of large numbers for functionals of binomial point processes in d-dimensional space, with a limit that depends explicitly on the (possibly nonuniform) density of the point process. The general result is applied to the minimal spanning tree, the k-nearest neighbors graph, the Voronoi graph and the sphere of influence graph. Functionals of interest include total edge length with arbitrary weighting, number of vertices of specified degree and number of components. We also obtain weak laws of large numbers functionals of marked point processes, including statistics of Boolean models.

...read moreread less

Book Chapter•DOI•

Graph Theory Methods for the Analysis of Neural Connectivity Patterns

[...]

Olaf Sporns¹•Institutions (1)

Indiana University¹

01 Jan 2003

TL;DR: Methods characterizing average measures of connectivity, similarity of connection patterns, connectedness and components, paths, walks and cycles, distances, cluster indices, ranges and shortcuts, and node and edge cut sets are introduced and discussed in a neurobiological context.

...read moreread less

Abstract: This paper summarizes a set of graph theory methods that are of special relevance to the computational analysis of neural connectivity patterns. Methods characterizing average measures of connectivity, similarity of connection patterns, connectedness and components, paths, walks and cycles, distances, cluster indices, ranges and shortcuts, and node and edge cut sets are introduced and discussed in a neurobiological context. A set of Matlab functions implementing these methods is available for download at http://php.indiana.edu/~osporns/graphmeasures.htm.

...read moreread less

Book•

Graph algebras and automata

[...]

Andrei V. Kelarev

01 Jan 2003

TL;DR: This work defines graph algebras and reveals their applicability to automata theory and explores assorted monoids, semigroups, rings, codes, and other algebraic structures to outline theorems and algorithms for finite state automata and grammars.

...read moreread less

Abstract: Graph algebras possess the capacity to relate fundamental concepts of computer science, combinatorics, graph theory, operations research, and universal algebra. They are used to identify nontrivial connections across notions, expose conceptual properties, and mediate the application of methods from one area toward questions of the other four. After a concentrated review of the prerequisite mathematical background, Graph Algebras and Automata defines graph algebras and reveals their applicability to automata theory. It proceeds to explore assorted monoids, semigroups, rings, codes, and other algebraic structures and to outline theorems and algorithms for finite state automata and grammars.

...read moreread less

Proceedings Article•DOI•

Learning attack strategies from intrusion alerts

[...]

Peng Ning¹, Dingbang Xu¹•Institutions (1)

North Carolina State University¹

27 Oct 2003

TL;DR: Techniques to automatically learn attack strategies from correlated intrusion alerts are presented, to reduces the similarity measurement of attack strategies into error-tolerant graph/subgraph isomorphism problem, and measures the similarity between attack strategies in terms of the cost to transform one strategy into another.

...read moreread less

Abstract: Understanding strategies of attacks is crucial for security applications such as computer and network forensics, intrusion response, and prevention of future attacks. This paper presents techniques to automatically learn attack strategies from correlated intrusion alerts. Central to these techniques is a model that represents an attack strategy as a graph of attacks with constraints on the attack attributes and the temporal order among these attacks. To learn the intrusion strategy is then to extract such a graph from a sequences of intrusion alerts. To further facilitate the analysis of attack strategies, which is essential to many security applications such as computer and network forensics, this paper presents techniques to measure the similarity between attack strategies. The basic idea is to reduces the similarity measurement of attack strategies into error-tolerant graph/subgraph isomorphism problem, and measures the similarity between attack strategies in terms of the cost to transform one strategy into another. Finally, this paper presents some experimental results, which demonstrate the potential of the proposed techniques.

...read moreread less

Journal Article•DOI•

Review of "The boost graph library: user guide and reference manual by Jeremy G. Siek, Lie-Quan Lee, and Andrew Lumsdaine." Addison-Wesley 2002.

[...]

James Law¹•Institutions (1)

Oregon State University¹

01 Mar 2003-ACM Sigsoft Software Engineering Notes

TL;DR: Algorithms in C++ makes the case that information on pattern matching algorithms is not well understood except by experts in the area, and that for non-experts useful, practical implementations are nearly impossible to construct from available literature.

...read moreread less

Abstract: deeper into graph theory, thereby generating algorithms that are more challenging to the reader. Topics such as Depth-First Search, Hamiltonian Paths, Kruskal's Algorithm and Euclidean Networks are explored in detail. I have studied graph theory and therefore I was able to appreciate the examples and algorithms given in the text. However, I believe the author gives enough of an introduction in the beginning and explanations throughout the text so that a reader without any prior exposure to graph theory can still gain valuable experience in developing algorithms to solve complex problems. This book would be an excellent tool for a graph theory course (assuming the student is familiar with programming) or perhaps an advanced programming course dealing with algorithms or object oriented design methods. I found that the explanations of theorems and proofs in this text were excellent and helped me to further my knowledge and appreciation of graph theory. The object-oriented approach to implementing algorithms in C++ broadened my programming experience and helped to keep my interest in the topic. Occasionally the author assumes that the reader either has read the first volume, or has the text available for review. The first two volumes can be purchased as a bundle, and I suggest the reader consider obtaining both texts. However the programs from both volumes are available for download on the author's website, so it is not necessary to have both books if the reader is comfortable with programming topics such as queues. Overall, I enjoyed Algorithms in C++, and I plan to purchase the first and third volumes to compliment this text. I am certain that I will refer to all three in the future when I am in need of guidance, or perhaps even diversion. Pattern matching in strings is a basic problem in many areas of computer science, but particularly in applications that deal with text searching and genetic sequences. Information retrieval and computational biology are generating dramatic increases both in the size of texts to search and in the sophistication of the searches. The authors are two academics with bioinformatics industry experience. They use this book to make the case that information on pattern matching algorithms is not well understood except by experts in the area, and that for non-experts useful, practical implementations are nearly impossible to construct from available literature. Further , they claim that the only way to truly determine the fastest algorithm …

...read moreread less

Proceedings Article•DOI•

DFuse: a framework for distributed data fusion

[...]

Rajnish Kumar¹, Matthew Wolenetz¹, Bikash Agarwalla¹, Junsuk Shin¹, Phillip Hutto¹, Arnab Paul¹, Umakishore Ramachandran¹ - Show less +3 more•Institutions (1)

Georgia Institute of Technology¹

05 Nov 2003

TL;DR: The DFuse architectural framework, DFuse, consists of a data fusion API and a distributed algorithm for energy-aware role assignment that enables an application to be specified as a coarse-grained dataflow graph, and eases application development and deployment.

...read moreread less

Abstract: Simple in-network data aggregation (or fusion) techniques for sensor networks have been the focus of several recent research efforts, but they are insufficient to support advanced fusion applications. We extend these techniques to future sensor networks and ask two related questions: (a) what is the appropriate set of data fusion techniques, and (b) how do we dynamically assign aggregation roles to the nodes of a sensor network. We have developed an architectural framework, DFuse, for answering these two questions. It consists of a data fusion API and a distributed algorithm for energy-aware role assignment. The fusion API enables an application to be specified as a coarse-grained dataflow graph, and eases application development and deployment. The role assignment algorithm maps the graph onto the network, and optimally adapts the mapping at run-time using role migration. Experiments on an iPAQ farm show that, the fusion API has low-overhead, and the role assignment algorithm with role migration significantly increases the network lifetime compared to any static assignment.

...read moreread less

Journal Article•DOI•

Planning biped locomotion using motion capture data and probabilistic roadmaps

[...]

Min Gyu Choi¹, Jehee Lee², Sung Yong Shin¹•Institutions (2)

KAIST¹, Seoul National University²

01 Apr 2003-ACM Transactions on Graphics

TL;DR: In this paper, a probabilistic path planning and hierarchical displacement mapping are combined with a posture transition graph to guide the locomotion of a biped figure in a virtual environment.

...read moreread less

Abstract: Typical high-level directives for locomotion of human-like characters are useful for interactive games and simulations as well as for off-line production animation. In this paper, we present a new scheme for planning natural-looking locomotion of a biped figure to facilitate rapid motion prototyping and task-level motion generation. Given start and goal positions in a virtual environment, our scheme gives a sequence of motions to move from the start to the goal using a set of live-captured motion clips. Based on a novel combination of probabilistic path planning and hierarchical displacement mapping, our scheme consists of three parts: roadmap construction, roadmap search, and motion generation. We randomly sample a set of valid footholds of the biped figure from the environment to construct a directed graph, called a roadmap, that guides the locomotion of the figure. Every edge of the roadmap is associated with a live-captured motion clip. Augmenting the roadmap with a posture transition graph, we traverse it to obtain the sequence of input motion clips and that of target footprints. We finally adapt the motion sequence to the constraints specified by the footprint sequence to generate a desired locomotion.

...read moreread less

Book Chapter•DOI•

On spectral graph drawing

[...]

Yehuda Koren

25 Jul 2003

TL;DR: A novel view of the spectral approach is presented, which provides a direct link between eigenvectors and the aesthetic properties of the layout and is accompanied by an aesthetically-motivated algorithm, which is much easier to understand and to implement than the standard numerical algorithms for computing eigenvctors.

...read moreread less

Abstract: The spectral approach for graph visualization computes the layout of a graph using certain eigenvectors of related matrices. Some important advantages of this approach are an ability to compute optimal layouts (according to specific requirements) and a very rapid computation time. In this paper we explore spectral visualization techniques and study their properties. We present a novel view of the spectral approach, which provides a direct link between eigenvectors and the aesthetic properties of the layout. In addition, we present a new formulation of the spectral drawing method with some aesthetic advantages. This formulation is accompanied by an aesthetically-motivated algorithm, which is much easier to understand and to implement than the standard numerical algorithms for computing eigenvectors.

...read moreread less

Journal Article•DOI•

Using a concept lattice of decomposition slices for program understanding and impact analysis

[...]

Paolo Tonella

01 Jun 2003-IEEE Transactions on Software Engineering

TL;DR: A novel program representation, called concept lattice of decomposition slices, is shown to be an extension of the decomposition slice graph, and is obtained by means of concept analysis, with additional nodes associated with weak interferences between computations, i.e., shared statements which are not decomposition slicing.

...read moreread less

Abstract: The decomposition slice graph and concept lattice are two program representations used to abstract the details of code into a higher-level view of the program. The decomposition slice graph partitions the program into computations performed on different variables and shows the dependence relation between computations, holding when a computation needs another computation as a building block. The concept lattice groups program entities which share common attributes and organizes such groupings into a hierarchy of concepts, which are related through generalizations/specializations. This paper investigates the relationship existing between these two program representations. The main result of this paper is a novel program representation, called concept lattice of decomposition slices, which is shown to be an extension of the decomposition slice graph, and is obtained by means of concept analysis, with additional nodes associated with weak interferences between computations, i.e., shared statements which are not decomposition slices. The concept lattice of decomposition slices can be used to support software maintenance by providing relevant information about the computations performed by a program and the related dependences/interferences, as well as by representing a natural data structure on which to conduct impact analysis. Preliminary results on small to medium size code support the applicability of this method at the intraprocedural level or when investigating the dependences among small groups of procedures.

...read moreread less

Patent•

Method and system for storing and reporting network performance metrics using histograms

[...]

David B. Hamilton, Louis M. Arquie¹, Kyle C. Lau•Institutions (1)

Brocade Communications Systems¹

25 Sep 2003

TL;DR: In this article, the authors present a method for reporting data network monitoring information, which includes accessing performance metrics values for a network component and generating a trace of graph data points for the performance metric values.

...read moreread less

Abstract: A method for reporting data network monitoring information. The method includes accessing performance metric values for a network component and generating a trace of graph data points for the performance metric values. For a range of the trace, a histogram is built and displayed corresponding to the graph data points (step 430). For a user interface, a performance monitoring display is generated including a graph of the trace relative to an x-axis and a y-axis and a representation of the histogram. Using the graphical user interface (GUI), the user can access a selection mechanism by a moving the range selector to define the selected histogram range (steps 440 and 470). The graph data points in the trace corresponds to a histogram previously built from the performance metric values, and the trace is generated by determining and plotting an average value of each of the graph data point histograms. The building of the histogram for the performance monitoring display involves combining the graph data point histograms corresponding to the graph data points in selected histogram range (step 460).

...read moreread less

Book Chapter•

Learning semantic similarity

[...]

J. Kandola, John Shawe-Taylor, Nello Cristianini

01 Jan 2003

TL;DR: In this article, the authors propose two methods for inferring semantic similarity between terms from a corpus, one based on word-similarity and the other based on document similarity, giving rise to a system of equations whose equilibrium point they use to obtain a semantic similarity measure.

...read moreread less

Abstract: The standard representation of text documents as bags of words suffers from well known limitations, mostly due to its inability to exploit semantic similarity between terms. Attempts to incorporate some notion of term similarity include latent semantic indexing [8], the use of semantic networks [9], and probabilistic methods [5]. In this paper we propose two methods for inferring such similarity from a corpus. The first one defines word-similarity based on document-similarity and viceversa, giving rise to a system of equations whose equilibrium point we use to obtain a semantic similarity measure. The second method models semantic relations by means of a diffusion process on a graph defined by lexicon and co-occurrence information. Both approaches produce valid kernel functions parametrised by a real number. The paper shows how the alignment measure can be used to successfully perform model selection over this parameter. Combined with the use of support vector machines we obtain positive results.

...read moreread less

Journal Article•DOI•

Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network

[...]

Seiya Imoto¹, Sun Yong Kim¹, Takao Goto¹, Sachiyo Aburatani², Kousuke Tashiro², Satoru Kuhara², Satoru Miyano¹ - Show less +3 more•Institutions (2)

University of Tokyo¹, Kyushu University²

01 Jul 2003-Journal of Bioinformatics and Computational Biology

TL;DR: A new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network is proposed and a new graph selection criterion from Bayesian approach in general situations is theoretically derived.

...read moreread less

Abstract: We propose a new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network. An essential point of Bayesian network construction is the estimation of the conditional distribution of each random variable. We consider fitting nonparametric regression models with heterogeneous error variances to the microarray gene expression data to capture the nonlinear structures between genes. Selecting the optimal graph, which gives the best representation of the system among genes, is still a problem to be solved. We theoretically derive a new graph selection criterion from Bayes approach in general situations. The proposed method includes previous methods based on Bayesian networks. We demonstrate the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae gene expression data newly obtained by disrupting 100 genes.

...read moreread less

Journal Article•DOI•

Multiparticle entanglement purification for graph states.

[...]

Wolfgang Dür¹, Hans Aschauer¹, Hans J. Briegel¹•Institutions (1)

Ludwig Maximilian University of Munich¹

05 Sep 2003-Physical Review Letters

TL;DR: A class of multiparticle entanglement purification protocols that allow us to distill a large class of entangled states, which include cluster states, Greenberger-Horne-Zeilinger states, and various error correction codes are introduced.

...read moreread less

Abstract: We introduce a class of multiparticle entanglement purification protocols that allow us to distill a large class of entangled states. These include cluster states, Greenberger-Horne-Zeilinger states, and various error correction codes all of which belong to the class of two-colorable graph states. We analyze these schemes under realistic conditions and observe that they are scalable; i.e., the threshold value for imperfect local operations does not depend on the number of parties for many of these states. When compared to schemes based on bipartite entanglement purification, the protocol is more efficient and the achievable quality of the purified states is larger. As an application we discuss an experimental realization of the protocol in optical lattices which allows one to purify cluster states.

...read moreread less

Collapse