scispace - formally typeset
Search or ask a question

Showing papers on "Graph (abstract data type) published in 2003"


Proceedings ArticleDOI
24 Aug 2003
TL;DR: A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods and shows that it not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.
Abstract: Recent research on pattern discovery has progressed form mining frequent itemsets and sequences to mining structured patterns including trees, lattices, and graphs. As a general data structure, graph can model complicated relations among data with wide applications in bioinformatics, Web exploration, and etc. However, mining large graph patterns in challenging due to the presence of an exponential number of frequent subgraphs. Instead of mining all the subgraphs, we propose to mine closed frequent graph patterns. A graph g is closed in a database if there exists no proper supergraph of g that has the same support as g. A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods. Our performance study shows that CloseGraph not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.

722 citations


Proceedings ArticleDOI
19 Nov 2003
TL;DR: This work proposes a novel frequent subgraph mining algorithm: FFSM, which employs a vertical search scheme within an algebraic graph framework it has developed to reduce the number of redundant candidates proposed.
Abstract: Frequent subgraph mining is an active research topic in the data mining community. A graph is a general model to represent data and has been used in many domains like cheminformatics and bioinformatics. Mining patterns from graph databases is challenging since graph related operations, such as subgraph testing, generally have higher time complexity than the corresponding operations on itemsets, sequences, and trees, which have been studied extensively. We propose a novel frequent subgraph mining algorithm: FFSM, which employs a vertical search scheme within an algebraic graph framework we have developed to reduce the number of redundant candidates proposed. Our empirical study on synthetic and real datasets demonstrates that FFSM achieves a substantial performance gain over the current start-of-the-art subgraph mining algorithm gSpan.

699 citations


Proceedings ArticleDOI
13 Oct 2003
TL;DR: This work shows how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric.
Abstract: Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry.

654 citations


Proceedings Article
16 Sep 2003
TL;DR: This work combines active and semi-supervised learning techniques under a Gaussian random field model, which requires a much smaller number of queries to achieve high accuracy compared with random query selection.
Abstract: Active and semi-supervised learning are important techniques when labeled data are scarce. We combine the two under a Gaussian random field model. Labeled and unlabeled data are represented as vertices in a weighted graph, with edge weights encoding the similarity between instances. The semi-supervised learning problem is then formulated in terms of a Gaussian random field on this graph, the mean of which is characterized in terms of harmonic functions. Active learning is performed on top of the semisupervised learning scheme by greedily selecting queries from the unlabeled data to minimize the estimated expected classification error (risk); in the case of Gaussian fields the risk is efficiently computed using matrix methods. We present experimental results on synthetic data, handwritten digit recognition, and text classification tasks. The active learning scheme requires a much smaller number of queries to achieve high accuracy compared with random query selection.

578 citations


Proceedings Article
09 Dec 2003
TL;DR: It is shown that the collective classification approach of RMNs, and the introduction of subgraph patterns over link labels, provide significant improvements in accuracy over flat classification, which attempts to predict each link in isolation.
Abstract: Many real-world domains are relational in nature, consisting of a set of objects related to each other in complex ways. This paper focuses on predicting the existence and the type of links between entities in such domains. We apply the relational Markov network framework of Taskar et al. to define a joint probabilistic model over the entire link graph — entity attributes and links. The application of the RMN algorithm to this task requires the definition of probabilistic patterns over subgraph structures. We apply this method to two new relational datasets, one involving university webpages, and the other a social network. We show that the collective classification approach of RMNs, and the introduction of subgraph patterns over link labels, provide significant improvements in accuracy over flat classification, which attempts to predict each link in isolation.

524 citations


Proceedings ArticleDOI
24 Aug 2003
TL;DR: This paper introduces two techniques for graph-based anomaly detection, and introduces a new method for calculating the regularity of a graph, with applications to anomaly detection.
Abstract: Anomaly detection is an area that has received much attention in recent years. It has a wide variety of applications, including fraud detection and network intrusion detection. A good deal of research has been performed in this area, often using strings or attribute-value data as the medium from which anomalies are to be extracted. Little work, however, has focused on anomaly detection in graph-based data. In this paper, we introduce two techniques for graph-based anomaly detection. In addition, we introduce a new method for calculating the regularity of a graph, with applications to anomaly detection. We hypothesize that these methods will prove useful both for finding anomalies, and for determining the likelihood of successful anomaly detection within graph-based data. We provide experimental results using both real-world network intrusion data and artificially-created data.

504 citations


Proceedings ArticleDOI
03 Nov 2003
TL;DR: Two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that take account both of text and communication patterns.
Abstract: A common method for finding information in an organization is to use social networks---ask people, following referrals until someone with the right information is found. Another way is to automatically mine documents to determine who knows what. Email documents seem particularly well suited to this task of "expertise location", as people routinely communicate what they know. Moreover, because people explicitly direct email to one another, social networks are likely to be contained in the patterns of communication. Can these patterns be used to discover experts on particular topics? Is this approach better than mining message content alone? To find answers to these questions, two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that takes account both of text and communication patterns. An evaluation was done using email and explicit expertise ratings from two different organizations. The rankings given by each algorithm were compared to the explicit rankings with the precision and recall measures commonly used in information retrieval, as well as the d' measure commonly used in signal-detection theory. Results show that the graph-based algorithm performs better than the content-based algorithm at identifying experts in both cases, demonstrating that the graph-based algorithm effectively extracts more information than is found in content alone.

395 citations


Journal ArticleDOI
TL;DR: A method of assigning functions based on a probabilistic analysis of graph neighborhoods in a protein-protein interaction network that exploits the fact that graph neighbors are more likely to share functions than nodes which are not neighbors.
Abstract: Motivation: The development of experimental methods for genome scale analysis of molecular interaction networks has made possible new approaches to inferring protein function. This paper describes a method of assigning functions based on a probabilistic analysis of graph neighborhoods in a protein-protein interaction network. The method exploits the fact that graph neighbors are more likely to share functions than nodes which are not neighbors. A binomial model of local neighbor function labeling probability is combined with a Markov random field propagation algorithm to assign function probabilities for proteins in the network. Results: We applied the method to a protein-protein interaction dataset for the yeast Saccharomyces cerevisiae using the Gene Ontology (GO) terms as function labels. The method reconstructed known GO term assignments with high precision, and produced putative GO assignments to 320 proteins that currently lack GO annotation, which represents about 10% of the unlabeled proteins in S. cere

387 citations


Proceedings ArticleDOI
02 Jun 2003
TL;DR: In this article, a more general algorithm which selects maximal speedup convex subgraphs of the application dataflow graph under fundamental micro-architectural constraints is presented, which improves significantly on the state of the art.
Abstract: Many commercial processors now offer the possibility of extending their instruction set for a specific application - that is, to introduce customized functional units. There is a need to develop algorithms that decide automatically, from high-level application code, which operations are to be carried out in the customized extensions. A few algorithms exist but are severely limited in the type of operation clusters they can choose and hence reduce significantly the effectiveness of specialization. In this paper, we introduce a more general algorithm which selects maximal-speedup convex subgraphs of the application dataflow graph under fundamental microarchitectural constraints, and which improves significantly on the state of the art.

355 citations


Journal ArticleDOI
TL;DR: A new generic graph model for traffic grooming in heterogeneous WDM mesh networks, based on the auxiliary graph, is proposed which can achieve various objectives using different grooming policies, while taking into account various constraints such as transceivers, wavelengths, wavelength-conversion capabilities, and grooming capabilities.
Abstract: As the operation of our fiber-optic backbone networks migrates from interconnected SONET rings to arbitrary mesh topology, traffic grooming on wavelength-division multiplexing (WDM) mesh networks becomes an extremely important research problem. To address this problem, we propose a new generic graph model for traffic grooming in heterogeneous WDM mesh networks. The novelty of our model is that, by only manipulating the edges of the auxiliary graph created by our model and the weights of these edges, our model can achieve various objectives using different grooming policies, while taking into account various constraints such as transceivers, wavelengths, wavelength-conversion capabilities, and grooming capabilities. Based on the auxiliary graph, we develop an integrated traffic-grooming algorithm (IGABAG) and an integrated grooming procedure (INGPROC) which jointly solve several traffic-grooming subproblems by simply applying the shortest-path computation method. Different grooming policies can be represented by different weight-assignment functions, and the performance of these grooming policies are compared under both nonblocking scenario and blocking scenario. The IGABAG can be applied to both static and dynamic traffic grooming. In static grooming, the traffic-selection scheme is key to achieving good network performance. We propose several traffic-selection schemes based on this model and we evaluate their performance for different network topologies.

355 citations


Book ChapterDOI
16 Sep 2003
TL;DR: In this article, a promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. intercluster sparsity, and a new approach that compares favorably with graph partitioning and geometric clustering.
Abstract: A promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. inter-cluster sparsity. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed no conclusive argument on their appropriateness has been given. As a first step towards understanding the consequences of particular con- ceptions, we conducted an experimental evaluation of graph clustering approaches. By combining proven techniques from graph partitioning and geometric clustering, we also introduce a new approach that compares favorably.

Book
01 Jan 2003
TL;DR: In this paper, two good counting algorithms were proposed for #P-completeness and #Pcomplete problem in planar graphs, and a proof of the Poincaru inequality (Theorem 6.7).
Abstract: Foreword.- 1 Two good counting algorithms.- 1.1 Spanning trees.- 1.2 Perfect matchings in a planar graph.- 2 #P-completeness.- 2.1 The class #P.- 2.2 A primal #P-complete problem.- 2.3 Computing the permanent is hard on average.- 3 Sampling and counting.- 3.1 Preliminaries.- 3.2 Reducing approximate countingto almost uniform sampling.- 3.3 Markov chains.- 4 Coupling and colourings.- 4.1 Colourings of a low-degree graph.- 4.2 Bounding mixing time using coupling.- 4.3 Path coupling.- 5 Canonical paths and matchings.- 5.1 Matchings in a graph.- 5.2 Canonical paths.- 5.3 Back to matchings.- 5.4 Extensions and further applications.- 5.5 Continuous time.- 6 Volume of a convex body.- 6.1 A few remarks on Markov chainswith continuous state space.- 6.2 Invariant measure of the ball walk.- 6.3 Mixing rate of the ball walk.- 6.4 Proof of the Poincaru inequality (Theorem 6.7).- 6.5 Proofs of the geometric lemmas.- 6.6 Relaxing the curvature condition.- 6.7 Using samples to estimate volume.- 6.8 Appendix: a proof of Corollary 6.8.- 7 Inapproximability.- 7.1 Independent sets in a low degree graph.

Proceedings ArticleDOI
09 Jun 2003
TL;DR: The D(k) index is introduced, an adaptive structural summary for general graph structured documents based on the concept of bisimilarity, and is shown to be a more effective structural summary than previous static ones, as a result of its query load sensitivity.
Abstract: To facilitate queries over semi-structured data, various structural summaries have been proposed. Structural summaries are derived directly from the data and serve as indices for evaluating path expressions on semi-structured or XML data. We introduce the D(k) index, an adaptive structural summary for general graph structured documents. Building on previous work, 1-index and A(k) index, the D(k)-index is also based on the concept of bisimilarity. However, as a generalization of the 1-index and A(k)-index, the D(k) index possesses the adaptive ability to adjust its structure according to the current query load. This dynamism also facilitates efficient update algorithms, which are crucial to practical applications of structural indices, but have not been adequately addressed in previous index proposals. Our experiments show that the D(k) index is a more effective structural summary than previous static ones, as a result of its query load sensitivity. In addition, update operations on the D(k) index can be performed more efficiently than on its predecessors.

Proceedings ArticleDOI
01 Jun 2003
TL;DR: In this paper, the authors propose an approach to topology control based on the principle of maintaining the number of neighbors of every node equal to or slightly below a specific value k. The approach enforces symmetry on the resulting communication graph, thereby easing the operation of higher layer protocols.
Abstract: We propose an approach to topology control based on the principle of maintaining the number of neighbors of every node equal to or slightly below a specific value k. The approach enforces symmetry on the resulting communication graph, thereby easing the operation of higher layer protocols. To evaluate the performance of our approach, we estimate the value of k that guarantees connectivity of the communication graph with high probability. We then define k-Neigh, a fully distributed, asynchronous, and localized protocol that follows the above approach and uses distance estimation. We prove that k-Neigh terminates at every node after a total of 2n messages have been exchanged (with n nodes in the network) and within strictly bounded time. Finally, we present simulations results which show that our approach is about 20% more energy-efficient than a widely-studied existing protocol.

Proceedings ArticleDOI
20 May 2003
TL;DR: A new algorithm OPIC is introduced that works on-line, and uses much less resources, and does not require storing the link matrix, and is used to focus crawling to the most interesting pages.
Abstract: The computation of page importance in a huge dynamic graph has recently attracted a lot of attention because of the web. Page importance, or page rank is defined as the fixpoint of a matrix equation. Previous algorithms compute it off-line and require the use of a lot of extra CPU as well as disk resources (e.g. to store, maintain and read the link matrix). We introduce a new algorithm OPIC that works on-line, and uses much less resources. In particular, it does not require storing the link matrix. It is on-line in that it continuously refines its estimate of page importance while the web/graph is visited. Thus it can be used to focus crawling to the most interesting pages. We prove the correctness of OPIC. We present Adaptive OPIC that also works on-line but adapts dynamically to changes of the web. A variant of this algorithm is now used by Xyleme.We report on experiments with synthetic data. In particular, we study the convergence and adaptiveness of the algorithms for various scheduling strategies for the pages to visit. We also report on experiments based on crawls of significant portions of the web.

Proceedings ArticleDOI
20 May 2003
TL;DR: This paper presents the notion of Semantic Associations as complex relationships between resource entities based on a specific notion of similarity called r-isomorphism, and formalizes these notions for the RDF data model, by introducing a notion of a Property Sequence as a type.
Abstract: This paper presents the notion of Semantic Associations as complex relationships between resource entities. These relationships capture both a connectivity of entities as well as similarity of entities based on a specific notion of similarity called r-isomorphism. It formalizes these notions for the RDF data model, by introducing a notion of a Property Sequence as a type. In the context of a graph model such as that for RDF, Semantic Associations amount to specific certain graph signatures. Specifically, they refer to sequences (i.e. directed paths) here called Property Sequences, between entities, networks of Property Sequences (i.e. undirected paths), or subgraphs of r-isomorphic Property Sequences.The ability to query about the existence of such relationships is fundamental to tasks in analytical domains such as national security and business intelligence, where tasks often focus on finding complex yet meaningful and obscured relationships between entities. However, support for such queries is lacking in contemporary query systems, including those for RDF.

Proceedings ArticleDOI
12 Jan 2003
TL;DR: An algorithm with an asymptotic approximation factor of |S|/4 gives a sufficient condition for the existence of k edge-disjoint Steiner trees in a graph in terms of the edge-connectivity of the graph.
Abstract: The Steiner packing problem is to find the maximum number of edge-disjoint subgraphs of a given graph G that connect a given set of required points S. This problem is motivated by practical applications in VLSI- layout and broadcasting, as well as theoretical reasons. In this paper, we study this problem and present an algorithm with an asymptotic approximation factor of vSv/4. This gives a sufficient condition for the existence of k edge-disjoint Steiner trees in a graph in terms of the edge-connectivity of the graph. We will show that this condition is the best possible if the number of terminals is 3. At the end, we consider the fractional version of this problem, and observe that it can be reduced to the minimum Steiner tree problem via the ellipsoid algorithm.

Journal ArticleDOI
TL;DR: In this article, a general weak law of large numbers for functionals of binomial point processes in d-dimensional space is established, with a limit that depends explicitly on the density of the point process.
Abstract: Using a coupling argument, we establish a general weak law of large numbers for functionals of binomial point processes in d-dimensional space, with a limit that depends explicitly on the (possibly nonuniform) density of the point process. The general result is applied to the minimal spanning tree, the k-nearest neighbors graph, the Voronoi graph and the sphere of influence graph. Functionals of interest include total edge length with arbitrary weighting, number of vertices of specified degree and number of components. We also obtain weak laws of large numbers functionals of marked point processes, including statistics of Boolean models.

Book ChapterDOI
Olaf Sporns1
01 Jan 2003
TL;DR: Methods characterizing average measures of connectivity, similarity of connection patterns, connectedness and components, paths, walks and cycles, distances, cluster indices, ranges and shortcuts, and node and edge cut sets are introduced and discussed in a neurobiological context.
Abstract: This paper summarizes a set of graph theory methods that are of special relevance to the computational analysis of neural connectivity patterns. Methods characterizing average measures of connectivity, similarity of connection patterns, connectedness and components, paths, walks and cycles, distances, cluster indices, ranges and shortcuts, and node and edge cut sets are introduced and discussed in a neurobiological context. A set of Matlab functions implementing these methods is available for download at http://php.indiana.edu/~osporns/graphmeasures.htm.

Book
01 Jan 2003
TL;DR: This work defines graph algebras and reveals their applicability to automata theory and explores assorted monoids, semigroups, rings, codes, and other algebraic structures to outline theorems and algorithms for finite state automata and grammars.
Abstract: Graph algebras possess the capacity to relate fundamental concepts of computer science, combinatorics, graph theory, operations research, and universal algebra. They are used to identify nontrivial connections across notions, expose conceptual properties, and mediate the application of methods from one area toward questions of the other four. After a concentrated review of the prerequisite mathematical background, Graph Algebras and Automata defines graph algebras and reveals their applicability to automata theory. It proceeds to explore assorted monoids, semigroups, rings, codes, and other algebraic structures and to outline theorems and algorithms for finite state automata and grammars.

Proceedings ArticleDOI
27 Oct 2003
TL;DR: Techniques to automatically learn attack strategies from correlated intrusion alerts are presented, to reduces the similarity measurement of attack strategies into error-tolerant graph/subgraph isomorphism problem, and measures the similarity between attack strategies in terms of the cost to transform one strategy into another.
Abstract: Understanding strategies of attacks is crucial for security applications such as computer and network forensics, intrusion response, and prevention of future attacks. This paper presents techniques to automatically learn attack strategies from correlated intrusion alerts. Central to these techniques is a model that represents an attack strategy as a graph of attacks with constraints on the attack attributes and the temporal order among these attacks. To learn the intrusion strategy is then to extract such a graph from a sequences of intrusion alerts. To further facilitate the analysis of attack strategies, which is essential to many security applications such as computer and network forensics, this paper presents techniques to measure the similarity between attack strategies. The basic idea is to reduces the similarity measurement of attack strategies into error-tolerant graph/subgraph isomorphism problem, and measures the similarity between attack strategies in terms of the cost to transform one strategy into another. Finally, this paper presents some experimental results, which demonstrate the potential of the proposed techniques.

Journal ArticleDOI
TL;DR: Algorithms in C++ makes the case that information on pattern matching algorithms is not well understood except by experts in the area, and that for non-experts useful, practical implementations are nearly impossible to construct from available literature.
Abstract: deeper into graph theory, thereby generating algorithms that are more challenging to the reader. Topics such as Depth-First Search, Hamiltonian Paths, Kruskal's Algorithm and Euclidean Networks are explored in detail. I have studied graph theory and therefore I was able to appreciate the examples and algorithms given in the text. However, I believe the author gives enough of an introduction in the beginning and explanations throughout the text so that a reader without any prior exposure to graph theory can still gain valuable experience in developing algorithms to solve complex problems. This book would be an excellent tool for a graph theory course (assuming the student is familiar with programming) or perhaps an advanced programming course dealing with algorithms or object oriented design methods. I found that the explanations of theorems and proofs in this text were excellent and helped me to further my knowledge and appreciation of graph theory. The object-oriented approach to implementing algorithms in C++ broadened my programming experience and helped to keep my interest in the topic. Occasionally the author assumes that the reader either has read the first volume, or has the text available for review. The first two volumes can be purchased as a bundle, and I suggest the reader consider obtaining both texts. However the programs from both volumes are available for download on the author's website, so it is not necessary to have both books if the reader is comfortable with programming topics such as queues. Overall, I enjoyed Algorithms in C++, and I plan to purchase the first and third volumes to compliment this text. I am certain that I will refer to all three in the future when I am in need of guidance, or perhaps even diversion. Pattern matching in strings is a basic problem in many areas of computer science, but particularly in applications that deal with text searching and genetic sequences. Information retrieval and computational biology are generating dramatic increases both in the size of texts to search and in the sophistication of the searches. The authors are two academics with bioinformatics industry experience. They use this book to make the case that information on pattern matching algorithms is not well understood except by experts in the area, and that for non-experts useful, practical implementations are nearly impossible to construct from available literature. Further , they claim that the only way to truly determine the fastest algorithm …

Proceedings ArticleDOI
05 Nov 2003
TL;DR: The DFuse architectural framework, DFuse, consists of a data fusion API and a distributed algorithm for energy-aware role assignment that enables an application to be specified as a coarse-grained dataflow graph, and eases application development and deployment.
Abstract: Simple in-network data aggregation (or fusion) techniques for sensor networks have been the focus of several recent research efforts, but they are insufficient to support advanced fusion applications. We extend these techniques to future sensor networks and ask two related questions: (a) what is the appropriate set of data fusion techniques, and (b) how do we dynamically assign aggregation roles to the nodes of a sensor network. We have developed an architectural framework, DFuse, for answering these two questions. It consists of a data fusion API and a distributed algorithm for energy-aware role assignment. The fusion API enables an application to be specified as a coarse-grained dataflow graph, and eases application development and deployment. The role assignment algorithm maps the graph onto the network, and optimally adapts the mapping at run-time using role migration. Experiments on an iPAQ farm show that, the fusion API has low-overhead, and the role assignment algorithm with role migration significantly increases the network lifetime compared to any static assignment.

Journal ArticleDOI
TL;DR: In this paper, a probabilistic path planning and hierarchical displacement mapping are combined with a posture transition graph to guide the locomotion of a biped figure in a virtual environment.
Abstract: Typical high-level directives for locomotion of human-like characters are useful for interactive games and simulations as well as for off-line production animation. In this paper, we present a new scheme for planning natural-looking locomotion of a biped figure to facilitate rapid motion prototyping and task-level motion generation. Given start and goal positions in a virtual environment, our scheme gives a sequence of motions to move from the start to the goal using a set of live-captured motion clips. Based on a novel combination of probabilistic path planning and hierarchical displacement mapping, our scheme consists of three parts: roadmap construction, roadmap search, and motion generation. We randomly sample a set of valid footholds of the biped figure from the environment to construct a directed graph, called a roadmap, that guides the locomotion of the figure. Every edge of the roadmap is associated with a live-captured motion clip. Augmenting the roadmap with a posture transition graph, we traverse it to obtain the sequence of input motion clips and that of target footprints. We finally adapt the motion sequence to the constraints specified by the footprint sequence to generate a desired locomotion.

Book ChapterDOI
25 Jul 2003
TL;DR: A novel view of the spectral approach is presented, which provides a direct link between eigenvectors and the aesthetic properties of the layout and is accompanied by an aesthetically-motivated algorithm, which is much easier to understand and to implement than the standard numerical algorithms for computing eigenvctors.
Abstract: The spectral approach for graph visualization computes the layout of a graph using certain eigenvectors of related matrices. Some important advantages of this approach are an ability to compute optimal layouts (according to specific requirements) and a very rapid computation time. In this paper we explore spectral visualization techniques and study their properties. We present a novel view of the spectral approach, which provides a direct link between eigenvectors and the aesthetic properties of the layout. In addition, we present a new formulation of the spectral drawing method with some aesthetic advantages. This formulation is accompanied by an aesthetically-motivated algorithm, which is much easier to understand and to implement than the standard numerical algorithms for computing eigenvectors.

Journal ArticleDOI
TL;DR: A novel program representation, called concept lattice of decomposition slices, is shown to be an extension of the decomposition slice graph, and is obtained by means of concept analysis, with additional nodes associated with weak interferences between computations, i.e., shared statements which are not decomposition slicing.
Abstract: The decomposition slice graph and concept lattice are two program representations used to abstract the details of code into a higher-level view of the program. The decomposition slice graph partitions the program into computations performed on different variables and shows the dependence relation between computations, holding when a computation needs another computation as a building block. The concept lattice groups program entities which share common attributes and organizes such groupings into a hierarchy of concepts, which are related through generalizations/specializations. This paper investigates the relationship existing between these two program representations. The main result of this paper is a novel program representation, called concept lattice of decomposition slices, which is shown to be an extension of the decomposition slice graph, and is obtained by means of concept analysis, with additional nodes associated with weak interferences between computations, i.e., shared statements which are not decomposition slices. The concept lattice of decomposition slices can be used to support software maintenance by providing relevant information about the computations performed by a program and the related dependences/interferences, as well as by representing a natural data structure on which to conduct impact analysis. Preliminary results on small to medium size code support the applicability of this method at the intraprocedural level or when investigating the dependences among small groups of procedures.

Patent
25 Sep 2003
TL;DR: In this article, the authors present a method for reporting data network monitoring information, which includes accessing performance metrics values for a network component and generating a trace of graph data points for the performance metric values.
Abstract: A method for reporting data network monitoring information. The method includes accessing performance metric values for a network component and generating a trace of graph data points for the performance metric values. For a range of the trace, a histogram is built and displayed corresponding to the graph data points (step 430). For a user interface, a performance monitoring display is generated including a graph of the trace relative to an x-axis and a y-axis and a representation of the histogram. Using the graphical user interface (GUI), the user can access a selection mechanism by a moving the range selector to define the selected histogram range (steps 440 and 470). The graph data points in the trace corresponds to a histogram previously built from the performance metric values, and the trace is generated by determining and plotting an average value of each of the graph data point histograms. The building of the histogram for the performance monitoring display involves combining the graph data point histograms corresponding to the graph data points in selected histogram range (step 460).

Book Chapter
01 Jan 2003
TL;DR: In this article, the authors propose two methods for inferring semantic similarity between terms from a corpus, one based on word-similarity and the other based on document similarity, giving rise to a system of equations whose equilibrium point they use to obtain a semantic similarity measure.
Abstract: The standard representation of text documents as bags of words suffers from well known limitations, mostly due to its inability to exploit semantic similarity between terms. Attempts to incorporate some notion of term similarity include latent semantic indexing [8], the use of semantic networks [9], and probabilistic methods [5]. In this paper we propose two methods for inferring such similarity from a corpus. The first one defines word-similarity based on document-similarity and viceversa, giving rise to a system of equations whose equilibrium point we use to obtain a semantic similarity measure. The second method models semantic relations by means of a diffusion process on a graph defined by lexicon and co-occurrence information. Both approaches produce valid kernel functions parametrised by a real number. The paper shows how the alignment measure can be used to successfully perform model selection over this parameter. Combined with the use of support vector machines we obtain positive results.

Journal ArticleDOI
TL;DR: A new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network is proposed and a new graph selection criterion from Bayesian approach in general situations is theoretically derived.
Abstract: We propose a new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network. An essential point of Bayesian network construction is the estimation of the conditional distribution of each random variable. We consider fitting nonparametric regression models with heterogeneous error variances to the microarray gene expression data to capture the nonlinear structures between genes. Selecting the optimal graph, which gives the best representation of the system among genes, is still a problem to be solved. We theoretically derive a new graph selection criterion from Bayes approach in general situations. The proposed method includes previous methods based on Bayesian networks. We demonstrate the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae gene expression data newly obtained by disrupting 100 genes.

Journal ArticleDOI
TL;DR: A class of multiparticle entanglement purification protocols that allow us to distill a large class of entangled states, which include cluster states, Greenberger-Horne-Zeilinger states, and various error correction codes are introduced.
Abstract: We introduce a class of multiparticle entanglement purification protocols that allow us to distill a large class of entangled states. These include cluster states, Greenberger-Horne-Zeilinger states, and various error correction codes all of which belong to the class of two-colorable graph states. We analyze these schemes under realistic conditions and observe that they are scalable; i.e., the threshold value for imperfect local operations does not depend on the number of parties for many of these states. When compared to schemes based on bipartite entanglement purification, the protocol is more efficient and the achievable quality of the purified states is larger. As an application we discuss an experimental realization of the protocol in optical lattices which allows one to purify cluster states.