Showing papers on "Graph database published in 2004"

PDF

Open Access

Proceedings Article•DOI•

Graph indexing: a frequent structure-based approach

[...]

Xifeng Yan¹, Philip S. Yu², Jiawei Han¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, IBM²

13 Jun 2004

TL;DR: The gIndex approach not only provides and elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit form data mining, especially frequent pattern mining.

...read moreread less

Abstract: Graph has become increasingly important in modelling complicated structures and schemaless data such as proteins, chemical compounds, and XML documents. Given a graph query, it is desirable to retrieve graphs quickly from a large database via graph-based indices. In this paper, we investigate the issues of indexing graphs and propose a novel solution by applying a graph mining technique. Different from the existing path-based methods, our approach, called gIndex, makes use of frequent substructure as the basic indexing feature. Frequent substructures are ideal candidates since they explore the intrinsic characteristics of the data and are relatively stable to database updates. To reduce the size of index structure, two techniques, size-increasing support constraint and discriminative fragments, are introduced. Our performance study shows that gIndex has 10 times smaller index size, but achieves 3--10 times better performance in comparison with a typical path-based method, GraphGrep. The gIndex approach not only provides and elegant solution to the graph indexing problem, but also demonstrates how database indexing and query processing can benefit form data mining, especially frequent pattern mining. Furthermore, the concepts developed here can be applied to indexing sequences, trees, and other complicated structures as well.

...read moreread less

706 citations

Proceedings Article•DOI•

SPIN: mining maximal frequent subgraphs from graph databases

[...]

Jun Huan¹, Wei Wang¹, Jan F. Prins¹, Jiong Yang²•Institutions (2)

University of North Carolina at Chapel Hill¹, University of Illinois at Urbana–Champaign²

22 Aug 2004

TL;DR: This paper proposes a new algorithm that mines only maximal frequent subgraphs, i.e. sub graphs that are not a part of any other frequent sub graph, and demonstrates that this algorithm can achieve a five-fold speed up over the current state-of-the-art subgraph mining algorithms.

...read moreread less

Abstract: One fundamental challenge for mining recurring subgraphs from semi-structured data sets is the overwhelming abundance of such patterns. In large graph databases, the total number of frequent subgraphs can become too large to allow a full enumeration using reasonable computational resources. In this paper, we propose a new algorithm that mines only maximal frequent subgraphs, i.e. subgraphs that are not a part of any other frequent subgraphs. This may exponentially decrease the size of the output set in the best case; in our experiments on practical data sets, mining maximal frequent subgraphs reduces the total number of mined patterns by two to three orders of magnitude.Our method first mines all frequent trees from a general graph database and then reconstructs all maximal subgraphs from the mined trees. Using two chemical structure benchmarks and a set of synthetic graph data sets, we demonstrate that, in addition to decreasing the output size, our algorithm can achieve a five-fold speed up over the current state-of-the-art subgraph mining algorithms.

...read moreread less

367 citations

Proceedings Article•DOI•

Cyclic pattern kernels for predictive graph mining

[...]

Tamás Horváth¹, Thomas Gärtner¹, Stefan Wrobel¹•Institutions (1)

University of Bonn¹

22 Aug 2004

TL;DR: The experimental results show that cyclic pattern kernels can be computed quickly and offer predictive performance superior to recent graph kernels based on frequent patterns.

...read moreread less

Abstract: With applications in biology, the world-wide web, and several other areas, mining of graph-structured objects has received significant interest recently One of the major research directions in this field is concerned with predictive data mining in graph databases where each instance is represented by a graph Some of the proposed approaches for this task rely on the excellent classification performance of support vector machines To control the computational cost of these approaches, the underlying kernel functions are based on frequent patterns In contrast to these approaches, we propose a kernel function based on a natural set of cyclic and tree patterns independent of their frequency, and discuss its computational aspects To practically demonstrate the effectiveness of our approach, we use the popular NCI-HIV molecule dataset Our experimental results show that cyclic pattern kernels can be computed quickly and offer predictive performance superior to recent graph kernels based on frequent patterns

...read moreread less

363 citations

Proceedings Article•DOI•

Scalable mining of large disk-based graph databases

[...]

Chen Wang¹, Wei Wang¹, Jian Pei², Yongtai Zhu¹, Baile Shi¹ - Show less +1 more•Institutions (2)

Fudan University¹, University at Buffalo²

22 Aug 2004

TL;DR: An effective index structure, ADI (for adjacency index), is developed to support mining various graph patterns over large databases that cannot be held into main memory and is faster than gSpan when both can run in main memory.

...read moreread less

Abstract: Mining frequent structural patterns from graph databases is an interesting problem with broad applications. Most of the previous studies focus on pruning unfruitful search subspaces effectively, but few of them address the mining on large, disk-based databases. As many graph databases in applications cannot be held into main memory, scalable mining of large, disk-based graph databases remains a challenging problem. In this paper, we develop an effective index structure, ADI (for adjacency index), to support mining various graph patterns over large databases that cannot be held into main memory. The index is simple and efficient to build. Moreover, the new index structure can be easily adopted in various existing graph pattern mining algorithms. As an example, we adapt the well-known gSpan algorithm by using the ADI structure. The experimental results show that the new index structure enables the scalable graph pattern mining over large databases. In one set of the experiments, the new disk-based method can mine graph databases with one million graphs, while the original gSpan algorithm can only handle databases of up to 300 thousand graphs. Moreover, our new method is faster than gSpan when both can run in main memory.

...read moreread less

104 citations

Proceedings Article•DOI•

Augmented Reeb graphs for content-based retrieval of 3D mesh models

[...]

Tony Tung¹, Francis Schmitt¹•Institutions (1)

Télécom ParisTech¹

07 Jun 2004

TL;DR: An improved method of 3D mesh models indexing for content-based retrieval in database with shape similarity and appearance queries and obtains a flexible multiresolutional and multicriteria representation called augmented Reeb graph (ARG).

...read moreread less

Abstract: This work presents an improved method of 3D mesh models indexing for content-based retrieval in database with shape similarity and appearance queries. The approach is based on the multiresolutional Reeb graph matching presented by Hilaga et al. (2001). The original method only takes into account topological information what is often not sufficient for effective matchings. Therefore we proposed to augment this graph with geometrical attributes. We also provide a new topological coherence condition to improve the graph matching. Moreover 2D appearance attributes and 3D features are extracted and merged to improve the estimation of the similarity between models. Besides, all these new attributes are user-dependent as they can be weighted by variable terms. We obtain a flexible multiresolutional and multicriteria representation called augmented Reeb graph (ARG). Good preliminary results have been obtained in a shape-based matching framework compared to existing methods based only on statistical measures. In addition, our study lead us to an innovative part matching scheme based on the same approach as our augmented Reeb graph matching.

...read moreread less

98 citations

Book Chapter•DOI•

Relational link-based ranking

[...]

Floris Geerts¹, Heikki Mannila², Evimaria Terzi²•Institutions (2)

University of Edinburgh¹, Helsinki Institute for Information Technology²

31 Aug 2004

TL;DR: A generalized ranking framework is provided that can be extended to extend the PageRank link analysis algorithm to relational databases and give this extension a random querier interpretation, and explores the properties of database graphs.

...read moreread less

Abstract: Link analysis methods show that the interconnections between web pages have lots of valuable information. The link analysis methods are, however, inherently oriented towards analyzing binary relations. We consider the question of generalizing link analysis methods for analyzing relational databases. To this aim, we provide a generalized ranking framework and address its practical implications. More specically, we associate with each relational database and set of queries a unique weighted directed graph, which we call the database graph. We explore the properties of database graphs. In analogy to link analysis algorithms, which use the Web graph to rank web pages, we use the database graph to rank partial tuples. In this way we can, e.g., extend the PageRank link analysis algorithm to relational databases and give this extension a random querier interpretation. Similarly, we extend the HITS link analysis algorithm to relational databases. We conclude with some preliminary experimental results.

...read moreread less

86 citations

Journal Article•DOI•

Classification of web documents using graph matching

[...]

Adam Schenker¹, Horst Bunke², Abraham Kandel¹, Abraham Kandel³•Institutions (3)

University of South Florida¹, University of Bern², Tel Aviv University³

01 May 2004-International Journal of Pattern Recognition and Artificial Intelligence

TL;DR: The results show the graph-based approach can outperform traditional vector-based methods in terms of accuracy, dimensionality and execution time.

...read moreread less

Abstract: In this paper we describe a classification method that allows the use of graph-based representations of data instead of traditional vector-based representations. We compare the vector approach combined with the k-Nearest Neighbor (k-NN) algorithm to the graph-matching approach when classifying three different web document collections, using the leave-one-out approach for measuring classification accuracy. We also compare the performance of different graph distance measures as well as various document representations that utilize graphs. The results show the graph-based approach can outperform traditional vector-based methods in terms of accuracy, dimensionality and execution time.

...read moreread less

86 citations

Proceedings Article•DOI•

Exploring the computing literature using temporal graph visualization

[...]

Cesim Erten¹, Philip J. Harding¹, Stephen G. Kobourov¹, Kevin Wampler¹, Gary V. Yee¹ - Show less +1 more•Institutions (1)

University of Arizona¹

04 Jun 2004

TL;DR: In this article, the authors present a system for the visualization of computing literature with an emphasis on collaboration patterns, interactions between related research specialties and the evolution of these characteristics through time.

...read moreread less

Abstract: We present a system for the visualization of computing literature with an emphasis on collaboration patterns, interactions between related research specialties and the evolution of these characteristics through time. Our computing literature visualization system, has four major components: A mapping of bibliographical data to relational schema coupled with an RDBMS to store the relational data, an interactive GUI that allows queries and the dynamic construction of graphs, a temporal graph layout algorithm, and an interactive visualization tool. We use a novel technique for visualization of large graphs that evolve through time. Given a dynamic graph, the layout algorithm produces two-dimensional representations of each timeslice, while preserving the mental map of the graph from one slice to the next. A combined view, with all the timeslices can also be viewed and explored. For our analysis we use data from the Association of Computing Machinery's Digital Library of Scientific Literature which contains more than one hundred thousand research papers and authors. Our system can be found online at http://tgrip.cs.arizona.edu.© (2004) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

...read moreread less

84 citations

Proceedings Article•DOI•

The Webgraph framework II: codes for the World-Wide Web

[...]

Paolo Boldi, Sebastiano Vigna

23 Mar 2004

TL;DR: The Web graph framework is described that provides simple methods to manage very large graphs, specially tailored around Web graphs, and contains a fully-documented implementation of various instantaneous codes, and of parameterisable compression algorithms that achieve the best compression rates and introducing a new technique called intervalisation.

...read moreread less

Abstract: This paper describes the Web graph framework that provides simple methods to manage very large graphs, specially tailored around Web graphs. A fundamental observation about compression of the web graph was made in the construction of the LINK database. Web graph contains a fully-documented implementation of various instantaneous codes, and of parameterisable compression algorithms that achieve the best compression rates and introducing a new technique called intervalisation. Web graph also contains algorithms for accessing a compressed graph.

...read moreread less

69 citations

Proceedings Article•DOI•

When one sample is not enough: improving text database selection using shrinkage

[...]

Panagiotis G. Ipeirotis¹, Luis Gravano¹•Institutions (1)

Columbia University¹

13 Jun 2004

TL;DR: A thorough evaluation over 315 real web databases as well as over TREC data suggests that the shrinkage-based content summaries are substantially more complete than their "unshrunk" counterparts.

...read moreread less

Abstract: Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which are not typically exported by databases. Previous research has developed algorithms for constructing an approximate content summary of a text database from a small document sample extracted via querying. Unfortunately, Zipf's law practically guarantees that content summaries built this way for any relatively large database will fail to cover many low-frequency words. Incomplete content summaries might negatively affect the database selection process, especially for short queries with infrequent words. To improve the coverage of approximate content summaries, we build on the observation that topically similar databases tend to have related vocabularies. Therefore, the approximate content summaries of topically related databases can complement each other and increase their coverage. Specifically, we exploit a (given or derived) hierarchical categorization of the databases and adapt the notion of "shrinkage" -a form of smoothing that has been used successfully for document classification-to the content summary construction task. A thorough evaluation over 315 real web databases as well as over TREC data suggests that the shrinkage-based content summaries are substantially more complete than their "unshrunk" counterparts. We also describe how to modify existing database selection algorithms to adaptively decide -at run-time-whether to apply shrinkage for a query. Our experiments, which rely on TREC data sets, queries, and the associated "relevance judgments," show that our shrinkage-based approach significantly improves state-of-the-art database selection algorithms, and also outperforms a recently proposed hierarchical strategy that exploits database classification as well.

...read moreread less

60 citations

Patent•

Graphical programming system and method for creating and managing a scene graph

[...]

Jonathan P. Fournie¹•Institutions (1)

National Instruments¹

03 Feb 2004

TL;DR: In this article, a data flow diagram is created in response to input, including displaying a first plurality of nodes on a display which are executable to create at least a portion of the scene graph, and connecting the nodes to create the data flow diagrams, where the nodes are connected to specify data flow among the plurality of vertices.

...read moreread less

Abstract: System and method for creating a scene graph. A data flow diagram is created in response to input, including displaying a first plurality of nodes on a display which are executable to create at least a portion of the scene graph, and connecting the nodes to create the data flow diagram, where the nodes are connected to specify data flow among the plurality of nodes. The data flow diagram is executed to create the scene graph. The scene graph specifies a plurality of objects and relationships between the objects, e.g., via an object hierarchy, and is usable in rendering a graphical image of the plurality of objects, e.g., a 3D scene. The scene graph is stored in a memory medium. At least one render node may be included in the data flow diagram which receives the scene graph as an input and renders the image based on the scene graph.

...read moreread less

Proceedings Article•DOI•

Frequent graph mining and its application to molecular databases

[...]

Siegfried Nijssen¹, J.N. Kok¹•Institutions (1)

Leiden University¹

10 Oct 2004

TL;DR: This work investigates a method for mining fragments which consists of three phases: first, a preprocessing phase for turning molecular databases into graph databases; second, the Gaston frequent graph mining phase for mining frequent paths, free trees and cyclic graphs; and third, a postprocessing phase in which redundant frequent fragments are removed.

...read moreread less

Abstract: Molecular fragment mining is a promising approach for discovering novel fragments for drugs. We investigate a method for mining fragments which consists of three phases: first, a preprocessing phase for turning molecular databases into graph databases; second, the Gaston frequent graph mining phase for mining frequent paths, free trees and cyclic graphs; and third, a postprocessing phase in which redundant frequent fragments are removed. We devote most of our attention to the frequent graph mining phase, as this phase is computationally the most demanding, but also look at the other phases.

...read moreread less

Proceedings Article•DOI•

Construction of conceptual graph representation of texts

[...]

Svetlana Hensman¹•Institutions (1)

University College Dublin¹

02 May 2004

TL;DR: This paper uses a two-step approach, by firstly identifying the semantic roles in a sentence, and then using these roles, together with semi-automatically compiled domain-specific knowledge to construct the conceptual graph representation.

...read moreread less

Abstract: This paper describes a system for constructing conceptual graph representation of text by using a combination of existing linguistic resources (VerbNet and WordNet). We use a two-step approach, by firstly identifying the semantic roles in a sentence, and then using these roles, together with semi-automatically compiled domain-specific knowledge to construct the conceptual graph representation.

...read moreread less

Patent•

Method and system for ranking words and concepts in a text using graph-based ranking

[...]

Lucretia H. Vanderwende¹, Aurl A. Menezes¹, Michele Banko¹•Institutions (1)

Microsoft¹

15 Apr 2004

TL;DR: A method and system for identifying words, text fragments, or concepts of interest in a corpus of text is presented in this paper, where a graph is built which includes nodes and links where nodes represent a word or a concept and links between the nodes represent directed relation names.

...read moreread less

Abstract: The present invention is a method and system for identifying words, text fragments, or concepts of interest in a corpus of text. A graph is built which covers the corpus of text. The graph includes nodes and links, where nodes represent a word or a concept and links between the nodes represent directed relation names. A score is then computed for each node in the graph. Scores can also be computed for larger sub-graph portions of the graph (such as tuples) The scores are used to identify desired sub-graph portions of the graph, those sub-graph portions being referred to as graph fragments.

...read moreread less

Book•DOI•

Applications of graph transformations with industrial relevance : second international workshop, AGTIVE 2003, Charlottesville, VA, USA, September 27 - October 1, 2003 ; revised selected and invited papers

[...]

Agtive, John L. Pfaltz, Manfred Nagl, Boris Böhlen

01 Jan 2004

TL;DR: Graph Transformations in OMG's Model-Driven Architecture, including Meta-Modelling, Graph Transformation and Model Checking for the Analysis of Hybrid Systems, are presented.

...read moreread less

Abstract: Web Applications.- Graph Transformation for Merging User Navigation Histories.- Towards Validation of Session Management in Web Applications based on Graph Transformation.- Data Structures and Data Bases.- Specifying Pointer Structures by Graph Reduction.- Specific Graph Models and Their Mappings to a Common Model.- Engineering Applications.- Transforming Graph Based Scenarios into Graph Transformation Based JUnit Tests.- On Graphs in Conceptual Engineering Design.- Parameterized Specification of Conceptual Design Tools in Civil Engineering.- Agent-Oriented and Functional Programs, Distribution.- Design of an Agent-Oriented Modeling Language Based on Graph Transformation.- Specification and Analysis of Fault Behaviours Using Graph Grammars.- Object and Aspect-Oriented Systems.- Integrating Graph Rewriting and Standard Software Tools.- Expressing Component-Relating Aspects with Graph Transformations.- Natural Languages: Processing and Structuring.- Modeling Discontinuous Constituents with Hypergraph Grammars.- Authoring Support Based on User-Serviceable Graph Transformation.- Re-engineering.- Re-engineering a Medical Imaging System Using Graph Transformations.- Behavioral Analysis of Telecommunication Systems by Graph Transformations.- Reuse and Integration.- Specifying Integrated Refactoring with Distributed Graph Transformations.- A Domain Specific Architecture Tool: Rapid Prototyping with Graph Grammars.- Modelling Languages.- Graph Transformations in OMG's Model-Driven Architecture.- Computing Reading Trees for Constraint Diagrams.- UML Interaction Diagrams: Correct Translation of Sequence Diagrams into Collaboration Diagrams.- Meta-Modelling, Graph Transformation and Model Checking for the Analysis of Hybrid Systems.- Bioinformatics.- Proper Down-Coloring Simple Acyclic Digraphs.- Local Specification of Surface Subdivision Algorithms.- Transforming Toric Digraphs.- Management of Development and Processes.- Graph-Based Specification of a Management System for Evolving Development Processes.- Graph-Based Tools for Distributed Cooperation in Dynamic Development Processes.- Multimedia, Picture, and Visual Languages.- MPEG-7 Semantic Descriptions: Graph Transformations, Graph Grammars, and the Description of Multimedia.- Collage Grammars for Collision-Free Growing of Objects in 3D Scenes.- VisualDiaGen - A Tool for Visually Specifying and Generating Visual Editors.- Demos.- GenGED - A Visual Definition Tool for Visual Modeling Environments.- CHASID - A Graph-Based Authoring Support System.- Interorganizational Management of Development Processes.- Conceptual Design Tools for Civil Engineering.- E-CARES - Telecommunication Re- and Reverse Engineering Tools.- AGG: A Graph Transformation Environment for Modeling and Validation of Software.- Process Evolution Support in the AHEAD System.- Fire3: Architecture Refinement for A-posteriori Integration.- A Demo of OptimixJ.- Visual Specification of Visual Editors with VisualDiaGen.- The GROOVE Simulator: A Tool for State Space Generation.- Summaries of the Workshop.- AGTIVE'03: Summary from the Outside In.- AGTIVE'03: Summary from the Theoretical Point of View.- AGTIVE'03: Summary from the Viewpoint of Graph Transformation Specifications.- AGTIVE'03: Summary from a Tool Builder's Viewpoint.- Best Presentation and Demonstration Awards.

...read moreread less

Book Chapter•DOI•

Links and Paths through Life Sciences Data Sources

[...]

Zoé Lacroix¹, Hyma Murthy², Felix Naumann³, Louiqa Raschid²•Institutions (3)

Arizona State University¹, University of Maryland, College Park², Humboldt University of Berlin³

25 Mar 2004

TL;DR: An abundance of biological data sources contain data on classes of scientific entities, such as genes and sequences that are implemented as URLs and foreign IDs and model the data objects in these sources and the links between objects as an object graph.

...read moreread less

Abstract: An abundance of biological data sources contain data on classes of scientific entities, such as genes and sequences Logical relationships between scientific objects are implemented as URLs and foreign IDs Query processing typically involves traversing links and paths (concatenation of links) through these sources We model the data objects in these sources and the links between objects as an object graph Analogous to database cost models, we use samples and statistics from the object graph to develop a framework to estimate the result size for a query on the object graph

...read moreread less

Proceedings Article•DOI•

Twig query processing over graph-structured XML data

[...]

Zografoula Vagena¹, Mirella M. Moro¹, Vassilis J. Tsotras¹•Institutions (1)

University of California, Riverside¹

17 Jun 2004

TL;DR: This paper investigates query evaluation techniques applicable to graph-structured data and proposes efficient algorithms for the case of directed acyclic graphs, which appear in many real world situations and tailor their approaches to handle other directed graphs as well.

...read moreread less

Abstract: XML and semi-structured data is usually modeled using graph structures. Structural summaries, which have been proposed to speedup XML query processing have graph forms as well. The existent approaches for evaluating queries over tree structured data (i.e. data whose underlying structure is a tree) are not directly applicable when the data is modeled as a random graph. Moreover, they cannot be applied when structural summaries are employed and, to the best of our knowledge, no analogous techniques have been reported for this case either. As a result, the potential of structural summaries is not fully exploited.In this paper, we investigate query evaluation techniques applicable to graph-structured data. We propose efficient algorithms for the case of directed acyclic graphs, which appear in many real world situations. We then tailor our approaches to handle other directed graphs as well. Our experimental evaluation reveals the advantages of our solutions over existing methods for graph-structured data.

...read moreread less

Journal Article•DOI•

Mining interesting association rules from customer databases and transaction databases

[...]

Pauray S. M. Tsai¹, Chien-Ming Chen•Institutions (1)

Minghsin University of Science and Technology¹

01 Dec 2004-Information Systems

TL;DR: An efficient graph-based algorithm is presented to discover interesting association rules embedded in the transaction database and the customer database and shows that the total execution time can be reduced significantly.

...read moreread less

Journal Article•DOI•

Sequential processing in comprehension of hierarchical graphs

[...]

Christof Körner¹•Institutions (1)

University of Graz¹

01 May 2004-Applied Cognitive Psychology

TL;DR: Detailed analysis of the eye movement data provided clear support for the three-stage model of graph comprehension, which suggests that participants serialize problem solving tasks in order to minimize the overall processing load.

...read moreread less

Abstract: Hierarchical graphs represent the relationships between non-numerical entities or concepts (like computer file systems, family trees, etc). Graph nodes represent the concepts and interconnecting lines represent the relationships. We recorded participants' eye movements while viewing such graphs to test two possible models of graph comprehension. Graph readers had to answer interpretive questions, which required comparisons between two graph nodes. One model postulates a search and a combined search-reasoning stage of graph comprehension (two-stage model), whereas the second model predicts three stages, two stages devoted to the search of the relevant graph nodes and a separate reasoning stage. A detailed analysis of the eye movement data provided clear support for the three-stage model. This is in line with recent studies, which suggest that participants serialize problem solving tasks in order to minimize the overall processing load. Copyright © 2004 John Wiley & Sons, Ltd.

...read moreread less

Journal Article•DOI•

Graph Transformation with Incremental Updates

[...]

Gergely Varró¹, Dániel Varró¹•Institutions (1)

Budapest University of Technology and Economics¹

01 Dec 2004-Electronic Notes in Theoretical Computer Science

TL;DR: The essence of the technique is to keep track of all possible matchings of graph transformation rules in database tables, and update these tables incrementally to exploit the fact that rules typically perform only local modifications to models.

...read moreread less

cGraph: A Fast Graph-Based Method for Link Analysis and Queries

[...]

Jeremy Kubica¹, Andrew W. Moore¹, David Cohn¹, Jeff Schneider¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2004

TL;DR: This work quantitatively compares this representation and learning method against other algorithms on the task of predicting future links and new “friendships” in a variety of real world data sets.

...read moreread less

Abstract: Many techniques in the social sciences and graph theory deal with the problem of examining and analyzing patterns found in the underlying structure and associations of a group of entities. However, much of this work assumes that this underlying structure is known or can easily be inferred from data, which may often be an unrealistic assumption for many real-world problems. Below we consider the problem of learning and querying a graph-based model of this underlying structure. The model is learned from noisy observations linking sets of entities. We explicitly allow different types of links (representing different types of relations) and temporal information indicating when a link was observed. We quantitatively compare this representation and learning method against other algorithms on the task of predicting future links and new “friendships” in a variety of real world data sets.

...read moreread less

Book Chapter•DOI•

Gravisto: graph visualization toolkit

[...]

Christian Bachmaier¹, Franz J. Brandenburg¹, Michael Forster¹, Paul Holleis², Marcus Raitner¹ - Show less +1 more•Institutions (2)

University of Passau¹, Ludwig Maximilian University of Munich²

29 Sep 2004

TL;DR: Gravisto, the Graph Visualization Toolkit, is more than a (Java-based) editor for graphs; it includes data structures, graph algorithms, several layout algorithms, and a graph viewer component.

...read moreread less

Abstract: Gravisto, the Graph Visualization Toolkit, is more than a (Java-based) editor for graphs. It includes data structures, graph algorithms, several layout algorithms, and a graph viewer component. As a general toolkit for the visualization and automatic layout of graphs it is extensible with plug-ins and is suited for the integration in other Java-based applications.

...read moreread less

Book Chapter•DOI•

DB-Subdue: Database Approach to Graph Mining

[...]

Sharma Chakravarthy¹, Ramji Beera¹, Ramanathan Balachandran¹•Institutions (1)

University of Texas at Arlington¹

26 May 2004

TL;DR: This paper has not only shown how the Subdue class of algorithms can be translated to SQL-based algorithms, but also demonstrated that scalability can be achieved without sacrificing performance.

...read moreread less

Abstract: In contrast to mining over transactional data, graph mining is done over structured data represented in the form of a graph. Data having structural relationships lends itself to graph mining. Subdue is one of the early main memory graph mining algorithms that detects the best substructure that compresses a graph using the minimum description length principle. Database approach to graph mining presented in this paper overcomes the problems – performance and scalability – inherent to main memory algorithms. The focus of this paper is the development of graph mining algorithms (specifically Subdue) using SQL and stored procedures in a Relational database environment. We have not only shown how the Subdue class of algorithms can be translated to SQL-based algorithms, but also demonstrated that scalability can be achieved without sacrificing performance.

...read moreread less

Node-centric RDF Graph Visualization

[...]

Craig Sayers

01 Jan 2004

TL;DR: A node-centric technique for visualizing Resource Description Framework graphs that sorting and displaying nodes based on the number of incoming and outgoing arcs is described.

...read moreread less

Abstract: RDF, visualization, Resource Description Framework, graph, browser, nodecentric This paper describes a node-centric technique for visualizing Resource Description Framework (RDF) graphs. Nodes of interest are discovered by searching over literals. Subgraphs for display are constructed by using the area around selected nodes. Wider views are created by sorting and displaying nodes based on the number of incoming and outgoing arcs.

...read moreread less

Proceedings Article•DOI•

Content-based image retrieval by matching hierarchical attributed region adjacency graphs

[...]

Benedikt Fischer¹, Christian Thies¹, Mark Oliver Güld¹, Thomas Martin Lehmann¹•Institutions (1)

RWTH Aachen University¹

12 May 2004

TL;DR: The Similarity Flooding approach and Hopfield-style neural networks are adapted from the graph matching community to the needs of HARAG comparison showing the framework's general applicability to content-based image retrieval of medical images.

...read moreread less

Abstract: Content-based image retrieval requires a formal description of visual information. In medical applications, all relevant biological objects have to be represented by this description. Although color as the primary feature has proven successful in publicly available retrieval systems of general purpose, this description is not applicable to most medical images. Additionally, it has been shown that global features characterizing the whole image do not lead to acceptable results in the medical context or that they are only suitable for specific applications. For a general purpose content-based comparison of medical images, local, i.e. regional features that are collected on multiple scales must be used. A hierarchical attributed region adjacency graph (HARAG) provides such a representation and transfers image comparison to graph matching. However, building a HARAG from an image requires a restriction in size to be computationally feasible while at the same time all visually plausible information must be preserved. For this purpose, mechanisms for the reduction of the graph size are presented. Even with a reduced graph, the problem of graph matching remains NP-complete. In this paper, the Similarity Flooding approach and Hopfield-style neural networks are adapted from the graph matching community to the needs of HARAG comparison. Based on synthetic image material build from simple geometric objects, all visually similar regions were matched accordingly showing the framework's general applicability to content-based image retrieval of medical images.

...read moreread less

Patent•

System and method for automatically generating a hierarchical register consolidation structure

[...]

Michael Andrew Holmes¹, Gerald S. Williams²•Institutions (2)

Agere Systems¹, Avago Technologies²

16 Mar 2004

TL;DR: In this paper, the authors present a system for automatically generating a hierarchical register consolidation structure from a High-Level Design Language (HDL) file using a graph generator and a graph converter.

...read moreread less

Abstract: A system for, and method of, automatically generating a hierarchical register consolidation structure. In one embodiment, the system includes: (1) a graph generator that parses a High-level Design Language (HDL) file to generate an intermediate graph containing definitions of microprocessor-accessible registers, node interrelationships and summary bits and masks associated with alarm registers, (2) a graph converter, associated with the graph generator, that selectively adds virtual elements and nodes to the intermediate graph to transform the intermediate graph into a mathematical tree and (3) a description generator, associated with the graph converter, that employs the mathematical tree to generate a static tree description in a programming language suitable for use by a device-independent condition management structure.

...read moreread less

Flexible layering in hierarchical drawings with nodes of arbitrary size

[...]

Carsten Friedrich, Falk Schreiber

01 Jan 2004

TL;DR: In this paper, the authors present an algorithm for the layering step of hierarchical graph drawing methods that is able to take the sizes of the nodes into account, and further allow the user to choose between compact drawings with many temporary (dummy) nodes and less compact drawing with fewer dummy nodes.

...read moreread less

Abstract: Graph drawing is an important area of information visualization which concerns itself with the visualization of relational data structures. Relational data like networks, hierarchies, or database schemas can be modelled by graphs and represented visually using graph drawing algorithms. Most existing graph drawing algorithms do not consider the size of nodes when creating a drawing. In most real world applications, however, nodes contain information which has to be displayed and nodes thus need a specific area to display this information. The required area can vary significantly between different nodes in the same graph. In this paper we present an algorithm for the layering step of hierarchical graph drawing methods that is able to take the sizes of the nodes into account. It further allows the user to choose between compact drawings with many temporary (dummy) nodes and less compact drawings with fewer dummy nodes. A large number of dummy nodes can significantly increase the running time of the subsequent steps of hierarchical graph drawing methods.

...read moreread less

Journal Article•DOI•

Feature vector: a graph-based feature recognition methodology

[...]

A. K. Verma, S. Rajotia

15 Aug 2004-International Journal of Production Research

TL;DR: A new edge classification scheme to extend the graph-based algorithms to handle test parts with curved faces is reported and a unique method of representing a feature, called a feature vector, is developed.

...read moreread less

Abstract: Recognition of machining features is a vital link for the effective integration of various modules of computer integrated manufacturing systems (CIMS). Graph-based recognition is the most researched method due to the sound mathematical background of graph theory and a graph's structural similarity with B-Rep computer-aided design modellers’ database. The method, however, is criticized for its high computational requirement of graph matching, its difficulty in building a feature template library, its ability to handle only polyhedral parts and its inability to handle interacting features. The paper reports a new edge classification scheme to extend the graph-based algorithms to handle test parts with curved faces. A unique method of representing a feature, called a feature vector, is developed. The feature vector generation heuristic results in a recognition system with polynomial time complexity for any arbitrary attributed adjacency graph. The feature vector can be generated automatically from B-Rep mode...

...read moreread less

Journal Article•

Graph Data Representation in Oracle Database 10 g : Case Studies in Life Sciences.

[...]

Susie Stephens¹, Johan Rung², Xavier Lopez¹•Institutions (2)

Oracle Corporation¹, McGill University²

01 Jan 2004-IEEE Data(base) Engineering Bulletin

TL;DR: This paper describes the Oracle implementation and provides case studies from the life sciences to demonstrate the functionality to model data as a graph, and thereby has the potential to greatly facilitate research.

...read moreread less

Abstract: New technologies have been developed in the life sciences that allow researchers to study biological systems in rich detail. These advances have resulted in an abundance of data that describes the relations between the fundamental components of biological systems, such as genes, proteins, and metabolites. The network of relations between the components holds insights as to how biological systems function, and consequently can help researchers understand the mechanisms behind disease. Biological networks are commonly managed and analyzed in a graph representation. Oracle Database 10g has the functionality to model data as a graph, and thereby has the potential to greatly facilitate research. In this paper we describe the Oracle implementation and provide case studies from the life sciences.

...read moreread less

Journal Article•DOI•

A novel representation of graph structures in web mining and data analysis

[...]

Jacek Blazewicz¹, Erwin Pesch², Malgorzata Sterna¹•Institutions (2)

Poznań University of Technology¹, University of Siegen²

01 May 2004-Omega-international Journal of Management Science

TL;DR: A new graph representation, the graph matrix, is presented, which combines the adjacency matrix with the linked lists allowing for the fastest possible access to different types of information on a graph.

...read moreread less

Abstract: The paper presents a new graph representation, the graph matrix, which combines the adjacency matrix with the linked lists allowing for the fastest possible access to different types of information on a graph. This is increasingly important for a high search performance, for instance, for rapidly extracting information from the link structure in a hub and authority graph of the World-Wide-Web. A very recent application for the proposed data structure arises from categorical data clustering defining proximity and similarity of data through their patterns of co-occurrence.

...read moreread less