scispace - formally typeset
Search or ask a question

Showing papers on "Graph database published in 2000"


Journal ArticleDOI
01 Jun 2000
TL;DR: The study of the web as a graph yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution.
Abstract: The study of the web as a graph is not only fascinating in its own right, but also yields valuable insight into web algorithms for crawling, searching and community discovery, and the sociological phenomena which characterize its evolution. We report on experiments on local and global properties of the web graph using two Altavista crawls each with over 200 million pages and 1.5 billion links. Our study indicates that the macroscopic structure of the web is considerably more intricate than suggested by earlier experiments on a smaller scale.

2,973 citations


Journal ArticleDOI
TL;DR: This is a survey on graph visualization and navigation techniques, as used in information visualization, which approaches the results of traditional graph drawing from a different perspective.
Abstract: This is a survey on graph visualization and navigation techniques, as used in information visualization. Graphs appear in numerous applications such as Web browsing, state-transition diagrams, and data structures. The ability to visualize and to navigate in these potentially large, abstract graphs is often a crucial part of an application. Information visualization has specific requirements, which means that this survey approaches the results of traditional graph drawing from a different perspective.

1,648 citations


Journal ArticleDOI
01 Nov 2000
TL;DR: A survey of recently proposed alternatives for graph partitioning finds that the standard methodology for graph partitions minimizes the wrong metric and lacks expressibility.
Abstract: Calculations can naturally be described as graphs in which vertices represent computation and edges reflect data dependencies. By partitioning the vertices of a graph, the calculation can be divided among processors of a parallel computer. However, the standard methodology for graph partitioning minimizes the wrong metric and lacks expressibility. We survey several recently proposed alternatives and discuss their relative merits.

448 citations


Proceedings ArticleDOI
01 May 2000
TL;DR: A set of algorithms that operate on the Web graph are reviewed, addressing problems from Web search, automatic community discovery, and classification, and a new family of random graph models are proposed.
Abstract: The pages and hyperlinks of the World-Wide Web may be viewed as nodes and edges in a directed graph. This graph has about a billion nodes today, several billion links, and appears to grow exponentially with time. There are many reasons—mathematical, sociological, and commercial—for studying the evolution of this graph. We first review a set of algorithms that operate on the Web graph, addressing problems from Web search, automatic community discovery, and classification. We then recall a number of measurements and properties of the Web graph. Noting that traditional random graph models do not explain these observations, we propose a new family of random graph models.

347 citations


Patent
03 Oct 2000
TL;DR: In a knowledge classification system, both the information sources and queries are processed to generate knowledge representation graph structures, which are then converted to views and displayed to a searcher as mentioned in this paper.
Abstract: In a knowledge classification system, both the information sources and queries are processed to generate knowledge representation graph structures. The graph structures for both the query and the information sources are then converted to views and displayed to a searcher. By manipulating the graph structure views for each information source, the searcher can examine the source for relevance. A search can be performed by comparing the graph structure of the query to the graph structure of each information source by a graph matching computer algorithm. Information sources are classified by constructing hierarchies of knowledge representations. The simplest construction is obtained by using the knowledge representation of a query as the top of the hierarchy. The structures in the hierarchy are substructures of the query. The hierarchy of structures may also be constructed by using the knowledge representation of the query as the bottom of the hierarchy. Structures in the hierarchy, in this case, are structures that contain the query. The vertices of a graph structure view can be displayed on a computer screen next to the corresponding items, such as words, phrases and visual features, of an information source view. Selecting a vertex in the graph structure causes the selected vertex and vertices adjacent to the selected vertex to be “highlighted.” By selecting a succession of vertices in the graph structure, a searcher can perform knowledge navigation of the information source. By successively selecting items of the information source, a searcher can perform knowledge exploration of the information source.

226 citations


Journal ArticleDOI
01 Sep 2000
TL;DR: This paper describes how the storage manager, the query execution engine, and the query optimizer of a database system can be extended to deal with compressed data and shows how compression can be integrated into a relational database system.
Abstract: In this paper, we show how compression can be integrated into a relational database system. Specifically, we describe how the storage manager, the query execution engine, and the query optimizer of a database system can be extended to deal with compressed data. Our main result is that compression can significantly improve the response time of queries if very light-weight compression techniques are used. We will present such light-weight compression techniques and give the results of running the TPC-D benchmark on a so compressed database and a non-compressed database using the AODB database system, an experimental database system that was developed at the Universities of Mannheim and Passau. Our benchmark results demonstrate that compression indeed offers high performance gains (up to 50%) for IO-intensive queries and moderate gains for CPU-intensive queries. Compression can, however, also increase the running time of certain update operations. In all, we recommend to extend today's database systems with light-weight compression techniques and to make extensive use of this feature.

201 citations


BookDOI
01 Jan 2000
TL;DR: In this paper, the authors present a Graph-Based System for Modeling and Managing Development Processes (GMSDP) based on Graph Transformations (GTP) and Graph Rewriting (GRRR).
Abstract: Modularization Concepts.- Term Graph Rewriting and Mobile Expressions in Functional Languages.- Graph Transformation Modules and Their Composition.- Modeling Distributed Systems by Modular Graph Transformation Based on Refinement via Rule Expressions.- Distributed System Modelling.- From UML Descriptions of High-Level Software Architectures to LQN Performance Models.- On a Uniform Representation of Transformation Systems.- A Note on Modeling Agent Systems by Graph Transformation.- Compositional Construction of Simulation Models Using Graph Grammars.- Software Architectures: Evolution and Reengineering.- Graph-Based Reverse Engineering and Reengineering Tools.- Support for Design Patterns through Graph Transformation Tools.- Conditional Graph Rewriting as a Domain-Independent Formalism for Software Evolution.- Visual Graph Transformation Languages.- Visual Languages: Where Do We Stand?.- From Graph Transformation to Rule-Based Programming with Diagrams.- Using Fujaba for the Development of Production Control Systems.- Visual Language Modeling and Tool Development.- A Formal Definition of Structured Analysis with Programmable Graph Grammars.- Creating Semantic Representations of Diagrams.- Defining the Syntax and Semantics of Natural Visual Languages.- GENGED A Development Environment for Visual Languages.- Tool Development and Knowledge Modeling in Different Applications.- Graph Visualisation in ArchiCAD.- A Combined Graph Schema and Graph Grammar Approach to Consistency in Distributed Modeling.- Improving the Publication Chain through High-Level Authoring Support.- Learning and Rewriting in Fuzzy Rule Graphs.- A Proof Tool Dedicated to Clean.- Image Recognition and Constraint Solving.- Document Table Recognition by Graph Rewriting.- Image Structure from Monotonic Dual Graph Contraction.- Planning Geometric Constraint Decomposition via Optimal Graph Transformations.- Process Modeling and View Integration.- AHEAD: A Graph-Based System for Modeling and Managing Development Processes.- Formalizing UML-Based Process Models Using Graph Transformations.- Formal Integration of Software Engineering Aspects Using a Graph Rewrite System - A Typical Experience ?! -.- Towards Integrating Multiple Perspectives by Distributed Graph Transformation.- Visualization and Animation Tools.- Graph Algorithm Animation with Grrr.- An L-System-Based Plant Modeling Language.- Tool Demonstrations.- TREEBAG - a Short Presentation.- Tool Support for ViewPoint-Oriented Software Development.- UPGRADE - A Framework for Graph-Based Visual Applications.- Generating Diagram Editors with DiaGen.- PROgrammed Graph REwriting System PROGRES.- Testing and Simulating Production Control Systems Using the Fujaba Environment.- L-Studio/cpfg: A Software System for Modeling Plants.- DiTo - A Distribution Tool Based on Graph Rewriting.- A Demonstration of the Grrr Graph Rewriting Programming Language.- AGG: A Tool Environment for Algebraic Graph Transformation.- AGTIVE Workshop/Synmposium Panel Discussion on Industrial Relevance of Graph Transformation: The Reality and Our Dreams.- Best Presentation and Demonstration Awards.

193 citations


Proceedings ArticleDOI
Horst Bunke1
01 Sep 2000
TL;DR: Topics to be addressed include graph clustering and efficient indexing of large databases of graphs, and theoretical work showing various relations between different similarity measures is discussed.
Abstract: Graphs are a powerful and versatile tool useful in various subfields of science and engineering. In many applications, for example, in pattern recognition and computer vision, it is required to measure the similarity of objects. When graphs are used for the representation of structured objects, then the problem of measuring object similarity turns into the problem of computing the similarity of graphs, which is also known as graph matching. In this paper, similarity measures on graphs and related algorithms are reviewed. Also theoretical work showing various relations between different similarity measures is discussed. Other topics to be addressed include graph clustering and efficient indexing of large databases of graphs.

186 citations


Journal ArticleDOI
TL;DR: This paper presents the results of experiments investigating the relative worth of graph drawing aesthetics and algorithms using a single graph, and indicates that while some individual aesthetics affect human performance, it is difficult to say that one algorithm is ‘better than another from a relational understanding point of view.

146 citations


Book ChapterDOI
27 Mar 2000
TL;DR: This paper presents a language for searching graph-like databases, which permits us to express paths in a graph by means of extended regular expressions, and presents an algebra for partially ordered relations and an algorithm for the computation of path queries.
Abstract: Graph data is an emerging model for representing a variety of database contexts ranging from object-oriented databases to hypertext data. Also many of the recursive queries that arise in relational databases are, in practice, graph traversals. In this paper we present a language for searching graph-like databases. The language permits us to express paths in a graph by means of extended regular expressions. The proposed extension is based on the introduction of constructs which permit us i) to define a partial order on the paths used to search the graph and, consequently, on the answers of queries, and ii) to cut off, nondeterministically, tuples with low priority. We present an algebra for partially ordered relations and an algorithm for the computation of path queries. Finally, we present applications to hypertext databases such as the Web.

141 citations


Book ChapterDOI
20 Sep 2000
TL;DR: GraphXML is a graph description language in XML that can be used as an interchange format for graph drawing and visualization packages and supports the pure, mathematical description of a graph.
Abstract: GraphXML is a graph description language in XML that can be used as an interchange format for graph drawing and visualization packages. The generality and rich features of XML make it possible to define an interchange format that not only supports the pure, mathematical description of a graph, but also the needs of information visualization applications that use graph--based data structures.

Book
01 Jan 2000
TL;DR: The Graph Visualization Framework (GVF) as discussed by the authors is an architecture that supports the tasks common to most graph browsers and editors, such as navigation, manipulation, and visualization of graphs.
Abstract: Many applications, from everyday file system browsers to visual programming tools, require the display of network and graph structures. The Graph Visualization Framework (GVF) is an architecture that supports the tasks common to most graph browsers and editors. This article gives a brief overview of the design of the GVF and focuses on the core classes that are used to represent and manipulate graphs. The design of the core classes is justified by the requirements for navigation and visualization.

Book ChapterDOI
27 Mar 2000
TL;DR: An approximation approach is proposed for graph schema extraction by summarizing the semi-structured data graph using an incremental clustering method and results have shown that approximate graph schemas were more compact than the conventional accurategraph schemas and promising in query evaluation that involved regular path expressions.
Abstract: Semi-structured data are typically represented in the form of labeled directed graphs. They are self-describing and schemaless. The lack of a schema renders query processing over semi-structured data expensive. To overcome this predicament, some researchers proposed to use the structure of the data for schema representation. Such schemas are commonly referred to as graph schemas. Nevertheless, since semi-structured data are irregular and frequently subjected to modifications, it is costly to construct an accurate graph schema and worse still, it is difficult to maintain it thereafter. Furthermore, an accurate graph schema is generally very large, hence impractical. In this paper, an approximation approach is proposed for graph schema extraction. Approximation is achieved by summarizing the semi-structured data graph using an incremental clustering method. The preliminary experimental results have shown that approximate graph schemas were more compact than the conventional accurate graph schemas and promising in query evaluation that involved regular path expressions.

Book ChapterDOI
18 May 2000
TL;DR: In this article, the problem of rewriting queries using views has been studied in the context of information integration systems such as the Information Manifold, where the data sources are modelled as sound views over a global schema.
Abstract: Rewriting queries using views is a powerful technique that has applications in data integration, data warehousing and query optimization. Query rewriting in relational databases is by now rather well investigated. However, in the framework of semistructured data the problem of rewriting has received much less attention. In this paper we identify some difficulties with currently known methods for using rewritings in semistructured databases. We study the problem in a realistic setting, proposed in information integration systems such as the Information Manifold, in which the data sources are modelled as sound views over a global schema. We give a new rewriting, which we call the possibility rewriting, that can be used in pruning the search space when answering queries using views. The possibility rewriting can be computed in time polynomial in the size of the original query and the view definitions. Finally, we show by means of a realistic example that our method can reduce the search space by an order of magnitude.

Book ChapterDOI
25 Mar 2000
TL;DR: A demand-driven call graph construction framework is presented, focusing on the dynamic calls due to polymorphism in object-oriented languages, and it is shown that the demanded technique has the same accuracy as the corresponding exhaustive technique.
Abstract: Call graph construction has been an important area of research within the compilers and programming languages community. However, all existingt echniques focus on exhaustive analysis of all the call-sites in the program. With increasing importance of just-in-time or dynamic compilation and use of program analysis as part of the software development environments, we believe that there is a need for techniques for demand-driven construction of the call graph. We present a demand-driven call graph construction framework in this paper, focusing on the dynamic calls due to polymorphism in object-oriented languages. We use a variant of Callahan's Program Summary Graph (PSG) and perform analysis over a set of influencingno des. We show that our demand-driven technique has the same accuracy as the correspondinge xhaustive technique. The reduction in the graph construction time depends upon the ratio of the cardinality of the set of influencing nodes to the set of all nodes.

Book ChapterDOI
14 Aug 2000
TL;DR: This paper discusses several levels of conceptual authoring support, which provide the author with directions and examples that (if adopted) remain linked to the text and a restriction on the possible document structures the author is allowed to build.
Abstract: Conceptual authoring support provides tools to help authors construct and organize their document on the conceptual level. As computer-ablenothingbased tools are purely formal entities, they cannot handle natural language itself. Instead, they provide the author with directions and examples that (if adopted) remain linked to the text. This paper discusses several levels of such directions: A Pattern describes a solution for a common problem, here a combination of audience and topic. It may point to several Schemata , which may be expanded in the document structure graph, leaving the author with more specific graph structures to expand and text gaps to fill in. A Type Definition is finally a restriction on the possible document structures the author is allowed to build. Several examples of such patterns, schemata and types are presented.


Book ChapterDOI
Prabhakar Raghavan1
10 Apr 2000
TL;DR: The subject of this survey is the directed graph induced by the hyperlinks between Web pages; it is suggested that there are several hundred million nodes in the Web graph; this quantity is growing by several percent each month.
Abstract: The subject of this survey is the directed graph induced by the hyperlinks between Web pages; we refer to this as the Web graph Nodes represent static html pages and hyperlinks represent directed edges between them Recent estimates [5] suggest that there are several hundred million nodes in the Web graph; this quantity is growing by several percent each month The average node has roughly seven hyperlinks (directed edges) to other pages, making for a total of several billion hyperlinks in all

Journal ArticleDOI
01 Dec 2000
TL;DR: A graph-based cluster-sequencing method to minimize the I/O cost in spatial join processing is proposed and it is proved that the approximation to MO order obtained from the method is close to the optimal result.
Abstract: In this paper, we propose a graph-based cluster-sequencing method to minimize the I/O cost in spatial join processing. We first define the maximum overlapping (MO) order in a graph, proving that the problem of finding an MO order in a graph is NP-complete. Then, we propose an algorithm to find an approximation to MO order in a graph. We also prove that the approximation to MO order obtained from our method is close to the optimal result. Simulations have been conducted to demonstrate the saving of I/O cost in spatial join by using our method. (C) 2000 Elsevier Science B.V. All rights reserved.

Proceedings ArticleDOI
01 Sep 2000
TL;DR: A novel image analysis system based on attributed relational graph matching is proposed, which is called accumulative Hopfield matching, which was applied to automatic object recognition and modeling of articulated objects with good results.
Abstract: In this paper, a novel image analysis system based on attributed relational graph matching is proposed, which is called accumulative Hopfield matching. We first divide the scene graph into many sub-graphs, and a modified Hopfield network is then constructed to obtain the sub-graph isomorphism between each sub-scene graph and model graph. The final result is deduced by accumulating the solutions of all small subnetworks. The proposed system was applied to automatic object recognition and modeling of articulated objects with good results.

Journal Article
TL;DR: A bottom-up approach for identifying and recognizing tables within a document using a set of rules designed for and based on apriori document knowledge and general formatting conventions is proposed.
Abstract: This paper proposes a bottom-up approach for identifying and recognizing tables within a document. This approach is based on the paradigm of graph rewriting. First, the document image is transformed into a layout graph whose nodes and edges respectively represent document entities and their interrelations. This graph is subsequently rewritten using a set of rules designed for and based on apriori document knowledge and general formatting conventions. The resulting graph provides both logical and layout views of the document content.

Proceedings ArticleDOI
08 Nov 2000
TL;DR: This work proposes a meta-data framework based on a graph modeling technique and shows how this framework can be applied to making two genome databases interoperable, namely DB/12 and GDB.
Abstract: The proliferation, diversity and complexity of genome databases pose a significant challenge to the multidatabase research community. We propose a meta-data framework based on a graph modeling technique and show how this framework can be applied to making two genome databases interoperable. Our technique maps individual database schemas expressed in heterogeneous data models into a common graph representation. This graph-based framework is designed to express both regular data and meta-data. Modeling both types of data is necessary for establishing database correspondences. We show how inter-database relationships can be expressed in terms of correspondences amongst basic graph units, including nodes, edges and the "value path". We apply these basic graph units to perform query interoperation between two databases, namely DB/12 (Database of Human Chromosome 12) and GDB (Genome Database).

Proceedings ArticleDOI
10 Sep 2000
TL;DR: This work proposes a solution that manages the visual complexity of graph-based diagrams with many hundreds, and even thousands of nodes by applying heuristics to choose the best path for an edge and attributing to it a color defined automatically according to its origin and destination nodes.
Abstract: One major challenge with graph-based visual languages is managing the complexity and maintaining a good readability as the density of edges in the graph increases. To improve the graph readability we propose a solution that manages the visual complexity of graph-based diagrams with many hundreds, and even thousands of nodes. It consists in applying heuristics to choose the best path for an edge and attributing to it a color defined automatically according to its origin and destination nodes. A grid system where certain areas are reserved for the display of nodes and others for the edges, is used for the layout of the nodes and edges.

Journal ArticleDOI
TL;DR: The performance evaluation demonstrates that the proposed approach can result in significant cost savings over the current join processing methods, for low to modest values of the join selectivity factor.
Abstract: The focus of this work is on join optimization in relational database systems. The importance of join optimization is critically underscored by the high cost of relational joins and their frequent needs in traditional as well as emerging database applications. We demonstrate that the sequence in which the pages of the relations are accessed to process a join is a critical determinant of the join execution cost and an optimization of this sequence can lead to a significant improvement in performance over the traditional approaches. We initially develop three network structures to represent a join on two relations:page connectivity graph, cascade andblock tree cascade. A page connectivity graph is a bipartite representation of the set of connected pages in the two relations according to the join predicate. To reveal the structural properties of the join, the nodes of the bipartite graph are ordered into a set of levels, and the resulting isomorphic structure is termed acascade. From the cascade, a tree structure termed ablock tree cascade is derived by selectively grouping nodes at each level of the cascade intoblocks. We formulate the join as a tree traversal process, and accordingly develop efficient tree traversal algorithms. We develop a compact data structure to store the resulting access path, and provide a comprehensive analysis of the algorithms with detailed assessments of their performance. The performance evaluation demonstrates that the proposed approach can result in significant cost savings over the current join processing methods, for low to modest values of the join selectivity factor.

Proceedings ArticleDOI
01 May 2000
TL;DR: This paper developed heuristics that improve the readability of control- and data-flow diagrams with many hundreds, and even thousands of nodes.
Abstract: The visual interface plays a significant role in a visual programming system. We therefore developed heuristics that improve the readability of control- and data-flow diagrams with many hundreds, and even thousands of nodes. In this paper, we study how the body of research in graph drawing (GD) can be applied to an actual graph-based interface.

Proceedings ArticleDOI
31 Jan 2000
TL;DR: A method to allow a correspondence biologist's terms and compiler generated terms and to apply a biological friendly computing environment and a visualization tool to representing the different data structures are supplied.
Abstract: In this paper we present generic database model tools for L-systems (DBM-L). DBM-L allows representation of L-Systems objects as database structures by a generic automatic translation process. This research contributes the idea of pre-computing recursive structures data into derived attributes using compiler generation. In this paper we supplied a method to allow a correspondence biologist's terms and compiler generated terms and to apply a biological friendly computing environment. We also present a visualization tool (VT) to representing the different data structures. The whole DBM-L provides a valuable package for biological scientific database management.

01 Mar 2000
TL;DR: Several modifications to a basic serial graph rewriting paradigm are presented and how they improve coding programs in the Grrr graph rewriting programming language are discussed.
Abstract: Graph rewriting is becoming increasingly popular as a method for programming with graph based data structures. We present several modifications to a basic serial graph rewriting paradigm and discuss how they improve coding programs in the Grrr graph rewriting programming language. The constructs we present are once only nodes, attractor nodes and single match rewrites. We illustrate the operation of the constructs by example. The advantages of adding these new rewrite modifiers is to reduce the size of programs, improve the efficiency of execution and simplify the host graph undergoing rewriting.

Proceedings Article
01 Jan 2000

Journal ArticleDOI
01 Sep 2000
TL;DR: A comprehensive process for re-engineering procedural, legacy code to an object-oriented architecture based on a program representation graph, called a statement dependence graph, which includes a technique to recognize potential object hierarchies, state variables and operations.
Abstract: Maintenance of legacy systems is a laborious, error-prone task. It is often difficult to define encapsulated components in procedural programs. We define a comprehensive process for re-engineering procedural, legacy code to an object-oriented architecture. The process is based on a program representation graph, called a statement dependence graph. The process includes a technique to recognize potential object hierarchies, state variables and operations. Procedures are partitioned into operations by analyzing variable use-def chains. The statement dependence graph is restructured by merging cohesive parts of the graph to produce a restructured graph. From the restructured graph, we identify hierarchies of objects. The process to encapsulate the objects includes streamlining the interfaces. Copyright © 2000 John Wiley & Sons, Ltd.

Proceedings ArticleDOI
24 Sep 2000
TL;DR: The edge types and inheritance of the proposed graph type model are useful modeling tools and the template illustrates the usual usage of a substructure, as opposed to the minimal one required or the maximal one allowed by the structure definition.
Abstract: Addresses two problems of technical authors in structured environments: (1) structure definitions of the SGML school are limiting: they require one primary hierarchy and do not cater for link types; and (2) real-life structure definitions are too large to be comprehended easily. As solutions, we propose graph types and usage templates. The edge types and inheritance of the proposed graph type model are useful modeling tools. We give examples for structures that can be expressed more precisely and with gain for the author using graph structures. There are also graphical tools available to define graph types and to specify operations on graphs. Templates can be used as a simple parameterization mechanism. A template illustrates the usual usage of a substructure, as opposed to the minimal one required by a structure definition or the maximal one allowed by it. We also present a prototype authoring application based on these ideas.