scispace - formally typeset
Search or ask a question

Showing papers on "Adjacency list published in 2005"


Journal ArticleDOI
TL;DR: A general framework for `soft' thresholding that assigns a connection weight to each gene pair is described and several node connectivity measures are introduced and provided empirical evidence that they can be important for predicting the biological significance of a gene.
Abstract: Gene co-expression networks are increasingly used to explore the system-level functionality of genes. The network construction is conceptually straightforward: nodes represent genes and nodes are connected if the corresponding genes are significantly co-expressed across appropriately chosen tissue samples. In reality, it is tricky to define the connections between the nodes in such networks. An important question is whether it is biologically meaningful to encode gene co-expression using binary information (connected=1, unconnected=0). We describe a general framework for ;soft' thresholding that assigns a connection weight to each gene pair. This leads us to define the notion of a weighted gene co-expression network. For soft thresholding we propose several adjacency functions that convert the co-expression measure to a connection weight. For determining the parameters of the adjacency function, we propose a biologically motivated criterion (referred to as the scale-free topology criterion). We generalize the following important network concepts to the case of weighted networks. First, we introduce several node connectivity measures and provide empirical evidence that they can be important for predicting the biological significance of a gene. Second, we provide theoretical and empirical evidence that the ;weighted' topological overlap measure (used to define gene modules) leads to more cohesive modules than its ;unweighted' counterpart. Third, we generalize the clustering coefficient to weighted networks. Unlike the unweighted clustering coefficient, the weighted clustering coefficient is not inversely related to the connectivity. We provide a model that shows how an inverse relationship between clustering coefficient and connectivity arises from hard thresholding. We apply our methods to simulated data, a cancer microarray data set, and a yeast microarray data set.

4,448 citations


Journal ArticleDOI
TL;DR: A close to optimal loop closing method is proposed that, while maintaining independence at the local level, imposes consistency at the global level at a computational cost that is linear with the size of the loop.
Abstract: In this paper, we present a hierarchical mapping method that allows us to obtain accurate metric maps of large environments in real time. The lower (or local) map level is composed of a set of local maps that are guaranteed to be statistically independent. The upper (or global) level is an adjacency graph whose arcs are labeled with the relative location between local maps. An estimation of these relative locations is maintained at this level in a relative stochastic map. We propose a close to optimal loop closing method that, while maintaining independence at the local level, imposes consistency at the global level at a computational cost that is linear with the size of the loop. Experimental results demonstrate the efficiency and precision of the proposed method by mapping the Ada Byron building at our campus. We also analyze, using simulations, the precision and convergence of our method for larger loops.

415 citations


Proceedings ArticleDOI
Qing Fang1, Jie Gao1, Leonidas J. Guibas1, V. de Silva1, Li Zhang2 
13 Mar 2005
TL;DR: This work develops a protocol which in a preprocessing phase discovers the global topology of the sensor field and partitions the nodes into routable tiles - regions where the node placement is sufficiently dense and regular that local greedy methods can work well.
Abstract: We present gradient landmark-based distributed routing (GLIDER), a novel naming/addressing scheme and associated routing algorithm, for a network of wireless communicating nodes We assume that the nodes are fixed (though their geographic locations are not necessarily known), and that each node can communicate wirelessly with some of its geographic neighbors - a common scenario in sensor networks We develop a protocol which in a preprocessing phase discovers the global topology of the sensor field and, as a byproduct, partitions the nodes into routable tiles - regions where the node placement is sufficiently dense and regular that local greedy methods can work well Such global topology includes not just connectivity but also higher order topological features, such as the presence of holes We address each node by the name of the tile containing it and a set of local coordinates derived from connectivity graph distances between the node and certain landmark nodes associated with its own and neighboring tiles We use the tile adjacency graph for global route planning and the local coordinates for realizing actual inter- and intra-tile routes We show that efficient load-balanced global routing can be implemented quite simply using such a scheme

287 citations


Proceedings ArticleDOI
04 Jul 2005
TL;DR: A novel approach to the surface reconstruction problem that takes as its input an oriented point set and returns a solid, water-tight model by using Stokes' Theorem to compute the characteristic function of the solid model.
Abstract: In this paper we present a novel approach to the surface reconstruction problem that takes as its input an oriented point set and returns a solid, water-tight model. The idea of our approach is to use Stokes' Theorem to compute the characteristic function of the solid model (the function that is equal to one inside the model and zero outside of it). Specifically, we provide an efficient method for computing the Fourier coefficients of the characteristic function using only the surface samples and normals, we compute the inverse Fourier transform to get back the characteristic function, and we use iso-surfacing techniques to extract the boundary of the solid model.The advantage of our approach is that it provides an automatic, simple, and efficient method for computing the solid model represented by a point set without requiring the establishment of adjacency relations between samples or iteratively solving large systems of linear equations. Furthermore, our approach can be directly applied to models with holes and cracks, providing a method for hole-filling and zippering of disconnected polygonal models.

204 citations


Journal ArticleDOI
TL;DR: The aim is to convert graphs to string sequences so that string matching techniques can be used and to compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice.
Abstract: This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graph-matching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems.

191 citations


Journal ArticleDOI
TL;DR: In this paper, an exhaustive arithmetic criterion for adjacency of vertices in a prime graph of a non-Abelian simple group is given. But this criterion is not applicable to the recognition-by-spectra problem.
Abstract: For every finite non-Abelian simple group, we give an exhaustive arithmetic criterion for adjacency of vertices in a prime graph of the group. For the prime graph of every finite simple group, this criterion is used to determine an independent set with a maximal number of vertices and an independent set with a maximal number of vertices containing 2, and to define orders on these sets; the information obtained is collected in tables. We consider several applications of these results to various problems in finite group theory, in particular, to the recognition-by-spectra problem for finite groups.

134 citations


Journal ArticleDOI
TL;DR: This work presents an exact algorithmic approach to a spatial problem arising in forest harvesting based on determining a strong formulation of the linear programming problem through a clique representation of a projected problem.
Abstract: We consider a spatial problem arising in forest harvesting. For regulatory reasons, blocks harvested should not exceed a certain total area, typically 49 hectares. Traditionally, this problem, called the adjacency problem, has been approached by forming a priori blocks from basic cells of 5 to 25 hectares and solving the resulting mixed-integer program. Superior solutions can be obtained by including the construction of blocks in the decision process. The resulting problem is far more complex combinatorially. We present an exact algorithmic approach that has yielded good results in computational tests. This solution approach is based on determining a strong formulation of the linear programming problem through a clique representation of a projected problem.

126 citations


Journal ArticleDOI
TL;DR: In this article, an exhaustive arithmetic criterion of adjacency in prime graphs for every finite nonabelian simple group was given, and the orders of these independence sets were given.
Abstract: In the paper we give an exhaustive arithmetic criterion of adjacency in prime graph $GK(G)$ for every finite nonabelian simple group $G$. By using this criterion for all finite simple groups an independence set with the maximal number of vertices, an independence set containing 2 with the maximal number of vertices, and the orders of these independence sets are given. We assemble this information in the tables at the end of the paper. Several applications of obtained results for various problems of finite group theory are considered.

114 citations


Journal ArticleDOI
TL;DR: This paper analyzes the Enron email data set to discover structures within the organization and shows that preprocessing of data has significant impact on the results, thus a standard form is needed for establishing a benchmark data.
Abstract: Analysis of social networks to identify communities and model their evolution has been an active area of recent research. This paper analyzes the Enron email data set to discover structures within the organization. The analysis is based on constructing an email graph and studying its properties with both graph theoretical and spectral analysis techniques. The graph theoretical analysis includes the computation of several graph metrics such as degree distribution, average distance ratio, clustering coefficient and compactness over the email graph. The spectral analysis shows that the email adjacency matrix has a rank-2 approximation. It is shown that preprocessing of data has significant impact on the results, thus a standard form is needed for establishing a benchmark data.

113 citations


Proceedings ArticleDOI
26 Oct 2005
TL;DR: Improved visual clustering is applied to previously described network protection domains (attack graph cliques) and shows patterns of network attack while avoiding the clutter usually associated with drawing large graphs.
Abstract: While efficient graph-based representations have been developed for modeling combinations of low-level network attacks, relatively little attention has been paid to effective techniques for visualizing such attack graphs. This paper describes a number of new attack graph visualization techniques, each having certain desirable properties and offering different perspectives for solving different kinds of problems. Moreover, the techniques we describe can be applied not only separately, but can also be combined into coordinated attack graph views. We apply improved visual clustering to previously described network protection domains (attack graph cliques), which reduces graph complexity and makes the overall attack flow easier to understand. We also visualize the attack graph adjacency matrix, which shows patterns of network attack while avoiding the clutter usually associated with drawing large graphs. We show how the attack graph adjacency matrix concisely conveys the impact of network configuration changes on attack graphs. We also describe a novel attack graph filtering technique based on the interactive navigation of a hierarchy of attack graph constraints. Overall, our techniques scale quadratically with the number of machines in the attack graph.

108 citations


Journal ArticleDOI
TL;DR: In this paper, the convergence of the integrated densities of states of finite box Hamiltonians to the one on the whole space holds even at the points of discontinuity, where the integrated density of states has discontinuities precisely at this set of energies.
Abstract: We study the family of Hamiltonians which corresponds to the adjacency operators on a percolation graph. We characterise the set of energies which are almost surely eigenvalues with finitely supported eigenfunctions. This set of energies is a dense subset of the algebraic integers. The integrated density of states has discontinuities precisely at this set of energies. We show that the convergence of the integrated densities of states of finite box Hamiltonians to the one on the whole space holds even at the points of discontinuity. For this we use an equicontinuity-from-the-right argument. The same statements hold for the restriction of the Hamiltonian to the infinite cluster. In this case we prove that the integrated density of states can be constructed using local data only. Finally we study some mixed Anderson-Quantum percolation models and establish results in the spirit of Wegner, and Delyon and Souillard.

Journal ArticleDOI
TL;DR: This paper presents a novel compact adjacency‐based topological data structure for finite element mesh representation, designed to support, under the same framework, both two‐ and three‐dimensional meshes, with any type of elements defined by templates of ordered nodes.
Abstract: SUMMARY This paper presents a novel compact adjacency-based topological data structure for finite element mesh representation. The proposed data structure is designed to support, under the same framework, both two- and three-dimensional meshes, with any type of elements defined by templates of ordered nodes. When compared to other proposals, our data structure reduces the required storage space while being ‘complete’, in the sense that it preserves the ability to retrieve all topological adjacency relationships in constant time or in time proportional to the number of retrieved entities. Element and node are the only entities explicitly represented. Other topological entities, which include facet, edge, and vertex, are implicitly represented. In order to simplify accessing topological adjacency relationships, we also define and implicitly represent oriented entities, associated to the use of facets, edges, and vertices by an element. All implicit entities are represented by concrete types, being handled as values, which avoid usual problems encountered in other reduced data structures when performing operations such as entity enumeration and attribute attachment. We also extend the data structure with the use of ‘reverse indices’, which improves performance for extracting adjacency relationships while maintaining storage space within reasonable limits. The data structure effectiveness is demonstrated by two different applications: for supporting fragmentation simulation and for supporting volume rendering algorithms. Copyright 2005 John Wiley & Sons, Ltd.

Journal ArticleDOI
TL;DR: A novel graph-theoretical formulation of the problem of automatically acquiring a generic 2D view-based class model from a set of images, each containing an exemplar object belonging to that class, and presents a shortest path-based approximation algorithm to yield an efficient solution.
Abstract: The recognition community has typically avoided bridging the representational gap between traditional, low-level image features and generic models. Instead, the gap has been artificially eliminated by either bringing the image closer to the models using simple scenes containing idealized, textureless objects or by bringing the models closer to the images using 3D CAD model templates or 2D appearance model templates. In this paper, we attempt to bridge the representational gap for the domain of model acquisition. Specifically, we address the problem of automatically acquiring a generic 2D view-based class model from a set of images, each containing an exemplar object belonging to that class. We introduce a novel graph-theoretical formulation of the problem in which we search for the lowest common abstraction among a set of lattices, each representing the space of all possible region groupings in a region adjacency graph representation of an input image. The problem is intractable and we present a shortest path-based approximation algorithm to yield an efficient solution. We demonstrate the approach on real imagery.

Journal ArticleDOI
TL;DR: The proposed method is based on a Markovian regularization of an elevation field defined on a region adjacency graph (RAG) obtained by oversegmenting the optical image and takes into account discontinuities of buildings thanks to an implicit edge process.
Abstract: This paper deals with the estimation of an elevation model using a pair of synthetic aperture radar (SAR) images and an optical image in semiurban areas. The proposed method is based on a Markovian regularization of an elevation field defined on a region adjacency graph (RAG). This RAG is obtained by oversegmenting the optical image. The support for elevation hypotheses is given by the structural matching of features extracted from both SAR images. The regularization model takes into account discontinuities of buildings thanks to an implicit edge process. Starting from a good initialization, optimization is obtained through an iterated conditional mode algorithm.

Journal ArticleDOI
TL;DR: In this paper, it was proved that graph Zn is determined by its adjacency spectrum as well as its Laplacian spectrum, where k is a positive integer.

Proceedings ArticleDOI
08 Jun 2005
TL;DR: This work presents SCGP, an algorithm for finding a single cluster of well-connected nodes in a graph that produces an approximate solution in O(N 2 ) time by considering the spectral properties of the graph's adjacency matrix.
Abstract: We present SCGP, an algorithm for finding a single cluster of well-connected nodes in a graph. The general problem is NP-hard, but our algorithm produces an approximate solution in O(N 2 ) time by considering the spectral properties of the graph's adjacency matrix. We show how this algorithm can be used to find sets of self-consistent hypotheses while rejecting incorrect hypotheses, a problem that frequently arises in robotics. We present results from a range-only SLAM system, a polynomial time data association algorithm, and a method for parametric line fitting that can outperform RANSAC.

Book ChapterDOI
18 May 2005
TL;DR: This work proposes a local approach that computes clusters in graphs, one at a time, relying only on the neighborhoods of the vertices included in the current cluster candidate, enabling implementing a local and parameter-free algorithm.
Abstract: Most graph-theoretical clustering algorithms require the complete adjacency relation of the graph representing the examined data. This is infeasible for very large graphs currently emerging in many application areas. We propose a local approach that computes clusters in graphs, one at a time, relying only on the neighborhoods of the vertices included in the current cluster candidate. This enables implementing a local and parameter-free algorithm. Approximate clusters may be identified quickly by heuristic methods. We report experimental results on clustering graphs using simulated annealing.

Proceedings Article
01 Jan 2005
TL;DR: This article formulate the semi-supervised learning problem by a regularization approach and discretize the resulting problem in function space by the sparse grid method and solve the arising equations using the so-called combination technique.
Abstract: Sparse grids were recently introduced for classification and regression problems. In this article we apply the sparse grid approach to semi-supervised classification. We formulate the semi-supervised learning problem by a regularization approach. Here, besides a regression formulation for the labeled data, an additional term is involved which is based on the graph Laplacian for an adjacency graph of all, labeled and unlabeled data points. It reflects the intrinsic geometric structure of the data distribution. We discretize the resulting problem in function space by the sparse grid method and solve the arising equations using the so-called combination technique. In contrast to recently proposed kernel based methods which currently scale cubic in regard to the number of overall data, our method scales only linear, provided that a sparse graph Laplacian is used. This allows to deal with huge data sets which involve millions of points. We show experimental results with the new approach.

Proceedings ArticleDOI
18 Apr 2005
TL;DR: This paper presents a new approach to automatic segmentation of the foreground objects from the sequence of images by integrating techniques of background subtraction and motion-based segmentation in an elegant way to make the foreground detection become more accurate.
Abstract: This paper presents a new approach to automatic segmentation of the foreground objects from the sequence of images by integrating techniques of background subtraction and motion-based segmentation. At first, a background model is built to represent information of both color and motion of the background scene. Based on temporal and spatial information, an initial partition of each image is obtained. Next, we formulate the classification problem as a graph labeling over a region adjacency graph (RAG) based on Markov random fields (MRFs) statistical framework. The Bhattacharyya distance for estimating the similarity between color and motion distributions of the background model and the currently obtained regions are used to model the likelihood energies. The object tracking strategy for finding the correspondence between region at different time instant is used to maintain the temporal coherence of the segmentation. For spatial coherence, the length of the common boundaries of two regions is taken into consideration for classification. Both spatial and temporal coherence are incorporated into the prior energy to maintain the continuity of the segmentation. Finally, a labeling is obtained by maximizing a posterior probability of the MRFs. Under such formulation, we integrate two different kinds of framework in an elegant way to make the foreground detection become more accurate. Experimental results for two image sequences including the hall monitoring and our e-home demo site are provided to demonstrate the effectiveness of the proposed approach.

Patent
Ryan Rifkin1, Stuart Andrews1
14 Apr 2005
TL;DR: In this paper, a local-neighborhood Laplacian Eigenmap (LNLE) algorithm is proposed for semi-supervised learning on manifolds of data points in a high-dimensional space.
Abstract: A local-neighborhood Laplacian Eigenmap (LNLE) algorithm is provided for methods and systems for semi-supervised learning on manifolds of data points in a high-dimensional space. In one embodiment, an LNLE based method includes building an adjacency graph over a dataset of labelled and unlabelled points. The adjacency graph is then used for finding a set of local neighbors with respect to an unlabelled data point to be classified. An eigen decomposition of the local subgraph provides a smooth function over the subgraph. The smooth function can be evaluated and based on the function evaluation the unclassified data point can be labelled. In one embodiment, a transductive inference (TI) algorithmic approach is provided. In another embodiment, a semi-supervised inductive inference (SSII) algorithmic approach is provided for classification of subsequent data points. A confidence determination can be provided based on a number of labeled data points within the local neighborhood. Experimental results comparing LNLE and simple LE approaches are presented.

Journal ArticleDOI
TL;DR: A new linear integer programming formulation of adjacency constraints for the area restriction model is presented, which are small in number and are a strong model for the adjacence probl...
Abstract: We present a new linear integer programming formulation of adjacency constraints for the area restriction model. These constraints are small in number and are a strong model for the adjacency problem. We describe constraint development, including strengthening and lifting, to improve the basic formulation. The model does not prohibit all adjacency violations, but computations show they are few in number. Using example forests ranging from 750 to more than 6000 polygons, optimization problems were solved and good solutions obtained in very short computational time.

Proceedings ArticleDOI
31 Jul 2005
TL;DR: This chapter provides an implementation of a DEC-friendly tetrahedral mesh data structure in C++ and documents the ideas behind the implementation.
Abstract: The methods of Discrete Exterior Calculus (DEC) have given birth to many new algorithms applicable to areas such as fluid simulation, deformable body simulation, and others. Despite the (possibly intimidating) mathematical theory that went into deriving these algorithms, in the end they lead to simple, elegant, and straightforward implementations. However, readers interested in implementing them should note that the algorithms presume the existence of a suitable simplicial complex data structure. Such a data structure needs to support local traversal of elements, adjacency information for all dimensions of simplices, a notion of a dual mesh, and all simplices must be oriented. Unfortunately, most publicly available tetrahedral mesh libraries provide only unoriented representations with little more than vertex-tet adjacency information (while we need vertex-edge, edge-triangle, edge-tet, etc.). For those eager to implement and build on the algorithms presented in this course without having to worry about these details, we provide an implementation of a DEC-friendly tetrahedral mesh data structure in C++. This chapter documents the ideas behind the implementation.

Journal ArticleDOI
TL;DR: A new classification method for vector-valued images is proposed, based on a causal Markovian model, defined on the hierarchy of a multiscale region adjacency tree (MRAT), and a set of nonparametric dissimilarity measures that express the data likelihoods.
Abstract: We propose a new classification method for vector-valued images, based on: 1) a causal Markovian model, defined on the hierarchy of a multiscale region adjacency tree (MRAT), and 2) a set of nonparametric dissimilarity measures that express the data likelihoods. The image classification is treated as a hierarchical labeling of the MRAT, using a finite set of interpretation labels (e.g., land cover classes). This is accomplished via a noniterative estimation of the modes of posterior marginals (MPM), inspired from existing approaches for Bayesian inference on the quadtree. The paper describes the main principles of our method and illustrates classification results on a set of artificial and remote sensing images, together with qualitative and quantitative comparisons with a variety of pixel-based techniques that follow the Bayesian-Markovian framework either on hierarchical structures or the original image lattice.

Journal ArticleDOI
TL;DR: In this paper, the continuous Laplacian on an infinite locally finite network with equal edge lengths under natural transition conditions as continuity at the ramification nodes and classical Kirchhoff conditions at all vertices is considered.
Abstract: We consider the continuous Laplacian on an infinite locally finite network with equal edge lengths under natural transition conditions as continuity at the ramification nodes and classical Kirchhoff conditions at all vertices. It is shown that eigenvalues of the Laplacian in a L∞-setting are closely related to those of the adjacency and transition operator of the network. In this way the point spectrum is determined completely in terms of combinatorial quantities and properties of the underlying graph as in the finite case [2]. Moreover, the occurrence of infinite geometric multiplicity on trees and some periodic graphs is investigated.

Patent
14 Jul 2005
TL;DR: In this paper, the authors treat a protected forwarding adjacency (FA) as a dynamic entity in that it allows a backup tunnel associated with the FA to carry traffic for the FA, when it's primary tunnel has failed, up to a predetermined amount of time.
Abstract: A technique treats a protected forwarding adjacency (FA) as a dynamic entity in that it allows a backup tunnel associated with the FA to carry traffic for the FA, when it's primary tunnel has failed, up to a predetermined amount of time. If after the predetermined amount of time has elapsed and the FA has not recovered (e.g., the primary tunnel has not been reestablished), a network topology change is automatically triggered causing the network to converge on a new network topology. By triggering the network topology change, a path that is more optimal than the path associated with the backup tunnel may be subsequently determined to carry the traffic.

Proceedings ArticleDOI
14 May 2005
TL;DR: A fairly simple and theoretically validated energy model for computing hierarchical graphs layouts, i.e. of positions of the nodes in two- or three-dimensional space, is presented.
Abstract: Hierarchical graphs are widely used as models of the structure of software systems. A central problem in the visualization of hierarchical graphs is the computation of layouts, i.e. of positions of the nodes in two- or three-dimensional space. We derive requirements for graph layouts from various software analysis questions, and classify the required layouts along three dimensions: layouts with meaningful distances between single nodes vs. layouts with meaningful distances between groups of nodes, layouts reflecting adjacency vs. layouts reflecting hierarchy, and layouts that faithfully reflect the size of subgraphs vs. layouts where certain subgraphs are magnified. We present a fairly simple and theoretically validated energy model for computing such layouts.

Journal ArticleDOI
TL;DR: A new matrix called adjusted adjacency matrix is proposed that meets the requirement that a graph must contain at least one distinct eigenvalue and is shown to be not only effective but also more efficient than that based on the adjACency matrix.
Abstract: Many science and engineering problems can be represented by a network, a generalization of which is a graph. Examples of the problems that can be represented by a graph include: cyclic sequential circuit, organic molecule structures, mechanical structures, etc. The most fundamental issue with these problems (e.g., designing a molecule structure) is the identification of structure, which further reduces to be the identification of graph. The problem of the identification of graph is called graph isomorphism. The graph isomorphism problem is an NP problem according to the computational complexity theory. Numerous methods and algorithms have been proposed to solve this problem. Elsewhere we presented an approach called the eigensystem approach. This approach is based on a combination of eigenvalue and eigenvector which are further associated with the adjacency matrix. The eigensystem approach has been shown to be very effective but requires that a graph must contain at least one distinct eigenvalue. The adjacency matrix is not shown sufficiently to meet this requirement. In this paper, we propose a new matrix called adjusted adjacency matrix that meets this requirement. We show that the eigensystem approach based on the adjusted adjacency matrix is not only effective but also more efficient than that based on the adjacency matrix.

01 Jan 2005
TL;DR: This paper presents the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach, and proposes a novel and unique embedding list representation that is suitable for describing embedded subtrees.
Abstract: Tree mining has many useful applications in areas such as Bioinformatics, XML mining, Web mining, etc. In general, most of the formally represented information in these domains is a tree structured form. In this paper we focus on mining frequent embedded subtrees from databases of rooted labeled ordered subtrees. We propose a novel and unique embedding list representation that is suitable for describing embedded subtrees. This representation is completely different from the string-like or conventional adjacency list representation previously utilized for trees. We present the mathematical model of a breadth-first-search Tree Model Guided (TMG) candidate generation approach previously introduced in [8]. The key characteristic of the TMG approach is that it enumerates fewer candidates by ensuring that only valid candidates that conform to the structural aspects of the data are generated as opposed to the join approach. Our experiments with both synthetic and real-life datasets provide comparisons against one of the state-of-the-art algorithms, TreeMiner [15], and they demonstrate the effectiveness and the efficiency of the technique.

Journal ArticleDOI
TL;DR: A unified method is developed for calculating the eigenvalues of the weighted adjacency and Laplacian matrices of three different graph products, which have many applications in computational mechanics.
Abstract: In this paper, a unified method is developed for calculating the eigenvalues of the weighted adjacency and Laplacian matrices of three different graph products. These products have many applications in computational mechanics, such as ordering, graph partitioning, and subdomaining of finite element models.

Journal ArticleDOI
TL;DR: Graph theory provides a computational framework for modeling a variety of datasets including those emerging from genomics, proteomics, and chemical genetics, and SpectralNET is a flexible application for analyzing and visualizing these biological and chemical networks.
Abstract: Graph theory provides a computational framework for modeling a variety of datasets including those emerging from genomics, proteomics, and chemical genetics. Networks of genes, proteins, small molecules, or other objects of study can be represented as graphs of nodes (vertices) and interactions (edges) that can carry different weights. SpectralNET is a flexible application for analyzing and visualizing these biological and chemical networks. Available both as a standalone .NET executable and as an ASP.NET web application, SpectralNET was designed specifically with the analysis of graph-theoretic metrics in mind, a computational task not easily accessible using currently available applications. Users can choose either to upload a network for analysis using a variety of input formats, or to have SpectralNET generate an idealized random network for comparison to a real-world dataset. Whichever graph-generation method is used, SpectralNET displays detailed information about each connected component of the graph, including graphs of degree distribution, clustering coefficient by degree, and average distance by degree. In addition, extensive information about the selected vertex is shown, including degree, clustering coefficient, various distance metrics, and the corresponding components of the adjacency, Laplacian, and normalized Laplacian eigenvectors. SpectralNET also displays several graph visualizations, including a linear dimensionality reduction for uploaded datasets (Principal Components Analysis) and a non-linear dimensionality reduction that provides an elegant view of global graph structure (Laplacian eigenvectors). SpectralNET provides an easily accessible means of analyzing graph-theoretic metrics for data modeling and dimensionality reduction. SpectralNET is publicly available as both a .NET application and an ASP.NET web application from http://chembank.broad.harvard.edu/resources/ . Source code is available upon request.