scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Network sampling and classification: An investigation of network model representations

TL;DR: It is argued that conclusions based on simulated network studies must focus on the full features of the connectivity patterns of a network instead of on the limited set of network metrics for a specific network type.
Abstract: Methods for generating a random sample of networks with desired properties are important tools for the analysis of social, biological, and information networks. Algorithm-based approaches to sampling networks have received a great deal of attention in recent literature. Most of these algorithms are based on simple intuitions that associate the full features of connectivity patterns with specific values of only one or two network metrics. Substantive conclusions are crucially dependent on this association holding true. However, the extent to which this simple intuition holds true is not yet known. In this paper, we examine the association between the connectivity patterns that a network sampling algorithm aims to generate and the connectivity patterns of the generated networks, measured by an existing set of popular network metrics. We find that different network sampling algorithms can yield networks with similar connectivity patterns. We also find that the alternative algorithms for the same connectivity pattern can yield networks with different connectivity patterns. We argue that conclusions based on simulated network studies must focus on the full features of the connectivity patterns of a network instead of on the limited set of networkmetrics for a specific network type. This fact has important implications for network data analysis: for instance, implications related to the way significance is currently assessed.

Content maybe subject to copyright    Report

Citations
More filters
Posted Content
TL;DR: Two novel approaches for characterizing functional network connec-tivity from electroencephalography (EEG) are provided, one of which represents func-tional connectivity structure through the distribution of eigenvalues making up channel coherence matrices in multiple frequency bands and the other uses a connectivity matrix at each frequency band.
Abstract: : Studies in recent years have demonstrated that neural organization and structure impact an individuals ability to perform a given task. Specifical-ly, more efficient functional networks have been shown to produce better per-formance. We apply this principle to evaluation of a working memory task by providing two novel approaches for characterizing functional network connec-tivity from electroencephalography (EEG). Our first approach represents func-tional connectivity structure through the distribution of eigenvalues making up channel coherence matrices in multiple frequency bands. Our second approach uses a connectivity matrix at each frequency band, assessing variability in aver-age path lengths and degree across the network. We also use features based on the pattern of frequency band power across the EEG channels. Failures in digit and sentence recall on single trials are detected using a Gaussian classifier for each feature set at each frequency band. The classifier results are then fused across frequency bands, with the resulting detection performance summarized using the area under the receiver operating characteristic curve (AUC) statistic. Fused AUC results of 0.63/0.58/0.61 for digit recall failure and 0.57/0.59/0.47 for sentence recall failure are obtained from the connectivity structure, graph variability, and channel power features respectively.

1 citations


Cites background from "Network sampling and classification..."

  • ...As in [2], we consider statistics of the vertex-based features (average path length, and degree) across all vertices in the graph....

    [...]

Proceedings ArticleDOI
01 Sep 2018
TL;DR: This study uses the graph representation combined with Random Forest to discriminate between Erdos-Renyi, Stochastic Block Model and Planted Clique models, and combines this representation with a Squared Mahalanobis Distance-based test to reject a model given an observed network.
Abstract: We present a novel approach of graph representation based on mutual information of a random walk in a graph. This representation, as any global metric of a graph, can be used to identify the model generator of the observed network. In this study, we use our graph representation combined with Random Forest (RF) to discriminate between Erdos-Renyi (ER), Stochastic Block Model (SBM) and Planted Clique (PC) models. We also combine our graph representation with a Squared Mahalanobis Distance (SMD)-based test to reject a model given an observed network. We test the proposed method with computer simulations.

1 citations


Cites background from "Network sampling and classification..."

  • ...In [1], the authors include 47 measures that combine local and global graph characteristics....

    [...]

12 Mar 2015
TL;DR: Palavras-chave et al. as mentioned in this paper propose a metodologia de exploracao de curriculos de pesquisadores com uso de softwares for construcao de planilhas de dados, contagem de coautorias e analise of redes sociais.
Abstract: Este estudo, apoiado pelo CNPq, objetivou elaborar marcadores para a avaliacao de processos interativos de trabalho em redes de pesquisa. Ele se baseia no entendimento de que este tipo de rede se estabelece quando um grupo colabora com a intencao de produzir conhecimento. A partir da teoria, desenvolveu-se uma metodologia de exploracao de curriculos de pesquisadores com uso de softwares para construcao de planilhas de dados, contagem de coautorias e analise de redes sociais. A producao bibliografica dos pesquisadores foi examinada atraves de grafos representando suas redes de colaboracao em pesquisa e um protocolo foi construido para avalia-las. Os resultados identificam 10 marcadores/indicadores quali-quantitativos para avaliacao de processos de pesquisa em rede que foram testados e validados em contexto de aplicacao. Palavras-chave: Avaliacao. Redes de pesquisa. Colaboracao cientifica. Indicadores. Link para o texto completo (PDF) http://www.revistas.ufg.br/index.php/ci/article/view/31820

1 citations

Journal ArticleDOI
TL;DR: In this article, a decision tree based generative model selection for complex networks (GMSCN) is proposed to select the model that is able to generate graphs similar to a given network instance.
Abstract: Real networks exhibit nontrivial topological features such as heavy-tailed degree distribution, high clustering, and small-worldness Researchers have developed several generative models for synthesizing artificial networks that are structurally similar to real networks An important research problem is to identify the generative model that best fits to a target network In this paper, we investigate this problem and our goal is to select the model that is able to generate graphs similar to a given network instance By the means of generating synthetic networks with seven outstanding generative models, we have utilized machine learning methods to develop a decision tree for model selection Our proposed method, which is named "Generative Model Selection for Complex Networks" (GMSCN), outperforms existing methods with respect to accuracy, scalability and size-independence

1 citations

References
More filters
Journal ArticleDOI
04 Jun 1998-Nature
TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.
Abstract: Networks of coupled dynamical systems have been used to model biological oscillators, Josephson junction arrays, excitable media, neural networks, spatial games, genetic control networks and many other self-organizing systems. Ordinarily, the connection topology is assumed to be either completely regular or completely random. But many biological, technological and social networks lie somewhere between these two extremes. Here we explore simple models of networks that can be tuned through this middle ground: regular networks 'rewired' to introduce increasing amounts of disorder. We find that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. We call them 'small-world' networks, by analogy with the small-world phenomenon (popularly known as six degrees of separation. The neural network of the worm Caenorhabditis elegans, the power grid of the western United States, and the collaboration graph of film actors are shown to be small-world networks. Models of dynamical systems with small-world coupling display enhanced signal-propagation speed, computational power, and synchronizability. In particular, infectious diseases spread more easily in small-world networks than in regular lattices.

39,297 citations


"Network sampling and classification..." refers background in this paper

  • ...An analysis would then claim, for example, that scale-free networks are characterized by having a power-law degree distribution [40]....

    [...]

  • ...[33], and it “sounds” like a plausible explanation [33,40]....

    [...]

  • ...Small world [40] Θ=(n,k,pn) Nodes, neighbors, pr rewire 2....

    [...]

  • ...Algorithm-based approaches to sampling networks [3,34,40] have received a great deal of attention in recent literature [9,11,15,30,41]....

    [...]

  • ...(Small world) Each node is connected to several of its neighbors and a few distant nodes, according to the ring-induced distance [40] (Fig....

    [...]

Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Abstract: Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

33,771 citations


"Network sampling and classification..." refers background in this paper

  • ...Scale free [7] Θ=(n,n0,p0,pn) Nodes, init nodes, pr init edge, pr edge 4....

    [...]

Book
28 Jul 2013
TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.
Abstract: During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

19,261 citations

Journal ArticleDOI
TL;DR: In this paper, a simple model based on the power-law degree distribution of real networks was proposed, which was able to reproduce the power law degree distribution in real networks and to capture the evolution of networks, not just their static topology.
Abstract: The emergence of order in natural systems is a constant source of inspiration for both physical and biological sciences. While the spatial order characterizing for example the crystals has been the basis of many advances in contemporary physics, most complex systems in nature do not offer such high degree of order. Many of these systems form complex networks whose nodes are the elements of the system and edges represent the interactions between them. Traditionally complex networks have been described by the random graph theory founded in 1959 by Paul Erdohs and Alfred Renyi. One of the defining features of random graphs is that they are statistically homogeneous, and their degree distribution (characterizing the spread in the number of edges starting from a node) is a Poisson distribution. In contrast, recent empirical studies, including the work of our group, indicate that the topology of real networks is much richer than that of random graphs. In particular, the degree distribution of real networks is a power-law, indicating a heterogeneous topology in which the majority of the nodes have a small degree, but there is a significant fraction of highly connected nodes that play an important role in the connectivity of the network. The scale-free topology of real networks has very important consequences on their functioning. For example, we have discovered that scale-free networks are extremely resilient to the random disruption of their nodes. On the other hand, the selective removal of the nodes with highest degree induces a rapid breakdown of the network to isolated subparts that cannot communicate with each other. The non-trivial scaling of the degree distribution of real networks is also an indication of their assembly and evolution. Indeed, our modeling studies have shown us that there are general principles governing the evolution of networks. Most networks start from a small seed and grow by the addition of new nodes which attach to the nodes already in the system. This process obeys preferential attachment: the new nodes are more likely to connect to nodes with already high degree. We have proposed a simple model based on these two principles wich was able to reproduce the power-law degree distribution of real networks. Perhaps even more importantly, this model paved the way to a new paradigm of network modeling, trying to capture the evolution of networks, not just their static topology.

18,415 citations

Book
25 Nov 1994
TL;DR: This paper presents mathematical representation of social networks in the social and behavioral sciences through the lens of Dyadic and Triadic Interaction Models, which describes the relationships between actor and group measures and the structure of networks.
Abstract: Part I. Introduction: Networks, Relations, and Structure: 1. Relations and networks in the social and behavioral sciences 2. Social network data: collection and application Part II. Mathematical Representations of Social Networks: 3. Notation 4. Graphs and matrixes Part III. Structural and Locational Properties: 5. Centrality, prestige, and related actor and group measures 6. Structural balance, clusterability, and transitivity 7. Cohesive subgroups 8. Affiliations, co-memberships, and overlapping subgroups Part IV. Roles and Positions: 9. Structural equivalence 10. Blockmodels 11. Relational algebras 12. Network positions and roles Part V. Dyadic and Triadic Methods: 13. Dyads 14. Triads Part VI. Statistical Dyadic Interaction Models: 15. Statistical analysis of single relational networks 16. Stochastic blockmodels and goodness-of-fit indices Part VII. Epilogue: 17. Future directions.

17,104 citations