scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Network sampling and classification: An investigation of network model representations

TL;DR: It is argued that conclusions based on simulated network studies must focus on the full features of the connectivity patterns of a network instead of on the limited set of network metrics for a specific network type.
Abstract: Methods for generating a random sample of networks with desired properties are important tools for the analysis of social, biological, and information networks. Algorithm-based approaches to sampling networks have received a great deal of attention in recent literature. Most of these algorithms are based on simple intuitions that associate the full features of connectivity patterns with specific values of only one or two network metrics. Substantive conclusions are crucially dependent on this association holding true. However, the extent to which this simple intuition holds true is not yet known. In this paper, we examine the association between the connectivity patterns that a network sampling algorithm aims to generate and the connectivity patterns of the generated networks, measured by an existing set of popular network metrics. We find that different network sampling algorithms can yield networks with similar connectivity patterns. We also find that the alternative algorithms for the same connectivity pattern can yield networks with different connectivity patterns. We argue that conclusions based on simulated network studies must focus on the full features of the connectivity patterns of a network instead of on the limited set of networkmetrics for a specific network type. This fact has important implications for network data analysis: for instance, implications related to the way significance is currently assessed.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The proposed method provides a principled approach to incorporate uncertainty in prediction in the designing of network-based interventions, as development of such interventions often requires prediction of the network structure in the presence and absence of the intervention.
Abstract: We present a statistical framework for generating predicted dynamic networks based on the observed evolution of social relationships in a population. The framework includes a novel and flexible procedure to sample dynamic networks given a probability distribution on evolving network properties; it permits the use of a broad class of approaches to model trends, seasonal variability, uncertainty, and changes in population composition. Current methods do not account for the variability in the observed historical networks when predicting the network structure; the proposed method provides a principled approach to incorporate uncertainty in prediction. This advance aids in the designing of network-based interventions, as development of such interventions often requires prediction of the network structure in the presence and absence of the intervention. Two simulation studies are conducted to demonstrate the usefulness of generating predicted networks when designing network-based interventions. The framework is also illustrated by investigating results of potential interventions on bill passage rates using a dynamic network that represents the sponsor/co-sponsor relationships among senators derived from bills introduced in the US Senate from 2003-2016.

5 citations


Cites methods or result from "Network sampling and classification..."

  • ...This approach shares similarities with the network classification method by Airoldi et al. (2011)....

    [...]

  • ...This approach shares similarities with the network classification method by Airoldi et al. (2011). We visually inspect the differences, but statistical tests can be applied, such as a Chi-squared test if the values can be binned....

    [...]

Journal ArticleDOI
TL;DR: The proposed “six-element” analysis method can provide a basis for decision making and an analysis method for the identification of core terrorists and key terrorist organizations, determination of the key alert period, location of counter-terrorism and the early warning of the means of major terrorist activities.
Abstract: The rapid development of social network theory provides a new perspective for the research on counter-terrorism, however, current research mostly relates to terrorists and terrorist organizations. Firstly, the “six-element” analysis method for terrorist activities based on social network is proposed in this paper, namely, a variety of sub-networks are constructed according to the correlation among the six elements—people, organization, time, location, manner and event. These sub-networks are assessed through using centrality analysis, cohesive subgroup analysis, spatial correlation analysis, invulnerability analysis and descriptive statistic analysis; the characteristics and laws of terrorist activities are revealed from several different perspectives. Then, the “six-element” analysis method is applied to conduct empirical research on “East Turkistan” terrorist activities since the founding of the People’s Republic of China, so as to effectively identify core people and key organizations of the “East Turkistan” terrorist activity network, to assess the invulnerability of the “East Turkistan” terrorist network and to reveal the temporal and spatial distribution rules as well as the characteristics of the means and manners adopted in all previous terrorist activities. Lastly, the analysis results are interpreted qualitatively. This research can provide a basis for decision making and an analysis method for the identification of core terrorists and key terrorist organizations, determination of the key alert period, location of counter-terrorism and the early warning of the means of major terrorist activities.

4 citations


Cites methods from "Network sampling and classification..."

  • ...The network data, which are made up of these different node sets and their correlations, can construct various network graphs by using such social network analysis software as UCINET and ORA (Airoldi et al. 2011; Carley and Dereno 2011; Carrington et al. 2005) and inputting the matrix constructed by the above method....

    [...]

  • ...…are made up of these different node sets and their correlations, can construct various network graphs by using such social network analysis software as UCINET and ORA (Airoldi et al. 2011; Carley and Dereno 2011; Carrington et al. 2005) and inputting the matrix constructed by the above method....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a quantitative dissimilarity metric of weighted networks (WD-metric) based on the D-measure which is suggested for unweighted networks, and constructed a distance probability matrix of weighted network.
Abstract: Measuring the dissimilarities between networks is a basic problem and wildly used in many fields. Based on method of the D-measure which is suggested for unweighted networks, we propose a quantitative dissimilarity metric of weighted network (WD-metric). Crucially, we construct a distance probability matrix of weighted network, which can capture the comprehensive information of weighted network. Moreover, we define the complementary graph and alpha centrality of weighted network. Correspondingly, several synthetic and real-world networks are used to verify the effectiveness of the WD-metric. Experimental results show that WD-metric can effectively capture the influence of weight on the network structure and quantitatively measure the dissimilarity of weighted networks. It can also be used as a criterion for backbone extraction algorithms of complex network.

3 citations

Journal ArticleDOI
TL;DR: Simulated data shows that tests with the proposed permutation approach to the two-sample testing problem for network-valued data exhibit a statistical power that is either the best or second-best but very close to the best on a variety of possible alternatives hypotheses and other statistics.

3 citations


Cites methods from "Network sampling and classification..."

  • ...The goal is to demonstrate that using summary indicators (e.g. clustering coefficient) to compare samples of networks, which is the most popular ap-560 proach (e.g. Airoldi et al., 2011), could yield less powerful test procedures with respect to using the entire network structures....

    [...]

Posted Content
TL;DR: This paper employs distance metric learning algorithms in order to construct an integrated distance metric for comparing structural properties of complex networks, and applies it as the distance metric in K-nearest-neighbors classification.
Abstract: Graph comparison plays a major role in many network applications. We often need a similarity metric for comparing networks according to their structural properties. Various network features - such as degree distribution and clustering coefficient - provide measurements for comparing networks from different points of view, but a global and integrated distance metric is still missing. In this paper, we employ distance metric learning algorithms in order to construct an integrated distance metric for comparing structural properties of complex networks. According to natural witnesses of network similarities (such as network categories) the distance metric is learned by the means of a dataset of some labeled real networks. For evaluating our proposed method which is called NetDistance, we applied it as the distance metric in K-nearest-neighbors classification. Empirical results show that NetDistance outperforms previous methods, at least 20 percent, with respect to precision.

3 citations


Cites background from "Network sampling and classification..."

  • ...[40] propose a vector of 47 metrics as the feature vector....

    [...]

References
More filters
Journal ArticleDOI
04 Jun 1998-Nature
TL;DR: Simple models of networks that can be tuned through this middle ground: regular networks ‘rewired’ to introduce increasing amounts of disorder are explored, finding that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs.
Abstract: Networks of coupled dynamical systems have been used to model biological oscillators, Josephson junction arrays, excitable media, neural networks, spatial games, genetic control networks and many other self-organizing systems. Ordinarily, the connection topology is assumed to be either completely regular or completely random. But many biological, technological and social networks lie somewhere between these two extremes. Here we explore simple models of networks that can be tuned through this middle ground: regular networks 'rewired' to introduce increasing amounts of disorder. We find that these systems can be highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. We call them 'small-world' networks, by analogy with the small-world phenomenon (popularly known as six degrees of separation. The neural network of the worm Caenorhabditis elegans, the power grid of the western United States, and the collaboration graph of film actors are shown to be small-world networks. Models of dynamical systems with small-world coupling display enhanced signal-propagation speed, computational power, and synchronizability. In particular, infectious diseases spread more easily in small-world networks than in regular lattices.

39,297 citations


"Network sampling and classification..." refers background in this paper

  • ...An analysis would then claim, for example, that scale-free networks are characterized by having a power-law degree distribution [40]....

    [...]

  • ...[33], and it “sounds” like a plausible explanation [33,40]....

    [...]

  • ...Small world [40] Θ=(n,k,pn) Nodes, neighbors, pr rewire 2....

    [...]

  • ...Algorithm-based approaches to sampling networks [3,34,40] have received a great deal of attention in recent literature [9,11,15,30,41]....

    [...]

  • ...(Small world) Each node is connected to several of its neighbors and a few distant nodes, according to the ring-induced distance [40] (Fig....

    [...]

Journal ArticleDOI
15 Oct 1999-Science
TL;DR: A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
Abstract: Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.

33,771 citations


"Network sampling and classification..." refers background in this paper

  • ...Scale free [7] Θ=(n,n0,p0,pn) Nodes, init nodes, pr init edge, pr edge 4....

    [...]

Book
28 Jul 2013
TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.
Abstract: During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

19,261 citations

Journal ArticleDOI
TL;DR: In this paper, a simple model based on the power-law degree distribution of real networks was proposed, which was able to reproduce the power law degree distribution in real networks and to capture the evolution of networks, not just their static topology.
Abstract: The emergence of order in natural systems is a constant source of inspiration for both physical and biological sciences. While the spatial order characterizing for example the crystals has been the basis of many advances in contemporary physics, most complex systems in nature do not offer such high degree of order. Many of these systems form complex networks whose nodes are the elements of the system and edges represent the interactions between them. Traditionally complex networks have been described by the random graph theory founded in 1959 by Paul Erdohs and Alfred Renyi. One of the defining features of random graphs is that they are statistically homogeneous, and their degree distribution (characterizing the spread in the number of edges starting from a node) is a Poisson distribution. In contrast, recent empirical studies, including the work of our group, indicate that the topology of real networks is much richer than that of random graphs. In particular, the degree distribution of real networks is a power-law, indicating a heterogeneous topology in which the majority of the nodes have a small degree, but there is a significant fraction of highly connected nodes that play an important role in the connectivity of the network. The scale-free topology of real networks has very important consequences on their functioning. For example, we have discovered that scale-free networks are extremely resilient to the random disruption of their nodes. On the other hand, the selective removal of the nodes with highest degree induces a rapid breakdown of the network to isolated subparts that cannot communicate with each other. The non-trivial scaling of the degree distribution of real networks is also an indication of their assembly and evolution. Indeed, our modeling studies have shown us that there are general principles governing the evolution of networks. Most networks start from a small seed and grow by the addition of new nodes which attach to the nodes already in the system. This process obeys preferential attachment: the new nodes are more likely to connect to nodes with already high degree. We have proposed a simple model based on these two principles wich was able to reproduce the power-law degree distribution of real networks. Perhaps even more importantly, this model paved the way to a new paradigm of network modeling, trying to capture the evolution of networks, not just their static topology.

18,415 citations

Book
25 Nov 1994
TL;DR: This paper presents mathematical representation of social networks in the social and behavioral sciences through the lens of Dyadic and Triadic Interaction Models, which describes the relationships between actor and group measures and the structure of networks.
Abstract: Part I. Introduction: Networks, Relations, and Structure: 1. Relations and networks in the social and behavioral sciences 2. Social network data: collection and application Part II. Mathematical Representations of Social Networks: 3. Notation 4. Graphs and matrixes Part III. Structural and Locational Properties: 5. Centrality, prestige, and related actor and group measures 6. Structural balance, clusterability, and transitivity 7. Cohesive subgroups 8. Affiliations, co-memberships, and overlapping subgroups Part IV. Roles and Positions: 9. Structural equivalence 10. Blockmodels 11. Relational algebras 12. Network positions and roles Part V. Dyadic and Triadic Methods: 13. Dyads 14. Triads Part VI. Statistical Dyadic Interaction Models: 15. Statistical analysis of single relational networks 16. Stochastic blockmodels and goodness-of-fit indices Part VII. Epilogue: 17. Future directions.

17,104 citations