scispace - formally typeset
Search or ask a question

Showing papers on "Degree distribution published in 2011"


Journal ArticleDOI
12 May 2011-Nature
TL;DR: In this article, the authors developed analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system's entire dynamics.
Abstract: The ultimate proof of our understanding of natural or technological systems is reflected in our ability to control them. Although control theory offers mathematical tools for steering engineered and natural systems towards a desired state, a framework to control complex self-organized systems is lacking. Here we develop analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system's entire dynamics. We apply these tools to several real networks, finding that the number of driver nodes is determined mainly by the network's degree distribution. We show that sparse inhomogeneous networks, which emerge in many real complex systems, are the most difficult to control, but that dense and homogeneous networks can be controlled using a few driver nodes. Counterintuitively, we find that in both model and real systems the driver nodes tend to avoid the high-degree nodes.

2,889 citations


Posted Content
TL;DR: A strong effect of age on friendship preferences as well as a globally modular community structure driven by nationality are observed, but it is shown that while the Facebook graph as a whole is clearly sparse, the graph neighborhoods of users contain surprisingly dense structure.
Abstract: We study the structure of the social graph of active Facebook users, the largest social network ever analyzed. We compute numerous features of the graph including the number of users and friendships, the degree distribution, path lengths, clustering, and mixing patterns. Our results center around three main observations. First, we characterize the global structure of the graph, determining that the social network is nearly fully connected, with 99.91% of individuals belonging to a single large connected component, and we confirm the "six degrees of separation" phenomenon on a global scale. Second, by studying the average local clustering coefficient and degeneracy of graph neighborhoods, we show that while the Facebook graph as a whole is clearly sparse, the graph neighborhoods of users contain surprisingly dense structure. Third, we characterize the assortativity patterns present in the graph by studying the basic demographic and network properties of users. We observe clear degree assortativity and characterize the extent to which "your friends have more friends than you". Furthermore, we observe a strong effect of age on friendship preferences as well as a globally modular community structure driven by nationality, but we do not find any strong gender homophily. We compare our results with those from smaller social networks and find mostly, but not entirely, agreement on common structural network characteristics.

938 citations


Proceedings ArticleDOI
28 Mar 2011
TL;DR: In this paper, a layered label propagation (LBP) algorithm is proposed to reorder very large graphs (billions of nodes) using task decomposition to perform aggressively on multi-core architecture, making it possible to compress graphs of more than 600 millions nodes in a few hours.
Abstract: We continue the line of research on graph compression started with WebGraph, but we move our focus to the compression of social networks in a proper sense (e.g., LiveJournal): the approaches that have been used for a long time to compress web graphs rely on a specific ordering of the nodes (lexicographical URL ordering) whose extension to general social networks is not trivial. In this paper, we propose a solution that mixes clusterings and orders, and devise a new algorithm, called Layered Label Propagation, that builds on previous work on scalable clustering and can be used to reorder very large graphs (billions of nodes). Our implementation uses task decomposition to perform aggressively on multi-core architecture, making it possible to reorder graphs of more than 600 millions nodes in a few hours.Experiments performed on a wide array of web graphs and social networks show that combining the order produced by the proposed algorithm with the WebGraph compression framework provides a major increase in compression with respect to all currently known techniques, both on web graphs and on social networks. These improvements make it possible to analyse in main memory significantly larger graphs.

637 citations


Proceedings ArticleDOI
28 Mar 2011
TL;DR: This work describes a sequential triangle counting algorithm and shows how to adapt it to the MapReduce setting, and presents a new algorithm designed specifically for the Map Reduce framework that achieves a factor of 10-100 speed up over the naive approach.
Abstract: The clustering coefficient of a node in a social network is a fundamental measure that quantifies how tightly-knit the community is around the node. Its computation can be reduced to counting the number of triangles incident on the particular node in the network. In case the graph is too big to fit into memory, this is a non-trivial task, and previous researchers showed how to estimate the clustering coefficient in this scenario. A different avenue of research is to to perform the computation in parallel, spreading it across many machines. In recent years MapReduce has emerged as a de facto programming paradigm for parallel computation on massive data sets. The main focus of this work is to give MapReduce algorithms for counting triangles which we use to compute clustering coefficients.Our contributions are twofold. First, we describe a sequential triangle counting algorithm and show how to adapt it to the MapReduce setting. This algorithm achieves a factor of 10-100 speed up over the naive approach. Second, we present a new algorithm designed specifically for the MapReduce framework. A key feature of this approach is that it allows for a smooth tradeoff between the memory available on each individual machine and the total memory available to the algorithm, while keeping the total work done constant. Moreover, this algorithm can use any triangle counting algorithm as a black box and distribute the computation across many machines. We validate our algorithms on real world datasets comprising of millions of nodes and over a billion edges. Our results show both algorithms effectively deal with skew in the degree distribution and lead to dramatic speed ups over the naive implementation.

450 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examined the network structure and nodal centrality of individual cities in the air transport network of China (ATNC) using a complex network approach and found that the ATNC has a cumulative degree distribution captured by an exponential function, and displays some small-world network properties with an average path length of 2.23 and a clustering coefficient of 0.69.

398 citations


Journal ArticleDOI
TL;DR: An extension of a combinatorial characterization due to Erdős and Gallai is used to develop a sequential algorithm for generating a random labeled graph with a given degree sequence, which allows for surprisingly efficient sequential importance sampling.
Abstract: Random graphs with given degrees are a natural next step in complexity beyond the Erdős–Renyi model, yet the degree constraint greatly complicates simulation and estimation. We use an extension of a combinatorial characterization due to Erdős and Gallai to develop a sequential algorithm for generating a random labeled graph with a given degree sequence. The algorithm is easy to implement and allows for surprisingly efficient sequential importance sampling. The resulting probabilities are easily computed on the fly, allowing the user to reweight estimators appropriately, in contrast to some ad hoc approaches that generate graphs with the desired degrees but with completely unknown probabilities. Applications are given, including simulating an ecological network and estimating the number of graphs with a given degree sequence.

355 citations


Journal ArticleDOI
TL;DR: It is shown that a synergy exists between the failure of connectivity and dependency links that leads to an iterative process of cascading failures that has a devastating effect on the network stability.
Abstract: Current network models assume one type of links to define the relations between the network entities. However, many real networks can only be correctly described using two different types of relations. Connectivity links that enable the nodes to function cooperatively as a network and dependency links that bind the failure of one network element to the failure of other network elements. Here we present an analytical framework for studying the robustness of networks that include both connectivity and dependency links. We show that a synergy exists between the failure of connectivity and dependency links that leads to an iterative process of cascading failures that has a devastating effect on the network stability. We present exact analytical results for the dramatic change in the network behavior when introducing dependency links. For a high density of dependency links, the network disintegrates in a form of a first-order phase transition, whereas for a low density of dependency links, the network disintegrates in a second-order transition. Moreover, opposed to networks containing only connectivity links where a broader degree distribution results in a more robust network, when both types of links are present a broad degree distribution leads to higher vulnerability.

271 citations


Journal ArticleDOI
TL;DR: A scheme that recovers the (dynamic) Bayesian dependency graph (connections in a network) using observed network activity is described that furnishes a network description of distributed activity in the brain that is optimal in the sense of having the greatest conditional probability, relative to other networks.

260 citations


Proceedings ArticleDOI
12 Nov 2011
TL;DR: In this article, the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms, is explored and two highly-tuned parallel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph and a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
Abstract: Data-intensive, graph-based computations are pervasive in several scientific applications, and are known to to be quite challenging to implement on distributed memory systems. In this work, we explore the design space of parallel algorithms for Breadth-First Search (BFS), a key subroutine in several graph algorithms. We present two highly-tuned parallel approaches for BFS on large parallel systems: a level-synchronous strategy that relies on a simple vertex-based partitioning of the graph, and a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead. For both approaches, we also present hybrid versions with intra-node multithreading. Our novel hybrid two-dimensional algorithm reduces communication times by up to a factor of 3.5, relative to a common vertex based approach. Our experimental study identifies execution regimes in which these approaches will be competitive, and we demonstrate extremely high performance on leading distributed-memory parallel systems. For instance, for a 40,000-core parallel execution on Hopper, an AMD Magny-Cours based system, we achieve a BFS performance rate of 17.8 billion edge visits per second on an undirected graph of 4.3 billion vertices and 68.7 billion edges with skewed degree distribution.

229 citations


Journal ArticleDOI
TL;DR: The robustness of CCN increases with the broadness of their degree distribution, and the system undergoes a percolation transition at a certain fraction p=p(c), which is always smaller than p(c) for randomly coupled networks with the same P(k).
Abstract: We study a problem of failure of two interdependent networks in the case of identical degrees of mutually dependent nodes. We assume that both networks (A and B) have the same number of nodes N connected by the bidirectional dependency links establishing a one-to-one correspondence between the nodes of the two networks in a such a way that the mutually dependent nodes have the same number of connectivity links; i.e., their degrees coincide. This implies that both networks have the same degree distribution P(k). We call such networks correspondently coupled networks (CCNs). We assume that the nodes in each network are randomly connected. We define the mutually connected clusters and the mutual giant component as in earlier works on randomly coupled interdependent networks and assume that only the nodes that belong to the mutual giant component remain functional. We assume that initially a 1-p fraction of nodes are randomly removed because of an attack or failure and find analytically, for an arbitrary P(k), the fraction of nodes μ(p) that belong to the mutual giant component. We find that the system undergoes a percolation transition at a certain fraction p=p(c), which is always smaller than p(c) for randomly coupled networks with the same P(k). We also find that the system undergoes a first-order transition at p(c)>0 if P(k) has a finite second moment. For the case of scale-free networks with 2 0. Finally, we find that the robustness of CCN increases with the broadness of their degree distribution.

220 citations


Proceedings ArticleDOI
TL;DR: A set of tools that are developed to analyze specific properties of social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship, are described.
Abstract: We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.

Journal ArticleDOI
TL;DR: It is shown that bond percolation-based approximations can be highly biased if one incorrectly assumes that infectious periods are homogeneous, and the magnitude of this bias increases with the amount of clustering in the network.
Abstract: The spread of infectious diseases fundamentally depends on the pattern of contacts between individuals. Although studies of contact networks have shown that heterogeneity in the number of contacts and the duration of contacts can have farreaching epidemiological consequences, models often assume that contacts are chosen at random and thereby ignore the sociological, temporal and/or spatial clustering of contacts. Here we investigate the simultaneous effects of heterogeneous and clustered contact patterns on epidemic dynamics. To model population structure, we generalize the configuration model which has a tunable degree distribution (number of contacts per node) and level of clustering (number of three cliques). To model epidemic dynamics for this class of random graph, we derive a tractable, low-dimensional system of ordinary differential equations that accounts for the effects of network structure on the course of the epidemic. We find that the interaction between clustering and the degree distribution is complex. Clustering always slows an epidemic, but simultaneously increasing clustering and the variance of the degree distribution can increase final epidemic size. We also show that bond percolation-based approximations can be highly biased if one incorrectly assumes that infectious periods are homogeneous, and the magnitude of this bias increases with the amount of clustering in the network. We apply this approach to model the high clustering of contacts within households, using contact parameters estimated from survey data of social interactions, and we identify conditions under which network models that do not account for household structure will be biased.

Journal ArticleDOI
TL;DR: The realizability of scale-free networks with a given degree sequence is studied, showing that the fraction of realizable sequences undergoes two first-order transitions at the values 0 and 2 of the power-law exponent.
Abstract: We study the realizability of scale-free networks with a given degree sequence, showing that the fraction of realizable sequences undergoes two first-order transitions at the values 0 and 2 of the power-law exponent. We substantiate this finding by analytical reasoning and by a numerical method, proposed here, based on extreme value arguments, which can be applied to any given degree distribution. Our results reveal a fundamental reason why large scale-free networks without constraints on minimum and maximum degree must be sparse.

Journal ArticleDOI
TL;DR: This work develops an analytical approach that accurately captures the dynamical interaction between epidemics on overlay networks by exploiting a correspondence between the propagation dynamics and a dynamical process performing progressive network generation.
Abstract: Epidemics seldom occur as isolated phenomena. Typically, two or more viral agents spread within the same host population and may interact dynamically with each other. We present a general model where two viral agents interact via an immunity mechanism as they propagate simultaneously on two networks connecting the same set of nodes. By exploiting a correspondence between the propagation dynamics and a dynamical process performing progressive network generation, we develop an analytical approach that accurately captures the dynamical interaction between epidemics on overlay networks. The formalism allows for overlay networks with arbitrary joint degree distribution and overlap. To illustrate the versatility of our approach, we consider a hypothetical delayed intervention scenario in which an immunizing agent is disseminated in a host population to hinder the propagation of an undesirable agent (e.g., the spread of preventive information in the context of an emerging infectious disease).

Journal ArticleDOI
TL;DR: The results show that robust networks have a novel 'onion-like' topology consisting of a core of highly connected nodes hierarchically surrounded by rings of nodes with decreasing degree.
Abstract: We develop a method to generate robust networks against malicious attacks, as well as to substantially improve the robustness of a given network by swapping edges and keeping the degree distribution fixed. The method, based on persistence of the size of the largest cluster during attacks, was applied to several types of networks with broad degree distributions, including a real network—the Internet. We find that our method can improve the robustness significantly. Our results show that robust networks have a novel 'onion-like' topology consisting of a core of highly connected nodes hierarchically surrounded by rings of nodes with decreasing degree.

Journal ArticleDOI
TL;DR: It is shown that to explain the growth of the citation network by preferential attachment, one has to accept that individual nodes exhibit heterogeneous fitness values that decay with time, which makes the model an apt candidate for modeling a wide range of real systems.
Abstract: We show that to explain the growth of the citation network by preferential attachment (PA), one has to accept that individual nodes exhibit heterogeneous fitness values that decay with time. While previous PA-based models assumed either heterogeneity or decay in isolation, we propose a simple analytically treatable model that combines these two factors. Depending on the input assumptions, the resulting degree distribution shows an exponential, log-normal or power-law decay, which makes the model an apt candidate for modeling a wide range of real systems.

Journal ArticleDOI
25 May 2011-PLOS ONE
TL;DR: The utility of ERGMs for modeling, analyzing, and simulating complex whole-brain networks with network data from normal subjects is illustrated and a foundation for the selection of important local features is provided through the implementation and assessment of three selection approaches.
Abstract: Exponential random graph models (ERGMs), also known as p* models, have been utilized extensively in the social science literature to study complex networks and how their global structure depends on underlying structural components. However, the literature on their use in biological networks (especially brain networks) has remained sparse. Descriptive models based on a specific feature of the graph (clustering coefficient, degree distribution, etc.) have dominated connectivity research in neuroscience. Corresponding generative models have been developed to reproduce one of these features. However, the complexity inherent in whole-brain network data necessitates the development and use of tools that allow the systematic exploration of several features simultaneously and how they interact to form the global network architecture. ERGMs provide a statistically principled approach to the assessment of how a set of interacting local brain network features gives rise to the global structure. We illustrate the utility of ERGMs for modeling, analyzing, and simulating complex whole-brain networks with network data from normal subjects. We also provide a foundation for the selection of important local features through the implementation and assessment of three selection approaches: a traditional p-value based backward selection approach, an information criterion approach (AIC), and a graphical goodness of fit (GOF) approach. The graphical GOF approach serves as the best method given the scientific interest in being able to capture and reproduce the structure of fitted brain networks.

Journal ArticleDOI
TL;DR: This paper quantifies and corrects the degree bias of BFS and proposes a practical BFS-bias correction procedure that performs well when applied to a broad range of Internet topologies and to two large BFS samples of Facebook and Orkut networks.
Abstract: Breadth First Search (BFS) is a widely used approach for sampling large graphs. However, it has been empirically observed that BFS sampling is biased toward high-degree nodes, which may strongly affect the measurement results. In this paper, we quantify and correct the degree bias of BFS. First, we consider a random graph RG(pk) with an arbitrary degree distribution pk. For this model, we calculate the node degree distribution expected to be observed by BFS as a function of the fraction f of covered nodes. We also show that, for RG(pk), all commonly used graph traversal techniques (BFS, DFS, Forest Fire, Snowball Sampling, RDS) have exactly the same bias. Next, we propose a practical BFS-bias correction procedure that takes as input a collected BFS sample together with the fraction f. Our correction technique is exact (i.e., leads to unbiased estimation) for RG(pk). Furthermore, it performs well when applied to a broad range of Internet topologies and to two large BFS samples of Facebook and Orkut networks.

Posted Content
TL;DR: In this article, the degree bias of incomplete BFS is quantified and a practical BFS-bias correction procedure is proposed, which takes as input a collected BFS sample together with its fraction f. Even though BFS does not capture many graph properties common in real-life graphs (such as assortativity), their RG(pk)-based correction technique performs well on a broad range of Internet topologies and on two large BFS samples of Facebook and Orkut networks.
Abstract: Breadth First Search (BFS) is a widely used approach for sampling large unknown Internet topologies. Its main advantage over random walks and other exploration techniques is that a BFS sample is a plausible graph on its own, and therefore we can study its topological characteristics. However, it has been empirically observed that incomplete BFS is biased toward high-degree nodes, which may strongly affect the measurements. In this paper, we first analytically quantify the degree bias of BFS sampling. In particular, we calculate the node degree distribution expected to be observed by BFS as a function of the fraction f of covered nodes, in a random graph RG(pk) with an arbitrary degree distribution pk. We also show that, for RG(pk), all commonly used graph traversal techniques (BFS, DFS, Forest Fire, Snowball Sampling, RDS) suffer from exactly the same bias. Next, based on our theoretical analysis, we propose a practical BFS-bias correction procedure. It takes as input a collected BFS sample together with its fraction f. Even though RG(pk) does not capture many graph properties common in real-life graphs (such as assortativity), our RG(pk)-based correction technique performs well on a broad range of Internet topologies and on two large BFS samples of Facebook and Orkut networks. Finally, we consider and evaluate a family of alternative correction procedures, and demonstrate that, although they are unbiased for an arbitrary topology, their large variance makes them far less effective than the RG(pk)-based technique.

Proceedings ArticleDOI
25 May 2011
TL;DR: In this paper, the authors describe a set of tools that can analyze specific properties of social-network graphs, i.e., degree distribution, centrality measures, scaling laws and distribution of friendship.
Abstract: We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our ad-hoc, privacy-compliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such social-network graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.

Journal ArticleDOI
TL;DR: An analytical approach to determining the expected cascade size in a broad range of dynamical models on the class of random networks with arbitrary degree distribution and nonzero clustering and Watts' threshold model gives excellent agreement with numerical simulations.
Abstract: We present an analytical approach to determining the expected cascade size in a broad range of dynamical models on the class of random networks with arbitrary degree distribution and nonzero clustering introduced previously in [M. E. J. Newman, Phys. Rev. Lett. 103, 058701 (2009)]. A condition for the existence of global cascades is derived as well as a general criterion that determines whether increasing the level of clustering will increase, or decrease, the expected cascade size. Applications, examples of which are provided, include site percolation, bond percolation, and Watts' threshold model; in all cases analytical results give excellent agreement with numerical simulations.

Journal ArticleDOI
TL;DR: The synchrony analysis takes advantage of the framework of second order networks, which defines four second order connectivity statistics based on the relative frequency of two-connection network motifs, and identifies two of them, convergent connections, and chain connections, as highly influencing the synchrony.
Abstract: We investigate how network structure can influence the tendency for a neuronal network to synchronize, or its synchronizability, independent of the dynamical model for each neuron The synchrony analysis takes advantage of the framework of second order networks (SONETs), which defines four second order connectivity statistics based on the relative frequency of two-connection network motifs The analysis identifies two of these statistics, convergent connections and chain connections, as highly influencing the synchrony Simulations verify that synchrony decreases with the frequency of convergent connections and increases with the frequency of chain connections These trends persist with simulations of multiple models for the neuron dynamics and for different types of networks Surprisingly, divergent connections, which determine the fraction of shared inputs, do not strongly influence the synchrony The critical role of chains, rather than divergent connections, in influencing synchrony can be explained by a pool and redistribute mechanism The pooling of many inputs averages out independent fluctuations, amplifying weak correlations in the inputs With increased chain connections, neurons with many inputs tend to have many outputs Hence, chains ensure that the amplified correlations in the neurons with many inputs are redistributed throughout the network, enhancing the development of synchrony across the network

Proceedings ArticleDOI
20 Jun 2011
TL;DR: This paper analyzes the state-of art graph sampling algorithms and evaluates their performance on some widely recognized graph properties on directed graphs using large-scale social network datasets and finds that none of the algorithms is able to obtain satisfied sampling results in both of these properties.
Abstract: Being able to keep the graph scale small while capturing the properties of the original social graph, graph sampling provides an efficient, yet inexpensive solution for social network analysis. The challenge is how to create a small, but representative sample out of the massive social graph with millions or even billions of nodes. Several sampling algorithms have been proposed in previous studies, but there lacks fair evaluation and comparison among them. In this paper, we analyze the state-of art graph sampling algorithms and evaluate their performance on some widely recognized graph properties on directed graphs using large-scale social network datasets. We evaluate not only the commonly used node degree distribution, but also clustering coefficient, which quantifies how well connected are the neighbors of a node in a graph. Through the comparison we have found that none of the algorithms is able to obtain satisfied sampling results in both of these properties, and the performance of each algorithm differs much in different kinds of datasets.

Journal ArticleDOI
TL;DR: The effect of broad degree distributions on network dynamics is studied by interpolating between a binomial and a truncated power-law distribution for the in-degree and out-degree independently.
Abstract: Neuronal network models often assume a fixed probability of connection between neurons. This assumption leads to random networks with binomial in-degree and out-degree distributions which are relatively narrow. Here I study the effect of broad degree distributions on network dynamics by interpolating between a binomial and a truncated powerlaw distribution for the in-degree and out-degree independently. This is done both for an inhibitory network (I network) as well as for the recurrent excitatory connections in a network of excitatory and inhibitory neurons (EI network). In both cases increasing the width of the in-degree distribution affects the global state of the network by driving transitions between asynchronous behavior and oscillations. This effect is reproduced in a simplified rate model which includes the heterogeneity in neuronal input due to the in-degree of cells. On the other hand, broadening the out-degree distribution is shown to increase the fraction of common inputs to pairs of neurons. This leads to increases in the amplitude of the cross-correlation (CC) of synaptic currents. In the case of the I network, despite strong oscillatory CCs in the currents, CCs of the membrane potential are low due to filtering and reset effects, leading to very weak CCs of the spikecount. In the asynchronous regime of the EI network, broadening the out-degree increases the amplitude of CCs in the recurrent excitatory currents, while CC of the total current is essentially unaffected as are pairwise spiking correlations. This is due to a dynamic balance between excitatory and inhibitory synaptic currents. In the oscillatory regime, changes in the out-degree can have a large effect on spiking correlations and even on the qualitative dynamical state of the network.

Journal ArticleDOI
07 Sep 2011-PLOS ONE
TL;DR: This work provides a universal analytical description of this classic scenario in terms of the horizontal visibility graphs associated with the dynamics within the attractors, that it calls Feigenbaum graphs, independent of map nonlinearity or other particulars, and shows that the network entropy mimics the Lyapunov exponent of the map independently of its sign.
Abstract: The recently formulated theory of horizontal visibility graphs transforms time series into graphs and allows the possibility of studying dynamical systems through the characterization of their associated networks. This method leads to a natural graph-theoretical description of nonlinear systems with qualities in the spirit of symbolic dynamics. We support our claim via the case study of the period-doubling and band-splitting attractor cascades that characterize unimodal maps. We provide a universal analytical description of this classic scenario in terms of the horizontal visibility graphs associated with the dynamics within the attractors, that we call Feigenbaum graphs, independent of map nonlinearity or other particulars. We derive exact results for their degree distribution and related quantities, recast them in the context of the renormalization group and find that its fixed points coincide with those of network entropy optimization. Furthermore, we show that the network entropy mimics the Lyapunov exponent of the map independently of its sign, hinting at a Pesin-like relation equally valid out of chaos.

Journal ArticleDOI
18 Oct 2011-PLOS ONE
TL;DR: The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations, which prevents fragmentation of neutral networks and thus enhances the navigability of sequence space.
Abstract: The evolution and adaptation of molecular populations is constrained by the diversity accessible through mutational processes. RNA is a paradigmatic example of biopolymer where genotype (sequence) and phenotype (approximated by the secondary structure fold) are identified in a single molecule. The extreme redundancy of the genotype-phenotype map leads to large ensembles of RNA sequences that fold into the same secondary structure and can be connected through single-point mutations. These ensembles define neutral networks of phenotypes in sequence space. Here we analyze the topological properties of neutral networks formed by 12-nucleotides RNA sequences, obtained through the exhaustive folding of sequence space. A total of 412 sequences fragments into 645 subnetworks that correspond to 57 different secondary structures. The topological analysis reveals that each subnetwork is far from being random: it has a degree distribution with a well-defined average and a small dispersion, a high clustering coefficient, and an average shortest path between nodes close to its minimum possible value, i.e. the Hamming distance between sequences. RNA neutral networks are assortative due to the correlation in the composition of neighboring sequences, a feature that together with the symmetries inherent to the folding process explains the existence of communities. Several topological relationships can be analytically derived attending to structural restrictions and generic properties of the folding process. The average degree of these phenotypic networks grows logarithmically with their size, such that abundant phenotypes have the additional advantage of being more robust to mutations. This property prevents fragmentation of neutral networks and thus enhances the navigability of sequence space. In summary, RNA neutral networks show unique topological properties, unknown to other networks previously described.

Journal ArticleDOI
TL;DR: This paper relates the onion structure to graphs with good expander properties and argues that networks of skewed degree distributions with large spectral gaps are typically onion structured, and proposes a generative algorithm producing synthetic scale-free networks with onion structure.
Abstract: In a recent work [Proc. Natl. Acad. Sci. USA 108, 3838 (2011)], Schneider et al. proposed a new measure for network robustness and investigated optimal networks with respect to this quantity. For networks with a power-law degree distribution, the optimized networks have an onion structure-high-degree vertices forming a core with radially decreasing degrees and an over-representation of edges within the same radial layer. In this paper we relate the onion structure to graphs with good expander properties (another characterization of robust network) and argue that networks of skewed degree distributions with large spectral gaps (and thus good expander properties) are typically onion structured. Furthermore, we propose a generative algorithm producing synthetic scale-free networks with onion structure, circumventing the optimization procedure of Schneider et al. We validate the robustness of our generated networks against malicious attacks and random removals.

Journal ArticleDOI
TL;DR: It is shown that although degree distribution is sufficient to predict disease behaviour on very sparse or very dense human contact networks, for intermediate density networks the authors must include information on clustering and path length to accurately predict diseasebehaviour.
Abstract: Recent studies have increasingly turned to graph theory to model more realistic contact structures that characterize disease spread. Because of the computational demands of these methods, many researchers have sought to use measures of network structure to modify analytically tractable differential equation models. Several of these studies have focused on the degree distribution of the contact network as the basis for their modifications. We show that although degree distribution is sufficient to predict disease behaviour on very sparse or very dense human contact networks, for intermediate density networks we must include information on clustering and path length to accurately predict disease behaviour. Using these three metrics, we were able to explain more than 98 per cent of the variation in endemic disease levels in our stochastic simulations.

Journal ArticleDOI
TL;DR: It is demonstrated that eukaryotic and viral PPI networks may belong to different graph model families and show that topology-based clustering can reveal important functional similarities between proteins within yeast and human P PI networks.
Abstract: Recent advancements in experimental biotechnology have produced large amounts of protein-protein interaction (PPI) data. The topology of PPI networks is believed to have a strong link to their function. Hence, the abundance of PPI data for many organisms stimulates the development of computational techniques for the modeling, comparison, alignment, and clustering of networks. In addition, finding representative models for PPI networks will improve our understanding of the cell just as a model of gravity has helped us understand planetary motion. To decide if a model is representative, we need quantitative comparisons of model networks to real ones. However, exact network comparison is computationally intractable and therefore several heuristics have been used instead. Some of these heuristics are easily computable "network properties," such as the degree distribution, or the clustering coefficient. An important special case of network comparison is the network alignment problem. Analogous to sequence alignment, this problem asks to find the "best" mapping between regions in two networks. It is expected that network alignment might have as strong an impact on our understanding of biology as sequence alignment has had. Topology-based clustering of nodes in PPI networks is another example of an important network analysis problem that can uncover relationships between interaction patterns and phenotype. We introduce the GraphCrunch 2 software tool, which addresses these problems. It is a significant extension of GraphCrunch which implements the most popular random network models and compares them with the data networks with respect to many network properties. Also, GraphCrunch 2 implements the GRAph ALigner algorithm ("GRAAL") for purely topological network alignment. GRAAL can align any pair of networks and exposes large, dense, contiguous regions of topological and functional similarities far larger than any other existing tool. Finally, GraphCruch 2 implements an algorithm for clustering nodes within a network based solely on their topological similarities. Using GraphCrunch 2, we demonstrate that eukaryotic and viral PPI networks may belong to different graph model families and show that topology-based clustering can reveal important functional similarities between proteins within yeast and human PPI networks. GraphCrunch 2 is a software tool that implements the latest research on biological network analysis. It parallelizes computationally intensive tasks to fully utilize the potential of modern multi-core CPUs. It is open-source and freely available for research use. It runs under the Windows and Linux platforms.

Journal ArticleDOI
TL;DR: The most well known network algorithms produce undirected networks, and this work emphasizes this point by highlighting how simple adaptations can instead produce directed networks.
Abstract: Many simulations of networks in computational neuroscience assume completely homogenous random networks of the Erd\"{o}s-R\'{e}nyi type, or regular networks, despite it being recognized for some time that anatomical brain networks are more complex in their connectivity and can, for example, exhibit the `scale-free' and `small-world' properties. We review the most well known algorithms for constructing networks with given non-homogeneous statistical properties and provide simple pseudo-code for reproducing such networks in software simulations. We also review some useful mathematical results and approximations associated with the statistics that describe these network models, including degree distribution, average path length and clustering coefficient. We demonstrate how such results can be used as partial verification and validation of implementations. Finally, we discuss a sometimes overlooked modeling choice that can be crucially important for the properties of simulated networks: that of network directedness. The most well known network algorithms produce undirected networks, and we emphasize this point by highlighting how simple adaptations can instead produce directed networks.