scispace - formally typeset
Search or ask a question

Showing papers by "Santa Fe Institute published in 2002"


Journal ArticleDOI
TL;DR: This article proposes a method for detecting communities, built around the idea of using centrality indices to find community boundaries, and tests it on computer-generated and real-world graphs whose community structure is already known and finds that the method detects this known structure with high sensitivity and reliability.
Abstract: A number of recent studies have focused on the statistical properties of networked systems such as social networks and the Worldwide Web. Researchers have concentrated particularly on a few properties that seem to be common to many networks: the small-world property, power-law degree distributions, and network transitivity. In this article, we highlight another property that is found in many networks, the property of community structure, in which network nodes are joined together in tightly knit groups, between which there are only looser connections. We propose a method for detecting such communities, built around the idea of using centrality indices to find community boundaries. We test our method on computer-generated and real-world graphs whose community structure is already known and find that the method detects this known structure with high sensitivity and reliability. We also apply the method to two networks whose community structure is not well known—a collaboration network and a food web—and find that it detects significant and informative community divisions in both cases.

14,429 citations


Journal ArticleDOI
TL;DR: This work proposes a model of an assortatively mixed network and finds that networks percolate more easily if they are assortative and that they are also more robust to vertex removal.
Abstract: A network is said to show assortative mixing if the nodes in the network that have many connections tend to be connected to other nodes with many connections. Here we measure mixing patterns in a variety of networks and find that social networks are mostly assortatively mixed, but that technological and biological networks tend to be disassortative. We propose a model of an assortatively mixed network, which we study both analytically and numerically. Within this model we find that networks percolate more easily if they are assortative and that they are also more robust to vertex removal.

4,752 citations


Journal ArticleDOI
TL;DR: This paper shows that a large class of standard epidemiological models, the so-called susceptible/infective/removed (SIR) models can be solved exactly on a wide variety of networks.
Abstract: The study of social networks, and in particular the spread of disease on networks, has attracted considerable recent attention in the physics community. In this paper, we show that a large class of standard epidemiological models, the so-called susceptible/infective/removed (SIR) models can be solved exactly on a wide variety of networks. In addition to the standard but unrealistic case of fixed infectiveness time and fixed and uncorrelated probability of transmission between all pairs of individuals, we solve cases in which times and probabilities are nonuniform and correlated. We also consider one simple case of an epidemic in a structured population, that of a sexually transmitted disease in a population divided into men and women. We confirm the correctness of our exact solutions with numerical simulations of SIR epidemics on networks.

3,138 citations


Journal ArticleDOI
TL;DR: Food-web structure mediates dramatic effects of biodiversity loss including secondary and ‘cascading’ extinctions and robustness increases with food-web connectance but appears independent of species richness and omnivory.
Abstract: Food-web structure mediates dramatic effects of biodiversity loss including secondary and ‘cascading’ extinctions. We studied these effects by simulating primary species loss in 16 food webs from terrestrial and aquatic ecosystems and measuring robustness in terms of the secondary extinctions that followed. As observed in other networks, food webs are more robust to random removal of species than to selective removal of species with the most trophic links to other species. More surprisingly, robustness increases with food-web connectance but appears independent of species richness and omnivory. In particular, food webs experience ‘rivet-like’ thresholds past which they display extreme sensitivity to removal of highly connected species. Higher connectance delays the onset of this threshold. Removing species with few trophic connections generally has little effect though there are several striking exceptions. These findings emphasize how the number of species removed affects ecosystems differently depending on the trophic functions of species removed.

1,466 citations


Journal ArticleDOI
TL;DR: It is found that in some cases, the models are in remarkable agreement with the data, whereas in others the agreement is poorer, perhaps indicating the presence of additional social structure in the network that is not captured by the random graph.
Abstract: We describe some new exactly solvable models of the structure of social networks, based on random graphs with arbitrary degree distributions. We give models both for simple unipartite networks, such as acquaintance networks, and bipartite networks, such as affiliation networks. We compare the predictions of our models to data for a number of real-world social networks and find that in some cases, the models are in remarkable agreement with the data, whereas in others the agreement is poorer, perhaps indicating the presence of additional social structure in the network that is not captured by the random graph.

1,408 citations


Journal ArticleDOI
17 May 2002-Science
TL;DR: A model is presented that offers an explanation of social network searchability in terms of recognizable personal identities: sets of characteristics measured along a number of social dimensions that may be applicable to many network search problems.
Abstract: Social networks have the surprising property of being "searchable": Ordinary people are capable of directing messages through their network of acquaintances to reach a specific but distant target person in only a few steps. We present a model that offers an explanation of social network searchability in terms of recognizable personal identities: sets of characteristics measured along a number of social dimensions. Our model defines a class of searchable networks and a method for searching them that may be applicable to many network search problems, including the location of data files in peer-to-peer networks, pages on the World Wide Web, and information in distributed databases.

1,015 citations


Journal ArticleDOI
TL;DR: In this article, the causal mechanisms that underlie the intergenerational transmission of economic status are investigated and the mechanisms are shown to be amenable to public policies in a way that would make the attainment of economic success more fair.
Abstract: How level is the intergenerational playing field? What are the causal mechanisms that underlie the intergenerational transmission of economic status? Are these mechanisms amenable to public policies in a way that would make the attainment of economic success more fair? These are the questions we will try to answer.

857 citations


Journal ArticleDOI
02 May 2002-Nature
TL;DR: A general model is derived, based on first principles of allometry and biochemical kinetics, that predicts the time of ontogenetic development as a function of body mass and temperature, and suggests a general definition of biological time that is approximately invariant and common to all organisms.
Abstract: Body size and temperature are the two most important variables affecting nearly all biological rates and times. The relationship of size and temperature to development is of particular interest, because during ontogeny size changes and temperature often varies. Here we derive a general model, based on first principles of allometry and biochemical kinetics, that predicts the time of ontogenetic development as a function of body mass and temperature. The model fits embryonic development times spanning a wide range of egg sizes and incubation temperatures for birds and aquatic ectotherms (fish, amphibians, aquatic insects and zooplankton). The model also describes nearly 75% of the variation in post-embryonic development among a diverse sample of zooplankton. The remaining variation is partially explained by stoichiometry, specifically the whole-body carbon to phosphorus ratio. Development in other animals at other life stages is also described by this model. These results suggest a general definition of biological time that is approximately invariant and common to all organisms.

841 citations


Journal ArticleDOI
TL;DR: Empirically the structure of this network of connections between individuals over which the virus spreads is investigated using data drawn from a large computer installation, and the implications for the understanding and prevention of computer virus epidemics are discussed.
Abstract: Many computer viruses spread via electronic mail, making use of computer users' email address books as a source for email addresses of new victims. These address books form a directed social network of connections between individuals over which the virus spreads. Here we investigate empirically the structure of this network using data drawn from a large computer installation, and discuss the implications of this structure for the understanding and prevention of computer virus epidemics.

808 citations


Journal ArticleDOI
TL;DR: This work presents a method for computing the consensus structure of a set aligned RNA sequences taking into account both thermodynamic stability and sequence covariation, and shows that the Early Noduline mRNA contains significant secondary structure that is supported by sequences covariation.

666 citations


Journal ArticleDOI
TL;DR: The analysis of some species-rich, well-defined food webs shows that they display the so-called small world behavior shared by a number of disparate complex systems, suggesting that communities might be self-organized in a non-random fashion that might have important consequences in their resistance to perturbations.

Journal ArticleDOI
TL;DR: It is suggested that antibiotic interactions within microbial communities may be very effective in maintaining diversity, based on a spatially explicit game theoretical model with multiply cyclic dominance structures.
Abstract: Evolutionary processes generating biodiversity and ecological mechanisms maintaining biodiversity seem to be diverse themselves. Conventional explanations of biodiversity such as niche differentiation, density-dependent predation pressure, or habitat heterogeneity seem satisfactory to explain diversity in communities of macrobial organisms such as higher plants and animals. For a long time the often high diversity among microscopic organisms in seemingly uniform environments, the famous "paradox of the plankton," has been difficult to understand. The biodiversity in bacterial communities has been shown to be sometimes orders of magnitudes higher than the diversity of known macrobial systems. Based on a spatially explicit game theoretical model with multiply cyclic dominance structures, we suggest that antibiotic interactions within microbial communities may be very effective in maintaining diversity.

Journal ArticleDOI
TL;DR: In this paper, a market maker based method of price formation is used to study the price dynamics induced by several commonly used financial trading strategies, showing how they amplify noise, induce structure in prices, and cause phenomena such as excess and clustered volatility.
Abstract: A deterministic trading strategy can be regarded as a signal processing element that uses external information and past prices as inputs and incorporates them into future prices. This paper uses a market maker based method of price formation to study the price dynamics induced by several commonly used financial trading strategies, showing how they amplify noise, induce structure in prices, and cause phenomena such as excess and clustered volatility.

Journal ArticleDOI
TL;DR: It is shown that species within large communities from a variety of aquatic and terrestrial ecosystems are on average two links apart, with >95% of species typically within three links of each other, which indicates that the dynamics of species within ecosystems may be more highly interconnected.
Abstract: Feeding relationships can cause invasions, extirpations, and population fluctuations of a species to dramatically affect other species within a variety of natural habitats. Empirical evidence suggests that such strong effects rarely propagate through food webs more than three links away from the initial perturbation. However, the size of these spheres of potential influence within complex communities is generally unknown. Here, we show for that species within large communities from a variety of aquatic and terrestrial ecosystems are on average two links apart, with >95% of species typically within three links of each other. Species are drawn even closer as network complexity and, more unexpectedly, species richness increase. Our findings are based on seven of the largest and most complex food webs available as well as a food-web model that extends the generality of the empirical results. These results indicate that the dynamics of species within ecosystems may be more highly interconnected and that biodiversity loss and species invasions may affect more species than previously thought.

Journal ArticleDOI
TL;DR: Recent progress and future prospects for understanding the mechanisms that generate power laws are described, and for explaining the diversity of species and complexity of ecosystems in terms of fundamental principles of physical and biological science are described.
Abstract: Underlying the diversity of life and the complexity of ecology is order that reflects the operation of fundamental physical and biological processes. Power laws describe empirical scaling relationships that are emergent quantitative features of biodiversity. These features are patterns of structure or dynamics that are self-similar or fractal-like over many orders of magnitude. Power laws allow extrapolation and prediction over a wide range of scales. Some appear to be universal, occurring in virtually all taxa of organisms and types of environments. They offer clues to underlying mechanisms that powerfully constrain biodiversity. We describe recent progress and future prospects for understanding the mechanisms that generate these power laws, and for explaining the diversity of species and complexity of ecosystems in terms of fundamental principles of physical and biological science.

Journal ArticleDOI
TL;DR: This work presents a simple model of proteome evolution that is able to reproduce many of the observed statistical regularities reported from the analysis of the yeast proteome, and suggests that the observed patterns can be explained by a process of gene duplication and diversification that would evolve proteome networks under a selection pressure.
Abstract: The next step in the understanding of the genome organization, after the determination of complete sequences, involves proteomics. The proteome includes the whole set of protein-protein interactions, and two recent independent studies have shown that its topology displays a number of surprising features shared by other complex networks, both natural and artificial. In order to understand the origins of this topology and its evolutionary implications, we present a simple model of proteome evolution that is able to reproduce many of the observed statistical regularities reported from the analysis of the yeast proteome. Our results suggest that the observed patterns can be explained by a process of gene duplication and diversification that would evolve proteome networks under a selection pressure, favoring robustness against failure of its individual components.

Journal ArticleDOI
01 Nov 2002-EPL
TL;DR: This letter presents the first evidence for the emergence of scaling (and the presence of small-world behavior) in software architecture graphs from a well-defined local optimization process, and the consequences for other complex networks.
Abstract: A large number of complex networks, both natural and artificial, share the presence of highly heterogeneous, scale-free degree distributions. A few mechanisms for the emergence of such patterns have been suggested, optimization not being one of them. In this letter we present the first evidence for the emergence of scaling (and the presence of small-world behavior) in software architecture graphs from a well-defined local optimization process. Although the rules that define the strategies involved in software engineering should lead to a tree-like structure, the final net is scale-free, perhaps reflecting the presence of conflicting constraints unavoidable in a multidimensional optimization process. The consequences for other complex networks are outlined.

Journal ArticleDOI
TL;DR: In financial markets, an excess of buying tends to drive prices up, and a excess of selling tend to drive them down as mentioned in this paper, and this is called market impact, which is defined as the tendency for self-fulfilling prophesies.
Abstract: In financial markets an excess of buying tends to drive prices up, and an excess of selling tends to drive them down. This is called market impact. Based on a simplified model for market making, it is possible to derive a unique functional form for market impact. This can be used to formulate a nonequilibrium theory for price formation. Commonly used trading strategies such as value investing and trend following induce characteristic dynamics in the price. Although there is a tendency for self-fulfilling prophesies, this is not always the case; in particular, many value investing strategies fail to make prices reflect values. When there is a diversity of preceived values, nonlinear strategies give rise to excess volatility. Many market phenomena such as trends and temporal correlations in volume and volatility have simple explanations. The theory is both simple and experimentally testable. Under this theory there is an emphasis on the interrelationships of strategies that makes it natural to regard a market as a financial ecology. A variety of examples show how diversity emerges autmatically as new stategies exploit the inefficiencies of old strategies. This results in capital reallocations that evolve on longer timescales, and cause apparent nonstationarities on shorter timescales. The drive toward market efficiency can be studied in the dynamical context of pattern evolution. The evolution of the capital of a strategy is analogous to the evolution of the population of a biolgoical species. Several different arguments suggest that the timescale for market efficiency is years to decades.

Journal ArticleDOI
TL;DR: It is proposed that antiredundancy is as important for developmental robustness as redundancy, and is an essential mechanism for ensuring tissue-level stability in complex multicellular organisms.
Abstract: Genetic mutations that lead to undetectable or minimal changes in phenotypes are said to reveal redundant functions. Redundancy is common among phenotypes of higher organisms that experience low mutation rates and small population sizes. Redundancy is less common among organisms with high mutation rates and large populations, or among the rapidly dividing cells of multicellular organisms. In these cases, one even observes the opposite tendency: a hypersensitivity to mutation, which we refer to as antiredundancy. In this paper we analyze the evolutionary dynamics of redundancy and antiredundancy. Assuming a cost of redundancy, we find that large populations will evolve antiredundant mechanisms for removing mutants and thereby bolster the robustness of wild-type genomes; whereas small populations will evolve redundancy to ensure that all individuals have a high chance of survival. We propose that antiredundancy is as important for developmental robustness as redundancy, and is an essential mechanism for ensuring tissue-level stability in complex multicellular organisms. We suggest that antiredundancy deserves greater attention in relation to cancer, mitochondrial disease, and virus infection.

Posted Content
Mark Newman1
TL;DR: In this article, generalized random graph models of both directed and undirected networks that incorporate arbitrary non-Poisson degree distributions, and extensions of these models that incorporate clustering too are described.
Abstract: The random graph of Erdos and Renyi is one of the oldest and best studied models of a network, and possesses the considerable advantage of being exactly solvable for many of its average properties. However, as a model of real-world networks such as the Internet, social networks or biological networks it leaves a lot to be desired. In particular, it differs from real networks in two crucial ways: it lacks network clustering or transitivity, and it has an unrealistic Poissonian degree distribution. In this paper we review some recent work on generalizations of the random graph aimed at correcting these shortcomings. We describe generalized random graph models of both directed and undirected networks that incorporate arbitrary non-Poisson degree distributions, and extensions of these models that incorporate clustering too. We also describe two recent applications of random graph models to the problems of network robustness and of epidemics spreading on contact networks.

Journal ArticleDOI
TL;DR: This work integrated the global sequence and immunology databases to systematically explore the relationship between HIV-1 amino acid sequences and CTL epitope distributions, and identified distinct characteristics of HIV amino acids sequences that correlate with C TL epitope localization.
Abstract: The human cytotoxic T-lymphocyte (CTL) response to human immunodeficiency virus type 1 (HIV-1) has been intensely studied, and hundreds of CTL epitopes have been experimentally defined, published, and compiled in the HIV Molecular Immunology Database. Maps of CTL epitopes on HIV-1 protein sequences reveal that defined epitopes tend to cluster. Here we integrate the global sequence and immunology databases to systematically explore the relationship between HIV-1 amino acid sequences and CTL epitope distributions. CTL responses to five HIV-1 proteins, Gag p17, Gag p24, reverse transcriptase (RT), Env, and Nef, have been particularly well characterized in the literature to date. Through comparing CTL epitope distributions in these five proteins to global protein sequence alignments, we identified distinct characteristics of HIV amino acid sequences that correlate with CTL epitope localization. First, experimentally defined HIV CTL epitopes are concentrated in relatively conserved regions. Second, the highly variable regions that lack epitopes bear cumulative evidence of past immune escape that may make them relatively refractive to CTLs: a paucity of predicted proteasome processing sites and an enrichment for amino acids that do not serve as C-terminal anchor residues. Finally, CTL epitopes are more highly concentrated in alpha-helical regions of proteins. Based on amino acid sequence characteristics, in a blinded fashion, we predicted regions in HIV regulatory and accessory proteins that would be likely to contain CTL epitopes; these predictions were then validated by comparison to new sets of experimentally defined epitopes in HIV-1 Rev, Tat, Vif, and Vpr.

Journal ArticleDOI
Mark Newman1
TL;DR: The author reviews some of the interesting issues in this area of characterization and modeling of networks and recounts some recent work on these issues by himself and by others.

Journal ArticleDOI
TL;DR: An updated review of nonextensive statistical mechanics and thermodynamics is colloquially presented in this article, where the value of q −1 (entropic nonextensivity) is used as a simple and efficient way to provide, at least for some classes of systems, some characterization of the degree of complexity.
Abstract: An updated review (corresponding to the inaugural talk delivered at the The International Workshop on Classical and Quantum Complexity and Nonextensive Thermodynamics, Denton, TX, April 3–6, 2000) of nonextensive statistical mechanics and thermodynamics is colloquially presented. Quite naturally the possibility emerges for using the value of q −1 (entropic nonextensivity) as a simple and efficient manner to provide, at least for some classes of systems, some characterization of the degree of what is currently referred to as complexity (M. Gell-Mann, The Quark and the Jaguar, Freeman, New York, 1994). A few historical digressions are included as well.

Posted Content
TL;DR: This article provided a simple stochastic OLG model with a cyclical structure which generates cyclical P/E ratios and calibrate the model to roughly fit the cyclical features of historical P /E ratios.
Abstract: Stock market price/earnings ratios should be influenced by demography. Since demography is predictable, stock returns should be as well. We provide a simple stochastic OLG model with a cyclical structure which generates cyclical P/E ratios. We calibrate the model to roughly fit the cyclical features of historical P/E ratios.

Journal ArticleDOI
TL;DR: Here different characteristic features of complex nets, as well as their behavior under different sources of perturbation, are considered.
Abstract: Summary of the Basic Features that Relate and Distinguish Different Types of Complex Networks, Both Natural and Artificial Property Proteomics Ecology Language TechnologyTinkering Gene duplication and recruitation Local assemblages fromregional species pools andpriority effectsCreation of words fromalready established onesReutilization of modules andcomponentsHubs Cellular signaling genes (e.g.,p53)Omnivorous and mostabundant speciesFunction words Most used componentsWhat can be optimized? Communication speed and linkingcostUnclear Communication speed withrestrictionsMinimize development effortwithin constraintsFailures Small phenotypic effect ofrandom mutationsLoss of only a few species-specific functionsMaintenance of expressionand communicationLoss of functionalityAttacks Large alterations of cell-cycle andapoptosis (e.g., cancer)Many coextinctions and lossof several ecosystemsfunctionsAgrammatism (i.e., greatdifficulties for buildingcomplex sentences)Avalanches of changes and largedevelopment costsRedundancy and degeneracy Redundant genes rapidly lost R minimized and D restrictedto non-keystone speciesGreat D Certain degree of R but no DHere different characteristic features of complex nets, as well as their behavior under different sources of perturbation, are considered.

Journal ArticleDOI
TL;DR: The folding of RNA sequences into secondary structures is a simple yet biophysically grounded model of a genotype-phenotype map that has uncovered a surprisingly rich statistical structure characterized by shape space covering, neutral networks and plastogenetic congruence.
Abstract: The folding of RNA sequences into secondary structures is a simple yet biophysically grounded model of a genotype-phenotype map. Its computational and mathematical analysis has uncovered a surprisingly rich statistical structure characterized by shape space covering, neutral networks and plastogenetic congruence. I review these concepts and discuss their evolutionary implications.

Journal ArticleDOI
16 Aug 2002-Science
TL;DR: A new study is explored that provides experimental evidence for stochasticity in bacterial gene expression that reflects on the importance of stoChastic processes for the evolution and adaptation of organisms.
Abstract: Is the regulation of gene expression in a cell a random (stochastic) process? In an ambitious Perspective, [Fedoroff and Fontana][1] explore a new study [ (Elowitz et al ) ][2] that provides experimental evidence for stochasticity in bacterial gene expression They reflect on the importance of stochastic processes for the evolution and adaptation of organisms [1]: http://wwwsciencemagorg/cgi/content/full/297/5584/1129 [2]: http://wwwsciencemagorg/cgi/content/short/297/5584/1183

Journal ArticleDOI
TL;DR: An analytic solution of this model using a combination of generating function methods and high-order series expansion gives accurate predictions for quantities such as the position of the percolation threshold and the typical size of disease outbreaks as a function of the density of "shortcuts" in the small-world network.
Abstract: Percolation on two-dimensional small-world networks has been proposed as a model for the spread of plant diseases. In this paper we give an analytic solution of this model using a combination of generating function methods and high-order series expansion. Our solution gives accurate predictions for quantities such as the position of the percolation threshold and the typical size of disease outbreaks as a function of the density of "shortcuts" in the small-world network. Our results agree with scaling hypotheses and numerical simulations for the same model.

Journal ArticleDOI
TL;DR: In this article, the authors demonstrate a striking regularity in the way people place limit orders in financial markets, using a data set consisting of roughly two million orders from the London Stock Exchange and demonstrate that the unconditional cumulative distribution of relative limit prices decays roughly as a power law with exponent approximately 1.5.
Abstract: In this paper we demonstrate a striking regularity in the way people place limit orders in financial markets, using a data set consisting of roughly two million orders from the London Stock Exchange. We define the relative limit price as the difference between the limit price and the best price available. Merging the data from 50 stocks, we demonstrate that for both buy and sell orders, the unconditional cumulative distribution of relative limit prices decays roughly as a power law with exponent approximately –1.5. This behaviour spans more than two decades, ranging from a few ticks to about 2000 ticks. Time series of relative limit prices show interesting temporal structure, characterized by an autocorrelation function that asymptotically decays as C(τ)∼τ−0.4. Furthermore, relative limit price levels are positively correlated with and are led by price volatility. This feedback may potentially contribute to clustered volatility.

Journal ArticleDOI
TL;DR: A robust semi-parametric normalization technique based on the assumption that the large majority of genes will not have their relative expression levels changed from one treatment group to the next, and on the assumptions that departures of the response from linearity are small and slowly varying is developed.
Abstract: Background With the advent of DNA hybridization microarrays comes the remarkable ability, in principle, to simultaneously monitor the expression levels of thousands of genes. The quantiative comparison of two or more microarrays can reveal, for example, the distinct patterns of gene expression that define different cellular phenotypes or the genes induced in the cellular response to insult or changing environmental conditions. Normalization of the measured intensities is a prerequisite of such comparisons, and indeed, of any statistical analysis, yet insufficient attention has been paid to its systematic study. The most straightforward normalization techniques in use rest on the implicit assumption of linear response between true expression level and output intensity. We find that these assumptions are not generally met, and that these simple methods can be improved.