scispace - formally typeset
Search or ask a question

Showing papers by "Duncan J. Watts published in 2016"


Journal ArticleDOI
15 Apr 2016-PLOS ONE
TL;DR: This study reports results from an online experiment in which 47 teams of size ranging from n = 1 to 32 collaborated on a realistic crisis mapping task, finding that individuals in teams exerted lower overall effort than independent workers, in part by allocating their effort to less demanding sub-tasks; however, it is found that individual in teams collaborated more with increasing team size.
Abstract: The relationship between team size and productivity is a question of broad relevance across economics, psychology, and management science. For complex tasks, however, where both the potential benefits and costs of coordinated work increase with the number of workers, neither theoretical arguments nor empirical evidence consistently favor larger vs. smaller teams. Experimental findings, meanwhile, have relied on small groups and highly stylized tasks, hence are hard to generalize to realistic settings. Here we narrow the gap between real-world task complexity and experimental control, reporting results from an online experiment in which 47 teams of size ranging from n = 1 to 32 collaborated on a realistic crisis mapping task. We find that individuals in teams exerted lower overall effort than independent workers, in part by allocating their effort to less demanding (and less productive) sub-tasks; however, we also find that individuals in teams collaborated more with increasing team size. Directly comparing these competing effects, we find that the largest teams outperformed an equivalent number of independent workers, suggesting that gains to collaboration dominated losses to effort. Importantly, these teams also performed comparably to a field deployment of crisis mappers, suggesting that experiments of the type described here can help solve practical problems as well as advancing the science of collective intelligence.

96 citations


Proceedings ArticleDOI
11 Apr 2016
TL;DR: In this article, a simple stylized model of success is presented, which attributes prediction error to one of two generic sources: insufficiency of available data and/or models, and inherent unpredictability of complex social systems on the other.
Abstract: How predictable is success in complex social systems? In spite of a recent profusion of prediction studies that exploit online social and information network data, this question remains unanswered, in part because it has not been adequately specified. In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two generic sources: insufficiency of available data and/or models on the one hand; and inherent unpredictability of complex social systems on the other. We then use this model to motivate an illustrative empirical study of information cascade size prediction on Twitter. Despite an unprecedented volume of information about users, content, and past performance, our best performing models can explain less than half of the variance in cascade sizes. In turn, this result suggests that even with unlimited data predictive performance would be bounded well below deterministic accuracy. Finally, we explore this potential bound theoretically using simulations of a diffusion process on a random scale free network similar to Twitter. We show that although higher predictive power is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty in estimating product quality or slight variation in quality across products leads to substantially more restrictive bounds on predictability. We conclude that realistic bounds on predictive accuracy are not dissimilar from those we have obtained empirically, and that such bounds for other complex social systems for which data is more difficult to obtain are likely even lower.

69 citations


Proceedings Article
01 Apr 2016
TL;DR: In this article, a simple stylized model of success is presented, which attributes prediction error to one of two generic sources: insufficiency of available data and/or models, and inherent unpredictability of complex social systems on the other.
Abstract: How predictable is success in complex social systems? In spite of a recent profusion of prediction studies that exploit online social and information network data, this question remains unanswered, in part because it has not been adequately specified. In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two generic sources: insufficiency of available data and/or models on the one hand; and inherent unpredictability of complex social systems on the other. We then use this model to motivate an illustrative empirical study of information cascade size prediction on Twitter. Despite an unprecedented volume of information about users, content, and past performance, our best performing models can explain less than half of the variance in cascade sizes. In turn, this result suggests that even with unlimited data predictive performance would be bounded well below deterministic accuracy. Finally, we explore this potential bound theoretically using simulations of a diffusion process on a random scale free network similar to Twitter. We show that although higher predictive power is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty in estimating product quality or slight variation in quality across products leads to substantially more restrictive bounds on predictability. We conclude that realistic bounds on predictive accuracy are not dissimilar from those we have obtained empirically, and that such bounds for other complex social systems for which data is more difficult to obtain are likely even lower.

67 citations


Proceedings ArticleDOI
TL;DR: It is shown that although higher predictive power is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty in estimating product quality or slight variation in quality across products leads to substantially more restrictive bounds on predictability.
Abstract: How predictable is success in complex social systems? In spite of a recent profusion of prediction studies that exploit online social and information network data, this question remains unanswered, in part because it has not been adequately specified. In this paper we attempt to clarify the question by presenting a simple stylized model of success that attributes prediction error to one of two generic sources: insufficiency of available data and/or models on the one hand; and inherent unpredictability of complex social systems on the other. We then use this model to motivate an illustrative empirical study of information cascade size prediction on Twitter. Despite an unprecedented volume of information about users, content, and past performance, our best performing models can explain less than half of the variance in cascade sizes. In turn, this result suggests that even with unlimited data predictive performance would be bounded well below deterministic accuracy. Finally, we explore this potential bound theoretically using simulations of a diffusion process on a random scale free network similar to Twitter. We show that although higher predictive power is possible in theory, such performance requires a homogeneous system and perfect ex-ante knowledge of it: even a small degree of uncertainty in estimating product quality or slight variation in quality across products leads to substantially more restrictive bounds on predictability. We conclude that realistic bounds on predictive accuracy are not dissimilar from those we have obtained empirically, and that such bounds for other complex social systems for which data is more difficult to obtain are likely even lower.

56 citations


Journal ArticleDOI
TL;DR: A virtual lab experiment in which 94 subjects play up to 400 ten-round games of Prisoner's Dilemma over the course of twenty consecutive weekdays predicts that a sufficiently large minority of resilient cooperators can permanently stabilize unravelling among a majority of rational players.
Abstract: The dynamics of learning in finitely repeated games of cooperation remains an open question in large part because the timescale on which learning takes place is much longer than that of traditional lab experiments. Here we report results of a “virtual lab” experiment in which 94 subjects play up to 400 ten-round games of Prisoners Dilemma over the course of twenty consecutive weekdays. Consistent with previous work, the typical round at which players first defect creeps steadily earlier over the first several days; however, this unraveling process slows after roughly one week and remains stable for the rest of the experiment. Analyzing individual strategies shows that roughly 40% of players resist the temptation to unravel, cooperating conditionally throughout the experiment, even at a significant cost to themselves. We call these players resilient cooperators. Finally, using a standard learning model we predict that the presence of more than a critical fraction of resilient cooperators can permanently stabilize unraveling among a majority of rational players. These results shed new and hopeful light on the long-term dynamics of cooperation, and demonstrate the importance of conducting behavioral experiments on longer timescales than previously contemplated.

22 citations


Reference EntryDOI
Duncan J. Watts1
14 Apr 2016

12 citations


Proceedings ArticleDOI
Duncan J. Watts1
13 Aug 2016
TL;DR: In this article, the authors highlight some examples of research that would not have been possible just a handful of years ago and that illustrate the promise of computational social science, but also highlight its limitations.
Abstract: The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers, leading some to herald the emergence of a new field: "computational social science." Against these exciting developments stands a stubborn fact: that in spite of many thousands of published papers, there has been surprisingly little progress on the "big" questions that motivated the field in the first place?questions concerning systemic risk in financial systems, problem solving in complex organizations, and the dynamics of epidemics or social movements, among others. In this talk I highlight some examples of research that would not have been possible just a handful of years ago and that illustrate the promise of CSS. At the same time, they illustrate its limitations. I then conclude with some thoughts on how CSS can bridge the gap between its current state and its potential.

12 citations


Posted Content
28 Nov 2016
TL;DR: This paper investigates a particular scenario in time series data that permits causal identification in the presence of unobserved confounders and presents an algorithm to automatically find such scenarios and shows that when both of these variables are caused by the same (unobserved) confounder, the problem of identification reduces to that of testing for independence among observed variables.
Abstract: Unobserved or unknown confounders complicate even the simplest attempts to estimate the effect of one variable on another using observational data. When cause and effect are both affected by unobserved confounders, methods based on identifying natural experiments have been proposed to eliminate confounds. However, their validity is hard to verify because they depend on assumptions about the independence of variables, that by definition, cannot be measured. In this paper we investigate a particular scenario in time series data that permits causal identification in the presence of unobserved confounders and present an algorithm to automatically find such scenarios. Specifically, we examine what we call the split-door setting, when the effect variable can be split up into two parts: one that is potentially affected by the cause, and another that is independent of it. We show that when both of these variables are caused by the same (unobserved) confounders, the problem of identification reduces to that of testing for independence among observed variables. We discuss various situations in which split-door variables are commonly recorded in both online and offline settings, and demonstrate the method by estimating the causal impact of Amazon’s recommender system, obtaining more than 23,000 natural experiments that provide similar—but more precise—estimates than past studies.

8 citations


Posted Content
TL;DR: It is found that the widely-used click-through rate (CTR) metric overestimates the causal impact of recommender systems; depending on the product category, it is estimated that 50-80\% of the traffic attributed toRecommender systems would have happened even without any recommendations.
Abstract: We present a method for estimating causal effects in time series data when fine-grained information about the outcome of interest is available. Specifically, we examine what we call the split-door setting, where the outcome variable can be split into two parts: one that is potentially affected by the cause being studied and another that is independent of it, with both parts sharing the same (unobserved) confounders. We show that under these conditions, the problem of identification reduces to that of testing for independence among observed variables, and present a method that uses this approach to automatically find subsets of the data that are causally identified. We demonstrate the method by estimating the causal impact of Amazon's recommender system on traffic to product pages, finding thousands of examples within the dataset that satisfy the split-door criterion. Unlike past studies based on natural experiments that were limited to a single product category, our method applies to a large and representative sample of products viewed on the site. In line with previous work, we find that the widely-used click-through rate (CTR) metric overestimates the causal impact of recommender systems; depending on the product category, we estimate that 50-80\% of the traffic attributed to recommender systems would have happened even without any recommendations. We conclude with guidelines for using the split-door criterion as well as a discussion of other contexts where the method can be applied.

2 citations



Journal Article
Duncan J. Watts1
TL;DR: In this talk, some examples of research that would not have been possible just a handful of years ago and that illustrate the promise of CSS are highlighted and at the same time they illustrate its limitations.
Abstract: The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers, leading some to herald the emergence of a new field: "computational social science." Against these exciting developments stands a stubborn fact: that in spite of many thousands of published papers, there has been surprisingly little progress on the "big" questions that motivated the field in the first place?questions concerning systemic risk in financial systems, problem solving in complex organizations, and the dynamics of epidemics or social movements, among others. In this talk I highlight some examples of research that would not have been possible just a handful of years ago and that illustrate the promise of CSS. At the same time, they illustrate its limitations. I then conclude with some thoughts on how CSS can bridge the gap between its current state and its potential.