scispace - formally typeset
Search or ask a question
Author

Daniel Gayo-Avello

Bio: Daniel Gayo-Avello is an academic researcher from University of Oviedo. The author has contributed to research in topics: Social media & Web query classification. The author has an hindex of 16, co-authored 40 publications receiving 1683 citations.

Papers
More filters
Proceedings Article
05 Jul 2011
TL;DR: This work applies techniques that had reportedly led to positive election predictions in the past, on the Twitter data collected from the 2010 US congressional elections, but finds no correlation between the analysis results and the electoral outcomes, contradicting previous reports.
Abstract: Using social media for political discourse is becoming common practice, especially around election time One interesting aspect of this trend is the possibility of pulsing the public’s opinion about the elections, and that has attracted the interest of many researchers and the press Allegedly, predicting electoral outcomes from social media data can be feasible and even simple Positive results have been reported, but without an analysis on what principle enables them Our work puts to test the purported predictive power of socialmedia metrics against the 2010 US congressional elections Here, we applied techniques that had reportedly led to positive election predictions in the past, on the Twitter data collected from the 2010 US congressional elections Unfortunately, we find no correlation between the analysis results and the electoral outcomes, contradicting previous reports Observing that 80 years of polling research would support our findings, we argue that one should not be accepting predictions about events using social media data as a black box Instead, scholarly research should be accompanied by a model explaining the predictive power of social media, when there is one

288 citations

Journal ArticleDOI
TL;DR: It is revealed that its presumed predictive power regarding electoral prediction has been somewhat exaggerated and further work on this topic is required, along with tighter integration with traditional electoral forecasting research.
Abstract: Electoral prediction from Twitter data is an appealing research topic. It seems relatively straightforward and the prevailing view is overly optimistic. This is problematic because while simple approaches are assumed to be good enough, core problems are not addressed. Thus, this article aims to (1) provide a balanced and critical review of the state of the art; (2) cast light on the presume predictive power of Twitter data; and (3) propose some considerations to push forward the field. Hence, a scheme to characterize Twitter prediction methods is proposed. It covers every aspect from data collection to performance evaluation, through data processing and vote inference. Using that scheme, prior research is analyzed and organized to explain the main approaches taken up to date but also their weaknesses. This is the first meta-analysis of the whole body of research regarding electoral prediction from Twitter data. It reveals that its presumed predictive power regarding electoral prediction has been somewhat exaggerated: Social media may provide a glimpse on electoral outcomes but, up to now, research has not provided strong evidence to support it can currently replace traditional polls. Nevertheless, there are some reasons for optimism and, hence, further work on this topic is required, along with tighter integration with traditional electoral forecasting research.

283 citations

Posted Content
TL;DR: It can be concluded that the predictive power of Twitter regarding elections has been greatly exaggerated, and that hard research problems still lie ahead.
Abstract: Predicting X from Twitter is a popular fad within the Twitter research subculture. It seems both appealing and relatively easy. Among such kind of studies, electoral prediction is maybe the most attractive, and at this moment there is a growing body of literature on such a topic. This is not only an interesting research problem but, above all, it is extremely difficult. However, most of the authors seem to be more interested in claiming positive results than in providing sound and reproducible methods. It is also especially worrisome that many recent papers seem to only acknowledge those studies supporting the idea of Twitter predicting elections, instead of conducting a balanced literature review showing both sides of the matter. After reading many of such papers I have decided to write such a survey myself. Hence, in this paper, every study relevant to the matter of electoral prediction using social media is commented. From this review it can be concluded that the predictive power of Twitter regarding elections has been greatly exaggerated, and that hard research problems still lie ahead.

232 citations

Journal ArticleDOI
TL;DR: It is argued that statistical models seem to be the most fruitful approach to apply to make predictions from social media data in the field of social media-based prediction and forecasting.
Abstract: – Social media provide an impressive amount of data about users and their interactions, thereby offering computer and social scientists, economists, and statisticians – among others – new opportunities for research. Arguably, one of the most interesting lines of work is that of predicting future events and developments from social media data. However, current work is fragmented and lacks of widely accepted evaluation approaches. Moreover, since the first techniques emerged rather recently, little is known about their overall potential, limitations and general applicability to different domains. Therefore, better understanding the predictive power and limitations of social media is of utmost importance. , – Different types of forecasting models and their adaptation to the special circumstances of social media are analyzed and the most representative research conducted up to date is surveyed. Presentations of current research on techniques, methods, and empirical studies aimed at the prediction of future or current events from social media data are provided. , – A taxonomy of prediction models is introduced, along with their relative advantages and the particular scenarios where they have been applied to. The main areas of prediction that have attracted research so far are described, and the main contributions made by the papers in this special issue are summarized. Finally, it is argued that statistical models seem to be the most fruitful approach to apply to make predictions from social media data. , – This special issue raises important questions to be addressed in the field of social media-based prediction and forecasting, fills some gaps in current research, and outlines future lines of work.

221 citations

Journal ArticleDOI
TL;DR: While simple approaches are purported to be good enough, the predictive power of Twitter regarding elections has been greatly exaggerated, and difficult research problems still lie ahead.
Abstract: Predicting X from Twitter is a popular fad within the Twitter research subculture. It seems both appealing and relatively easy. Among such studies, electoral prediction is maybe the most attractive, and a growing body of literature exists on this topic. This research problem isn't only interesting, but is also extremely difficult. However, most authors seem to be more interested in claiming positive results than in providing sound and reproducible methods. It's also especially worrisome that recent papers seem to only acknowledge those studies supporting the idea that Twitter can predict elections. This is all problematic because while simple approaches are purported to be good enough, the predictive power of Twitter regarding elections has been greatly exaggerated, and difficult research problems still lie ahead.

163 citations


Cited by
More filters
Journal ArticleDOI

1,549 citations

01 Jan 2013

1,098 citations

Journal Article
TL;DR: The Net Delusion: The Dark Side of Internet Freedom by Evgeny Morozov New York: Public Affairs, 2011 409 pages $16.99 [ILLUSTRATION OMITTED] as discussed by the authors.
Abstract: The Net Delusion: The Dark Side of Internet Freedom by Evgeny Morozov New York: Public Affairs, 2011 409 pages $16.99 [ILLUSTRATION OMITTED] In January 2010, Secretary of State Hillary Clinton gave a highly touted speech on Internet freedom in which she stated, "The freedom to connect is like the freedom of assembly, only in cyberspace. It allows individuals to get online, come together, and hopefully cooperate. Once you're on the Internet, you don't need to be a tycoon or a rock star to have a huge impact on society." Evgeny Morozov, in his book The Net Delusion, takes great issue with the implication, however, that the so-called "Arab Spring" and "Twitter Revolution" were caused by unfettered access to the Internet. Instead, Morozov, a research academic, provides a cautionary tale about what he argues is any attempt to establish a monocausal relationship to meaningful political change (especially when that single focus is information technology). The book opens with a discussion of cyber-utopianism and Internet-centrism--mind-sets that focus on the positive "emancipatory" aspects of Internet communication while ignoring the downsides. The argument throughout centers on nation-state policy, or lack thereof, that attacks the "wicked" problem of authoritarianism by, as a colleague of mine has dubbed it, "wiring the world." Morozov, expectantly, but importantly, cites the hedonistic world portrayed by Huxley and the "Big Brother" world of Orwell to consider both the proactive and reactive approaches to Internet freedom by authoritarian regimes. Interestingly, he notes that there is often a mix of both. Such regimes certainly use the anonymity and openness of the Internet to spy on their people and shutdown undesirable sites. But there is also a subtle approach that belies the jackboot on the keyboard methodology. While China may be known more for suppressing the Internet and for employing the masses to counter antiregime rhetoric, Russia imposes no formal Internet censorship. It relies on entertainment (porn is specifically cited) to soothe the masses, assuming that given options for political discourse and anything else, most opt for "anything else." Hitler would understand. And in nations where freedom is not widely understood from a western perspective, any bit of additional mindless diversion may be viewed as liberty by the populace. Perhaps most importantly, Morozov rails against social media determinism as driving the end of authoritarianism, labeling it "an intellectually impoverished, lazy way to study the past, understand the present, and predict the future." He does not dismiss the value of Facebook and Twitter to quickly mobilize like-minded individuals. He notes as well that the development of that very like-mindedness is complex and potentially can be manipulated by authoritarian governments using the same Internet freedom. …

870 citations

Posted Content
TL;DR: Data collected using Twitter's sampled API service is compared with data collected using the full, albeit costly, Firehose stream that includes every single published tweet to help researchers and practitioners understand the implications of using the Streaming API.
Abstract: Twitter is a social media giant famous for the exchange of short, 140-character messages called "tweets". In the scientific community, the microblogging site is known for openness in sharing its data. It provides a glance into its millions of users and billions of tweets through a "Streaming API" which provides a sample of all tweets matching some parameters preset by the API user. The API service has been used by many researchers, companies, and governmental institutions that want to extract knowledge in accordance with a diverse array of questions pertaining to social media. The essential drawback of the Twitter API is the lack of documentation concerning what and how much data users get. This leads researchers to question whether the sampled data is a valid representation of the overall activity on Twitter. In this work we embark on answering this question by comparing data collected using Twitter's sampled API service with data collected using the full, albeit costly, Firehose stream that includes every single published tweet. We compare both datasets using common statistical metrics as well as metrics that allow us to compare topics, networks, and locations of tweets. The results of our work will help researchers and practitioners understand the implications of using the Streaming API.

848 citations