scispace - formally typeset
Search or ask a question
Topic

Microblogging

About: Microblogging is a research topic. Over the lifetime, 4186 publications have been published within this topic receiving 137030 citations. The topic is also known as: microblog.


Papers
More filters
Proceedings ArticleDOI
28 Jul 2013
TL;DR: This paper empirically establishes that a novel method of tweet pooling by hashtags leads to a vast improvement in a variety of measures for topic coherence across three diverse Twitter datasets in comparison to an unmodified LDA baseline and a range of pooling schemes.
Abstract: Twitter, or the world of 140 characters poses serious challenges to the efficacy of topic models on short, messy text. While topic models such as Latent Dirichlet Allocation (LDA) have a long history of successful application to news articles and academic abstracts, they are often less coherent when applied to microblog content like Twitter. In this paper, we investigate methods to improve topics learned from Twitter content without modifying the basic machinery of LDA; we achieve this through various pooling schemes that aggregate tweets in a data preprocessing step for LDA. We empirically establish that a novel method of tweet pooling by hashtags leads to a vast improvement in a variety of measures for topic coherence across three diverse Twitter datasets in comparison to an unmodified LDA baseline and a variety of pooling schemes. An additional contribution of automatic hashtag labeling further improves on the hashtag pooling results for a subset of metrics. Overall, these two novel schemes lead to significantly improved LDA topic models on Twitter content.

475 citations

Proceedings Article
21 Jun 2013
TL;DR: In this paper, the authors compare data collected using Twitter's sampled API service with data collected from the full, albeit costly, Firehose stream that includes every single published tweet, using common statistical metrics as well as metrics that allow them to compare topics, networks, and locations of tweets.
Abstract: Twitter is a social media giant famous for the exchange of short, 140-character messages called "tweets". In the scientific community, the microblogging site is known for openness in sharing its data. It provides a glance into its millions of users and billions of tweets through a "Streaming API" which provides a sample of all tweets matching some parameters preset by the API user. The API service has been used by many researchers, companies, and governmental institutions that want to extract knowledge in accordance with a diverse array of questions pertaining to social media. The essential drawback of the Twitter API is the lack of documentation concerning what and how much data users get. This leads researchers to question whether the sampled data is a valid representation of the overall activity on Twitter. In this work we embark on answering this question by comparing data collected using Twitter's sampled API service with data collected using the full, albeit costly, Firehose stream that includes every single published tweet. We compare both datasets using common statistical metrics as well as metrics that allow us to compare topics, networks, and locations of tweets. The results of our work will help researchers and practitioners understand the implications of using the Streaming API.

469 citations

Proceedings ArticleDOI
08 Feb 2012
TL;DR: An efficient hybrid approach based on a linear regression for predicting the spread of an idea in a given time frame is presented and it is shown that a combination of content features with temporal and topological features minimizes prediction error.
Abstract: Current social media research mainly focuses on temporal trends of the information flow and on the topology of the social graph that facilitates the propagation of information. In this paper we study the effect of the content of the idea on the information propagation. We present an efficient hybrid approach based on a linear regression for predicting the spread of an idea in a given time frame. We show that a combination of content features with temporal and topological features minimizes prediction error.Our algorithm is evaluated on Twitter hashtags extracted from a dataset of more than 400 million tweets. We analyze the contribution and the limitations of the various feature types to the spread of information, demonstrating that content aspects can be used as strong predictors thus should not be disregarded. We also study the dependencies between global features such as graph topology and content features.

466 citations

Proceedings ArticleDOI
11 Feb 2012
TL;DR: It is shown that users are poor judges of truthfulness based on content alone, and instead are influenced by heuristics such as user name when making credibility assessments.
Abstract: Twitter is now used to distribute substantive content such as breaking news, increasing the importance of assessing the credibility of tweets. As users increasingly access tweets through search, they have less information on which to base credibility judgments as compared to consuming content from direct social network connections. We present survey results regarding users' perceptions of tweet credibility. We find a disparity between features users consider relevant to credibility assessment and those currently revealed by search engines. We then conducted two experiments in which we systematically manipulated several features of tweets to assess their impact on credibility ratings. We show that users are poor judges of truthfulness based on content alone, and instead are influenced by heuristics such as user name when making credibility assessments. Based on these findings, we discuss strategies tweet authors can use to enhance their credibility with readers (and strategies astute readers should be aware of!). We propose design improvements for displaying social search results so as to better convey credibility.

466 citations

Journal ArticleDOI
TL;DR: Analysis of the type of content that legislators are posting to Twitter shows that Congresspeople are primarily using Twitter to disperse information, particularly links to news articles about themselves and to their blog posts, and to report on their daily activities.
Abstract: Twitter is a microblogging and social networking service with millions of members and growing at a tremendous rate. With the buzz surrounding the service have come claims of its ability to transform the way people interact and share information and calls for public figures to start using the service. In this study, we are interested in the type of content that legislators are posting to the service, particularly by members of the United States Congress. We read and analyzed the content of over 6,000 posts from all members of Congress using the site. Our analysis shows that Congresspeople are primarily using Twitter to disperse information, particularly links to news articles about themselves and to their blog posts, and to report on their daily activities. These tend not to provide new insights into government or the legislative process or to improve transparency; rather, they are vehicles for self-promotion. However, Twitter is also facilitating direct communication between Congresspeople and citizens, though this is a less popular activity. We report on our findings and analysis and discuss other uses of Twitter for legislators. © 2010 Wiley Periodicals, Inc.

459 citations


Network Information
Related Topics (5)
Social network
42.9K papers, 1.5M citations
85% related
Social media
76K papers, 1.1M citations
83% related
The Internet
213.2K papers, 3.8M citations
82% related
Active learning
42.3K papers, 1.1M citations
79% related
Information system
107.5K papers, 1.8M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023202
2022551
2021153
2020238
2019226
2018282