On Predicting the Popularity of Newly Emerging Hashtags in Twitter

doi:10.1002/ASI.22844

Journal ArticleDOI

On Predicting the Popularity of Newly Emerging Hashtags in Twitter

Zongyang Ma, +2 more

- 01 Jul 2013 -

Journal of the Association for Informati...

- Vol. 64, Iss: 7, pp 1399-1410

TLDR

This article proposes methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task and shows that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features.

Abstract:

Because of Twitter’s popularity and the viral nature of information dissemination on Twitter, predicting which Twitter topics will become popular in the near future becomes a task of considerable economic importance. Many Twitter topics are annotated by hashtags. In this article, we propose methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task. We use five standard classification models (i.e., Naive bayes, k-nearest neighbors, decision trees, support vector machines, and logistic regression) for prediction. The main challenge is the identification of effective features for describing new hashtags. We extract 7 content features from a hashtag string and the collection of tweets containing the hashtag and 11 contextual features from the social graph formed by users who have adopted the hashtag. We conducted experiments on a Twitter data set consisting of 31 million tweets from 2 million Singapore-based users. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features. Among the five classifiers, the logistic regression model performs the best in terms of the Micro-F1 measure. We also observe that contextual features are more effective than content features.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Can cascades be predicted

Justin Cheng, +4 more

TL;DR: In this paper, a large sample of photo reshare cascades on Facebook was used to predict whether a cascade will continue to grow in the future, and they found that the relative growth of a cascade becomes more predictable as more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth rather than depth in a cascade is a better indicator of larger cascades.

...read moreread less

Journal ArticleDOI

The Structural Virality of Online Diffusion

Sharad Goel, +3 more

- 22 Jul 2015 -

Management Science

TL;DR: This work proposes a formal measure of what it label “structural virality” that interpolates between two conceptual extremes: content that gains its popularity through a single, large broadcast and that which grows through multiple generations with any one individual directly responsible for only a fraction of the total adoption.

...read moreread less

Proceedings ArticleDOI

Can Cascades be Predicted

Justin Cheng, +4 more

- 18 Mar 2014 -

arXiv: Social and Information Networks

TL;DR: This work develops a framework for addressing cascade prediction problems, and finds that the relative growth of a cascade becomes more predictable as the authors observe more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth, rather than depth, is a better indicator of larger cascades.

...read moreread less

Journal ArticleDOI

Measuring social media influencer index- insights from facebook, Twitter and Instagram

Anuja Arora, +4 more

- 01 Jan 2019 -

Journal of Retailing and Consumer Servic...

TL;DR: A mechanism for measuring the influencer index across popular social media platforms including Facebook, Twitter, and Instagram is proposed and findings indicate that engagement, outreach, sentiment, and growth play a key role in determining the influencers.

...read moreread less

Proceedings ArticleDOI

DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades

Qi Cao, +4 more

TL;DR: DeepHawkes inherits the high interpretability of Hawkes process and possesses the high predictive power of deep learning methods, bridging the gap between prediction and understanding of information cascades.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Latent dirichlet allocation

David M. Blei, +2 more

- 01 Mar 2003 -

Journal of Machine Learning Research

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.

...read moreread less

Proceedings Article

Latent Dirichlet Allocation

David M. Blei, +2 more

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).

...read moreread less

Proceedings ArticleDOI

What is Twitter, a social network or a news media?

Haewoon Kwak, +3 more

TL;DR: In this paper, the authors have crawled the entire Twittersphere and found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.

...read moreread less

Proceedings ArticleDOI

Earthquake shakes Twitter users: real-time event detection by social sensors

Takeshi Sakaki, +2 more

TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.

...read moreread less

Proceedings ArticleDOI

Graphs over time: densification laws, shrinking diameters and possible explanations

Jure Leskovec, +2 more

TL;DR: A new graph generator is provided, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.

...read moreread less

Collapse

On Predicting the Popularity of Newly Emerging Hashtags in Twitter

Citations

Can cascades be predicted

The Structural Virality of Online Diffusion

Can Cascades be Predicted

Measuring social media influencer index- insights from facebook, Twitter and Instagram

DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades

References

Latent dirichlet allocation

Latent Dirichlet Allocation

What is Twitter, a social network or a news media?

Earthquake shakes Twitter users: real-time event detection by social sensors

Graphs over time: densification laws, shrinking diameters and possible explanations

Related Papers (5)

Predicting the popularity of online content

Predicting popular messages in Twitter

Patterns of temporal variation in online media

Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network

What is Twitter, a social network or a news media?