scispace - formally typeset
Journal ArticleDOI

On Predicting the Popularity of Newly Emerging Hashtags in Twitter

TLDR
This article proposes methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task and shows that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features.
Abstract
Because of Twitter’s popularity and the viral nature of information dissemination on Twitter, predicting which Twitter topics will become popular in the near future becomes a task of considerable economic importance. Many Twitter topics are annotated by hashtags. In this article, we propose methods to predict the popularity of new hashtags on Twitter by formulating the problem as a classification task. We use five standard classification models (i.e., Naive bayes, k-nearest neighbors, decision trees, support vector machines, and logistic regression) for prediction. The main challenge is the identification of effective features for describing new hashtags. We extract 7 content features from a hashtag string and the collection of tweets containing the hashtag and 11 contextual features from the social graph formed by users who have adopted the hashtag. We conducted experiments on a Twitter data set consisting of 31 million tweets from 2 million Singapore-based users. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features. Among the five classifiers, the logistic regression model performs the best in terms of the Micro-F1 measure. We also observe that contextual features are more effective than content features.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Can cascades be predicted

TL;DR: In this paper, a large sample of photo reshare cascades on Facebook was used to predict whether a cascade will continue to grow in the future, and they found that the relative growth of a cascade becomes more predictable as more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth rather than depth in a cascade is a better indicator of larger cascades.
Journal ArticleDOI

The Structural Virality of Online Diffusion

TL;DR: This work proposes a formal measure of what it label “structural virality” that interpolates between two conceptual extremes: content that gains its popularity through a single, large broadcast and that which grows through multiple generations with any one individual directly responsible for only a fraction of the total adoption.
Proceedings ArticleDOI

Can Cascades be Predicted

TL;DR: This work develops a framework for addressing cascade prediction problems, and finds that the relative growth of a cascade becomes more predictable as the authors observe more of its reshares, that temporal and structural features are key predictors of cascade size, and that initially, breadth, rather than depth, is a better indicator of larger cascades.
Journal ArticleDOI

Measuring social media influencer index- insights from facebook, Twitter and Instagram

TL;DR: A mechanism for measuring the influencer index across popular social media platforms including Facebook, Twitter, and Instagram is proposed and findings indicate that engagement, outreach, sentiment, and growth play a key role in determining the influencers.
Proceedings ArticleDOI

DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades

TL;DR: DeepHawkes inherits the high interpretability of Hawkes process and possesses the high predictive power of deep learning methods, bridging the gap between prediction and understanding of information cascades.
References
More filters
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Proceedings ArticleDOI

What is Twitter, a social network or a news media?

TL;DR: In this paper, the authors have crawled the entire Twittersphere and found a non-power-law follower distribution, a short effective diameter, and low reciprocity, which all mark a deviation from known characteristics of human social networks.
Proceedings ArticleDOI

Earthquake shakes Twitter users: real-time event detection by social sensors

TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.
Proceedings ArticleDOI

Graphs over time: densification laws, shrinking diameters and possible explanations

TL;DR: A new graph generator is provided, based on a "forest fire" spreading process, that has a simple, intuitive justification, requires very few parameters (like the "flammability" of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study.