scispace - formally typeset
Search or ask a question

Showing papers on "Microblogging published in 2020"


Journal ArticleDOI
TL;DR: Public opinion in the early stages of COVID-19 in China is explored by analyzing Sina-Weibo texts in terms of space, time, and content to better understand the public opinion and sentiments towards CO VID-19, to accelerate emergency responses, and to support post-disaster management.
Abstract: The outbreak of Corona Virus Disease 2019 (COVID-19) is a grave global public health emergency. Nowadays, social media has become the main channel through which the public can obtain information and express their opinions and feelings. This study explored public opinion in the early stages of COVID-19 in China by analyzing Sina-Weibo (a Twitter-like microblogging system in China) texts in terms of space, time, and content. Temporal changes within one-hour intervals and the spatial distribution of COVID-19-related Weibo texts were analyzed. Based on the latent Dirichlet allocation model and the random forest algorithm, a topic extraction and classification model was developed to hierarchically identify seven COVID-19-relevant topics and 13 sub-topics from Weibo texts. The results indicate that the number of Weibo texts varied over time for different topics and sub-topics corresponding with the different developmental stages of the event. The spatial distribution of COVID-19-relevant Weibo was mainly concentrated in Wuhan, Beijing-Tianjin-Hebei, the Yangtze River Delta, the Pearl River Delta, and the Chengdu-Chongqing urban agglomeration. There is a synchronization between frequent daily discussions on Weibo and the trend of the COVID-19 outbreak in the real world. Public response is very sensitive to the epidemic and significant social events, especially in urban agglomerations with convenient transportation and a large population. The timely dissemination and updating of epidemic-related information and the popularization of such information by the government can contribute to stabilizing public sentiments. However, the surge of public demand and the hysteresis of social support demonstrated that the allocation of medical resources was under enormous pressure in the early stage of the epidemic. It is suggested that the government should strengthen the response in terms of public opinion and epidemic prevention and exert control in key epidemic areas, urban agglomerations, and transboundary areas at the province level. In controlling the crisis, accurate response countermeasures should be formulated following public help demands. The findings can help government and emergency agencies to better understand the public opinion and sentiments towards COVID-19, to accelerate emergency responses, and to support post-disaster management.

237 citations


Journal ArticleDOI
TL;DR: The results of this study provide initial insight into the origins of the COVID-19 outbreak based on quantitative and qualitative analysis of Chinese social media data at the initial epicenter in Wuhan City.
Abstract: BACKGROUND: The coronavirus disease (COVID-19) pandemic, which began in Wuhan, China in December 2019, is rapidly spreading worldwide with over 1.9 million cases as of mid-April 2020. Infoveillance approaches using social media can help characterize disease distribution and public knowledge, attitudes, and behaviors critical to the early stages of an outbreak. OBJECTIVE: The aim of this study is to conduct a quantitative and qualitative assessment of Chinese social media posts originating in Wuhan City on the Chinese microblogging platform Weibo during the early stages of the COVID-19 outbreak. METHODS: Chinese-language messages from Wuhan were collected for 39 days between December 23, 2019, and January 30, 2020, on Weibo. For quantitative analysis, the total daily cases of COVID-19 in Wuhan were obtained from the Chinese National Health Commission, and a linear regression model was used to determine if Weibo COVID-19 posts were predictive of the number of cases reported. Qualitative content analysis and an inductive manual coding approach were used to identify parent classifications of news and user-generated COVID-19 topics. RESULTS: A total of 115,299 Weibo posts were collected during the study time frame consisting of an average of 2956 posts per day (minimum 0, maximum 13,587). Quantitative analysis found a positive correlation between the number of Weibo posts and the number of reported cases from Wuhan, with approximately 10 more COVID-19 cases per 40 social media posts (P<.001). This effect size was also larger than what was observed for the rest of China excluding Hubei Province (where Wuhan is the capital city) and held when comparing the number of Weibo posts to the incidence proportion of cases in Hubei Province. Qualitative analysis of 11,893 posts during the first 21 days of the study period with COVID-19-related posts uncovered four parent classifications including Weibo discussions about the causative agent of the disease, changing epidemiological characteristics of the outbreak, public reaction to outbreak control and response measures, and other topics. Generally, these themes also exhibited public uncertainty and changing knowledge and attitudes about COVID-19, including posts exhibiting both protective and higher-risk behaviors. CONCLUSIONS: The results of this study provide initial insight into the origins of the COVID-19 outbreak based on quantitative and qualitative analysis of Chinese social media data at the initial epicenter in Wuhan City. Future studies should continue to explore the utility of social media data to predict COVID-19 disease severity, measure public reaction and behavior, and evaluate effectiveness of outbreak communication.

134 citations


Journal ArticleDOI
TL;DR: Through the analysis of media information, it helps relevant decision makers to grasp social media topics from spatiotemporal characteristics, so that relevant departments can accurately grasp the public's subjective ideas and emotional expressions and provide decision support for macro-control response strategies and measures and risk communication.
Abstract: COVID-19 blocked Wuhan in China, which was sealed off on Chinese New Year's Eve. During this period, the research on the relevant topics of COVID-19 and emotional expressions published on social media can provide decision support for the management and control of large-scale public health events. The research assisted the analysis of microblog text topics with the help of the LDA model, and obtained 8 topics ("origin", "host", "organization", "quarantine measures", "role models", "education", "economic", "rumor") and 28 interactive topics. Obtain data through crawler tools, with the help of big data technology, social media topics and emotional change characteristics are analyzed from spatiotemporal perspectives. The results show that: (1) "Double peaks" feature appears in the epidemic topic search curve. Weibo on the topic of the epidemic gradually reduced after January 24. However, the proportion of epidemic topic searches has gradually increased, and a "double peaks" phenomenon appeared within a week; (2) The topic changes with time and the fluctuation of the topic discussion rate gradually weakens. The number of texts on different topics and interactive topics changes with time. At the same time, the discussion rate of epidemic topics gradually weakens; (3) The political and economic center is an area where social media is highly concerned. The areas formed by Beijing, Shanghai, Guangdong, Sichuan and Hubei have published more microblog texts. The spatial division of the number of Weibo social media texts has a high correlation with the economic zone division; (4) The existence of the topic of "rumor" will enable people to have more communication and discussion. The interactive topics of "rumors" always have higher topic popularity and low emotion text expressions. Through the analysis of media information, it helps relevant decision makers to grasp social media topics from spatiotemporal characteristics, so that relevant departments can accurately grasp the public's subjective ideas and emotional expressions, and provide decision support for macro-control response strategies and measures and risk communication.

85 citations


Journal ArticleDOI
TL;DR: This work shows that textual and imagery content on social media provide complementary information useful to improve situational awareness and proposes a methodological approach that combines several computational techniques effectively in a unified framework to help humanitarian organisations in their relief efforts.
Abstract: People increasingly use microblogging platforms such as Twitter during natural disasters and emergencies. Research studies have revealed the usefulness of the data available on Twitter for several ...

76 citations


Journal ArticleDOI
TL;DR: In this article, the authors proposed a hybrid recommendation system for the movies that leverage the best of concepts used from CF and CBF along with sentiment analysis of tweets from micro blogging sites.
Abstract: Recommendation systems (RSs) have garnered immense interest for applications in e-commerce and digital media. Traditional approaches in RSs include such as collaborative filtering (CF) and content-based filtering (CBF) through these approaches that have certain limitations, such as the necessity of prior user history and habits for performing the task of recommendation. To minimize the effect of such limitation, this article proposes a hybrid RS for the movies that leverage the best of concepts used from CF and CBF along with sentiment analysis of tweets from microblogging sites. The purpose to use movie tweets is to understand the current trends, public sentiment, and user response of the movie. Experiments conducted on the public database have yielded promising results.

74 citations


Journal ArticleDOI
Zhiyuan Hou1, Fanxing Du1, Hao Jiang1, Xinyu Zhou1, Leesa Lin2 
TL;DR: Social media surveillance can enable timely assessments of public reaction to risk communication and epidemic control measures, and the immediate clarification of rumours, and should be fully incorporated into epidemic preparedness and response systems.
Abstract: Background: Using social media surveillance data, this study aimed to assess public attention and awareness, risk perception, emotion, and behavioural response to the COVID-19 outbreak in real time. Methods: We collected social media data from the three most popular platforms in China: Sina Weibo (microblog), Baidu search engine, and Ali e-commerce marketplace, from the beginning of the outbreak, 1 Dec 2019, to 15 Feb 2020. Quantitative behavioural data including Weibo post counts, Weibo Hot Search ranking, and Baidu searches were used to generate indices assessing public attention and awareness. Public intention and actual adoption of recommended personal protection measures (e.g. hand sanitisers) or panic buying triggered by rumours and misinformation (e.g. garlics) were measured by Baidu and Ali indices. Correlation analysis was performed to detect consistency among the three indices. Qualitative data from Weibo posts were collected and analysed by the Linguistic Inquiry and Word Count (LIWC) text analysis programme to assess public emotion responses to epidemiological events, governments’ announcements, and epidemic control measures. Findings: We identified two missed windows of opportunity for early epidemic control during the early stages of the COVID-19 outbreak, one in Dec 2019 and the other between 31 Dec and 19 Jan, when public attention and awareness was very low despite the emerging outbreak. Delayed release of information ignited negative public emotions. The public responded quickly to government announcements and adopted recommended behaviours according to issued guidelines. We found rumours and misinformation regarding remedies and cures led to panic buying during the outbreak, and timely clarification of rumours effectively reduced irrational behaviour. Interpretation: Social media surveillance can enable timely assessments of public reaction to risk communication and epidemic control measures, and the immediate clarification of rumours. This should be fully incorporated into epidemic preparedness and response systems. Funding Statement: Z.H. acknowledges financial support from the National Natural Science Foundation of China (No. 71874034) Declaration of Interests: The authors have no conflict of interest. Ethics Approval Statement: All data are publicly available.

61 citations


Journal ArticleDOI
TL;DR: Concerns expressed by social media users are highly correlated with the evolution of the global pandemic and can create appropriate policies in a timely manner through monitoring social media platforms to guide public opinion and behavior during epidemics.
Abstract: Background: The COVID-19 pandemic has created a global health crisis that is affecting economies and societies worldwide. During times of uncertainty and unexpected change, people have turned to social media platforms as communication tools and primary information sources. Platforms such as Twitter and Sina Weibo have allowed communities to share discussion and emotional support; they also play important roles for individuals, governments, and organizations in exchanging information and expressing opinions. However, research that studies the main concerns expressed by social media users during the pandemic is limited. Objective: The aim of this study was to examine the main concerns raised and discussed by citizens on Sina Weibo, the largest social media platform in China, during the COVID-19 pandemic. Methods: We used a web crawler tool and a set of predefined search terms (New Coronavirus Pneumonia, New Coronavirus, and COVID-19) to investigate concerns raised by Sina Weibo users. Textual information and metadata (number of likes, comments, retweets, publishing time, and publishing location) of microblog posts published between December 1, 2019, and July 32, 2020, were collected. After segmenting the words of the collected text, we used a topic modeling technique, latent Dirichlet allocation (LDA), to identify the most common topics posted by users. We analyzed the emotional tendencies of the topics, calculated the proportional distribution of the topics, performed user behavior analysis on the topics using data collected from the number of likes, comments, and retweets, and studied the changes in user concerns and differences in participation between citizens living in different regions of mainland China. Results: Based on the 203,191 eligible microblog posts collected, we identified 17 topics and grouped them into 8 themes. These topics were pandemic statistics, domestic epidemic, epidemics in other countries worldwide, COVID-19 treatments, medical resources, economic shock, quarantine and investigation, patients’ outcry for help, work and production resumption, psychological influence, joint prevention and control, material donation, epidemics in neighboring countries, vaccine development, fueling and saluting antiepidemic action, detection, and study resumption. The mean sentiment was positive for 11 topics and negative for 6 topics. The topic with the highest mean of retweets was domestic epidemic, while the topic with the highest mean of likes was quarantine and investigation. Conclusions: Concerns expressed by social media users are highly correlated with the evolution of the global pandemic. During the COVID-19 pandemic, social media has provided a platform for Chinese government departments and organizations to better understand public concerns and demands. Similarly, social media has provided channels to disseminate information about epidemic prevention and has influenced public attitudes and behaviors. Government departments, especially those related to health, can create appropriate policies in a timely manner through monitoring social media platforms to guide public opinion and behavior during epidemics.

59 citations


Journal ArticleDOI
TL;DR: An integrated research model is proposed with the aim of understanding the factors that affect users’ continuous content contribution behaviours (CCCB) on microblogs and indicates that perceived gratification had a positive but surprisingly trivial effect on continuouscontent contribution behaviours.
Abstract: Microblogs are revolutionising the way users produce, consume and distribute short content. The continuous content contributions of users are crucial for the sustainable development of microblogs. ...

55 citations


Journal ArticleDOI
TL;DR: A multiple-information susceptible-discussing-immune (M-SDI) model is proposed in order to understand the patterns of key information propagation on social networks and take into account of the behavior that users may re-enter another related topic or Weibo after discussing one.
Abstract: The outbreak of a novel coronavirus (COVID-19) generated an outbreak of public opinions in the Chinese Sina-microblog. To help in designing effective communication strategies during a major public health emergency, we propose a multiple-information susceptible-discussing-immune (M-SDI) model in order to understand the patterns of key information propagation on social networks. We develop the M-SDI model, based on the public discussion quantity and take into account of the behavior that users may re-enter another related topic or Weibo after discussing one. Data fitting using the real data of COVID-19 public opinion obtained from Chinese Sina-microblog can parameterize the model to make accurate prediction of the public opinion trend until the next major news item occurs. The reproduction ratio has fallen from 1.7769 and maintained around 0.97, which reflects the peak of public opinion has passed but it will continue for a period of time.

51 citations


Journal ArticleDOI
TL;DR: This paper proposes a different approach to detect spammers on Twitter based on the similarities that exist among spam accounts, and results revealed that Random Forest achieved the highest accuracy, precision, recall, and F -measure.
Abstract: Twitter social network has gained more popularity due to the increase in social activities of registered users. Twitter performs dual functions of online social network (OSN), acting as a microblogging OSN, and at the same time as a news update platform. Recently, the growth in Twitter social interactions has attracted the attention of cybercriminals. Spammers have used Twitter to spread malicious messages, post phishing links, flood the network with fake accounts, and engage in other malicious activities. The process of detecting the network of spammers who engage in these activities is an important step toward identifying individual spam account. Researchers have proposed a number of approaches to identify a group of spammers. However, each of these approaches addressed a specific category of spammer. This paper proposes a different approach to detect spammers on Twitter based on the similarities that exist among spam accounts. A number of features were introduced to improve the performance of the three classification algorithms selected in this study. The proposed approach applied principal component analysis and tuned K-means algorithm to cluster over 200,000 accounts, randomly selected from more than 2 million tweets to detect the clusters of spammers. Experimental results show that Random Forest achieved the highest accuracy of 96.30%. This result is followed by multilayer perceptron with 96.00% and support vector machine, which achieved 95.60%. The performance of the selected classifiers based on class imbalance also revealed that Random Forest achieved the highest accuracy, precision, recall, and F-measure.

50 citations


Journal ArticleDOI
TL;DR: This study analyzes the public engagement of communication scientists by using the example of their Twitter activity and theoretically distinguish eight types of engagement, which can offer a starting point for other fields of public engagement and the impact of the discipline on the public discourse.
Abstract: Recent publications question the public visibility of communication science as a discipline and its relevance for the broader society. To address this issue, we analyze the public engagement of com...

Journal ArticleDOI
TL;DR: Twitter is used as a source of opinioned data, R is used for the acquisition, pre- processing, analyzing the tweets, then sentiment analysis is performed based on the different approaches, and the result obtained was in full compliance with the actual election results obtained in May 2019.

Journal ArticleDOI
TL;DR: It is suggested that microblogging platforms such as Weibo can function as public forums for discussing GMOs that expose users to ideologically cross-cutting viewpoints that could mitigate the likelihood of opinion polarization.
Abstract: The spread of rumors on social media has caused increasing concerns about an under-informed or even misinformed public when it comes to scientific issues However, researchers have rarely investigated their diffusion in non-western contexts This study aims to systematically examine the content and network structure of rumor-related discussions around genetically modified organisms (GMOs) on Chinese social media,This study identified 21,837 rumor-related posts of GMOs on Weibo, one of China's most popular social media platforms An approach combining social network analysis and content analysis was employed to classify user attitudes toward rumors, measure the level of homophily of their attitudes and examine the nature of their interactions,Though a certain level of homophily existed in the interaction networks, referring to the observed echo chamber effect, Weibo also served as a public forum for GMO discussions in which cross-cutting ties between communities existed A considerable amount of interactions emerged between the pro- and anti-GMO camps, and most of them involved providing or requesting information, which could mitigate the likelihood of opinion polarization Moreover, this study revealed the declining role of traditional opinion leaders and pointed toward the need for alternative strategies for efficient fact-checking,In general, the findings of this study suggested that microblogging platforms such as Weibo can function as public forums for discussing GMOs that expose users to ideologically cross-cutting viewpoints This study stands to provide important insights into the viral processes of scientific rumors on social media

Journal ArticleDOI
TL;DR: The results showed that the developed system is very good for automatic topic detection and categorization, and indicates a more perfect test having an AUC of 0.97, when compared to similar methods.

Proceedings ArticleDOI
19 Jul 2020
TL;DR: The experimental results show that BDANN outperforms the state-of-the-art models, and the existence of noisy images in the Weibo dataset that may affect the results are discussed.
Abstract: Nowadays, with the rapid growth of microblogging networks for news propagation, there are increasingly more people accessing news through such emerging social media. In the meantime, fake news now spreads at a faster pace and affects a larger population than ever before. Compared with traditional text news, the news posted on microblog often has attached images in the context. So how to correctly and autonomously detect fakes news in a multi-modal manner becomes a prominent challenge to be addressed. In this paper, we propose an end-to-end model, named BERT-based domain adaptation neural network for multi-modal fake news detection (BDANN). BDANN comprises three main modules: a multi-modal feature extractor, a domain classifier and a fake news detector. Specifically, the multi-modal feature extractor employs the pretrained BERT model to extract text features and the pretrained VGG-19 model to extract image features. The extracted features are then concatenated and fed to the detector to distinguish fake news. The role of the domain classifier is mainly to map the multi-modal features of different events to the same feature space. To assess the performance of BDANN, we conduct extensive experiments on two multimedia datasets: Twitter and Weibo. The experimental results show that BDANN outperforms the state-of-the-art models. Moreover, we further discuss the existence of noisy images in the Weibo dataset that may affect the results.

Proceedings ArticleDOI
01 Feb 2020
TL;DR: This paper presents an approach for sentiment analysis by adapting a Hadoop framework and deep learning classifier, which offered improved classification accuracy, better sensitivity, and high specificity than classical strategies.
Abstract: Sentiment analysis acquired a great area of attention in the microblogging websites and analysis of sentiment is a practice of categorization and identification of opinions that are articulated as speech, text, database sources and tweets to detect if opinion is negative, positive or neutral. The challenge lies in determining sentiment from the tweets due to the unique characteristics of Twitter data. This paper presents an approach for sentiment analysis by adapting a Hadoop framework and deep learning classifier. The Hadoop cluster is used for the distribution of data for extracting the features. Then, the significant features are extracted using the twitter data. The deep learning classifier, namely deep recurrent neural network classifier is used assign a real-valued review to each input twitter data thus, classifying the input data into two classes, such as positive review and negative review. The analysis of the performance is done using metrics like, classification accuracy, sensitivity and specificity. In contrast to classical strategies, the proposed method offered improved classification accuracy of 0.9302, better sensitivity of 0.9404 and high specificity of 0.9157, respectively.

Proceedings ArticleDOI
12 Mar 2020
TL;DR: Sentiment Analysis has been performed by using Machine Learning Classifiers, and Polarity-based sentiment analysis, and Deep Learning Models are used to classify user's tweets as having ‘positive’ or ‘negative’ sentiment.
Abstract: With the increasing rate at which data is created by internet users on various platforms, it becomes necessary to analyze and make use of the data by the Defense and other Government Organizations and know the sentiment of the people. This shall help the organizations take control of their actions and decide the steps to be taken shortly. Added to it, when something crucial is happening in the nation, it is of paramount importance to decide every step without hurting/violating the sentiments of the people. In the era of Microblogging, which has become quite a popular tool of communication, millions of users share their views and opinions on various day-to-day life issues concerning them directly or indirectly through social media platforms like Twitter, Reddit, Tumblr, Facebook. Data from these sites can be efficiently used for marketing or social studies. In this paper, we have taken into account various methods to perform sentiment analysis. Sentiment Analysis has been performed by using Machine Learning Classifiers. Polarity-based sentiment analysis, and Deep Learning Models are used to classify user's tweets as having ‘positive’ or ‘negative’ sentiment. The idea behind taking in various model architectures was to account for the variance in the opinions and thoughts existing on such social media platforms. These classification models can further be implemented to classify live tweets on twitter on any topic.

Journal ArticleDOI
TL;DR: This research has proposed a sustainable approach, namely Weighted Correlated Influence (WCI), which incorporates the relative impact of timeline-based and trend-specific features of online users and considers merging the profile activity and underlying network topology to designate online users with an influence score, which represents the combined effect.
Abstract: In the era of advanced mobile technology, freedom of expression over social media has become prevalent among online users This generates a huge amount of communication that eventually forms a ground for extensive research and analysis The social network analysis allows identifying the influential people in society over microblogging platforms Twitter, being an evolving social media platform, has become increasingly vital for online dialogues, trends, and content virality Applications of discovering influential users over Twitter are manifold It includes viral marketing, brand analysis, news dissemination, health awareness spreading, propagating political movement, and opinion leaders for empowering governance In our research, we have proposed a sustainable approach, namely Weighted Correlated Influence (WCI), which incorporates the relative impact of timeline-based and trend-specific features of online users Our methodology considers merging the profile activity and underlying network topology to designate online users with an influence score, which represents the combined effect To quantify the performance of our proposed method, the Twitter trend #CoronavirusPandemic is used Also, the results are validated for another social media trend The experimental outcomes depict enhanced performance of proposed WCI over existing methods that are based on precision, recall, and F1-measure for validation

Journal ArticleDOI
TL;DR: This work analyzes the online activity of millions of users in a popular microblogging platform during exceptional events, from NBA Finals to the elections of Pope Francis and the discovery of gravitational waves to demonstrate how combining simple mechanisms provides a route towards understanding complex social phenomena.
Abstract: In the era of social media, every day billions of individuals produce content in socio-technical systems resulting in a deluge of information. However, human attention is a limited resource and it is increasingly challenging to consume the most suitable content for one's interests. In fact, the complex interplay between individual and social activities in social systems overwhelmed by information results in bursty activity of collective attention which are still poorly understood. Here, we tackle this challenge by analyzing the online activity of millions of users in a popular microblogging platform during exceptional events, from NBA Finals to the elections of Pope Francis and the discovery of gravitational waves. We observe extreme fluctuations in collective attention that we are able to characterize and explain by considering the co-occurrence of two fundamental factors: the heterogeneity of social interactions and the preferential attention towards influential users. Our findings demonstrate how combining simple mechanisms provides a route towards understanding complex social phenomena.

Journal ArticleDOI
27 Feb 2020
TL;DR: In this paper, the authors used the event of Hurricane Irma and combined it with the life cycle of online public opinion evolution to understand the effect of different types of emotional microblogs (tweets) on information dissemination.
Abstract: Microblogging is an important channel used to disseminate online public opinion during an emergency. Analyzing the features and evolution mechanism of online public opinion during an emergency plays a significant role in crisis management.,This paper uses the event of Hurricane Irma and combines it with the life cycle of online public opinion evolution to understand the effect of different types of emotional (joy, anger, sadness, fear, disgust) microblogs (tweets) on information dissemination. The research was performed in the context of Hurricane Irma by using tweets associated with that event.,This paper demonstrates that negative emotional information has a greater communication effect, and further, the target audience that receives more exposure to negative emotional microblogs has a stronger tendency to retweet. Meanwhile, emotions expressed in tweets and the life cycle of public opinion evolution exert interactive effects on the retweeting behavior of the target audience.,For future research, a professional dictionary and the context should be taken into consideration to make the modeling in the text more normative and analyzable.,This paper aims to reveal how the emotions of a tweet affect its virality in terms of diffusion volume in the context of an emergency event.,The conclusion made in this paper can shed light on the real-time regulation and public opinion transmission, as well as for efficient intelligence service and emergency management.,In this study, Hurricane Irma is taken as an example to explore the factors influencing the information dissemination during emergencies on the social media environment. The relationship between the sentiment of a tweet and the life cycle of public opinion and its effect on tweet volume were investigated.

Journal ArticleDOI
TL;DR: It is demonstrated that sentiment analysis can be an effective and useful tool for sports-related content and is intended to stimulate the increased use of and discussion on sentiment analysis in sports science.
Abstract: Sentiment analysis refers to the algorithmic extraction of subjective information from textual data and—driven by the increasing amount of online communication—has become one of the fastest growing research areas in computer science with applications in several domains. Although sports events such as football matches are accompanied by a huge public interest and large amount of related online communication, social media analysis in general and sentiment analysis in particular are almost unused tools in sports science so far. The present study tests the feasibility of lexicon-based tools of sentiment analysis with regard to football-related textual data on the microblogging platform Twitter. The sentiment of a total of 10,000 tweets with reference to ten top-level football matches was analyzed both manually by human annotators and algorithmically by means of publicly available sentiment analysis tools. Results show that the general sentiment of realistic sets (1000 tweets with a proportion of 60% having the same polarity) can be classified correctly with more than 95% accuracy. The present paper demonstrates that sentiment analysis can be an effective and useful tool for sports-related content and is intended to stimulate the increased use of and discussion on sentiment analysis in sports science.

Journal ArticleDOI
Wei Ren1, Yaping Guo1
TL;DR: This article analyzed the frequency and percentage of each self-praise strategy deployed by Chinese users, supplemented by a qualitative analysis of individual strategies with examples, and found that three main pragmatic strategies are employed for selfpraise in Chinese microblogs.

Journal ArticleDOI
TL;DR: An overview of the algorithms and approaches that have been used for sentiment analysis in twitter is provided and directions for future research on how twitter sentiment analysis approaches can utilize theories and technologies from other fields such cognitive science, semantic Web, big data and visualization are discussed.
Abstract: Twitter is one of the most popular microblogging and social networking platforms where massive instant messages (i.e. tweets) are posted every day. Twitter sentiment analysis tackles the problem of analyzing users’ tweets in terms of thoughts, interests and opinions in a variety of contexts and domains. Such analysis can be valuable for several researchers and applications that require understanding people views about a particular topic or event. The study carried out in this paper provides an overview of the algorithms and approaches that have been used for sentiment analysis in twitter. The reviewed articles are categories into four categories based on the approach they use. Furthermore, we discuss directions for future research on how twitter sentiment analysis approaches can utilize theories and technologies from other fields such cognitive science, semantic Web, big data and visualization.

Proceedings ArticleDOI
10 Jun 2020
TL;DR: To detect sarcasm, a pattern-based approach is proposed using Twitter data and four sets of features that include a lot of specific sarcasm is proposed and classify tweets as sarcastic and non-sarcastic.
Abstract: Sarcasm is a subtle type of irony, which can be widely used in social networks. It is usually used to transmit hidden information to criticize and ridicule a person and to recognize. The sarcastic reorganization system is very helpful for the improvement of automatic sentiment analysis collected from different social networks and microblogging sites. Sentiment analysis refers to internet users of a particular community, expressed attitudes and opinions of identification and aggregation. In this paper, to detect sarcasm, a pattern-based approach is proposed using Twitter data. Four sets of features that include a lot of specific sarcasm is proposed and classify tweets as sarcastic and non-sarcastic. The proposed feature sets are studied and evaluate its additional cost classifications.

Journal ArticleDOI
01 Jan 2020
TL;DR: This paper reviews core components that enable large-scale querying and indexing for microblogs data, and discusses system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems.
Abstract: Microblogs data is the microlength user-generated data that is posted on the web, e.g., tweets, online reviews, comments on news and social media. It has gained considerable attention in recent years due to its widespread popularity, rich content, and value in several societal applications. Nowadays, microblogs applications span a wide spectrum of interests including targeted advertising, market reports, news delivery, political campaigns, rescue services, and public health. Consequently, major research efforts have been spent to manage, analyze, and visualize microblogs to support different applications. This paper gives a comprehensive review of major research and system work in microblogs data management. The paper reviews core components that enable large-scale querying and indexing for microblogs data. A dedicated part gives particular focus for discussing system-level issues and on-going effort on supporting microblogs through the rising wave of big data systems. In addition, we review the major research topics that exploit these core data management components to provide innovative and effective analysis and visualization for microblogs, such as event detection, recommendations, automatic geotagging, and user queries. Throughout the different parts, we highlight the challenges, innovations, and future opportunities in microblogs data research.

Journal ArticleDOI
TL;DR: A model for extracting and classifying emotions in Arabic tweets based on four emotions: sad, joy, disgust, and anger is presented, which improves the state of the art in the classification of Arabic tweets using support vector machine (SVM) and Naïve Bayes (NB) that give the best results.
Abstract: Twitter is one of the most used microblogs in social media communication channels. Emotion detection has recently raised as an important research field. Extracting emotions in Twitter microblogs ha...

Journal ArticleDOI
TL;DR: This work focuses on how COVID-19 has influenced the attention dynamics on the biggest Chinese microblogging website Sina Weibo during the first four months of the pandemic, and explores the dynamics of HSL by measuring the ranking dynamics and the lifetimes of hashtags on the list.
Abstract: Understanding attention dynamics on social media during pandemics could help governments minimize the effects. We focus on how COVID-19 has influenced the attention dynamics on the biggest Chinese microblogging website Sina Weibo during the first four months of the pandemic. We study the real-time Hot Search List (HSL), which provides the ranking of the most popular 50 hashtags based on the amount of Sina Weibo searches. We show how the specific events, measures and developments during the epidemic affected the emergence of different kinds of hashtags and the ranking on the HSL. A significant increase of COVID-19 related hashtags started to occur on HSL around January 20, 2020, when the transmission of the disease between humans was announced. Then very rapidly a situation was reached where COVID-related hashtags occupied 30-70% of the HSL, however, with changing content. We give an analysis of how the hashtag topics changed during the investigated time span and conclude that there are three periods separated by February 12 and March 12. In period 1, we see strong topical correlations and clustering of hashtags; in period 2, the correlations are weakened, without clustering pattern; in period 3, we see a potential of clustering while not as strong as in period 1. We further explore the dynamics of HSL by measuring the ranking dynamics and the lifetimes of hashtags on the list. This way we can obtain information about the decay of attention, which is important for decisions about the temporal placement of governmental measures to achieve permanent awareness. Furthermore, our observations indicate abnormally higher rank diversity in the top 15 ranks on HSL due to the COVID-19 related hashtags, revealing the possibility of algorithmic intervention from the platform provider.

Journal ArticleDOI
TL;DR: This work addresses two challenges in microblog content summarization by considering both topic sentiments and topic aspects in tandem and is able to outperform existing methods on standard metrics such as ROUGE-1.
Abstract: Recent advances in microblog content summarization has primarily viewed this task in the context of traditional multi-document summarization techniques where a microblog post or their collection form one document. While these techniques already facilitate information aggregation, categorization and visualization of microblog posts, they fall short in two aspects: i) when summarizing a certain topic from microblog content, not all existing techniques take topic polarity into account. This is an important consideration in that the summarization of a topic should cover all aspects of the topic and hence taking polarity into account (sentiment) can lead to the inclusion of the less popular polarity in the summarization process. ii) Some summarization techniques produce summaries at the topic level. However, it is possible that a given topic can have more than one important aspect that need to have representation in the summarization process. Our work in this paper addresses these two challenges by considering both topic sentiments and topic aspects in tandem. We compare our work with the state of the art Twitter summarization techniques and show that our method is able to outperform existing methods on standard metrics such as ROUGE-1.

Journal ArticleDOI
TL;DR: A theoretical model is built to explain the formation of a user-to-user interaction network and understand the underlying mechanism why a user would be replied to by others in government microblogs, and experimental results provide important empirical evidence that user interaction in online government micro blogs is highly reciprocal and transitive.
Abstract: Although social media plays a significant role in public affairs, some online government communities still lack sufficient response, triggering a negative environment for the policy decision process. In order to foster a healthy online public community, understanding the mechanism and drivers of user interaction becomes critical, especially for the user-to-user interaction. We build a theoretical model to explain the formation of a user-to-user interaction network and understand the underlying mechanism why a user would be replied to by others in government microblogs. We collect a dataset from government microblogs on Sina Weibo to construct attributed-user reply networks containing 2461 users and 3937 replies. By using the exponential random graph model (ERGM), we test how the network structures and attributes, especially about the interest similarity and emotion expression affecting the formation of user reply networks. Our experimental results provide important empirical evidence that user interaction in online government microblogs is highly reciprocal and transitive. Interest-based homophily is a significant predictor rather than gender homophily. Moreover, users with high social influence and with more extreme emotional signals are more likely to get replies from others. However, these users with high social influence and activeness, in turn, will not be likely to reply to others.

Posted Content
TL;DR: This paper releases a multilingual dataset of social media posts related to COVID-19, consisting of microblogs in English and Japanese from Twitter and those in Chinese from Weibo, and provides a quantitative as well as qualitative analysis of these datasets by creating daily word clouds as an example of text-mining analysis.
Abstract: Since the outbreak of coronavirus disease 2019 (COVID-19) in the late 2019, it has affected over 200 countries and billions of people worldwide. This has affected the social life of people owing to enforcements, such as "social distancing" and "stay at home." This has resulted in an increasing interaction through social media. Given that social media can bring us valuable information about COVID-19 at a global scale, it is important to share the data and encourage social media studies against COVID-19 or other infectious diseases. Therefore, we have released a multilingual dataset of social media posts related to COVID-19, consisting of microblogs in English and Japanese from Twitter and those in Chinese from Weibo. The data cover microblogs from January 20, 2020, to March 24, 2020. This paper also provides a quantitative as well as qualitative analysis of these datasets by creating daily word clouds as an example of text-mining analysis. The dataset is now available on Github. This dataset can be analyzed in a multitude of ways and is expected to help in efficient communication of precautions related to COVID-19.