scispace - formally typeset
Search or ask a question
Book ChapterDOI

Assessing the Effects of Social Familiarity and Stance Similarity in Interaction Dynamics

TL;DR: The current work provides insights into the impact of topic-specific stance similarity and social familiarity on social interaction dynamics, with respect to specific familiarities between user pairs as well as social communities.
Abstract: Homophily, the phenomenon of similar people getting connected to and being socially familiar with each other, is well-known on online social networks. Detection of user stance towards given topics, on online social networks, specifically Twitter, has emerged as a mainstream research topic. The current work provides insights into the impact of topic-specific stance similarity and social familiarity on social interaction dynamics. This is a novel and yet fundamental problem in social networks research, that has so far remained unexplored in the literature. Specifically, we address two key aspects. One, we investigate whether the smoothness (politeness) level of conversations between user pairs, relate with overall stance similarity (spanning across topics). Two, we examine the impact on interaction smoothness (politeness) with respect to social familiarity and topical stance-similarity. We propose a novel approach based on word embedding, to compare across users and across topics. We analyze the relationship between topical stance similarity, social familiarity and interaction politeness of users, with respect to specific familiarities between user pairs as well as social communities.
Citations
More filters
Journal ArticleDOI
TL;DR: An exhaustive review of stance detection techniques on social media, including the task definition, different types of targets in stance detection, features set used, and various machine learning approaches applied is presented.
Abstract: Stance detection on social media is an emerging opinion mining paradigm for various social and political applications in which sentiment analysis may be sub-optimal. There has been a growing research interest for developing effective methods for stance detection methods varying among multiple communities including natural language processing, web science, and social computing, where each modeled stance detection in different ways. In this paper, we survey the work on stance detection across those communities and present an exhaustive review of stance detection techniques on social media, including the task definition, different types of targets in stance detection, features set used, and various machine learning approaches applied. Our survey reports state-of-the-art results on the existing benchmark datasets on stance detection, and discusses the most effective approaches. In addition, we explore the emerging trends and different applications of stance detection on social media, including opinion mining and prediction and recently using it for fake news detection. The study concludes by discussing the gaps in the current existing research and highlights the possible future directions for stance detection on social media.

121 citations


Cites methods from "Assessing the Effects of Social Fam..."

  • ...Another element that has been heavily investigated is hashtags, this element has been used in literature to infer similarity between users in order to predict the stance [Darwish et al. 2017a; Dey et al. 2017]....

    [...]

  • ...The work of [Dey et al. 2017] used soft cosine similarity to gauge the similarity between the users who post on the same hashtags....

    [...]

Proceedings ArticleDOI
17 Aug 2022
TL;DR: A novel model named BIC is posed that makes the 013 text and graph modalities interactive and also detects semantic consistency within tweet con- 015 tent, which proves the effectiveness of the proposed interaction and semantic consistency detection.
Abstract: Twitter bots are automatic programs operated by malicious actors to manipulate public opinion and spread misinformation. Research efforts have been made to automatically identify bots based on texts and networks on social media. Existing methods only leverage texts or networks alone, and while few works explored the shallow combination of the two modalities, we hypothesize that the interaction and information exchange between texts and graphs could be crucial for holistically evaluating bot activities on social media. In addition, according to a recent survey (Cresci, 2020), Twitter bots are constantly evolving while advanced bots steal genuine users’ tweets and dilute their malicious content to evade detection. This results in greater inconsistency across the timeline of novel Twitter bots, which warrants more attention. In light of these challenges, we propose BIC, a Twitter Bot detection framework with text-graph Interaction and semantic Consistency. Specifically, in addition to separately modeling the two modalities on social media, BIC employs a text-graph interaction module to enable information exchange across modalities in the learning process. In addition, given the stealing behavior of novel Twitter bots, BIC proposes to model semantic consistency in tweets based on attention weights while using it to augment the decision process. Extensive experiments demonstrate that BIC consistently outperforms state-of-the-art baselines on two widely adopted datasets. Further analyses reveal that text-graph interactions and modeling semantic consistency are essential improvements and help combat bot evolution.

6 citations

Book ChapterDOI
TL;DR: It is empirically show that homophily grows linearly with increase of familiarity, reaches a peak, and subsequently falls, indicating that, familiarity correlates with similarity up to a point, beyond which, similarity occurs for other reasons.
Abstract: We perform a first-of-its-kind characterization of topical homophily - familiarity co-occurring with topic-participation similarity of user pairs - by correlating topic participation similarity and degree of familiarity of users on Twitter. We quantify similarity between a user pair by measuring their distribution of participation in topics, wherein topics are defined as clusters of hashtags formed using semantically related user-generated content. We examine the topic participation similarity of users against different degrees of familiarity: edges, shared neighbors, and structural communities. We provide varying relaxation in identifying topics, and characterize the correlation of topical similarity with the degree of familiarity over the range of relaxation. We empirically substantiate the characteristics of topical homophily, over the varying relaxation of identified topics. We empirically show that homophily grows linearly with increase of familiarity, reaches a peak, and subsequently falls, indicating that, familiarity correlates with similarity up to a point, beyond which, similarity occurs for other reasons.

4 citations

Proceedings ArticleDOI
08 Feb 2023
TL;DR: Zhang et al. as mentioned in this paper proposed a method to extract such relation information from (dis)agreement data into an inductive social relation graph, merely using the comment-reply pairs without any additional platform-specific information.
Abstract: (Dis)agreement detection aims to identify the authors’ attitudes or positions (agree, disagree, neutral) towards a specific text. It is limited for existing methods merely using textual information for identifying (dis)agreements, especially for cross-domain settings. Social relation information can play an assistant role in the (dis)agreement task besides textual information. We propose a novel method to extract such relation information from (dis)agreement data into an inductive social relation graph, merely using the comment-reply pairs without any additional platform-specific information. The inductive social relation globally considers the historical discussion and the relation between authors. Textual information based on a pre-trained language model and social relation information encoded by pre-trained RGCN are jointly considered for (dis)agreement detection. Experimental results show that our model achieves state-of-the-art performance for both the in-domain and cross-domain tasks on the benchmark – DEBAGREEMENT. We find social relations can boost the performance of the (dis)agreement detection model, especially for the long-token comment-reply pairs, demonstrating the effectiveness of the social relation graph. We also explore the effect of the knowledge graph embedding methods, the information fusing method, and the time interval in constructing the social relation graph, which shows the effectiveness of our model.
References
More filters
Journal ArticleDOI
TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Abstract: We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.

30,570 citations

Proceedings ArticleDOI
01 Oct 2014
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Abstract: Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.

30,558 citations

Proceedings Article
03 Jan 2001
TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Abstract: We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI) [3]. In the context of text modeling, our model posits that each document is generated as a mixture of topics, where the continuous-valued mixture proportions are distributed as a latent Dirichlet random variable. Inference and learning are carried out efficiently via variational algorithms. We present empirical results on applications of this model to problems in text modeling, collaborative filtering, and text classification.

25,546 citations

Posted Content
TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Abstract: We propose two novel model architectures for computing continuous vector representations of words from very large data sets. The quality of these representations is measured in a word similarity task, and the results are compared to the previously best performing techniques based on different types of neural networks. We observe large improvements in accuracy at much lower computational cost, i.e. it takes less than a day to learn high quality word vectors from a 1.6 billion words data set. Furthermore, we show that these vectors provide state-of-the-art performance on our test set for measuring syntactic and semantic word similarities.

20,077 citations

Journal ArticleDOI
TL;DR: The homophily principle as mentioned in this paper states that similarity breeds connection, and that people's personal networks are homogeneous with regard to many sociodemographic, behavioral, and intrapersonal characteristics.
Abstract: Similarity breeds connection. This principle—the homophily principle—structures network ties of every type, including marriage, friendship, work, advice, support, information transfer, exchange, comembership, and other types of relationship. The result is that people's personal networks are homogeneous with regard to many sociodemographic, behavioral, and intrapersonal characteristics. Homophily limits people's social worlds in a way that has powerful implications for the information they receive, the attitudes they form, and the interactions they experience. Homophily in race and ethnicity creates the strongest divides in our personal environments, with age, religion, education, occupation, and gender following in roughly that order. Geographic propinquity, families, organizations, and isomorphic positions in social systems all create contexts in which homophilous relations form. Ties between nonsimilar individuals also dissolve at a higher rate, which sets the stage for the formation of niches (localize...

15,738 citations