Conference

Web Search and Data Mining

About: Web Search and Data Mining is an academic conference. The conference publishes majorly in the area(s): Computer science & Recommender system. Over the lifetime, 1424 publications have been published by the conference receiving 87855 citations.

...read moreread less

Topics: Computer science, Recommender system, Ranking (information retrieval), Web search query, Social media ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

TwitterRank: finding topic-sensitive influential twitterers

[...]

Jianshu Weng¹, Ee-Peng Lim¹, Jing Jiang¹, Qi He²•Institutions (2)

Singapore Management University¹, Pennsylvania State University²

04 Feb 2010

TL;DR: Experimental results show that TwitterRank outperforms the one Twitter currently uses and other related algorithms, including the original PageRank and Topic-sensitive PageRank, which is proposed to measure the influence of users in Twitter.

...read moreread less

Abstract: This paper focuses on the problem of identifying influential users of micro-blogging services. Twitter, one of the most notable micro-blogging services, employs a social-networking model called "following", in which each user can choose who she wants to "follow" to receive tweets from without requiring the latter to give permission first. In a dataset prepared for this study, it is observed that (1) 72.4% of the users in Twitter follow more than 80% of their followers, and (2) 80.5% of the users have 80% of users they are following follow them back. Our study reveals that the presence of "reciprocity" can be explained by phenomenon of homophily. Based on this finding, TwitterRank, an extension of PageRank algorithm, is proposed to measure the influence of users in Twitter. TwitterRank measures the influence taking both the topical similarity between users and the link structure into account. Experimental results show that TwitterRank outperforms the one Twitter currently uses and other related algorithms, including the original PageRank and Topic-sensitive PageRank.

...read moreread less

1,974 citations

Proceedings Article•DOI•

Everyone's an influencer: quantifying influence on twitter

[...]

Eytan Bakshy¹, Jake M. Hofman², Winter Mason², Duncan J. Watts²•Institutions (2)

University of Michigan¹, Yahoo!²

09 Feb 2011

TL;DR: It is concluded that word-of-mouth diffusion can only be harnessed reliably by targeting large numbers of potential influencers, thereby capturing average effects and that predictions of which particular user or URL will generate large cascades are relatively unreliable.

...read moreread less

Abstract: In this paper we investigate the attributes and relative influence of 1.6M Twitter users by tracking 74 million diffusion events that took place on the Twitter follower graph over a two month interval in 2009. Unsurprisingly, we find that the largest cascades tend to be generated by users who have been influential in the past and who have a large number of followers. We also find that URLs that were rated more interesting and/or elicited more positive feelings by workers on Mechanical Turk were more likely to spread. In spite of these intuitive results, however, we find that predictions of which particular user or URL will generate large cascades are relatively unreliable. We conclude, therefore, that word-of-mouth diffusion can only be harnessed reliably by targeting large numbers of potential influencers, thereby capturing average effects. Finally, we consider a family of hypothetical marketing strategies, defined by the relative cost of identifying versus compensating potential "influencers." We find that although under some circumstances, the most influential users are also the most cost-effective, under a wide range of plausible assumptions the most cost-effective performance can be realized using "ordinary influencers"---individuals who exert average or even less-than-average influence.

...read moreread less

1,834 citations

Proceedings Article•DOI•

Recommender systems with social regularization

[...]

Hao Ma¹, Dengyong Zhou², Chao Liu², Michael R. Lyu¹, Irwin King³ - Show less +1 more•Institutions (3)

The Chinese University of Hong Kong¹, Microsoft², AT&T Labs³

09 Feb 2011

TL;DR: This paper proposes a matrix factorization framework with social regularization, which can be easily extended to incorporate other contextual information, like social tags, etc, and demonstrates that the approaches outperform other state-of-the-art methods.

...read moreread less

Abstract: Although Recommender Systems have been comprehensively analyzed in the past decade, the study of social-based recommender systems just started. In this paper, aiming at providing a general method for improving recommender systems by incorporating social network information, we propose a matrix factorization framework with social regularization. The contributions of this paper are four-fold: (1) We elaborate how social network information can benefit recommender systems; (2) We interpret the differences between social-based recommender systems and trust-aware recommender systems; (3) We coin the term Social Regularization to represent the social constraints on recommender systems, and we systematically illustrate how to design a matrix factorization objective function with social regularization; and (4) The proposed method is quite general, which can be easily extended to incorporate other contextual information, like social tags, etc. The empirical analysis on two large datasets demonstrates that our approaches outperform other state-of-the-art methods.

...read moreread less

1,573 citations

Proceedings Article•DOI•

A holistic lexicon-based approach to opinion mining

[...]

Xiaowen Ding¹, Bing Liu¹, Philip S. Yu¹•Institutions (1)

University of Illinois at Chicago¹

11 Feb 2008

TL;DR: This paper proposes a holistic lexicon-based approach to solving the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews by exploiting external evidences and linguistic conventions of natural language expressions.

...read moreread less

Abstract: One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews. This problem has many applications, e.g., opinion mining, summarization and search. Most existing techniques utilize a list of opinion (bearing) words (also called opinion lexicon) for the purpose. Opinion words are words that express desirable (e.g., great, amazing, etc.) or undesirable (e.g., bad, poor, etc) states. These approaches, however, all have some major shortcomings. In this paper, we propose a holistic lexicon-based approach to solving the problem by exploiting external evidences and linguistic conventions of natural language expressions. This approach allows the system to handle opinion words that are context dependent, which cause major difficulties for existing algorithms. It also deals with many special words, phrases and language constructs which have impacts on opinions based on their linguistic patterns. It also has an effective function for aggregating multiple conflicting opinion words in a sentence. A system, called Opinion Observer, based on the proposed technique has been implemented. Experimental results using a benchmark product review data set and some additional reviews show that the proposed technique is highly effective. It outperforms existing methods significantly

...read moreread less

1,404 citations

Proceedings Article•DOI•

Opinion spam and analysis

[...]

Nitin Jindal¹, Bing Liu¹•Institutions (1)

University of Illinois at Chicago¹

11 Feb 2008

TL;DR: It is shown that opinion spam is quite different from Web spam and email spam, and thus requires different detection techniques, and therefore requires some novel techniques to detect them.

...read moreread less

Abstract: Evaluative texts on the Web have become a valuable source of opinions on products, services, events, individuals, etc. Recently, many researchers have studied such opinion sources as product reviews, forum posts, and blogs. However, existing research has been focused on classification and summarization of opinions using natural language processing and data mining techniques. An important issue that has been neglected so far is opinion spam or trustworthiness of online opinions. In this paper, we study this issue in the context of product reviews, which are opinion rich and are widely used by consumers and product manufacturers. In the past two years, several startup companies also appeared which aggregate opinions from product reviews. It is thus high time to study spam in reviews. To the best of our knowledge, there is still no published study on this topic, although Web spam and email spam have been investigated extensively. We will see that opinion spam is quite different from Web spam and email spam, and thus requires different detection techniques. Based on the analysis of 5.8 million reviews and 2.14 million reviewers from amazon.com, we show that opinion spam in reviews is widespread. This paper analyzes such spam activities and presents some novel techniques to detect them

...read moreread less

1,385 citations

Collapse

Performance

Metrics

1,424

Papers

87,853

Citations

No. of papers from the Conference in previous years
Year	Papers
2023	128
2022	51
2021	157
2020	129
2019	122
2018	112