Negative Deceptive Opinion Spam

Home
/
Papers
/
Negative Deceptive Opinion Spam

Proceedings Article•

Negative Deceptive Opinion Spam

Myle Ott¹, Claire Cardie¹, Jeffrey T. Hancock¹•Institutions (1)

01 Jun 2013-pp 497-501

TL;DR: This work creates and study the first dataset of deceptive opinion spam with negative sentiment reviews, and finds that standard n-gram text categorization techniques can detect negative deceptive opinions spam with performance far surpassing that of human judges.

read less

Abstract: The rising influence of user-generated online reviews (Cone, 2011) has led to growing incentive for businesses to solicit and manufacture DECEPTIVE OPINION SPAM—fictitious reviews that have been deliberately written to sound authentic and deceive the reader. Recently, Ott et al. (2011) have introduced an opinion spam dataset containing gold standard deceptive positive hotel reviews. However, the complementary problem of negative deceptive opinion spam, intended to slander competitive offerings, remains largely unstudied. Following an approach similar to Ott et al. (2011), in this work we create and study the first dataset of deceptive opinion spam with negative sentiment reviews. Based on this dataset, we find that standard n-gram text categorization techniques can detect negative deceptive opinion spam with performance far surpassing that of human judges. Finally, in conjunction with the aforementioned positive review dataset, we consider the possible interactions between sentiment and deception, and present initial results that encourage further exploration of this relationship.

...read moreread less

Citations

PDF

Open Access

More filters

「会話の文法」に関する一考察 : Longman Grammar of Spoken and Written Englishの場合

[...]

周飯島

01 Jan 1999

TL;DR: Longman Student Grammar of Spoken and Written English (LGSME) as discussed by the authors is a large scale grammar of English with the aim of meeting the need of creating discourse in different situations.

...read moreread less

Abstract: Longman Student Grammar of Spoken and Written English March 13th, 2019 These tell us what choices are available in the grammar but we also need to understand how these choices are used to create discourse in different situations The year 1999 saw the publication of a large scale grammar of English with the aim of meeting the above needs the Longman ielts house net, longman student grammar of spoken and written english, longman grammar of spoken and written english roffel, longman student grammar of spoken and written english pdf, longman grammar of spoken and written english libros, longmans student grammar of spoken and written english, english longman grammar of spoken and written eng free, longman student grammar of spoken and written english, longman grammar of spoken and written english pdf web, lms2 vu edu pk, longman student grammar of spoken and written english, longman grammar of spoken and written english wikipedia, longman student grammar of spoken and written english, download pdf longman grammar of spoken and written, longman student grammar of spoken and written english, longman grammar of spoken and written english amazon co, longman student grammar of spoken and written english, longman grammar of spoken and written english edoc pub, the languagelab library longman student grammar of, longman grammar of spoken and written english scribd, longman grammar of spoken and written english free, the longman grammar of spoken and written english, longman grammar of spoken and written english epdf tips, grammars of spoken english new outcomes of corpus, longman grammar of spoken and written english tesl ej, book reviews longman grammar of spoken and written english, longman student grammar of spoken and written english, longman grammar of spoken and written english worldcat org, douglas biber et al longman grammar of spoken and, project muse longman grammar of spoken and written, longman grammar of spoken and written english oxford, 9780582237261 longman student grammar of spoken and, longman student grammar of spoken and written english, pdf longman grammar of spoken and written english, longman student grammar of spoken and written english, longman grammar of spoken and written english google books, student grammar of spoken and written english workbook, longman grammar of spoken and written english goodreads, longman student grammar of spoken and written english, longman student grammar of spoken and written english le, longman student grammar of spoken and written english, longman grammar of spoken and written english co construction, longman student grammar of spoken and written english, longman student grammar of spoken and written english by, longman student grammar of spoken and written english workbook, longman grammar of spoken and written english douglas

...read moreread less

1,038 citations

Journal Article•DOI•

A survey on opinion mining and sentiment analysis

[...]

Kumar Satish Ravi¹, Vadlamani Ravi•Institutions (1)

University UCINF¹

01 Nov 2015-Knowledge Based Systems

TL;DR: A rigorous survey on sentiment analysis is presented, which portrays views presented by over one hundred articles published in the last decade regarding necessary tasks, approaches, and applications of sentiment analysis.

...read moreread less

Abstract: With the advent of Web 2.0, people became more eager to express and share their opinions on web regarding day-to-day activities and global issues as well. Evolution of social media has also contributed immensely to these activities, thereby providing us a transparent platform to share views across the world. These electronic Word of Mouth (eWOM) statements expressed on the web are much prevalent in business and service industry to enable customer to share his/her point of view. In the last one and half decades, research communities, academia, public and service industries are working rigorously on sentiment analysis, also known as, opinion mining, to extract and analyze public mood and views. In this regard, this paper presents a rigorous survey on sentiment analysis, which portrays views presented by over one hundred articles published in the last decade regarding necessary tasks, approaches, and applications of sentiment analysis. Several sub-tasks need to be performed for sentiment analysis which in turn can be accomplished using various approaches and techniques. This survey covering published literature during 2002-2015, is organized on the basis of sub-tasks to be performed, machine learning and natural language processing techniques used and applications of sentiment analysis. The paper also presents open issues and along with a summary table of a hundred and sixty-one articles.

...read moreread less

1,011 citations

Cites background or methods from "Negative Deceptive Opinion Spam"

...Document level 73 [13], [18], [22], [32], [33], [36], [40], [43], [45], [48], [50], [51], [53], [54], [61], [64], [66], [77], [81], [80], [85], [88], [90], [91], [94], [96], [101], [111], [117], [121], [123], [130], [131], [132], [148], [155], [156], [157], [158], [167], [168], [169], [175], [176], [177], [179], [180], [182], [194], [195], [197], [200], [203], [205], [206], [207], [209], [210], [211], [212], [217], [220], [221], [222], [223], [224], [225], [226], [227], [228], [229], [231], [232]...
[...]
...[200] dataset, which contains 400 deceptive and 400 truthful reviews on each positive and negative category....
[...]
...Some promising review spam detection methods included duplicate finding methods [234], concept similarity based method [235], content based method [200, 210], and review and reviewer oriented features based method [236] etc....
[...]
...[200] developed a negative deceptive opinion dataset and performed spam classification using SVM....
[...]
...S# Tasks and applications #Articles References 1 Subjectivity Classification 6 [44], [75], [110], [163], [167], [174] 2 Polarity determination 43 [12], [26], [29], [32], [33], [35], [40], [45], [48], [50], [54], [57], [66], [85], [95], [96], [108], [109], [112], [114], [123], [126], [154], [156], [157], [160], [162], [165], [166], [168], [169], [170], [171], [172], [176], [177], [178], [179], [180], [203], [205], [206], [209] 3 Vagueness in opinionated text 5 [22], [41], [86], [216], [217] 4 Multi- & cross-lingual SA 6 [46], [88], [94], [115], [148], [173] 5 Cross-domain SA 4 [36], [98], [99], [121] 6 Review usefulness measurement 13 [76], [78], [81], [130], [221], [222], [223], [224], [225], [226], [227], [228], [229] 7 Opinion spam detection 7 [199], [200], [212], [216], [220], [231], [232] 8 Lexica and corpora creation 22 [21], [23], [24], [30], [52], [55], [56], [69], [74], [97], [106], [111], [116], [117], [118], [127], [136], [202], [207], [211], [213], [214] 9 Opinion word and aspects extraction, entity recognition, name disambiguation 36 [8], [11], [25], [27], [35], [37], [59], [60], [61], [62], [63], [67], [68], [92],[93], [100], [101], [102], [107], [125], [132], [175], [182], [185], [186], [189], [190], [191], [193], [194], [195], [196], [218], [240], [241], [243] 10 Applications of SA 21 [13], [18], [43], [47], [49], [51], [53], [58], [64], [73], [77], [79], [80], [90], [91], [124], [131], [155], [158], [183], [184] Total 163...
[...]

Journal Article•DOI•

Automatic deception detection: methods for finding fake news

[...]

Niall J. Conroy¹, Victoria L. Rubin¹, Yimin Chen¹•Institutions (1)

University of Western Ontario¹

06 Nov 2015

TL;DR: This research surveys the current state‐of‐the‐art technologies that are instrumental in the adoption and development of fake news detection, as well as various formats and genres.

...read moreread less

Abstract: This research surveys the current state-of-the-art technologies that are instrumental in the adoption and development of fake news detection. "Fake news detection" is defined as the task of categorizing news along a continuum of veracity, with an associated measure of certainty. Veracity is compromised by the occurrence of intentional deceptions. The nature of online news publication has changed, such that traditional fact checking and vetting from potential deception is impossible against the flood arising from content generators, as well as various formats and genres. The paper provides a typology of several varieties of veracity assessment methods emerging from two major categories -- linguistic cue approaches (with machine learning), and network analysis approaches. We see promise in an innovative hybrid approach that combines linguistic cue and machine learning, with network-based behavioral data. Although designing a fake news detector is not a straightforward problem, we propose operational guidelines for a feasible fake news detecting system.

...read moreread less

715 citations

Cites background from "Negative Deceptive Opinion Spam"

...The classification of sentiment (Pang & Lee, 2008; Ott et al., 2013) is based on the underlying intuition that deceivers use unintended emotional communication, judgment or evaluation of affective state (Hancock, Woodworth, & Porter, 2011)....
[...]
...Comparison between human judgement and SVM classifiers showed 86% performance accuracy on negative deceptive opinion spam (Ott et al., 2013)....
[...]

Journal Article•DOI•

Survey of review spam detection using machine learning techniques

[...]

Michael Crawford¹, Taghi M. Khoshgoftaar¹, Joseph D. Prusa¹, Aaron N. Richter¹, Hamzah Al Najada¹ - Show less +1 more•Institutions (1)

Florida Atlantic University¹

05 Oct 2015-Journal of Big Data

TL;DR: A strong and comprehensive comparative study of current research on detecting review spam using various machine learning techniques and to devise methodology for conducting further investigation is provided.

...read moreread less

Abstract: Online reviews are often the primary factor in a customer’s decision to purchase a product or service, and are a valuable source of information that can be used to determine public opinion on these products or services. Because of their impact, manufacturers and retailers are highly concerned with customer feedback and reviews. Reliance on online reviews gives rise to the potential concern that wrongdoers may create false reviews to artificially promote or devalue products and services. This practice is known as Opinion (Review) Spam, where spammers manipulate and poison reviews (i.e., making fake, untruthful, or deceptive reviews) for profit or gain. Since not all online reviews are truthful and trustworthy, it is important to develop techniques for detecting review spam. By extracting meaningful features from the text using Natural Language Processing (NLP), it is possible to conduct review spam detection using various machine learning techniques. Additionally, reviewer information, apart from the text itself, can be used to aid in this process. In this paper, we survey the prominent machine learning techniques that have been proposed to solve the problem of review spam detection and the performance of different approaches for classification and detection of review spam. The majority of current research has focused on supervised learning methods, which require labeled data, a scarcity when it comes to online review spam. Research on methods for Big Data are of interest, since there are millions of online reviews, with many more being generated daily. To date, we have not found any papers that study the effects of Big Data analytics for review spam detection. The primary goal of this paper is to provide a strong and comprehensive comparative study of current research on detecting review spam using various machine learning techniques and to devise methodology for conducting further investigation.

...read moreread less

355 citations

Proceedings Article•DOI•

Towards a General Rule for Identifying Deceptive Opinion Spam

[...]

Jiwei Li¹, Myle Ott², Claire Cardie², Eduard Hovy¹•Institutions (2)

Carnegie Mellon University¹, Cornell University²

01 Jun 2014

TL;DR: This paper explores generalized approaches for identifying online deceptive opinion spam based on a new gold standard dataset, which is comprised of data from three different domains that contains three types of reviews, i.e. customer generated truthful reviews, Turker generated deceptive reviews and employee (domain-expert) generated deception reviews.

...read moreread less

Abstract: Consumers’ purchase decisions are increasingly influenced by user-generated online reviews. Accordingly, there has been growing concern about the potential for posting deceptive opinion spam— fictitious reviews that have been deliberately written to sound authentic, to deceive the reader. In this paper, we explore generalized approaches for identifying online deceptive opinion spam based on a new gold standard dataset, which is comprised of data from three different domains (i.e. Hotel, Restaurant, Doctor), each of which contains three types of reviews, i.e. customer generated truthful reviews, Turker generated deceptive reviews and employee (domain-expert) generated deceptive reviews. Our approach tries to capture the general difference of language usage between deceptive and truthful reviews, which we hope will help customers when making purchase decisions and review portal operators, such as TripAdvisor or Yelp, investigate possible fraudulent activity on their sites. 1

...read moreread less

293 citations

Cites background or methods from "Negative Deceptive Opinion Spam"

..., 2012), identification of negative deceptive opinion spam (Ott et al., 2013), and identifying manipulated offerings (Li et al....
[...]
...created a gold-standard collection by employing Turkers to write fake reviews, and follow-up research was based on their data (Ott et al., 2012; Ott et al., 2013; Li et al., 2013b; Feng and Hirst, 2013)....
[...]
...…Turk.3 A couple of follow-up works have been introduced based on Ott et al.’s dataset, including estimating prevalence of deception in online reviews (Ott et al., 2012), identification of negative deceptive opinion spam (Ott et al., 2013), and identifying manipulated offerings (Li et al., 2013b)....
[...]
...Ott et al. created a gold-standard collection by employing Turkers to write fake reviews, and follow-up research was based on their data (Ott et al., 2012; Ott et al., 2013; Li et al., 2013b; Feng and Hirst, 2013)....
[...]
...Identifying positive/negative opinion spam is explored in (Ott et al., 2011; Ott et al., 2013)...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

The measurement of observer agreement for categorical data

[...]

J. R. Landis¹, Gary G. Koch•Institutions (1)

University of Michigan¹

01 Mar 1977-Biometrics

TL;DR: A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.

...read moreread less

Abstract: This paper presents a general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies. The procedure essentially involves the construction of functions of the observed proportions which are directed at the extent to which the observers agree among themselves and the construction of test statistics for hypotheses involving these functions. Tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interobserver agreement are developed as generalized kappa-type statistics. These procedures are illustrated with a clinical diagnosis example from the epidemiological literature.

...read moreread less

64,109 citations

Book•

Longman Grammar of Spoken and Written English

[...]

Douglas Biber¹, Randolph Quirk²•Institutions (2)

Iowa State University¹, University of St. Thomas (Minnesota)²

01 Jan 1999

TL;DR: The authors compare the frequency of constructions in different contexts, from conversation to fiction to academic prose, using the 40 million-word Longman Spoken and Written English Corpus (LSEE).

...read moreread less

Abstract: * Over 350 tables and graphs show the frequency of constructions in different contexts, from conversation to fiction to academic prose * Entirely corpus-based with 6000 authentic examples from the 40 million-word Longman Spoken and Written English Corpus * Suggests the reasons why we choose a particular structure in a particular context * Compares British and American spoken and written English Areas covered include basic grammar: description and distribution, key word classes and their phrases and complex structures. Each area is subdivided into more detailed content.

...read moreread less

3,876 citations

Journal Article•DOI•

Generalized additive models for location, scale and shape

[...]

Robert A. Rigby¹, Dimitrios Stasinopoulos¹•Institutions (1)

London Metropolitan University¹

01 Jun 2005-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: The generalized additive model for location, scale and shape (GAMLSS) as mentioned in this paper is a general class of statistical models for a univariate response variable, which assumes independent observations of the response variable y given the parameters, the explanatory variables and the values of the random effects.

...read moreread less

Abstract: Summary. A general class of statistical models for a univariate response variable is presented which we call the generalized additive model for location, scale and shape (GAMLSS). The model assumes independent observations of the response variable y given the parameters, the explanatory variables and the values of the random effects. The distribution for the response variable in the GAMLSS can be selected from a very general family of distributions including highly skew or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modelling not only of the mean (or location) but also of the other parameters of the distribution of y, as parametric and/or additive nonparametric (smooth) functions of explanatory variables and/or random-effects terms. Maximum (penalized) likelihood estimation is used to fit the (non)parametric models. A Newton–Raphson or Fisher scoring algorithm is used to maximize the (penalized) likelihood. The additive terms in the model are fitted by using a backfitting algorithm. Censored data are easily incorporated into the framework. Five data sets from different fields of application are analysed to emphasize the generality of the GAMLSS class of models.

...read moreread less

2,386 citations

"Negative Deceptive Opinion Spam" refers methods in this paper

...6We use the R package GAMLSS (Rigby and Stasinopoulos, 2005) to fit a log-normal distribution (left truncated at 150 characters) to the lengths of the deceptive reviews....
[...]

Journal Article•DOI•

Nonverbal Leakage and Clues to Deception

[...]

Paul Ekman, Wallace V. Friesen

01 Feb 1969-Psychiatry MMC

TL;DR: The study explores the interaction situation, and considers how within deception interactions differences in neuroanatomy and cultural influences combine to produce specific types of body movements and facial expressions which escape efforts to deceive and emerge as leakage or deception clues.

...read moreread less

Abstract: : Research relevant to psychotherapy regarding facial expression and body movement, has shown that the kind of information which can be gleaned from the patients words - information about affects, attitudes, interpersonal styles, psychodynamics - can also be derived from his concomitant nonverbal behavior. The study explores the interaction situation, and considers how within deception interactions differences in neuroanatomy and cultural influences combine to produce specific types of body movements and facial expressions which escape efforts to deceive and emerge as leakage or deception clues.

...read moreread less

1,594 citations

"Negative Deceptive Opinion Spam" refers background in this paper

..., 2001), and (3) increased negative emotion terms, often attributed to leakage cues (Ekman and Friesen, 1969), but perhaps better explained in our case as an exaggeration of the underlying review sentiment....
[...]

Journal Article•DOI•

Accuracy of Deception Judgments

[...]

Charles F. Bond¹, Bella M. DePaulo²•Institutions (2)

Texas Christian University¹, University of California, Santa Barbara²

01 Jan 2006-Personality and Social Psychology Review

TL;DR: It is proposed that people judge others' deceptions more harshly than their own and that this double standard in evaluating deceit can explain much of the accumulated literature.

...read moreread less

Abstract: We analyze the accuracy of deception judgments, synthesizing research results from 206 documents and 24,483 judges. In relevant studies, people attempt to discriminate lies from truths in real time with no special aids or training. In these circumstances, people achieve an average of 54% correct lie-truth judgments, correctly classifying 47% of lies as deceptive and 61% of truths as nondeceptive. Relative to cross-judge differences in accuracy, mean lie-truth discrimination abilities are nontrivial, with a mean accuracy d of roughly .40. This produces an effect that is at roughly the 60th percentile in size, relative to others that have been meta-analyzed by social psychologists. Alternative indexes of lie-truth discrimination accuracy correlate highly with percentage correct, and rates of lie detection vary little from study to study. Our meta-analyses reveal that people are more accurate in judging audible than visible lies, that people appear deceptive when motivated to be believed, and that individuals regard their interaction partners as honest. We propose that people judge others' deceptions more harshly than their own and that this double standard in evaluating deceit can explain much of the accumulated literature.

...read moreread less

1,493 citations

"Negative Deceptive Opinion Spam" refers background or result in this paper

...To validate the credibility of our deceptive reviews, we show that human deception detection performance on the negative reviews is low, in agreement with decades of traditional deception detection research (Bond and DePaulo, 2006)....
[...]
...Recent large-scale meta-analyses have shown human deception detection performance is low, with accuracies often not much better than chance (Bond and DePaulo, 2006)....
[...]