Neural networks for deceptive opinion spam detection

doi:10.1016/J.INS.2017.01.015

Home
/
Papers
/
Neural networks for deceptive opinion spam detection

Journal Article•DOI•

Neural networks for deceptive opinion spam detection

Yafeng Ren¹, Donghong Ji²•Institutions (2)

Guangdong University of Foreign Studies¹, Wuhan University²

01 Apr 2017-Information Sciences (Elsevier)-Vol. 385, pp 213-224

TL;DR: This work empirically explore a neural network model to learn document-level representation for detecting deceptive opinion spam and shows that the proposed method outperforms state-of-the-art methods.

read less

About: This article is published in Information Sciences.The article was published on 2017-04-01. It has received 181 citations till now. The article focuses on the topics: Sentence & Feature learning.

...read moreread less

Citations

PDF

Open Access

More filters

Posted Content•

Fake News: A Survey of Research, Detection Methods, and Opportunities.

[...]

Xinyi Zhou, Reza Zafarani

02 Dec 2018

TL;DR: This survey comprehensively and systematically reviews fake news research and identifies and specifies fundamental theories across various disciplines, e.g., psychology and social science, to facilitate and enhance the interdisciplinary research of fake news.

...read moreread less

Abstract: The explosive growth in fake news and its erosion to democracy, justice, and public trust has increased the demand for fake news analysis, detection and intervention. This survey comprehensively and systematically reviews fake news research. The survey identifies and specifies fundamental theories across various disciplines, e.g., psychology and social science, to facilitate and enhance the interdisciplinary research of fake news. Current fake news research is reviewed, summarized and evaluated. These studies focus on fake news from four perspective: (1) the false knowledge it carries, (2) its writing style, (3) its propagation patterns, and (4) the credibility of its creators and spreaders. We characterize each perspective with various analyzable and utilizable information provided by news and its spreaders, various strategies and frameworks that are adaptable, and techniques that are applicable. By reviewing the characteristics of fake news and open issues in fake news studies, we highlight some potential research tasks at the end of this survey.

...read moreread less

212 citations

Cites background from "Neural networks for deceptive opini..."

...This constant evolution in content style demands a real-time representation and/or learning of news content style, where e.g., deep learning can be helpful [Gogate et al. 2017; Li et al. 2017b; Ren and Ji 2017; Wang et al. 2018]....
[...]
...Related studies can be seen in, e.g., [Jindal and Liu 2008; Li et al. 2014, 2017b; Mukherjee et al. 2013b; Ott et al. 2011; Popoola 2018; Ren and Ji 2017; Shojaee et al. 2013; Zhang et al. 2016]....
[...]

Journal Article•DOI•

Fake online reviews: Literature review, synthesis, and directions for future research

[...]

Yuanyuan Wu¹, Yuanyuan Wu², Eric W.T. Ngai², Pengkun Wu³, Pengkun Wu¹, Pengkun Wu², Chong Wu¹ - Show less +3 more•Institutions (3)

Harbin Institute of Technology¹, Hong Kong Polytechnic University², Sichuan University³

01 May 2020

TL;DR: This study comprehensively compile and summarize the existing fake reviews-related public datasets and proposes an antecedent–consequence–intervention conceptual framework to develop an initial research agenda for investigating fake reviews.

...read moreread less

Abstract: Fake online reviews in e-commerce significantly affect online consumers, merchants, and, as a result, market efficiency. Despite scholarly efforts to examine fake reviews, there still lacks a survey that can systematically analyze and summarize its antecedents and consequences. This study proposes an antecedent–consequence–intervention conceptual framework to develop an initial research agenda for investigating fake reviews. Based on a review of the extant literature on this issue, we identify 20 future research questions and suggest 18 propositions. Notably, research on fake reviews is often limited by lack of high-quality datasets. To alleviate this problem, we comprehensively compile and summarize the existing fake reviews-related public datasets. We conclude by presenting the theoretical and practical implications of the current research.

...read moreread less

156 citations

Journal Article•DOI•

HOBA: A novel feature engineering methodology for credit card fraud detection with a deep learning architecture

[...]

Xinwei Zhang¹, Xinwei Zhang², Yaoci Han¹, Wei Xu¹, Qili Wang¹ - Show less +1 more•Institutions (2)

Renmin University of China¹, Rutgers University²

16 May 2019-Information Sciences

TL;DR: The main contribution of this work is the development of a fraud detection system that employs a deep learning architecture together with an advanced feature engineering process based on homogeneity-oriented behavior analysis (HOBA) to efficiently identify fraudulent transactions.

...read moreread less

111 citations

Journal Article•DOI•

Sentiment analysis using deep learning approaches: an overview

[...]

Olivier Habimana¹, Yuhua Li¹, Li Rui-xuan¹, Xiwu Gu¹, Ge Yu² - Show less +1 more•Institutions (2)

Huazhong University of Science and Technology¹, Northeastern University (China)²

01 Jan 2020-Science in China Series F: Information Sciences

TL;DR: This paper reviews deep learning approaches that have been applied to various sentiment analysis tasks and their trends of development, and provides the performance analysis of different deep learning models on a particular dataset at the end of each sentiment analysis task.

...read moreread less

Abstract: Nowadays, with the increasing number of Web 2.0 tools, users generate huge amounts of data in an enormous and dynamic way. In this regard, the sentiment analysis appeared to be an important tool that allows the automation of getting insight from the user-generated data. Recently, deep learning approaches have been proposed for different sentiment analysis tasks and have achieved state-of-the-art results. Therefore, in order to help researchers to depict quickly the current progress as well as current issues to be addressed, in this paper, we review deep learning approaches that have been applied to various sentiment analysis tasks and their trends of development. This study also provides the performance analysis of different deep learning models on a particular dataset at the end of each sentiment analysis task. Toward the end, the review highlights current issues and hypothesized solutions to be taken into account in future work. Moreover, based on knowledge learned from previous studies, the future work subsection shows the suggestions that can be incorporated into new deep learning models to yield better performance. Suggestions include the use of bidirectional encoder representations from transformers (BERT), sentiment-specific word embedding models, cognition-based attention models, common sense knowledge, reinforcement learning, and generative adversarial networks.

...read moreread less

105 citations

Journal Article•DOI•

Fake consumer review detection using deep neural networks integrating word embeddings and emotion mining

[...]

Petr Hájek¹, Aliaksandr Barushka¹, Michal Munk²•Institutions (2)

University of Pardubice¹, University of Constantine the Philosopher²

01 Dec 2020-Neural Computing and Applications

TL;DR: Two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions are proposed that perform well on all datasets, irrespective of their sentiment polarity and product category.

...read moreread less

Abstract: Fake consumer review detection has attracted much interest in recent years owing to the increasing number of Internet purchases. Existing approaches to detect fake consumer reviews use the review content, product and reviewer information and other features to detect fake reviews. However, as shown in recent studies, the semantic meaning of reviews might be particularly important for text classification. In addition, the emotions hidden in the reviews may represent another potential indicator of fake content. To improve the performance of fake review detection, here we propose two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions. Specifically, the models learn document-level representation by using three sets of features: (1) n-grams, (2) word embeddings and (3) various lexicon-based emotion indicators. Such a high-dimensional feature representation is used to classify fake reviews into four domains. To demonstrate the effectiveness of the presented detection systems, we compare their classification performance with several state-of-the-art methods for fake review detection. The proposed systems perform well on all datasets, irrespective of their sentiment polarity and product category.

...read moreread less

86 citations

Cites background or methods or result from "Neural networks for deceptive opini..."

...In [61], the pre-trained CBOW model was tuned on actual review datasets using CNN to improve detection accuracy....
[...]
...To overcome this problem, Ren and Ji [61] developed a gated recurrent NN model combining sentence representations to detect deceptive opinion spam....
[...]
...However, as reported by its authors [50], the CBOW model used in [61] is not effective in generating a generalizable context model....
[...]
...Inspired by these state-of-the-art models [43, 61], here we use word embeddings to obtain the semantic repre-...
[...]
...Therefore, deep NN models such as DFFNNs [10], CNNs [43], general regression neural networks [61], generative adversarial Neural Computing and Applications...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Long short-term memory

[...]

Sepp Hochreiter¹, Jürgen Schmidhuber²•Institutions (2)

Technische Universität München¹, Dalle Molle Institute for Artificial Intelligence Research²

01 Nov 1997-Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

...read moreread less

72,897 citations

Proceedings Article•DOI•

Glove: Global Vectors for Word Representation

[...]

Jeffrey Pennington¹, Richard Socher², Christopher D. Manning¹•Institutions (2)

Stanford University¹, University of Colorado Boulder²

01 Oct 2014

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Abstract: Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.

...read moreread less

30,558 citations

Proceedings Article•

Distributed Representations of Words and Phrases and their Compositionality

[...]

Tomas Mikolov¹, Ilya Sutskever¹, Kai Chen¹, Greg S. Corrado¹, Jeffrey Dean¹ - Show less +1 more•Institutions (1)

Google¹

05 Dec 2013

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.

...read moreread less

Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

...read moreread less

24,012 citations

Proceedings Article•DOI•

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

[...]

Kyunghyun Cho¹, Bart van Merriënboer², Caglar Gulcehre², Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio³, Yoshua Bengio⁴, Yoshua Bengio⁵ - Show less +5 more•Institutions (5)

Aalto University¹, Université de Montréal², Alcatel-Lucent³, AT&T⁴, École Polytechnique de Montréal⁵

01 Jan 2014

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Abstract: In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder‐Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

...read moreread less

19,998 citations

Proceedings Article•DOI•

Convolutional Neural Networks for Sentence Classification

[...]

Yoon Kim¹•Institutions (1)

New York University¹

25 Aug 2014

TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.

...read moreread less

Abstract: We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.

...read moreread less

9,776 citations

"Neural networks for deceptive opini..." refers methods in this paper

...1 , a convolutional neural network [22,23,25] is used to learn continuous representations of a sentence as it does not rely on external parse tree....
[...]