scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Neural networks for deceptive opinion spam detection

01 Apr 2017-Information Sciences (Elsevier)-Vol. 385, pp 213-224
TL;DR: This work empirically explore a neural network model to learn document-level representation for detecting deceptive opinion spam and shows that the proposed method outperforms state-of-the-art methods.
About: This article is published in Information Sciences.The article was published on 2017-04-01. It has received 181 citations till now. The article focuses on the topics: Sentence & Feature learning.
Citations
More filters
Posted Content
02 Dec 2018
TL;DR: This survey comprehensively and systematically reviews fake news research and identifies and specifies fundamental theories across various disciplines, e.g., psychology and social science, to facilitate and enhance the interdisciplinary research of fake news.
Abstract: The explosive growth in fake news and its erosion to democracy, justice, and public trust has increased the demand for fake news analysis, detection and intervention. This survey comprehensively and systematically reviews fake news research. The survey identifies and specifies fundamental theories across various disciplines, e.g., psychology and social science, to facilitate and enhance the interdisciplinary research of fake news. Current fake news research is reviewed, summarized and evaluated. These studies focus on fake news from four perspective: (1) the false knowledge it carries, (2) its writing style, (3) its propagation patterns, and (4) the credibility of its creators and spreaders. We characterize each perspective with various analyzable and utilizable information provided by news and its spreaders, various strategies and frameworks that are adaptable, and techniques that are applicable. By reviewing the characteristics of fake news and open issues in fake news studies, we highlight some potential research tasks at the end of this survey.

212 citations


Cites background from "Neural networks for deceptive opini..."

  • ...This constant evolution in content style demands a real-time representation and/or learning of news content style, where e.g., deep learning can be helpful [Gogate et al. 2017; Li et al. 2017b; Ren and Ji 2017; Wang et al. 2018]....

    [...]

  • ...Related studies can be seen in, e.g., [Jindal and Liu 2008; Li et al. 2014, 2017b; Mukherjee et al. 2013b; Ott et al. 2011; Popoola 2018; Ren and Ji 2017; Shojaee et al. 2013; Zhang et al. 2016]....

    [...]

Journal ArticleDOI
01 May 2020
TL;DR: This study comprehensively compile and summarize the existing fake reviews-related public datasets and proposes an antecedent–consequence–intervention conceptual framework to develop an initial research agenda for investigating fake reviews.
Abstract: Fake online reviews in e-commerce significantly affect online consumers, merchants, and, as a result, market efficiency. Despite scholarly efforts to examine fake reviews, there still lacks a survey that can systematically analyze and summarize its antecedents and consequences. This study proposes an antecedent–consequence–intervention conceptual framework to develop an initial research agenda for investigating fake reviews. Based on a review of the extant literature on this issue, we identify 20 future research questions and suggest 18 propositions. Notably, research on fake reviews is often limited by lack of high-quality datasets. To alleviate this problem, we comprehensively compile and summarize the existing fake reviews-related public datasets. We conclude by presenting the theoretical and practical implications of the current research.

156 citations

Journal ArticleDOI
TL;DR: The main contribution of this work is the development of a fraud detection system that employs a deep learning architecture together with an advanced feature engineering process based on homogeneity-oriented behavior analysis (HOBA) to efficiently identify fraudulent transactions.

111 citations

Journal ArticleDOI
TL;DR: This paper reviews deep learning approaches that have been applied to various sentiment analysis tasks and their trends of development, and provides the performance analysis of different deep learning models on a particular dataset at the end of each sentiment analysis task.
Abstract: Nowadays, with the increasing number of Web 2.0 tools, users generate huge amounts of data in an enormous and dynamic way. In this regard, the sentiment analysis appeared to be an important tool that allows the automation of getting insight from the user-generated data. Recently, deep learning approaches have been proposed for different sentiment analysis tasks and have achieved state-of-the-art results. Therefore, in order to help researchers to depict quickly the current progress as well as current issues to be addressed, in this paper, we review deep learning approaches that have been applied to various sentiment analysis tasks and their trends of development. This study also provides the performance analysis of different deep learning models on a particular dataset at the end of each sentiment analysis task. Toward the end, the review highlights current issues and hypothesized solutions to be taken into account in future work. Moreover, based on knowledge learned from previous studies, the future work subsection shows the suggestions that can be incorporated into new deep learning models to yield better performance. Suggestions include the use of bidirectional encoder representations from transformers (BERT), sentiment-specific word embedding models, cognition-based attention models, common sense knowledge, reinforcement learning, and generative adversarial networks.

105 citations

Journal ArticleDOI
TL;DR: Two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions are proposed that perform well on all datasets, irrespective of their sentiment polarity and product category.
Abstract: Fake consumer review detection has attracted much interest in recent years owing to the increasing number of Internet purchases. Existing approaches to detect fake consumer reviews use the review content, product and reviewer information and other features to detect fake reviews. However, as shown in recent studies, the semantic meaning of reviews might be particularly important for text classification. In addition, the emotions hidden in the reviews may represent another potential indicator of fake content. To improve the performance of fake review detection, here we propose two neural network models that integrate traditional bag-of-words as well as the word context and consumer emotions. Specifically, the models learn document-level representation by using three sets of features: (1) n-grams, (2) word embeddings and (3) various lexicon-based emotion indicators. Such a high-dimensional feature representation is used to classify fake reviews into four domains. To demonstrate the effectiveness of the presented detection systems, we compare their classification performance with several state-of-the-art methods for fake review detection. The proposed systems perform well on all datasets, irrespective of their sentiment polarity and product category.

86 citations


Cites background or methods or result from "Neural networks for deceptive opini..."

  • ...In [61], the pre-trained CBOW model was tuned on actual review datasets using CNN to improve detection accuracy....

    [...]

  • ...To overcome this problem, Ren and Ji [61] developed a gated recurrent NN model combining sentence representations to detect deceptive opinion spam....

    [...]

  • ...However, as reported by its authors [50], the CBOW model used in [61] is not effective in generating a generalizable context model....

    [...]

  • ...Inspired by these state-of-the-art models [43, 61], here we use word embeddings to obtain the semantic repre-...

    [...]

  • ...Therefore, deep NN models such as DFFNNs [10], CNNs [43], general regression neural networks [61], generative adversarial Neural Computing and Applications...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Abstract: Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O. 1. Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

72,897 citations

Proceedings ArticleDOI
01 Oct 2014
TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Abstract: Recent methods for learning vector space representations of words have succeeded in capturing fine-grained semantic and syntactic regularities using vector arithmetic, but the origin of these regularities has remained opaque. We analyze and make explicit the model properties needed for such regularities to emerge in word vectors. The result is a new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods. Our model efficiently leverages statistical information by training only on the nonzero elements in a word-word cooccurrence matrix, rather than on the entire sparse matrix or on individual context windows in a large corpus. The model produces a vector space with meaningful substructure, as evidenced by its performance of 75% on a recent word analogy task. It also outperforms related models on similarity tasks and named entity recognition.

30,558 citations

Proceedings Article
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1, Jeffrey Dean1 
05 Dec 2013
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

24,012 citations

Proceedings ArticleDOI
01 Jan 2014
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Abstract: In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder‐Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

19,998 citations

Proceedings ArticleDOI
Yoon Kim1
25 Aug 2014
TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Abstract: We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.

9,776 citations


"Neural networks for deceptive opini..." refers methods in this paper

  • ...1 , a convolutional neural network [22,23,25] is used to learn continuous representations of a sentence as it does not rely on external parse tree....

    [...]