A Hybrid Model for Paraphrase Detection Combines pros of Text Similarity with Deep Learning
TLDR
This paper proposes a hybrid model that combines the text similarity approach with deep learning approach in order to improve paraphrase detection and verified results with Microsoft Research Paraphrase Corpus dataset.Abstract:
Paraphrase detection (PD) is a very essential and important task in Natural language processing. The goal of paraphrase detection is to check whether two statements written in natural language have the identical semantic or not. Its importance appears in many fields like plagiarism detection, question answering, document clustering and information retrieval, etc. This paper proposes a hybrid model that combines the text similarity approach with deep learning approach in order to improve paraphrase detection. This model verified results with Microsoft Research Paraphrase Corpus (MSPR) dataset, shows that accuracy measure is about 76.6% and F-measure is about 83.5%.read more
Citations
More filters
Journal ArticleDOI
Paraphrase identification using collaborative adversarial networks
Journal ArticleDOI
Corpus-Based Paraphrase Detection Experiments and Review
Tedo Vrbanec,Ana Meštrović +1 more
TL;DR: A performance overview of various types of corpus-based models, especially deep learning (DL) models, with the task of paraphrase detection shows that DL models are very competitive with traditional state-of-the-art approaches and have potential that should be further developed.
Journal ArticleDOI
Corpus-Based Paraphrase Detection Experiments and Review
Tedo Vrbanec,Ana Meštrović +1 more
TL;DR: In this paper, the authors give a performance overview of various types of corpus-based models, especially deep learning (DL) models, with the task of paraphrase detection, which is important for a number of applications, including plagiarism detection, authorship attribution, question answering, text summarization, text mining in general, etc.
Journal ArticleDOI
Applying BERT for Early-Stage Recognition of Persistence in Chat-Based Social Engineering Attacks
TL;DR: In this paper , a natural language processing model, called CSE-PersistenceBERT, was proposed for paraphrase detection to recognize persistency as a social engineering attacker's behavior during a chat-based dialogue.
Book ChapterDOI
Evaluation of Similarity Measures in a Benchmark for Spanish Paraphrasing Detection
Helena Gómez-Adorno,Gemma Bel-Enguix,Gerardo Sierra,Juan-Manuel Torres-Moreno,Renata Martinez,Pedro Serrano +5 more
TL;DR: A similarity-based approach towards paraphrase detection in Spanish is presented and a threshold is obtained for each of the similarity metrics with the aim of determining a classification boundary to decide if two sentences are paraphrased.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article
Skip-thought vectors
Ryan Kiros,Yukun Zhu,Ruslan Salakhutdinov,Richard S. Zemel,Antonio Torralba,Raquel Urtasun,Sanja Fidler +6 more
TL;DR: This article used the continuity of text from books to train an encoder-decoder model that tries to reconstruct the surrounding sentences of an encoded passage, which can produce highly generic sentence representations that are robust and perform well in practice.
Proceedings Article
Corpus-based and knowledge-based measures of text semantic similarity
TL;DR: This paper shows that the semantic similarity method out-performs methods based on simple lexical matching, resulting in up to 13% error rate reduction with respect to the traditional vector-based similarity metric.
Proceedings Article
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
TL;DR: This work introduces a method for paraphrase detection based on recursive autoencoders (RAE) and unsupervised RAEs based on a novel unfolding objective and learns feature vectors for phrases in syntactic trees to measure word- and phrase-wise similarity between two sentences.
Journal ArticleDOI
A Survey of Text Similarity Approaches
Wael Hassan Gomaa,Aly A. Fahmy +1 more
TL;DR: This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities, and samples of combination between these similarities are presented.