Machine Learning Models for Paraphrase Identification and its Applications on Plagiarism Detection

doi:10.1109/ICBK.2019.00021

Proceedings ArticleDOI

Machine Learning Models for Paraphrase Identification and its Applications on Plagiarism Detection

- pp 97-104

TLDR

Among the compared models, as expected, Recurrent Neural Network is best suited for the paraphrase identification task and it is proposed that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.

Abstract:

Paraphrase Identification or Natural Language Sentence Matching (NLSM) is one of the important and challenging tasks in Natural Language Processing where the task is to identify if a sentence is a paraphrase of another sentence in a given pair of sentences. Paraphrase of a sentence conveys the same meaning but its structure and the sequence of words varies. It is a challenging task as it is difficult to infer the proper context about a sentence given its short length. Also, coming up with similarity metrics for the inferred context of a pair of sentences is not straightforward as well. Whereas, its applications are numerous. This work explores various machine learning algorithms to model the task and also applies different input encoding scheme. Specifically, we created the models using Logistic Regression, Support Vector Machines, and different architectures of Neural Networks. Among the compared models, as expected, Recurrent Neural Network (RNN) is best suited for our paraphrase identification task. Also, we propose that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep Physical Informed Neural Networks for Metamaterial Design

Zhiwei Fang, +1 more

- 01 Jan 2020 -

IEEE Access

TL;DR: A physical informed neural network approach for designing the electromagnetic metamaterial and a method to solve high frequency Helmholtz equation, which is widely used in physics and engineering is proposed.

...read moreread less

Journal ArticleDOI

BLSTM-API: Bi-LSTM Recurrent Neural Network-Based Approach for Arabic Paraphrase Identification

Adnen Mahmoud, +2 more

- 24 Feb 2021 -

Arabian Journal for Science and Engineer...

TL;DR: In this paper, an Arabic extrinsic paraphrase identification method is proposed based on a Siamese recurrent neural networks architecture, which is useful for identifying semantic similarity between the obtained source and suspect vectors.

...read moreread less

Journal ArticleDOI

An Evolutionary Approach to Compact DAG Neural Network Optimization

Carter Chiu, +1 more

- 20 Nov 2019 -

IEEE Access

TL;DR: This work proposes the use of compact directed acyclic graph neural networks (DAG-NNs) and an evolutionary approach for automating the optimization of their structure and parameters and demonstrates that this approach consistently outperforms conventional neural networks, even while employing fewer nodes.

...read moreread less

Proceedings ArticleDOI

A Study of Ensemble Methods for Cyber Security

Nicholas Lower, +1 more

TL;DR: This study looks at the advantages of ensemble methods when applied to the cybersecurity domain by using the widely used NSL-KDD intrusion detection dataset, specifically the algorithms experimented with are the Voting classifier, boosting, Random forest classifier and AdaBoost classifier.

...read moreread less

Proceedings ArticleDOI

Segregating Hazardous Waste Using Deep Neural Networks in Real-Time Video

Dorothy Hua, +5 more

TL;DR: Through the use of machine learning, the model is able to identify hazardous objects and recyclable items within a pile of trash to help protect all individuals.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Posted Content

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, +2 more

- 01 Sep 2014 -

arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Posted Content

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Pranav Rajpurkar, +3 more

- 16 Jun 2016 -

arXiv: Computation and Language

TL;DR: The Stanford Question Answering Dataset (SQuAD) as mentioned in this paper is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.

...read moreread less

Proceedings ArticleDOI

SQuAD: 100,000+ Questions for Machine Comprehension of Text

Pranav Rajpurkar, +3 more

TL;DR: The Stanford Question Answering Dataset (SQuAD) as mentioned in this paper is a reading comprehension dataset consisting of 100,000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.

...read moreread less

Journal ArticleDOI

Privacy-preserving data mining

Rakesh Agrawal, +1 more

TL;DR: This work considers the concrete case of building a decision-tree classifier from training data in which the values of individual records have been perturbed and proposes a novel reconstruction procedure to accurately estimate the distribution of original data values.

...read moreread less