scispace - formally typeset
Search or ask a question
Topic

Plagiarism detection

About: Plagiarism detection is a research topic. Over the lifetime, 1790 publications have been published within this topic receiving 24740 citations.


Papers
More filters
01 Jan 2010
TL;DR: The approach in detecting external plagiarism is reported, which identifies non-English documents and translates them into English using an online translator tool and retrieves the top documents that are similar to the suspicious documents.
Abstract: In this paper, we report our approach in detecting external plagiarism. For the pre-processing stage, we identify non-English documents and translate them into English using an online translator tool. Then we index and retrieve the top documents that are similar to the suspicious documents. We divide the retrieved documents into passages where each passage contains twenty sentences. The plagiarism is detected by identifying the number of overlapped words between suspicious and source passages.

8 citations

01 Jan 2012
TL;DR: This work examines a simplified approach for unsupervised authorship and plagiarism detection which is based on binary bag of words representation and proves to be successful achieving overall average accuracy of 84% over and a 2nd place rank in the competition.
Abstract: Identifying writing style shifts and variations are fundamental capabilities when addressing authorship related tasks. In this work we examine a simplified approach for unsupervised authorship and plagiarism detection which is based on binary bag of words representation. We evaluate our approach using PAN-2012 Authorship Attribution challenge data, which includes both open/closed class authorship identification and intrinsic plagiarism tasks. Our approach proved to be successful achieving overall average accuracy of 84% over and a 2nd place rank in the competition.

8 citations

Proceedings ArticleDOI
01 Jan 2017
TL;DR: This similarity detection method for language independent source code similarity detection is based on idea of maximum reusability of standard Unix filters and achieved significantly better results than competitors, which are considered as gold standard in plagiarism detection.
Abstract: The paper describes similarity detection method for language independent source code similarity detection. It is based on idea of maximum reusability of standard Unix filters. This method was implemented and benchmarked with different datasets from real world (students' assignments) and also synthetic datasets (perfect plagiarism experiment). Our method achieved significantly better results than competitors, which are considered as gold standard in plagiarism detection.

8 citations

Proceedings ArticleDOI
17 Jan 2013
TL;DR: This paper proposes a novel software similarity analysis tool, based on runtime semantic information, that focuses on AMC (Abstract Memory Context) that can represent the intrinsic features of the analyzed code and introduces two advanced techniques, namely region filtering and sequence alignment for AMC comparison.
Abstract: Analyzing software similarity has emerged as a key ingredient for various applications such as software maintenance, bug finding, malware clustering and copyright protection. In this paper, we propose a novel software similarity analysis tool for plagiarism detection. The tool, which we refer to it as RAMC (Runtime Abstract Memory Context based tool), has the following three characteristics. First, it is based on runtime semantic information, which makes it feasible to investigate similarity in binary codes, without source codes. Second, among runtime semantic information, it focuses on AMC (Abstract Memory Context) that can represent the intrinsic features of the analyzed code. Finally, it introduces two advanced techniques, namely region filtering and sequence alignment for AMC comparison. Real implementation based experiments have shown that RAMC can identify similarity appropriately between the original and plagiarized binaries.

8 citations

Proceedings ArticleDOI
Sun Weisong1, Xingya Wang1, Haoran Wu1, Duan Ding1, Sun Zesong1, Zhenyu Chen1 
27 May 2019
TL;DR: The experimental results show that MAF can effectively improve the performance of similarity measures for test code plagiarism detection, and it can also be extended to use in test recommendation, test reuse and other engineering applications.
Abstract: Software engineering education becomes popular due to the rapid development of the software industry. In order to reduce learning costs and improve learning efficiency, some online practice platforms have emerged. This paper proposes a novel test code plagiarism detection technology, namely MAF, by introducing bidirectional static slicing to anchor methods under test and extract fragments of test codes. Combined with similarity measures, MAF can achieve effective plagiarism detection by avoiding massive unrelated noisy test codes. The experiment is conducted on the dataset of Mooctest, which so far has supported hundreds of test activities around the world in the past 3 years. The experimental results show that MAF can effectively improve the performance (precision, recall and F1-measure) of similarity measures for test code plagiarism detection. We believe that MAF can further expand and promote software testing education, and it can also be extended to use in test recommendation, test reuse and other engineering applications.

8 citations


Network Information
Related Topics (5)
Active learning
42.3K papers, 1.1M citations
78% related
The Internet
213.2K papers, 3.8M citations
77% related
Software development
73.8K papers, 1.4M citations
77% related
Graph (abstract data type)
69.9K papers, 1.2M citations
76% related
Deep learning
79.8K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022126
202183
2020118
2019130
2018125