Home
/
Topics
/
Plagiarism detection

Topic

Plagiarism detection

About: Plagiarism detection is a research topic. Over the lifetime, 1790 publications have been published within this topic receiving 24740 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1998
1997
1996
1994
1990
1989
1988
1987
1985
1981

Papers

PDF

Open Access

More filters

Plagiarism detection and document chunking methods

[...]

Máté Pataki

01 May 2003

TL;DR: The tests made on chunking methods used for plagiarism detection makes it possible to decide on the best fitting chunking method for a given application.

...read moreread less

Abstract: This paper describes the tests made on chunking methods used for plagiarism detection. The result of the tests makes it possible to decide on the best fitting chunking method for a given application. For example, overlapping word chunking is good for a grammar analyzer or for small databases, sentence chunking suits best for finding quoted texts, hashed breakpoint chunking is the fastest method therefore advisable for search in big set of documents, or if more reliability is needed overlapping hashed breakpoint chunking can be used as well.

...read moreread less

9 citations

Journal Article•DOI•

Plagiarism Detection Algorithm for Source Code in Computer Science Education

[...]

Xin Liu¹, Chan Xu¹, Boyu Ouyang¹•Institutions (1)

Xiangtan University¹

01 Oct 2015-International Journal of Distance Education Technologies

TL;DR: The author designed an effective and complete method to detect source code plagiarizing according to the popular way of students' plagiarizing, and designed an improved Longest Common Subsequence algorithm for text matching, using statement as the unit for matching.

...read moreread less

Abstract: Nowadays, computer programming is getting more necessary in the course of program design in college education. However, the trick of plagiarizing plus a little modification exists among some students' home works. It's not easy for teachers to judge if there's plagiarizing in source code or not. Traditional detection algorithms cannot fit this condition. The author designed an effective and complete method to detect source code plagiarizing according to the popular way of students' plagiarizing. There are two basic concepts of the algorithm. One is to standardize the source code via filtration against to remove the majority noises intentionally blended by plagiarists. The other one is an improved Longest Common Subsequence algorithm for text matching, using statement as the unit for matching. The authors also designed an appropriate HASH function to increase the efficiency of matching. Based on the algorithm, a system was designed and proved to be practical and sufficient, which runs well and meet the practical requirement in application.

...read moreread less

9 citations

Proceedings Article•DOI•

SPPlagiarise: A Tool for Generating Simulated Semantics-Preserving Plagiarism of Java Source Code

[...]

Hayden Cheers¹, Yuqing Lin¹, Shamus P. Smith¹•Institutions (1)

University of Newcastle¹

01 Oct 2019

TL;DR: A tool, SPPlagiarise, is presented, which is designed to produce simulated source code plagiarism of Java source code, and an evaluation of a generated plagiarism data set is presented.

...read moreread less

Abstract: Source code plagiarism is a common occurrence in undergraduate computer science education. Studies have indicated at least 50% of students plagiarize during their undergraduate career. To identity cases of source code plagiarism, many source code plagiarism detection tools have been proposed. However, conclusively determining the effectiveness these tools at identifying cases of source code plagiarism is difficult. Evaluations are typically performed using unreleased data sets. Without a comprehensive publicly available data set for source code plagiarism detection evaluation, it is difficult to perform an unbiased and reproducible evaluations of tools. To address this problem, this paper presents a tool, SPPlagiarise, which is designed to produce simulated source code plagiarism of Java source code. SPPlagiarise applies a random number of semantics-preserving source code obfuscations at random locations to a Java code base to simulate source code plagiarism. In this paper the design of the tool and an evaluation of a generated plagiarism data set is presented.

...read moreread less

9 citations

Book Chapter•DOI•

Comparing Images for Document Plagiarism Detection

[...]

Marcin Iwanowski¹, Arkadiusz Cacko¹, Grzegorz Sarwas•Institutions (1)

Warsaw University of Technology¹

19 Sep 2016

TL;DR: In the paper various combination of feature point detectors and descriptors are investigated as potential tool for finding similar images in document as well as how the algorithms computing the image similarity may extend the functionality of plagiarism detection systems.

...read moreread less

Abstract: The paper presents results of research oriented towards an application of image processing methods into document comparisons in view of their application into plagiarism-detection systems. Among all image processing methods, the feature-point ones, thanks to their invariance to various image transforms, are best suited for computing image similarity. In the paper various combination of feature point detectors and descriptors are investigated as potential tool for finding similar images in document. The methods are tested on the database consisting of scientific papers containing 5 well known image processing test images. Also, an idea is presented in the paper how the algorithms computing the image similarity may extend the functionality of plagiarism detection systems.

...read moreread less

9 citations

Proceedings Article•DOI•

Preference comparison for plagiarism detection systems

[...]

Sarka Krizkova¹, Hana Tomaskova¹, Martin Gavalec¹•Institutions (1)

University of Hradec Králové¹

24 Jul 2016

TL;DR: The extent and practicality of plagiarism detection systems using multiple classifications of detection engines are studied using 8 individual articles from different fields of work to determine the effectiveness and extent of each detection engine.

...read moreread less

Abstract: This article studies the extent and practicality of plagiarism detection systems using multiple classifications of detection engines, further described within the article. An in-depth analysis of 8 individual articles from different fields of work was carried out allowing comparisons both between detection systems and different writing styles/formats. The first analysis used unmodified versions of the 8 selected papers as a control and base for the performance of the detection engines, before a second analysis was conducted. This analysis used modified versions of the selected papers by formatting the plagiarized sentences detected in the first test. This formatting involved simple shuffling and manipulation of the text to determine the effectiveness and extent of each detection engine.

...read moreread less

9 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
…
136
137
138
139
140
141
142
…
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,976

Papers

29,005

Citations

No. of papers in the topic in previous years
Year	Papers
2023	59
2022	126
2021	83
2020	118
2019	130
2018	125

Plagiarism detection

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics