scispace - formally typeset
Search or ask a question
Topic

Plagiarism detection

About: Plagiarism detection is a research topic. Over the lifetime, 1790 publications have been published within this topic receiving 24740 citations.


Papers
More filters
Dissertation
12 Jun 2013
TL;DR: In this article, the authors investigated whether plagiarism involves an intention to deceive, and, in this case, whether forensic linguistic evidence can provide clues to this intentionality, and also evaluated current computational approaches to plagiarism detection, and identified strategies that these systems fail to detect.
Abstract: This study investigates plagiarism detection, with an application in forensic contexts. Two types of data were collected for the purposes of this study. Data in the form of written texts were obtained from two Portuguese Universities and from a Portuguese newspaper. These data are analysed linguistically to identify instances of verbatim, morpho-syntactical, lexical and discursive overlap. Data in the form of survey were obtained from two higher education institutions in Portugal, and another two in the United Kingdom. These data are analysed using a 2 by 2 between-groups Univariate Analysis of Variance (ANOVA), to reveal cross-cultural divergences in the perceptions of plagiarism. The study discusses the legal and social circumstances that may contribute to adopting a punitive approach to plagiarism, or, conversely, reject the punishment. The research adopts a critical approach to plagiarism detection. On the one hand, it describes the linguistic strategies adopted by plagiarists when borrowing from other sources, and, on the other hand, it discusses the relationship between these instances of plagiarism and the context in which they appear. A focus of this study is whether plagiarism involves an intention to deceive, and, in this case, whether forensic linguistic evidence can provide clues to this intentionality. It also evaluates current computational approaches to plagiarism detection, and identifies strategies that these systems fail to detect. Specifically, a method is proposed to translingual plagiarism. The findings indicate that, although cross-cultural aspects influence the different perceptions of plagiarism, a distinction needs to be made between intentional and unintentional plagiarism. The linguistic analysis demonstrates that linguistic elements can contribute to finding clues for the plagiarist’s intentionality. Furthermore, the findings show that translingual plagiarism can be detected by using the method proposed, and that plagiarism detection software can be improved using existing computer tools.

22 citations

Proceedings ArticleDOI
08 Sep 2014
TL;DR: It is shown that a hybrid approach that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection performance for disguised plagiarism forms allows semantic plagiarism detection to become feasible even on large collections for the first time.
Abstract: This paper proposes a hybrid approach to plagiarism detection in academic documents that integrates detection methods using citations, semantic argument structure, and semantic word similarity with character-based methods to achieve a higher detection performance for disguised plagiarism forms. Currently available software for plagiarism detection exclusively performs text string comparisons. These systems find copies, but fail to identify disguised plagiarism, such as paraphrases, translations, or idea plagiarism. Detection approaches that consider semantic similarity on word and sentence level exist and have consistently achieved higher detection accuracy for disguised plagiarism forms compared to character-based approaches. However, the high computational effort of these semantic approaches makes them infeasible for use in real-world plagiarism detection scenarios. The proposed hybrid approach uses citation-based methods as a preliminary heuristic to reduce the retrieval space with a relatively low loss in detection accuracy. This preliminary step can then be followed by a computationally more expensive semantic and character-based analysis. We show that such a hybrid approach allows semantic plagiarism detection to become feasible even on large collections for the first time.

22 citations

Book ChapterDOI
21 Mar 2010
TL;DR: An approach using an extension of the method Encoplot, which won the 1st international competition on plagiarism detection in 2009, is presented, tested on a large-scale corpus of artificial plagiarism, with good results.
Abstract: Determining the direction of plagiarism (who plagiarized whom in a given pair of documents) is one of the most interesting problems in the field of automatic plagiarism detection. We present here an approach using an extension of the method Encoplot, which won the 1st international competition on plagiarism detection in 2009. We have tested it on a large-scale corpus of artificial plagiarism, with good results.

22 citations

01 Jan 2004
TL;DR: In this paper, the authors describe the strategic framework for work at the University of Tasmania (UTAS) for management of plagiarism detection software which has served to highlight the wide variety of issues associated with academic integrity and the importance of embedding good practice on the part of both staff and students.
Abstract: Academic integrity issues are currently a major focus of concern at most tertiary institutions. This paper details the strategic framework for work at the University of Tasmania (UTAS) for management of these issues. It focuses on the introduction of plagiarism detection software which has served to highlight the wide variety of issues associated with academic integrity and the importance of embedding good practice on the part of both staff and students. The paper reports on the Pandora’s box of implementation issues – legal, workload, training and support – that have emerged and the strategies being used to manage these, as part of the project. It recommends the use of a model which focuses on an educative approach to the management of academic integrity, as well as including mechanisms for identifying and discouraging plagiarism, and where it occurs, proceeding against it as academic misconduct. Many of the issues raised by the project have challenged the ‘comfort zones’ of students, staff and university academic administration. These are being managed both through the approaches being used in the pilot and the project governance adopted.

21 citations

Journal ArticleDOI
TL;DR: Students perceived that plagiarism is an important issue; detection software makes it easier for lecturers; it is fair to use detection software; students support its use; and it will have some effect in preventing plagiarism, but students' concerns included being caught for unintentional plagiarism.
Abstract: The aim of this research was to determine student and staff perceptions of the effectiveness of plagiarism detection software. A mixed methods approach was undertaken, using a research model adapted from the literature. Eight hours of interviews were conducted with six students and six teaching staff from Curtin Business School at Curtin University of Technology, which had trialled the plagiarism detection software, EVE2 . A survey questionnaire was completed by 171 students involved in the trial. The summary indication was that students perceived that plagiarism is an important issue; detection software makes it easier for lecturers; it is fair to use detection software; students support its use; and it will have some effect in preventing plagiarism. However, students' concerns included being caught for unintentional plagiarism, teaching staff placing too much emphasis on detection results above student ability, and the accuracy of the software at detecting plagiarism. Concerns for teaching staff included the time taken for the detection process, limitation of the software to publicly based Internet sources and direct copying, and the extra workload involved with pursuing academic misconduct.

21 citations


Network Information
Related Topics (5)
Active learning
42.3K papers, 1.1M citations
78% related
The Internet
213.2K papers, 3.8M citations
77% related
Software development
73.8K papers, 1.4M citations
77% related
Graph (abstract data type)
69.9K papers, 1.2M citations
76% related
Deep learning
79.8K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022126
202183
2020118
2019130
2018125