scispace - formally typeset
Search or ask a question
Topic

Plagiarism detection

About: Plagiarism detection is a research topic. Over the lifetime, 1790 publications have been published within this topic receiving 24740 citations.


Papers
More filters
Proceedings ArticleDOI
05 Feb 2017
TL;DR: This work proposes a novel hybrid approach for automatic plagiarism detection in programming assignments using static features extracted from the intermediate representation of a program in a compiler infrastructure such as gcc and demonstrates the use of unsupervised learning techniques on the extracted feature representations.
Abstract: In this work, we propose a novel hybrid approach for automatic plagiarism detection in programming assignments. Most of the well known plagiarism detectors either employ a text-based approach or use features based on the property of the program at a syntactic level. However, both these approaches succumb to code obfuscation which is a huge obstacle for automatic software plagiarism detection. Our proposed method uses static features extracted from the intermediate representation of a program in a compiler infrastructure such as gcc. We demonstrate the use of unsupervised learning techniques on the extracted feature representations and show that our system is robust to code obfuscation. We test our method on assignments from introductory programming course. The preliminary results show that our system is better when compared to other popular tools like MOSS. For visualizing the local and global structure of the features, we obtained the low-dimensional representations of our features using a popular technique called t-SNE, a variation of Stochastic Neighbor Embedding, which can preserve neighborhood identity in low-dimensions. Based on this idea of preserving neighborhood identity, we mine interesting information such as the diversity in student solution approaches to a given problem. The presence of well defined clusters in low-dimensional visualizations demonstrate that our features are capable of capturing interesting programming patterns.

14 citations

Proceedings Article
01 Jan 2015
TL;DR: RDI system for extrinsic plagiarism detection (RDI_RED) performs remarkably on a wide spectrum of plagiarism techniques starting from simple copy-paste to word shuffling and also complete sentence rephrasing.
Abstract: Extrinsic plagiarism detection gathered the attention of many researchers lately. Plagiarism process began to be more and more difficult to be detected due to appearance of other sophisticated plagiarism approaches other than direct copy and paste such as (phrase rephrasing, word shuffling, semantic substitution, etc...). In this paper, we present RDI system for extrinsic plagiarism detection (RDI_RED). RDI_RED system performs remarkably on a wide spectrum of plagiarism techniques starting from simple copy-paste to word shuffling and also complete sentence rephrasing. RDI_RED system achieved the first three positions in Arabic language plagiarism detection competition with a Plagdet (Plagiarism Detection score) of 80% which is 20% higher than the base line and 18% higher than the second best competing system.

13 citations

Journal ArticleDOI
TL;DR: Medical researchers and authors may improve their writing skills and avoid the same errors by consulting the list of retractions due to plagiarism which are tracked on the PubMed platform and discussed on the Retraction Watch blog.
Abstract: Plagiarism is an ethical misconduct affecting the quality, readability, and trustworthiness of scholarly publications. Improving researcher awareness of plagiarism of words, ideas, and graphics is essential for avoiding unacceptable writing practices. Global editorial associations have publicized their statements on strategies to clean literature from redundant, stolen, and misleading information. Consulting related documents is advisable for upgrading author instructions and warning plagiarists of academic and other consequences of the unethical conduct. A lack of creative thinking and poor academic English skills are believed to compound most instances of redundant and “copy-and-paste” writing. Plagiarism detection software largely relies on reporting text similarities. However, manual checks are required to reveal inappropriate referencing, copyright violations, and substandard English writing. Medical researchers and authors may improve their writing skills and avoid the same errors by consulting the list of retractions due to plagiarism which are tracked on the PubMed platform and discussed on the Retraction Watch blog.

13 citations

Proceedings ArticleDOI
22 May 2017
TL;DR: A report of how semantic similarity measures can be used in the plagiarism detection task is presented, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved.
Abstract: Academic plagiarism is a serious problem nowadays. Due to the existence of inexhaustible sources of digital information, today it is easier to plagiarize more than ever before. The good thing is that plagiarism detection techniques have improved and are powerful enough to detect attempts of plagiarism in education. We are now witnessing efficient plagiarism detection software in action, such as Turnitin, iThenticate or SafeAssign. In the introduction we explore software that is used within the Croatian academic community for plagiarism detection in universities and/or in scientific journals. The question is - is this enough? Current software has proven to be successful, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved. In this paper we present a report of how semantic similarity measures can be used in the plagiarism detection task.

13 citations

Journal ArticleDOI
31 Mar 2014
TL;DR: In this article, a case study describes what happened in Slovakia in last few years, compares the situation with other European countries and discusses the results of the European project "Impact of Policies for Plagiarism in Higher Education Across Europe".
Abstract: The European project “Impact of Policies for Plagiarism in Higher Education Across Europe“ has identified best practices and gaps related to plagiarism in different European countries. Slovakia is one of interesting ones, where national repository for plagiarism detection was established. However, there are still gaps in terms of policies and overall understanding of plagiarism. This case study describes what happened in Slovakia in last few years, compares the situation with other European countries and discusses the results. Additionally, the number of occurrences of the terms “plagiarism” and “academic integrity” in media and on the Internet is examined in relation to recent changes.

13 citations


Network Information
Related Topics (5)
Active learning
42.3K papers, 1.1M citations
78% related
The Internet
213.2K papers, 3.8M citations
77% related
Software development
73.8K papers, 1.4M citations
77% related
Graph (abstract data type)
69.9K papers, 1.2M citations
76% related
Deep learning
79.8K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022126
202183
2020118
2019130
2018125