scispace - formally typeset
Search or ask a question
Topic

Plagiarism detection

About: Plagiarism detection is a research topic. Over the lifetime, 1790 publications have been published within this topic receiving 24740 citations.


Papers
More filters
Proceedings ArticleDOI
05 Sep 2019
TL;DR: The results call into question the usefulness of automated detection for SQL since they imply that a lot of manual inspection will still be needed and suggest that the false-positive rate may be restricted to shorter queries (e.g. under 200 characters).
Abstract: Automated assessment is becoming increasingly common in Computer Science and with it automated plagiarism detection is also common. However, little attention has been paid to SQL assessment where submissions are much shorter and must be less varied than in imperative languages. This brings the challenge of avoiding high false-positive rates that require manual inspection and undermine the usefulness of automated detection. In this paper we investigate the false-positive rate of various automated plagiarism detection algorithms. We find that there is a significant false-positive rate of between 15% and 64%. These results call into question the usefulness of automated detection for SQL since they imply that a lot of manual inspection will still be needed. However, our results suggest that the false-positive rate may be restricted to shorter queries (e.g. under 200 characters). Further research is needed because our datasets consist mostly of short queries and the results for longer queries are based on a small subset of the data.

5 citations

Journal ArticleDOI
TL;DR: The most frequently used strategies were rubrics, plagiarism detection software, multistep assignments, and examples of well-written papers, all strategies that increased in use over the 5-year study.
Abstract: Background Nursing students are underprepared for the rigors of graduate writing. The lack of sufficient writing opportunities and skill development in prelicensure nursing education creates barriers that threaten course and program progression. Approach This study used a prospective, repeated-measures design to evaluate 5 years of faculty-implemented writing development strategies in a DNP program. Outcomes Faculty adopted 12 strategies in 10 courses. The strategies addressed skill building in content, construction, format, plagiarism, and citation use. The most frequently used strategies were rubrics, plagiarism detection software, multistep assignments, and examples of well-written papers, all strategies that increased in use over the 5-year study. Conclusions Graduate faculty interact with students and assess writing development outcomes firsthand. Changes in faculty practices over time can indicate the strategies they consider most valuable for writing development.

5 citations

Proceedings ArticleDOI
29 Oct 2020
TL;DR: In this article, a deep learning model for Persian language to detect exact plagiarisms and rewritings in Persian science texts is presented. But the results indicate that structural and semantic information improves the performance of the proposed method.
Abstract: In recent years, the rapid increase of Persian electronic resources and facility of access to them has seriously triggered the plagiarism problem of the Iranian scientific community. Despite the automatic systems of plagiarism detection, like Turnitin, Eve2, this problem has strongly remained due to lack of support from Persian. The main purpose of this article is to detect exact plagiarisms and re-writings in Persian science texts. In our proposed method, after the candidate retrieval based on the statistical characteristics, in the text alignment step, structural analysis and semantic analysis of expression has been performed to detect re-writing plagiarisms. Firstly, data-driven dependency parser has been improved with the help of a deep learning model for Persian language to analyze the structure of the expression, and then the degree of structural similarity of the expression is evaluated through the analysis of the dependency tree. In this paper, our suggestion to examine the semantic similarity of expression is to use the semantic role labeling obtained from the deep learning model presented. The experiments have been performed on the corpus prepared in the AAIC2015 and corpus of the PAN2015 competitions. The results indicate that structural and semantic information improves the performance of the proposed method. ParsiPayesh is available on http://www.parsipayesh.ir.

5 citations

01 Jun 2009
TL;DR: CrossCheck as mentioned in this paper is a plagiarism detection system based on concordance, i.e., comparison of text where program tools isolate and mark correspondent parts of the text and calculate its rate regarding the whole text.
Abstract: Plagiarism is unauthorized appropriation of other people’s ideas, processes or text without giving correct credit and with intention to present it as own property. Appropriation of own published ideas or text and passing it as original is denominated self-plagiarism and considered as bad as plagiarism. The frequency of plagiarism is increasing and development of information and communication technologies facilitates it, but simultaneously, thanks to the same technology, plagiarism detection software is developing. There are diff erent software solutions for checking plagiarism. Most of them are based on concordance, i.e., comparison of text where program tools isolate and mark correspondent parts of the text and calculate its rate regarding the whole text. Several programs, besides comparing the texts, also search the Internet aiming for text with corresponding content. All programs can compare text written in the same language but translingual comparison with plagiarism detection software is not yet possible. The software is available through computer programs (WCopyfi nd) or Web Services (eTBlast, CrossCheck). Their advantage is in the possibility of fi nding the original source paper. eTBlast is the free of charge web based service for searching corresponding and highly similar scientific paper abstracts (it searches also Medline database), which served as the ground for constructing Deja vu database. Web based service CrossCheck is accessible only for members (academic institutions and journals) and by using computer similarity algorithm iThen cate of company iParadigms (Oakland, CA, USA), it checks accordance of the given text with the complete texts in the CrossCheck database. It is organized by collaboration of journal editorial boards and publishers who pass the published papers to the base and enable searching of content usually protected by subscription. The importance of recognizing and teaching plagiarism in the academic community at all levels of education is enormous. Scientific journal editors and scientists should fight together against unethical researches which are opposite to the scientific idea and harmful for scientific community and society, critically read and examine scientific publications, report plagiarism and other suspicious research misconduct to journal editorial boards and institutional authorities.

5 citations

Proceedings ArticleDOI
18 Jun 2008
TL;DR: This paper approximate a student's coding style which is superficial feature of a source code by a stochastic model, called coding model based on Hidden Markov Model and use it for authentification information of an author.
Abstract: Measuring similarity among source codes produced in programming class, hereinafter called 'in-class' source codes, for grading or detecting plagiarisms is a laborious task. A special similarity measuring method for in-class source codes is needed because: (1) they are often too short to extract enough algorithmic features, and (2) they naturally have strong algorithmic similarity since they are made for the same purpose, and it is difficult to distinguish plagiarism and coincidental similarity in them. The contribution of this paper is to quantify the features based on students' coding style instead of algorithmic features. We approximate a student's coding style which is superficial feature of a source code by a stochastic model, called coding model based on Hidden Markov Model and use it for authentification information of an author.

5 citations


Network Information
Related Topics (5)
Active learning
42.3K papers, 1.1M citations
78% related
The Internet
213.2K papers, 3.8M citations
77% related
Software development
73.8K papers, 1.4M citations
77% related
Graph (abstract data type)
69.9K papers, 1.2M citations
76% related
Deep learning
79.8K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022126
202183
2020118
2019130
2018125