scispace - formally typeset
Search or ask a question
Topic

Plagiarism detection

About: Plagiarism detection is a research topic. Over the lifetime, 1790 publications have been published within this topic receiving 24740 citations.


Papers
More filters
08 Jun 2018
TL;DR: The proposed framework performs efficient sections wise plagiarism detection and provides suggestions for improving documents and precision, recall and accuracy based on different n-gram features is presented showing the strictness of higher level n- gram features.
Abstract: Various approaches have been implemented for plagiarism detection used, for author‘s work and academic publication, there is a purpose to create such reliable and performant plagiarism detection with increasing amount of publications. This is a serious offense where one author presents someone else’s work as his own. Moreover these algorithms don’t consider similar sections for efficient comparison. The proposed framework performs efficient sections wise plagiarism detection and provides suggestions for improving documents. The precision, recall and accuracy based on different n-gram features is presented showing the strictness of higher level n-gram features.

6 citations

Journal ArticleDOI
TL;DR: A generic program structure comparison framework that is designed to transform the source code into mathematical objects, use appropriate reduction and comparison methods on these, and interpret the results appropriately is presented.
Abstract: The paper presents a plagiarism detection framework the goal of which is to determine whether two programs are similar to each other, and if so, to what extent. The issue of plagiarism detection has been considered earlier for written material, such as student essays. For these, text-based algorithms have been published. We argue that in case of program code comparison, structure based techniques may be much more suitable. The main idea is to transform the source code into mathematical objects, use appropriate reduction and comparison methods on these, and interpret the results appropriately. We have designed a generic program structure comparison framework and implemented it for the Prolog and SML programming languages. We have been using the implementation at BUTE to successfully detect plagiarism in homework assignments for years.

6 citations

Journal ArticleDOI
TL;DR: IBFET as mentioned in this paper uses the MapReduce rule of divide and conquer to detect code clones at a very large scale level to billions of LOC at file level granularity, and performs preprocessing, indexing, and clone detection for more than 324 billion LOC using a Hadoop distributed environment.
Abstract: Many techniques have been developed over the years to detect code clones in different software systems to maintain security measures. These techniques often require the source code to compare the subject system against a very large data set of big code. This paper presents index‐based features extraction technique (IBFET) to detect code clones at a very large‐scale level to billions of LOC at file level granularity. We performed preprocessing, indexing, and clone detection for more than 324 billion of LOC using a Hadoop distributed environment, which is quite faster and more efficient as compared to existing distributed indexing and clone detection techniques; meanwhile, it detects all three types of clones efficiently. The MapReduce rule of divide and conquer is used for a count and retrieve the similar features between different systems. We evaluated the execution time, scalability, precision, and recall of IBFET by using a well‐known clone detection data set IJaDataset and BigCloneBench; furthermore, we compared the results with other state‐of‐the‐art tools. Our approach is faster, flexible, scalable, and provides accurate results with high authenticity and can be implemented at a large‐scale level.

6 citations

01 Jan 2008
TL;DR: This work was designed the plagiarism detection tool in the e-learning system that has accuracy and faster detection process, using flngerprint tech- nique by tokenisation of words n-grams to ensure that copied material is detected and that no stu- dent is unfairly accused of copying.
Abstract: Cheating is one of major problem in education. With the increase in the use of computers and the Internet the oppor- tunities for students to plagiarize have increased many times. Electronic source material coupled with word processed sub- mission facilitates a 'cut and paste' mentality where a stu- dent's engagement with a subject can be bypassed. The students share their work and collaborate with other stu- dents through email or copy material from published works or the internet without proper citation. In computer pro- gramming classes, the students can easily share their source code for programming assignments. The e-learning give ben- eflt for educational process but it has high plagiarism op- portunity. There is a need of plagiarism detection tool to help a lecturer to give student assessment fairly. This work was designed the plagiarism detection tool in the e-learning system. The aim of any plagiarism detection process is to ensure that copied material is detected, and that no stu- dent is unfairly accused of copying. Software exists to help achieve this aim, but (obviously) no system is perfect, and no fully automatic plagiarism detection system exists that will identify all and only cheaters. Similarity issue insure the plagiarism accuracy and time consuming insure the quality of service because it involves the amount of document and amount of ideas inside documents. Using flngerprint tech- nique by tokenisation of words n-grams it has accuracy and faster detection process.

6 citations


Network Information
Related Topics (5)
Active learning
42.3K papers, 1.1M citations
78% related
The Internet
213.2K papers, 3.8M citations
77% related
Software development
73.8K papers, 1.4M citations
77% related
Graph (abstract data type)
69.9K papers, 1.2M citations
76% related
Deep learning
79.8K papers, 2.1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202359
2022126
202183
2020118
2019130
2018125