Automatic Generation of Plagiarism Detection Among Student Programs

doi:10.1109/ITHET.2006.339768

Proceedings ArticleDOI

Automatic Generation of Plagiarism Detection Among Student Programs

- pp 226-235

TLDR

Initial qualitative and quantitative evaluations illustrate a flexible, convenient and cost-effective tool for building plagiarism detectors for effective detection of programs in various imperative and procedural programming languages.

Abstract:

A system for the automatic generation of plagiarism detectors that find similar programs in a set of student programs is presented. Existing plagiarism detectors are either applied to a programming language or a pre-defined set of programming languages. The general purpose one usually employs string matching to perform similarity measures that are based on plagiarism detection among documents in general, and not in programs in particular, thus, losing much of the structure and logic of programs in the process. On the other hand, plagiarism detectors for specific languages only cater to that particular set of languages. This study provides a means for the user to specify the programming language of the student programs to be analyzed. Moreover, an automatic plagiarism detector system must be immune to the transformations that students perform on copied programs. These transformations are usually dependent on several factors namely: the type of programming problems and correspondingly, the complexity of the project to be implemented by the students, and also the programming language paradigm of the programs. Thus, the similarity measures employed by the system should be determined by these factors and can be specified by the professor. He/she has the option to specify how the similarities among the student programs will be captured. The system provides an interface for the specification of the particular programming language in which the student programs are implemented, and a knowledgebase of similarity measures that the user would like to include in the analysis of the student programs. Hence, the system provides flexibility in the programming language of the student programs to be analyzed and the similarity measures that the professor wishes to employ. Initial qualitative and quantitative evaluations illustrate a flexible, convenient and cost-effective tool for building plagiarism detectors for effective detection of programs in various imperative and procedural programming languages. The approach also addresses some of the changes that students perform on copied programs which JPlag fails to handle, thus, allowing for improved accuracy in terms of the reduction of false-positives, increasing the chance of catching plagiarized programs. These changes include modification of control structures, use of temporary variables and subexpressions, in-lining and re-factoring of methods, and redundancy (variables or methods that were not used). Comprehensive tests on other programming languages under various programming language paradigms such as object-oriented, logic and functional languages, considering the different changes that the students employ to copied programs (such as the tests done in JPlag) are also recommended for empirical evaluation

Automatic Generation of Plagiarism Detection Among Student Programs

Citations

Source-code Similarity Detection and Detection Tools Used in Academia: A Systematic Review

Plagiarism Detection based on studying correlation between Author, Title and Content

Layered similarity detection for programming plagiarism and collusion on weekly assessments

Research of anti-plagiarism monitoring system model

Sistem Otomatisasi Pengelolaan Laboratorium untuk Penilaian Praktikum Pemrograman Dasar dengan Deteksi Plagiarisme

References

Elements of software science

Winnowing: local algorithms for document fingerprinting

Elements of software science (Operating and programming systems series)

Finding Plagiarisms among a Set of Programs with JPlag

An algorithmic approach to the detection and prevention of plagiarism

Related Papers (5)

Finding Anomalies in Scratch Assignments

Deep Reinforcement Learning for Programming Language Correction

Introduction to Programming in Python: An Interdisciplinary Approach

Verified from scratch: program analysis for learners' programs

Measuring code behavioral similarity for programming and software engineering education