scispace - formally typeset
Proceedings ArticleDOI

A Plagiarism Detection Technique for Java Program Using Bytecode Analysis

Jeong-Hoon Ji, +2 more
- Vol. 1, pp 1092-1098
TLDR
In this paper, a plagiarism detection technique for Java programs using bytecodes without referring their source codes is proposed, which can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.
Abstract
Most plagiarism detection systems evaluate the similarity of source codes and detect plagiarized program pairs. If we use the source codes in plagiarism detection, the source code security can be a significant problem. Plagiarism detection based on target code can be used for protecting the security of source codes. In this paper, we propose anew plagiarism detection technique for Java programs using bytecodes without referring their source codes. The plagiarism detection procedure using bytecode consists of two major steps. First, we generate the token sequences from the Java class file by analyzing the code area of methods. Then, we evaluate the similarity between token sequences using the adaptive local alignment. According to the experimental results, we can find the distributions of similarities of the source codes and that of bytecodes are very similar. Also, the correlation between the similarities of source code pairs and those of bytecode pairs is high enough for typical test data. The plagiarism detection system using bytecode can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.

read more

Citations
More filters
Journal ArticleDOI

Source-code Similarity Detection and Detection Tools Used in Academia: A Systematic Review

TL;DR: This review gives an overview of definitions of plagiarism, plagiarism detection tools, comparison metrics, obfuscation methods, datasets used for comparison, and algorithm types and identifies interesting insights about metrics and datasets for quantitative tool comparison and categorisation of detection algorithms.
Journal ArticleDOI

Code Authorship Attribution: Methods and Challenges

TL;DR: This article presents the first comprehensive review of research on code authorship attribution, and summarizes various methods of authorship attributions, and highlights challenges in the field.
Proceedings ArticleDOI

Detecting source code plagiarism on introductory programming course assignments using a bytecode approach

TL;DR: Based on evaluation, it can be concluded that the source code plagiarism detection approach is more effective to detect most plagiarism attack types than raw source code approach on introductory programming course.
Proceedings ArticleDOI

Similarity Detection Techniques for Academic Source Code Plagiarism and Collusion: A Review

TL;DR: The mechanisms by which each of these code similarity detection techniques works are summarized, compiled from publications listed by Google Scholar and one or more of the ACM digital library, IEEE Xploredigital library, ScienceDirect, Scopus, and the references of already listed publications.
Journal ArticleDOI

Detecting Source Code Plagiarism on .NET Programming Languages using Low-level Representation and Adaptive Local Alignment

TL;DR: A source code plagiarism detection which rely on low-level representation which is more effective and efficient when compared with standard lexical-token approach is proposed.
References
More filters
Journal ArticleDOI

Identification of common molecular subsequences.

TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal Article

Finding Plagiarisms among a Set of Programs with JPlag

TL;DR: JPlag is a web service that finds pairs of similar programs among a given set of programs and its architecture and its comparsion algorithm is described, which is based on a known one called Greedy String Tiling.
Journal ArticleDOI

Shared information and program plagiarism detection

TL;DR: A metric, based on Kolmogorov complexity, is proposed and proven to be universal in measuring the amount of shared information between two computer programs, to enable plagiarism detection and a practical system is designed and implemented that approximates this metric by a heuristic compression algorithm.
Journal ArticleDOI

Plagiarism in programming assignments

TL;DR: The authors have developed a package which will allow programming assignments to be submitted online, and which includes software to assist in detecting possible instances of plagiarism, and consider its implications for large group teaching.
Proceedings Article

Deducing similarities in Java sources from bytecodes

TL;DR: Experimental results indicate that these techniques can be very effective, even changes of 30% to the source file will usually result in bytecode that can be associated with the original.
Related Papers (5)