Proceedings ArticleDOI
A Plagiarism Detection Technique for Java Program Using Bytecode Analysis
Jeong-Hoon Ji,Gyun Woo,Hwan-Gue Cho +2 more
- Vol. 1, pp 1092-1098
TLDR
In this paper, a plagiarism detection technique for Java programs using bytecodes without referring their source codes is proposed, which can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.Abstract:
Most plagiarism detection systems evaluate the similarity of source codes and detect plagiarized program pairs. If we use the source codes in plagiarism detection, the source code security can be a significant problem. Plagiarism detection based on target code can be used for protecting the security of source codes. In this paper, we propose anew plagiarism detection technique for Java programs using bytecodes without referring their source codes. The plagiarism detection procedure using bytecode consists of two major steps. First, we generate the token sequences from the Java class file by analyzing the code area of methods. Then, we evaluate the similarity between token sequences using the adaptive local alignment. According to the experimental results, we can find the distributions of similarities of the source codes and that of bytecodes are very similar. Also, the correlation between the similarities of source code pairs and those of bytecode pairs is high enough for typical test data. The plagiarism detection system using bytecode can be used as a preliminary verifying tool before detecting the plagiarism by source code comparison.read more
Citations
More filters
Journal ArticleDOI
Source-code Similarity Detection and Detection Tools Used in Academia: A Systematic Review
TL;DR: This review gives an overview of definitions of plagiarism, plagiarism detection tools, comparison metrics, obfuscation methods, datasets used for comparison, and algorithm types and identifies interesting insights about metrics and datasets for quantitative tool comparison and categorisation of detection algorithms.
Journal ArticleDOI
Code Authorship Attribution: Methods and Challenges
TL;DR: This article presents the first comprehensive review of research on code authorship attribution, and summarizes various methods of authorship attributions, and highlights challenges in the field.
Proceedings ArticleDOI
Detecting source code plagiarism on introductory programming course assignments using a bytecode approach
TL;DR: Based on evaluation, it can be concluded that the source code plagiarism detection approach is more effective to detect most plagiarism attack types than raw source code approach on introductory programming course.
Proceedings ArticleDOI
Similarity Detection Techniques for Academic Source Code Plagiarism and Collusion: A Review
TL;DR: The mechanisms by which each of these code similarity detection techniques works are summarized, compiled from publications listed by Google Scholar and one or more of the ACM digital library, IEEE Xploredigital library, ScienceDirect, Scopus, and the references of already listed publications.
Journal ArticleDOI
Detecting Source Code Plagiarism on .NET Programming Languages using Low-level Representation and Adaptive Local Alignment
TL;DR: A source code plagiarism detection which rely on low-level representation which is more effective and efficient when compared with standard lexical-token approach is proposed.
References
More filters
Journal ArticleDOI
Identification of common molecular subsequences.
TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
Journal Article
Finding Plagiarisms among a Set of Programs with JPlag
TL;DR: JPlag is a web service that finds pairs of similar programs among a given set of programs and its architecture and its comparsion algorithm is described, which is based on a known one called Greedy String Tiling.
Journal ArticleDOI
Shared information and program plagiarism detection
TL;DR: A metric, based on Kolmogorov complexity, is proposed and proven to be universal in measuring the amount of shared information between two computer programs, to enable plagiarism detection and a practical system is designed and implemented that approximates this metric by a heuristic compression algorithm.
Journal ArticleDOI
Plagiarism in programming assignments
Mike Joy,Michael Luck +1 more
TL;DR: The authors have developed a package which will allow programming assignments to be submitted online, and which includes software to assist in detecting possible instances of plagiarism, and consider its implications for large group teaching.
Proceedings Article
Deducing similarities in Java sources from bytecodes
Brenda S. Baker,Udi Manber +1 more
TL;DR: Experimental results indicate that these techniques can be very effective, even changes of 30% to the source file will usually result in bytecode that can be associated with the original.