scispace - formally typeset
Proceedings ArticleDOI

Detecting Java Code Clones with Multi-granularities Based on Bytecode

Reads0
Chats0
TLDR
A novel code clone detection method based on Java bytecode that can simultaneously detect code clones at both method level and block level and is shown to be more effective than the state-of-the-art methods.
Abstract
Sequences of duplicate code, either with or without modification, are known as code clones or just clones. Code clones are generally considered undesirable for a number of reasons, although they can offer some convenience to developers. The detection of code clones helps to improve the quality of source code through software re-engineering. Numerous methods have been proposed for code clone detection in Java code. However, the existing methods are mostly based on the Java source code, while only a few focus on its bytecode, in fact, the Java bytecode reflects more of the semantic nature of the source code than the source code itself does. In this paper, we propose a novel code clone detection method based on Java bytecode. Using the block-level code fragments extracted from bytecode, it can simultaneously detect code clones at both method level and block level. In addition, during the process of code clone detection, the similarities of both method call sequences and instruction sequences are calculated in order to improve accuracy. We conduct two extensive experiments to evaluate the performance of our method. The results show that the proposed method can detect code clones more effectively than the state-of-the-art methods.

read more

Citations
More filters
Journal ArticleDOI

A Systematic Review on Code Clone Detection

TL;DR: There is a need to develop novel approaches with complete tool support in order to detect all four types of clones collectively and it is also required to introduce more approaches to simplify the development of a program dependency graph (PDG) while dealing with the detection of the type4 clones.
Journal ArticleDOI

Detecting Java Code Clones Based on Bytecode Sequence Alignment

TL;DR: An approach based on Java bytecode is introduced, which mainly contains the steps of bytecode sequence alignment and similarity score comparison, and separately considers the similarities between instruction sequences and method call sequences, thus improving its effectiveness in detecting code clones.
Proceedings ArticleDOI

DCCD: An Efficient and Scalable Distributed Code Clone Detection Technique for Big Code.

TL;DR: The DCCD technique, which detects clones from big code bases based on feature extraction is presented, which is faster, flexible, scalable and provides 87% accurate results with authenticity, ease of accessibility, upgradeability and maintainability.
Journal ArticleDOI

IBFET: Index‐based features extraction technique for scalable code clone detection at file level granularity

TL;DR: IBFET as mentioned in this paper uses the MapReduce rule of divide and conquer to detect code clones at a very large scale level to billions of LOC at file level granularity, and performs preprocessing, indexing, and clone detection for more than 324 billion LOC using a Hadoop distributed environment.
Proceedings ArticleDOI

Software Product Line Extraction from Bytecode Based Applications

TL;DR: A Software Product Line (SPL) extraction approach to handle legacy software systems running on the Java Virtual Machine (JVM), for which the source code is unavailable, and factor in all input programming languages for the JVM.
References
More filters
Journal ArticleDOI

CCFinder: a multilinguistic token-based code clone detection system for large scale source code

TL;DR: A new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison, is proposed, which has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems.
Proceedings ArticleDOI

Clone detection using abstract syntax trees

TL;DR: The paper presents simple and practical methods for detecting exact and near miss clones over arbitrary program fragments in program source code by using abstract syntax trees and suggests that clone detection could be useful in producing more structured code, and in reverse engineering to discover domain concepts and their implementations.
Proceedings ArticleDOI

DECKARD: Scalable and Accurate Tree-Based Detection of Code Clones

TL;DR: This paper presents an efficient algorithm for identifying similar subtrees and apply it to tree representations of source code and implemented this algorithm as a clone detection tool called DECKARD and evaluated it on large code bases written in C and Java including the Linux kernel and JDK.
Journal ArticleDOI

Comparison and evaluation of code clone detection techniques and tools: A qualitative approach

TL;DR: A qualitative comparison and evaluation of the current state-of-the-art in clone detection techniques and tools is provided, and a taxonomy of editing scenarios that produce different clone types and a qualitative evaluation of current clone detectors are evaluated.
Proceedings ArticleDOI

On finding duplication and near-duplication in large software systems

TL;DR: A program called dup can be used to locate instances of duplication or near-duplication in a software system and is shown to be both effective at locating duplication and fast.
Related Papers (5)