scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Quantifying the Effect of Code Smells on Maintenance Effort

Dag I. K. Sjøberg1, A. Yamashita1, Bente Anda1, Audris Mockus2, Tore Dybå1 
01 Aug 2013-IEEE Transactions on Software Engineering (IEEE)-Vol. 39, Iss: 8, pp 1144-1156
TL;DR: To reduce maintenance effort, a focus on reducing code size and the work practices that limit the number of changes may be more beneficial than refactoring code smells.
Abstract: Context: Code smells are assumed to indicate bad design that leads to less maintainable code. However, this assumption has not been investigated in controlled studies with professional software developers. Aim: This paper investigates the relationship between code smells and maintenance effort. Method: Six developers were hired to perform three maintenance tasks each on four functionally equivalent Java systems originally implemented by different companies. Each developer spent three to four weeks. In total, they modified 298 Java files in the four systems. An Eclipse IDE plug-in measured the exact amount of time a developer spent maintaining each file. Regression analysis was used to explain the effort using file properties, including the number of smells. Result: None of the 12 investigated smells was significantly associated with increased effort after we adjusted for file size and the number of changes; Refused Bequest was significantly associated with decreased effort. File size and the number of changes explained almost all of the modeled variation in effort. Conclusion: The effects of the 12 smells on maintenance effort were limited. To reduce maintenance effort, a focus on reducing code size and the work practices that limit the number of changes may be more beneficial than refactoring code smells.
Citations
More filters
Journal ArticleDOI
TL;DR: The findings mostly contradict common wisdom stating that smells are being introduced during evolutionary tasks, and call for the need to develop a new generation of recommendation systems aimed at properly planning smell refactoring activities.
Abstract: Technical debt is a metaphor introduced by Cunningham to indicate “not quite right code which we postpone making it right”. One noticeable symptom of technical debt is represented by code smells, defined as symptoms of poor design and implementation choices. Previous studies showed the negative impact of code smells on the comprehensibility and maintainability of code. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced, what is their survivability , and how they are removed by developers. To empirically corroborate such anecdotal evidence, we conducted a large empirical study over the change history of 200 open source projects. This study required the development of a strategy to identify smell-introducing commits, the mining of over half a million of commits, and the manual analysis and classification of over 10K of them. Our findings mostly contradict common wisdom, showing that most of the smell instances are introduced when an artifact is created and not as a result of its evolution. At the same time, 80 percent of smells survive in the system. Also, among the 20 percent of removed instances, only 9 percent are removed as a direct consequence of refactoring operations.

309 citations


Cites background from "Quantifying the Effect of Code Smel..."

  • ...[43] confirmed that smells do not always constitute a problem, and that often class size impacts maintainability more than the presence of smells....

    [...]

  • ...In the past and, most notably, in recent years, several studies investigated the relevance that code smells have for developers [37], [50], the extent to which code smells tend to remain in a software system for long periods of time [3], [15], [32], [40], as well as the side effects of code smells, such as increase in change- and fault-proneness [25], [26] or decrease of software understandability [1] and maintainability [43], [49], [48]....

    [...]

Journal ArticleDOI
TL;DR: The largest experiment of applying machine learning algorithms to code smells to the best of the authors' knowledge concludes that the application of machine learning to the detection of these code smells can provide high accuracy (>96 %), and only a hundred training examples are needed to reach at least 95 % accuracy.
Abstract: Several code smell detection tools have been developed providing different results, because smells can be subjectively interpreted, and hence detected, in different ways. In this paper, we perform the largest experiment of applying machine learning algorithms to code smells to the best of our knowledge. We experiment 16 different machine-learning algorithms on four code smells (Data Class, Large Class, Feature Envy, Long Method) and 74 software systems, with 1986 manually validated code smell samples. We found that all algorithms achieved high performances in the cross-validation data set, yet the highest performances were obtained by J48 and Random Forest, while the worst performance were achieved by support vector machines. However, the lower prevalence of code smells, i.e., imbalanced data, in the entire data set caused varying performances that need to be addressed in the future studies. We conclude that the application of machine learning to the detection of these code smells can provide high accuracy (>96 %), and only a hundred training examples are needed to reach at least 95 % accuracy.

288 citations

Proceedings ArticleDOI
16 May 2015
TL;DR: The findings mostly contradict common wisdom, showing that most of the smell instances are introduced when an artifact is created and not as a result of its evolution, and at the same time, 80 percent of smells survive in the system.
Abstract: In past and recent years, the issues related to managing technical debt received significant attention by researchers from both industry and academia. There are several factors that contribute to technical debt. One of these is represented by code bad smells, i.e., symptoms of poor design and implementation choices. While the repercussions of smells on code quality have been empirically assessed, there is still only anecdotal evidence on when and why bad smells are introduced. To fill this gap, we conducted a large empirical study over the change history of 200 open source projects from different software ecosystems and investigated when bad smells are introduced by developers, and the circumstances and reasons behind their introduction. Our study required the development of a strategy to identify smell-introducing commits, the mining of over 0.5M commits, and the manual analysis of 9,164 of them (i.e., those identified as smell-introducing). Our findings mostly contradict common wisdom stating that smells are being introduced during evolutionary tasks. In the light of our results, we also call for the need to develop a new generation of recommendation systems aimed at properly planning smell refactoring activities.

245 citations


Cites background from "Quantifying the Effect of Code Smel..."

  • ...[43] confirmed that smells do not always constitute a problem, and that often class size impacts maintainability more than the presence of smells....

    [...]

  • ...In the past and, most notably, in recent years, several studies investigated the relevance that code smells have for developers [37], [50], the extent to which code smells tend to remain in a software system for long periods of time [3], [15], [32], [40], as well as the side effects of code smells, such as increase in change- and fault-proneness [25], [26] or decrease of software understandability [1] and maintainability [43], [49], [48]....

    [...]

Journal ArticleDOI
TL;DR: The results show that smells characterized by long and/or complex code (e.g., Complex Class) are highly diffused, and that smelly classes have a higher change- and fault-proneness than smell-free classes.
Abstract: Code smells are symptoms of poor design and implementation choices that may hinder code comprehensibility and maintainability. Despite the effort devoted by the research community in studying code smells, the extent to which code smells in software systems affect software maintainability remains still unclear. In this paper we present a large scale empirical investigation on the diffuseness of code smells and their impact on code change- and fault-proneness. The study was conducted across a total of 395 releases of 30 open source projects and considering 17,350 manually validated instances of 13 different code smell kinds. The results show that smells characterized by long and/or complex code (e.g., Complex Class) are highly diffused, and that smelly classes have a higher change- and fault-proneness than smell-free classes.

215 citations


Cites background from "Quantifying the Effect of Code Smel..."

  • ...Sjoberg et al (2013) investigated the impact of twelve code smells on the maintainability of software systems....

    [...]

  • ...…properties, such as program comprehensibility (Abbes et al, 2011), fault- and change-proneness (Khomh et al, 2012, 2009a; D’Ambros et al, 2010), and code maintainability (Yamashita and Moonen, 2012, 2013; Deligiannis et al, 2004; Li and Shatnawi, 2007; Olbrich et al, 2010; Sjoberg et al, 2013)....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors provide an interdisciplinary survey on challenges and state-of-the-art in evolution of automated production systems, and summarize future research directions to address the challenges of evolution in automated production system.

213 citations

References
More filters
Book
Martin Fowler1
01 Jan 1999
TL;DR: Almost every expert in Object-Oriented Development stresses the importance of iterative development, but how do you add function to the existing code base while still preserving its design integrity?
Abstract: Almost every expert in Object-Oriented Development stresses the importance of iterative development. As you proceed with the iterative development, you need to add function to the existing code base. If you are really lucky that code base is structured just right to support the new function while still preserving its design integrity. Of course most of the time we are not lucky, the code does not quite fit what we want to do. You could just add the function on top of the code base. But soon this leads to applying patch upon patch making your system more complex than it needs to be. This complexity leads to bugs, and cripples your productivity.

5,174 citations

Journal Article
TL;DR: An introduction to the R project for statistical computing (www.R-project.org) is presented to make the professional community aware of "R" as a potent and free software for graphical and statistical analysis of medical data.
Abstract: An introduction to the R project for statistical computing (www.R-project.org) is presented. The main topics are: 1. To make the professional community aware of "R" as a potent and free software for graphical and statistical analysis of medical data; 2. Simple well-known statistical tests are fairly easy to perform in R, but more complex modelling requires programming skills; 3. R is seen as a tool for teaching statistics and implementing complex modelling of medical data among medical professionals.

2,670 citations


"Quantifying the Effect of Code Smel..." refers methods in this paper

  • ...All statistical analyses were performed using R [43]....

    [...]

Journal ArticleDOI
TL;DR: This research is compared and discussed based on a number of different criteria: the refactoring activities that are supported, the specific techniques and formalisms that are used for supporting these activities, the types of software artifacts that are being refactored, the important issues that need to be taken into account when buildingRefactoring tool support, and the effect of refactors on the software process.
Abstract: We provide an extensive overview of existing research in the field of software refactoring. This research is compared and discussed based on a number of different criteria: the refactoring activities that are supported, the specific techniques and formalisms that are used for supporting these activities, the types of software artifacts that are being refactored, the important issues that need to be taken into account when building refactoring tool support, and the effect of refactoring on the software process. A running example is used to explain and illustrate the main concepts.

1,206 citations


Additional excerpts

  • ...In total, they modified 298 Java files in the four systems....

    [...]

Journal ArticleDOI
TL;DR: Although there are a set of fault prediction studies in which confidence is possible, more studies are needed that use a reliable methodology and which report their context, methodology, and performance comprehensively.
Abstract: Background: The accurate prediction of where faults are likely to occur in code can help direct test effort, reduce costs, and improve the quality of software. Objective: We investigate how the context of models, the independent variables used, and the modeling techniques applied influence the performance of fault prediction models. Method: We used a systematic literature review to identify 208 fault prediction studies published from January 2000 to December 2010. We synthesize the quantitative and qualitative results of 36 studies which report sufficient contextual and methodological information according to the criteria we develop and apply. Results: The models that perform well tend to be based on simple modeling techniques such as Naive Bayes or Logistic Regression. Combinations of independent variables have been used by models that perform well. Feature selection has been applied to these combinations when models are performing particularly well. Conclusion: The methodology used to build models seems to be influential to predictive performance. Although there are a set of fault prediction studies in which confidence is possible, more studies are needed that use a reliable methodology and which report their context, methodology, and performance comprehensively.

1,012 citations

Journal ArticleDOI
TL;DR: DETEX is proposed, a method that embodies and defines all the steps necessary for the specification and detection of code and design smells, and a detection technique that instantiates this method, and an empirical validation in terms of precision and recall of DETEX.
Abstract: Code and design smells are poor solutions to recurring implementation and design problems. They may hinder the evolution of a system by making it hard for software engineers to carry out changes. We propose three contributions to the research field related to code and design smells: (1) DECOR, a method that embodies and defines all the steps necessary for the specification and detection of code and design smells, (2) DETEX, a detection technique that instantiates this method, and (3) an empirical validation in terms of precision and recall of DETEX. The originality of DETEX stems from the ability for software engineers to specify smells at a high level of abstraction using a consistent vocabulary and domain-specific language for automatically generating detection algorithms. Using DETEX, we specify four well-known design smells: the antipatterns Blob, Functional Decomposition, Spaghetti Code, and Swiss Army Knife, and their 15 underlying code smells, and we automatically generate their detection algorithms. We apply and validate the detection algorithms in terms of precision and recall on XERCES v2.7.0, and discuss the precision of these algorithms on 11 open-source systems.

710 citations


"Quantifying the Effect of Code Smel..." refers methods in this paper

  • ...This study was conducted on four different but functionally equivalent (with the same requirements specifications) web-based information systems originally implemented (primarily in Java) by different contractors [3]....

    [...]

  • ...Method: Six developers were hired to perform three maintenance tasks each on four functionally equivalent Java systems originally implemented by different companies....

    [...]

  • ...We hired six developers who performed three maintenance tasks each on two of four functionally equivalent but independently developed Java systems....

    [...]

  • ...The 12 smells investigated in this study were those that these tools detected in the four Java systems used in our study....

    [...]

  • ...Furthermore, the scope of this study was real, albeit small, web-based information systems primarily implemented in Java....

    [...]