scispace - formally typeset
Search or ask a question

Showing papers in "Information & Software Technology in 2015"


Journal ArticleDOI
TL;DR: There was a need to provide an update of how to conduct systematic mapping studies and how the guidelines should be updated based on the lessons learned from the existing systematic maps and SLR guidelines.
Abstract: Context Systematic mapping studies are used to structure a research area, while systematic reviews are focused on gathering and synthesizing evidence. The most recent guidelines for systematic mapping are from 2008. Since that time, many suggestions have been made of how to improve systematic literature reviews (SLRs). There is a need to evaluate how researchers conduct the process of systematic mapping and identify how the guidelines should be updated based on the lessons learned from the existing systematic maps and SLR guidelines. Objective To identify how the systematic mapping process is conducted (including search, study selection, analysis and presentation of data, etc.); to identify improvement potentials in conducting the systematic mapping process and updating the guidelines accordingly. Method We conducted a systematic mapping study of systematic maps, considering some practices of systematic review guidelines as well (in particular in relation to defining the search and to conduct a quality assessment). Results In a large number of studies multiple guidelines are used and combined, which leads to different ways in conducting mapping studies. The reason for combining guidelines was that they differed in the recommendations given. Conclusion The most frequently followed guidelines are not sufficient alone. Hence, there was a need to provide an update of how to conduct systematic mapping studies. New guidelines have been proposed consolidating existing findings.

1,598 citations


Journal ArticleDOI
TL;DR: A systematic mapping of the field of gamification in software engineering is carried out in an attempt to characterize the state of the art of this field identifying gaps and opportunities for further research.
Abstract: Context Gamification seeks for improvement of the user’s engagement, motivation, and performance when carrying out a certain task, by means of incorporating game mechanics and elements, thus making that task more attractive Much research work has studied the application of gamification in software engineering for increasing the engagement and results of developers Objective The objective of this paper is to carry out a systematic mapping of the field of gamification in software engineering in an attempt to characterize the state of the art of this field identifying gaps and opportunities for further research Method We carried out a systematic mapping with a view to finding the primary studies in the existing literature, which were later classified and analyzed according to four criteria: the software process area addressed, the gamification elements used, the type of research method followed, and the type of forum in which they were published A subjective evaluation of the studies was also carried out to evaluate them in terms of methodology, empirical evidence, integration with the organization, and replicability Results As a result of the systematic mapping we found 29 primary studies, published between January 2011 and June 2014 Most of them focus on software development, and to a lesser extent, requirements, project management, and other support areas In the main, they consider very simple gamification mechanics such as points and badges, and few provide empirical evidence of the impact of gamification Conclusions Existing research in the field is quite preliminary, and more research effort analyzing the impact of gamification in SE would be needed Future research work should look at other game mechanics in addition to the basic ones and should tackle software process areas that have not been fully studied, such as requirements, project management, maintenance, or testing Most studies share a lack of methodological support that would make their proposals replicable in other settings The integration of gamification with an organization’s existing tools is also an important challenge that needs to be taken up in this field

312 citations


Journal ArticleDOI
TL;DR: Tackling software data issues, including redundancy, correlation, feature irrelevance and missing samples, with the proposed combined learning model resulted in remarkable classification performance paving the way for successful quality control.
Abstract: Context Several issues hinder software defect data including redundancy, correlation, feature irrelevance and missing samples. It is also hard to ensure balanced distribution between data pertaining to defective and non-defective software. In most experimental cases, data related to the latter software class is dominantly present in the dataset. Objective The objectives of this paper are to demonstrate the positive effects of combining feature selection and ensemble learning on the performance of defect classification. Along with efficient feature selection, a new two-variant (with and without feature selection) ensemble learning algorithm is proposed to provide robustness to both data imbalance and feature redundancy. Method We carefully combine selected ensemble learning models with efficient feature selection to address these issues and mitigate their effects on the defect classification performance. Results Forward selection showed that only few features contribute to high area under the receiver-operating curve (AUC). On the tested datasets, greedy forward selection (GFS) method outperformed other feature selection techniques such as Pearson’s correlation. This suggests that features are highly unstable. However, ensemble learners like random forests and the proposed algorithm, average probability ensemble (APE), are not as affected by poor features as in the case of weighted support vector machines (W-SVMs). Moreover, the APE model combined with greedy forward selection (enhanced APE) achieved AUC values of approximately 1.0 for the NASA datasets: PC2, PC4, and MC1. Conclusion This paper shows that features of a software dataset must be carefully selected for accurate classification of defective components. Furthermore, tackling the software data issues, mentioned above, with the proposed combined learning model resulted in remarkable classification performance paving the way for successful quality control.

293 citations


Journal ArticleDOI
Peng He1, Bing Li1, Xiao Liu1, Jun Chen1, Yutao Ma1 
TL;DR: The experimental results indicate that the choice of training data for defect prediction should depend on the specific requirement of accuracy and the minimum metric subset can be identified to facilitate the procedure of general defect prediction with acceptable loss of prediction precision in practice.
Abstract: ContextSoftware defect prediction plays a crucial role in estimating the most defect-prone components of software, and a large number of studies have pursued improving prediction accuracy within a project or across projects. However, the rules for making an appropriate decision between within- and cross-project defect prediction when available historical data are insufficient remain unclear. ObjectiveThe objective of this work is to validate the feasibility of the predictor built with a simplified metric set for software defect prediction in different scenarios, and to investigate practical guidelines for the choice of training data, classifier and metric subset of a given project. MethodFirst, based on six typical classifiers, three types of predictors using the size of software metric set were constructed in three scenarios. Then, we validated the acceptable performance of the predictor based on Top-k metrics in terms of statistical methods. Finally, we attempted to minimize the Top-k metric subset by removing redundant metrics, and we tested the stability of such a minimum metric subset with one-way ANOVA tests. ResultsThe study has been conducted on 34 releases of 10 open-source projects available at the PROMISE repository. The findings indicate that the predictors built with either Top-k metrics or the minimum metric subset can provide an acceptable result compared with benchmark predictors. The guideline for choosing a suitable simplified metric set in different scenarios is presented in Table 12. ConclusionThe experimental results indicate that (1) the choice of training data for defect prediction should depend on the specific requirement of accuracy; (2) the predictor built with a simplified metric set works well and is very useful in case limited resources are supplied; (3) simple classifiers (e.g., Naive Bayes) also tend to perform well when using a simplified metric set for defect prediction; and (4) in several cases, the minimum metric subset can be identified to facilitate the procedure of general defect prediction with acceptable loss of prediction precision in practice.

255 citations


Journal ArticleDOI
TL;DR: The results have revealed that UI does contribute positively to system success, but it is a double edged sword and if not managed carefully it may cause more problems than benefits.
Abstract: Context For more than four decades it has been intuitively accepted that user involvement (UI) during system development lifecycle leads to system success. However when the researchers have evaluated the user involvement and system success (UI-SS) relationship empirically, the results were not always positive. Objective Our objective was to explore the UI-SS relationship by synthesizing the results of all the studies that have empirically investigated this complex phenomenon. Method We performed a Systematic Literature Review (SLR) following the steps provided in the guidelines of Evidence Based Software Engineering. From the resulting studies we extracted data to answer our 9 research questions related to the UI-SS relationship, identification of users, perspectives of UI, benefits, problems and challenges of UI, degree and level of UI, relevance of stages of software development lifecycle (SDLC) and the research method employed on the UI-SS relationship. Results Our systematic review resulted in selecting 87 empirical studies published during the period 1980–2012. Among 87 studies reviewed, 52 reported that UI positively contributes to system success, 12 suggested a negative contribution and 23 were uncertain. The UI-SS relationship is neither direct nor binary, and there are various confounding factors that play their role. The identification of users, their degree/level of involvement, stage of SDLC for UI, and choice of research method have been claimed to have impact on the UI-SS relationship. However, there is not sufficient empirical evidence available to support these claims. Conclusion Our results have revealed that UI does contribute positively to system success. But it is a double edged sword and if not managed carefully it may cause more problems than benefits. Based on the analysis of 87 studies, we were able to identify factors for effective management of UI alluding to the causes for inconsistency in the results of published literature.

230 citations


Journal ArticleDOI
TL;DR: This paper investigates the following research question: Which principles constitute a user-centered agile software development approach and identifies generic principles of UCASD and associating them with specific practices and processes.
Abstract: ContextIn the last decade, software development has been characterized by two major approaches: agile software development, which aims to achieve increased velocity and flexibility during the development process, and user-centered design, which places the goals and needs of the system's end-users at the center of software development in order to deliver software with appropriate usability. Hybrid development models, referred to as user-centered agile software development (UCASD) in this article, propose to combine the merits of both approaches in order to design software that is both useful and usable. ObjectiveThis paper aims to capture the current state of the art in UCASD approaches and to derive generic principles from these approaches. More specifically, we investigate the following research question: Which principles constitute a user-centered agile software development approach? MethodWe conduct a systematic review of the literature on UCASD. Identified works are analyzed using a coding scheme that differentiates four levels of UCASD: the process, practices, people/social and technology dimensions. Through subsequent synthesis, we derive generic principles of UCASD. ResultsWe identified and analyzed 83 relevant publications. The analysis resulted in a comprehensive coding system and five principles for UCASD: (1) separate product discovery and product creation, (2) iterative and incremental design and development, (3) parallel interwoven creation tracks, (4) continuous stakeholder involvement, and (5) artifact-mediated communication. ConclusionOur paper contributes to the software development body of knowledge by (1) providing a broad overview of existing works in the area of UCASD, (2) deriving an analysis framework (in form a coding system) for works in this area, going beyond former classifications, and (3) identifying generic principles of UCASD and associating them with specific practices and processes.

228 citations


Journal ArticleDOI
TL;DR: A systematic literature review of papers reporting empirical evidence regarding the barriers that newcomers face when contributing to open source software projects identified 20 studies providing empirical evidence of barriers faced by newcomers to OSS projects while making a contribution.
Abstract: Display Omitted ContextNumerous open source software projects are based on volunteers collaboration and require a continuous influx of newcomers for their continuity. Newcomers face barriers that can lead them to give up. These barriers hinder both developers willing to make a single contribution and those willing to become a project member. ObjectiveThis study aims to identify and classify the barriers that newcomers face when contributing to open source software projects. MethodWe conducted a systematic literature review of papers reporting empirical evidence regarding the barriers that newcomers face when contributing to open source software (OSS) projects. We retrieved 291 studies by querying 4 digital libraries. Twenty studies were identified as primary. We performed a backward snowballing approach, and searched for other papers published by the authors of the selected papers to identify potential studies. Then, we used a coding approach inspired by open coding and axial coding procedures from Grounded Theory to categorize the barriers reported by the selected studies. ResultsWe identified 20 studies providing empirical evidence of barriers faced by newcomers to OSS projects while making a contribution. From the analysis, we identified 15 different barriers, which we grouped into five categories: social interaction, newcomers' previous knowledge, finding a way to start, documentation, and technical hurdles. We also classified the problems with regard to their origin: newcomers, community, or product. ConclusionThe results are useful to researchers and OSS practitioners willing to investigate or to implement tools to support newcomers. We mapped technical and non-technical barriers that hinder newcomers' first contributions. The most evidenced barriers are related to socialization, appearing in 75% (15 out of 20) of the studies analyzed, with a high focus on interactions in mailing lists (receiving answers and socialization with other members). There is a lack of in-depth studies on technical issues, such as code issues. We also noticed that the majority of the studies relied on historical data gathered from software repositories and that there was a lack of experiments and qualitative studies in this area.

179 citations


Journal ArticleDOI
TL;DR: It is shown that although Agile teams use many metrics suggested in the Agile literature, they also use many custom metrics, and the most influential metrics in the primary studies are Velocity and Effort estimate.
Abstract: ContextSoftware industry has widely adopted Agile software development methods. Agile literature proposes a few key metrics but little is known of the actual metrics use in Agile teams. ObjectiveThe objective of this paper is to increase knowledge of the reasons for and effects of using metrics in industrial Agile development. We focus on the metrics that Agile teams use, rather than the ones used from outside by software engineering researchers. In addition, we analyse the influence of the used metrics. MethodThis paper presents a systematic literature review (SLR) on using metrics in industrial Agile software development. We identified 774 papers, which we reduced to 30 primary studies through our paper selection process. ResultsThe results indicate that the reasons for and the effects of using metrics are focused on the following areas: sprint planning, progress tracking, software quality measurement, fixing software process problems, and motivating people. Additionally, we show that although Agile teams use many metrics suggested in the Agile literature, they also use many custom metrics. Finally, the most influential metrics in the primary studies are Velocity and Effort estimate. ConclusionThe use of metrics in Agile software development is similar to Traditional software development. Projects and sprints need to be planned and tracked. Quality needs to be measured. Problems in the process need to be identified and fixed. Future work should focus on metrics that had high importance but low prevalence in our study, as they can offer the largest impact to the software industry.

167 citations


Journal ArticleDOI
TL;DR: A glossary of terms and a classification scheme for financial approaches used for managing technical debt are introduced to prove beneficial for the communication between technical managers and project managers, and will help in setting up quality-related goals, during software development.
Abstract: ContextTechnical debt is a software engineering metaphor, referring to the eventual financial consequences of trade-offs between shrinking product time to market and poorly specifying, or implementing a software product, throughout all development phases. Based on its inter-disciplinary nature, i.e. software engineering and economics, research on managing technical debt should be balanced between software engineering and economic theories. ObjectiveThe aim of this study is to analyze research efforts on technical debt, by focusing on their financial aspect. Specifically, the analysis is carried out with respect to: (a) how financial aspects are defined in the context of technical debt and (b) how they relate to the underlying software engineering concepts. MethodIn order to achieve the abovementioned goals, we employed a standard method for SLRs and applied it on studies retrieved from seven general-scope digital libraries. In total we selected 69 studies relevant to the financial aspect of technical debt. ResultsThe most common financial terms that are used in technical debt research are principal and interest, whereas the financial approaches that have been more frequently applied for managing technical debt are real options, portfolio management, cost/benefit analysis and value-based analysis. However, the application of such approaches lacks consistency, i.e., the same approach is differently applied in different studies, and in some cases lacks a clear mapping between financial and software engineering concepts. ConclusionThe results are expected to prove beneficial for the communication between technical managers and project managers, in the sense that they will provide a common vocabulary, and will help in setting up quality-related goals, during software development. To achieve this we introduce: (a) a glossary of terms and (b) a classification scheme for financial approaches used for managing technical debt. Based on these, we have been able to underline interesting implications for researchers and practitioners.

159 citations


Journal ArticleDOI
TL;DR: A systematic mapping of studies for which the primary goal is to develop or to improve ASEE techniques published in the period 1990–2012 revealed that most researchers focus on addressing problems related to the first step of an ASEE process, that is, feature and case subset selection.
Abstract: Context Analogy-based Software development Effort Estimation (ASEE) techniques have gained considerable attention from the software engineering community. However, existing systematic map and review studies on software development effort prediction have not investigated in depth several issues of ASEE techniques, to the exception of comparisons with other types of estimation techniques. Objective The objective of this research is twofold: (1) to classify ASEE studies which primary goal is to propose new or modified ASEE techniques according to five criteria: research approach, contribution type, techniques used in combination with ASEE methods, and ASEE steps, as well as identifying publication channels and trends and (2) to analyze these studies from five perspectives: estimation accuracy, accuracy comparison, estimation context, impact of the techniques used in combination with ASEE methods, and ASEE tools. Method We performed a systematic mapping of studies for which the primary goal is to develop or to improve ASEE techniques published in the period 1990–2012, and reviewed them based on an automated search of four electronic databases. Results In total, we identified 65 studies published between 1990 and 2012, and classified them based on our predefined classification criteria. The mapping study revealed that most researchers focus on addressing problems related to the first step of an ASEE process, that is, feature and case subset selection. The results of our detailed analysis show that ASEE methods outperform the eight techniques with which they were compared, and tend to yield acceptable results especially when combining ASEE techniques with Fuzzy Logic (FL) or Genetic Algorithms (GA). Conclusion Based on the findings of this study, the use of other techniques such FL and GA in combination with an ASEE method is promising to generate more accurate estimates. However, the use of ASEE techniques by practitioners is still limited: developing more ASEE tools may facilitate the application of these techniques and then lead to increasing the use of ASEE techniques in industry.

156 citations


Journal ArticleDOI
TL;DR: It is concluded that the advent of new eye-trackers makes the use of these tools easier and less obtrusive and that the software engineering community could benefit more from this technology.
Abstract: ContextEye-tracking is a mean to collect evidence regarding some participants' cognitive processes. Eye-trackers monitor participants' visual attention by collecting eye-movement data. These data are useful to get insights into participants' cognitive processes during reasoning tasks. ObjectiveThe Evidence-based Software Engineering (EBSE) paradigm has been proposed in 2004 and, since then, has been used to provide detailed insights regarding different topics in software engineering research and practice. Systematic Literature Reviews (SLR) are also useful in the context of EBSE by bringing together all existing evidence of research and results about a particular topic. This SLR evaluates the current state of the art of using eye-trackers in software engineering and provides evidence on the uses and contributions of eye-trackers to empirical studies in software engineering. MethodWe perform a SLR covering eye-tracking studies in software engineering published from 1990 up to the end of 2014. To search all recognised resources, instead of applying manual search, we perform an extensive automated search using Engineering Village. We identify 36 relevant publications, including nine journal papers, two workshop papers, and 25 conference papers. ResultsThe software engineering community started using eye-trackers in the 1990s and they have become increasingly recognised as useful tools to conduct empirical studies from 2006. We observe that researchers use eye-trackers to study model comprehension, code comprehension, debugging, collaborative interaction, and traceability. Moreover, we find that studies use different metrics based on eye-movement data to obtain quantitative measures. We also report the limitations of current eye-tracking technology, which threaten the validity of previous studies, along with suggestions to mitigate these limitations. ConclusionHowever, not withstanding these limitations and threats, we conclude that the advent of new eye-trackers makes the use of these tools easier and less obtrusive and that the software engineering community could benefit more from this technology.

Journal ArticleDOI
TL;DR: It is concluded that organisations need to be well prepared to handle technical and social adoption challenges with their existing expertise, processes and tools before adopting the CD process.
Abstract: Context Continuous Deployment (CD) is an emerging software development process with organisations such as Facebook, Microsoft, and IBM successfully implementing and using the process. The CD process aims to immediately deploy software to customers as soon as new code is developed, and can result in a number of benefits for organisations, such as: new business opportunities, reduced risk for each release, and prevent development of wasted software. There is little academic literature on the challenges organisations face when adopting the CD process, however there are many anecdotal challenges that organisations have voiced on their online blogs. Objective The aim of this research is to examine the challenges faced by organisations when adopting CD as well as the strategies to mitigate these challenges. Method An explorative case study technique that involves in-depth interviews with software practitioners in an organisation that has adopted CD was conducted to identify these challenges. Results This study found a total of 20 technical and social adoption challenges that organisations may face when adopting the CD process. The results are discussed to gain a deeper understanding of the strategies employed by organisations to mitigate the impacts of these challenges. Conclusion While a number of individual technical and social adoption challenges were uncovered by the case study in this research, most challenges were not faced in isolation. The severity of these challenges were reduced by a number of mitigation strategies employed by the case study organisation. It is concluded that organisations need to be well prepared to handle technical and social adoption challenges with their existing expertise, processes and tools before adopting the CD process. For practitioners, knowing how to address the challenges an organisation may face when adopting the CD process provides a level of awareness that they previously may not have had.

Journal ArticleDOI
TL;DR: Agile methodologies can be used by companies to reduce efforts in getting to levels 2 and 3 of CMMI, however, agile methodologies alone, according to the studies, were not sufficient to obtain a rating at a given level.
Abstract: Background The search for adherence to maturity levels by using lightweight processes that require low levels of effort is regarded as a challenge for software development organizations. Objective This study seeks to evaluate, synthesize, and present results on the use of the Capability Maturity Model Integration (CMMI) in combination with agile software development, and thereafter to give an overview of the topics researched, which includes a discussion of their benefits and limitations, the strength of the findings, and the implications for research and practice. Methods The method applied was a Systematic Literature Review on studies published up to (and including) 2011. Results The search strategy identified 3193 results, of which 81 included studies on the use of CMMI together with agile methodologies. The benefits found were grouped into two main categories: those related to the organization in general and those related to the development process, and were organized into subcategories, according to the area to which they refer. The limitations were also grouped into these categories. Using the criteria defined, the strength of the evidence found was considered low. The implications of the results for research and practice are discussed. Conclusion Agile methodologies can be used by companies to reduce efforts in getting to levels 2 and 3 of CMMI, there even being reports of applying agile practices that led to achieving level 5. However, agile methodologies alone, according to the studies, were not sufficient to obtain a rating at a given level, it being necessary to resort to additional practices to do so.

Journal ArticleDOI
TL;DR: A novel algorithm named Double Transfer Boosting (DTB) is introduced to improve the performance of cross-company defects prediction by reducing negative samples in CC data and demonstrating that it could be an effective model for early software defects detection.
Abstract: Context Software defect prediction has been widely studied based on various machine-learning algorithms. Previous studies usually focus on within-company defects prediction (WCDP), but lack of training data in the early stages of software testing limits the efficiency of WCDP in practice. Thus, recent research has largely examined the cross-company defects prediction (CCDP) as an alternative solution. Objective However, the gap of different distributions between cross-company (CC) data and within-company (WC) data usually makes it difficult to build a high-quality CCDP model. In this paper, a novel algorithm named Double Transfer Boosting (DTB) is introduced to narrow this gap and improve the performance of CCDP by reducing negative samples in CC data. Method The proposed DTB model integrates two levels of data transfer: first, the data gravitation method reshapes the whole distribution of CC data to fit WC data. Second, the transfer boosting method employs a small ratio of labeled WC data to eliminate negative instances in CC data. Results The empirical evaluation was conducted based on 15 publicly available datasets. CCDP experiment results indicated that the proposed model achieved better overall performance than compared CCDP models. DTB was also compared to WCDP in two different situations. Statistical analysis suggested that DTB performed significantly better than WCDP models trained by limited samples and produced comparable results to WCDP with sufficient training data. Conclusions DTB reforms the distribution of CC data from different levels to improve the performance of CCDP, and experimental results and analysis demonstrate that it could be an effective model for early software defects detection.

Journal ArticleDOI
TL;DR: The results indicate that data preprocessing techniques may significantly influence the final prediction of software cost estimation methods, and sometimes might have negative impacts on prediction performance of ML methods.
Abstract: ContextDue to the complex nature of software development process, traditional parametric models and statistical methods often appear to be inadequate to model the increasingly complicated relationship between project development cost and the project features (or cost drivers). Machine learning (ML) methods, with several reported successful applications, have gained popularity for software cost estimation in recent years. Data preprocessing has been claimed by many researchers as a fundamental stage of ML methods; however, very few works have been focused on the effects of data preprocessing techniques. ObjectiveThis study aims for an empirical assessment of the effectiveness of data preprocessing techniques on ML methods in the context of software cost estimation. MethodIn this work, we first conduct a literature survey of the recent publications using data preprocessing techniques, followed by a systematic empirical study to analyze the strengths and weaknesses of individual data preprocessing techniques as well as their combinations. ResultsOur results indicate that data preprocessing techniques may significantly influence the final prediction. They sometimes might have negative impacts on prediction performance of ML methods. ConclusionIn order to reduce prediction errors and improve efficiency, a careful selection is necessary according to the characteristics of machine learning methods, as well as the datasets used for software cost estimation.

Journal ArticleDOI
TL;DR: Empirical evidence is summarized on the impact of global dispersion dimensions on coordination, team performance and project outcomes to reveal several opportunities for future research, such as coordination challenges in inter-organizational software projects.
Abstract: Context Global software development (GSD) contains different context setting dimensions, which are essential for effective teamwork and success of projects. Although considerable research effort has been made in this area, as yet, no agreement has been reached about the impact of these dispersion dimensions on team coordination and project outcomes. Objective This paper summarizes empirical evidence on the impact of global dispersion dimensions on coordination, team performance and project outcomes. Method We performed a systematic literature review of 46 publications from 25 journals and 19 conference and workshop proceedings, which were published between 2001 and 2013. Thematic analysis was used to identify global dimensions and their measures. Vote counting was used to decide on the impact trends of dispersion dimensions on team performance and software quality. Results Global dispersion dimensions are consistently conceptualized, but quantified in many different ways. Different dispersion dimensions are associated with a distinct set of coordination challenges. Overall, geographical dispersion tends to have a negative impact on team performance and software quality. Temporal dispersion tends to have a negative impact on software quality, but its impact on team performance is inconsistent and can be explained by type of performance. Conclusion For researchers, we reveal several opportunities for future research, such as coordination challenges in inter-organizational software projects, impact of processes and practices mismatches on project outcomes, evolution of coordination needs and mechanism over time and impact of dispersion dimensions on open source project outcomes. For practitioners, they should consider the tradeoff between cost and benefits while dispersing tasks, alignment impact of dispersion dimensions with individual and organizational objectives, coordination mechanisms as situational approaches and collocation of development activities of high quality demand components in GSD projects.

Journal ArticleDOI
TL;DR: There are still problems in components of EAIM, including: there is lack of tool support for whole part of EA implementation, there are deficiency in addressing the EAIM's practices especially in modeling, management, and maintenance.
Abstract: ContextEnterprise Architecture (EA) is a strategy to align business and Information Technology (IT) within an enterprise. EA is managed, developed, and maintained throughout the EA Implementation Methodology (EAIM). ObjectiveThe aims of this study are to identify the existing effective practices that are used by existing EAIMs, identify the factors that affect the effectiveness of EAIM, identify the current tools that are used by existing EAIMs, and identify the open problems and areas related to EAIM for improvement. MethodA Systematic Literature Review (SLR) was carried out. 669 papers were retrieved by a manual search in 6 databases and 46 primary studies were finally included. ResultFrom these studies 33% were journal articles, 41% were conference papers while 26% were contributions from the studies consisted of book chapters. Consequently, 28 practices, 19 factors, and 15 tools were identified and analysed. ConclusionSeveral rigorous researches have been done in order to provide effective EAIM, however there are still problems in components of EAIM, including: there is lack of tool support for whole part of EA implementation, there are deficiency in addressing the EAIM's practices especially in modeling, management, and maintenance, there is lack of consideration on non-functional requirement in existing EAIM, there is no appropriate consideration on requirement analysis in most existing EAIM. This review provides researchers with some guidelines for future research on this topic. It also provides broad information on EAIM, which could be useful for practitioners.

Journal ArticleDOI
TL;DR: There is no single solution that can effectively mitigate XSS attacks, and more research is needed in the area of vulnerability removal from the source code of the applications before deployment.
Abstract: Context: Cross-site scripting (XSS) is a security vulnerability that affects web applications. It occurs due to improper or lack of sanitization of user inputs. The security vulnerability caused many problems for users and server applications. Objective: To conduct a systematic literature review on the studies done on XSS vulnerabilities and attacks. Method: We followed the standard guidelines for systematic literature review as documented by Barbara Kitchenham and reviewed a total of 115 studies related to cross-site scripting from various journals and conference proceedings. Results: Research on XSS is still very active with publications across many conference proceedings and journals. Attack prevention and vulnerability detection are the areas focused on by most of the studies. Dynamic analysis techniques form the majority among the solutions proposed by the various studies. The type of XSS addressed the most is reflected XSS. Conclusion: XSS still remains a big problem for web applications, despite the bulk of solutions provided so far. There is no single solution that can effectively mitigate XSS attacks. More research is needed in the area of vulnerability removal from the source code of the applications before deployment.

Journal ArticleDOI
TL;DR: A systematic literature review of peer reviewed articles as published between 2000 and August 2013 on business process modeling quality suggests that there is no generally accepted framework of model quality types and there is a need for the development of a broader quality framework capable of dealing with the different aspects of business process modeled quality.
Abstract: Context Business process modeling is an essential part of understanding and redesigning the activities that a typical enterprise uses to achieve its business goals The quality of a business process model has a significant impact on the development of any enterprise and IT support for that process Objective Since the insights on what constitutes modeling quality are constantly evolving, it is unclear whether research on business process modeling quality already covers all major aspects of modeling quality Therefore, the objective of this research is to determine the state of the art on business process modeling quality: What aspects of process modeling quality have been addressed until now and which gaps remain to be covered? Method We performed a systematic literature review of peer reviewed articles as published between 2000 and August 2013 on business process modeling quality To analyze the contributions of the papers we use the Formal Concept Analysis technique Results We found 72 studies addressing quality aspects of business process models These studies were classified into different dimensions: addressed model quality type, research goal, research method, and type of research result Our findings suggest that there is no generally accepted framework of model quality types Most research focuses on empirical and pragmatic quality aspects, specifically with respect to improving the understandability or readability of models Among the various research methods, experimentation is the most popular one The results from published research most often take the form of intangible knowledge Conclusion We believe there is a lack of an encompassing and generally accepted definition of business process modeling quality This evidences the need for the development of a broader quality framework capable of dealing with the different aspects of business process modeling quality Different dimensions of business process quality and of the process of modeling still require further research

Journal ArticleDOI
TL;DR: A taxonomy of the factors and their influence in the accumulation of Architectural Technical Debt is compiled, and two qualitative models of how the debt is accumulated and refactored over time in the studied companies are provided.
Abstract: Display Omitted We provide a taxonomy of the causes for Architectural Technical Debt accumulation.Crisis model: shows the increasing accumulation of Architectural Technical Debt.Phases model: shows when Architectural Technical Debt is accumulated and refactored.Refactoring strategies: best and worst case scenarios with respect to crises.We conduct an empirical evaluation of the factors and models. ContextA known problem in large software companies is to balance the prioritization of short-term with long-term feature delivery speed. Specifically, Architecture Technical Debt is regarded as sub-optimal architectural solutions taken to deliver fast that might hinder future feature development, which, in turn, would hinder agility. ObjectiveThis paper aims at improving software management by shedding light on the current factors responsible for the accumulation of Architectural Technical Debt and to understand how it evolves over time. MethodWe conducted an exploratory multiple-case embedded case study in 7 sites at 5 large companies. We evaluated the results with additional cross-company interviews and an in-depth, company-specific case study in which we initially evaluate factors and models. ResultsWe compiled a taxonomy of the factors and their influence in the accumulation of Architectural Technical Debt, and we provide two qualitative models of how the debt is accumulated and refactored over time in the studied companies. We also list a set of exploratory propositions on possible refactoring strategies that can be useful as insights for practitioners and as hypotheses for further research. ConclusionSeveral factors cause constant and unavoidable accumulation of Architecture Technical Debt, which leads to development crises. Refactorings are often overlooked in prioritization and they are often triggered by development crises, in a reactive fashion. Some of the factors are manageable, while others are external to the companies. ATD needs to be made visible, in order to postpone the crises according to the strategic goals of the companies. There is a need for practices and automated tools to proactively manage ATD.

Journal ArticleDOI
TL;DR: ELBlocker achieves a substantial and statistically significant improvement over the state-of-the-art methods, i.e., Garcia and Shihab’s method, SMOTE, OSS, and Bagging.
Abstract: Context Blocking bugs are bugs that prevent other bugs from being fixed Previous studies show that blocking bugs take approximately two to three times longer to be fixed compared to non-blocking bugs Objective Thus, automatically predicting blocking bugs early on so that developers are aware of them, can help reduce the impact of or avoid blocking bugs However, a major challenge when predicting blocking bugs is that only a small proportion of bugs are blocking bugs, ie, there is an unequal distribution between blocking and non-blocking bugs For example, in Eclipse and OpenOffice, only 28% and 30% bugs are blocking bugs, respectively We refer to this as the class imbalance phenomenon Method In this paper, we propose ELBlocker to identify blocking bugs given a training data ELBlocker first randomly divides the training data into multiple disjoint sets, and for each disjoint set, it builds a classifier Next, it combines these multiple classifiers, and automatically determines an appropriate imbalance decision boundary to differentiate blocking bugs from non-blocking bugs With the imbalance decision boundary, a bug report will be classified to be a blocking bug when its likelihood score is larger than the decision boundary, even if its likelihood score is low Results To examine the benefits of ELBlocker, we perform experiments on 6 large open source projects – namely Freedesktop, Chromium, Mozilla, Netbeans, OpenOffice, and Eclipse containing a total of 402,962 bugs We find that ELBlocker achieves F1 and EffectivenessRatio@20% scores of up to 0482 and 0831, respectively On average across the 6 projects, ELBlocker improves the F1 and EffectivenessRatio@20% scores over the state-of-the-art method proposed by Garcia and Shihab by 1469% and 899%, respectively Statistical tests show that the improvements are significant and the effect sizes are large Conclusion ELBlocker can help deal with the class imbalance phenomenon and improve the prediction of blocking bugs ELBlocker achieves a substantial and statistically significant improvement over the state-of-the-art methods, ie, Garcia and Shihab’s method, SMOTE, OSS, and Bagging

Journal ArticleDOI
TL;DR: Empirical evaluation of the ability of static code analysis tools to detect security vulnerabilities with an objective to better understand their strengths and shortcomings found them to be not very effective in detecting security vulnerabilities.
Abstract: Context: Static analysis of source code is a scalable method for discovery of software faults and security vulnerabilities. Techniques for static code analysis have matured in the last decade and many tools have been developed to support automatic detection.Objective: This research work is focused on empirical evaluation of the ability of static code analysis tools to detect security vulnerabilities with an objective to better understand their strengths and shortcomings.Method: We conducted an experiment which consisted of using the benchmarking test suite Juliet to evaluate three widely used commercial tools for static code analysis. Using design of experiments approach to conduct the analysis and evaluation and including statistical testing of the results are unique characteristics of this work. In addition to the controlled experiment, the empirical evaluation included case studies based on three open source programs.Results: Our experiment showed that 27% of C/C++ vulnerabilities and 11% of Java vulnerabilities were missed by all three tools. Some vulnerabilities were detected by only one or combination of two tools; 41% of C/C++ and 21% of Java vulnerabilities were detected by all three tools. More importantly, static code analysis tools did not show statistically significant difference in their ability to detect security vulnerabilities for both C/C++ and Java. Interestingly, all tools had median and mean of the per CWE recall values and overall recall across all CWEs close to or below 50%, which indicates comparable or worse performance than random guessing. While for C/C++ vulnerabilities one of the tools had better performance in terms of probability of false alarm than the other two tools, there was no statistically significant difference among tools' probability of false alarm for Java test cases.Conclusions: Despite recent advances in methods for static code analysis, the state-of-the-art tools are not very effective in detecting security vulnerabilities.

Journal ArticleDOI
TL;DR: A broad basis for the development and application of quality models in industrial practice as well as a basis for further extension, validation and comparison with other approaches in research is provided.
Abstract: ContextSoftware quality models provide either abstract quality characteristics or concrete quality measurements; there is no seamless integration of these two aspects. Quality assessment approaches are, hence, also very specific or remain abstract. Reasons for this include the complexity of quality and the various quality profiles in different domains which make it difficult to build operationalised quality models. ObjectiveIn the project Quamoco, we developed a comprehensive approach aimed at closing this gap. MethodThe project combined constructive research, which involved a broad range of quality experts from academia and industry in workshops, sprint work and reviews, with empirical studies. All deliverables within the project were peer-reviewed by two project members from a different area. Most deliverables were developed in two or three iterations and underwent an evaluation. ResultsWe contribute a comprehensive quality modelling and assessment approach: (1) A meta quality model defines the structure of operationalised quality models. It includes the concept of a product factor, which bridges the gap between concrete measurements and abstract quality aspects, and allows modularisation to create modules for specific domains. (2) A largely technology-independent base quality model reduces the effort and complexity of building quality models for specific domains. For Java and C# systems, we refined it with about 300 concrete product factors and 500 measures. (3) A concrete and comprehensive quality assessment approach makes use of the concepts in the meta-model. (4) An empirical evaluation of the above results using real-world software systems showed: (a) The assessment results using the base model largely match the expectations of experts for the corresponding systems. (b) The approach and models are well understood by practitioners and considered to be both consistent and well suited for getting an overall view on the quality of a software product. The validity of the base quality model could not be shown conclusively, however. (5) The extensive, open-source tool support is in a mature state. (6) The model for embedded software systems is a proof-of-concept for domain-specific quality models. ConclusionWe provide a broad basis for the development and application of quality models in industrial practice as well as a basis for further extension, validation and comparison with other approaches in research.

Journal ArticleDOI
TL;DR: The study identified the need to improve the robustness of the empirical evaluation of existing research, a lack of extensive and robust tool support, and multiple avenues worthy of further investigation.
Abstract: ContextSearch-Based Software Engineering (SBSE) is an emerging discipline that focuses on the application of search-based optimization techniques to software engineering problems. Software Product Lines (SPLs) are families of related software systems whose members are distinguished by the set of features each one provides. SPL development practices have proven benefits such as improved software reuse, better customization, and faster time to market. A typical SPL usually involves a large number of systems and features, a fact that makes them attractive for the application of SBSE techniques which are able to tackle problems that involve large search spaces. ObjectiveThe main objective of our work is to identify the quantity and the type of research on the application of SBSE techniques to SPL problems. More concretely, the SBSE techniques that have been used and at what stage of the SPL life cycle, the type of case studies employed and their empirical analysis, and the fora where the research has been published. MethodA systematic mapping study was conducted with five research questions and assessed 77 publications from 2001, when the term SBSE was coined, until 2014. ResultsThe most common application of SBSE techniques found was testing followed by product configuration, with genetic algorithms and multi-objective evolutionary algorithms being the two most commonly used techniques. Our study identified the need to improve the robustness of the empirical evaluation of existing research, a lack of extensive and robust tool support, and multiple avenues worthy of further investigation. ConclusionsOur study attested the great synergy existing between both fields, corroborated the increasing and ongoing interest in research on the subject, and revealed challenging open research questions.

Journal ArticleDOI
TL;DR: A comprehensive categorization of the risks faced by the practitioners in managing DAD projects and presents frequently used methods to reduce their impact is given to fill the yawning research void in this field.
Abstract: Context Organizations combine agile approach and Distributed Software Development (DSD) in order to develop better quality software solutions in lesser time and cost. It helps to reap the benefits of both agile and distributed development but pose significant challenges and risks. Relatively scanty evidence of research on the risks prevailing in distributed agile development (DAD) has motivated this study. Objective This paper aims at creating a comprehensive set of risk factors that affect the performance of distributed agile development projects and identifies the risk management methods which are frequently used in practice for controlling those risks. Method The study is an exploration of practitioners’ experience using constant comparison method for analyzing in-depth interviews of thirteen practitioners and work documents of twenty-eight projects from thirteen different information technology (IT) organizations. The field experience was supported by extensive research literature on risk management in traditional, agile and distributed development. Results Analysis of qualitative data from interviews and project work documents resulted into categorization of forty-five DAD risk factors grouped under five core risk categories. The risk categories were mapped to Leavitt’s model of organizational change for facilitating the implementation of results in real world. The risk factors could be attributed to the conflicting properties of DSD and agile development. Besides that, some new risk factors have been experienced by practitioners and need further exploration as their understanding will help the practitioners to act on time. Conclusion Organizations are adopting DAD for developing solutions that caters to the changing business needs, while utilizing the global talent. Conflicting properties of DSD and agile approach pose several risks for DAD. This study gives a comprehensive categorization of the risks faced by the practitioners in managing DAD projects and presents frequently used methods to reduce their impact. The work fills the yawning research void in this field.

Journal ArticleDOI
TL;DR: A fuzzy logic based phase-wise software defect density indicator prediction model is proposed using the top most reliability relevant metrics of the each phase of the SDLC to analyze the defect severity in different artifacts of SDLC of a software project.
Abstract: Display Omitted We present a fuzzy logic based approach for phase-wise software defects prediction.Top-most reliability relevant software metrics of SDLC are considered.The proposed model is validated on 20 real software projects.The sensitivity analysis of software metrics is presented.It is useful to analyze the defects severity in different artifacts of SDLC. ContextThe software defect prediction during software development has recently attracted the attention of many researchers. The software defect density indicator prediction in each phase of software development life cycle (SDLC) is desirable for developing a reliable software product. Software defect prediction at the end of testing phase may not be more beneficial because the changes need to be performed in the previous phases of SDLC may require huge amount of money and effort to be spent in order to achieve target software quality. Therefore, phase-wise software defect density indicator prediction model is of great importance. ObjectiveIn this paper, a fuzzy logic based phase-wise software defect prediction model is proposed using the top most reliability relevant metrics of the each phase of the SDLC. MethodIn the proposed model, defect density indicator in requirement analysis, design, coding and testing phase is predicted using nine software metrics of these four phases. The defect density indicator metric predicted at the end of the each phase is also taken as an input to the next phase. Software metrics are assessed in linguistic terms and fuzzy inference system has been employed to develop the model. ResultsThe predictive accuracy of the proposed model is validated using twenty real software project data. Validation results are satisfactory. Measures based on the mean magnitude of relative error and balanced mean magnitude of relative error decrease significantly as the software project size increases. ConclusionIn this paper, a fuzzy logic based model is proposed for predicting software defect density indicator at each phase of the SDLC. The predicted defects of twenty different software projects are found very near to the actual defects detected during testing. The predicted defect density indicators are very helpful to analyze the defect severity in different artifacts of SDLC of a software project.

Journal ArticleDOI
TL;DR: This paper will provide a framework for assessing and comparing process variability approaches and the support they provide for the dierent phases of the business process life.
Abstract: Context: The increasing adoption of process-aware information systems (PAISs) such as workow management systems, enterprise resource planning systems, or case management systems, together with the high variability in business processes (e.g., sales processes may vary depending on the respective products and countries), has resulted in large industrial process model repositories. To cope with this business process variability, the proper management of process variants along the entire process lifecycle becomes crucial. Objective: The goal of this paper is to develop a fundamental understanding of business process variability. In particular, the paper will provide a framework for assessing and comparing process variability approaches and the support they provide for the dierent phases of the business process life

Journal ArticleDOI
TL;DR: In this paper, a new approach is used to generate combinatorial test suites for a new type of application and case study by using Cuckoo Search concepts, which is evaluated through different benchmarks and it is able to get comparative results.
Abstract: A new approach is used to generate combinatorial test suites.Cuckoo Search application is investigated for a new type of application and case study.The strategy opens a new approach in Search Based Software Testing (SBST).The strategy is evaluated through different benchmarks and it is able to get comparative results.Application of combinatorial optimization is also investigated in the current paper. ContextSoftware has become an innovative solution nowadays for many applications and methods in science and engineering. Ensuring the quality and correctness of software is challenging because each program has different configurations and input domains. To ensure the quality of software, all possible configurations and input combinations need to be evaluated against their expected outputs. However, this exhaustive test is impractical because of time and resource constraints due to the large domain of input and configurations. Thus, different sampling techniques have been used to sample these input domains and configurations. ObjectiveCombinatorial testing can be used to effectively detect faults in software-under-test. This technique uses combinatorial optimization concepts to systematically minimize the number of test cases by considering the combinations of inputs. This paper proposes a new strategy to generate combinatorial test suite by using Cuckoo Search concepts. MethodCuckoo Search is used in the design and implementation of a strategy to construct optimized combinatorial sets. The strategy consists of different algorithms for construction. These algorithms are combined to serve the Cuckoo Search. ResultsThe efficiency and performance of the new technique were proven through different experiment sets. The effectiveness of the strategy is assessed by applying the generated test suites on a real-world case study for the purpose of functional testing. ConclusionResults show that the generated test suites can detect faults effectively. In addition, the strategy also opens a new direction for the application of Cuckoo Search in the context of software engineering.

Journal ArticleDOI
TL;DR: A framework for estimating, planning and managing Web projects is presented, by combining some existing Agile techniques with Web Engineering principles, presenting them as an unified framework which uses the business value to guide the delivery of features.
Abstract: Graphical abstractThis paper tries to answer the following question by identifying Agile practices and adapting them for being integrated into a coherent framework "Is it possible to define an Agile approach to estimate, plan and manage Web projects guided by business value". The paper includes the results obtained from a real experience dealing with applying the proposed framework. To finish, it states relevant conclusion.Display Omitted We propose a framework for Agile Web projects.Our framework focuses on estimation, planning, and management activities.We recommend an approach guided by business value.Our approach includes continuous improvement along the project.We present our first empirical experience of this framework.We take out a set of first conclusions in order to extent the model. ContextThe processes of estimating, planning and managing are crucial for software development projects, since the results must be related to several business strategies. The broad expansion of the Internet and the global and interconnected economy make Web development projects be often characterized by expressions like delivering as soon as possible, reducing time to market and adapting to undefined requirements. In this kind of environment, traditional methodologies based on predictive techniques sometimes do not offer very satisfactory results. The rise of Agile methodologies and practices has provided some useful tools that, combined with Web Engineering techniques, can help to establish a framework to estimate, manage and plan Web development projects. ObjectiveThis paper presents a proposal for estimating, planning and managing Web projects, by combining some existing Agile techniques with Web Engineering principles, presenting them as an unified framework which uses the business value to guide the delivery of features. MethodThe proposal is analyzed by means of a case study, including a real-life project, in order to obtain relevant conclusions. ResultsThe results achieved after using the framework in a development project are presented, including interesting results on project planning and estimation, as well as on team productivity throughout the project. ConclusionIt is concluded that the framework can be useful in order to better manage Web-based projects, through a continuous value-based estimation and management process.

Journal ArticleDOI
TL;DR: The results show that researchers use six primary existing approaches to identify refactoring opportunities and six approaches to empirically evaluate the identification techniques, and indicate that research in the area of identifying refactored opportunities is highly active.
Abstract: Context Identifying refactoring opportunities in object-oriented code is an important stage that precedes the actual refactoring process. Several techniques have been proposed in the literature to identify opportunities for various refactoring activities. Objective This paper provides a systematic literature review of existing studies identifying opportunities for code refactoring activities. Method We performed an automatic search of the relevant digital libraries for potentially relevant studies published through the end of 2013, performed pilot and author-based searches, and selected 47 primary studies (PSs) based on inclusion and exclusion criteria. The PSs were analyzed based on a number of criteria, including the refactoring activities, the approaches to refactoring opportunity identification, the empirical evaluation approaches, and the data sets used. Results The results indicate that research in the area of identifying refactoring opportunities is highly active. Most of the studies have been performed by academic researchers using nonindustrial data sets. Extract Class and Move Method were found to be the most frequently considered refactoring activities. The results show that researchers use six primary existing approaches to identify refactoring opportunities and six approaches to empirically evaluate the identification techniques. Most of the systems used in the evaluation process were open-source, which helps to make the studies repeatable. However, a relatively high percentage of the data sets used in the empirical evaluations were small, which limits the generality of the results. Conclusions It would be beneficial to perform further studies that consider more refactoring activities, involve researchers from industry, and use large-scale and industrial-based systems.