scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Software Engineering in 2017"


Posted Content
TL;DR: The entire ImageJ codebase was rewrote, engineering a redesigned plugin mechanism intended to facilitate extensibility at every level, with the goal of creating a more powerful tool that continues to serve the existing community while addressing a wider range of scientific requirements.
Abstract: ImageJ is an image analysis program extensively used in the biological sciences and beyond. Due to its ease of use, recordable macro language, and extensible plug-in architecture, ImageJ enjoys contributions from non-programmers, amateur programmers, and professional developers alike. Enabling such a diversity of contributors has resulted in a large community that spans the biological and physical sciences. However, a rapidly growing user base, diverging plugin suites, and technical limitations have revealed a clear need for a concerted software engineering effort to support emerging imaging paradigms, to ensure the software's ability to handle the requirements of modern science. Due to these new and emerging challenges in scientific imaging, ImageJ is at a critical development crossroads. We present ImageJ2, a total redesign of ImageJ offering a host of new functionality. It separates concerns, fully decoupling the data model from the user interface. It emphasizes integration with external applications to maximize interoperability. Its robust new plugin framework allows everything from image formats, to scripting languages, to visualization to be extended by the community. The redesigned data model supports arbitrarily large, N-dimensional datasets, which are increasingly common in modern image acquisition. Despite the scope of these changes, backwards compatibility is maintained such that this new functionality can be seamlessly integrated with the classic ImageJ interface, allowing users and developers to migrate to these new methods at their own pace. ImageJ2 provides a framework engineered for flexibility, intended to support these requirements as well as accommodate future needs.

2,156 citations


Posted Content
TL;DR: This article presents a taxonomy based on the underlying design principles of each model and uses it to navigate the literature and discuss cross-cutting and application-specific challenges and opportunities.
Abstract: Research at the intersection of machine learning, programming languages, and software engineering has recently taken important steps in proposing learnable probabilistic models of source code that exploit code's abundance of patterns. In this article, we survey this work. We contrast programming languages against natural languages and discuss how these similarities and differences drive the design of probabilistic models. We present a taxonomy based on the underlying design principles of each model and use it to navigate the literature. Then, we review how researchers have adapted these models to application areas and discuss cross-cutting and application-specific challenges and opportunities.

503 citations


Posted Content
TL;DR: Storm as mentioned in this paper is a probabilistic model checker for both discrete and continuous-time variants of both Markov chains and MDPs that supports the PRISM and JANI modeling languages.
Abstract: We launch the new probabilistic model checker storm. It features the analysis of discrete- and continuous-time variants of both Markov chains and MDPs. It supports the PRISM and JANI modeling languages, probabilistic programs, dynamic fault trees and generalized stochastic Petri nets. It has a modular set-up in which solvers and symbolic engines can easily be exchanged. It offers a Python API for rapid prototyping by encapsulating storm's fast and scalable algorithms. Experiments on a variety of benchmarks show its competitive performance.

206 citations


Proceedings ArticleDOI
TL;DR: FairFuzz as discussed by the authors proposes an approach that automatically prioritizes inputs exercising rare parts of the program under test, and adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts.
Abstract: In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice. One particular fuzz testing tool, American Fuzzy Lop or AFL, has become popular thanks to its ease-of-use and bug-finding power. However, AFL remains limited in the depth of program coverage it achieves, in particular because it does not consider which parts of program inputs should not be mutated in order to maintain deep program coverage. We propose an approach, FairFuzz, that helps alleviate this limitation in two key steps. First, FairFuzz automatically prioritizes inputs exercising rare parts of the program under test. Second, it automatically adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts of the program. We conduct evaluation on real-world programs against state-of-the-art versions of AFL, thoroughly repeating experiments to get good measures of variability. We find that on certain benchmarks FairFuzz shows significant coverage increases after 24 hours compared to state-of-the-art versions of AFL, while on others it achieves high program coverage at a significantly faster rate.

204 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigate the extent of which developers update their library dependencies and find that even though third-party reuse is commonplace, the practice of updating a dependency is not as common for many developers.
Abstract: Third-party library reuse has become common practice in contemporary software development, as it includes several benefits for developers. Library dependencies are constantly evolving, with newly added features and patches that fix bugs in older versions. To take full advantage of third-party reuse, developers should always keep up to date with the latest versions of their library dependencies. In this paper, we investigate the extent of which developers update their library dependencies. Specifically, we conducted an empirical study on library migration that covers over 4,600 GitHub software projects and 2,700 library dependencies. Results show that although many of these systems rely heavily on dependencies, 81.5% of the studied systems still keep their outdated dependencies. In the case of updating a vulnerable dependency, the study reveals that affected developers are not likely to respond to a security advisory. Surveying these developers, we find that 69% of the interviewees claim that they were unaware of their vulnerable dependencies. Furthermore, developers are not likely to prioritize library updates, citing it as extra effort and added responsibility. This study concludes that even though third-party reuse is commonplace, the practice of updating a dependency is not as common for many developers.

169 citations


Posted Content
TL;DR: GASPER as mentioned in this paper is a new tool for automatically locating gas-costly patterns by analyzing smart contracts' bytecodes, and the preliminary results on discovering 3 representative patterns from 4,240 real smart contracts show that 93.5%, 90.1% and 80% contracts suffer from these 3 patterns, respectively.
Abstract: Smart contracts are full-fledged programs that run on blockchains (e.g., Ethereum, one of the most popular blockchains). In Ethereum, gas (in Ether, a cryptographic currency like Bitcoin) is the execution fee compensating the computing resources of miners for running smart contracts. However, we find that under-optimized smart contracts cost more gas than necessary, and therefore the creators or users will be overcharged. In this work, we conduct the first investigation on Solidity, the recommended compiler, and reveal that it fails to optimize gas-costly programming patterns. In particular, we identify 7 gas-costly patterns and group them to 2 categories. Then, we propose and develop GASPER, a new tool for automatically locating gas-costly patterns by analyzing smart contracts' bytecodes. The preliminary results on discovering 3 representative patterns from 4,240 real smart contracts show that 93.5%, 90.1% and 80% contracts suffer from these 3 patterns, respectively.

159 citations


Posted Content
TL;DR: In this paper, the authors present guidelines on how to conduct multivocal literature review (MLR) studies in SE from the planning phase, over conducting the review to the final reporting of the review.
Abstract: Context: A Multivocal Literature Review (MLR) is a form of a Systematic Literature Review (SLR) which includes the grey literature (e.g., blog posts and white papers) in addition to the published (formal) literature (e.g., journal and conference papers). MLRs are useful for both researchers and practitioners since they provide summaries both the state-of-the art and -practice in a given area. Objective: There are several guidelines to conduct SLR studies in SE. However, given the facts that several phases of MLRs differ from those of traditional SLRs, for instance with respect to the search process and source quality assessment. Therefore, SLR guidelines are only partially useful for conducting MLR studies. Our goal in this paper is to present guidelines on how to conduct MLR studies in SE. Method: To develop the MLR guidelines, we benefit from three inputs: (1) existing SLR guidelines in SE, (2), a literature survey of MLR guidelines and experience papers in other fields, and (3) our own experiences in conducting several MLRs in SE. All derived guidelines are discussed in the context of three examples MLRs as running examples (two from SE and one MLR from the medical sciences). Results: The resulting guidelines cover all phases of conducting and reporting MLRs in SE from the planning phase, over conducting the review to the final reporting of the review. In particular, we believe that incorporating and adopting a vast set of recommendations from MLR guidelines and experience papers in other fields have enabled us to propose a set of guidelines with solid foundations. Conclusion: Having been developed on the basis of three types of solid experience and evidence, the provided MLR guidelines support researchers to effectively and efficiently conduct new MLRs in any area of SE.

153 citations


Posted Content
TL;DR: In this paper, a systematic review and comparative evaluation of automated process discovery methods, using an open-source benchmark and covering twelve publicly-available real-life event logs, twelve proprietary real life event logs and nine quality metrics, is presented.
Abstract: Process mining allows analysts to exploit logs of historical executions of business processes to extract insights regarding the actual performance of these processes. One of the most widely studied process mining operations is automated process discovery. An automated process discovery method takes as input an event log, and produces as output a business process model that captures the control-flow relations between tasks that are observed in or implied by the event log. Various automated process discovery methods have been proposed in the past two decades, striking different tradeoffs between scalability, accuracy and complexity of the resulting models. However, these methods have been evaluated in an ad-hoc manner, employing different datasets, experimental setups, evaluation measures and baselines, often leading to incomparable conclusions and sometimes unreproducible results due to the use of closed datasets. This article provides a systematic review and comparative evaluation of automated process discovery methods, using an open-source benchmark and covering twelve publicly-available real-life event logs, twelve proprietary real-life event logs, and nine quality metrics. The results highlight gaps and unexplored tradeoffs in the field, including the lack of scalability of some methods and a strong divergence in their performance with respect to the different quality metrics used.

127 citations


Posted Content
TL;DR: The challenges and opportunities of blockchain for business process management (BPM) are outlined and how blockchains could be used in the context of the established BPM lifecycle and how they might become relevant beyond are reflected.
Abstract: Blockchain technology promises a sizable potential for executing inter-organizational business processes without requiring a central party serving as a single point of trust (and failure). This paper analyzes its impact on business process management (BPM). We structure the discussion using two BPM frameworks, namely the six BPM core capabilities and the BPM lifecycle. This paper provides research directions for investigating the application of blockchain technology to BPM.

124 citations


Proceedings ArticleDOI
TL;DR: In this article, the authors show that applying a simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results than deep learning method, i.e. 84 times faster hours than deep Learning method.
Abstract: While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.

123 citations


Proceedings ArticleDOI
TL;DR: DeepRepair as discussed by the authors uses code similarities to prioritize and transform statements in a codebase for patch generation, and finds code elements that contain suspicious statements by mapping out-of-scope identifiers to similar identifiers in scope.
Abstract: In the field of automated program repair, the redundancy assumption claims large programs contain the seeds of their own repair. However, most redundancy-based program repair techniques do not reason about the repair ingredients---the code that is reused to craft a patch. We aim to reason about the repair ingredients by using code similarities to prioritize and transform statements in a codebase for patch generation. Our approach, DeepRepair, relies on deep learning to reason about code similarities. Code fragments at well-defined levels of granularity in a codebase can be sorted according to their similarity to suspicious elements (i.e., code elements that contain suspicious statements) and statements can be transformed by mapping out-of-scope identifiers to similar identifiers in scope. We examined these new search strategies for patch generation with respect to effectiveness from the viewpoint of a software maintainer. Our comparative experiments were executed on six open-source Java projects including 374 buggy program revisions and consisted of 19,949 trials spanning 2,616 days of computation time. DeepRepair's search strategy using code similarities generally found compilable ingredients faster than the baseline, jGenProg, but this improvement neither yielded test-adequate patches in fewer attempts (on average) nor found significantly more patches than the baseline. Although the patch counts were not statistically different, there were notable differences between the nature of DeepRepair patches and baseline patches. The results demonstrate that our learning-based approach finds patches that cannot be found by existing redundancy-based repair techniques.

Posted Content
TL;DR: This work implements several neural models including LSTMs and sequence-to-sequence models that can encode variable length input files and incorporates them in the state-of-the-art AFL (American Fuzzy Lop) fuzzer and shows significant improvements in terms of code coverage, unique code paths, and crashes.
Abstract: Fuzzing is a popular dynamic program analysis technique used to find vulnerabilities in complex software. Fuzzing involves presenting a target program with crafted malicious input designed to cause crashes, buffer overflows, memory errors, and exceptions. Crafting malicious inputs in an efficient manner is a difficult open problem and often the best approach to generating such inputs is through applying uniform random mutations to pre-existing valid inputs (seed files). We present a learning technique that uses neural networks to learn patterns in the input files from past fuzzing explorations to guide future fuzzing explorations. In particular, the neural models learn a function to predict good (and bad) locations in input files to perform fuzzing mutations based on the past mutations and corresponding code coverage information. We implement several neural models including LSTMs and sequence-to-sequence models that can encode variable length input files. We incorporate our models in the state-of-the-art AFL (American Fuzzy Lop) fuzzer and show significant improvements in terms of code coverage, unique code paths, and crashes for various input formats including ELF, PNG, PDF, and XML.

Proceedings ArticleDOI
TL;DR: In this article, a self-tuning version of SMOTE is used to improve the weaker regions of the training data, which leads to dramatically large increases in software defect predictions.
Abstract: We report and fix an important systematic error in prior studies that ranked classifiers for software analytics. Those studies did not (a) assess classifiers on multiple criteria and they did not (b) study how variations in the data affect the results. Hence, this paper applies (a) multi-criteria tests while (b) fixing the weaker regions of the training data (using SMOTUNED, which is a self-tuning version of SMOTE). This approach leads to dramatically large increases in software defect predictions. When applied in a 5*5 cross-validation study for 3,681 JAVA classes (containing over a million lines of code) from open source systems, SMOTUNED increased AUC and recall by 60% and 20% respectively. These improvements are independent of the classifier used to predict for quality. Same kind of pattern (improvement) was observed when a comparative analysis of SMOTE and SMOTUNED was done against the most recent class imbalance technique. In conclusion, for software analytic tasks like defect prediction, (1) data pre-processing can be more important than classifier choice, (2) ranking studies are incomplete without such pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.

Posted Content
TL;DR: In this paper, the architecture centric development method with rapid prototyping is proposed to reduce the number of iterations across the stages of ACDM and reduce the complexity of the process.
Abstract: Many engineering processes exist in the industry, text books and international standards. However, in practice rarely any of the processes are followed consistently and literally. It is observed across industries the processes are altered based on the requirements of the projects. Two features commonly lacking from many engineering processes are, 1) the formal capacity to rapidly develop prototypes in the rudimentary stage of the project, 2) transitioning of requirements into architectural designs, when and how to evaluate designs and how to use the throw away prototypes throughout the system lifecycle. Prototypes are useful for eliciting requirements, generating customer feedback and identifying, examining or mitigating risks in a project where the product concept is at a cutting edge or not fully perceived. Apart from the work that the product is intended to do, systemic properties like availability, performance and modifiability matter as much as functionality. Architects must even these concerns with the method they select to promote these systemic properties and at the same time equip the stakeholders with the desired functionality. Architectural design and prototyping is one of the key ways to build the right product embedded with the desired systemic properties. Once the product is built it can be almost impossible to retrofit the system with the desired attributes. This paper customizes the architecture centric development method with rapid prototyping to achieve the above-mentioned goals and reducing the number of iterations across the stages of ACDM.

Posted Content
TL;DR: The main features of microservices are described and how these features improve scalability are highlighted to highlight how this style of computing improves scalability.
Abstract: The microservice architecture is a style inspired by service-oriented computing that has recently started gaining popularity and that promises to change the way in which software is perceived, conceived and designed. In this paper, we describe the main features of microservices and highlight how these features improve scalability.

Posted Content
TL;DR: The experimental results show that ENTRUST can be used to engineer self-adaptive software systems in different application domains and to generate dynamic assurance cases for these systems.
Abstract: Building on concepts drawn from control theory, self-adaptive software handles environmental and internal uncertainties by dynamically adjusting its architecture and parameters in response to events such as workload changes and component failures. Self-adaptive software is increasingly expected to meet strict functional and non-functional requirements in applications from areas as diverse as manufacturing, healthcare and finance. To address this need, we introduce a methodology for the systematic ENgineering of TRUstworthy Self-adaptive sofTware (ENTRUST). ENTRUST uses a combination of (1) design-time and runtime modelling and verification, and (2) industry-adopted assurance processes to develop trustworthy self-adaptive software and assurance cases arguing the suitability of the software for its intended application. To evaluate the effectiveness of our methodology, we present a tool-supported instance of ENTRUST and its use to develop proof-of-concept self-adaptive software for embedded and service-based systems from the oceanic monitoring and e-finance domains, respectively. The experimental results show that ENTRUST can be used to engineer self-adaptive software systems in different application domains and to generate dynamic assurance cases for these systems.

Book ChapterDOI
TL;DR: This work aims to understand how Scrum is adapted in different contexts and what are the reasons for these changes.
Abstract: Background: Agile software development has become a popular way of developing software. Scrum is the most frequently used agile framework, but it is often reported to be adapted in practice. Objective: Thus, we aim to understand how Scrum is adapted in different contexts and what are the reasons for these changes. Method: Using a structured interview guideline, we interviewed ten German companies about their concrete usage of Scrum and analysed the results qualitatively. Results: All companies vary Scrum in some way. The least variations are in the Sprint length, events, team size and requirements engineering. Many users varied the roles, effort estimations and quality assurance. Conclusions: Many variations constitute a substantial deviation from Scrum as initially proposed. For some of these variations, there are good reasons. Sometimes, however, the variations are a result of a previous non-agile, hierarchical organisation.

Posted Content
TL;DR: This dataset by far has collected over five million apps and over 20 types of metadata such as VirusTotal reports, to contribute to ongoing research efforts, as well as to enable new potential research topics on Android Apps.
Abstract: We present a growing collection of Android apps collected from several sources, including the official Google Play app market and a growing collection of various metadata of those collected apps aiming at facilitating the Android-relevant research works. Our dataset by far has collected over five million apps and over 20 types of metadata such as VirusTotal reports. Our objective of collecting this dataset is to contribute to ongoing research efforts, as well as to enable new potential research topics on Android Apps. By releasing our app and metadata set to the research community, we also aim at encouraging our fellow researchers to engage in reproducible experiments. This article will be continuously updated based on the growing apps and metadata collected in the AndroZoo project. If you have specific metadata that you want to collect from AndroZoo and which are not yet provided by far, please let us know. We will thereby prioritise it in our collecting process so as to provide it to our fellow researchers in a short manner.

Proceedings ArticleDOI
TL;DR: A large-scale case study that applied clone detection to 28 requirements specifications with a total of 8,667 pages found that the amount of redundancy found in real-world specifications, namely redundancy that stems from copy & paste operations, was high.
Abstract: Due to their pivotal role in software engineering, considerable effort is spent on the quality assurance of software requirements specifications. As they are mainly described in natural language, relatively few means of automated quality assessment exist. However, we found that clone detection, a technique widely applied to source code, is promising to assess one important quality aspect in an automated way, namely redundancy that stems from copy&paste operations. This paper describes a large-scale case study that applied clone detection to 28 requirements specifications with a total of 8,667 pages. We report on the amount of redundancy found in real-world specifications, discuss its nature as well as its consequences and evaluate in how far existing code clone detection approaches can be applied to assess the quality of requirements specifications in practice.

Proceedings ArticleDOI
TL;DR: The results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information.
Abstract: The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these apps has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that directly lead to a bug's resolution. To address the need for an improved reporting system, we introduce a novel solution, called FUSION, that helps users auto complete reproduction steps in bug reports for mobile apps. FUSION links user provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. The approach that FUSION employs is generalizable to other current mobile software platforms, and constitutes a new method by which off device bug reporting can be conducted for mobile software projects. In a study involving 28 participants we applied FUSION to support the maintenance tasks of reporting and reproducing defects from 15 real world bugs found in 14 open source Android apps while qualitatively and qualitatively measuring the user experience of the system. Our results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information.

Journal ArticleDOI
TL;DR: The need for energy transparency in software development is discussed and how such transparency can be realized to help tackle the IoT energy challenge is emphasized.
Abstract: The Internet of Things (IoT) sparks a whole new world of embedded applications. Most of these applications are based on deeply embedded systems that have to operate on limited or unreliable sources of energy, such as batteries or energy harvesters. Meeting the energy requirements for such applications is a hard challenge, which threatens the future growth of the IoT. Software has the ultimate control over hardware. Therefore, its role is significant in optimizing the energy consumption of a system. Currently, programmers have no feedback on how their software affects the energy consumption of a system. Such feedback can be enabled by energy transparency, a concept that makes a program's energy consumption visible, from hardware to software. This paper discusses the need for energy transparency in software development and emphasizes on how such transparency can be realized to help tackling the IoT energy challenge.

Posted Content
TL;DR: This paper describes a new approach, built upon the powerful deep learning Long Short Term Memory model, to automatically learn both semantic and syntactic features in code, which demonstrates that the prediction power obtained is equal or even superior to what is achieved by state of the art vulnerability prediction models.
Abstract: Code flaws or vulnerabilities are prevalent in software systems and can potentially cause a variety of problems including deadlock, information loss, or system failure. A variety of approaches have been developed to try and detect the most likely locations of such code vulnerabilities in large code bases. Most of them rely on manually designing features (e.g. complexity metrics or frequencies of code tokens) that represent the characteristics of the code. However, all suffer from challenges in sufficiently capturing both semantic and syntactic representation of source code, an important capability for building accurate prediction models. In this paper, we describe a new approach, built upon the powerful deep learning Long Short Term Memory model, to automatically learn both semantic and syntactic features in code. Our evaluation on 18 Android applications demonstrates that the prediction power obtained from our learned features is equal or even superior to what is achieved by state of the art vulnerability prediction models: 3%--58% improvement for within-project prediction and 85% for cross-project prediction.

Posted Content
TL;DR: This paper proposes ARJA, a new GP based repair approach for automated repair of Java programs, and presents a novel lower-granularity patch representation that properly decouples the search subspaces of likely-buggy locations, operation types and potential fix ingredients, enabling GP to explore the search space more effectively.
Abstract: Recent empirical studies show that the performance of GenProg is not satisfactory, particularly for Java. In this paper, we propose ARJA, a new GP based repair approach for automated repair of Java programs. To be specific, we present a novel lower-granularity patch representation that properly decouples the search subspaces of likely-buggy locations, operation types and potential fix ingredients, enabling GP to explore the search space more effectively. Based on this new representation, we formulate automated program repair as a multi-objective search problem and use NSGA-II to look for simpler repairs. To reduce the computational effort and search space, we introduce a test filtering procedure that can speed up the fitness evaluation of GP and three types of rules that can be applied to avoid unnecessary manipulations of the code. Moreover, we also propose a type matching strategy that can create new potential fix ingredients by exploiting the syntactic patterns of the existing statements. We conduct a large-scale empirical evaluation of ARJA along with its variants on both seeded bugs and real-world bugs in comparison with several state-of-the-art repair approaches. Our results verify the effectiveness and efficiency of the search mechanisms employed in ARJA and also show its superiority over the other approaches. In particular, compared to jGenProg (an implementation of GenProg for Java), an ARJA version fully following the redundancy assumption can generate a test-suite adequate patch for more than twice the number of bugs (from 27 to 59), and a correct patch for nearly four times of the number (from 5 to 18), on 224 real-world bugs considered in Defects4J. Furthermore, ARJA is able to correctly fix several real multi-location bugs that are hard to be repaired by most of the existing repair approaches.

Proceedings ArticleDOI
TL;DR: It is shown that some maintenance practices---specifically the adoption of contributing guidelines and continuous integration---have an important association with a project failure or success and the principal strategies developers have tried to overcome the failure of the studied projects.
Abstract: Open source is experiencing a renaissance period, due to the appearance of modern platforms and workflows for developing and maintaining public code. As a result, developers are creating open source software at speeds never seen before. Consequently, these projects are also facing unprecedented mortality rates. To better understand the reasons for the failure of modern open source projects, this paper describes the results of a survey with the maintainers of 104 popular GitHub systems that have been deprecated. We provide a set of nine reasons for the failure of these open source projects. We also show that some maintenance practices -- specifically the adoption of contributing guidelines and continuous integration -- have an important association with a project failure or success. Finally, we discuss and reveal the principal strategies developers have tried to overcome the failure of the studied projects.

Journal ArticleDOI
TL;DR: Consequences of happiness and unhappiness that are beneficial and detrimental for developers' mental well-being, the software development process, and the produced artifacts are found.
Abstract: The growing literature on affect among software developers mostly reports on the linkage between happiness, software quality, and developer productivity. Understanding happiness and unhappiness in all its components -- positive and negative emotions and moods -- is an attractive and important endeavor. Scholars in industrial and organizational psychology have suggested that understanding happiness and unhappiness could lead to cost-effective ways of enhancing working conditions, job performance, and to limiting the occurrence of psychological disorders. Our comprehension of the consequences of (un)happiness among developers is still too shallow, being mainly expressed in terms of development productivity and software quality. In this paper, we study what happens when developers are happy and unhappy while developing software. Qualitative data analysis of responses given by 317 questionnaire participants identified 42 consequences of unhappiness and 32 of happiness. We found consequences of happiness and unhappiness that are beneficial and detrimental for developers' mental well-being, the software development process, and the produced artifacts. Our classification scheme, available as open data enables new happiness research opportunities of cause-effect type, and it can act as a guideline for practitioners for identifying damaging effects of unhappiness and for fostering happiness on the job.

Journal ArticleDOI
TL;DR: In this article, the authors used a systematic literature review methodology based on well-known digital libraries to assess the state of the art research on the game development software engineering process and highlight areas that need further consideration by researchers.
Abstract: Software game is a kind of application that is used not only for entertainment, but also for serious purposes that can be applicable to different domains such as education, business, and health care. Although the game development process differs from the traditional software development process because it involves interdisciplinary activities. Software engineering techniques are still important for game development because they can help the developer to achieve maintainability, flexibility, lower effort and cost, and better design. The purpose of this study is to assesses the state of the art research on the game development software engineering process and highlight areas that need further consideration by researchers. In the study, we used a systematic literature review methodology based on well-known digital libraries. The largest number of studies have been reported in the production phase of the game development software engineering process life cycle, followed by the pre-production phase. By contrast, the post-production phase has received much less research activity than the pre-production and production phases. The results of this study suggest that the game development software engineering process has many aspects that need further attention from researchers; that especially includes the postproduction phase.

Posted Content
TL;DR: Context2Name is presented, a deep learningbased technique that partially reverses the effect of minification by predicting natural identifier names for minified names and complements the state-of-the-art by predicting 5.3% additional identifiers that are missed by both existing tools.
Abstract: Most of the JavaScript code deployed in the wild has been minified, a process in which identifier names are replaced with short, arbitrary and meaningless names. Minified code occupies less space, but also makes the code extremely difficult to manually inspect and understand. This paper presents Context2Name, a deep learningbased technique that partially reverses the effect of minification by predicting natural identifier names for minified names. The core idea is to predict from the usage context of a variable a name that captures the meaning of the variable. The approach combines a lightweight, token-based static analysis with an auto-encoder neural network that summarizes usage contexts and a recurrent neural network that predict natural names for a given usage context. We evaluate Context2Name with a large corpus of real-world JavaScript code and show that it successfully predicts 47.5% of all minified identifiers while taking only 2.9 milliseconds on average to predict a name. A comparison with the state-of-the-art tools JSNice and JSNaughty shows that our approach performs comparably in terms of accuracy while improving in terms of efficiency. Moreover, Context2Name complements the state-of-the-art by predicting 5.3% additional identifiers that are missed by both existing tools.

Posted Content
TL;DR: A real world case study is presented to demonstrate how scalability is positively affected by re-implementing a monolithic architecture into a microservices architecture (MSA) in the specific setting of financial domain.
Abstract: The microservices paradigm aims at changing the way in which software is perceived, conceived and designed. One of the foundational characteristics of this new promising paradigm, compared for instance to monolithic architectures, is scalability. In this paper, we present a real world case study in order to demonstrate how scalability is positively affected by re-implementing a monolithic architecture into microservices. The case study is based on the FX Core system, a mission critical system of Danske Bank, the largest bank in Denmark and one of the leading financial institutions in Northern Europe.

Proceedings ArticleDOI
TL;DR: RefDiff as mentioned in this paper is an automated approach that identifies refactorings performed between two code revisions in a git repository, using a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types.
Abstract: Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to understand software evolution, adapt software components, merge code changes, and other applications. In this paper, we present RefDiff, an automated approach that identifies refactorings performed between two code revisions in a git repository. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. In an evaluation using an oracle of 448 known refactoring operations, distributed across seven Java projects, our approach achieved precision of 100% and recall of 88%. Moreover, our evaluation suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.

Journal ArticleDOI
TL;DR: This study is the first that draws a comprehensive picture of the different engineering goals proposed in the literature for test amplification, and it is believed that this survey will help researchers and practitioners entering this new field to understand more quickly and more deeply the intuitions, concepts and techniques used fortest amplification.
Abstract: The adoption of agile development approaches has put an increased emphasis on developer testing, resulting in software projects with strong test suites. These suites include a large number of test cases, in which developers embed knowledge about meaningful input data and expected properties in the form of oracles. This article surveys various works that aim at exploiting this knowledge in order to enhance these manually written tests with respect to an engineering goal (e.g., improve coverage of changes or increase the accuracy of fault localization). While these works rely on various techniques and address various goals, we believe they form an emerging and coherent field of research, which we call `test amplification'. We devised a first set of papers from DBLP, looking for all papers containing `test' and `amplification' in their title. We reviewed the 70 papers in this set and selected the 4 papers that fit our definition of test amplification. We use these 4 papers as the seed for our snowballing study, and systematically followed the citation graph. This study is the first that draws a comprehensive picture of the different engineering goals proposed in the literature for test amplification. In particular, we note that the goal of test amplification goes far beyond maximizing coverage only. We believe that this survey will help researchers and practitioners entering this new field to understand more quickly and more deeply the intuitions, concepts and techniques used for test amplification.