scispace - formally typeset
Search or ask a question

Showing papers by "James A. Jones published in 2017"


Proceedings ArticleDOI
20 May 2017
TL;DR: This paper provides a more expressive model of code evolution, the fuzzy history graph, by representing code lineage as a continuous (i.e., fuzzy) metric rather than a discrete one, and finds that the use of such a fuzzy model of history provided improved accuracy for code-history analysis tasks.
Abstract: Existing software-history techniques represent source-code evolution as an absolute and unambiguous mapping of lines of code in prior revisions to lines of code in subsequent revisions. However, the true evolutionary lineage of a line of code is often complex, subjective, and ambiguous. As such, existing techniques are predisposed to, both, overestimate and underestimate true evolution lineage. In this paper, we seek to address these issues by providing a more expressive model of code evolution, the fuzzy history graph, by representing code lineage as a continuous (i.e., fuzzy) metric rather than a discrete (i.e., absolute) one. Using this more descriptive model, we additionally provide a novel multi-revision code-history analysis --- fuzzy history slicing. In our experiments over three real-world software systems, we found that the fuzzy history graph provides a tunable balance of precision and recall, and an overall improved accuracy over existing code-evolution models. Furthermore, we found that the use of such a fuzzy model of history provided improved accuracy for code-history analysis tasks.

17 citations


Journal ArticleDOI
TL;DR: This work summarizes and reuse the data- and control dependencies between the inputs and outputs of software components, and shows results showing a 13× speedup with the use of dynamic dependence summaries, with an accuracy of 90% in a real-world software engineering task.
Abstract: Software engineers construct modern-day software applications by building on existing software libraries and components that they necessarily do not author themselves. Thus, contemporary software applications rely heavily on existing standard and third-party libraries for their execution and behavior. As such, effective runtime analysis of such a software application’s behavior is met with new challenges. To perform dynamic analysis of a software application, all transitively dependent external libraries must also be monitored and analyzed at each layer of the software application’s call stack. However, monitoring and analyzing large and often numerous external libraries may prove to be prohibitively expensive. Moreover, an overabundance of library-level analyses may obfuscate the details of the actual software application’s dynamic behavior. In other words, the extensive use of existing libraries by a software application renders the results of its dynamic analysis both expensive to compute and difficult to understand. We model software component behavior as dynamically observed data- and control dependencies between inputs and outputs of a software component. Such data- and control dependencies are monitored at a fine-grain instruction-level and are collected as dynamic execution traces for software runs. As an approach to address the complexities and expenses associated with analyzing dynamically observable behavior of software components, we summarize and reuse the data- and control dependencies between the inputs and outputs of software components. Dynamically monitored data- and control dependencies, between the inputs and outputs of software components, upon summarization are called dynamic dependence summaries. Software components, equipped with dynamic dependence summaries, afford the omission of their exhaustive runtime analysis. Nonetheless, the reuse of dependence summaries would necessitate the abstraction of any concrete runtime information enclosed within the summary, thus potentially causing a loss in the information modeled by the dependence summary. Therefore, benefits to the efficiency of dynamic analyses that use such summarization may be afforded with losses of accuracy. As such, we evaluate the potential accuracy loss and the potential performance gain with the use of dynamic dependence summaries. Our results show, on average, a 13× speedup with the use of dynamic dependence summaries, with an accuracy of 90% in a real-world software engineering task.

5 citations