This work proposes several changes to usual edit sequences, specifically augmenting edits with content data and using fuzzy matching, in an attempt to improve semantic preservation.
Abstract:
Genetic improvement uses automated search to find improved versions of existing software. Edit sequences have been proposed as a very convenient way to represent code modifications, focusing on the changes themselves rather than duplicating the entire program. However, edits are usually defined in terms of practical operations rather than in terms of semantic changes; indeed, crossover and other edit sequence mutations usually never guarantee semantic preservation. We propose several changes to usual edit sequences, specifically augmenting edits with content data and using fuzzy matching, in an attempt to improve semantic preservation.
TL;DR: In this paper, the authors proposed an experimental analysis of the above-mentioned viruses data using correlation methods, which considered the distribution of various amino acids, protein sequences, 3D modelling of viruses, pairwise alignment of proteins that comprise the DNA genome of the viruses.
TL;DR: This paper describes GenProg, an automated method for repairing defects in off-the-shelf, legacy programs without formal specifications, program annotations, or special coding practices, and analyzes the generated repairs qualitatively and quantitatively to demonstrate the process efficiently produces evolved programs that repair the defect.
TL;DR: This paper evaluates GenProg, which uses genetic programming to repair defects in off-the-shelf C programs, and proposes novel algorithmic improvements that allow it to scale to large programs and find repairs 68% more often.
TL;DR: It is shown that the genetic improvement of programs (GIP) can scale by evolving increased performance in a widely-used and highly complex 50000 line system.
TL;DR: In this article, the plastic surgery hypothesis is validated empirically and the extent to which the content of new code can be assembled out of fragments of code that already exist in the code base is investigated.
TL;DR: A comprehensive survey of this nascent field of research with a focus on the core papers in the area published between 1995 and 2015, identifying core publications including empirical studies, 96% of which use evolutionary algorithms (genetic programming in particular).
Q1. What are the contributions in "Fuzzy edit sequences in genetic improvement" ?
The authors propose several changes to usual edit sequences, specifically augmenting edits with content data and using fuzzy matching, in an attempt to improve semantic preservation.
Q2. What is the purpose of the article?
The authors expect that content data may be used to track semantic changes, which then, through fuzzy matching, may lead to the generation of new edits beneficial to the overall GI process.
Q3. What is the way to use the lookup function f?
because the lookup function f cannot be inverted (while locations are unique, content is not), the authors propose to use both content and location, i.e., “op((α,a),(β,b))”.
Q4. What is the meaning of the edit?
In the literature [5]–[8], [10], [11], edits are traditionally based on location first, and content second: in the edit “i(a,b)”, “a” and “b” refer to modifications points usually in the original source code, and only through them the content at these locations.
Q5. What can be used to help the GI process?
When a conflict arises, and one or multiple plausible matches are found, they can then be used in order to help the GI process with additional diversity.
Q6. How can the authors use crossover to influence the creation of new edits?
It has already been shown, using crossover on decoupled edit sequences [7], that re-using knowledge already present in existing sequences (here, over all possible modification points only “a”, “b”, and “z” were used) can positively influence the creation of new edits.
Q7. What is the way to avoid losing potentially useful edits?
In any case, it can be expected that falling back to the current approach (i.e., applying edits without regard to the original content) should still be considered to avoid losing potentially useful edits.
Q8. What is the purpose of this proposal?
In the following, the authors propose to use the content relevant at the time of creation of an edit as a marker of the initial edit semantic, in order to generate new variants of the edit when this semantic is modified.
Q9. What is the motivation behind the proposal?
In their motivating example, this means that when appending the insertion of Listing 4 at the end of the edit sequence of Listing 3, the GI process now has the opportunity to realise that the line “foo()” has changed location.