Home
/
Authors
/
Dongsun Kim

Author

Dongsun Kim

Other affiliations: Kyungpook National University, Sogang University, Hong Kong University of Science and Technology

Bio: Dongsun Kim is an academic researcher from University of Luxembourg. The author has contributed to research in topics: Software system & Software construction. The author has an hindex of 20, co-authored 49 publications receiving 1664 citations. Previous affiliations of Dongsun Kim include Kyungpook National University & Sogang University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2014
2013
2011
2009
2008
2007
2006
2005

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Automatic patch generation learned from human-written patches

[...]

Dongsun Kim¹, Jaechang Nam¹, Jaewoo Song¹, Sunghun Kim¹•Institutions (1)

Hong Kong University of Science and Technology¹

18 May 2013

TL;DR: A novel patch generation approach, Pattern-based Automatic program Repair (Par), using fix patterns learned from existing human-written patches to generate program patches automatically, which is more acceptable than GenProg.

...read moreread less

Abstract: Patch generation is an essential software maintenance task because most software systems inevitably have bugs that need to be fixed. Unfortunately, human resources are often insufficient to fix all reported and known bugs. To address this issue, several automated patch generation techniques have been proposed. In particular, a genetic-programming-based patch generation technique, GenProg, proposed by Weimer et al., has shown promising results. However, these techniques can generate nonsensical patches due to the randomness of their mutation operations. To address this limitation, we propose a novel patch generation approach, Pattern-based Automatic program Repair (Par), using fix patterns learned from existing human-written patches. We manually inspected more than 60,000 human-written patches and found there are several common fix patterns. Our approach leverages these fix patterns to generate program patches automatically. We experimentally evaluated Par on 119 real bugs. In addition, a user study involving 89 students and 164 developers confirmed that patches generated by our approach are more acceptable than those generated by GenProg. Par successfully generated patches for 27 out of 119 bugs, while GenProg was successful for only 16 bugs.

...read moreread less

549 citations

Journal Article•DOI•

Where Should We Fix This Bug? A Two-Phase Recommendation Model

[...]

Dongsun Kim¹, Yida Tao¹, Sunghun Kim¹, Andreas Zeller•Institutions (1)

Hong Kong University of Science and Technology¹

01 Nov 2013-IEEE Transactions on Software Engineering

TL;DR: A two-phase prediction model that uses bug reports' contents to suggest the files likely to be fixed and compared it with three other prediction models: the Usual Suspects, the one-phase model, and BugScout to find the best prediction performance.

...read moreread less

Abstract: To support developers in debugging and locating bugs, we propose a two-phase prediction model that uses bug reports' contents to suggest the files likely to be fixed. In the first phase, our model checks whether the given bug report contains sufficient information for prediction. If so, the model proceeds to predict files to be fixed, based on the content of the bug report. In other words, our two-phase model "speaks up" only if it is confident of making a suggestion for the given bug report; otherwise, it remains silent. In the evaluation on the Mozilla "Firefox" and "Core" packages, the two-phase model was able to make predictions for almost half of all bug reports; on average, 70 percent of these predictions pointed to the correct files. In addition, we compared the two-phase model with three other prediction models: the Usual Suspects, the one-phase model, and BugScout. The two-phase model manifests the best prediction performance.

...read moreread less

179 citations

Proceedings Article•DOI•

TBar: Revisiting Template-based Automated Program Repair

[...]

Kui Liu¹, Anil Koyuncu¹, Dongsun Kim¹, Tegawendé F. Bissyandé¹•Institutions (1)

University of Luxembourg¹

20 Mar 2019-arXiv: Software Engineering

TL;DR: It is demonstrated that TBar correctly fixes 43 bugs from Defects4J, an unprecedented performance in the literature (including all approaches, i.e., template-based, stochastic mutation-based or synthesis-based APR).

...read moreread less

Abstract: We revisit the performance of template-based APR to build comprehensive knowledge about the effectiveness of fix patterns, and to highlight the importance of complementary steps such as fault localization or donor code retrieval. To that end, we first investigate the literature to collect, summarize and label recurrently-used fix patterns. Based on the investigation, we build TBar, a straightforward APR tool that systematically attempts to apply these fix patterns to program bugs. We thoroughly evaluate TBar on the Defects4J benchmark. In particular, we assess the actual qualitative and quantitative diversity of fix patterns, as well as their effectiveness in yielding plausible or correct patches. Eventually, we find that, assuming a perfect fault localization, TBar correctly/plausibly fixes 74/101 bugs. Replicating a standard and practical pipeline of APR assessment, we demonstrate that TBar correctly fixes 43 bugs from Defects4J, an unprecedented performance in the literature (including all approaches, i.e., template-based, stochastic mutation-based or synthesis-based APR).

...read moreread less

122 citations

Proceedings Article•DOI•

FaCoY: a code-to-code search engine

[...]

Kisub Kim¹, Dongsun Kim¹, Tegawendé F. Bissyandé¹, Eunjong Choi², Li Li³, Jacques Klein¹, Yves Le Traon¹ - Show less +3 more•Institutions (3)

University of Luxembourg¹, Nara Institute of Science and Technology², Monash University³

27 May 2018

TL;DR: FaCoY is proposed, a novel approach for statically finding code fragments which may be semantically similar to user input code which is more effective than online code-to-code search engines and can be useful in code/patch recommendation.

...read moreread less

Abstract: Code search is an unavoidable activity in software development. Various approaches and techniques have been explored in the literature to support code search tasks. Most of these approaches focus on serving user queries provided as natural language free-form input. However, there exists a wide range of use-case scenarios where a code-to-code approach would be most beneficial. For example, research directions in code transplantation, code diversity, patch recommendation can leverage a code-to-code search engine to find essential ingredients for their techniques. In this paper, we propose FaCoY, a novel approach for statically finding code fragments which may be semantically similar to user input code. FaCoY implements a query alternation strategy: instead of directly matching code query tokens with code in the search space, FaCoY first attempts to identify other tokens which may also be relevant in implementing the functional behavior of the input code. With various experiments, we show that (1) FaCoY is more effective than online code-to-code search engines; (2) FaCoY can detect more semantic code clones (i.e., Type-4) in BigCloneBench than the state-of-the-art; (3) FaCoY, while static, can detect code fragments which are indeed similar with respect to runtime execution behavior; and (4) FaCoY can be useful in code/patch recommendation.

...read moreread less

107 citations

Proceedings Article•DOI•

TBar: revisiting template-based automated program repair

[...]

Kui Liu¹, Anil Koyuncu¹, Dongsun Kim¹, Tegawendé F. Bissyandé¹•Institutions (1)

University of Luxembourg¹

10 Jul 2019

TL;DR: TBar as mentioned in this paper is a template-based APR tool that systematically applies these fix patterns to program bugs and finds that, assuming a perfect fault localization, TBar correctly/plausibly fixes 74/101 bugs.

...read moreread less

103 citations

1
2
3
4
…
5
6
7
8
9
10
11

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Subset Selection in Regression

[...]

Anthony B. Atkinson¹•Institutions (1)

London School of Economics and Political Science¹

01 Jan 1992-Journal of The Royal Statistical Society Series A-statistics in Society

TL;DR: Chapman and Miller as mentioned in this paper, Subset Selection in Regression (Monographs on Statistics and Applied Probability, no. 40, 1990) and Section 5.8.

...read moreread less

Abstract: 8. Subset Selection in Regression (Monographs on Statistics and Applied Probability, no. 40). By A. J. Miller. ISBN 0 412 35380 6. Chapman and Hall, London, 1990. 240 pp. £25.00.

...read moreread less

1,154 citations

Proceedings Article•DOI•

A practical guide for using statistical tests to assess randomized algorithms in software engineering

[...]

Andrea Arcuri¹, Lionel C. Briand¹•Institutions (1)

Simula Research Laboratory¹

21 May 2011

TL;DR: It is shown that randomized algorithms are used in a significant percentage of papers but that, in most cases, randomness is not properly accounted for, which casts doubts on the validity of most empirical results assessing randomized algorithms.

...read moreread less

Abstract: Randomized algorithms have been used to successfully address many different types of software engineering problems. This type of algorithms employ a degree of randomness as part of their logic. Randomized algorithms are useful for difficult problems where a precise solution cannot be derived in a deterministic way within reasonable time. However, randomized algorithms produce different results on every run when applied to the same problem instance. It is hence important to assess the effectiveness of randomized algorithms by collecting data from a large enough number of runs. The use of rigorous statistical tests is then essential to provide support to the conclusions derived by analyzing such data. In this paper, we provide a systematic review of the use of randomized algorithms in selected software engineering venues in 2009. Its goal is not to perform a complete survey but to get a representative snapshot of current practice in software engineering research. We show that randomized algorithms are used in a significant percentage of papers but that, in most cases, randomness is not properly accounted for. This casts doubts on the validity of most empirical results assessing randomized algorithms. There are numerous statistical tests, based on different assumptions, and it is not always clear when and how to use these tests. We hence provide practical guidelines to support empirical research on randomized algorithms in software engineering

...read moreread less

859 citations

Journal Article•DOI•

A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering

[...]

Andrea Arcuri¹, Lionel C. Briand²•Institutions (2)

Simula Research Laboratory¹, University of Luxembourg²

01 May 2014-Software Testing, Verification & Reliability

TL;DR: This paper provides guidelines on how to carry out and properly analyse randomized algorithms applied to solve software engineering tasks, with a particular focus on software testing, which is by far the most frequent application area of randomized algorithms within software engineering.

...read moreread less

Abstract: Randomized algorithms are widely used to address many types of software engineering problems, especially in the area of software verification and validation with a strong emphasis on test automation. However, randomized algorithms are affected by chance and so require the use of appropriate statistical tests to be properly analysed in a sound manner. This paper features a systematic review regarding recent publications in 2009 and 2010 showing that, overall, empirical analyses involving randomized algorithms in software engineering tend to not properly account for the random nature of these algorithms. Many of the novel techniques presented clearly appear promising, but the lack of soundness in their empirical evaluations casts unfortunate doubts on their actual usefulness. In software engineering, although there are guidelines on how to carry out empirical analyses involving human subjects, those guidelines are not directly and fully applicable to randomized algorithms. Furthermore, many of the textbooks on statistical analysis are written from the viewpoints of social and natural sciences, which present different challenges from randomized algorithms. To address the questionable overall quality of the empirical analyses reported in the systematic review, this paper provides guidelines on how to carry out and properly analyse randomized algorithms applied to solve software engineering tasks, with a particular focus on software testing, which is by far the most frequent application area of randomized algorithms within software engineering. Copyright © 2012 John Wiley & Sons, Ltd.

...read moreread less

510 citations

Proceedings Article•DOI•

Automatic patch generation by learning correct code

[...]

Fan Long¹, Martin Rinard¹•Institutions (1)

Massachusetts Institute of Technology¹

11 Jan 2016

TL;DR: Experimental results show that, on a benchmark set of 69 real-world defects drawn from eight open-source projects, Prophet significantly outperforms the previous state-of-the-art patch generation system.

...read moreread less

Abstract: We present Prophet, a novel patch generation system that works with a set of successful human patches obtained from open- source software repositories to learn a probabilistic, application-independent model of correct code. It generates a space of candidate patches, uses the model to rank the candidate patches in order of likely correctness, and validates the ranked patches against a suite of test cases to find correct patches. Experimental results show that, on a benchmark set of 69 real-world defects drawn from eight open-source projects, Prophet significantly outperforms the previous state-of-the-art patch generation system.

...read moreread less

495 citations

Proceedings Article•DOI•

Angelix: scalable multiline program patch synthesis via symbolic analysis

[...]

Sergey Mechtaev¹, Jooyong Yi¹, Abhik Roychoudhury¹•Institutions (1)

National University of Singapore¹

14 May 2016

TL;DR: Angelix is a novel semantics- based repair method that scales up to programs of similar size as are handled by search-based repair tools such as GenProg and SPR, and is more scalable than previously proposed semantics based repair methods such as SemFix and DirectFix.

...read moreread less

Abstract: Since debugging is a time-consuming activity, automated program repair tools such as GenProg have garnered interest A recent study revealed that the majority of GenProg repairs avoid bugs simply by deleting functionality We found that SPR, a state-of-the-art repair tool proposed in 2015, still deletes functionality in their many "plausible" repairs Unlike generate-and-validate systems such as GenProg and SPR, semantic analysis based repair techniques synthesize a repair based on semantic information of the program While such semantics-based repair methods show promise in terms of quality of generated repairs, their scalability has been a concern so far In this paper, we present Angelix, a novel semantics-based repair method that scales up to programs of similar size as are handled by search-based repair tools such as GenProg and SPR This shows that Angelix is more scalable than previously proposed semantics based repair methods such as SemFix and DirectFix Furthermore, our repair method can repair multiple buggy locations that are dependent on each other Such repairs are hard to achieve using SPR and GenProg In our experiments, Angelix generated repairs from large-scale real-world software such as wireshark and php, and these generated repairs include multi-location repairs We also report our experience in automatically repairing the well-known Heartbleed vulnerability

...read moreread less

464 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse