Home
/
Authors
/
Marco Tulio Valente

Author

Marco Tulio Valente

Other affiliations: The Catholic University of America, Centro Federal de Educação Tecnológica de Minas Gerais, Pontifícia Universidade Católica de Minas Gerais

Bio: Marco Tulio Valente is an academic researcher from Universidade Federal de Minas Gerais. The author has contributed to research in topics: Code refactoring & Source code. The author has an hindex of 32, co-authored 171 publications receiving 3476 citations. Previous affiliations of Marco Tulio Valente include The Catholic University of America & Centro Federal de Educação Tecnológica de Minas Gerais.

Topics: Code refactoring, Source code, Software development, Software system, JavaScript ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2003

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Why we refactor? confessions of GitHub contributors

[...]

Danilo Silva¹, Nikolaos Tsantalis², Marco Tulio Valente¹•Institutions (2)

Universidade Federal de Minas Gerais¹, Concordia University²

01 Nov 2016

TL;DR: This work monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked developers to explain the reasons behind their decision to refactor the code, compiling a catalogue of 44 distinct motivations for 12 well-known refactoring types.

...read moreread less

Abstract: Refactoring is a widespread practice that helps developers to improve the maintainability and readability of their code. However, there is a limited number of studies empirically investigating the actual motivations behind specific refactoring operations applied by developers. To fill this gap, we monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked the developers to explain the reasons behind their decision to refactor the code. By applying thematic analysis on the collected responses, we compiled a catalogue of 44 distinct motivations for 12 well-known refactoring types. We found that refactoring activity is mainly driven by changes in the requirements and much less by code smells. Extract Method is the most versatile refactoring operation serving 11 different purposes. Finally, we found evidence that the IDE used by the developers affects the adoption of automated refactoring tools.

...read moreread less

200 citations

Journal Article•DOI•

What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform

[...]

Hudson Borges¹, Marco Tulio Valente¹•Institutions (1)

Universidade Federal de Minas Gerais¹

01 Dec 2018-Journal of Systems and Software

TL;DR: A throughout study on the meaning, characteristics, and dynamic growth of GitHub stars is provided and a list of recommendations to open source project managers and GitHub users and Software Engineering researchers is provided.

...read moreread less

161 citations

Journal Article•DOI•

What's in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform

[...]

Hudson Borges¹, Marco Tulio Valente¹•Institutions (1)

Universidade Federal de Minas Gerais¹

19 Nov 2018-arXiv: Software Engineering

TL;DR: In this paper, the authors provide a study on the meaning, characteristics, and dynamic growth of GitHub stars and propose four patterns to describe stars growth, which are derived after clustering the time series representing the number of stars of the studied repositories.

...read moreread less

Abstract: Besides a git-based version control system, GitHub integrates several social coding features Particularly, GitHub users can star a repository, presumably to manifest interest or satisfaction with an open source project However, the real and practical meaning of starring a project was never the subject of an in-depth and well-founded empirical investigation Therefore, we provide in this paper a throughout study on the meaning, characteristics, and dynamic growth of GitHub stars First, by surveying 791 developers, we report that three out of four developers consider the number of stars before using or contributing to a GitHub project Then, we report a quantitative analysis on the characteristics of the top-5,000 most starred GitHub repositories We propose four patterns to describe stars growth, which are derived after clustering the time series representing the number of stars of the studied repositories; we also reveal the perception of 115 developers about these growth patterns To conclude, we provide a list of recommendations to open source project managers (eg, on the importance of social media promotion) and to GitHub users and Software Engineering researchers (eg, on the risks faced when selecting projects by GitHub stars)

...read moreread less

147 citations

Proceedings Article•DOI•

Understanding the Factors That Impact the Popularity of GitHub Repositories

[...]

Hudson Borges¹, Andre Hora¹, Marco Tulio Valente¹•Institutions (1)

Universidade Federal de Minas Gerais¹

01 Oct 2016

TL;DR: Wang et al. as discussed by the authors describe a study on the popularity of software systems hosted at GitHub and identify four main patterns of popularity growth, which are derived after clustering the time series representing the number of stars of 2,279 popular GitHub repositories.

...read moreread less

Abstract: Software popularity is a valuable information to modern open source developers, who constantly want to know if their systems are attracting new users, if new releases are gaining acceptance, or if they are meeting user's expectations. In this paper, we describe a study on the popularity of software systems hosted at GitHub, which is the world's largest collection of open source software. GitHub provides an explicit way for users to manifest their satisfaction with a hosted repository: the stargazers button. In our study, we reveal the main factors that impact the number of stars of GitHub projects, including programming language and application domain. We also study the impact of new features on project popularity. Finally, we identify four main patterns of popularity growth, which are derived after clustering the time series representing the number of stars of 2,279 popular GitHub repositories. We hope our results provide valuable insights to developers and maintainers, which could help them on building and evolving systems in a competitive software market.

...read moreread less

143 citations

Proceedings Article•DOI•

RefDiff: detecting refactorings in version histories

[...]

Danilo Silva¹, Marco Tulio Valente¹•Institutions (1)

Universidade Federal de Minas Gerais¹

20 May 2017

TL;DR: An automated approach that identifies refactorings performed between two code revisions in a git repository, which suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.

...read moreread less

Abstract: Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to understand software evolution, adapt software components, merge code changes, and other applications. In this paper, we present RefDiff, an automated approach that identifies refactorings performed between two code revisions in a git repository. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. In an evaluation using an oracle of 448 known refactoring operations, distributed across seven Java projects, our approach achieved precision of 100% and recall of 88%. Moreover, our evaluation suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.

...read moreread less

113 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

物件導向軟體之架構(Object-Oriented Software Construction)探討

[...]

簡聰富

01 Dec 1989

4,898 citations

Journal Article•

An overview of AspectJ

[...]

Gregor Kiczales¹, Erik Hilsdale², Jim Hugunin², Mik Kersten², Jeffrey Palm², William G. Griswold³ - Show less +2 more•Institutions (3)

University of British Columbia¹, PARC², University of California³

01 Jan 2001-Lecture Notes in Computer Science

TL;DR: AspectJ as mentioned in this paper is a simple and practical aspect-oriented extension to Java with just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns.

...read moreread less

Abstract: Aspect] is a simple and practical aspect-oriented extension to Java With just a few new constructs, AspectJ provides support for modular implementation of a range of crosscutting concerns. In AspectJ's dynamic join point model, join points are well-defined points in the execution of the program; pointcuts are collections of join points; advice are special method-like constructs that can be attached to pointcuts; and aspects are modular units of crosscutting implementation, comprising pointcuts, advice, and ordinary Java member declarations. AspectJ code is compiled into standard Java bytecode. Simple extensions to existing Java development environments make it possible to browse the crosscutting structure of aspects in the same kind of way as one browses the inheritance structure of classes. Several examples show that AspectJ is powerful, and that programs written using it are easy to understand.

...read moreread less

2,947 citations

The C programming language

[...]

Brian W. Kernighan¹, Dennis M. Ritchie¹•Institutions (1)

AT&T¹

01 Jan 1978

TL;DR: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.), and is a "must-have" reference for every serious programmer's digital library.

...read moreread less

Abstract: This ebook is the first authorized digital version of Kernighan and Ritchie's 1988 classic, The C Programming Language (2nd Ed.). One of the best-selling programming books published in the last fifty years, "K&R" has been called everything from the "bible" to "a landmark in computer science" and it has influenced generations of programmers. Available now for all leading ebook platforms, this concise and beautifully written text is a "must-have" reference for every serious programmers digital library. As modestly described by the authors in the Preface to the First Edition, this "is not an introductory programming manual; it assumes some familiarity with basic programming concepts like variables, assignment statements, loops, and functions. Nonetheless, a novice programmer should be able to read along and pick up the language, although access to a more knowledgeable colleague will help."

...read moreread less

2,120 citations

Journal Article•

When is nearest neighbor meaningful

[...]

Kevin S. Beyer, Jonathan Goldstein, Raghu Ramakrishnan, Uri Shaft

01 Jan 1999-Lecture Notes in Computer Science

TL;DR: In this article, the authors explore the effect of dimensionality on the nearest neighbor problem and show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance of the farthest data point.

...read moreread less

Abstract: We explore the effect of dimensionality on the nearest neighbor problem. We show that under a broad set of conditions (much broader than independent and identically distributed dimensions), as dimensionality increases, the distance to the nearest data point approaches the distance to the farthest data point. To provide a practical perspective, we present empirical results on both real and synthetic data sets that demonstrate that this effect can occur for as few as 10-15 dimensions. These results should not be interpreted to mean that high-dimensional indexing is never meaningful; we illustrate this point by identifying some high-dimensional workloads for which this effect does not occur. However, our results do emphasize that the methodology used almost universally in the database literature to evaluate high-dimensional indexing techniques is flawed, and should be modified. In particular, most such techniques proposed in the literature are not evaluated versus simple linear scan, and are evaluated over workloads for which nearest neighbor is not meaningful. Often, even the reported experiments, when analyzed carefully, show that linear scan would outperform the techniques being proposed on the workloads studied in high (10-15) dimensionality!.

...read moreread less

1,992 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse