scispace - formally typeset
Search or ask a question
Author

Filippo Lanubile

Bio: Filippo Lanubile is an academic researcher from University of Bari. The author has contributed to research in topics: Software development & Social software engineering. The author has an hindex of 34, co-authored 180 publications receiving 4516 citations. Previous affiliations of Filippo Lanubile include University of Maryland, College Park.


Papers
More filters
Journal ArticleDOI
TL;DR: The paper discusses the experience of the authors, based upon a collection of experiments, in terms of a framework for organizing sets of related studies, with specific emphasis on persistent problems encountered in experimental design, threats to validity, criteria for evaluation, and execution of experiments in the domain of software engineering.
Abstract: Experimentation in software engineering is necessary but difficult. One reason is that there are a large number of context variables and, so, creating a cohesive understanding of experimental results requires a mechanism for motivating studies and integrating results. It requires a community of researchers that can replicate studies, vary context variables, and build models that represent the common observations about the discipline. The paper discusses the experience of the authors, based upon a collection of experiments, in terms of a framework for organizing sets of related studies. With such a framework, experiments can be viewed as part of common families of studies, rather than being isolated events. Common families of studies can contribute to important and relevant hypotheses that may not be suggested by individual experiments. A framework also facilitates building knowledge in an incremental manner through the replication of experiments within families of studies. To support the framework, the paper discusses the experiences of the authors in carrying out empirical studies, with specific emphasis on persistent problems encountered in experimental design, threats to validity, criteria for evaluation, and execution of experiments in the domain of software engineering.

852 citations

Journal ArticleDOI
TL;DR: This article summarizes experiences and trends chosen from recent IEEE International Conference on Global Software Engineering (IGSCE) conferences.
Abstract: Software engineering involves people collaborating to develop better software. Collaboration is challenging, especially across time zones and without face-to-face meetings. We therefore use collaboration tools all along the product life cycle to let us work together, stay together, and achieve results together. This article summarizes experiences and trends chosen from recent IEEE International Conference on Global Software Engineering (IGSCE) conferences.

196 citations

Journal ArticleDOI
TL;DR: Senti4SD as mentioned in this paper is a classifier specifically trained to support sentiment analysis in developers' communication channels, which is trained and validated using a gold standard of Stack Overflow questions, answers, and comments manually annotated for sentiment polarity.
Abstract: The role of sentiment analysis is increasingly emerging to study software developers' emotions by mining crowd-generated content within social software engineering tools. However, off-the-shelf sentiment analysis tools have been trained on non-technical domains and general-purpose social media, thus resulting in misclassifications of technical jargon and problem reports. Here, we present Senti4SD, a classifier specifically trained to support sentiment analysis in developers' communication channels. Senti4SD is trained and validated using a gold standard of Stack Overflow questions, answers, and comments manually annotated for sentiment polarity. It exploits a suite of both lexicon- and keyword-based features, as well as semantic features based on word embedding. With respect to a mainstream off-the-shelf tool, which we use as a baseline, Senti4SD reduces the misclassifications of neutral and positive posts as emotionally negative. To encourage replications, we release a lab package including the classifier, the word embedding space, and the gold standard with annotation guidelines.

176 citations

Proceedings ArticleDOI
01 Sep 2015
TL;DR: This paper aims at assessing the suitability of a state-of-the-art sentiment analysis tool, already applied in social computing, for detecting affective expressions in Stack Overflow, and verifying the construct validity of choosing sentiment polarity and strength as an appropriate way to operationalize affective states in empirical studies on Stack overflow.
Abstract: A recent research trend has emerged to study the role of affect in in the social programmer ecosystem, by applying sentiment analysis to the content available in sites such as GitHub and Stack Overflow. In this paper, we aim at assessing the suitability of a state-of-the-art sentiment analysis tool, already applied in social computing, for detecting affective expressions in Stack Overflow. We also aim at verifying the construct validity of choosing sentiment polarity and strength as an appropriate way to operationalize affective states in empirical studies on Stack Overflow. Finally, we underline the need to overcome the limitations induced by domain-dependent use of lexicon that may produce unreliable results.

125 citations

Journal ArticleDOI
TL;DR: This work applies program slicing, a program decomposition method, to the problem of extracting reusable functions from ill structured programs, and extends the definition of program slice to a transform slice, one that includes statements which contribute directly or indirectly to transform a set of input variables into aSet of output variables.
Abstract: An alternative approach to developing reusable components from scratch is to recover them from existing systems. We apply program slicing, a program decomposition method, to the problem of extracting reusable functions from ill structured programs. As with conventional slicing first described by M. Weiser (1984), a slice is obtained by iteratively solving data flow equations based on a program flow graph. We extend the definition of program slice to a transform slice, one that includes statements which contribute directly or indirectly to transform a set of input variables into a set of output variables. Unlike conventional program slicing, these statements do not include either the statements necessary to get input data or the statements which test the binding conditions of the function. Transform slicing presupposes the knowledge that a function is performed in the code and its partial specification, only in terms of input and output data. Using domain knowledge we discuss how to formulate expectations of the functions implemented in the code. In addition to the input/output parameters of the function, the slicing criterion depends on an initial statement, which is difficult to obtain for large programs. Using the notions of decomposition slice and concept validation we show how to produce a set of candidate functions, which are independent of line numbers but must be evaluated with respect to the expected behavior. Although human interaction is required, the limited size of candidate functions makes this task easier than looking for the last function instruction in the original source code.

119 citations


Cited by
More filters
Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jan 2009

7,241 citations

Journal ArticleDOI

3,628 citations

Journal Article
TL;DR: This sales letter may not influence you to be smarter, but the book that this research methods in social relations will evoke you to being smarter.
Abstract: This sales letter may not influence you to be smarter, but the book that we offer will evoke you to be smarter. Yeah, at least you'll know more than others who don't. This is what called as the quality life improvisation. Why should this research methods in social relations? It's because this is your favourite theme to read. If you like this theme about, why don't you read the book to enrich your discussion?

2,382 citations