Preliminary guidelines for empirical research in software engineering

doi:10.1109/TSE.2002.1027796

https://doi.org/10.4224/8914084

READ THESE TERMS AND CONDITIONS CAREFULLY BEFORE USING THIS WEBSITE.

https://nrc-publications.canada.ca/eng/copyright

Vous avez des questions?

Nous pouvons vous aider. Pour communiquer directement avec un auteur, consultez la

première page de la revue dans laquelle son article a été publié afin de trouver ses coordonnées. Si vous n’arrivez

pas à les repérer, communiquez avec nous à PublicationsArchive-ArchivesPublications@nrc-cnrc.gc.ca.

Questions? Contact the NRC Publications Archive team at

PublicationsArchive-ArchivesPublications@nrc-cnrc.gc.ca. If you wish to email the authors directly, please see the

first page of the publication for their contact information.

NRC Publications Archive

Archives des publications du CNRC

For the publisher’s version, please access the DOI link below./ Pour consulter la version de l’éditeur, utilisez le lien

DOI ci-dessous.

Access and use of this website and the material on it are subject to the Terms and Conditions set forth at

Preliminary Guidelines for Empirical Research in Software Engineering

Kitchenham, B.A.; Pfleeger, S.L.; Pickard, L.M.; Jones, P.W.; Hoaglin, D.C.;

El-Emam, Khaled; Rosenberg, J.

https://publications-cnrc.canada.ca/fra/droits

L’accès à ce site Web et l’utilisation de son contenu sont assujettis aux conditions présentées dans le site

LISEZ CES CONDITIONS ATTENTIVEMENT AVANT D’UTILISER CE SITE WEB.

NRC Publications Record / Notice d'Archives des publications de CNRC:

https://nrc-publications.canada.ca/eng/view/object/?id=0c0d4174-6677-4d7f-8ff6-86927aa3aabc

https://publications-cnrc.canada.ca/fra/voir/objet/?id=0c0d4174-6677-4d7f-8ff6-86927aa3aabc

National Research

Council Canada

Institute for

Information Technology

Conseil national

de recherches Canada

Institut de technologie

de l'information

Preliminary Guidelines for Empirical Research in

Software Engineering *

Kitchenham, B.A., Pfleeger, S.L., Pickard, L.M., Jones, P.W.,

Hoaglin, D.C., El-Emam, K., Rosenberg, J.

January 2001

* published as NRC/ERB-1082. January 2001. 27 pages. NRC 44158.

National Research Council of Canada

Permission is granted to quote short excerpts and to reproduce figures and tables from this report,

provided that the source of such material is fully acknowledged.

National Research

Council Canada

Institute for

Information Technology

Conseil national

de recherches Canada

Institut de Technologie

de l’information

Preliminary Guidelines for

Empirical Research in

Software Engineering

Kitchenham, B.A., Pfleeger, S.L., Pickard, L.M.,

Jones, P.W., Hoaglin, D.C., El-Emam, K., and Rosenberg, J.

January 2001

ERB-1082

NRC 44158

* Keele University, Keele, Staffordshire, UK

** Systems/Software, Inc., Washington, DC, USA

*** Abt Associates Inc., Cambridge, MA, USA

**** National Reserch Council of Canada, Ottawa, Ontario, Canada

*****Sun Microsystems, Palo Alto, CA, USA

Preliminary guidelines for empirical research in software

engineering

Barbara A. Kitchenham*, Shari Lawrence Pfleeger**, Lesley M. Pickard*,

Peter W. Jones*, David C. Hoaglin***, Khaled El-Emam****,

and Jarrett Rosenberg****

Abstract

Empirical software engineering research needs research guidelines to improve the

research and reporting processes. We propose a preliminary set of research guidelines

aimed at stimulating discussion among software researchers. They are based on a review

of research guidelines developed for medical researchers and on our own experience in

doing and reviewing software engineering research. The guidelines are intended to assist

researchers, reviewers and meta-analysts in designing, conducting and evaluating

empirical studies. Editorial boards of software engineering journals may wish to use our

recommendations as a basis for developing guidelines for reviewers and for framing

policies for dealing with the design, data collection and analysis and reporting of

empirical studies.

Keywords: empirical software research; research guidelines; statistical mistakes.

1. Introduction

We have spent many years both undertaking empirical studies in software engineering

ourselves, and reviewing reports of empirical studies submitted to journals or presented

as postgraduate theses or dissertations. In our view, the standard of empirical software

engineering research is poor. This includes case studies, surveys and formal experiments,

whether observed in the field or in a laboratory or classroom. This statement is not a

criticism of software researchers in particular; many applied disciplines have problems

performing empirical studies. For example, Yancey [50] found many articles in the

American Journal of Surgery (1987 and 1988) with “ methodologic errors so serious as to

render invalid the conclusions of the authors.” McGuigan [31] reviewed 164 papers that

included numerical results that were published in the British Journal of Psychiatry in

1993 and found that 40% of them had statistical errors. When Welch and Gabbe [48]

reviewed clinical articles in six issues of the American Journal of Obstetrics, they found

more than half the studies impossible to assess because the statistical techniques used

were not reported in sufficient detail. Furthermore, nearly one third of the articles

contained inappropriate uses of statistics. If researchers have difficulty in a discipline

such as medicine, which has a rich history of empirical research, it is hardly surprising

that software engineering researchers have problems.

In a previous investigation of the use of meta-analysis in software engineering [34], three

of us identified the need to assess the quality of the individual studies included in a meta-

analysis. In this paper, we extend those ideas to discuss several guidelines that can be

used both to improve the quality of on-going and proposed empirical studies and to

2

encourage critical assessment of existing studies. We believe that adoption of such

guidelines will not only improve the quality of individual studies but will also increase

the likelihood that we can use meta-analysis to combine the results of related studies. The

guidelines presented in this paper are a first attempt to formulate a set of guidelines.

There needs to be a wider debate before the software engineering research community

can develop and agree on definitive guidelines.

Before we describe our guidelines, it may be helpful to you to understand who we are and

how we developed these guidelines. Kitchenham, Pickard, Pfleeger and El-Emam are

software engineering researchers with backgrounds in statistics as well as computer

science. We regularly review papers and dissertations, and we often participate in

empirical research. Rosenberg is a statistician who applies statistical methods to software

engineering problems. Jones is a medical statistician with experience in developing

standards for improving medical research studies. Hoaglin is a statistician who has long

been interested in software and computing. He reviewed eight papers published in

Transactions on Software Engineering in the last few years. These papers were not

chosen at random. Rather, they were selected (by those of us whose primary focus is

software engineering) because their authors are well-known for their empirical software

engineering work, and because their techniques are typical of papers submitted to this

journal. Hoaglin’ s independent comments on these papers confirmed our suspicions that

the current state of empirical studies as published in Transactions on Software

Engineering is similar to that found in medical studies. He found examples of poor

experimental design, inappropriate use of statistical techniques and conclusions that did

not follow from the reported results. We omit the titles of these papers. We want the

focus of our guidelines to be overall improvement of our discipline, not finger-pointing at

previous work. We do, however, cite papers that include specific statistical mistakes

when they help illustrate the reason that a particular guideline should be followed.

The main sources for this paper, apart from our own experience, are:

• The Yancey paper already mentioned. Yancey identifies ten rules for reading clinical

research results. Many of the rules can also serve as guidelines for authors.

• A paper by Sacks et al. [43] that considers quality criteria for meta-analyses of

randomized controlled trials. Sacks et al. point out that the quality of papers included

in a meta-analysis is important. In particular, they suggest considering the quality of

features such as the randomization process, the statistical analysis, and the handling

of withdrawals.

• A paper on guidelines for contributors to journals by Altman [1].

• The guidelines for statistical review of general papers and clinical trials prepared by

the British Medical Journal. (These guidelines are listed in Altman et al. [3], chapter

10 of Gardner and Altman [14], and on the journal’ s web page:

http://www.bmj.com/advice)

• A book by Lang and Secic [28] with guidelines for reporting medical statistics.

• The CONSORT statement on reporting the results of randomized trials in medicine

[4]. This statement has been adopted by seventy medical journals.

Preliminary guidelines for empirical research in software engineering

Citations

Guidelines for conducting and reporting case study research in software engineering

Empirical studies of agile software development: A systematic review

Experimentation in Software Engineering

Clinical Trials: A Practical Approach

Supporting Controlled Experimentation with Testing Techniques: An Infrastructure and its Potential Impact

References

Statistical Power Analysis for the Behavioral Sciences

Coefficient alpha and the internal structure of tests.

Applied Regression Analysis

Statistical Analysis with Missing Data

Experimental and Quasi-Experimental Designs for Research

Related Papers (5)

Experimentation in Software Engineering: An Introduction

Building knowledge through families of experiments

Guidelines for conducting and reporting case study research in software engineering

Case Study Research: Design and Methods

Experimentation in Software Engineering