scispace - formally typeset
Search or ask a question

Showing papers in "Software Quality Journal in 2006"


Journal ArticleDOI
TL;DR: This paper first reviews existing usability standards and models while highlighted the limitations and complementarities of the various standards, and explains how these various models can be unified into a single consolidated, hierarchical model of usability measurement.
Abstract: Usability is increasingly recognized as an important quality factor for interactive software systems, including traditional GUIs-style applications, Web sites, and the large variety of mobile and PDA interactive services. Unusable user interfaces are probably the single largest reasons why encompassing interactive systems --- computers plus people, fail in actual use. The design of this diversity of applications so that they actually achieve their intended purposes in term of ease of use is not an easy task. Although there are many individual methods for evaluating usability; they are not well integrated into a single conceptual framework that facilitate their usage by developers who are not trained in the filed of HCI. This is true in part because there are now several different standards (e.g., ISO 9241, ISO/IEC 9126, IEEE Std.610.12) or conceptual models (e.g., Metrics for Usability Standards in Computing [MUSiC]) for usability, and not all of these standards or models describe the same operational definitions and measures. This paper first reviews existing usability standards and models while highlighted the limitations and complementarities of the various standards. It then explains how these various models can be unified into a single consolidated, hierarchical model of usability measurement. This consolidated model is called Quality in Use Integrated Measurement (QUIM). Included in the QUIM model are 10 factors each of which corresponds to a specific facet of usability that is identified in an existing standard or model. These 10 factors are decomposed into a total of 26 sub-factors or measurable criteria that are furtherdecomposed into 127 specific metrics. The paper explains also how a consolidated model, such as QUIM, can help in developing a usability measurement theory.

630 citations


Journal ArticleDOI
TL;DR: This paper renders adequate identifier naming far more precisely a formal model, based on bijective mappings between concepts and names, provides a solid foundation for the definition of precise rules for concise and consistent naming.
Abstract: Approximately 70% of the source code of a software system consists of identifiers. Hence, the names chosen as identifiers are of paramount importance for the readability of computer programs and therewith their comprehensibility. However, virtually every programming language allows programmers to use almost arbitrary sequences of characters as identifiers which far too often results in more or less meaningless or even misleading naming. Coding style guides somehow address this problem but are usually limited to general and hard to enforce rules like "identifiers should be self-describing". This paper renders adequate identifier naming far more precisely. A formal model, based on bijective mappings between concepts and names, provides a solid foundation for the definition of precise rules for concise and consistent naming. The enforcement of these rules is supported by a tool that incrementally builds and maintains a complete identifier dictionary while the system is being developed. The identifier dictionary explains the language used in the software system, aids in consistent naming, and supports programmers by proposing suitable names depending on the current context.

303 citations


Journal ArticleDOI
TL;DR: This paper reviews some of the key cognitive theories of program comprehension that have emerged over the past thirty years and explores how tools that are commonly used today have evolved to support program comprehension.
Abstract: Program comprehension research can be characterized by both the theories that provide rich explanations about how programmers understand software, as well as the tools that are used to assist in comprehension tasks. In this paper, I review some of the key cognitive theories of program comprehension that have emerged over the past thirty years. Using these theories as a canvas, I then explore how tools that are commonly used today have evolved to support program comprehension. Specifically, I discuss how the theories and tools are related and reflect on the research methods that were used to construct the theories and evaluate the tools. The reviewed theories and tools are distinguished according to human characteristics, program characteristics, and the context for the various comprehension tasks. Finally, I predict how these characteristics will change in the future and speculate on how a number of important research directions could lead to improvements in program comprehension tool development and research methods.

132 citations


Journal ArticleDOI
TL;DR: The CBR system using the Mahalanobis distance similarity function and the inverse distance weighted solution algorithm yielded the best fault prediction, and the CBR models have better performance than models based on multiple linear regression.
Abstract: The resources allocated for software quality assurance and improvement have not increased with the ever-increasing need for better software quality. A targeted software quality inspection can detect faulty modules and reduce the number of faults occurring during operations. We present a software fault prediction modeling approach with case-based reasoning (CBR), a part of the computational intelligence field focusing on automated reasoning processes. A CBR system functions as a software fault prediction model by quantifying, for a module under development, the expected number of faults based on similar modules that were previously developed. Such a system is composed of a similarity function, the number of nearest neighbor cases used for fault prediction, and a solution algorithm. The selection of a particular similarity function and solution algorithm may affect the performance accuracy of a CBR-based software fault prediction system. This paper presents an empirical study investigating the effects of using three different similarity functions and two different solution algorithms on the prediction accuracy of our CBR system. The influence of varying the number of nearest neighbor cases on the performance accuracy is also explored. Moreover, the benefits of using metric-selection procedures for our CBR system is also evaluated. Case studies of a large legacy telecommunications system are used for our analysis. It is observed that the CBR system using the Mahalanobis distance similarity function and the inverse distance weighted solution algorithm yielded the best fault prediction. In addition, the CBR models have better performance than models based on multiple linear regression.

97 citations


Journal ArticleDOI
TL;DR: Six critical factors are identified and an instrument is developed and validated so as to measure the customer’s perception of quality management in the software industry.
Abstract: Most of the available literature on quality management is based on management's perception; few studies examine critical issues of quality management from the customer's perspective, especially in the software industry. In order to gain an insight into what customers expect from a product/service, an analysis of quality management from customer's point of view is essential. Such an understanding would help the managers to adopt strategies that can enhance the satisfaction level of their customers. The present study highlights the critical factors of quality management in the software industry from the customer's perspective. Six critical factors are identified: and an instrument, comprising these factors, is developed and validated so as to measure the customer's perception of quality management in the software industry.

52 citations


Journal ArticleDOI
TL;DR: In this paper, the authors report on a combined experiment in which they try to identify crosscutting concerns in the JHotDraw framework automatically, and present three interesting combinations of these three techniques, and show how these combinations provide a more complete coverage of the detected concerns.
Abstract: Understanding a software system at source-code level requires understanding the different concerns that it addresses, which in turn requires a way to identify these concerns in the source code. Whereas some concerns are explicitly represented by program entities (like classes, methods and variables) and thus are easy to identify, crosscutting concerns are not captured by a single program entity but are scattered over many program entities and are tangled with the other concerns. Because of their crosscutting nature, such crosscutting concerns are difficult to identify, and reduce the understandability of the system as a whole. In this paper, we report on a combined experiment in which we try to identify crosscutting concerns in the JHotDraw framework automatically. We first apply three independently developed aspect mining techniques to JHotDraw and evaluate and compare their results. Based on this analysis, we present three interesting combinations of these three techniques, and show how these combinations provide a more complete coverage of the detected concerns as compared to the original techniques individually. Our results are a first step towards improving the understandability of a system that contains crosscutting concerns, and can be used as a basis for refactoring the identified crosscutting concerns into aspects.

49 citations


Journal ArticleDOI
TL;DR: This paper provides a practical insight on the usability of SPC for the selected metrics in the specific processes and describes the observations on the difficulties and the benefits of applying SPC to an emergent software organization.
Abstract: Common wisdom in the domain of software engineering tells us that companies should be mature enough to apply Statistical Process Control (SPC) techniques. Since reaching high maturity levels (in CMM or similar models such as ISO 15504) usually takes 5---10 years, should software companies wait years to utilize Statistical Process Control techniques? To answer this question, we performed a case study of the application of SPC techniques using existing measurement data in an emergent software organization. Specifically, defect density, rework percentage and inspection performance metrics are analyzed. This paper provides a practical insight on the usability of SPC for the selected metrics in the specific processes and describes our observations on the difficulties and the benefits of applying SPC to an emergent software organization.

39 citations


Journal ArticleDOI
TL;DR: Key criteria and guidelines for the effective layout of UML class and sequence diagrams from the perspective of perceptual theories are presented.
Abstract: UML class and sequence diagrams are helpful for understanding the static structure and dynamic behavior of a software system. Algorithms and tools have been developed to generate these UML diagrams automatically for program understanding purposes. Many tools, however, often ignore perceptual factors in the layout of these diagrams. Therefore, users still have to spend much time and effort rearranging boxes and lines to make the diagram understandable. This article presents key criteria and guidelines for the effective layout of UML class and sequence diagrams from the perspective of perceptual theories. Two UML tools are evaluated to illustrate how the criteria can be applied to assess the readability of their generated diagrams.

37 citations


Journal ArticleDOI
TL;DR: The ISO/IEC 14598-5 standard is already used as a methodology basis for evaluating software products and can be combined with the CMMI to produce a methodology that can be tailored for process evaluation in order to improve their software processes.
Abstract: Many small software organizations have recognized the need to improve their software product. Evaluating the software product alone seems insufficient since it is known that its quality is largely dependant on the process that is used to create it. Thus, small organizations are asking for evaluation of their software processes and products. The ISO/IEC 14598-5 standard is already used as a methodology basis for evaluating software products. This article explores how it can be combined with the CMMI to produce a methodology that can be tailored for process evaluation in order to improve their software processes.

27 citations


Journal ArticleDOI
TL;DR: This paper presents an innovative quantitative method of setting technical targets in SQFD to enable analysis of impact of unachieved target values on customer satisfaction and certainly improves the existing quantitative methods which are based on only linear regression.
Abstract: Target setting in software quality function deployment (SQFD) is very important since it is directly related to development of high quality products with high customer satisfaction. However target setting is usually done subjectively in practice, which is not scientific. Two quantitative approaches for setting target values: benchmarking and primitive linear regression have been developed and applied in the past to overcome this problem (Akao and Yoji, 1990). But these approaches cannot be used to assess the impact of unachieved targets on satisfaction of customers for customer requirements. In addition, both of them are based on linear regression and not very practical in many applications. In this paper, we present an innovative quantitative method of setting technical targets in SQFD to enable analysis of impact of unachieved target values on customer satisfaction. It is based on assessment of impact of technical attributes on satisfaction of customer requirements. In addition both linear and non linear regression techniques are utilized in our method, which certainly improves the existing quantitative methods which are based on only linear regression.

24 citations


Journal ArticleDOI
TL;DR: Proposed XML-based Service Level Agreement (SLA) languages are reviewed as a means of providing quality assurances in machine-readable ways and interesting research proposals for proactively ensuring that good quality of service is obtained are reviewed.
Abstract: Web Services technologies and their supporting collection of de facto standards are now reaching the point of maturity where they are appearing in production software systems. Service Oriented Architectures (SOAs) using Web Services as an enabling technology are also being discussed widely in the IT press. However, despite the numerous and real advantages of these architectural patterns there are still many software quality challenges that remain unresolved. This is particularly true as we consider more advanced architectures that exploit the technology to its maximum advantage: utility computing and on-demand service discovery and composition, grid computing and multi-agent systems will only become pervasive once the software quality challenges of real-world industrial applications have been addressed. In this paper potential quality issues such as performance, reliability and availability are addressed in terms of the quality assurances that might need to be provided to consumers of services. Proposed XML-based Service Level Agreement (SLA) languages are reviewed as a means of providing these quality assurances in machine-readable ways. We also discuss how SLAs might be automatically negotiated to enable automated, on-demand service discovery and composition. The next section of this paper addresses quality issues from a service provider's perspective. The providers of such services will need to ensure that SLA commitments are met and this poses interesting problems in terms of application management. Network quality of service is currently addressed through such means as IntServ and DiffServ. Research proposals to introduce similar techniques at an application level are described. From the service consumer's perspective, interesting research proposals for proactively ensuring that good quality of service is obtained are also reviewed. These could be particularly important for creating confidence, from a consumer's perspective, in these architectures. Finally, the paper evaluates the challenges and suggests areas where further research is most urgently required.

Journal ArticleDOI
TL;DR: The rationale for the overall approach is described, evidence of its appropriateness is provided through a concrete empirical study that analyses the COCOMO II DOCU cost driver and an aggregation mechanism device for the selection of input variables based on existing data is described.
Abstract: Parametric cost estimation models are widely used effort prediction tools for software development projects. These models are based on mathematical models that use as inputs specific values for relevant cost drivers. The selection of these inputs is, in many cases, driven by public prescriptive rules that determine the selection of the values. Nonetheless, such selection may in some cases be restrictive and somewhat contradictory with empirical evidence, in other cases the selection procedure is somewhat subject to ambiguity. This paper presents an approach to improve the quality of the selection of adequate cost driver values in parametric models through a process of adjustment to bodies of empirical evidence. The approach has two essential elements. Firstly, it proceeds by analyzing the diverse factors potentially affecting the values a cost driver input might adopt for a given project. And secondly, an aggregation mechanism device for the selection of input variables based on existing data is explicitly devised. This paper describes the rationale for the overall approach and provides evidence of its appropriateness through a concrete empirical study that analyses the COCOMO II DOCU cost driver.

Journal ArticleDOI
TL;DR: The study illustrates how productivity rates might be misleading unless these factors are taken into account and how confidence about the company's performance can be expressed in terms of Bayesian confidence intervals for the ratio of the arithmetic means of the two data sets.
Abstract: A productivity benchmarking case study is presented. Empirically valid evidence exists to suggest that certain project factors, such as development type and language type, influence project effort and productivity and a comparison is made taking into account these and other factors. The case study identifies a reasonably comparable set of data that was taken from a large benchmarking data repository by using the factors. This data set was then compared with the small data set presented by a company for benchmarking. The study illustrates how productivity rates might be misleading unless these factors are taken into account. Further, rather than simply giving a ratio for the company's productivity performance against the benchmark, the study shows how confidence about the company's performance can be expressed in terms of Bayesian confidence (credible) intervals for the ratio of the arithmetic means of the two data sets.

Journal ArticleDOI
TL;DR: A new technique that end-users’ quality assurance (QA) teams can employ to test the new version of a component in its application context by using the existing version as a baseline is presented.
Abstract: Advancement in reusable component technology has had a significant impact on the development of complex graphical user interfaces (GUIs), which are front-ends to most of today's software. Software developers can, with very little effort, integrate components into their software's GUI. Problems, however, arise when new versions of GUI components replace their predecessors in an implementation. Often, the inclusion of a new version of a component breaks some part of the software, i.e., tasks that end-users were able to perform before modifications were made can no longer be performed. Software developers (who also happen to be component users) are unable to perform adequate regression testing in this context because of several factors, including lack of source code, environmental side-effects on GUI rendering, event-driven nature of GUIs, and large number of possible permutations of events. This paper presents a new technique that end-users' quality assurance (QA) teams can employ to test the new version of a component in its application context by using the existing version as a baseline. The technique combines lightweight event-level dynamic profiling to collect user profiles in a transparent manner, GUI reverse engineering to extract the structure of the component's GUI, test case execution to replay the collected profiles on the new version, and GUI oracles that collect properties from the existing version. Empirical studies demonstrate the practicality, usefulness, and limitations of the technique.

Journal ArticleDOI
TL;DR: The stakeholder identification and subsequent analysis provides an effective complement to the original method and can clearly aid in change management within information system redesign.
Abstract: PISO® (Process Improvement for Strategic Objectives) is a method that engages system users in the redesign of their own work-based information systems. PisoSIA® (stakeholder identification and analysis) is an enhancement to the original method that helps in the identification of a system's stakeholders, analyses the impact they have on the system and also considers the effect of change upon those stakeholders. Overviews of the original and enhanced methods are provided and research investigations centred on four case studies are reported. Each of the case studies made use of the original PISO® method and two made use of the enhanced pisoSIA® method. These case studies demonstrate the worth of the enhanced approach. The stakeholder identification and subsequent analysis provides an effective complement to the original method and can clearly aid in change management within information system redesign.

Journal ArticleDOI
TL;DR: This case study presents an initial analysis of audit findings that led to the need to review some of the approaches taken in gathering audit data, including the techniques used and the motivation of auditors.
Abstract: The analysis of audit findings should prove useful in uncovering the problems practitioners have in implementing a software quality management regime The understanding gained from this analysis could then be used to solve the issues involved, and make software management, eg development or procurement, more effective This case study presents an initial analysis of audit findings that led to the need to review some of the approaches taken in gathering audit data This review included the techniques used and the motivation of auditors A detailed implementation rating system was devised to further investigate and accurately identify specific problems It was also used to test and validate initial conclusions and highlight problems with audit sampling Without proper management, particularly for the analysis of audit findings, the internal audit process can be an ineffective use of resources The recommendations made by this paper can provide practical solutions to making internal auditing a cost-effective, problem solving, management tool

Journal ArticleDOI
TL;DR: A procedure for building software quality classification models from the limited resources perspective is presented and an empirical case study of a large-scale software system demonstrates the promising results of using the MECM measure to select an appropriate resource-based rule-based classification model.
Abstract: The amount of resources allocated for software quality improvements is often not enough to achieve the desired software quality. Software quality classification models that yield a risk-based quality estimation of program modules, such as fault-prone (fp) and not fault-prone (nfp), are useful as software quality assurance techniques. Their usefulness is largely dependent on whether enough resources are available for inspecting the fp modules. Since a given development project has its own budget and time limitations, a resource-based software quality improvement seems more appropriate for achieving its quality goals. A classification model should provide quality improvement guidance so as to maximize resource-utilization. We present a procedure for building software quality classification models from the limited resources perspective. The essence of the procedure is the use of our recently proposed Modified Expected Cost of Misclassification (MECM) measure for developing resource-oriented software quality classification models. The measure penalizes a model, in terms of costs of misclassifications, if the model predicts more number of fp modules than the number that can be inspected with the allotted resources. Our analysis is presented in the context of our Rule-Based Classification Modeling (RBCM) technique. An empirical case study of a large-scale software system demonstrates the promising results of using the MECM measure to select an appropriate resource-based rule-based classification model.

Journal ArticleDOI
TL;DR: After the major newspapers failed to provide good ideas for an editorial, the most reliable source for stories about unreliability, the Risks forum was visited, and almost all software failures are described in the R risks forum before they reach the popular press.
Abstract: As I was pondering topics for an editorial, I began to scan some on-line newspapers for good stories that describe horrible software failures. Alas, I found no good stories in that day’s New York Times, Washington Post, Los Angeles Times, etc. Then I thought that surely the San Jose Mercury News would have news on some awful software glitch, but I had no such luck. Perhaps our software was not giving us problems this week, or perhaps the media is so used to hearing about software problems that they are no longer newsworthy. The last headline story in my mind that day was the Sony music CD copy protection software that made serious modifications to the Windows operating system which include leaving security holes in the system. But that was last week’s news (although it is a continuing story). After the major newspapers failed to provide good ideas for an editorial, I visited the most reliable source for stories about unreliability, the Risks forum (www.risks.org). Virtually all software failures are described in the Risks forum before they reach the popular press. Let’s take a look at some of the headlines from the forum on Thursday, 17 November 2005: “Software bug crashes Japanese stock exchange.” “Flight Booking System Can’t Recognise February 29.” “Fun with Daylight Saving Time: . . . One wonders how well the embedded time-aware code in most electronic equipment will handle this.” “Computer Glitch Lets Prisoners Out Early: . . . Some prisoners were also let out too late, which is just as bad.” “Freddie Mac profits misstated due to software error.” “Some Fast Lane accounts double-billed.” “Sony CD DRM Blow-Up Continues – Recalls Ordered, Lawsuits Possible”: this story continues. Stories on the Risks forum tend to be precursors and follow-ups to widely reported stories. These stories reach the mainstream press, since reporters also scan Risks forum stories. What is it that these reported problems have in common? For one thing, the failures of these systems had enough impact, or potential impact, on people to be reported. What makes a failure of great enough importance to report?

Journal ArticleDOI
TL;DR: The notion of free/open source software (F/OSS) development is intriguing, in theory, such software is developed by volunteers who see the software as fulfilling their own needs, but F/OSS projects can wither and die in part because few developers are attracted to the project.
Abstract: The notion of free/open source software (F/OSS) development is intriguing. In theory, such software is developed by volunteers who see the software as fulfilling their own needs. Developers work on what they want, anyone can have a copy of the source code and contribute towards improving the system by reporting and/or fixing bugs, and developing needed new functionality. The code is high quality because of the large number of people involved in reporting and fixing errors, and adding functionality. The F/OSS process should satisfy user and developer needs. In contrast, the list of software disasters produced through “traditional” commercial development is staggering. You can easily find a long list of (the most frightening) software failures by searching for “software disasters” using your favorite search engine. Another indicator of problems with traditional development is the high proportion of software development projects that are cancelled. Virtually all commercial software developers with five or more years of experience have worked one or many projects that have been cancelled. Estimates in software engineering textbooks, research papers, and from a web search for “cancelled software projects” indicate that from 33–44% of commercial software projects end up cancelled. I also find estimates that up to 67% of commercial software projects experience large cost overruns. Do F/OSS projects get cancelled and experience cost overruns? Are such questions at all relevant to F/OSS projects? F/OSS projects do not get cancelled by the “suits” or “bean counters” when the corporate mission changes, or the company reorganizes, or for some other arbitrary reason. However, F/OSS projects can wither and die in part because few developers are attracted to the project. For example, Crowston et al. found that 67% of nearly 100,000 projects listed on SourceForge attracted no more than one developer at any point in a five year period (Crowston et al., 2006). Rather than being cancelled explicitly, F/OSS projects without a “market” fade into inactivity.