scispace - formally typeset
Search or ask a question

Showing papers on "Software published in 2006"


Journal ArticleDOI
TL;DR: COPASI is presented, a platform-independent and user-friendly biochemical simulator that offers several unique features, and numerical issues with these features are discussed; in particular, the criteria to switch between stochastic and deterministic simulation methods, hybrid deterministic-stochastic methods, and the importance of random number generator numerical resolution in Stochastic simulation.
Abstract: Motivation: Simulation and modeling is becoming a standard approach to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as support for the usage of these methods. Results: Here, we present COPASI, a platform-independent and user-friendly biochemical simulator that offers several unique features. We discuss numerical issues with these features; in particular, the criteria to switch between stochastic and deterministic simulation methods, hybrid deterministic--stochastic methods, and the importance of random number generator numerical resolution in stochastic simulation. Availability: The complete software is available in binary (executable) for MS Windows, OS X, Linux (Intel) and Sun Solaris (SPARC), as well as the full source code under an open source license from http://www.copasi.org. Contact: mendes@vbi.vt.edu

2,351 citations


Book ChapterDOI
TL;DR: This chapter describes each component of the TM4 suite of open‐source tools for data management and reporting, image analysis, normalization and pipeline control, and data mining and visualization and includes a sample analysis walk‐through.
Abstract: Powerful specialized software is essential for managing, quantifying, and ultimately deriving scientific insight from results of a microarray experiment. We have developed a suite of software applications, known as TM4, to support such gene expression studies. The suite consists of open‐source tools for data managementandreporting,imageanalysis,normalizationandpipelinecontrol, and data mining and visualization. An integrated MIAME‐compliant MySQL database is included. This chapter describes each component of the suite and includes a sample analysis walk‐through.

1,931 citations


Journal ArticleDOI
TL;DR: Meta-DiSc is a comprehensive and dedicated test accuracy meta-analysis software that has already been used and cited in several meta-analyses published in high-ranking journals and is publicly available.
Abstract: Systematic reviews and meta-analyses of test accuracy studies are increasingly being recognised as central in guiding clinical practice. However, there is currently no dedicated and comprehensive software for meta-analysis of diagnostic data. In this article, we present Meta-DiSc, a Windows-based, user-friendly, freely available (for academic use) software that we have developed, piloted, and validated to perform diagnostic meta-analysis. Meta-DiSc a) allows exploration of heterogeneity, with a variety of statistics including chi-square, I-squared and Spearman correlation tests, b) implements meta-regression techniques to explore the relationships between study characteristics and accuracy estimates, c) performs statistical pooling of sensitivities, specificities, likelihood ratios and diagnostic odds ratios using fixed and random effects models, both overall and in subgroups and d) produces high quality figures, including forest plots and summary receiver operating characteristic curves that can be exported for use in manuscripts for publication. All computational algorithms have been validated through comparison with different statistical tools and published meta-analyses. Meta-DiSc has a Graphical User Interface with roll-down menus, dialog boxes, and online help facilities. Meta-DiSc is a comprehensive and dedicated test accuracy meta-analysis software. It has already been used and cited in several meta-analyses published in high-ranking journals. The software is publicly available at http://www.hrc.es/investigacion/metadisc_en.htm .

1,727 citations


Book
22 Nov 2006
TL;DR: The Implied Marginal Variance-Covariance Matrix for the Final Model Diagnostics for theFinal Model Software Notes and Recommendations Other Analytic Approaches Recommendations.
Abstract: INTRODUCTION What Are Linear Mixed Models (LMMs)? A Brief History of Linear Mixed Models LINEAR MIXED MODELS: AN OVERVIEW Introduction Specification of LMMs The Marginal Linear Model Estimation in LMMs Computational Issues Tools for Model Selection Model-Building Strategies Checking Model Assumptions (Diagnostics) Other Aspects of LMMs Power Analysis for Linear Mixed Models Chapter Summary TWO-LEVEL MODELS FOR CLUSTERED DATA: THE RAT PUP EXAMPLE Introduction The Rat Pup Study Overview of the Rat Pup Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model Estimating the Intraclass Correlation Coefficients (ICCs) Calculating Predicted Values Diagnostics for the Final Model Software Notes and Recommendations THREE-LEVEL MODELS FOR CLUSTERED DATA THE CLASSROOM EXAMPLE Introduction The Classroom Study Overview of the Classroom Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model Estimating the Intraclass Correlation Coefficients (ICCs) Calculating Predicted Values Diagnostics for the Final Model Software Notes Recommendations MODELS FOR REPEATED-MEASURES DATA: THE RAT BRAIN EXAMPLE Introduction The Rat Brain Study Overview of the Rat Brain Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model The Implied Marginal Variance-Covariance Matrix for the Final Model Diagnostics for the Final Model Software Notes Other Analytic Approaches Recommendations RANDOM COEFFICIENT MODELS FOR LONGITUDINAL DATA: THE AUTISM EXAMPLE Introduction The Autism Study Overview of the Autism Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model Calculating Predicted Values Diagnostics for the Final Model Software Note: Computational Problems with the D Matrix An Alternative Approach: Fitting the Marginal Model with an Unstructured Covariance Matrix MODELS FOR CLUSTERED LONGITUDINAL DATA: THE DENTAL VENEER EXAMPLE Introduction The Dental Veneer Study Overview of the Dental Veneer Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model The Implied Marginal Variance-Covariance Matrix for the Final Model Diagnostics for the Final Model Software Notes and Recommendations Other Analytic Approaches MODELS FOR DATA WITH CROSSED RANDOM FACTORS: THE SAT SCORE EXAMPLE Introduction The SAT Score Study Overview of the SAT Score Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model The Implied Marginal Variance-Covariance Matrix for the Final Model Recommended Diagnostics for the Final Model Software Notes and Additional Recommendations APPENDIX A: STATISTICAL SOFTWARE RESOURCES APPENDIX B: CALCULATION OF THE MARGINAL VARIANCE-COVARIANCE MATRIX APPENDIX C: ACRONYMS/ABBREVIATIONS BIBLIOGRAPHY INDEX

1,680 citations


Patent
10 May 2006
TL;DR: In this paper, the authors present a new Windows application that includes considerable improvements over the prior art, such as a reset function, a powerful undo feature, improved undo features in word processing, improved file comparison features, being able for example to track changes retroactively, improved backup features, and many additional improvements.
Abstract: Although MS Windows (in its various versions) is at present the most popular OS (Operating System) in personal computers, after years of consecutive improvements there are still various issues which need to be improved, which include for example issues of efficiency, comfort, and/or reliability. The present invention tries to solve the above problems in new ways that include considerable improvements over the prior art. Preferably the system allows for example a “Reset” function, which means that preferably an Image of the state of the OS (including all loaded software) is saved immediately after a successful boot on the disk or other non-volatile memory and is preferably automatically updated when new drivers and/or software that change the state after a boot are added, so that if the system gets stuck it can be instantly restarted as if it has been rebooted. Other features include for example solving the problem that the focus can be grabbed while the user is typing something, allowing the user to easily define or increase or decrease the priority of various processes or open windows, a powerful undo feature that can include preferably even any changes to the hard disk, improved undo features in word processing, improved file comparison features, being able for example to track changes retroactively, improved backup features, and many additional improvements. The application covers also improvements that are related for example to Word processing (since for example in Microsoft Windows, Word behaves like an integral part of the system) and things that are related to the user's Internet surfing experience, including for example improved search experience (This is important since for example in Microsoft Windows, Internet Explorer is practically an integral part of the OS). The invention deals also with some preferable improvements in the performance of the hard disk and also with some other smart computerized devices.

1,185 citations


Proceedings ArticleDOI
TL;DR: The CIAO (Chandra Interactive Analysis of Observations) software package was first released in 1999 following the launch of the Chandra X-ray Observatory and is used by astronomers across the world to analyze Chandra data as well as data from other telescopes.
Abstract: The CIAO (Chandra Interactive Analysis of Observations) software package was first released in 1999 following the launch of the Chandra X-ray Observatory and is used by astronomers across the world to analyze Chandra data as well as data from other telescopes. From the earliest design discussions, CIAO was planned as a general-purpose scientific data analysis system optimized for X-ray astronomy, and consists mainly of command line tools (allowing easy pipelining and scripting) with a parameter-based interface layered on a flexible data manipulation I/O library. The same code is used for the standard Chandra archive pipeline, allowing users to recalibrate their data in a consistent way. We will discuss the lessons learned from the first six years of the software's evolution. Our initial approach to documentation evolved to concentrate on recipe-based "threads" which have proved very successful. A multi-dimensional abstract approach to data analysis has allowed new capabilities to be added while retaining existing interfaces. A key requirement for our community was interoperability with other data analysis systems, leading us to adopt standard file formats and an architecture which was as robust as possible to the input of foreign data files, as well as re-using a number of external libraries. We support users who are comfortable with coding themselves via a flexible user scripting paradigm, while the availability of tightly constrained pipeline programs are of benefit to less computationally-advanced users. As with other analysis systems, we have found that infrastructure maintenance and re-engineering is a necessary and significant ongoing effort and needs to be planned in to any long-lived astronomy software.

1,145 citations


Journal ArticleDOI
TL;DR: This paper inductively derives a framework for understanding participation from the perspective of the individual software developer based on data from two software communities with different governance structures.
Abstract: Open source software projects rely on the voluntary efforts of thousands of software developers, yet we know little about why developers choose to participate in this collective development process. This paper inductively derives a framework for understanding participation from the perspective of the individual software developer based on data from two software communities with different governance structures. In both communities, a need for software-related improvements drives initial participation. The majority of participants leave the community once their needs are met, however, a small subset remains involved. For this set of developers, motives evolve over time and participation becomes a hobby. These hobbyists are critical to the long-term viability of the software code: They take on tasks that might otherwise go undone and work to maintain the simplicity and modularity of the code. Governance structures affect this evolution of motives. Implications for firms interested in implementing hybrid strategies designed to combine the advantages of open source software development with proprietary ownership and control are discussed.

905 citations


Journal ArticleDOI
TL;DR: It is argued that the open source software phenomenon has metamorphosed into a more mainstream and commercially viable form, which the author labels as OSS 2.0, and how the bazaar metaphor has actually shifted to become a metaphor better suited to the OSS 1.0 product delivery and support process.
Abstract: A frequent characterization of open source software is the somewhat outdated, mythical one of a collective of supremely talented software hackers freely volunteering their services to produce uniformly high-quality software. I contend that the open source software phenomenon has metamorphosed into a more mainstream and commercially viable form, which I label as OSS 2.0. I illustrate this transformation using a framework of process and product factors, and discuss how the bazaar metaphor, which up to now has been associated with the open source development process, has actually shifted to become a metaphor better suited to the OSS 2.0 product delivery and support process. Overall the OSS 2.0 phenomenon is significantly different from its free software antecedent. Its emergence accentuates the fundamental alteration of the basic ground rules in the software landscape, signifying the end of the proprietary-driven model that has prevailed for the past 20 years or so. Thus, a clear understanding of the characteristics of the emergent OSS 2.0 phenomenon is required to address key challenges for research and practice.

837 citations


Proceedings ArticleDOI
28 May 2006
TL;DR: Using principal component analysis on the code metrics, this work built regression models that accurately predict the likelihood of post-release defects for new entities and can be generalized to arbitrary projects.
Abstract: What is it that makes software fail? In an empirical study of the post-release defect history of five Microsoft software systems, we found that failure-prone software entities are statistically correlated with code complexity measures. However, there is no single set of complexity metrics that could act as a universally best defect predictor. Using principal component analysis on the code metrics, we built regression models that accurately predict the likelihood of post-release defects for new entities. The approach can easily be generalized to arbitrary projects; in particular, predictors obtained from one project can also be significant for new, similar projects.

803 citations


Journal ArticleDOI
TL;DR: The main goals of this article are to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach, and to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers.
Abstract: In this paper we describe the libMesh (http://libmesh.sourceforge.net) framework for parallel adaptive finite element applications. libMesh is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out in the CFDLab (http://cfdlab.ae.utexas.edu) at the University of Texas, but as with other open-source software projects; contributions are being made elsewhere in the US and abroad. The main goals of this article are: (1) to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach; (2) to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers; and (3) to describe the parallel implementation and data structures with supporting discussion of domain decomposition, message passing, and details related to dynamic repartitioning for parallel AMR/C. Other aspects related to C++ programming paradigms, reusability for diverse applications, adaptive modeling, physics-independent error indicators, and similar concepts are briefly discussed. Finally, results from some applications using the library are presented and areas of future research are discussed.

761 citations


Journal ArticleDOI
TL;DR: This paper reports data from a study that seeks to characterize the differences in design structure between complex software products, using design structure matrices to map dependencies between the elements of a design and define metrics that allow us to compare the structures of different designs.
Abstract: This paper reports data from a study that seeks to characterize the differences in design structure between complex software products. We use design structure matrices (DSMs) to map dependencies between the elements of a design and define metrics that allow us to compare the structures of different designs. We use these metrics to compare the architectures of two software products---the Linux operating system and the Mozilla Web browser---that were developed via contrasting modes of organization: specifically, open source versus proprietary development. We then track the evolution of Mozilla, paying attention to a purposeful “redesign” effort undertaken with the intention of making the product more “modular.” We find significant differences in structure between Linux and the first version of Mozilla, suggesting that Linux had a more modular architecture. Yet we also find that the redesign of Mozilla resulted in an architecture that was significantly more modular than that of its predecessor and, indeed, than that of Linux. Our results, while exploratory, are consistent with a view that different modes of organization are associated with designs that possess different structures. However, they also suggest that purposeful managerial actions can have a significant impact in adapting a design's structure. This latter result is important given recent moves to release proprietary software into the public domain. These moves are likely to fail unless the product possesses an “architecture for participation.”

Journal ArticleDOI
TL;DR: Recent work to redesign the EIDORS software structure in order to simplify its use and provide a uniform interface, permitting easier modification and customization is described.
Abstract: EIDORS is an open source software suite for image reconstruction in electrical impedance tomography and diffuse optical tomography, designed to facilitate collaboration, testing and new research in these fields. This paper describes recent work to redesign the software structure in order to simplify its use and provide a uniform interface, permitting easier modification and customization. We describe the key features of this software, followed by examples of its use. One general issue with inverse problem software is the difficulty of correctly implementing algorithms and the consequent ease with which subtle numerical bugs can be inadvertently introduced. EIDORS helps with this issue, by allowing sharing and reuse of well-documented and debugged software. On the other hand, since EIDORS is designed to facilitate use by non-specialists, its use may inadvertently result in such numerical errors. In order to address this issue, we develop a list of ways in which such errors with inverse problems (which we refer to as 'cheats') may occur. Our hope is that such an overview may assist authors of software to avoid such implementation issues.

Journal ArticleDOI
TL;DR: This paper proposes a tool, CP-Miner, that uses data mining techniques to efficiently identify copy-pasted code in large software suites and detects copy-paste bugs and has detected many new bugs in popular operating systems.
Abstract: Recent studies have shown that large software suites contain significant amounts of replicated code. It is assumed that some of this replication is due to copy-and-paste activity and that a significant proportion of bugs in operating systems are due to copy-paste errors. Existing static code analyzers are either not scalable to large software suites or do not perform robustly where replicated code is modified with insertions and deletions. Furthermore, the existing tools do not detect copy-paste related bugs. In this paper, we propose a tool, CP-Miner, that uses data mining techniques to efficiently identify copy-pasted code in large software suites and detects copy-paste bugs. Specifically, it takes less than 20 minutes for CP-Miner to identify 190,000 copy-pasted segments in Linux and 150,000 in FreeBSD. Moreover, CP-Miner has detected many new bugs in popular operating systems, 49 in Linux and 31 in FreeBSD, most of which have since been confirmed by the corresponding developers and have been rectified in the following releases. In addition, we have found some interesting characteristics of copy-paste in operating system code. Specifically, we analyze the distribution of copy-pasted code by size (number lines of code), granularity (basic blocks and functions), and modification within copy-pasted code. We also analyze copy-paste across different modules and various software versions.

01 Jan 2006
TL;DR: GCDkit is a program for handling and recalculation of geochemical data from igneous and metamorphic rocks using the Windows version of R, which provides a flexible and comprehensive language and environment for data analysis and graphics.
Abstract: Geochemical Data Toolkit (GCDkit) is a program for handling and recalculation of geochemical data from igneous and metamorphic rocks. It is built using the Windows version of R, which provides a flexible and comprehensive language and environment for data analysis and graphics. GCDkit was designed to eliminate routine and tedious operations involving large collections of whole-rock data and, at the same time, provide access to the wealth of statistical functions built into R. Data management tools include import and export of data files in a number of formats, data editing, searching, grouping and generation of subsets. Included are a variety of calculation and normative schemes, for instance CIPW and Mesonorm, as are the common geochemical graphs (e.g. binary and ternary graphs, Harker plots, spider plots, and several dozens of classification and geotectonic discrimination diagrams). The graphical output is publication ready but can be further retouched if required. The system can be further expanded by means of plug-in modules that provide specialist applications. GCDkit is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License and can be downloaded from http://www. gla.ac.uk/gcdkit. The product is actively maintained and updated to provide additional functionality; Unix/Linux and Mac OS versions are being developed.

Journal ArticleDOI
TL;DR: The purpose of this paper is to present and compare these implementations of support vector machines, among the most popular and efficient classification and regression methods currently available.
Abstract: Being among the most popular and efficient classification and regression methods currently available, implementations of support vector machines exist in almost every popular programming language. Currently four R packages contain SVM related software. The purpose of this paper is to present and compare these implementations. (authors' abstract)

Book
01 Jan 2006
TL;DR: This is an excellent textbook for students in business, computer science, and statistics, as well as a problem-solving reference for data analysts and professionals in the field.
Abstract: Apply powerful Data Mining Methods and Models to Leverage your Data for Actionable Results Data Mining Methods and Models provides: * The latest techniques for uncovering hidden nuggets of information * The insight into how the data mining algorithms actually work * The hands-on experience of performing data mining on large data sets Data Mining Methods and Models: * Applies a "white box" methodology, emphasizing an understanding of the model structures underlying the softwareWalks the reader through the various algorithms and provides examples of the operation of the algorithms on actual large data sets, including a detailed case study, "Modeling Response to Direct-Mail Marketing" * Tests the reader's level of understanding of the concepts and methodologies, with over 110 chapter exercises * Demonstrates the Clementine data mining software suite, WEKA open source data mining software, SPSS statistical software, and Minitab statistical software * Includes a companion Web site, www.dataminingconsultant.com, where the data sets used in the book may be downloaded, along with a comprehensive set of data mining resources. Faculty adopters of the book have access to an array of helpful resources, including solutions to all exercises, a PowerPoint(r) presentation of each chapter, sample data mining course projects and accompanying data sets, and multiple-choice chapter quizzes. With its emphasis on learning by doing, this is an excellent textbook for students in business, computer science, and statistics, as well as a problem-solving reference for data analysts and professionals in the field. An Instructor's Manual presenting detailed solutions to all the problems in the book is available onlne.

Journal ArticleDOI
TL;DR: The MIX program is a valid tool for performing meta-analysis and may be particularly useful in educational environments, and distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel) interface, and the educational features.
Abstract: Meta-analysis has become a well-known method for synthesis of quantitative data from previously conducted research in applied health sciences. So far, meta-analysis has been particularly useful in evaluating and comparing therapies and in assessing causes of disease. Consequently, the number of software packages that can perform meta-analysis has increased over the years. Unfortunately, it can take a substantial amount of time to get acquainted with some of these programs and most contain little or no interactive educational material. We set out to create and validate an easy-to-use and comprehensive meta-analysis package that would be simple enough programming-wise to remain available as a free download. We specifically aimed at students and researchers who are new to meta-analysis, with important parts of the development oriented towards creating internal interactive tutoring tools and designing features that would facilitate usage of the software as a companion to existing books on meta-analysis. We took an unconventional approach and created a program that uses Excel as a calculation and programming platform. The main programming language was Visual Basic, as implemented in Visual Basic 6 and Visual Basic for Applications in Excel 2000 and higher. The development took approximately two years and resulted in the 'MIX' program, which can be downloaded from the program's website free of charge. Next, we set out to validate the MIX output with two major software packages as reference standards, namely STATA (metan, metabias, and metatrim) and Comprehensive Meta-Analysis Version 2. Eight meta-analyses that had been published in major journals were used as data sources. All numerical and graphical results from analyses with MIX were identical to their counterparts in STATA and CMA. The MIX program distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel) interface, and the educational features. The MIX program is a valid tool for performing meta-analysis and may be particularly useful in educational environments. It can be downloaded free of charge via http://www.mix-for-meta-analysis.info or http://sourceforge.net/projects/meta-analysis .

Proceedings ArticleDOI
28 May 2006
TL;DR: The approach separates the adaptation behavior and non-adaptive behavior specifications of adaptive programs, making the models easier to specify and more amenable to automated analysis and visual inspection.
Abstract: Increasingly, software should dynamically adapt its behavior at run-time in response to changing conditions in the supporting computing and communication infrastructure, and in the surrounding physical environment. In order for an adaptive program to be trusted, it is important to have mechanisms to ensure that the program functions correctly during and after adaptations. Adaptive programs are generally more difficult to specify, verify, and validate due to their high complexity. Particularly, when involving multi-threaded adaptations, the program behavior is the result of the collaborative behavior of multiple threads and software components. This paper introduces an approach to create formal models for the behavior of adaptive programs. Our approach separates the adaptation behavior and non-adaptive behavior specifications of adaptive programs, making the models easier to specify and more amenable to automated analysis and visual inspection. We introduce a process to construct adaptation models, automatically generate adaptive programs from the models, and verify and validate the models. We illustrate our approach through the development of an adaptive GSM-oriented audio streaming protocol for a mobile computing application.

Journal ArticleDOI
TL;DR: The free software package described here was designed to help neurophysiologists process and view recorded data in an efficient and user-friendly manner and consists of several well-integrated applications.

Journal ArticleDOI
TL;DR: A case study is presented to demonstrate how Bunch can be used to create views of the structure of significant software systems and research is outlined to evaluate the software clustering results produced by Bunch.
Abstract: Since modern software systems are large and complex, appropriate abstractions of their structure are needed to make them more understandable and, thus, easier to maintain. Software clustering techniques are useful to support the creation of these abstractions by producing architectural-level views of a system's structure directly from its source code. This paper examines the Bunch clustering system which, unlike other software clustering tools, uses search techniques to perform clustering. Bunch produces a subsystem decomposition by partitioning a graph of the entities (e.g., classes) and relations (e.g., function calls) in the source code. Bunch uses a fitness function to evaluate the quality of graph partitions and uses search algorithms to find a satisfactory solution. This paper presents a case study to demonstrate how Bunch can be used to create views of the structure of significant software systems. This paper also outlines research to evaluate the software clustering results produced by Bunch.

Proceedings ArticleDOI
18 Dec 2006
TL;DR: Different similarity coefficients that are applied in the context of a program spectral approach to software fault localization (single programming mistakes) show different effectiveness in terms of the position of the actual fault in the probability ranking of fault candidates produced by the diagnosis technique.
Abstract: Automated diagnosis of software faults can improve the efficiency of the debugging process, and is therefore an important technique for the development of dependable software. In this paper we study different similarity coefficients that are applied in the context of a program spectral approach to software fault localization (single programming mistakes). The coefficients studied are taken from the systems diagnosis/automated debugging tools Pinpoint, Tarantula, and AMPLE, and from the molecular biology domain (the Ochiai coefficient). We evaluate these coefficients on the Siemens Suite of benchmark faults, and assess their effectiveness in terms of the position of the actual fault in the probability ranking of fault candidates produced by the diagnosis technique. Our experiments indicate that the Ochiai coefficient consistently outperforms the coefficients currently used by the tools mentioned. In terms of the amount of code that needs to be inspected, this coefficient improves 5% on average over the next best technique, and up to 30% in specific cases

BookDOI
01 Jan 2006
TL;DR: This book discusses end-User development as Adaptive Maintenance, Psychological Issues in End-User Programming, and Meta-design: A Framework for the Future of End- User Development.
Abstract: End-User Development: An Emerging Paradigm- Psychological Issues in End-User Programming- More Natural Programming Languages and Environments- What Makes End-User Development Tick? 13 Design Guidelines- An Integrated Software Engineering Approach for End-User Programmers- Component-Based Approaches to Tailorable Systems- Natural Development of Nomadic Interfaces Based on Conceptual Descriptions- End User Development of Web Applications- End-User Development: The Software Shaping Workshop Approach- Participatory Programming: Developing Programmable Bioinformatics Tools for End-Users- Challenges for End-User Development in Intelligent Environments- Fuzzy Rewriting- Breaking It Up: An Industrial Case Study of Component-Based Tailorable Software Design- End-User Development as Adaptive Maintenance- Supporting Collaborative Tailoring- EUD as Integration of Components Off-The-Shelf: The Role of Software Professionals Knowledge Artifacts- Organizational View of End-User Development- A Semiotic Framing for End-User Development- Meta-design: A Framework for the Future of End-User Development- Feasibility Studies for Programming in Natural Language- Future Perspectives in End-User Development

Journal ArticleDOI
TL;DR: The architecture of the MAPGPS software, which automates the processing of GPS data into global total electron density (TEC) maps, is described and three different methods for solving the receiver bias problem are described in detail.
Abstract: A software package known as MIT Automated Processing of GPS (MAPGPS) has been developed to automate the processing of GPS data into global total electron density (TEC) maps. The goal of the MAPGPS software is to produce reliable TEC data automatically, although not yet in real time. Observations are used from all available GPS receivers during all geomagnetic conditions where data has been successfully collected. In this paper, the architecture of the MAPGPS software is described. Particular attention is given to the algorithms used to estimate the individual receiver biases. One of the largest sources of error in estimating TEC from GPS data is the determination of these unknown receiver biases. The MAPGPS approach to solving the receiver bias problem uses three different methods: minimum scalloping, least squares, and zero-TEC. These methods are described in detail, along with their relative performance characteristics. A brief comparison of the JPL and MAPGPS receiver biases is presented, and a possible remaining error source in the receiver bias estimation is discussed. Finally, the Madrigal database, which allows Web access to the MAPGPS TEC data and maps, is described.

01 Jan 2006
TL;DR: In this paper, the main characteristics of DRAGON software for the calculation of molecular descriptors are shortly illustrated.
Abstract: Due to the relevance that molecular descriptors are constantly gaining in several scientific fields, software for the calculation of molecular descriptors have become very important tools for the scientists. In this paper, the main characteristics of DRAGON software for the calculation of molecular descriptors are shortly illustrated.

Book
03 Aug 2006
TL;DR: A novel metrics-based approach for detecting design problems in object-oriented software and introduces an important suite of detection strategies for the identification of different well-known design flaws as well as some rarely mentioned ones.
Abstract: Presents a novel metrics-based approach for detecting design problems in object-oriented software. Introduces an important suite of detection strategies for the identification of different well-known design flaws as well as some rarely mentioned ones.

Journal ArticleDOI
TL;DR: The quantification of map similarities and dissimilarities using the Map Comparison Kit (MCK) software is addressed, which is unique in having two map comparison techniques based on fuzzy-set calculation rules.
Abstract: Comparing maps is an important issue in environmental research. There are many reasons to compare maps: (i) to detect temporal/spatial changes or hot-spots, (ii) to compare different models, methodologies or scenarios, (iii) to calibrate, validate land-use models, (iv) to analyse model uncertainty and sensitivity, and (v) to assess map accuracy. This paper addresses the quantification of map similarities and dissimilarities using the Map Comparison Kit (MCK) software. Software and documentation are publicly available on the RIKS website free of charge (http://www.riks.nl/MCK/). The main focus is on 'categorical' or 'nominal' maps. Four different nominal map-comparison techniques are integrated in the software. Maps on ordinal, ratio and interval scale can be dealt with as well. The software is unique in having two map comparison techniques based on fuzzy-set calculation rules. The rationale is that fuzzy-set map comparison is very close to human judgement. Both fuzziness in location and fuzziness in category definitions are dealt with in the software.

Book ChapterDOI
31 Dec 2006
TL;DR: This chapter discusses Descriptive Statistics: Two or More Variates, which focuses on the part of the model concerned with Hierarchy-Divisive Clustering and Cluster Analysis.
Abstract: The first book to present a unified account of symbolic data analysis methods in a consistent statistical framework, Symbolic Data Analysis features a substantial number of examples from a range of application areas, including health, the social sciences, economics, and computer science. It includes implementation of the methods described using SODAS software, which has been developed by a team led by Edwin Diday and is freely available on the Web, with an additional chapter that provides a basic guide to the software. It also features exercises at the end of each chapter to help the reader develop their understanding of the methodology, and to enable use of the book as a course text. The book is supported by a website featuring a link to download SODAS software, datasets, solutions to exercises, and additional teaching material.

Journal ArticleDOI
01 May 2006
TL;DR: Compatibility of GA with MPI enables the programmer to take advatage of the existing MPI software/libraries when available and appropriate, and demonstrates the attractiveness of using higher level abstractions to write parallel code.
Abstract: This paper describes capabilities, evolution, performance, and applications of the Global Arrays (GA) toolkit. GA was created to provide application programmers with an inteface that allows them to distribute data while maintaining the type of global index space and programming syntax similar to that available when programming on a single processor. The goal of GA is to free the programmer from the low level management of communication and allow them to deal with their problems at the level at which they were originally formulated. At the same time, compatibility of GA with MPI enables the programmer to take advatage of the existing MPI software/libraries when available and appropriate. The variety of applications that have been implemented using Global Arrays attests to the attractiveness of using higher level abstractions to write parallel code.

01 Jan 2006
TL;DR: This paper presents some examples of ontology applications throughout the Software Engineering lifecycle and discusses the advantages of ontologies in each case and provides a framework for classifying the usage of ontological applications in Software Engineering.
Abstract: The emerging field of semantic web technologies promises new stimulus for Software Engineering research. However, since the underlying concepts of the semantic web have a long tradition in the knowledge engineering field, it is sometimes hard for software engineers to overlook the variety of ontology-enabled approaches to Software Engineering. In this paper we therefore present some examples of ontology applications throughout the Software Engineering lifecycle. We discuss the advantages of ontologies in each case and provide a framework for classifying the usage of ontologies in Software Engineering.