Showing papers on "Software published in 2006"

PDF

Open Access

Journal Article•DOI•

[...]

Stefan Hoops¹, Sven Sahle, Ralph Gauges, Christine Lee¹, Jürgen Pahle, Natalia Simus, Mudita Singhal¹, Liang Xu¹, Pedro Mendes¹, Ursula Kummer - Show less +6 more•Institutions (1)

Virginia Bioinformatics Institute¹

15 Dec 2006-Bioinformatics

TL;DR: COPASI is presented, a platform-independent and user-friendly biochemical simulator that offers several unique features, and numerical issues with these features are discussed; in particular, the criteria to switch between stochastic and deterministic simulation methods, hybrid deterministic-stochastic methods, and the importance of random number generator numerical resolution in Stochastic simulation.

...read moreread less

Abstract: Motivation: Simulation and modeling is becoming a standard approach to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as support for the usage of these methods. Results: Here, we present COPASI, a platform-independent and user-friendly biochemical simulator that offers several unique features. We discuss numerical issues with these features; in particular, the criteria to switch between stochastic and deterministic simulation methods, hybrid deterministic--stochastic methods, and the importance of random number generator numerical resolution in stochastic simulation. Availability: The complete software is available in binary (executable) for MS Windows, OS X, Linux (Intel) and Sun Solaris (SPARC), as well as the full source code under an open source license from http://www.copasi.org. Contact: mendes@vbi.vt.edu

...read moreread less

2,351 citations

Book Chapter•DOI•

TM4 microarray software suite

[...]

Alexander I. Saeed, Nirmal Bhagabati, John C. Braisted, Wei Liang, Vasily Sharov, Eleanor A. Howe, Jianwei Li, Mathangi Thiagarajan, Joseph A. White, John Quackenbush - Show less +6 more

01 Jan 2006-Methods in Enzymology

TL;DR: This chapter describes each component of the TM4 suite of open‐source tools for data management and reporting, image analysis, normalization and pipeline control, and data mining and visualization and includes a sample analysis walk‐through.

...read moreread less

Abstract: Powerful specialized software is essential for managing, quantifying, and ultimately deriving scientific insight from results of a microarray experiment. We have developed a suite of software applications, known as TM4, to support such gene expression studies. The suite consists of open‐source tools for data managementandreporting,imageanalysis,normalizationandpipelinecontrol, and data mining and visualization. An integrated MIAME‐compliant MySQL database is included. This chapter describes each component of the suite and includes a sample analysis walk‐through.

...read moreread less

1,931 citations

Journal Article•DOI•

Meta-DiSc: a software for meta-analysis of test accuracy data

[...]

Javier Zamora, Víctor Abraira, Alfonso Muriel, Khalid M. Khan¹, Arri Coomarasamy¹ - Show less +1 more•Institutions (1)

University of Birmingham¹

12 Jul 2006-BMC Medical Research Methodology

TL;DR: Meta-DiSc is a comprehensive and dedicated test accuracy meta-analysis software that has already been used and cited in several meta-analyses published in high-ranking journals and is publicly available.

...read moreread less

Abstract: Systematic reviews and meta-analyses of test accuracy studies are increasingly being recognised as central in guiding clinical practice. However, there is currently no dedicated and comprehensive software for meta-analysis of diagnostic data. In this article, we present Meta-DiSc, a Windows-based, user-friendly, freely available (for academic use) software that we have developed, piloted, and validated to perform diagnostic meta-analysis. Meta-DiSc a) allows exploration of heterogeneity, with a variety of statistics including chi-square, I-squared and Spearman correlation tests, b) implements meta-regression techniques to explore the relationships between study characteristics and accuracy estimates, c) performs statistical pooling of sensitivities, specificities, likelihood ratios and diagnostic odds ratios using fixed and random effects models, both overall and in subgroups and d) produces high quality figures, including forest plots and summary receiver operating characteristic curves that can be exported for use in manuscripts for publication. All computational algorithms have been validated through comparison with different statistical tools and published meta-analyses. Meta-DiSc has a Graphical User Interface with roll-down menus, dialog boxes, and online help facilities. Meta-DiSc is a comprehensive and dedicated test accuracy meta-analysis software. It has already been used and cited in several meta-analyses published in high-ranking journals. The software is publicly available at http://www.hrc.es/investigacion/metadisc_en.htm .

...read moreread less

1,727 citations

Book•

Linear Mixed Models: A Practical Guide Using Statistical Software

[...]

Brady T. West, Kathleen B. Welch, Andrzej T. Galecki¹•Institutions (1)

University of Michigan¹

22 Nov 2006

TL;DR: The Implied Marginal Variance-Covariance Matrix for the Final Model Diagnostics for theFinal Model Software Notes and Recommendations Other Analytic Approaches Recommendations.

...read moreread less

Abstract: INTRODUCTION What Are Linear Mixed Models (LMMs)? A Brief History of Linear Mixed Models LINEAR MIXED MODELS: AN OVERVIEW Introduction Specification of LMMs The Marginal Linear Model Estimation in LMMs Computational Issues Tools for Model Selection Model-Building Strategies Checking Model Assumptions (Diagnostics) Other Aspects of LMMs Power Analysis for Linear Mixed Models Chapter Summary TWO-LEVEL MODELS FOR CLUSTERED DATA: THE RAT PUP EXAMPLE Introduction The Rat Pup Study Overview of the Rat Pup Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model Estimating the Intraclass Correlation Coefficients (ICCs) Calculating Predicted Values Diagnostics for the Final Model Software Notes and Recommendations THREE-LEVEL MODELS FOR CLUSTERED DATA THE CLASSROOM EXAMPLE Introduction The Classroom Study Overview of the Classroom Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model Estimating the Intraclass Correlation Coefficients (ICCs) Calculating Predicted Values Diagnostics for the Final Model Software Notes Recommendations MODELS FOR REPEATED-MEASURES DATA: THE RAT BRAIN EXAMPLE Introduction The Rat Brain Study Overview of the Rat Brain Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model The Implied Marginal Variance-Covariance Matrix for the Final Model Diagnostics for the Final Model Software Notes Other Analytic Approaches Recommendations RANDOM COEFFICIENT MODELS FOR LONGITUDINAL DATA: THE AUTISM EXAMPLE Introduction The Autism Study Overview of the Autism Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model Calculating Predicted Values Diagnostics for the Final Model Software Note: Computational Problems with the D Matrix An Alternative Approach: Fitting the Marginal Model with an Unstructured Covariance Matrix MODELS FOR CLUSTERED LONGITUDINAL DATA: THE DENTAL VENEER EXAMPLE Introduction The Dental Veneer Study Overview of the Dental Veneer Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model The Implied Marginal Variance-Covariance Matrix for the Final Model Diagnostics for the Final Model Software Notes and Recommendations Other Analytic Approaches MODELS FOR DATA WITH CROSSED RANDOM FACTORS: THE SAT SCORE EXAMPLE Introduction The SAT Score Study Overview of the SAT Score Data Analysis Analysis Steps in the Software Procedures Results of Hypothesis Tests Comparing Results across the Software Procedures Interpreting Parameter Estimates in the Final Model The Implied Marginal Variance-Covariance Matrix for the Final Model Recommended Diagnostics for the Final Model Software Notes and Additional Recommendations APPENDIX A: STATISTICAL SOFTWARE RESOURCES APPENDIX B: CALCULATION OF THE MARGINAL VARIANCE-COVARIANCE MATRIX APPENDIX C: ACRONYMS/ABBREVIATIONS BIBLIOGRAPHY INDEX

...read moreread less

1,680 citations

Patent•

System and method for improving the efficiency, comfort, and/or reliability in operating systems, such as for example windows

[...]

Yaron Mayer

10 May 2006

TL;DR: In this paper, the authors present a new Windows application that includes considerable improvements over the prior art, such as a reset function, a powerful undo feature, improved undo features in word processing, improved file comparison features, being able for example to track changes retroactively, improved backup features, and many additional improvements.

...read moreread less

Abstract: Although MS Windows (in its various versions) is at present the most popular OS (Operating System) in personal computers, after years of consecutive improvements there are still various issues which need to be improved, which include for example issues of efficiency, comfort, and/or reliability. The present invention tries to solve the above problems in new ways that include considerable improvements over the prior art. Preferably the system allows for example a “Reset” function, which means that preferably an Image of the state of the OS (including all loaded software) is saved immediately after a successful boot on the disk or other non-volatile memory and is preferably automatically updated when new drivers and/or software that change the state after a boot are added, so that if the system gets stuck it can be instantly restarted as if it has been rebooted. Other features include for example solving the problem that the focus can be grabbed while the user is typing something, allowing the user to easily define or increase or decrease the priority of various processes or open windows, a powerful undo feature that can include preferably even any changes to the hard disk, improved undo features in word processing, improved file comparison features, being able for example to track changes retroactively, improved backup features, and many additional improvements. The application covers also improvements that are related for example to Word processing (since for example in Microsoft Windows, Word behaves like an integral part of the system) and things that are related to the user's Internet surfing experience, including for example improved search experience (This is important since for example in Microsoft Windows, Internet Explorer is practically an integral part of the OS). The invention deals also with some preferable improvements in the performance of the hard disk and also with some other smart computerized devices.

...read moreread less

1,185 citations

Proceedings Article•DOI•

CIAO: Chandra's data analysis system

[...]

Antonella Fruscione¹, Jonathan C. McDowell¹, Glenn E. Allen², Nancy S. Brickhouse¹, Douglas Burke¹, John E. Davis², Nick Durham¹, Martin Elvis¹, Elizabeth C. Galle¹, Daniel E. Harris¹, David P. Huenemoerder², John C. Houck², Bish Ishibashi², Margarita Karovska¹, Fabrizio Nicastro¹, Michael S. Noble², Michael A. Nowak², F. A. Primini¹, Aneta Siemiginowska¹, Randall K. Smith³, Michael W. Wise⁴ - Show less +17 more•Institutions (4)

Smithsonian Astrophysical Observatory¹, Massachusetts Institute of Technology², Goddard Space Flight Center³, ASTRON⁴

30 Jun 2006-Proceedings of SPIE

TL;DR: The CIAO (Chandra Interactive Analysis of Observations) software package was first released in 1999 following the launch of the Chandra X-ray Observatory and is used by astronomers across the world to analyze Chandra data as well as data from other telescopes.

...read moreread less

Abstract: The CIAO (Chandra Interactive Analysis of Observations) software package was first released in 1999 following the launch of the Chandra X-ray Observatory and is used by astronomers across the world to analyze Chandra data as well as data from other telescopes. From the earliest design discussions, CIAO was planned as a general-purpose scientific data analysis system optimized for X-ray astronomy, and consists mainly of command line tools (allowing easy pipelining and scripting) with a parameter-based interface layered on a flexible data manipulation I/O library. The same code is used for the standard Chandra archive pipeline, allowing users to recalibrate their data in a consistent way. We will discuss the lessons learned from the first six years of the software's evolution. Our initial approach to documentation evolved to concentrate on recipe-based "threads" which have proved very successful. A multi-dimensional abstract approach to data analysis has allowed new capabilities to be added while retaining existing interfaces. A key requirement for our community was interoperability with other data analysis systems, leading us to adopt standard file formats and an architecture which was as robust as possible to the input of foreign data files, as well as re-using a number of external libraries. We support users who are comfortable with coding themselves via a flexible user scripting paradigm, while the availability of tightly constrained pipeline programs are of benefit to less computationally-advanced users. As with other analysis systems, we have found that infrastructure maintenance and re-engineering is a necessary and significant ongoing effort and needs to be planned in to any long-lived astronomy software.

...read moreread less

1,145 citations

Journal Article•DOI•

Motivation, Governance, and the Viability of Hybrid Forms in Open Source Software Development

[...]

Sonali K. Shah¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Jul 2006-Management Science

TL;DR: This paper inductively derives a framework for understanding participation from the perspective of the individual software developer based on data from two software communities with different governance structures.

...read moreread less

Abstract: Open source software projects rely on the voluntary efforts of thousands of software developers, yet we know little about why developers choose to participate in this collective development process. This paper inductively derives a framework for understanding participation from the perspective of the individual software developer based on data from two software communities with different governance structures. In both communities, a need for software-related improvements drives initial participation. The majority of participants leave the community once their needs are met, however, a small subset remains involved. For this set of developers, motives evolve over time and participation becomes a hobby. These hobbyists are critical to the long-term viability of the software code: They take on tasks that might otherwise go undone and work to maintain the simplicity and modularity of the code. Governance structures affect this evolution of motives. Implications for firms interested in implementing hybrid strategies designed to combine the advantages of open source software development with proprietary ownership and control are discussed.

...read moreread less

905 citations

Journal Article•DOI•

The transformation of open source software

[...]

Brian Fitzgerald¹•Institutions (1)

University of Limerick¹

01 Sep 2006-Management Information Systems Quarterly

TL;DR: It is argued that the open source software phenomenon has metamorphosed into a more mainstream and commercially viable form, which the author labels as OSS 2.0, and how the bazaar metaphor has actually shifted to become a metaphor better suited to the OSS 1.0 product delivery and support process.

...read moreread less

Abstract: A frequent characterization of open source software is the somewhat outdated, mythical one of a collective of supremely talented software hackers freely volunteering their services to produce uniformly high-quality software. I contend that the open source software phenomenon has metamorphosed into a more mainstream and commercially viable form, which I label as OSS 2.0. I illustrate this transformation using a framework of process and product factors, and discuss how the bazaar metaphor, which up to now has been associated with the open source development process, has actually shifted to become a metaphor better suited to the OSS 2.0 product delivery and support process. Overall the OSS 2.0 phenomenon is significantly different from its free software antecedent. Its emergence accentuates the fundamental alteration of the basic ground rules in the software landscape, signifying the end of the proprietary-driven model that has prevailed for the past 20 years or so. Thus, a clear understanding of the characteristics of the emergent OSS 2.0 phenomenon is required to address key challenges for research and practice.

...read moreread less

837 citations

Proceedings Article•DOI•

Mining metrics to predict component failures

[...]

Nachiappan Nagappan¹, Thomas Ball¹, Andreas Zeller²•Institutions (2)

Microsoft¹, Saarland University²

28 May 2006

TL;DR: Using principal component analysis on the code metrics, this work built regression models that accurately predict the likelihood of post-release defects for new entities and can be generalized to arbitrary projects.

...read moreread less

Abstract: What is it that makes software fail? In an empirical study of the post-release defect history of five Microsoft software systems, we found that failure-prone software entities are statistically correlated with code complexity measures. However, there is no single set of complexity metrics that could act as a universally best defect predictor. Using principal component analysis on the code metrics, we built regression models that accurately predict the likelihood of post-release defects for new entities. The approach can easily be generalized to arbitrary projects; in particular, predictors obtained from one project can also be significant for new, similar projects.

...read moreread less

803 citations

Journal Article•DOI•

libMesh: a C++ library for parallel adaptive mesh refinement/coarsening simulations

[...]

Benjamin S. Kirk¹, John W. Peterson¹, Roy H. Stogner¹, Graham F. Carey¹•Institutions (1)

University of Texas at Austin¹

07 Dec 2006-Engineering With Computers

TL;DR: The main goals of this article are to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach, and to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers.

...read moreread less

Abstract: In this paper we describe the libMesh (http://libmesh.sourceforge.net) framework for parallel adaptive finite element applications. libMesh is an open-source software library that has been developed to facilitate serial and parallel simulation of multiscale, multiphysics applications using adaptive mesh refinement and coarsening strategies. The main software development is being carried out in the CFDLab (http://cfdlab.ae.utexas.edu) at the University of Texas, but as with other open-source software projects; contributions are being made elsewhere in the US and abroad. The main goals of this article are: (1) to provide a basic reference source that describes libMesh and the underlying philosophy and software design approach; (2) to give sufficient detail and references on the adaptive mesh refinement and coarsening (AMR/C) scheme for applications analysts and developers; and (3) to describe the parallel implementation and data structures with supporting discussion of domain decomposition, message passing, and details related to dynamic repartitioning for parallel AMR/C. Other aspects related to C++ programming paradigms, reusability for diverse applications, adaptive modeling, physics-independent error indicators, and similar concepts are briefly discussed. Finally, results from some applications using the library are presented and areas of future research are discussed.

...read moreread less

761 citations

Journal Article•DOI•

Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code

[...]

Alan MacCormack¹, John Rusnak¹, Carliss Y. Baldwin¹•Institutions (1)

Harvard University¹

01 Jul 2006-Management Science

TL;DR: This paper reports data from a study that seeks to characterize the differences in design structure between complex software products, using design structure matrices to map dependencies between the elements of a design and define metrics that allow us to compare the structures of different designs.

...read moreread less

Abstract: This paper reports data from a study that seeks to characterize the differences in design structure between complex software products. We use design structure matrices (DSMs) to map dependencies between the elements of a design and define metrics that allow us to compare the structures of different designs. We use these metrics to compare the architectures of two software products---the Linux operating system and the Mozilla Web browser---that were developed via contrasting modes of organization: specifically, open source versus proprietary development. We then track the evolution of Mozilla, paying attention to a purposeful “redesign” effort undertaken with the intention of making the product more “modular.” We find significant differences in structure between Linux and the first version of Mozilla, suggesting that Linux had a more modular architecture. Yet we also find that the redesign of Mozilla resulted in an architecture that was significantly more modular than that of its predecessor and, indeed, than that of Linux. Our results, while exploratory, are consistent with a view that different modes of organization are associated with designs that possess different structures. However, they also suggest that purposeful managerial actions can have a significant impact in adapting a design's structure. This latter result is important given recent moves to release proprietary software into the public domain. These moves are likely to fail unless the product possesses an “architecture for participation.”

...read moreread less

Journal Article•DOI•

Uses and abuses of EIDORS: an extensible software base for EIT

[...]

Andy Adler¹, William R. B. Lionheart²•Institutions (2)

University of Ottawa¹, University of Manchester²

01 May 2006-Physiological Measurement

TL;DR: Recent work to redesign the EIDORS software structure in order to simplify its use and provide a uniform interface, permitting easier modification and customization is described.

...read moreread less

Abstract: EIDORS is an open source software suite for image reconstruction in electrical impedance tomography and diffuse optical tomography, designed to facilitate collaboration, testing and new research in these fields. This paper describes recent work to redesign the software structure in order to simplify its use and provide a uniform interface, permitting easier modification and customization. We describe the key features of this software, followed by examples of its use. One general issue with inverse problem software is the difficulty of correctly implementing algorithms and the consequent ease with which subtle numerical bugs can be inadvertently introduced. EIDORS helps with this issue, by allowing sharing and reuse of well-documented and debugged software. On the other hand, since EIDORS is designed to facilitate use by non-specialists, its use may inadvertently result in such numerical errors. In order to address this issue, we develop a list of ways in which such errors with inverse problems (which we refer to as 'cheats') may occur. Our hope is that such an overview may assist authors of software to avoid such implementation issues.

...read moreread less

Journal Article•DOI•

CP-Miner: finding copy-paste and related bugs in large-scale software code

[...]

Zhenmin Li¹, Shan Lu¹, Suvda Myagmar¹, Yuanyuan Zhou¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Mar 2006-IEEE Transactions on Software Engineering

TL;DR: This paper proposes a tool, CP-Miner, that uses data mining techniques to efficiently identify copy-pasted code in large software suites and detects copy-paste bugs and has detected many new bugs in popular operating systems.

...read moreread less

Abstract: Recent studies have shown that large software suites contain significant amounts of replicated code. It is assumed that some of this replication is due to copy-and-paste activity and that a significant proportion of bugs in operating systems are due to copy-paste errors. Existing static code analyzers are either not scalable to large software suites or do not perform robustly where replicated code is modified with insertions and deletions. Furthermore, the existing tools do not detect copy-paste related bugs. In this paper, we propose a tool, CP-Miner, that uses data mining techniques to efficiently identify copy-pasted code in large software suites and detects copy-paste bugs. Specifically, it takes less than 20 minutes for CP-Miner to identify 190,000 copy-pasted segments in Linux and 150,000 in FreeBSD. Moreover, CP-Miner has detected many new bugs in popular operating systems, 49 in Linux and 31 in FreeBSD, most of which have since been confirmed by the corresponding developers and have been rectified in the following releases. In addition, we have found some interesting characteristics of copy-paste in operating system code. Specifically, we analyze the distribution of copy-pasted code by size (number lines of code), granularity (basic blocks and functions), and modification within copy-pasted code. We also analyze copy-paste across different modules and various software versions.

...read moreread less

TECHNICAL NOTE Interpretation of Whole-rock Geochemical Data in Igneous Geochemistry: Introducing Geochemical Data Toolkit (GCDkit)

[...]

Vojtech Janousek, Colin Farrow

01 Jan 2006

TL;DR: GCDkit is a program for handling and recalculation of geochemical data from igneous and metamorphic rocks using the Windows version of R, which provides a flexible and comprehensive language and environment for data analysis and graphics.

...read moreread less

Abstract: Geochemical Data Toolkit (GCDkit) is a program for handling and recalculation of geochemical data from igneous and metamorphic rocks. It is built using the Windows version of R, which provides a flexible and comprehensive language and environment for data analysis and graphics. GCDkit was designed to eliminate routine and tedious operations involving large collections of whole-rock data and, at the same time, provide access to the wealth of statistical functions built into R. Data management tools include import and export of data files in a number of formats, data editing, searching, grouping and generation of subsets. Included are a variety of calculation and normative schemes, for instance CIPW and Mesonorm, as are the common geochemical graphs (e.g. binary and ternary graphs, Harker plots, spider plots, and several dozens of classification and geotectonic discrimination diagrams). The graphical output is publication ready but can be further retouched if required. The system can be further expanded by means of plug-in modules that provide specialist applications. GCDkit is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License and can be downloaded from http://www. gla.ac.uk/gcdkit. The product is actively maintained and updated to provide additional functionality; Unix/Linux and Mac OS versions are being developed.

...read moreread less

Journal Article•DOI•

Support Vector Machines in R

[...]

Alexandros Karatzoglou, David Meyer, Kurt Hornik

06 Apr 2006-Journal of Statistical Software

TL;DR: The purpose of this paper is to present and compare these implementations of support vector machines, among the most popular and efficient classification and regression methods currently available.

...read moreread less

Abstract: Being among the most popular and efficient classification and regression methods currently available, implementations of support vector machines exist in almost every popular programming language. Currently four R packages contain SVM related software. The purpose of this paper is to present and compare these implementations. (authors' abstract)

...read moreread less

Book•

Data Mining Methods and Models

[...]

Daniel T. Larose

01 Jan 2006

TL;DR: This is an excellent textbook for students in business, computer science, and statistics, as well as a problem-solving reference for data analysts and professionals in the field.

...read moreread less

Abstract: Apply powerful Data Mining Methods and Models to Leverage your Data for Actionable Results Data Mining Methods and Models provides: * The latest techniques for uncovering hidden nuggets of information * The insight into how the data mining algorithms actually work * The hands-on experience of performing data mining on large data sets Data Mining Methods and Models: * Applies a "white box" methodology, emphasizing an understanding of the model structures underlying the softwareWalks the reader through the various algorithms and provides examples of the operation of the algorithms on actual large data sets, including a detailed case study, "Modeling Response to Direct-Mail Marketing" * Tests the reader's level of understanding of the concepts and methodologies, with over 110 chapter exercises * Demonstrates the Clementine data mining software suite, WEKA open source data mining software, SPSS statistical software, and Minitab statistical software * Includes a companion Web site, www.dataminingconsultant.com, where the data sets used in the book may be downloaded, along with a comprehensive set of data mining resources. Faculty adopters of the book have access to an array of helpful resources, including solutions to all exercises, a PowerPoint(r) presentation of each chapter, sample data mining course projects and accompanying data sets, and multiple-choice chapter quizzes. With its emphasis on learning by doing, this is an excellent textbook for students in business, computer science, and statistics, as well as a problem-solving reference for data analysts and professionals in the field. An Instructor's Manual presenting detailed solutions to all the problems in the book is available onlne.

...read moreread less

Journal Article•DOI•

Development and validation of MIX: comprehensive free software for meta-analysis of causal research data

[...]

Leon Bax¹, Leon Bax², Ly-Mee Yu, Noriaki Ikeda¹, Harukazu Tsuruta¹, Karel G.M. Moons² - Show less +2 more•Institutions (2)

Kitasato University¹, Oklahoma State University Center for Health Sciences²

13 Oct 2006-BMC Medical Research Methodology

TL;DR: The MIX program is a valid tool for performing meta-analysis and may be particularly useful in educational environments, and distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel) interface, and the educational features.

...read moreread less

Abstract: Meta-analysis has become a well-known method for synthesis of quantitative data from previously conducted research in applied health sciences. So far, meta-analysis has been particularly useful in evaluating and comparing therapies and in assessing causes of disease. Consequently, the number of software packages that can perform meta-analysis has increased over the years. Unfortunately, it can take a substantial amount of time to get acquainted with some of these programs and most contain little or no interactive educational material. We set out to create and validate an easy-to-use and comprehensive meta-analysis package that would be simple enough programming-wise to remain available as a free download. We specifically aimed at students and researchers who are new to meta-analysis, with important parts of the development oriented towards creating internal interactive tutoring tools and designing features that would facilitate usage of the software as a companion to existing books on meta-analysis. We took an unconventional approach and created a program that uses Excel as a calculation and programming platform. The main programming language was Visual Basic, as implemented in Visual Basic 6 and Visual Basic for Applications in Excel 2000 and higher. The development took approximately two years and resulted in the 'MIX' program, which can be downloaded from the program's website free of charge. Next, we set out to validate the MIX output with two major software packages as reference standards, namely STATA (metan, metabias, and metatrim) and Comprehensive Meta-Analysis Version 2. Eight meta-analyses that had been published in major journals were used as data sources. All numerical and graphical results from analyses with MIX were identical to their counterparts in STATA and CMA. The MIX program distinguishes itself from most other programs by the extensive graphical output, the click-and-go (Excel) interface, and the educational features. The MIX program is a valid tool for performing meta-analysis and may be particularly useful in educational environments. It can be downloaded free of charge via http://www.mix-for-meta-analysis.info or http://sourceforge.net/projects/meta-analysis .

...read moreread less

Proceedings Article•DOI•

Model-based development of dynamically adaptive software

[...]

Ji Zhang¹, Betty H. C. Cheng¹•Institutions (1)

Michigan State University¹

28 May 2006

TL;DR: The approach separates the adaptation behavior and non-adaptive behavior specifications of adaptive programs, making the models easier to specify and more amenable to automated analysis and visual inspection.

...read moreread less

Abstract: Increasingly, software should dynamically adapt its behavior at run-time in response to changing conditions in the supporting computing and communication infrastructure, and in the surrounding physical environment. In order for an adaptive program to be trusted, it is important to have mechanisms to ensure that the program functions correctly during and after adaptations. Adaptive programs are generally more difficult to specify, verify, and validate due to their high complexity. Particularly, when involving multi-threaded adaptations, the program behavior is the result of the collaborative behavior of multiple threads and software components. This paper introduces an approach to create formal models for the behavior of adaptive programs. Our approach separates the adaptation behavior and non-adaptive behavior specifications of adaptive programs, making the models easier to specify and more amenable to automated analysis and visual inspection. We introduce a process to construct adaptation models, automatically generate adaptive programs from the models, and verify and validate the models. We illustrate our approach through the development of an adaptive GSM-oriented audio streaming protocol for a mobile computing application.

...read moreread less

Journal Article•DOI•

Klusters, NeuroScope, NDManager: a free software suite for neurophysiological data processing and visualization.

[...]

Lynn Hazan¹, Michaël B. Zugaro¹, György Buzsáki¹•Institutions (1)

Rutgers University¹

15 Sep 2006-Journal of Neuroscience Methods

TL;DR: The free software package described here was designed to help neurophysiologists process and view recorded data in an efficient and user-friendly manner and consists of several well-integrated applications.

...read moreread less

Journal Article•DOI•

On the automatic modularization of software systems using the Bunch tool

[...]

Brian S. Mitchell¹, Spiros Mancoridis¹•Institutions (1)

Drexel University¹

01 Mar 2006-IEEE Transactions on Software Engineering

TL;DR: A case study is presented to demonstrate how Bunch can be used to create views of the structure of significant software systems and research is outlined to evaluate the software clustering results produced by Bunch.

...read moreread less

Abstract: Since modern software systems are large and complex, appropriate abstractions of their structure are needed to make them more understandable and, thus, easier to maintain. Software clustering techniques are useful to support the creation of these abstractions by producing architectural-level views of a system's structure directly from its source code. This paper examines the Bunch clustering system which, unlike other software clustering tools, uses search techniques to perform clustering. Bunch produces a subsystem decomposition by partitioning a graph of the entities (e.g., classes) and relations (e.g., function calls) in the source code. Bunch uses a fitness function to evaluate the quality of graph partitions and uses search algorithms to find a satisfactory solution. This paper presents a case study to demonstrate how Bunch can be used to create views of the structure of significant software systems. This paper also outlines research to evaluate the software clustering results produced by Bunch.

...read moreread less

Proceedings Article•DOI•

An Evaluation of Similarity Coefficients for Software Fault Localization

[...]

Rui Abreu¹, Peter Zoeteweij¹, A.J.C. van Gemund¹•Institutions (1)

Delft University of Technology¹

18 Dec 2006

TL;DR: Different similarity coefficients that are applied in the context of a program spectral approach to software fault localization (single programming mistakes) show different effectiveness in terms of the position of the actual fault in the probability ranking of fault candidates produced by the diagnosis technique.

...read moreread less

Abstract: Automated diagnosis of software faults can improve the efficiency of the debugging process, and is therefore an important technique for the development of dependable software. In this paper we study different similarity coefficients that are applied in the context of a program spectral approach to software fault localization (single programming mistakes). The coefficients studied are taken from the systems diagnosis/automated debugging tools Pinpoint, Tarantula, and AMPLE, and from the molecular biology domain (the Ochiai coefficient). We evaluate these coefficients on the Siemens Suite of benchmark faults, and assess their effectiveness in terms of the position of the actual fault in the probability ranking of fault candidates produced by the diagnosis technique. Our experiments indicate that the Ochiai coefficient consistently outperforms the coefficients currently used by the tools mentioned. In terms of the amount of code that needs to be inspected, this coefficient improves 5% on average over the next best technique, and up to 30% in specific cases

...read moreread less

Book•DOI•

End user development

[...]

Simone Diniz Junqueira Barbosa¹, Panos Markopoulos, Fabio Paternò, Simone Stumpf², Stefano Valtolina³ - Show less +1 more•Institutions (3)

Pontifical Catholic University of Rio de Janeiro¹, University of London², University of Milan³

01 Jan 2006

TL;DR: This book discusses end-User development as Adaptive Maintenance, Psychological Issues in End-User Programming, and Meta-design: A Framework for the Future of End- User Development.

...read moreread less

Abstract: End-User Development: An Emerging Paradigm- Psychological Issues in End-User Programming- More Natural Programming Languages and Environments- What Makes End-User Development Tick? 13 Design Guidelines- An Integrated Software Engineering Approach for End-User Programmers- Component-Based Approaches to Tailorable Systems- Natural Development of Nomadic Interfaces Based on Conceptual Descriptions- End User Development of Web Applications- End-User Development: The Software Shaping Workshop Approach- Participatory Programming: Developing Programmable Bioinformatics Tools for End-Users- Challenges for End-User Development in Intelligent Environments- Fuzzy Rewriting- Breaking It Up: An Industrial Case Study of Component-Based Tailorable Software Design- End-User Development as Adaptive Maintenance- Supporting Collaborative Tailoring- EUD as Integration of Components Off-The-Shelf: The Role of Software Professionals Knowledge Artifacts- Organizational View of End-User Development- A Semiotic Framing for End-User Development- Meta-design: A Framework for the Future of End-User Development- Feasibility Studies for Programming in Natural Language- Future Perspectives in End-User Development

...read moreread less

Journal Article•DOI•

Automated GPS processing for global total electron content data

[...]

William C. Rideout¹, Anthea J. Coster¹•Institutions (1)

Massachusetts Institute of Technology¹

11 May 2006-Gps Solutions

TL;DR: The architecture of the MAPGPS software, which automates the processing of GPS data into global total electron density (TEC) maps, is described and three different methods for solving the receiver bias problem are described in detail.

...read moreread less

Abstract: A software package known as MIT Automated Processing of GPS (MAPGPS) has been developed to automate the processing of GPS data into global total electron density (TEC) maps. The goal of the MAPGPS software is to produce reliable TEC data automatically, although not yet in real time. Observations are used from all available GPS receivers during all geomagnetic conditions where data has been successfully collected. In this paper, the architecture of the MAPGPS software is described. Particular attention is given to the algorithms used to estimate the individual receiver biases. One of the largest sources of error in estimating TEC from GPS data is the determination of these unknown receiver biases. The MAPGPS approach to solving the receiver bias problem uses three different methods: minimum scalloping, least squares, and zero-TEC. These methods are described in detail, along with their relative performance characteristics. A brief comparison of the JPL and MAPGPS receiver biases is presented, and a possible remaining error source in the receiver bias estimation is discussed. Finally, the Madrigal database, which allows Web access to the MAPGPS TEC data and maps, is described.

...read moreread less

Dragon software: an easy approach to molecular descriptor calculations

[...]

Andrea Mauri, Viviana Consonni, Manuela Pavan, Roberto Todeschini, Milano Chemometrics - Show less +1 more

01 Jan 2006

TL;DR: In this paper, the main characteristics of DRAGON software for the calculation of molecular descriptors are shortly illustrated.

...read moreread less

Abstract: Due to the relevance that molecular descriptors are constantly gaining in several scientific fields, software for the calculation of molecular descriptors have become very important tools for the scientists. In this paper, the main characteristics of DRAGON software for the calculation of molecular descriptors are shortly illustrated.

...read moreread less

Book•

Object-Oriented Metrics in Practice: Using Software Metrics to Characterize, Evaluate, and Improve the Design of Object-Oriented Systems

[...]

Michele Lanza, Radu Marinescu

03 Aug 2006

TL;DR: A novel metrics-based approach for detecting design problems in object-oriented software and introduces an important suite of detection strategies for the identification of different well-known design flaws as well as some rarely mentioned ones.

...read moreread less

Abstract: Presents a novel metrics-based approach for detecting design problems in object-oriented software. Introduces an important suite of detection strategies for the identification of different well-known design flaws as well as some rarely mentioned ones.

...read moreread less

Journal Article•DOI•

The Map Comparison Kit

[...]

H. Visser¹, T. de Nijs¹•Institutions (1)

Netherlands Environmental Assessment Agency¹

01 Mar 2006-Environmental Modelling and Software

TL;DR: The quantification of map similarities and dissimilarities using the Map Comparison Kit (MCK) software is addressed, which is unique in having two map comparison techniques based on fuzzy-set calculation rules.

...read moreread less

Abstract: Comparing maps is an important issue in environmental research. There are many reasons to compare maps: (i) to detect temporal/spatial changes or hot-spots, (ii) to compare different models, methodologies or scenarios, (iii) to calibrate, validate land-use models, (iv) to analyse model uncertainty and sensitivity, and (v) to assess map accuracy. This paper addresses the quantification of map similarities and dissimilarities using the Map Comparison Kit (MCK) software. Software and documentation are publicly available on the RIKS website free of charge (http://www.riks.nl/MCK/). The main focus is on 'categorical' or 'nominal' maps. Four different nominal map-comparison techniques are integrated in the software. Maps on ordinal, ratio and interval scale can be dealt with as well. The software is unique in having two map comparison techniques based on fuzzy-set calculation rules. The rationale is that fuzzy-set map comparison is very close to human judgement. Both fuzziness in location and fuzziness in category definitions are dealt with in the software.

...read moreread less

Book Chapter•DOI•

Symbolic Data Analysis : Conceptual statistics and data Mining

[...]

Edwin Diday, Lynne Billard¹•Institutions (1)

University of Georgia¹

31 Dec 2006

TL;DR: This chapter discusses Descriptive Statistics: Two or More Variates, which focuses on the part of the model concerned with Hierarchy-Divisive Clustering and Cluster Analysis.

...read moreread less

Abstract: The first book to present a unified account of symbolic data analysis methods in a consistent statistical framework, Symbolic Data Analysis features a substantial number of examples from a range of application areas, including health, the social sciences, economics, and computer science. It includes implementation of the methods described using SODAS software, which has been developed by a team led by Edwin Diday and is freely available on the Web, with an additional chapter that provides a basic guide to the software. It also features exercises at the end of each chapter to help the reader develop their understanding of the methodology, and to enable use of the book as a course text. The book is supported by a website featuring a link to download SODAS software, datasets, solutions to exercises, and additional teaching material.

...read moreread less

Journal Article•DOI•

Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit

[...]

Jarek Nieplocha¹, Bruce J. Palmer, Vinod Tipparaju, Manojkumar Krishnan, Harold E. Trease¹, Edoardo Aprà² - Show less +2 more•Institutions (2)

Pacific Northwest National Laboratory¹, Environmental Molecular Sciences Laboratory²

01 May 2006

TL;DR: Compatibility of GA with MPI enables the programmer to take advatage of the existing MPI software/libraries when available and appropriate, and demonstrates the attractiveness of using higher level abstractions to write parallel code.

...read moreread less

Abstract: This paper describes capabilities, evolution, performance, and applications of the Global Arrays (GA) toolkit. GA was created to provide application programmers with an inteface that allows them to distribute data while maintaining the type of global index space and programming syntax similar to that available when programming on a single processor. The goal of GA is to free the programmer from the low level management of communication and allow them to deal with their problems at the level at which they were originally formulated. At the same time, compatibility of GA with MPI enables the programmer to take advatage of the existing MPI software/libraries when available and appropriate. The variety of applications that have been implemented using Global Arrays attests to the attractiveness of using higher level abstractions to write parallel code.

...read moreread less

Applications of Ontologies in Software Engineering

[...]

Hans-Jörg Happel¹, Stefan Seedorf²•Institutions (2)

Forschungszentrum Informatik¹, University of Mannheim²

01 Jan 2006

TL;DR: This paper presents some examples of ontology applications throughout the Software Engineering lifecycle and discusses the advantages of ontologies in each case and provides a framework for classifying the usage of ontological applications in Software Engineering.

...read moreread less

Abstract: The emerging field of semantic web technologies promises new stimulus for Software Engineering research. However, since the underlying concepts of the semantic web have a long tradition in the knowledge engineering field, it is sometimes hard for software engineers to overlook the variety of ontology-enabled approaches to Software Engineering. In this paper we therefore present some examples of ontology applications throughout the Software Engineering lifecycle. We discuss the advantages of ontologies in each case and provide a framework for classifying the usage of ontologies in Software Engineering.

...read moreread less

Optimal Design for Longitudinal and Multilevel Research: Documentation for the "Optimal Design" Software

[...]

Jessaca Spybrook, W. Raudenbush, Xiao-feng Liu

01 Jan 2006

Collapse