scispace - formally typeset
Search or ask a question

Showing papers by "Ron S. Kenett published in 2014"


Book
28 Jan 2014
TL;DR: Governing and post-graduate students in the areas of statistical quality and engineering, as well as industrial statisticians, researchers and practitioners in these fields will all benefit from the comprehensive combination of theoretical and practical information provided in this single volume.
Abstract: Fully revised and updated, this book combines a theoretical background with examples and references to R, MINITAB and JMP, enabling practitioners to find state-of-the-art material on both foundation and implementation tools to support their work. Topics addressed include computer-intensive data analysis, acceptance sampling, univariate and multivariate statistical process control, design of experiments, quality by design, and reliability using classical and Bayesian methods. The book can be used for workshops or courses on acceptance sampling, statistical process control, design of experiments, and reliability. Graduate and post-graduate students in the areas of statistical quality and engineering, as well as industrial statisticians, researchers and practitioners in these fields will all benefit from the comprehensive combination of theoretical and practical information provided in this single volume. Modern Industrial Statistics: With applications in R, MINITAB and JMP:Combines a practical approach with theoretical foundations and computational support. Provides examples in R using a dedicated package called MISTAT, and also refers to MINITAB and JMP. Includes exercises at the end of each chapter to aid learning and test knowledge. Provides over 40 data sets representing real-life case studies. Is complemented by a comprehensive website providing an introduction to R, and installations of JMP scripts and MINITAB macros, including effective tutorials with introductory material: www.wiley.com/go/modern_industrial_statistics.

82 citations


Journal ArticleDOI
TL;DR: In this article, the authors define the concept of information quality "InfoQ" as the potential of a data set to achieve a specific (scientific or practical) goal by using a given empirical analysis method.
Abstract: We define the concept of information quality 'InfoQ' as the potential of a data set to achieve a specific (scientific or practical) goal by using a given empirical analysis method. InfoQ is different from data quality and analysis quality, but is dependent on these components and on the relationship between them. We survey statistical methods for increasing InfoQ at the study design and post-data-collection stages, and we consider them relatively to what we define as InfoQ. We propose eight dimensions that help to assess InfoQ: data resolution, data structure, data integration, temporal relevance, generalizability, chronology of data and goal, construct operationalization and communication. We demonstrate the concept of InfoQ, its components (what it is) and assessment (how it is achieved) through three case-studies in on-line auctions research. We suggest that formalizing the concept of InfoQ can help to increase the value of statistical analysis, and data mining both methodologically and practically, thus contributing to a general theory of applied statistics. © 2013 Royal Statistical Society.

78 citations


Journal ArticleDOI
TL;DR: This innovative inorganic DoE-optimized NP surface modification by [CeLn]3/4+ complexes enables an effective "fully inorganic-type" coordination attachment of a branched poly-cationic 25 kDa b-PEI25 polymer for siRNA loading and gene silencing.
Abstract: Pre-formed Massart magnetite (Fe3O4) nanoparticles (NPs) have been successfully modified by positively charged lanthanide Ce(iii/iv) cations/[CeLn]3/4+ complexes by using a strong mono-electronic Ceric Ammonium Nitrate oxidant (CAN) as a Ce donor. The doping process is promoted by high-power ultrasonic irradiation. The reaction has been statistically optimized by Design of Experiments (DoE, MINITAB® 16 DoE software) to afford globally optimized magnetically responsive ultra-small 6.61 ± 2.04 nm-sized CANDOE-γ-Fe2O3 NPs that are highly positively charged (ξ potential: +45.7 mV). This innovative inorganic DoE-optimized NP surface modification by [CeLn]3/4+ complexes enables an effective "fully inorganic-type" coordination attachment of a branched poly-cationic 25 kDa b-PEI25 polymer for siRNA loading and gene silencing. This innovative NP platform technology paves an efficient way for the successful development of a wide range of biomedicine and diagnostic-related applications.

32 citations


OtherDOI
29 Sep 2014
TL;DR: This section covers the Ishikawa fishbone diagram, structural equation models, and Bayesian networks, which have been proposed to map cause and effect relationships.
Abstract: Cause and effect is a basic knowledge driven by theoretical and empirical considerations. Several tools have been proposed to map cause and effect relationships, with some more heuristics some highly quantitative. In this section we cover the Ishikawa fishbone diagram, structural equation models, and Bayesian networks. Keywords: scatter plots; Ishikawa diagrams; structural equation models; Bayesian networks; integrated management models

22 citations


Proceedings ArticleDOI
21 Jul 2014
TL;DR: The ability to systematically capture, filter, analyze, reason about, and build theories upon, the behavior of an open source community in combination with the structured elicitation of expert opinions on potential organizational business risk is presented.
Abstract: Free Libre Open Source Software (FLOSS) has become a strategic asset in software development, and open source communities behind FLOSS are a key player in the field. The analysis of open source community dynamics is a key capability in risk management practices focused on the integration of FLOSS in all types of organizations. We are conducting research in developing methodologies for managing risks of FLOSS adoption and deployment in various application domains. This paper is about the ability to systematically capture, filter, analyze, reason about, and build theories upon, the behavior of an open source community in combination with the structured elicitation of expert opinions on potential organizational business risk. The novel methodology presented here blends together qualitative and quantitative information as part of a wider analytics platform. The approach combines big data analytics with automatic scripting of scenarios that permits experts to assess risk indicators and business risks in focused tactical and strategic workshops. These workshops generate data that is used to construct Bayesian networks that map data from community risk drivers into statistical distributions that are feeding the platform risk management dashboard. A special feature of this model is that the dynamics of an open source community are tracked using social network metrics that capture the structure of unstructured chat data. The method is illustrated with a running example based on experience gained in implementing our approach in an academic smart environment setting including Mood bile, a Mobile Learning for Moodle (www.moodbile.org). This example is the first in a series of planned experiences in the domain of smart environments with the ultimate goal of deriving a complete risk model in that field.

16 citations


Journal ArticleDOI
TL;DR: This paper proposes an approach to sensitivity analysis for identifying the drivers of overall satisfaction and shows how such an analysis generates high information quality and can be effectively combined with an integrated analysis considering various models.
Abstract: Modelling relationships between variables has been a major challenge for statisticians in a wide range of application areas. In conducting customer satisfaction surveys, one main objective, is to identify the drivers to overall satisfaction (or dissatisfaction) in order to initiate proactive actions for containing problems and/or improving customer satisfaction. Bayesian Networks (BN) combine graphical analysis with Bayesian analysis to represent relations linking measured and target variables. Such graphical maps are used for diagnostic and predictive analytics. This paper is about the use of BN in the analysis of customer survey data. We propose an approach to sensitivity analysis for identifying the drivers of overall satisfaction. We also address the problem of selection of robust networks. Moreover, we show how such an analysis generates high information quality (InfoQ) and can be effectively combined with an integrated analysis considering various models.

9 citations


Book ChapterDOI
08 Oct 2014
TL;DR: An approach for monitoring community dynamics continuously, including communications like email and blogs, and repositories of bugs and fixes is suggested, which can be used to drive selective testing for effective validation and verification of OSS components.
Abstract: The increasing adoption of open source software (OSS) components in software systems introduces new quality risks and testing challenges. OSS components are developed and maintained by open communities and the fluctuation of community members and structures can result in instability of the software quality. Hence, an investigation is necessary to analyze the impact open community dynamics and the quality of the OSS, such as the level and trends in internal communications and content distribution. The analysis results provide inputs to drive selective testing for effective validation and verification of OSS components. The paper suggests an approach for monitoring community dynamics continuously, including communications like email and blogs, and repositories of bugs and fixes. Detection of patterns in the monitored behavior such as changes in traffic levels within and across clusters can be used in turn to drive testing efforts. Our proposal is demonstrated in the case of the XWiki OSS, a Java-based environment that allows for the storing of structured data and the execution of server side scripts within the wiki interface. We illustrate our concepts, methods and approach behind this approach for risk based testing of OSS.

7 citations


OtherDOI
29 Sep 2014
TL;DR: Response surfaces are a general strategy for combining designed experiments and regression analysis to explore the relationship between one or more response variables and a set of factors that are thought to affect the responses as mentioned in this paper.
Abstract: Response surface methodology is a general strategy for combining designed experiments and regression analysis to explore the relationship between one or more response variables and a set of factors that are thought to affect the responses. Some of the key elements in this strategy include sequential progress, a matching of experimental designs to the complexity of needed regression models and rapid progress to the most interesting parts of the factor space. These methods are often used to seek optimal settings of the factors for reaching specific values of the response, such as a maximum or minimum value. Response surface methods are widely used in research and development and industrial applications. Keywords: response surfaces; sequential experimentation; factorial design; central composite design; Box-Behnken design

6 citations


OtherDOI
29 Sep 2014
TL;DR: In this section, the issue of missing data and imputation is addressed, covering a wide variety of applications and solutions proposed in the recent literature.
Abstract: Handling missing data is a core requirement of data analysis. The assessment of the process that caused missing data and the approach for handling missing data can have significant impact on the results of the analysis. In this section, we address the issue of missing data and imputation, covering a wide variety of applications and solutions proposed in the recent literature. Keywords: missing data; imputation; EM algorithm; listwise deletion; weighting; hot-deck; bootstrapping; multiple imputation

6 citations


Journal ArticleDOI
TL;DR: The concept of information quality, or InfoQ, is defined by as mentioned in this paper as "the potential of a dataset to achieve a specific goal using a given empirical analysis method." It relies on identifying and examining the relationships between four components: the analysis goal, the data, data analysis, and the utility.
Abstract: The term quality of statistical data, developed and used in official statistics and international organizations such as the IMF and the OECD, refers to the usefulness of summary statistics generated by producers of official statistics. Similarly, in the context of survey quality, official agencies such as Eurostat, NCSES and Statistics Canada created dimensions for evaluating the quality of a survey for obtaining 'accurate survey data'.The concept of Information Quality, or InfoQ, (Kenett and Shmueli, 2014), provides a general framework applicable to data analysis in a broader sense than summary statistics: InfoQ is defined as "the potential of a dataset to achieve a specific goal using a given empirical analysis method." It relies on identifying and examining the relationships between four components: the analysis goal, the data, the data analysis, and the utility. The InfoQ framework relies on eight dimensions used to deconstruct InfoQ and thereby provide an approach for assessing it.We compare and contrast the InfoQ framework and dimensions with those typically used by statistical agencies. We discuss how the InfoQ approach can support using official statistics not only by government for policy decision making, but also by other stakeholders such as industry by integrating official and organizational data.

5 citations


Proceedings ArticleDOI
21 Jul 2014
TL;DR: This research introduces the strategy of risk-based adaptive testing of OSS by combining information on the OSS community ecosystem with risk-driven tests selection and scheduling strategy, and illustrates the concepts, methods and approach behind risk based testing.
Abstract: Open Source Software (OSS) has become a strategic asset for a number of reasons, such as its short time-to-market software service and product delivery, reduced development and maintenance costs, introduction of innovative features and its customization capabilities. By 2016 an estimated 95% of all commercial software packages will include OSS components. This pervasive adoption is not without risks for an industry that has experienced significant failures in product quality, timelines and delivery costs. Exhaustive testing of any software system and, specifically, of open source software components is usually not feasible due to limitations in time and resources. In risk-based testing approach test cases are selected and scheduled based on software risk analysis. This research introduces the strategy of risk-based adaptive testing of OSS by combining information on the OSS community ecosystem with risk-driven tests selection and scheduling strategy. A key feature of the proposed approach is the monitoring and analysis of OSS community dynamics, including chats and email communications, blogs, repositories of bugs and fixes, and more. The community and its dynamics are then monitored to detect anomaly communication between the community members. Our approach is demonstrated in the XWiki OSS, a Java-based environment that allows for the storing of structured data and the execution of server side scripts within the wiki interface. We illustrate our concepts, methods and approach behind risk based testing.

Book ChapterDOI
06 May 2014
TL;DR: A pattern-based approach and risk reasoning techniques to link risks to business goals is proposed in the third layer of a layered approach to managing risks in OSS projects.
Abstract: In this paper, we propose a layered approach to managing risks in OSS projects. We define three layers: the first one for defining risk drivers by collecting and summarising available data from different data sources, including human-provided contextual information; the second layer, for converting these risk drivers into risk indicators; the third layer for assessing how these indicators impact the business of the adopting organisation. The contributions are: 1) the complexity of gathering data is isolated in one layer using appropriate techniques, 2) the context needed to interpret this data is provided by expert involvement evaluating risk scenarios and answering questionnaires in a second layer, 3) a pattern-based approach and risk reasoning techniques to link risks to business goals is proposed in the third layer.

OtherDOI
29 Sep 2014
TL;DR: This section presents various situations in which multivariate statistical process control techniques are used and provides details on their practical implementation.
Abstract: Owing to pervasive capabilities of modern data collection, analysis, and graphical displays, the multivariate statistical process control is playing an increasingly central role in industry and services. Its application requires careful consideration of the data context and the statistical tools involved. In this section we present various situations in which such techniques are used and provide details on their practical implementation. Keywords: multivariate statistical process control; Mahalanobis T2 charts; multivariate tolerance regions

Journal ArticleDOI
TL;DR: In this article, an expanded view of the role of statistics in research, business, industry, and service organizations is presented, which can contribute to close the gap between theory and practice and improve the position of statistics as a scientific discipline with wide relevance to organizations and research activities.
Abstract: Statistics has gained a reputation as being focused only on data collection and data analysis. This paper is about an expanded view of the role of statistics in research, business, industry and service organizations. Such an approach provides an antidote to the narrow view of statistics outlined above. The life cycle view we elaborate on can contribute to close the gap between theory and practice and improve the position of statistics as a scientific discipline with wide relevance to organizations and research activities. Specifically we discus here a “life cycle view” consisting of: 1) Problem elicitation, 2) Goal formulation, 3) Data collection, 4) Data analysis, 5) Formulation of findings, 6) Operationalization of findings, 7) Communication and 8) Impact assessment. These 8 phases are conducted with internal iterations that combine the inductive-deductive learning process studied by George Box (Box, 1997). Covering these 8 dimensions, beyond the data analysis phase, increases the impact of statistical analysis and enhances the level of generated knowledge and information quality it leads to. The envisaged overall approach is that applied statistics needs to involve a trilogy combining: 1) a life cycle view, 2) an analysis of impact and 3) an assessment of the quality of the generated information and knowledge. We begin with a section introducing the problem, continue with a review of the InfoQ concept presented in Kenett and Shmueli (2013) and proceed with a description of the eight life cycle phases listed above. Adopting a life cycle view of statistics has obvious implications to research, education and statistical practice. We conclude with a discussion of such implications.

Proceedings ArticleDOI
01 Jan 2014
TL;DR: RISCOSS is the only platform to deliver a complete solu tion rather than a piecemeal approach to enable mainstream product develope rs to safely integrate open source software in their developments.
Abstract: Open Source Software (OSS) has become a strategic asset in so ftware development, and open source communities behind OSS are a ke y player in the field. By 2016 an estimated 95% of all commercial software pac kages will include OSS. This extended adoption is yet not avoiding failure rate s in OSS projects to be as high as 50%. Inadequate risk management has been iden tifie among the top mistakes to avoid when implementing OSS-based solut ions. Understanding, managing and mitigating OSS adoption risks is therefor e crucial to avoid potentially significant adverse impact on the business. Thi s c apter introduces the RISCOSS decision support platform. RISCOSS develops a r isk managementbased methodology to facilitate the adoption of open source cod into mainstream products and services. RISCOSS develops a methodology and a software platform that integrate the whole decision-making chain, from t echnology criteria to strategic concerns. Using advanced software engineerin g techniques and risk management methodologies, RISCOSS develops innovative to ols and methods to identify, manage and mitigate risks of integrating third -party open source software. RISCOSS is the only platform to deliver a complete solu tion rather than a piecemeal approach to enable mainstream product develope rs to safely integrate open source software in their developments. Itself an open source project,

OtherDOI
29 Sep 2014
TL;DR: Methods used to analyze software failure data include software reliability models for predicting the number of hidden faults, software trouble assessment for assessing the inspection process, Pareto charts of software errors, and control charts for tracking the weekly number of new and fixed bugs.
Abstract: Effectively and efficiently analyzing software failure data is critical to competitive organizations developing software. This article covers several methods used to analyze such data, with examples. The methods covered include software reliability models for predicting the number of hidden faults, software trouble assessment for assessing the inspection process, Pareto charts of software errors, and control charts for tracking the weekly number of new and fixed bugs. Keywords: software failures; software reliability models; Pareto analysis; control charts; software trouble assessment

OtherDOI
29 Sep 2014
TL;DR: In this paper, the authors present several multivariate capability indices, and extend the idea of multivariate tolerance regions for assessing the capability of a process, based on which a process capability is determined by comparing the actual performance with required specifications.
Abstract: Process capability is determined by comparing the actual performance of a process with required specifications. Several indices have been proposed to report the capability of a process in the univariate case. When performance is tracked in several dimensions, an extension of these indices is required. We present several multivariate capability indices, and extend the idea of multivariate tolerance regions for assessing the capability of a process. Keywords: process capability; specification limits; multivariate statistical process control; Mahalanobis T2 charts; multivariate tolerance regions