scispace - formally typeset
Search or ask a question

Showing papers in "Technometrics in 2007"


Journal ArticleDOI
TL;DR: This book deals with probability distributions, discrete and continuous densities, distribution functions, bivariate distributions, means, variances, covariance, correlation, and some random process material.
Abstract: Chapter 3 deals with probability distributions, discrete and continuous densities, distribution functions, bivariate distributions, means, variances, covariance, correlation, and some random process material. Chapter 4 is a detailed study of the concept of utility including the psychological aspects, risk, attributes, rules for utilities, multidimensional utility, and normal form of analysis. Chapter 5 treats games and optimization, linear optimization, and mixed strategies. Entropy is the topic of Chapter 6 with sections devoted to entropy, disorder, information, Shannon’s theorem, demon’s roulette, Maxwell– Boltzmann distribution, Schrodinger’s nutshell, maximum entropy probability distributions, blackbodies, and Bose–Einstein distribution. Chapter 7 is standard statistical fare including transformations of random variables, characteristic functions, generating functions, and the classic limit theorems such as the central limit theorem and the laws of large numbers. Chapter 8 is about exchangeability and inference with sections on Bayesian techniques and classical inference. Partial exchangeability is also treated. Chapter 9 considers such things as order statistics, extreme value, intensity, hazard functions, and Poisson processes. Chapter 10 covers basic elements of risk and reliability, while Chapter 11 is devoted to curve fitting, regression, and Monte Carlo simulation. There is an ample number of exercises at the ends of the chapters with answers or comments on many of them in an appendix in the back of the book. Other appendices are on the common discrete and continuous distributions and mathematical aspects of integration.

19,893 citations


Journal ArticleDOI
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.

18,802 citations


Journal ArticleDOI
TL;DR: Robinson, R. (2007). Generalized Additive Models: An Introduction With R.(2007).
Abstract: (2007). Generalized Additive Models: An Introduction With R. Technometrics: Vol. 49, No. 3, pp. 360-361.

4,367 citations


Journal ArticleDOI
TL;DR: There are valuable points here; they are, however, generally too hard to find and some of them are undercut by the author’s misguided attempt to be “fair.”
Abstract: (2007). Introduction to Statistical Quality Control. Technometrics: Vol. 49, No. 1, pp. 108-109.

3,358 citations


Journal ArticleDOI
TL;DR: Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.
Abstract: (2007). Nonlinear Programming Theory and Algorithms. Technometrics: Vol. 49, No. 1, pp. 105-105.

1,317 citations


Journal ArticleDOI
TL;DR: This book puts together several fault-tolerant control and fault diagnosis approaches, with an emphasis on the work of the authors, and complements the material of the book with methods for systems with Markovian parameters.
Abstract: One of the major concerns in engineering is that the systems we design behave reasonably well in practice. We deal with imperfect models, model uncertainties, uncertainties in the interaction with the environment, and the finite dependability of the hardware and software components. There are various methods used in the engineering fields that account for such difficulties, including methods that check the dependability of designs by simulations and formal verification, methods for uncertainties in models such as worst case analysis and Monte Carlo analysis, and specific methods that can deal with certain faulty situations, such as in the areas of self-stabilization in computer science and error-correcting codes in coding theory. In control systems, we have disciplines such as robust control, adaptive control and fault-tolerant control. While these three disciplines have similar goals, a careful look reveals the differences. Following the authors of the book, we notice that fault-tolerant control “aims at changing the control law so as to cancel the effects of the faults and or to attenuate them to an acceptable level.” Compared to disturbances and model uncertainties, faults are more severe changes that cannot be suppressed by a fixed controller. This distinguishes fault-tolerant control from robust control, in which a fixed controller is designed as a tradeoff between performance and robustness. Further, the principle of adaptive control is “particularly efficient only for plants described by linear models with slowly varying parameters.” However, fault-tolerant control is to deal also with systems of nonlinear behavior, while faults typically involve sudden parameter changes. A further distinction can be made between traditional fault tolerance and model-based fault-tolerant control, the latter being approached in the book. Traditional fault tolerance improves the dependability of the system based on physical redundancy: a component is replaced with a component of the same type when it fails. Model-based fault tolerant control achieves dependability by means of analytical redundancy: In case of faults, changes are made in the control law and possibly also in the plant, by means of reconfigurations. The field of fault-tolerant control is relatively new. Two surveys of the field are [2] and [3]. Note that fault diagnosis, which has been studied extensively in the literature, is required for the implementation of fault tolerant control. The book puts together several fault-tolerant control and fault diagnosis approaches, with an emphasis on the work of the authors. Application examples are also given, allowing the reader to compare the approaches proposed in the book. In the literature, there is another book on fault-tolerant control [1], which complements the material of the book with methods for systems with Markovian parameters. The book is organized in ten chapters and six appendices. Chapters 1–3 are introductory, presenting an overview of the main ideas of the book, examples, and the various types of models used in the book. Chapters 4 and 5 present methods applicable at a higher-level of abstraction, in which the analytical details of the plant model are absent. Chapters 6 and 7 address fault diagnosis and fault-tolerant control

897 citations


Journal ArticleDOI
TL;DR: In this article, a simple Bayesian logistic regression approach that uses a Laplace prior to avoid overfitting and produces sparse predictive models for text data is presented. But this approach is not suitable for document classification problems.
Abstract: Logistic regression analysis of high-dimensional data, such as natural language text, poses computational and statistical challenges. Maximum likelihood estimation often fails in these applications. We present a simple Bayesian logistic regression approach that uses a Laplace prior to avoid overfitting and produces sparse predictive models for text data. We apply this approach to a range of document classification problems and show that it produces compact predictive models at least as effective as those produced by support vector machine classifiers or ridge logistic regression combined with feature selection. We describe our model fitting algorithm, our open source implementations (BBR and BMR), and experimental results.

829 citations


Journal ArticleDOI
TL;DR: The present book applies kernel regression techniques to functional data problems such as functional regression or classification, where the predictor is a function and nonparametric statisticians should feel very much at home with the approach taken in this book.
Abstract: (2007). Nonparametric Functional Data Analysis: Theory And Practice. Technometrics: Vol. 49, No. 2, pp. 226-226.

805 citations


Journal ArticleDOI
TL;DR: A framework that enables computer model evaluation oriented toward answering the question: Does the computer model adequately represent reality?
Abstract: We present a framework that enables computer model evaluation oriented toward answering the question: Does the computer model adequately represent reality? The proposed validation framework is a six-step procedure based on Bayesian and likelihood methodology. The Bayesian methodology is particularly well suited to treating the major issues associated with the validation process: quantifying multiple sources of error and uncertainty in computer models, combining multiple sources of information, and updating validation assessments as new information is acquired. Moreover, it allows inferential statements to be made about predictive error associated with model predictions in untested situations. The framework is implemented in a test bed example of resistance spot welding, to provide context for each of the six steps in the proposed validation process.

693 citations


Journal ArticleDOI
TL;DR: This book deals with probability distributions, discrete and continuous densities, distribution functions, bivariate distributions, means, variances, covariance, correlation, and some random process material.
Abstract: Chapter 3 deals with probability distributions, discrete and continuous densities, distribution functions, bivariate distributions, means, variances, covariance, correlation, and some random process material. Chapter 4 is a detailed study of the concept of utility including the psychological aspects, risk, attributes, rules for utilities, multidimensional utility, and normal form of analysis. Chapter 5 treats games and optimization, linear optimization, and mixed strategies. Entropy is the topic of Chapter 6 with sections devoted to entropy, disorder, information, Shannon’s theorem, demon’s roulette, Maxwell– Boltzmann distribution, Schrodinger’s nutshell, maximum entropy probability distributions, blackbodies, and Bose–Einstein distribution. Chapter 7 is standard statistical fare including transformations of random variables, characteristic functions, generating functions, and the classic limit theorems such as the central limit theorem and the laws of large numbers. Chapter 8 is about exchangeability and inference with sections on Bayesian techniques and classical inference. Partial exchangeability is also treated. Chapter 9 considers such things as order statistics, extreme value, intensity, hazard functions, and Poisson processes. Chapter 10 covers basic elements of risk and reliability, while Chapter 11 is devoted to curve fitting, regression, and Monte Carlo simulation. There is an ample number of exercises at the ends of the chapters with answers or comments on many of them in an appendix in the back of the book. Other appendices are on the common discrete and continuous distributions and mathematical aspects of integration.

539 citations


Journal ArticleDOI
TL;DR: The book describes clearly and intuitively the differences between exploratory and confirmatory factor analysis, and discusses how to construct, validate, and assess the goodness of fit of a measurement model in SEM by confirmatory factors analysis.
Abstract: Examples are discussed to show the differences among discriminant analysis, logistic regression, and multiple regression. Chapter 6, “Multivariate Analysis of Variance,” presents advantages of multivariate analysis of variance (MANOVA) over univariate analysis of variance (ANOVA), discusses assumptions of MANOVA, and assesses validations of MANOVA assumptions and model estimation. The authors also discuss post hoc tests of MANOVA and multivariate analysis of covariance. Chapter 7, “Conjoint Analysis,” explains what conjoint analysis does and how it is different from other multivariate techniques. Guidelines of selecting attributes, models, and methods of data collection are presented. Chapter 8, “Cluster Analysis,” studies objectives, roles, and limitations of cluster analysis. Two basic concepts: similarity and distance are discussed. The authors also discuss details of five most popular hierarchical algorithms (singlelinkage, complete-linkage, average-linkage, centroid method, Ward’s method) and three nonhierarchical algorithms (the sequential threshold method, the parallel threshold method, and the optimizing procedure). Profiles of clusters and guidelines for cluster validation are studied as well. Chapter 9, “Multidimensional Scaling and Correspondence Analysis,” introduces two interdependence techniques to display the relationships in the data. The book describes clearly and intuitively the differences between the two techniques and how these two techniques are performed. Chapters 10–12 cover topics in SEM. Chapter 10, “Structural Equation Modeling: An Introduction,” introduces SEM and related concepts such as exogenous, endogenous constructs, and so on, points out the differences between SEM and other multivariate techniques, overviews the decision process of SEM. Chapter 11, “Confirmatory Factor Analysis,” explains the differences between exploratory and confirmatory factor analysis, discusses how to construct, validate, and assess the goodness of fit of a measurement model in SEM by confirmatory factor analysis. Chapter 12, “Testing a Structural Model,” presents some methods of SEM in examining the relationships between latent constructs. The book is an excellent book for people in management and marketing. For the Technometrics audience, this book does not have much flavor of physical, chemical, and engineering sciences. For example, partial least squares, a very popular method in Chemometrics, is discussed but not as detailed as other techniques in the book. Furthermore, due to the amount of materials covered in the book, it might be inappropriate for someone who is new to multivariate analysis.

Journal ArticleDOI
TL;DR: DMA over a large model space led to better predictions than the single best performing physically motivated model, and it recovered both constant and time-varying regression parameters and model specifications quite well.
Abstract: We consider the problem of online prediction when it is uncertain what the best prediction model to use is. We develop a method called Dynamic Model Averaging (DMA) in which a state space model for the parameters of each model is combined with a Markov chain model for the correct model. This allows the "correct" model to vary over time. The state space and Markov chain models are both specified in terms of forgetting, leading to a highly parsimonious representation. As a special case, when the model and parameters do not change, DMA is a recursive implementation of standard Bayesian model averaging, which we call recursive model averaging. The method is applied to the problem of predicting the output strip thickness for a cold rolling mill, where the output is measured with a time delay. We found that when only a small number of physically motivated models were considered and one was clearly best, the method quickly converged to the best model, and the cost of model uncertainty was small; indeed DMA performed slightly better than the best physical model. When model uncertainty and the number of models considered were large, our method ensured that the penalty for model uncertainty was small. At the beginning of the process, when control is most difficult, we found that DMA over a large model space led to better predictions than the single best performing physically motivated model. We also applied the method to several simulated examples, and found that it recovered both constant and time-varying regression parameters and model specifications quite well.

Journal ArticleDOI
TL;DR: Adaptive procedures add the following features to the possible decisions at the interim analyses: (1) the addition or deletion of trial arms in a multiple armed clinical trial, (2) an increase or decrease in the total sample at the end of the study (based on interim estimates of variability and/or other assumed parameters, e.g., effect size), and (3) other changes to the design
Abstract: (2007). Taguchi's Quality Engineering Handbook. Technometrics: Vol. 49, No. 2, pp. 224-225.

Journal ArticleDOI
TL;DR: An algorithm for constructing orthogonal Latin hypercubes, given a fixed sample size, in more dimensions than previous approaches is presented, and a method that dramatically improves the space-filling properties of the resultant Latinhypercubes is detailed.
Abstract: This article presents an algorithm for constructing orthogonal Latin hypercubes, given a fixed sample size, in more dimensions than previous approaches. In addition, we detail a method that dramatically improves the space-filling properties of the resultant Latin hypercubes at the expense of inducing small correlations between the columns in the design matrix. Although the designs are applicable to many situations, they were developed to provide Department of Defense analysts flexibility in fitting models when exploring high-dimensional computer simulations where there is considerable a priori uncertainty about the forms of the response surfaces.

Journal ArticleDOI
TL;DR: There are valuable points here; they are, however, generally too hard to find and some of them are undercut by the author’s misguided attempt to be “fair.”
Abstract: (2007). Uncertainty Analysis With High Dimensional Dependence Modelling. Technometrics: Vol. 49, No. 1, pp. 108-108.

Journal ArticleDOI
TL;DR: This book aims to introduce simulation techniques for practitioners in the financial and risk management industry at an intermediate level by having extensive simulation examples using S–PLUS or Visual Basics.
Abstract: (2007). Stochastic Ageing and Dependence for Reliability. Technometrics: Vol. 49, No. 2, pp. 222-222.

Journal ArticleDOI
TL;DR: A novel multivariate exponentially weighted moving average monitoring scheme for a general linear profile, which fits a quadratic polynomial regression model well and is introduced to improve the performance of the proposed scheme.
Abstract: We propose a statistical process control scheme that can be implemented in industrial practice, in which the quality of a process can be characterized by a general linear profile. We start by reviewing the general linear profile model and the existing monitoring methods. Based on this, we propose a novel multivariate exponentially weighted moving average monitoring scheme for such a profile. We introduce two other enhancement features, the variable sampling interval and the parametric diagnostic approach, to further improve the performance of the proposed scheme. Throughout the article, we use a deep reactive ion etching example from semiconductor manufacturing, which has a profile that fits a quadratic polynomial regression model well, to illustrate the implementation of the proposed approach.

Journal ArticleDOI
TL;DR: One of the main strengths of this book is that it introduces the public domain R software and nicely explains how it can be used in computations of methods presented in the book.
Abstract: The targeted audience for this book is graduate students in engineering and medical statistics courses, and it may be useful for a senior undergraduate statistics course. To get the maximum benefit from this book, one should have a good knowledge and understanding of calculus and sufficient background in elementary probability theory to understand the central limit theorem and the law of large numbers. Some more sophisticated probability terminologies and concepts are defined for a smooth reading of the monograph. This monograph has 10 chapters, including the introduction. Chapter 2 deals with the ageing concept and some usual parametric families of probability distribution are presented in Chapter 3. Parametric and nonparametric statistical inference are nicely treated in Chapters 4 and 5. Chapter 5 also offers tests for exponentiality, which is one of the main feature of the monograph. Chapters 7 and 8 cover two-sample and regression problems, respectively. All of the preceding chapters showcase results for both complete and censored data. One of the interesting contributions is with regard to the analysis of competing risk, which is presented in Chapter 9. Finally, Chapter 10 introduces repairable systems. One of the main strengths of this book is that it introduces the public domain R software and nicely explains how it can be used in computations of methods presented in the book. This book has sufficient material and examples to cover a one semester (13week) course. However, I would be reluctant to adopt this book for one simple reason—there are no exercises. Having said that, the monograph would be useful to some applied researchers in related fields.

Journal ArticleDOI
TL;DR: The book elucidates design adequately and illuminates Taguchi’s advances, and the extensive set of case studies that cover each topic from the previous chapters will find appeal for virtually any practitioner regardless of specific field.
Abstract: Empirical evidence to lend proper credence, however, continues to elude the quality literature. This hardly vexes Taguchi (or most of those who produce the corpus of the discipline), but it is importunate to the reviewer. In many settings, the loss function is unlikely to be symmetric with respect to the target and, furthermore, the behavior on either side of the target is not necessarily the same. Such seemingly obvious deviations have not deterred the vast majority from proclaiming the ubiquity of the function. The current book offers no new insights here. The treatment of experimental design is fairly strong. Taguchi’s use of outer arrays is one of his greatest contributions (and one that has caught the ire of a few academics). The book elucidates design adequately and illuminates Taguchi’s advances. Anyone who is well versed in design will be able to skip the introductions and go straight to the discussion of orthogonal arrays. In this reviewer’s opinion, this is the major strength of the book. Another strength is the extensive set of case studies that cover each topic from the previous chapters. Applications include robust engineering in polymer chemistry, material design in automatic transmissions, improvements in omelet taste, and the use of Mahalanobis distance to measure drug efficacy. The sheer range of topical coverage in the cases will doubtlessly find appeal for virtually any practitioner regardless of specific field. There is the obligatory mention of Six Sigma as it relates to Taguchi’s work. Given the scope of Six Sigma in the current landscape, finding your place therein is necessary. A glaring omission is the lack of a similar consideration of ISO and QS certifications (as is given in Juran). Do not assume that the reviewer sees this as a negative. It is hoped here that Taguchi sees these quality certifications as largely specious and unworthy of a reference. Overall, it is hard not to be impressed with the utter volume of Taguchi’s output. The expanse of coverage is not to be dismissed. As a vehicle for presenting his prolific production, the handbook succeeds. The book may appear to be somewhat self-indulgent (as if 1600+ pages about your previous work could appear otherwise!). No doubt an ambitious undertaking, the authors nevertheless generally hit their mark. One would be hard-pressed not to at least enjoy most of the ride. What is positive (negative) about the book is largely what one perceives to be positive (negative) about Taguchi. The aforementioned lack of scholarly references is unsurprising, because Taguchi largely practiced beyond the boundaries of academia. Many academics have tended to reciprocate with less attention to his work than is probably deserved. What can safely be said is that if you are a fan of Taguchi’s work, this is definitely for you. If you need a single reference for his work or simply desire a “complete quality library,” you cannot go wrong here. Otherwise, it is unlikely that you would be interested. But in the event that you are a practitioner itching to get acquainted with Taguchi and have $150 burning a hole in your wallet or Visa, this one’s a winner.

Journal ArticleDOI
TL;DR: The present book applies kernel regression techniques to functional data problems such as functional regression or classification, where the predictor is a function and nonparametric statisticians should feel very much at home with the approach taken in this book.
Abstract: This is a research monograph rather than a practical book or, even less, a textbook. For the latter, your needs are better served by Ramsay and Silverman (2005a, b). In a sense, the present book is heavily biased toward statistical theory and is weak on practice and applications. For the theory aspect, the present book does bring something new and, indeed, some novel theoretical investigations into the kinds of functional data problems not addressed by Ramsay and Silverman (2005a). While Ramsay and Silverman’s books focus on exploratory and data analytic techniques for sparsely observed functional data, and employ techniques for smoothing and extrapolation using smoothing spline methods, the present book focuses on issues that arise from analysis of high resolution functional data, which can be easily registered or made to balance (pp. 33–34). The present book applies kernel regression techniques to functional data problems such as functional regression or classification, where the predictor is a function. The use of “nonparametric” in the title, although appropriate, is not totally distinctive, because I consider most techniques used in Ramsay and Silverman (2005a) nonparametric as well. The book mentions several applications in chemometrics, speech recognition, and electricity consumption forecast. I would like to add more, such as climate data analysis, material sciences, and bioinformatics. As someone who works closely with scientists on various interdisciplinary investigations on a daily basis, I feel strongly that there is need for new statistics that can deal with increasingly high throughput and high resolution measurements. Modern data analysis can benefit greatly from the recent statistical advent in functional data analysis. I think there will be many developments in the area of high-dimensional statistics when there are more observed variables than the number of replicates or samples, and multivariate statistics should receive revived interest in statistical research. In a sense, rather than sticking strictly with the existing techniques, one should adopt a pioneering attitude toward functional data analysis, and professional statisticians should be prepared to develop on their own techniques appropriate for a given problem, because much new statistics remains to be developed for the emerging problems (Lu 2006). Nonparametric statisticians should feel very much at home with the approach taken in this book. The authors have defined a broad and interesting framework in Part I, such as functional statistics, semimetrics, and locally weighted regression for functional data. Theoretical results, mainly asymptotics, are provided in Part II. Part V also contains some relevant theory and should be read right after Part II. I should point out some very relevant early work on nonparametric regression with fractal design (Lu 1999). Part III of the book deals with classification problems of functional data. Part IV is unusual, in that it deals with time series and dependent data. Although time series is among my favorite subject, it does not appear obvious how this part fits into the functional data framework, although one may argue that for high frequency time series, functional statistics may be very relevant. Notwithstanding, I do think the present book is a worthy contribution to the literature. The authors have done a nice job of summarizing some of ongoing research, on which some of the papers exist only in proceedings or in the French literature. Researchers in the growing functional statistics community should be glad to have a copy of the book.


Journal ArticleDOI
TL;DR: This book coversasymmetric queues, approximation algorithms for networks, an introduction to performance analysis software PEPSY (Performance Evaluation Prediction SYstem), SPNP (Stochastic Petri Net Package), MOSEL-2 (MOdeling, Specification, and Evaluation Language), and SHARPE (Symbolic Hierarchical Automated Reliability Performance Evaluator).
Abstract: (2007). Modeling Financial Time Series With S—Plus. Technometrics: Vol. 49, No. 1, pp. 105-106.

Journal ArticleDOI
TL;DR: The editors–authors have collected 16 chapters by 34 leading experts on the mathematical theory of stochastic optimization methods with potential for application to the engineering design of complex systems (e.g., telecom and computer networks and aircraft control systems).
Abstract: (2007). Estimation in Surveys with Nonresponse. Technometrics: Vol. 49, No. 2, pp. 227-227.

Journal ArticleDOI
TL;DR: In this paper, the authors consider a sum of independent processes in which each process is obtained by applying a first-order differential operator to a fully symmetric process on sphere × time.
Abstract: For space–time processes on global or large scales, it is critical to use models that respect the Earth's spherical shape. The covariance functions of such processes should be not only positive definite on sphere × time, but also capable of capturing the dynamics of the processes well. We develop space–time covariance functions on sphere × time that are flexible in producing space–time interactions, especially space–time asymmetries. Our idea is to consider a sum of independent processes in which each process is obtained by applying a first-order differential operator to a fully symmetric process on sphere × time. The resulting covariance functions can produce various types of space–time interactions and give different covariance structures along different latitudes. Our approach yields explicit expressions for the covariance functions, which has great advantages in computation. Moreover, it applies equally well to generating asymmetric space–time covariance functions on flat or other spatial domains. We ...

Journal ArticleDOI
TL;DR: The R Development Core Team (2006), “R: A Language and Environment for Statistical computing,” in R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0, available at http://www.R-project.org.
Abstract: Bobman, S., Riederer, S., Lee, J., Suddarth, S., Wang, H., and MacFall, J. (1985), “Synthesized MR Images: Comparison With Acquired Images,” Radiology, 155, 731–738. Bobman, S., Riederer, S., Lee, J., Tasciyan, T., Farzaneh, F., and Wang, H. (1986), “Pulse Sequence Extrapolation With MR Image Synthesis,” Radiology, 159, 253–258. Glad, I., and Sebastiani, G. (1995), “A Bayesian Approach to Synthetic Magnetic Resonance Imaging,” Biometrika, 82, 237–250. Hyvärinen, A., Karhunen, J., and Oja, E. (2001), Independent Component Analysis, New York: Wiley. Maitra, R., and Besag, J. E. (1998), “Bayesian Reconstruction in Synthetic Magnetic Resonance Imaging,” in Bayesian Inference in Inverse Problems. Proceedings of the Society of Photo-Optical Instrumentation Engineers (SPIE 1998) Meetings, Vol. 3459, ed. A. Mohammad-Djafari, pp. 39–47. R Development Core Team (2006), “R: A Language and Environment for Statistical Computing,” in R Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0, available at http://www.R-project.org.

Journal ArticleDOI
TL;DR: This work proposes an efficient variable selection strategy to specifically address the unique challenges faced by analysis of experiments and can be computed very rapidly and can find sparse models that better satisfy the goals of experiments.
Abstract: The analysis of experiments in which numerous potential variables are examined is driven by the principles of effect sparsity, effect hierarchy, and effect heredity. We propose an efficient variable selection strategy to specifically address the unique challenges faced by such analysis. The proposed methods are natural extensions of the LARS general-purpose variable selection algorithm. They can be computed very rapidly and can find sparse models that better satisfy the goals of experiments. Simulations and real examples are used to illustrate the wide applicability of the proposed methods.

Journal ArticleDOI
TL;DR: Semiparametric Theory and Missing Data is an excellent addition to the literature and is suitable for an advanced graduate course or for self-study by doctoral students or researchers in statistics and biostatistics.
Abstract: methodology for obtaining regular asymptotic linear AIPWCC (augmented inverse probability weighted complete case) estimators. Additionally, Chapter 9 includes a presentation of the relationship between monotone coarsening and censoring. Chapters 10 and 11 develop methodology for obtaining efficient and robust estimators. Initially optimal influence functions, whose structure yields to the space of double robust influence functions, are identified. Then the discussion is divided into three parts, according to the structure of coarsening (namely two levels of missingness), monotone coarsening, and nonmonotone coarsening. The concepts are promoted nicely for every different case and the corresponding estimators are developed via both theory and intuition. Chapter 11 considers efficient estimation in the class of double robust estimators. The ideas are based on AIPWCC estimators, but as the author shows there are computational problems for their implementation. Motivated by the preceding results, Chapter 12 develops estimation within a restricted class of AIPWCC estimators. In particular, detailed proof of the form of the estimating equations is provided in two cases. Initially, it is assumed that the estimating function belongs to the q-replicating linear subspace of the space of influence functions and the q-replicating linear subspace of the augmentation space, where both of the spaces are assumed to be linear and finite dimensional. The second case considers the q-replicating linear subspace of the space of influence functions to be finite dimensional, while there is no restriction on the augmentation space. Examples are worked out in great detail for both cases. Chapter 13 demonstrates the theory to the problem of estimating the average causal treatment effect. The idea is based on the so-called stable unit treatment value assumption, which implies that estimation of the average causal treatment effect is equivalent to estimation with missing data. Estimators are derived in great detail, reinforcing in this way the previous results. The last chapter examines the asymptotic properties of multiple imputation estimators. The prerequisites vary in level and depth as the material advances, but a graduate class in statistics at the level of Casella and Berger (1990) suffices to follow the exposition. The writing style is excellent, and all the main concepts and ideas are presented in a clear and pedagogical way. For instance, a detailed study of both restricted moment and logistic models throughout the book is instrumental in illustrating the abstract theory to some standard data analysis tools. At the end of almost each chapter, there are problems and a summary, which recapitulates notation usage and important results. The author should have put more emphasis on real data examples—applications are missing from the presentation. This book is suitable for an advanced graduate course or for self-study by doctoral students or researchers in statistics and biostatistics. It provides a valuable resource because it contains an up-to-date literature review and an exceptional account of state of the art research on the necessary theory. Overall, Semiparametric Theory and Missing Data is an excellent addition to the literature and, without any hesitation, I recommend it to any professional statistician.

Journal ArticleDOI
TL;DR: This book presents Wilcoxon-type rank-sum precedence tests; Chapter 7 discusses the extension of the precedence-type tests from the preceding chapters to the case of progressive sampling; Chapter 8 presents the generalization of the hitherto discussed two-sample case to the k-sample situation; Chapter 9 discusses the problem of selecting the “best” population if the null hypothesis of homogeneity of populations is rejected by using a test of equality based on a precedence statistic.
Abstract: presents Wilcoxon-type rank-sum precedence tests; Chapter 7 discusses the extension of the precedence-type tests from the preceding chapters to the case of progressive sampling; Chapter 8 presents the generalization of the hitherto discussed two-sample case to the k-sample situation; Chapter 9 discusses the problem of selecting the “best” (in terms of lifetime) population if the null hypothesis of homogeneity of populations is rejected by using a test of equality based on a precedence statistic; Chapter 10, finally, discusses the selection of the “best” population when a test of equality based on a minimal Wilcoxon rank-sum precedence statistic is used. A reader that is interested in precedence-type tests will find this book to be invaluable. It gives all the theory and formulas needed to understand and apply the methods that are discussed. However, the applications mentioned in the title of the book is mainly confined to a dataset on the times to breakdown of an insulating fluid that is subjected to high voltage stress, which is used in almost all chapters. This hardly justifies the mention of applications in the title. Further, the book contains page after page with tables from Monte Carlo simulations of the power of the tests. While these may be of interest for some readers, they break up the flow of the text and make it less coherent. It had been better if the tables had been placed in an Appendix instead of in the text. In summary, this book is a valuable contribution to the literature in reliability and lifetime data analysis. Researchers and practitioners in this field may find it quite useful, but persons outside this field will probably have little use of it.


Journal ArticleDOI
TL;DR: The statistical papers emphasize data and the statistical analysis of various forms of classification, but the computer science papers take a different approach which will be difficult for statisticians to follow.
Abstract: marily statisticians and computer scientists, but some papers are by economists, musicologists, and others working in applied fields. The book is divided into a five chapters on theory (Similarity and Dissimilarity, Classification and Clustering, Network and Graph Analysis, Analysis of Symbolic Data, General Data Analysis Methods) and three chapters on applications (Data and Web Mining, Analysis of Music Data, Gene and Microarray Analysis). Within each chapter are three to five papers of eight to twelve pages in length. Some of these papers are densely packed with detail, while others appear to have been stretched to fill eight pages. The typesetting and page layout are well done, and the graphics are very clear. Most of the papers either present new statistics for identifying complex features seen in data, or they present new classification methods. There are a few survey articles, and some articles which broadly discuss applications. Because the authors come from diverse backgrounds, there is great variety of writing style among the papers. In spite of the multidisciplinary nature of the field, there is still some distance between various academic approaches to it. The computer science papers reference computer science journals, while the statisticians reference statistical journals. The statistical papers emphasize data and the statistical analysis of various forms of classification, but the computer science papers take a different approach which will be difficult for statisticians to follow. Some of the differences are fundamental, as in the case of technical papers on fuzzy sets. Other difficulties arise from differences in notation, and differences in what is emphasized in discussions. Readers with a background from computer science will have similar difficulties with the statistical papers. The main market for this book would be libraries, and researchers wanting a record of recent advances in statistical learning. One section of more general interest is dedicated to classification of music. Musicologists seek to identify similarities between music played at different tempo and in different scales, for purposes as diverse as ethnomusicology and investigations of copyright infringement. A survey article is presented, along with three papers on similarity measures and other statistics designed to describe specific aspects of music.