scispace - formally typeset
Search or ask a question

Showing papers in "Statistical Science in 2006"


Journal ArticleDOI
TL;DR: This paper provides a review of many of the AT models that have been use successfully in this area and makes important contributions in the development of appropriate stochastic models for AT data.
Abstract: Engineers in the manufacturing industries have used accelerated test (AT) experiments for many decades. The purpose of AT experiments is to acquire reliability information quickly. Test units of a material, component, subsystem, or entire systems are subjected to higher-than-usual levels of one or more accelerating variables such as temperature or stress. Then the AT results are used to predict life of the units at use conditions. The extrapolation is typically justified (correctly or incorrectly) on the basis of physically motivated models or a combination of empirical model fitting with a sufficient amount of previous experience in testing similar units. The need to extrapolate in both time and the accelerating variables generally necessitates the use of fully parametric models. Statisticians have made important contributions in the development of appropriate stochastic models for AT data (typically a distribution for the response and regression relationships between the parameters of this distribution and the accelerating variable(s)), statistical methods for AT planning (choice of accelerating variable levels and allocation of available test units to those levels), and methods of estimation of suitable reliability metrics. This paper provides a review of many of the AT models that have been use successfully in this area.

622 citations


Journal ArticleDOI
TL;DR: The authors argued that simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.
Abstract: A great many tools have been developed for supervised clas- sification, ranging from early methods such as linear discriminant anal- ysis through to modern developments such as neural networks and sup- port vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illu- sion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the di!erence in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.

533 citations


Journal ArticleDOI
TL;DR: In this article, the authors survey the literature on network-based marketing with an emphasis on the statistical methods used and the data to which these methods have been applied and provide a discussion of challenges and opportunities for this burgeoning research topic.
Abstract: Network-based marketing refers to a collection of marketing techniques that take advantage of links between consumers to increase sales. We concentrate on the consumer networks formed using direct interactions (e.g., communications) between consumers. We survey the diverse literature on such marketing with an emphasis on the statistical methods used and the data to which these methods have been applied. We also provide a discussion of challenges and opportunities for this burgeoning research topic. Our survey highlights a gap in the literature. Because of inadequate data, prior studies have not been able to provide direct, statistical support for the hypothesis that network linkage can directly affect product/service adoption. Using a new data set that represents the adoption of a new telecommunications service, we show very strong support for the hypothesis. Specifically, we show three main results: (1) “Network neighbors”—those consumers linked to a prior customer—adopt the service at a rate 3–5 times greater than baseline groups selected by the best practices of the firm’s marketing team. In addition, analyzing the network allows the firm to acquire new customers who otherwise would have fallen through the cracks, because they would not have been identified based on traditional attributes. (2) Statistical models, built with a very large amount of geographic, demographic and prior purchase data, are significantly and substantially improved by including network information. (3) More detailed network information allows the ranking of the network neighbors so as to permit the selection of small sets of individuals with very high probabilities of adoption.

504 citations


Journal ArticleDOI
TL;DR: First hitting times arise naturally in many types of stochastic processes, ranging from Wiener processes to Markov chains, and have been investigated as models for survival data.
Abstract: Many researchers have investigated first hitting times as models for survival data. First hitting times arise naturally in many types of stochastic processes, ranging from Wiener processes to Markov chains. In a survival context, the state of the underlying process represents the strength of an item or the health of an individual. The item fails or the individual experiences a clinical endpoint when the process reaches an adverse threshold state for the first time. The time scale can be calendar time or some other operational measure of degradation or disease progression. In many applications, the process is latent (i.e., unobservable). Threshold regression refers to first-hitting-time models with regression structures that accommodate covariate data. The parameters of the process, threshold state and time scale may depend on the covariates. This paper reviews aspects of this topic and discusses fruitful avenues for future research.

258 citations


Journal ArticleDOI
TL;DR: Using data from a popular movie website, a metric of a purchasing population's propensity to rate a product online is introduced and it is found that it exhibits several relationships that have been previously found to exist between aspects of a product and consumers' propensity to engage in offline WOM about it.
Abstract: The emergence of online communities has enabled firms to monitor consumer-generated online word-of-mouth (WOM) in real-time by mining publicly available information from the Internet. A prerequisite for harnessing this new ability is the development of appropriate WOM metrics and the identification of relationships between such metrics and consumer behavior. Along these lines this paper introduces a metric of a purchasing population’s propensity to rate a product online. Using data from a popular movie website we find that our metric exhibits several relationships that have been previously found to exist between aspects of a product and consumers’ propensity to engage in offline WOM about it. Our study, thus, provides positive evidence for the validity of our metric as a proxy of a population’s propensity to engage in post-purchase online WOM. Our results also suggest that the antecedents of offline and online WOM exhibit important similarities.

238 citations


Journal ArticleDOI
TL;DR: This paper is intended as an introduction to SVMs and their applications, emphasizing their key features, and some algorithmic extensions and illustrative real-world applications of SVMs are shown.
Abstract: Support vector machines (SVMs) appeared in the early nineties as optimal margin classiers in the context of Vapnikis statistical learning theory. Since then SVMs have been successfully applied to real-world data analysis problems, often providing improved results compared with other techniques. The SVMs operate within the framework of regularization theory by minimizing an empirical risk in a well-posed and consistent way. A clear advantage of the support vector approach is that sparse solutions to classi- cation and regression problems are usually obtained: only a few samples are involved in the determination of the classication or regression functions. This fact facilitates the application of SVMs to problems that involve a large amount of data, such as text processing and bioinformatics tasks. This paper is intended as an introduction to SVMs and their applications, emphasizing their key features. In addition, some algorithmic extensions and illustrative real-world applications of SVMs are shown.

232 citations


Journal ArticleDOI
TL;DR: In this paper, the authors use potential out-comes to define causal effects, followed by principal stratification on the intermediated outcomes (e.g., survival) and conclude that causal inference is best understood using potential outcomes.
Abstract: Causal inference is best understood using potential out- comes. This use is particularly important in more complex settings, that is, observational studies or randomized experiments with compli- cations such as noncompliance. The topic of this lecture, the issue of estimating the causal effect of a treatment on a primary outcome that is "censored" by death, is another such complication. For example, sup- pose that we wish to estimate the effect of a new drug on Quality of Life (QOL) in a randomized experiment, where some of the patients die before the time designated for their QOL to be assessed. Another example with the same structure occurs with the evaluation of an ed- ucational program designed to increase final test scores, which are not defined for those who drop out of school before taking the test. A fur- ther application is to studies of the effect of job-training programs on wages, where wages are only defined for those who are employed. The analysis of examples like these is greatly clarified using potential out- comes to define causal effects, followed by principal stratification on the intermediated outcomes (e.g., survival).

198 citations


Journal ArticleDOI
TL;DR: The MCMC package WinBUGS facilitates sound fitting of general design Bayesian generalized linear mixed models in practice, and is described as a Bayesian approach and Markov chain Monte Carlo (MCMC) is used for estimation and inference.
Abstract: Linear mixed models are able to handle an extraordinary range of complications in regression-type analyses. Their most common use is to account for within-subject correlation in longitudinal data analysis. They are also the standard vehicle for smoothing spatial count data. However, when treated in full generality, mixed models can also handle spline-type smoothing and closely approximate kriging. This allows for nonparametric regression models (e.g., additive models and varying coefficient models) to be handled within the mixed model framework. The key is to allow the ran- dom effects design matrix to have general structure; hence our label general design. For continuous response data, particularly when Gaussianity of the response is reasonably assumed, computation is now quite mature and sup- ported by the R, SAS and S-PLUS packages. Such is not the case for bi- nary and count responses, where generalized linear mixed models (GLMMs) are required, but are hindered by the presence of intractable multivariate in- tegrals. Software known to us supports special cases of the GLMM (e.g., PROC NLMIXED in SAS or glmmML in R) or relies on the sometimes crude Laplace-type approximation of integrals (e.g., the SAS macro glimmix or glmmPQL in R). This paper describes the fitting of general design general- ized linear mixed models. A Bayesian approach is taken and Markov chain Monte Carlo (MCMC) is used for estimation and inference. In this gener- alized setting, MCMC requires sampling from nonstandard distributions. In this article, we demonstrate that the MCMC package WinBUGS facilitates sound fitting of general design Bayesian generalized linear mixed models in practice.

181 citations


Journal ArticleDOI
TL;DR: A framework where the ob- served events are modeled as marked point processes, with marks labeling the types of events is presented, where the emphasis is more on modeling than on statistical inference.
Abstract: We review basic modeling approaches for failure and mainte- nance data from repairable systems. In particular we consider imperfect re- pair models, defined in terms of virtual age processes, and the trend-renewal process which extends the nonhomogeneous Poisson process and the renewal process. In the case where several systems of the same kind are observed, we show how observed covariates and unobserved heterogeneity can be included in the models. We also consider various approaches to trend testing. Modern reliability data bases usually contain information on the type of failure, the type of maintenance and so forth in addition to the failure times themselves. Basing our work on recent literature we present a framework where the ob- served events are modeled as marked point processes, with marks labeling the types of events. Throughout the paper the emphasis is more on modeling than on statistical inference.

176 citations


Journal ArticleDOI
TL;DR: A survey of various existing techniques dealing with the dependence problem is provided in this article, which includes locally optimal designs, sequential designs, Bayesian designs and quantile dispersion graph approach for comparing designs for generalized linear models.
Abstract: Generalized linear models (GLMs) have been used quite effectively in the modeling of a mean response under nonstandard conditions, where discrete as well as continuous data distributions can be accommodated. The choice of design for a GLM is a very important task in the development and building of an adequate model. However, one major problem that handicaps the construction of a GLM design is its dependence on the unknown parameters of the fitted model. Several approaches have been proposed in the past 25 years to solve this problem. These approaches, however, have provided only partial solutions that apply in only some special cases, and the problem, in general, remains largely unresolved. The purpose of this article is to focus attention on the aforementioned dependence problem. We provide a survey of various existing techniques dealing with the dependence problem. This survey includes discussions concerning locally optimal designs, sequential designs, Bayesian designs and the quantile dispersion graph approach for comparing designs for GLMs.

137 citations


Journal ArticleDOI
TL;DR: It is argued that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask new research questions that classical statistical methods are not able to ad- dress.
Abstract: This paper describes opportunities and challenges of using functional data analysis (FDA) for the exploration and analysis of data originating from electronic commerce (eCommerce). We discuss the special data structures that arise in the online environment and why FDA is a natural approach for representing and analyzing such data. The paper reviews several FDA methods and motivates their useful- ness in eCommerce research by providing a glimpse into new domain insights that they allow. We argue that the wedding of eCommerce with FDA leads to innovations both in statistical methodology, due to the challenges and complications that arise in eCommerce data, and in online research, by being able to ask (and subsequently answer) new research questions that classical statistical methods are not able to ad- dress, and also by expanding on research questions beyond the ones traditionally asked in the offline environment. We describe s ap- plications originating from online transactions which are new to the statistics literature, and point out statistical challenges accompanied by some solutions. We also discuss some promising future directions for joint research efforts between researchers in eCommerce and statis- tics.

Journal ArticleDOI
TL;DR: This article empirically measure the distribution of bid timings and the extent of multiple bidding in a large set of online auctions, using bidder experience as a mediating variable and finds a nonmonotonic impact of bidder experience on the timing of bid placements.
Abstract: Online auctions are fast gaining popularity in today’s electronic commerce. Relative to offline auctions, there is a greater degree of multiple bidding and late bidding in online auctions, an empirical finding by some recent research. These two behaviors (multiple bidding and late bidding) are of “strategic” importance to online auctions and hence important to investigate. In this article we empirically measure the distribution of bid timings and the extent of multiple bidding in a large set of online auctions, using bidder experience as a mediating variable. We use data from the popular auction site www.eBay.com to investigate more than 10,000 auctions from 15 consumer product categories. We estimate the distribution of late bidding and multiple bidding, which allows us to place these product categories along a continuum of these metrics (the extent of late bidding and the extent of multiple bidding). Interestingly, the results of the analysis distinguish most of the product categories from one another with respect to these metrics, implying that product categories, after controlling for bidder experience, differ in the extent of multiple bidding and late bidding observed in them. We also find a nonmonotonic impact of bidder experience on the timing of bid placements. Experienced bidders are “more” active either toward the close of auction or toward the start of auction. The impact of experience on the extent of multiple bidding, though, is monotonic across the auction interval; more experienced bidders tend to indulge “less” in multiple bidding.

Journal ArticleDOI
TL;DR: Propensity score methods were proposed by Rosenbaum and Rubin (Biometrika 70 (1983) 41-55) as central tools to help assess the causal effects of interventions as discussed by the authors.
Abstract: Propensity score methods were proposed by Rosenbaum and Rubin (Biometrika 70 (1983) 41-55) as central tools to help assess the causal effects of interventions. Since their introduction more than two decades ago, they have found wide application in a variety of areas, including medical research, economics, epidemiology and education, es- pecially in those situations where randomized experiments are either difficult to perform, or raise ethical questions, or would require exten- sive delays before answers could be obtained. In the past few years, the number of published applications using propensity score methods to evaluate medical and epidemiological interventions has increased dra- matically. Nevertheless, thus far, we believe that there have been few applications of propensity score methods to evaluate marketing inter- ventions (e.g., advertising, promotions), where the tradition is to use generally inappropriate techniques, which focus on the prediction of an outcome from background characteristics and an indicator for the in- tervention using statistical tools such as least-squares regression, data mining, and so on. With these techniques, an estimated parameter in the model is used to estimate some global "causal" effect. This practice can generate grossly incorrect answers that can be self-perpetuating: polishing the Ferraris rather than the Jeeps "causes" them to continue to win more races than the Jeeps , visiting the high-prescribing doc- tors rather than the low-prescribing doctors "causes" them to continue to write more prescriptions. This presentation will take "causality" seri- ously, not just as a casual concept implying some predictive association in a data set, and will illustrate why propensity score methods are gen- erally superior in practice to the standard predictive approaches for estimating causal effects.

Journal ArticleDOI
TL;DR: This paper examined the sources of these two aspects of the Grundbegriffe, the work of the earlier scholars whose ideas Kolmogorov synthesized, and provided an explanation of how the formalism can be connected to the world of experience.
Abstract: Andrei Kolmogorov’s Grundbegriffe der Wahrscheinlichkeits-rechnung put probability’s modern mathematical formalism in place. It also provided a philosophy of probability—an explanation of how the formalism can be connected to the world of experience. In this article, we examine the sources of these two aspects of the Grundbegriffe—the work of the earlier scholars whose ideas Kolmogorov synthesized.

Journal ArticleDOI
TL;DR: An overview of the various modeling frameworks for non-Gaussian longitudinal data is provided, and a focus on generalized linear mixed-effects models, on the one hand, of which the parameters can be estimated using full likelihood, and on generalized estimating equations, which is a nonlikelihood method and hence requires a modification to be valid under MAR.
Abstract: Commonly used methods to analyze incomplete longitudi- nal clinical trial data include complete case analysis (CC) and last observation carried forward (LOCF). However, such methods rest on strong assumptions, including missing completely at random (MCAR) for CC and unchanging profile after dropout for LOCF. Such assump- tions are too strong to generally hold. Over the last decades, a number of full longitudinal data analysis methods have become available, such as the linear mixed model for Gaussian outcomes, that are valid un- der the much weaker missing at random (MAR) assumption. Such a method is useful, even if the scientific question is in terms of a sin- gle time point, for example, the last planned measurement occasion, and it is generally consistent with the intention-to-treat principle. The validity of such a method rests on the use of maximum likelihood, un- der which the missing data mechanism is ignorable as soon as it is MAR. In this paper, we will focus on non-Gaussian outcomes, such as binary, categorical or count data. This setting is less straightforward since there is no unambiguous counterpart to the linear mixed model. We first provide an overview of the various modeling frameworks for non-Gaussian longitudinal data, and subsequently focus on generalized linear mixed-effects models, on the one hand, of which the parameters can be estimated using full likelihood, and on generalized estimating equations, on the other hand, which is a nonlikelihood method and hence requires a modification to be valid under MAR. We briefly com- ment on the position of models that assume missingness not at random and argue they are most useful to perform sensitivity analysis. Our developments are underscored using data from two studies. While the case studies feature binary outcomes, the methodology applies equally well to other discrete-data settings, hence the qualifier "discrete" in the title.

Journal ArticleDOI
TL;DR: In this article, the role of experts in problem structuring and in developing failure mitigation options is discussed, and there is a need to take into account the reliability potential for future mitigation measures downstream in the system life cycle.
Abstract: This paper reviews the role of expert judgement to support reliability assessments within the systems engineering design process. Generic design processes are described to give the context and a discussion is given about the nature of the reliability assessments required in the different systems engineering phases. It is argued that, as far as meeting reliability requirements is concerned, the whole design process is more akin to a statistical control process than to a straightforward statistical problem of assessing an unknown distribution. This leads to features of the expert judgement problem in the design context which are substantially different from those seen, for example, in risk assessment. In particular, the role of experts in problem structuring and in developing failure mitigation options is much more prominent, and there is a need to take into account the reliability potential for future mitigation measures downstream in the system life cycle. An overview is given of the stakeholders typically involved in large scale systems engineering design projects, and this is used to argue the need for methods that expose potential judgemental biases in order to generate analyses that can be said to provide rational consensus about uncertainties. Finally, a number of key points are developed with the aim of moving toward a framework that provides a holistic method for tracking reliability assessment through the design process.

Journal ArticleDOI
TL;DR: In this paper, a review of methodology that has been proposed for addressing system reliability with limited full system testing is presented, and methodological contributions to resource allocation considerations for system relability assessment are also made.
Abstract: The systems that statisticians are asked to assess, such as nuclear weapons, infrastructure networks, supercomputer codes and munitions, have become increasingly complex. It is often costly to conduct full system tests. As such, we present a review of methodology that has been proposed for addressing system reliability with limited full system testing. The first approaches presented in this paper are concerned with the combination of multiple sources of information to assess the reliability of a single component. The second general set of methodology addresses the combination of multiple levels of data to determine system reliability. We then present developments for complex systems beyond traditional series/parallel representations through the use of Bayesian networks and flowgraph models. We also include methodological contributions to resource allocation considerations for system relability assessment. We illustrate each method with applications primarily encountered at Los Alamos National Laboratory.

Journal ArticleDOI
TL;DR: This work presents an overview of some proposals that have surfaced for the search of multiple databases which supposedly do not compromise possible pledges of confidentiality to the individuals whose data are included, and explores their link to the related literature on privacy-preserving data mining.
Abstract: The growing expanse of e-commerce and the widespread availability of online databases raise many fears regarding loss of privacy and many statistical challenges. Even with encryption and other nominal forms of protection for individual databases, we still need to protect against the violation of privacy through linkages across multiple databases. These issues parallel those that have arisen and received some attention in the context of homeland security. Following the events of September 11, 2001, there has been heightened attention in the United States and elsewhere to the use of multiple government and private databases for the identification of possible perpetrators of future attacks, as well as an unprecedented expansion of federal government data mining activities, many involving databases containing personal information. We present an overview of some proposals that have surfaced for the search of multiple databases which supposedly do not compromise possible pledges of confidentiality to the individuals whose data are included. We also explore their link to the related literature on privacy-preserving data mining. In particular, we focus on the matching problem across databases and the concept of “selective revelation” and their confidentiality implications.

Journal ArticleDOI
TL;DR: In this paper, the optimality of price discrimination in the software industry using a large e-commerce panel data set gathered from Amazon.com has been investigated using a reverse-engineering approach.
Abstract: As Internet-based commerce becomes increasingly widespread, large data sets about the demand for and pricing of a wide variety of products become available. These present exciting new opportunities for empirical economic and business research, but also raise new statistical issues and challenges. In this article, we summarize research that aims to assess the optimality of price discrimination in the software industry using a large e-commerce panel data set gathered from Amazon.com. We describe the key parameters that relate to demand and cost that must be reliably estimated to accomplish this research successfully, and we outline our approach to estimating these parameters. This includes a method for “reverse engineering” actual demand levels from the sales ranks reported by Amazon, and approaches to estimating demand elasticity, variable costs and the optimality of pricing choices directly from publicly available e-commerce data. Our analysis raises many new challenges to the reliable statistical analysis of e-commerce data and we conclude with a brief summary of some salient ones.

Journal ArticleDOI
TL;DR: In this paper, the authors examined the price dynamics of online auctions of modern Indian art using functional data analysis and identified several factors, such as artist characteristics (established or emerging artist; prior sales history), art characteristics (size, painting medium, competition characteristics (current number of bidders; current number of bids) and auction design characteristics (opening bid; position of the lot in the auction), that explain the dynamics of price movement in an on-line art auction.
Abstract: In this paper, we examine the price dynamics of on-line art auctions of modern Indian art using functional data analysis. The purpose here is not just to understand what determines the final prices of art objects, but also the price movement during the entire auction. We identify several factors, such as artist characteristics (established or emerging artist; prior sales history), art characteristics (size; painting medium—canvas or paper), competition characteristics (current number of bidders; current number of bids) and auction design characteristics (opening bid; position of the lot in the auction), that explain the dynamics of price movement in an on-line art auction. We find that the effects on price vary over the duration of the auction, with some of these effects being stronger at the beginning of the auction (such as the opening bid and historical prices realized). In some cases, the rate of change in prices (velocity) increases at the end of the auction (for canvas paintings and paintings by established artists). Our analysis suggests that the opening bid is positively related to on-line auction price levels of art at the beginning of the auction, but its effect declines toward the end of the auction. The order in which the lots appear in an art auction is negatively related to the current price level, with this relationship decreasing toward the end of the auction. This implies that lots that appear earlier have higher current prices during the early part of the auction, but that effect diminishes by the end of the auction. Established artists show a positive relationship with the price level at the beginning of the auction. Reputation or popularity of the artists and their investment potential as assessed by previous history of sales are positively related to the price levels at the beginning of the auction. The medium (canvas or paper) of the painting does not show any relationship with art auction price levels, but the size of the painting is negatively related to the current price during the early part of the auction. Important implications for auction design are drawn from the analysis.

Journal ArticleDOI
TL;DR: In this article, the causal effect of customer relationship management (CRM) applications on one-to-one marketing effectiveness is assessed using a potential outcomes-based propensity score approach.
Abstract: This article provides an assessment of the causal effect of customer relationship management (CRM) applications on one-to-one marketing effectiveness. We use a potential outcomes based propensity score approach to assess this causal effect. We find that firms using CRM systems have greater levels of one-to-one marketing effectiveness. We discuss the strengths and challenges of using the propensity score approach to design and execute CRM related observational studies. We also discuss the applicability of the framework in this paper to study typical causal questions in business and electronic commerce research at the firm, individual and economy levels, and to clarify the assumptions that researchers must make to infer causality from observational data.

Journal ArticleDOI
TL;DR: This review article provides an overview of recent work in the modelling and analysis of recurrent events arising in engineering, reliability, public health, biomedical, and other areas and describes a recent general class of models for recurrent events which simultaneously accommodates these aspects.
Abstract: This review article provides an overview of recent work in the modelling and analysis of recurrent events arising in engineering, reliability, public health, biomedical, and other areas. Recurrent event modelling possesses unique facets making it different and more difficult to handle than single event settings. For instance, the impact of an increasing number of event occurrences needs to be taken into account, the effects of covariates should be considered, potential association among the inter-event times within a unit cannot be ignored, and the effects of performed interventions after each event occurrence need to be factored in. A recent general class of models for recurrent events which simultaneously accommodates these aspects is described. Statistical inference methods for this class of models are presented and illustrated through applications to real data sets. Some existing open research problems are described.

Journal ArticleDOI
TL;DR: The application of functional data analysis (FDA) as a means to study the dynamics of software evolution in the open source context is explored and some patterns in which the complexity of software decreased as the software grew in size are demonstrated.
Abstract: This paper explores the application of functional data analy sis (FDA) as a means to study the dynamics of software evolution in the open source context. Several challenges in analyzing the data from software projects are discussed, an approach to overcoming those challenges is de scribed, and preliminary results from the analysis of a sample of open source software (OSS) projects are provided. The results demonstrate the utility of FDA for uncovering and categorizing multiple distinct patterns of evolution in the complexity of OSS projects. These results are promising in that they demonstrate some patterns in which the complexity of software decreased as the software grew in size, a particularly novel result. The paper reports pre liminary explorations of factors that may be associated with decreasing com plexity patterns in these projects. The paper concludes by describing several next steps for this research project as well as some questions for which more sophisticated analytical techniques may be needed.

Journal ArticleDOI
TL;DR: An Incremental Quantile (IQ) estimation method that is designed for performance monitoring at arbitrary levels of network aggregation and time resolution when only a limited amount of data can be transferred is described.
Abstract: Networked applications have software components that reside on different computers. Email, for example, has database, processing, and user interface components that can be distributed across a network and shared by users in different locations or work groups. End-to-end performance and reliability metrics describe the software quality experienced by these groups of users, taking into account all the software components in the pipeline. Each user produces only some of the data needed to understand the quality of the application for the group, so group performance metrics are obtained by combining summary statistics that each end computer periodically (and automatically) sends to a central server. The group quality metrics usually focus on medians and tail quantiles rather than on averages. Distributed quantile estimation is challenging, though, especially when passing large amounts of data around the network solely to compute quality metrics is undesirable. This paper describes an Incremental Quantile (IQ) estimation method that is designed for performance monitoring at arbitrary levels of network aggregation and time resolution when only a limited amount of data can be transferred. Applications to both real and simulated data are provided.

Journal ArticleDOI
TL;DR: This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods.
Abstract: Modern business is rushing toward e-commerce. If the transition is done properly, it enables better management, new services, lower transaction costs and better customer relations. Success depends on skilled information technologists, among whom are statisticians. This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods. This is a very large area, and the topics we cover are chosen to avoid overlap with other papers in this special issue, as well as to respect the limitations of our expertise. Inevitably, electronic commerce has raised and is raising fresh research problems in a very wide range of statistical areas, and we try to emphasize those challenges.

Journal ArticleDOI
TL;DR: In this paper, the authors share their experiences in collecting, validating, storing and analyzing large Internet-based data sets in the area of online auctions, music file sharing and online retailer pricing.
Abstract: Widespread e-commerce activity on the Internet has led to new opportunities to collect vast amounts of micro-level market and nonmarket data. In this paper we share our experiences in collecting, validating, storing and analyzing large Internet-based data sets in the area of online auctions, music file sharing and online retailer pricing. We demonstrate how such data can advance knowledge by facilitating sharper and more extensive tests of existing theories and by offering observational underpinnings for the development of new theories. Just as experimental economics pushed the frontiers of economic thought by enabling the testing of numerous theories of economic behavior in the environment of a controlled laboratory, we believe that observing, often over extended periods of time, real-world agents participating in market and nonmarket activity on the Internet can lead us to develop and test a variety of new theories. Internet data gathering is not controlled experimentation. We cannot randomly assign participants to treatments or determine event orderings. Internet data gathering does offer potentially large data sets with repeated observation of individual choices and action. In addition, the automated data collection holds promise for greatly reduced cost per observation. Our methods rely on technological advances in automated data collection agents. Significant challenges remain in developing appropriate sampling techniques integrating data from heterogeneous sources in a variety of formats, constructing generalizable processes and understanding legal constraints. Despite these challenges, the early evidence from those who have harvested and analyzed large amounts of e-commerce data points toward a significant leap in our ability to understand the functioning of electronic commerce.

Journal ArticleDOI
TL;DR: The suggestion in the paper is that the field has advanced very little over the past ten or so years in spite of all of the excitement to the contrary.
Abstract: This paper provides a valuable service by asking us to reflect on recent developments in classification methodology to ascertain how far we have progressed and what remains to be done. The suggestion in the paper is that the field has advanced very little over the past ten or so years in spite of all of the excitement to the contrary. It is of course natural to become overenthusiastic about new methods. Academic disciplines are as susceptible to fads as any other endeavor. Statistics and machine learning are not exempt from this phenomenon. Often a new method is heavily championed by its developer(s) as the “magic bullet” that renders past methodology obsolete. Sometimes these arguments are accompanied by nontechnical metaphors such as brain biology, natural selection and human reasoning. The developers become gurus of a movement that eventually attracts disciples who in turn spread the word that a new dawn has emerged. All of this enthusiasm is infectious and the new method is adopted by practitioners who often uncritically assume that they are realizing benefits not afforded by previous methodology. Eventually realism sets in as the limitations of the newer methods emerge and they are placed in proper perspective. Such realism is often not immediately welcomed. Suggesting that an exciting new method may not bring as great an improvement as initially envisioned or that it may simply be a variation of existing methodology expressed in new vocabulary often elicits a strong reaction. Thus, the messengers who bring this news tend to be, at least initially, unpopular among their colleagues in the field. It therefore takes

Journal ArticleDOI
TL;DR: In this article, the authors present a family of interlinked errors in which an analysis uses an outcome of treatment as if it were a covariate measured before treatment, whereas covariates exist in a single version.
Abstract: Donald Rubin’s lucid discussion of censoring by death comments on several issues: he warns against mistakes, describes obstacles to inference that might be surmounted within a given investigation, and discusses barriers to inference that direct attention to new data from outside the current investigation. Censoring by death creates outcomes that are defined only contingently, such as quality of life defined only for survivors. If the contingency is an outcome of treatment—if survival could be affected by the treatment—then, as Rubin demonstrates, it is a serious analytical mistake to act as if the contingency were a covariate, a variable unaffected by treatment, when studying the effect of the treatment on the contingently defined outcome. This is one instance of a family of interlinked errors in which an analysis uses an outcome of treatment as if it were a covariate measured before treatment. Other instances in this same family are adjusting for an outcome as if it were a covariate (Rosenbaum, 1984), or attempting to define an interaction effect between a treatment and an outcome of treatment (Rosenbaum, 2004). One of the several advantages of defining outcomes of treatment as comparisons of potential responses under alternative treatments (Neyman, 1923; Rubin, 1974) is that it becomes difficult to make these mistakes: outcomes exist in several versions depending upon the treatment, whereas covariates exist in a single version. Figure 1 depicts the mistake Rubin warns against. It is a simulated randomized experiment, with N = 650 subjects, of whom n = 325 were randomized to treatment where 16 died, and m= 325 were randomized to control, where 111 died, and Figure 1 depicts quality of life scores for survivors. Beginning with the structure as Rubin develops it, I will propose a somewhat different analysis. In Section 2 notation describes a completely randomized experiment of the type depicted

Journal ArticleDOI
TL;DR: This article pointed out the limitations of the new methods to which Professor Friedman refers, and pointed out that the advances, when taken in the context of real practical problems, are not as great as is often claimed.
Abstract: I would like to thank the discussants for some very stimulating comments. Being only human, I am naturally pleased when others produce evidence or ar guments in support of my contentions, but being a sci entist, I am also pleased when others produce evidence or arguments against my proposals (although I may have to take a deep breath first), since this represents the scientific process in action. I should first make one thing clear: I agree with Pro fessor Friedman that substantial advances have been made in recent years. Indeed, in my paper I remarked that "developments such as the bootstrap and other re sampling approaches ... have led to significant ad vances in classification and other statistical models." However, what I question is whether the advances, when taken in the context of real practical problems, are as great as is often claimed?the recognition of the limitations of the new methods to which Professor Friedman refers.

Journal ArticleDOI
TL;DR: In this article, Newton answered a query from Samuel Pepys about a problem involving dice and made an error in answering the question, which was later revealed to have been an error.
Abstract: In 1693, Isaac Newton answered a query from Samuel Pepys about a problem involving dice. Newton’s analysis is discussed and attention is drawn to an error he made.