scispace - formally typeset
Search or ask a question

Showing papers in "Technometrics in 2016"


Journal ArticleDOI
TL;DR: In this article, the authors showed that the computation of distance covariance and distance correlation of real-valued random variables can be done in O(n log n) time using a U-statistic.
Abstract: Distance covariance and distance correlation have been widely adopted in measuring dependence of a pair of random variables or random vectors If the computation of distance covariance and distance correlation is implemented directly accordingly to its definition then its computational complexity is O(n2), which is a disadvantage compared to other faster methods In this article we show that the computation of distance covariance and distance correlation of real-valued random variables can be implemented by an O(nlog n) algorithm and this is comparable to other computationally efficient algorithms The new formula we derive for an unbiased estimator for squared distance covariance turns out to be a U-statistic This fact implies some nice asymptotic properties that were derived before via more complex methods We apply the fast computing algorithm to some synthetic data Our work will make distance correlation applicable to a much wider class of problems A supplementary file to this article, available on

117 citations


Journal ArticleDOI
Elizabeth D. Schifano1, Jing Wu1, Chun Wang1, Jun Yan1, Ming-Hui Chen1 
TL;DR: In this article, the authors present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data.
Abstract: We present statistical methods for big data arising from online analytical processing, where large amounts of data arrive in streams and require fast analysis without storage/access to the historical data. In particular, we develop iterative estimating algorithms and statistical inferences for linear models and estimating equations that update as new data arrive. These algorithms are computationally efficient, minimally storage-intensive, and allow for possible rank deficiencies in the subset design matrices due to rare-event covariates. Within the linear model setting, the proposed online-updating framework leads to predictive residual tests that can be used to assess the goodness of fit of the hypothesized model. We also propose a new online-updating estimator under the estimating equation setting. Theoretical properties of the goodness-of-fit tests and proposed estimators are examined in detail. In simulation studies and real data applications, our estimator compares favorably with competing approaches...

115 citations


Journal ArticleDOI
TL;DR: A new modeling, monitoring, and diagnosis framework for phase-I analysis of multichannel profiles under the assumption that different profile channels have similar structure so that the proposed approach has good performance in identifying change-points in various situations compared with some existing methods.
Abstract: Process monitoring and fault diagnosis using profile data remains an important and challenging problem in statistical process control (SPC). Although the analysis of profile data has been extensively studied in the SPC literature, the challenges associated with monitoring and diagnosis of multichannel (multiple) nonlinear profiles are yet to be addressed. Motivated by an application in multioperation forging processes, we propose a new modeling, monitoring, and diagnosis framework for phase-I analysis of multichannel profiles. The proposed framework is developed under the assumption that different profile channels have similar structure so that we can gain strength by borrowing information from all channels. The multidimensional functional principal component analysis is incorporated into change-point models to construct monitoring statistics. Simulation results show that the proposed approach has good performance in identifying change-points in various situations compared with some existing methods. The ...

102 citations


Journal ArticleDOI
TL;DR: In this article, a combination of response surface modeling, expected improvement, and the augmented Lagrangian numerical optimization framework is proposed to solve the problem of constrained black-box optimization.
Abstract: Constrained blackbox optimization is a difficult problem, with most approaches coming from the mathematical programming literature. The statistical literature is sparse, especially in addressing problems with nontrivial constraints. This situation is unfortunate because statistical methods have many attractive properties: global scope, handling noisy objectives, sensitivity analysis, and so forth. To narrow that gap, we propose a combination of response surface modeling, expected improvement, and the augmented Lagrangian numerical optimization framework. This hybrid approach allows the statistical model to think globally and the augmented Lagrangian to act locally. We focus on problems where the constraints are the primary bottleneck, requiring expensive simulation to evaluate and substantial modeling effort to map out. In that context, our hybridization presents a simple yet effective solution that allows existing objective-oriented statistical approaches, like those based on Gaussian process surrogates ...

85 citations


Journal ArticleDOI
TL;DR: A new nonparametric methodology for monitoring location parameters when only a small reference dataset is available and the key idea is to construct a series of conditionally distribution-free test statistics in the sense that their distributions are free of the underlying distribution given the empirical distribution functions.
Abstract: Monitoring multivariate quality variables or data streams remains an important and challenging problem in statistical process control (SPC). Although the multivariate SPC has been extensively studied in the literature, designing distribution-free control schemes are still challenging and yet to be addressed well. This article develops a new nonparametric methodology for monitoring location parameters when only a small reference dataset is available. The key idea is to construct a series of conditionally distribution-free test statistics in the sense that their distributions are free of the underlying distribution given the empirical distribution functions. The conditional probability that the charting statistic exceeds the control limit at present given that there is no alarm before the current time point can be guaranteed to attain a specified false alarm rate. The success of the proposed method lies in the use of data-dependent control limits, which are determined based on the observations online rather...

68 citations


Journal ArticleDOI
TL;DR: It is demonstrated that a three-level compromise plan with small proportion allocation in the middle stress, in general, is a good strategy for ADT allocation, and the penalties of using nonoptimum allocation rules are addressed.
Abstract: Optimum allocation problem in accelerated degradation tests (ADTs) is an important task for reliability analysts. Several researchers have attempted to address this decision problem, but their results have been based only on specific degradation models. Therefore, they lack a unified approach toward general degradation models. This study proposes a class of exponential dispersion (ED) degradation models to overcome this difficulty. Assuming that the underlying degradation path comes from the ED class, we analytically derive the optimum allocation rules (by minimizing the asymptotic variance of the estimated q quantile of product's lifetime) for two-level and three-level ADT allocation problems whether the testing stress levels are prefixed or not. For a three-level allocation problem, we show that all test units should be allocated into two out of three stresses, depending on certain specific conditions. Two examples are used to illustrate the proposed procedure. Furthermore, the penalties of using nonopt...

54 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a statistical approach that explicitly accounts for the space-time dependence of the data for annual global 3D temperature fields in an initial condition ensemble, which can be used to instantaneously reproduce the temperature fields with a substantial saving in storage and time.
Abstract: One of the main challenges when working with modern climate model ensembles is the increasingly larger size of the data produced, and the consequent difficulty in storing large amounts of spatio-temporally resolved information. Many compression algorithms can be used to mitigate this problem, but since they are designed to compress generic scientific datasets, they do not account for the nature of climate model output and they compress only individual simulations. In this work, we propose a different, statistics-based approach that explicitly accounts for the space-time dependence of the data for annual global three-dimensional temperature fields in an initial condition ensemble. The set of estimated parameters is small (compared to the data size) and can be regarded as a summary of the essential structure of the ensemble output; therefore, it can be used to instantaneously reproduce the temperature fields in an ensemble with a substantial saving in storage and time. The statistical model exploits the gri...

47 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that searching the space radially, continuously along rays emanating from the predictive location of interest, is a far thriftier alternative than the exhaustive and discrete nature of an important search subroutine involved in building such local designs may be overly conservative.
Abstract: Recent implementations of local approximate Gaussian process models have pushed computational boundaries for nonlinear, nonparametric prediction problems, particularly when deployed as emulators for computer experiments. Their flavor of spatially independent computation accommodates massive parallelization, meaning that they can handle designs two or more orders of magnitude larger than previously. However, accomplishing that feat can still require massive computational horsepower. Here we aim to ease that burden. We study how predictive variance is reduced as local designs are built up for prediction. We then observe how the exhaustive and discrete nature of an important search subroutine involved in building such local designs may be overly conservative. Rather, we suggest that searching the space radially, that is, continuously along rays emanating from the predictive location of interest, is a far thriftier alternative. Our empirical work demonstrates that ray-based search yields predictors with accur...

45 citations


Journal ArticleDOI
TL;DR: In this article, a new sparse PCA algorithm is presented, which is robust against outliers, based on the ROBPCA algorithm that generates robust but nonsparse loadings.
Abstract: A new sparse PCA algorithm is presented, which is robust against outliers. The approach is based on the ROBPCA algorithm that generates robust but nonsparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. In comparison with a projection pursuit-based algorithm, ROSPCA demonstrates superior robustness properties and comparable sparsity estimation capability, as well as significantly faster computation time.

39 citations


Journal ArticleDOI
TL;DR: An algorithm based on swarm intelligence is proposed to find E(s2)-optimal SSDs by showing that they attain the theoretical lower bounds found in previous literature, and it is shown that this algorithm consistently produces SSDs that are at least as efficient as those from the traditional CP exchange method.
Abstract: Supersaturated designs (SSDs) are often used to reduce the number of experimental runs in screening experiments with a large number of factors. As more factors are used in the study, the search for an optimal SSD becomes increasingly challenging because of the large number of feasible selection of factor level settings. This article tackles this discrete optimization problem via an algorithm based on swarm intelligence. Using the commonly used E(s2) criterion as an illustrative example, we propose an algorithm to find E(s2)-optimal SSDs by showing that they attain the theoretical lower bounds found in previous literature. We show that our algorithm consistently produces SSDs that are at least as efficient as those from the traditional CP exchange method in terms of computational effort, frequency of finding the E(s2)-optimal SSD, and also has good potential for finding D3-, D4-, and D5-optimal SSDs. Supplementary materials for this article are available online.

36 citations


Journal ArticleDOI
TL;DR: A probabilistic spatial-temporal model for analyzing local wind fields based on measurements taken from a large number of turbines in a wind farm, as opposed to aggregating the data into a single time-series, which finds that the two modeling elements benefit short-term wind speed forecasts.
Abstract: Turbine operations in a wind farm benefit from an understanding of the near-ground behavior of wind speeds. This article describes a probabilistic spatial-temporal model for analyzing local wind fields. Our model is constructed based on measurements taken from a large number of turbines in a wind farm, as opposed to aggregating the data into a single time-series. The model incorporates both temporal and spatial characteristics of wind speed data: in addition to using a time epoch mechanism to model temporal nonstationarity, our model identifies an informative neighborhood of turbines that are spatially related, and consequently, constructs an ensemble-like predictor using the data associated with the neighboring turbines. Using actual wind data measured at 200 wind turbines in a wind farm, we found that the two modeling elements benefit short-term wind speed forecasts. We also investigate the use of regime switching to account for the effect of wind direction and the use of geostrophic wind to account for...

Journal ArticleDOI
TL;DR: An order-constrained version of ℓ1-regularized regression (Lasso) is proposed, and it is shown how to solve it efficiently using the well-known pool adjacent violators algorithm as its proximal operator.
Abstract: We consider regression scenarios where it is natural to impose an order constraint on the coefficients. We propose an order-constrained version of l1-regularized regression (Lasso) for this problem, and show how to solve it efficiently using the well-known pool adjacent violators algorithm as its proximal operator. The main application of this idea is to time-lagged regression, where we predict an outcome at time t from features at the previous K time points. In this setting, it is natural to assume that the coefficients decay as we move farther away from t, and hence the order constraint is reasonable. Potential application areas include financial time series and prediction of dynamic patient outcomes based on clinical measurements. We illustrate this idea on real and simulated data.

Journal ArticleDOI
TL;DR: This article develops orthogonal blocking schemes for definitive screening designs, which are quite flexible in that the numbers of blocks may vary from two to the number of factors, and block sizes need not be equal.
Abstract: In earlier work, Jones and Nachtsheim proposed a new class of screening designs called definitive screening designs. As originally presented, these designs are three-level designs for quantitative factors that provide estimates of main effects that are unbiased by any second-order effect and require only one more than twice as many runs as there are factors. Definitive screening designs avoid direct confounding of any pair of second-order effects, and, for designs that have more than five factors, project to efficient response surface designs for any two or three factors. Recently, Jones and Nachtsheim expanded the applicability of these designs by showing how to include any number of two-level categorical factors. However, methods for blocking definitive screening designs have not been addressed. In this article we develop orthogonal blocking schemes for definitive screening designs. We separately consider the cases where all of the factors are quantitative and where there is a mix of quantitative and tw...

Journal ArticleDOI
TL;DR: The proposed designs are one kind of sliced Latin hypercube designs with points clustered in the design region and possess good uniformity for each slice to measure the similarities among responses of different level-combinations in the qualitative variables.
Abstract: Computer experiments have received a great deal of attention in many fields of science and technology. Most literature assumes that all the input variables are quantitative. However, researchers often encounter computer experiments involving both qualitative and quantitative variables (BQQV). In this article, a new interface on design and analysis for computer experiments with BQQV is proposed. The new designs are one kind of sliced Latin hypercube designs with points clustered in the design region and possess good uniformity for each slice. For computer experiments with BQQV, such designs help to measure the similarities among responses of different level-combinations in the qualitative variables. An adaptive analysis strategy intended for the proposed designs is developed. The proposed strategy allows us to automatically extract information from useful auxiliary responses to increase the precision of prediction for the target response. The interface between the proposed design and the analysis strategy ...

Journal ArticleDOI
TL;DR: An IM-based technique is employed to marginalize out the unknown parameters, yielding prior-free probabilistic prediction of future observables, which is expected to be a useful tool for practitioners.
Abstract: Prediction of future observations is a fundamental problem in statistics. Here we present a general approach based on the recently developed inferential model (IM) framework. We employ an IM-based technique to marginalize out the unknown parameters, yielding prior-free probabilistic prediction of future observables. Verifiable sufficient conditions are given for validity of our IM for prediction, and a variety of examples demonstrate the proposed method’s performance. Thanks to its generality and ease of implementation, we expect that our IM-based method for prediction will be a useful tool for practitioners. Supplementary materials for this article are available online.

Journal ArticleDOI
TL;DR: In this paper, the authors exploit the independent-increments structure of maximum likelihood estimators to produce complementary plots with greater interpretability, and suggest a simple likelihood-based procedure that allows for automated threshold selection.
Abstract: To model the tail of a distribution, one has to define the threshold above or below which an extreme value model produces a suitable fit. Parameter stability plots, whereby one plots maximum likelihood estimates of supposedly threshold-independent parameters against threshold, form one of the main tools for threshold selection by practitioners, principally due to their simplicity. However, one repeated criticism of these plots is their lack of interpretability, with pointwise confidence intervals being strongly dependent across the range of thresholds. In this article, we exploit the independent-increments structure of maximum likelihood estimators to produce complementary plots with greater interpretability, and suggest a simple likelihood-based procedure that allows for automated threshold selection. Supplementary materials for this article are available online.

Journal ArticleDOI
TL;DR: An efficient iterative algorithm to orthogonalize a design matrix by adding new rows and then solve the original problem by embedding the augmented design in a missing data framework, which is considerably faster than competing methods when n is much larger than p.
Abstract: We introduce an efficient iterative algorithm, intended for various least squares problems, based on a design of experiments perspective. The algorithm, called orthogonalizing EM (OEM), works for ordinary least squares (OLS) and can be easily extended to penalized least squares. The main idea of the procedure is to orthogonalize a design matrix by adding new rows and then solve the original problem by embedding the augmented design in a missing data framework. We establish several attractive theoretical properties concerning OEM. For the OLS with a singular regression matrix, an OEM sequence converges to the Moore-Penrose generalized inverse-based least squares estimator. For ordinary and penalized least squares with various penalties, it converges to a point having grouping coherence for fully aliased regression matrices. Convergence and the convergence rate of the algorithm are examined. Finally, we demonstrate that OEM is highly efficient for large-scale least squares and penalized least squares proble...

Journal ArticleDOI
TL;DR: A new approach to screen for active factorial effects from such experiments that uses the potential outcomes framework and is based on sequential posterior predictive model checks, which has the ability to broaden the standard definition of active effects and to link their definition to the population of interest.
Abstract: Unreplicated factorial designs have been widely used in scientific and industrial settings, when it is important to distinguish “active” or real factorial effects from “inactive” or noise factorial effects used to estimate residual or “error” terms. We propose a new approach to screen for active factorial effects from such experiments that uses the potential outcomes framework and is based on sequential posterior predictive model checks. One advantage of the proposed method is its ability to broaden the standard definition of active effects and to link their definition to the population of interest. Another important aspect of this approach is its conceptual connection to Fisherian randomization tests. Extensive simulation studies are conducted, which demonstrate the superiority of the proposed approach over existing ones in the situations considered.

Journal ArticleDOI
TL;DR: New quantile function estimators for spatial and temporal data with a fused adaptive Lasso penalty to accommodate the dependence in space and time for applications with features ordered in time or space without replicated observations are introduced.
Abstract: Quantile functions are important in characterizing the entire probability distribution of a random variable, especially when the tail of a skewed distribution is of interest. This article introduces new quantile function estimators for spatial and temporal data with a fused adaptive Lasso penalty to accommodate the dependence in space and time. This method penalizes the difference among neighboring quantiles, hence it is desirable for applications with features ordered in time or space without replicated observations. The theoretical properties are investigated and the performances of the proposed methods are evaluated by simulations. The proposed method is applied to particulate matter (PM) data from the Community Multiscale Air Quality (CMAQ) model to characterize the upper quantiles, which are crucial for studying spatial association between PM concentrations and adverse human health effects.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a data-augmentation algorithm for truncated field return data with returned failures available only, based on an idea to reveal the hidden unobserved lifetimes.
Abstract: Field data are an important source of reliability information for many commercial products. Because field data are often collected by the maintenance department, information on failed and returned units is well maintained. Nevertheless, information on unreturned units is generally unavailable. The unavailability leads to truncation in the lifetime data. This study proposes a data-augmentation algorithm for this type of truncated field return data with returned failures available only. The algorithm is based on an idea to reveal the hidden unobserved lifetimes. Theoretical justifications of the procedure for augmenting the hidden unobserved are given. On the other hand, the algorithm is iterative in nature. Asymptotic properties of the estimators from the iterations are investigated. Both point estimation and the information matrix of the parameters can be directly obtained from the algorithm. In addition, a by-product of the algorithm is a nonparametric estimator of the installation time distribution. An ...

Journal ArticleDOI
TL;DR: This article develops a Bayesian statistical calibration approach that is ideally suited for such challenging calibration problems and leverages recent ideas from Bayesian additive regression Tree models to construct a random basis representation of the simulator outputs and observational data.
Abstract: Complex natural phenomena are increasingly investigated by the use of a complex computer simulator. To leverage the advantages of simulators, observational data need to be incorporated in a probabilistic framework so that uncertainties can be quantified. A popular framework for such experiments is the statistical computer model calibration experiment. A limitation often encountered in current statistical approaches for such experiments is the difficulty in modeling high-dimensional observational datasets and simulator outputs as well as high-dimensional inputs. As the complexity of simulators seems to only grow, this challenge will continue unabated. In this article, we develop a Bayesian statistical calibration approach that is ideally suited for such challenging calibration problems. Our approach leverages recent ideas from Bayesian additive regression Tree models to construct a random basis representation of the simulator outputs and observational data. The approach can flexibly handle high-dimensional...

Journal Article
TL;DR: A data-augmentation algorithm for this type of truncated field return data with returned failures available only is proposed, based on an idea to reveal the hidden unobserved lifetimes.
Abstract: Supplementary material to "Augmenting the Unreturned for Field Data With Information on Returned Failures Only"

Journal Article
TL;DR: This approach is based on a block-splitting variant of the alternating directions method of multipliers, carefully reconfigured to handle very large random feature matrices under memory constraints, while exploiting hybrid parallelism typically found in modern clusters of multicore machines.
Abstract: Supplementary material to "High-Performance Kernel Machines With Implicit Distributed Optimization and Randomization"

Journal ArticleDOI
TL;DR: In this paper, a sliced orthogonal array-based Latin hypercube design is proposed to achieve one-and two-dimensional uniformity, which can be used for uncertainty quantification of computer models, cross-validation and efficient allocation of computing resources.
Abstract: We propose an approach for constructing a new type of design, called a sliced orthogonal array-based Latin hypercube design. This approach exploits a slicing structure of orthogonal arrays with strength two and makes use of sliced random permutations. Such a design achieves one- and two-dimensional uniformity and can be divided into smaller Latin hypercube designs with one-dimensional uniformity. Sampling properties of the proposed designs are derived. Examples are given for illustrating the construction method and corroborating the derived theoretical results. Potential applications of the constructed designs include uncertainty quantification of computer models, computer models with qualitative and quantitative factors, cross-validation and efficient allocation of computing resources. Supplementary materials for this article are available online.

Journal ArticleDOI
TL;DR: This article proposes a class of monotonic regression models, which consists of functional analysis of variance (FANOVA) decomposition components modeled with Bernstein polynomial bases for estimating quantiles as a function of multiple inputs.
Abstract: Quantile regression is an important tool to determine the quality level of service, product, and operation systems via stochastic simulation. It is frequently known that the quantiles of the output distribution are monotonic functions of certain inputs to the simulation model. Because there is typically high variability in estimation of tail quantiles, it can be valuable to incorporate this information in quantile modeling. However, the existing literature on monotone quantile regression with multiple inputs is sparse. In this article, we propose a class of monotonic regression models, which consists of functional analysis of variance (FANOVA) decomposition components modeled with Bernstein polynomial bases for estimating quantiles as a function of multiple inputs. The polynomial degrees of the bases for the model and the FANOVA components included in the model are selected by a greedy algorithm. Real examples demonstrate the advantages of incorporating the monotonicity assumption in quantile regression a...

Journal ArticleDOI
TL;DR: A self-starting exponentially weighted moving average (EWMA) control scheme based on a parametric bootstrap method that is useful in rare event studies during the start-up stage of a monitoring process and has good in-control and out-of-control performance under various situations.
Abstract: In this article, we consider the problem of monitoring Poisson rates when the population sizes are time-varying and the nominal value of the process parameter is unavailable. Almost all previous control schemes for the detection of increases in the Poisson rate in Phase II are constructed based on assumed knowledge of the process parameters, for example, the expectation of the count of a rare event when the process of interest is in control. In practice, however, this parameter is usually unknown and not able to be estimated with a sufficiently large number of reference samples. A self-starting exponentially weighted moving average (EWMA) control scheme based on a parametric bootstrap method is proposed. The success of the proposed method lies in the use of probability control limits, which are determined based on the observations during rather than before monitoring. Simulation studies show that our proposed scheme has good in-control and out-of-control performance under various situations. In particular...

Journal ArticleDOI
TL;DR: In this paper, an online in situ method for identifying a reduced set of time steps of a simulation is presented. But the method is limited to a subset of the time steps, where the spacing can be defined by the budget for storage and transfer.
Abstract: As computer simulations continue to grow in size and complexity, they present a particularly challenging class of big data problems. Many application areas are moving toward exascale computing systems, systems that perform 1018 FLOPS (FLoating-point Operations Per Second)—a billion billion calculations per second. Simulations at this scale can generate output that exceeds both the storage capacity and the bandwidth available for transfer to storage, making post-processing and analysis challenging. One approach is to embed some analyses in the simulation while the simulation is running—a strategy often called in situ analysis—to reduce the need for transfer to storage. Another strategy is to save only a reduced set of time steps rather than the full simulation. Typically the selected time steps are evenly spaced, where the spacing can be defined by the budget for storage and transfer. This article combines these two ideas to introduce an online in situ method for identifying a reduced set of time steps of ...

Journal ArticleDOI
TL;DR: The so-called bootstrap Metropolis–Hastings (BMH) algorithm is proposed, which provides a general framework for how to tame powerful MCMC methods to be used for big data analysis, that is, to replace the full data log-likelihood by a Monte Carlo average of the log- likelihoods that are calculated in parallel from multiple bootstrap samples.
Abstract: Markov chain Monte Carlo (MCMC) methods have proven to be a very powerful tool for analyzing data of complex structures. However, their computer-intensive nature, which typically require a large number of iterations and a complete scan of the full dataset for each iteration, precludes their use for big data analysis. In this article, we propose the so-called bootstrap Metropolis–Hastings (BMH) algorithm that provides a general framework for how to tame powerful MCMC methods to be used for big data analysis, that is, to replace the full data log-likelihood by a Monte Carlo average of the log-likelihoods that are calculated in parallel from multiple bootstrap samples. The BMH algorithm possesses an embarrassingly parallel structure and avoids repeated scans of the full dataset in iterations, and is thus feasible for big data problems. Compared to the popular divide-and-combine method, BMH can be generally more efficient as it can asymptotically integrate the whole data information into a single simulation r...

Journal Article
TL;DR: In this article, a self-starting monitoring scheme for Poisson count data with varying population sizes was proposed, which is based on the Self-Starting Monitoring Scheme (SMS) algorithm.
Abstract: Supplementary material to "Self-Starting Monitoring Scheme for Poisson Count Data With Varying Population Sizes"

Journal ArticleDOI
TL;DR: In this article, a block-splitting variant of the alternating directions method of multipliers is proposed to handle very large random feature matrices under memory constraints, while exploiting hybrid parallelism typically found in modern clusters of multicore machines.
Abstract: We propose a framework for massive-scale training of kernel-based statistical models, based on combining distributed convex optimization with randomization techniques. Our approach is based on a block-splitting variant of the alternating directions method of multipliers, carefully reconfigured to handle very large random feature matrices under memory constraints, while exploiting hybrid parallelism typically found in modern clusters of multicore machines. Our high-performance implementation supports a variety of statistical learning tasks by enabling several loss functions, regularization schemes, kernels, and layers of randomized approximations for both dense and sparse datasets, in an extensible framework. We evaluate our implementation on large-scale model construction tasks and provide a comparison against existing sequential and parallel libraries. Supplementary materials for this article are available online.