scispace - formally typeset
Search or ask a question

Showing papers in "Journal of The Royal Statistical Society Series B-statistical Methodology in 2006"


Journal ArticleDOI
TL;DR: In this paper, instead of selecting factors by stepwise backward elimination, the authors focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection.
Abstract: Summary. We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor analysis-of-variance problem as the most important and well-known example. Instead of selecting factors by stepwise backward elimination, we focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection. The lasso, the LARS algorithm and the non-negative garrotte are recently proposed regression methods that can be used to select individual variables. We study and propose efficient algorithms for the extensions of these methods for factor selection and show that these extensions give superior performance to the traditional stepwise backward elimination method in factor selection problems. We study the similarities and the differences between these methods. Simulations and real examples are used to illustrate the methods.

7,400 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose a methodology to sample sequentially from a sequence of probability distributions that are defined on a common space, each distribution being known up to a normalizing constant.
Abstract: Summary. We propose a methodology to sample sequentially from a sequence of probability distributions that are defined on a common space, each distribution being known up to a normalizing constant. These probability distributions are approximated by a cloud of weighted random samples which are propagated over time by using sequential Monte Carlo methods. This methodology allows us to derive simple algorithms to make parallel Markov chain Monte Carlo algorithms interact to perform global optimization and sequential Bayesian estimation and to compute ratios of normalizing constants. We illustrate these algorithms for various integration tasks arising in the context of Bayesian inference.

1,684 citations


Journal ArticleDOI
TL;DR: Monte Carlo methods are proposed, which build on recent advances on the exact simulation of diffusions, for performing maximum likelihood and Bayesian estimation for discretely observed diffusions.
Abstract: Summary. The objective of the paper is to present a novel methodology for likelihood-based inference for discretely observed diffusions. We propose Monte Carlo methods, which build on recent advances on the exact simulation of diffusions, for performing maximum likelihood and Bayesian estimation.

423 citations


Journal ArticleDOI
TL;DR: In this paper, the properties of functional principal component analysis can be elucidated through stochastic expansions and related results, which can be used to explore properties of existing methods, and also to suggest new techniques.
Abstract: Summary. Functional data analysis is intrinsically infinite dimensional; functional principal component analysis reduces dimension to a finite level, and points to the most significant components of the data. However, although this technique is often discussed, its properties are not as well understood as they might be. We show how the properties of functional principal component analysis can be elucidated through stochastic expansions and related results. Our approach quantifies the errors that arise through statistical approximation, in successive terms of orders n−1/2, n−1, n−3/2, …, where n denotes sample size. The expansions show how spacings among eigenvalues impact on statistical performance. The term of size n−1/2 illustrates first-order properties and leads directly to limit theory which describes the dominant effect of spacings. Thus, for example, spacings are seen to have an immediate, first-order effect on properties of eigenfunction estimators, but only a second-order effect on eigenvalue estimators. Our results can be used to explore properties of existing methods, and also to suggest new techniques. In particular, we suggest bootstrap methods for constructing simultaneous confidence regions for an infinite number of eigenvalues, and also for individual eigenvalues and eigenvectors.

420 citations


Journal ArticleDOI
TL;DR: New methodology is presented that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian wavelet-based approach, which is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and between-curve covariance structures that are available in the Mixed model framework.
Abstract: Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian wavelet-based approach. This method is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and between-curve covariance structures that are available in the mixed model framework. It yields nonparametric estimates of the fixed and random-effects functions as well as the various between-curve and within-curve covariance matrices. The functional fixed effects are adaptively regularized as a result of the non-linear shrinkage prior that is imposed on the fixed effects' wavelet coefficients, and the random-effect functions experience a form of adaptive regularization because of the separately estimated variance components for each wavelet coefficient. Because we have posterior samples for all model quantities, we can perform pointwise or joint Bayesian inference or prediction on the quantities of the model. The adaptiveness of the method makes it especially appropriate for modelling irregular functional data that are characterized by numerous local features like peaks.

408 citations


Journal ArticleDOI
TL;DR: In this paper, a score test on a hyperparameter in an empirical Bayesian model was proposed as an alternative to classical tests for high-dimensional data, where there are more parameters than observations.
Abstract: Summary. As the dimensionality of the alternative hypothesis increases, the power of classical tests tends to diminish quite rapidly. This is especially true for high dimensional data in which there are more parameters than observations. We discuss a score test on a hyperparameter in an empirical Bayesian model as an alternative to classical tests. It gives a general test statistic which can be used to test a point null hypothesis against a high dimensional alternative, even when the number of parameters exceeds the number of samples. This test will be shown to have optimal power on average in a neighbourhood of the null hypothesis, which makes it a proper generalization of the locally most powerful test to multiple dimensions. To illustrate this new locally most powerful test we investigate the case of testing the global null hypothesis in a linear regression model in more detail. The score test is shown to have significantly more power than the F-test whenever under the alternative the large variance principal components of the design matrix explain substantially more of the variance of the outcome than do the small variance principal components. The score test is also useful for detecting sparse alternatives in truly high dimensional data, where its power is comparable with the test based on the maximum absolute t-statistic.

246 citations


Journal ArticleDOI
TL;DR: An arithmetic of arrays is developed which allows us to define the expectation of the data array as a sequence of nested matrix operations on a coefficient array, and it is shown how this arithmetic leads to low storage, high speed computation in the scoring algorithm of the generalized linear model.
Abstract: Summary. Data with an array structure are common in statistics, and the design or regression matrix for analysis of such data can often be written as a Kronecker product. Factorial designs, contingency tables and smoothing of data on multidimensional grids are three such general classes of data and models. In such a setting, we develop an arithmetic of arrays which allows us to define the expectation of the data array as a sequence of nested matrix operations on a coefficient array. We show how this arithmetic leads to low storage, high speed computation in the scoring algorithm of the generalized linear model. We refer to a generalized linear array model and apply the methodology to the smoothing of multidimensional arrays. We illustrate our procedure with the analysis of three data sets: mortality data indexed by age at death and year of death, spatially varying microarray background data and disease incidence data indexed by age at death, year of death and month of death.

221 citations


Journal ArticleDOI
TL;DR: In this article, a bias correction to achieve the standard X2-limit is proposed. But the bias correction is not asymptotically tractable, since the index is of norm 1.
Abstract: Summary. Empirical-likelihood-based inference for the parameters in a partially linear singleindex model is investigated. Unlike existing empirical likelihood procedures for other simpler models, if there is no bias correction the limit distribution of the empirical likelihood ratio cannot be asymptotically tractable. To attack this difficulty we propose a bias correction to achieve the standard X2-limit. The bias-corrected empirical likelihood ratio shares some of the desired features of the existing least squares method: the estimation of the parameters is not needed; when estimating nonparametric functions in the model, undersmoothing for ensuring In-consistency of the estimator of the parameters is avoided; the bias-corrected empirical likelihood is self-scale invariant and no plug-in estimator for the limiting variance is needed. Furthermore, since the index is of norm 1, we use this constraint as information to increase the accuracy of the confidence regions (smaller regions at the same nominal level). As a by-product, our approach of using bias correction may also shed light on nonparametric estimation in model checking for other semiparametric regression models. A simulation study is carried out to assess the performance of the bias-corrected empirical likelihood. An application to a real data set is illustrated.

215 citations


Journal ArticleDOI
TL;DR: In this article, a bias-corrected mean-squared error estimator for small area prediction is proposed. But this estimator is limited to a narrow range of models, and it is not applicable to general two-level models.
Abstract: Summary. The particularly wide range of applications of small area prediction, e.g. in policy making decisions, has meant that this topic has received substantial attention in recent years. The problems of estimating mean-squared predictive error, of correcting that estimator for bias and of constructing prediction intervals have been addressed by various workers, although existing methodology is still restricted to a narrow range of models. To overcome this difficulty we develop new, bootstrap-based methods, which are applicable in very general settings, for constructing bias-corrected estimators of mean-squared error and for computing prediction regions. Unlike existing techniques, which are based largely on Taylor expansions, our bias-corrected mean-squared error estimators do not require analytical calculation. They also have the property that they are non-negative. Our prediction intervals have a high degree of coverage accuracy, O(n−3), where n is the number of areas, if double-bootstrap methods are employed. The techniques do not depend on the form of the small area estimator and are applicable to general two-level, small area models, where the variables at either level can be discrete or continuous and, in particular, can be non-normal. Most importantly, the new methods are simple and easy to apply.

204 citations


Journal ArticleDOI
TL;DR: The authors showed that conditional maximum likelihood can eliminate this bias by partitioning the covariate into between-and within-cluster components and models that include separate terms for these components also eliminate the source of the bias.
Abstract: Summary. We consider the situation where the random effects in a generalized linear mixed model may be correlated with one of the predictors, which leads to inconsistent estimators. We show that conditional maximum likelihood can eliminate this bias. Conditional likelihood leads naturally to the partitioning of the covariate into between- and within-cluster components and models that include separate terms for these components also eliminate the source of the bias. Another viewpoint that we develop is the idea that many violations of the assumptions (including correlation between the random effects and a covariate) in a generalized linear mixed model may be cast as misspecified mixing distributions. We illustrate the results with two examples and simulations.

193 citations


Journal ArticleDOI
TL;DR: In this paper, an iterative estimation procedure for performing functional principal component analysis is proposed, which aims at functional or longitudinal data where the repeated measurements from the same subject are correlated, and the resulting data after iteration are theoretically shown to be asymptotically equivalent (in probability) to a set of independent data.
Abstract: Summary. We propose an iterative estimation procedure for performing functional principal component analysis. The procedure aims at functional or longitudinal data where the repeated measurements from the same subject are correlated. An increasingly popular smoothing approach, penalized spline regression, is used to represent the mean function. This allows straightforward incorporation of covariates and simple implementation of approximate inference procedures for coefficients. For the handling of the within-subject correlation, we develop an iterative procedure which reduces the dependence between the repeated measurements that are made for the same subject. The resulting data after iteration are theoretically shown to be asymptotically equivalent (in probability) to a set of independent data. This suggests that the general theory of penalized spline regression that has been developed for independent data can also be applied to functional data.The effectiveness of the proposed procedure is demonstrated via a simulation study and an application to yeast cell cycle gene expression data.

Journal ArticleDOI
TL;DR: The nonparametric Bayes model extends the scope of traditional Bayes wavelet methods to functional clustering and allows the elicitation of prior belief about the regularity of the functions and the number of clusters by suitably mixing the Dirichlet processes.
Abstract: Summary. We propose a nonparametric Bayes wavelet model for clustering of functional data. The wavelet-based methodology is aimed at the resolution of generic global and local features during clustering and is suitable for clustering high dimensional data. Based on the Dirichlet process, the nonparametric Bayes model extends the scope of traditional Bayes wavelet methods to functional clustering and allows the elicitation of prior belief about the regularity of the functions and the number of clusters by suitably mixing the Dirichlet processes. Posterior inference is carried out by Gibbs sampling with conjugate priors, which makes the computation straightforward. We use simulated as well as real data sets to illustrate the suitability of the approach over other alternatives.

Journal ArticleDOI
TL;DR: In this article, a wavelet-based nonparametric regression approach is proposed to estimate conditional expectations by using appropriate wavelet decompositions of the segmented sample paths, and a notion of similarity is used to calibrate the prediction.
Abstract: Summary. We consider the prediction problem of a time series on a whole time interval in terms of its past. The approach that we adopt is based on functional kernel nonparametric regression estimation techniques where observations are discrete recordings of segments of an underlying stochastic process considered as curves. These curves are assumed to lie within the space of continuous functions, and the discretized time series data set consists of a relatively small, compared with the number of segments, number of measurements made at regular times. We estimate conditional expectations by using appropriate wavelet decompositions of the segmented sample paths. A notion of similarity, based on wavelet decompositions, is used to calibrate the prediction. Asymptotic properties when the number of segments grows to 1 are investigated under mild conditions, and a nonparametric resampling procedure is used to generate, in a flexible way, valid asymptotic pointwise prediction intervals for the trajectories predicted. We illustrate the usefulness of the proposed functional wavelet–kernel methodology in finite sample situations by means of a simulated example and two real life data sets, and we compare the resulting predictions with those obtained by three other methods in the literature, in particular with a smoothing spline method, with an exponential smoothing procedure and with a seasonal autoregressive integrated moving average model.

Journal ArticleDOI
TL;DR: This article proposed profile kernel and backfitting estimation methods for a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function.
Abstract: Summary. The paper considers a wide class of semiparametric problems with a parametric part for some covariate effects and repeated evaluations of a nonparametric function. Special cases in our approach include marginal models for longitudinal or clustered data, conditional logistic regression for matched case–control studies, multivariate measurement error models, generalized linear mixed models with a semiparametric component, and many others. We propose profile kernel and backfitting estimation methods for these problems, derive their asymptotic distributions and show that in likelihood problems the methods are semiparametric efficient. Although generally not true, it transpires that with our methods profiling and backfitting are asymptotically equivalent. We also consider pseudolikelihood methods where some nuisance parameters are estimated from a different algorithm. The methods proposed are evaluated by using simulation studies and applied to the Kenya haemoglobin data.

Journal ArticleDOI
TL;DR: In this article, the bias of the multiple imputation variance estimator for data that are collected with a complex sample design is analyzed. But the bias may be sizable for certain estimators, such as domain means, when a large fraction of the values are imputed.
Abstract: Summary. Multiple imputation is a method of estimating the variances of estimators that are constructed with some imputed data. We give an expression for the bias of the multiple-imputation variance estimator for data that are collected with a complex sample design. The bias may be sizable for certain estimators, such as domain means, when a large fraction of the values are imputed. A bias-adjusted variance estimator is suggested.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the analysis of three-arm randomized trials with noncompliance and derived sharp bounds for the causal effects within principal strata for binary outcomes and constructed confidence intervals to cover the identification regions.
Abstract: Summary. The paper considers the analysis of three-arm randomized trials with non-compliance. In these trials, the average causal effects of treatments within principal strata of compliance behaviour are of interest for better understanding the effect of the treatment. Unfortunately, even with the usual assumptions, the average causal effects of treatments within principal strata are not point identified. However, the observable data do provide useful information on the bounds of the identification regions of the parameters of interest. Under two sets of assumptions, we derive sharp bounds for the causal effects within principal strata for binary outcomes and construct confidence intervals to cover the identification regions. The methods are illustrated by an analysis of data from a randomized study of treatments for alcohol dependence.

Journal ArticleDOI
TL;DR: An efficient randomized response model that can easily be adjusted to be more efficient than the Warner, Mangat and Singh, and Mangat methods by selecting certain parameters of the proposed randomization device is suggested.
Abstract: Summary. We suggest an efficient randomized response model that can easily be adjusted to be more efficient than the Warner, Mangat and Singh, and Mangat methods by selecting certain parameters of the proposed randomization device.

Journal ArticleDOI
TL;DR: In this paper, the authors consider the problem of quantifying the finite dimensionality of functional data and propose techniques for assessing the finiteness of dimensionality, based on the assumption that the signal is finite-dimensional.
Abstract: Summary. If a problem in functional data analysis is low dimensional then the methodology for its solution can often be reduced to relatively conventional techniques in multivariate analysis. Hence, there is intrinsic interest in assessing the finite dimensionality of functional data. We show that this problem has several unique features. From some viewpoints the problem is trivial, in the sense that continuously distributed functional data which are exactly finite dimensional are immediately recognizable as such, if the sample size is sufficiently large. However, in practice, functional data are almost always observed with noise, for example, resulting from rounding or experimental error. Then the problem is almost insolubly difficult. In such cases a part of the average noise variance is confounded with the true signal and is not identifiable. However, it is possible to define the unconfounded part of the noise variance. This represents the best possible lower bound to all potential values of average noise variance and is estimable in low noise settings. Moreover, bootstrap methods can be used to describe the reliability of estimates of unconfounded noise variance, under the assumption that the signal is finite dimensional. Motivated by these ideas, we suggest techniques for assessing the finiteness of dimensionality. In particular, we show how to construct a critical point 9q such that, if the distribution of our functional data has fewer than q - 1 degrees of freedom, then we should be willing to assume that the average variance of the added noise is at least 9q. If this level seems too high then we must conclude that the dimension is at least q - 1. We show that simpler, more conventional techniques, based on hypothesis testing, are generally not effective.

Journal ArticleDOI
TL;DR: In this paper, the Gibbs sampler is used to model the exact distribution of a continuous time Markov chain over an interval given the start and end states and the infinitesimal generator.
Abstract: A Markov modulated Poisson process (MMPP) is a Poisson process whose intensity varies according to a Markov process. We present a novel technique for simulating from the exact distribution of a continuous time Markov chain over an interval given the start and end states and the infinitesimal generator, and use this to create a Gibbs sampler which samples from the exact distribution of the hidden Markov chain in an MMPP. We apply the Gibbs sampler to modelling the occurrence of a rare DNA motif (the Chi site) and to inferring regions of the genome with evidence of high or low intensities for occurrences of this site.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a new "Haar-Fisz" technique for estimating the time-varying, piecewise constant local variance of a locally stationary Gaussian time series.
Abstract: Summary. We propose a new 'Haar-Fisz' technique for estimating the time-varying, piecewise constant local variance of a locally stationary Gaussian time series. We apply our technique to the estimation of the spectral structure in the locally stationary wavelet model. Our method combines Haar wavelets and the variance stabilizing Fisz transform. The resulting estimator is mean square consistent, rapidly computable and easy to implement, and performs well in practice. We also introduce the 'Haar-Fisz transform', a device for stabilizing the variance of scaled x2-data and bringing their distribution close to Gaussianity.

Journal ArticleDOI
TL;DR: In this article, first-order residual analysis for spatio-temporal point processes is proposed and a second-order analysis based on the viewpoint of martingales is proposed.
Abstract: Summary. The paper gives first-order residual analysis for spatiotemporal point processes that is similar to the residual analysis that has been developed by Baddeley and co-workers for spatial point processes and also proposes principles for second-order residual analysis based on the viewpoint of martingales. Examples are given for both first- and second-order residuals. In particular, residual analysis can be used as a powerful tool in model improvement. Taking a spatiotemporal epidemic-type aftershock sequence model for earthquake occurrences as the base-line model, second-order residual analysis can be useful for identifying many features of the data that are not implied in the base-line model, providing us with clues about how to formulate better models.

Journal ArticleDOI
TL;DR: This paper describes a flexible class of probing experiments (‘flexicast’) for data collection and develops conditions under which the link level delay distributions are identifiable, and proposes a faster algorithm based on solving for local maximum likehood estimators and combining their information.
Abstract: Summary. Estimating and monitoring the quality of service of computer and communications networks is a problem of considerable interest. The paper focuses on estimating link level delay distributions from end-to-end path level data collected by using active probing experiments. This is an interesting large scale statistical inverse (deconvolution) problem. We describe a flexible class of probing experiments (‘flexicast’) for data collection and develop conditions under which the link level delay distributions are identifiable. Maximum likelihood estimation using the EM algorithm is studied. It does not scale well for large trees, so a faster algorithm based on solving for local maximum likehood estimators and combining their information is proposed. The usefulness of the methods is illustrated on real voice over Internet protocol data that were collected from the University of North Carolina campus network.

Journal ArticleDOI
TL;DR: In this paper, the problem of estimating the mean-squared error of small area estimators within a Fay-Herriot normal error model is studied theoretically in the common setting where the model is fitted to a logarithmically transformed response variable.
Abstract: Summary. The problem of accurately estimating the mean-squared error of small area estimators within a Fay–Herriot normal error model is studied theoretically in the common setting where the model is fitted to a logarithmically transformed response variable. For bias-corrected empirical best linear unbiased predictor small area point estimators, mean-squared error formulae and estimators are provided, with biases of smaller order than the reciprocal of the number of small areas. The performance of these mean-squared error estimators is illustrated by a simulation study and a real data example relating to the county level estimation of child poverty rates in the US Census Bureau’s on-going ‘Small area income and poverty estimation’ project.

Journal ArticleDOI
TL;DR: In this article, the authors introduce an approach for formulating and testing linear hypotheses on the transition probabilities of the latent process, for a class of latent Markov models for discrete variables having a longitudinal structure, and outline an EM algorithm based on well-known recursions in the hidden Markov literature.
Abstract: Summary. For a class of latent Markov models for discrete variables having a longitudinal structure, we introduce an approach for formulating and testing linear hypotheses on the transition probabilities of the latent process. For the maximum likelihood estimation of a latent Markov model under hypotheses of this type, we outline an EM algorithm that is based on well-known recursions in the hidden Markov literature. We also show that, under certain assumptions, the asymptotic null distribution of the likelihood ratio statistic for testing a linear hypothesis on the transition probabilities of a latent Markov model, against a less stringent linear hypothesis on the transition probabilities of the same model, is of X2 type. As a particular case, we derive the asymptotic distribution of the likelihood ratio statistic between a latent class model and its latent Markov version, which may be used to test the hypothesis of absence of transition between latent states. The approach is illustrated through a series of simulations and two applications, the first of which is based on educational testing data that have been collected within the National Assessment of Educational Progress 1996, and the second on data, concerning the use of marijuana, which have been collected within the National Youth Survey 19761980.

Journal ArticleDOI
TL;DR: In this paper, a framework for non-linear multiscale decompositions of Poisson data that have piecewise smooth intensity curves is introduced, which combines the advantages of the Haar-Fisz transform with wavelet smoothing and Bayesian shrinkage approach with an original prior for coefficients of this decomposition.
Abstract: Summary. The paper introduces a framework for non-linear multiscale decompositions of Poisson data that have piecewise smooth intensity curves. The key concept is conditioning on the sum of the observations that are involved in the computation of a given multiscale coefficient. Within this framework, most classical wavelet thresholding schemes for data with additive homoscedastic noise can be used. Any family of wavelet transforms (orthogonal, biorthogonal or second generation) can be incorporated in this framework. Our second contribution is to propose a Bayesian shrinkage approach with an original prior for coefficients of this decomposition. As such, the method combines the advantages of the Haar-Fisz transform with wavelet smoothing and (Bayesian) multiscale likelihood models, with additional benefits, such as extendability towards arbitrary wavelet families. Simulations show an important reduction in average squared error of the output, compared with the present techniques of Anscombe or Fisz variance stabilization or multiscale likelihood modelling.

Journal ArticleDOI
TL;DR: In this article, the authors present a Web of Science Record created on 2006-04-21, modified on 2017-05-12, with the purpose of finding the most relevant information.
Abstract: Reference STAT-ARTICLE-2006-001doi:10.1111/j.1467-9868.2006.00548.xView record in Web of Science Record created on 2006-04-21, modified on 2017-05-12

Journal ArticleDOI
TL;DR: This work discusses in more detail the fitting of time‐varying AR(p) processes for which the problem of the selection of the order p is treated, and proposes an iterative algorithm for the computation of the estimator.
Abstract: Over recent decades increasingly more attention has been paid to the problem of how to fit a parametric model of time series with time-varying parameters. A typical example is given by autoregressive models with time-varying parameters. We propose a procedure to fit such time-varying models to general non-stationary processes. The estimator is a maximum Whittle likelihood estimator on sieves. The results do not assume that the observed process belongs to a specific class of time varying parametric models. We discuss in more detail the fitting of time-varying AR(p) processes for which we treat the problem of the selection of the order p, and we propose an iterative algorithm for the computation of the estimator. A comparison with model selection by Akaike's information criterion is provided through simulations.

Journal ArticleDOI
TL;DR: In this paper, nonparametric techniques for analyzing data generated by the Berkson model are discussed, and a practical method for smoothing parameter choice is suggested for dosage error data is also discussed.
Abstract: Summary. It is common, in errors-in-variables problems in regression, to assume that the errors are incurred 'after the experiment', in that the observed value of the explanatory variable is an independent perturbation of its true value. However, if the errors are incurred 'before the experiment' then the true value of the explanatory variable equals a perturbation of its observed value. This is the context of the Berkson model, which is increasingly attracting attention in parametric and semiparametric settings. We introduce and discuss nonparametric techniques for analysing data that are generated by the Berkson model. Our approach permits both random and regularly spaced values of the target doses. In the absence of data on dosage error it is necessary to propose a distribution for the latter, but we show numerically that our method is robust against that assumption. The case of dosage error data is also discussed. A practical method for smoothing parameter choice is suggested. Our techniques for errors-in-variables regression are shown to achieve theoretically optimal convergence rates.

Journal ArticleDOI
TL;DR: In this paper, a general approach to this problem is developed and used to calculate a lower bound on the attainable volume of the confidence region of a multivariate normal distribution, and Bayes and fiducial methods are involved in the calculation.
Abstract: Summary. Since Stein's original proposal in 1962, a series of papers have constructed confidence regions of smaller volume than the standard spheres for the mean vector of a multivariate normal distribution. A general approach to this problem is developed here and used to calculate a lower bound on the attainable volume. Bayes and fiducial methods are involved in the calculation. Scheffe-type problems are used to show that low volume by itself does not guarantee favourable inferential properties.

Journal Article
TL;DR: In this article, the authors compare and contrast six types of multiple randomizations, using a wide range of examples, and discuss their use in designing experiments, and outline a system of describing the randomizations in terms of sets of objects, their associated tiers and the factor nesting, using randomization diagrams.
Abstract: Multitiered experiments are characterized by involving multiple randomizations, in a sense that we make explicit. We compare and contrast six types of multiple randomizations, using a wide range of examples, and discuss their use in designing experiments. We outline a system of describing the randomizations in terms of sets of objects, their associated tiers and the factor nesting, using randomization diagrams, which give a convenient and readily assimilated summary of an experiment's randomization. We also indicate how to formulate a randomization-based mixed model for the analysis of data from such experiments.