scispace - formally typeset
Search or ask a question

Showing papers in "Electronic Journal of Statistics in 2014"


Journal ArticleDOI
TL;DR: A critical inferential challenge that results from nonregularity arises in inference for parameters in the optimal dynamic treatment regime is reviewed; the asymptotic, limiting, distribution of estimators are sensitive to local perturbations.
Abstract: Dynamic treatment regimes are of growing interest across the clinical sciences because these regimes provide one way to operationalize and thus inform sequential personalized clinical decision making. Formally, a dynamic treatment regime is a sequence of decision rules, one per stage of clinical intervention. Each decision rule maps up-to-date patient information to a recommended treatment. We briefly review a variety of approaches for using data to construct the decision rules. We then review a critical inferential challenge that results from nonregularity, which often arises in this area. In particular, nonregularity arises in inference for parameters in the optimal dynamic treatment regime; the asymptotic, limiting, distribution of estimators are sensitive to local perturbations. We propose and evaluate a locally consistent Adaptive Confidence Interval (ACI) for the parameters of the optimal dynamic treatment regime. We use data from the Adaptive Pharmacological and Behavioral Treatments for Children with ADHD Trial as an illustrative example. We conclude by highlighting and discussing emerging theoretical problems in this area.

154 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider the multivariate normal mean model in the situation that the mean vector is sparse in the nearly black sense and show that the posterior distribution of the horseshoe prior may be more informative than that of other one-component priors, including the Lasso.
Abstract: We consider the horseshoe estimator due to Carvalho, Polson and Scott (2010) for the multivariate normal mean model in the situation that the mean vector is sparse in the nearly black sense. We assume the fre- quentist framework where the data is generated according to a fixed mean vector. We show that if the number of nonzero parameters of the mean vector is known, the horseshoe estimator attains the minimax l2 risk, pos- sibly up to a multiplicative constant. We provide conditions under which the horseshoe estimator combined with an empirical Bayes estimate of the number of nonzero means still yields the minimax risk. We furthermore prove an upper bound on the rate of contraction of the posterior distri- bution around the horseshoe estimator, and a lower bound on the poste- rior variance. These bounds indicate that the posterior distribution of the horseshoe prior may be more informative than that of other one-component priors, including the Lasso. MSC 2010 subject classifications: 62F15, 62F10.

132 citations


Journal ArticleDOI
TL;DR: In this paper, the adjacency spectral embedding was used to obtain perfect vertex clustering for the stochastic block model in a graph with respect to the graph topology.
Abstract: Vertex clustering in a stochastic blockmodel graph has wide ap- plicability and has been the subject of extensive research. In this paper, we provide a short proof that the adjacency spectral embedding can be used to obtain perfect clustering for the stochastic blockmodel.

121 citations


Journal ArticleDOI
TL;DR: In this paper, a generalization of the Sobol indices is proposed, where the output of the output belongs to a Hilbert space of finite or infinite dimension. But the asymptotic behavior of such an estimation scheme is unknown.
Abstract: Let $X:=(X_1, \ldots, X_p)$ be random objects (the inputs), defined on some probability space $(\Omega,{\mathcal{F}}, \mathbb P)$ and valued in some measurable space $E=E_1\times\ldots \times E_p$. Further, let $Y:=Y = f(X_1, \ldots, X_p)$ be the output. Here, $f$ is a measurable function from $E$ to some Hilbert space $\mathbb{H}$ ($\mathbb{H}$ could be either of finite or infinite dimension). In this work, we give a natural generalization of the Sobol indices (that are classically defined when $Y\in\R$ ), when the output belongs to $\mathbb{H}$. These indices have very nice properties. First, they are invariant. under isometry and scaling. Further they can be, as in dimension $1$, easily estimated by using the so-called Pick and Freeze method. We investigate the asymptotic behaviour of such estimation scheme.

91 citations


Journal ArticleDOI
TL;DR: In this paper, an appropriate BIC expression that is consistent with the random effect structure of the mixed effects model is derived, which is used for variable selection in mixed effects models.
Abstract: The Bayesian Information Criterion (BIC) is widely used for variable selection in mixed effects models. However, its expression is unclear in typical situations of mixed effects models, where simple definition of the sample size is not meaningful. We derive an appropriate BIC expression that is consistent with the random effect structure of the mixed effects model. We illustrate the behavior of the proposed criterion through a simulation experiment and a case study and we recommend its use as an alternative to various existing BIC versions that are implemented in available software.

89 citations


Journal ArticleDOI
TL;DR: The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes to estimate the size of a target population based on data collected through RDS.
Abstract: Respondent-Driven Sampling (RDS) is n approach to sampling design and inference in hard-to-reach human populations. It is often used in situations where the target population is rare and/or stigmatized in the larger population, so that it is prohibitively expensive to contact them through the available frames. Common examples include injecting drug users, men who have sex with men, and female sex workers. Most analysis of RDS data has focused on estimating aggregate characteristics, such as disease prevalence. However, RDS is often conducted in settings where the population size is unknown and of great independent interest. This paper presents an approach to estimating the size of a target population based on data collected through RDS. The proposed approach uses a successive sampling approximation to RDS to leverage information in the ordered sequence of observed personal network sizes. The inference uses the Bayesian framework, allowing for the incorporation of prior knowledge. A flexible class of priors for the population size is used that aids elicitation. An extensive simulation study provides insight into the performance of the method for estimating population size under a broad range of conditions. A further study shows the approach also improves estimation of aggregate characteristics. Finally, the method demonstrates sensible results when used to estimate the size of known networked populations from the National Longitudinal Study of Adolescent Health, and when used to estimate the size of a hard-to-reach population at high risk for HIV.

73 citations


Journal ArticleDOI
TL;DR: The principal finding is that the most natural, and simplest, mean field variational Bayes algorithm can perform quite poorly due to post- rior dependence among auxiliary variables, so more sophisticated algorithms are shown to be superior.
Abstract: We investigate mean field variational approximate Bayesian in- ference for models that use continuous distributions, Horseshoe, Negative- Exponential-Gamma and Generalized Double Pareto, for sparse signal shrin- kage. Our principal finding is that the most natural, and simplest, mean field variational Bayes algorithm can perform quite poorly due to poste- rior dependence among auxiliary variables. More sophisticated algorithms, based on special functions, are shown to be superior. Continued fraction ap- proximations via Lentz's Algorithm are developed to make the algorithms

64 citations


Journal ArticleDOI
TL;DR: In this article, an empirical Bayes posterior distribution with desirable properties under mild conditions is proposed, which concentrates on balls, centered at the true mean vector, with squared radius proportional to the minimax rate and its posterior mean is an asymptotically minimax estimator.
Abstract: For the important classical problem of inference on a sparse high-dimensional normal mean vector, we propose a novel empirical Bayes model that admits a posterior distribution with desirable properties under mild conditions. In particular, our empirical Bayes posterior distribution concentrates on balls, centered at the true mean vector, with squared radius proportional to the minimax rate, and its posterior mean is an asymptotically minimax estimator. We also show that, asymptotically, the support of our empirical Bayes posterior has roughly the same effective dimension as the true sparse mean vector. Simulation from our empirical Bayes posterior is straightforward, and our numerical results demonstrate the quality of our method compared to others having similar large-sample properties.

52 citations


Journal ArticleDOI
TL;DR: In this article, the conditional Akaike information criterion (CAIC) is used for model selection in linear mixed models and a general framework for the calculation of conditional AIC for different exponential family distributions is developed.
Abstract: The conditional Akaike information criterion, AIC, has been frequently used for model selection in linear mixed models. We develop a general framework for the calculation of the conditional AIC for different exponential family distributions. This unified framework incorporates the conditional AIC for the Gaussian case, gives a new justification for Poisson distributed data and yields a new conditional AIC for exponentially distributed responses but cannot be applied to the binomial and gamma distributions. The proposed conditional Akaike information criteria are unbiased for finite samples, do not rely on a particular estimation method and do not assume that the variance-covariance matrix of the random effects is known. The theoretical results are investigated in a simulation study. The practical use of the method is illustrated by application to a data set on tree growth.

50 citations


Journal ArticleDOI
TL;DR: A dynamic programming algorithm is presented for the optimal selection of such chain event graphs that maximizes a decomposable score derived from a complete independent sample and it is shown that the algorithm is suitable for small problems.
Abstract: We introduce a subclass of chain event graphs that we call stratified chain event graphs, and present a dynamic programming algorithm for the optimal selection of such chain event graphs that maximizes a decomposable score derived from a complete independent sample. We apply the algorithm to such a dataset, with a view to deducing the causal structure of the variables under the hypothesis that there are no unobserved confounders. We show that the algorithm is suitable for small problems. Similarities with and differences to a dynamic programming algorithm for MAP learning of Bayesian networks are highlighted, as are the relations to causal discovery using Bayesian networks.

43 citations


Journal ArticleDOI
TL;DR: In this paper, the authors monitor very robust regression by looking at the be- haviour of residuals and test statistics as they smoothly change the robustness of parameter estimation from a breakdown point of 50% to non-robust least squares.
Abstract: Robust methods are little applied (although much studied by statisticians). We monitor very robust regression by looking at the be- haviour of residuals and test statistics as we smoothly change the robustness of parameter estimation from a breakdown point of 50% to non-robust least squares. The resulting procedure provides insight into the structure of the data including outliers and the presence of more than one population. Moni- toring overcomes the hindrances to the routine adoption of robust methods, being informative about the choice between the various robust procedures. Methods tuned to give nominal high efficiency fail with our most compli- cated example. We find that the most informative analyses come from S estimates combined with Tukey's biweight or with the optimalfunctions. For our major example with 1,949 observations and 13 explanatory vari- ables, we combine robust S estimation with regression using the forward search, so obtaining an understanding of the importance of individual obser- vations, which is missing from standard robust procedures. We discover that the data come from two different populations. They also contain six outliers. Our analyses are accompanied by numerous graphs. Algebraic results are contained in two appendices, the second of which provides useful new results on the absolute odd moments of elliptically truncated multivariate normal random variables.

Journal ArticleDOI
TL;DR: In this paper, the convergence rate of the posterior distribution and Bayes estimators based on the graphical model in the $L ∞$-operator norm uniformly over a class of precision matrices, even if the true precision matrix may not have a banded structure, was shown.
Abstract: We consider Bayesian estimation of a $p\times p$ precision matrix, when $p$ can be much larger than the available sample size $n$. It is well known that consistent estimation in such ultra-high dimensional situations requires regularization such as banding, tapering or thresholding. We consider a banding structure in the model and induce a prior distribution on a banded precision matrix through a Gaussian graphical model, where an edge is present only when two vertices are within a given distance. For a proper choice of the order of graph, we obtain the convergence rate of the posterior distribution and Bayes estimators based on the graphical model in the $L_{\infty}$-operator norm uniformly over a class of precision matrices, even if the true precision matrix may not have a banded structure. Along the way to the proof, we also compute the convergence rate of the maximum likelihood estimator (MLE) under the same set of condition, which is of independent interest. The graphical model based MLE and Bayes estimators are automatically positive definite, which is a desirable property not possessed by some other estimators in the literature. We also conduct a simulation study to compare finite sample performance of the Bayes estimators and the MLE based on the graphical model with that obtained by using a Cholesky decomposition of the precision matrix. Finally, we discuss a practical method of choosing the order of the graphical model using the marginal likelihood function.

Journal ArticleDOI
TL;DR: Many methods exist for one dimensional curve registration, and how methods compare has not been made clear in the literature as discussed by the authors, and a detailed comparison of a number of major methods, done during a recent workshop, is presented in this special section.
Abstract: Many methods exist for one dimensional curve registration, and how methods compare has not been made clear in the literature. This special section is a summary of a detailed comparison of a number of major methods, done during a recent workshop. The basis of the comparison was simultaneous analysis of a set of four real data sets, which engendered a high level of informative discussion. Most research groups in this area were represented, and many insights were gained, which are discussed here. The format of this special section is four papers introducing the data, each accompanied by a number of analyses by different groups, plus a discussion summary of the lessons learned.

Journal ArticleDOI
TL;DR: In this article, the authors investigate the cost of using the exact one and two-sided Clopper-Pearson confidence intervals rather than shorter approximate intervals, first in terms of increased expected length and then the increase in sample size required to obtain a desired expected length.
Abstract: When computing a confidence interval for a binomial proportion $p$ one must choose between using an exact interval, which has a coverage probability of at least $1-\alpha$ for all values of $p$, and a shorter approximate interval, which may have lower coverage for some $p$ but that on average has coverage equal to $1-\alpha$. We investigate the cost of using the exact one and two-sided Clopper–Pearson confidence intervals rather than shorter approximate intervals, first in terms of increased expected length and then in terms of the increase in sample size required to obtain a desired expected length. Using asymptotic expansions, we also give a closed-form formula for determining the sample size for the exact Clopper–Pearson methods. For two-sided intervals, our investigation reveals an interesting connection between the frequentist Clopper–Pearson interval and Bayesian intervals based on noninformative priors.

Journal ArticleDOI
TL;DR: In this article, it was shown that exponential weights methods also succeed at variable selection and estimation under the near minimum condition on the design matrix, instead of much stronger assumptions required by other methods such as the Lasso or the Dantzig Selector.
Abstract: In the context of a linear model with a sparse coefficient vector, exponential weights methods have been shown to be achieve oracle inequalities for denoising/prediction. We show that such methods also succeed at variable selection and estimation under the near minimum condition on the design matrix, instead of much stronger assumptions required by other methods such as the Lasso or the Dantzig Selector. The same analysis yields consistency results for Bayesian methods and BIC-type variable selection under similar conditions.

Journal ArticleDOI
TL;DR: In this paper, the authors characterize the dynamic properties of generalized autoregressive score models by identifying the regions of the parameter space that imply stationarity and ergodicity of the corresponding nonlinear time series process.
Abstract: We characterize the dynamic properties of generalized autoregressive score models by identifying the regions of the parameter space that imply stationarity and ergodicity of the corresponding nonlinear time series process. We show how these regions are affected by the choice of parameterization and scaling, which are key features for the class of generalized autoregressive score models compared to other observation driven models. All results are illustrated for the case of time-varying means, variances, or higher-order moments.

Journal ArticleDOI
TL;DR: In this paper, the convergence of the least squares estimator when the model is wrong is established for the linear (anisotropic) case, and new results for inner products of functions.
Abstract: Uniform convergence of empirical norms - empirical measures of squared functions - is a topic which has received considerable attention in the literature on empirical processes. The results are relevant as empirical norms occur due to symmetrization. They also play a prominent role in statistical applications. The contraction inequality has been a main tool but recently other approaches have shown to lead to better results in important cases. We present an overview including the linear (anisotropic) case, and give new results for inner products of functions. Our main application will be the estimation of the parental structure in a directed acyclic graph. As intermediate result we establish convergence of the least squares estimator when the model is wrong.

Journal ArticleDOI
TL;DR: In this article, a drift and minorization-based analysis of the Gibbs sampling Markov chains corresponding to the Dirichlet-Laplace prior and the Normal-Gamma prior is presented.
Abstract: In recent years, a large variety of continuous shrinkage priors have been developed for a Bayesian analysis of the standard linear regression model in high dimensional settings. We consider two such priors, the Dirichlet-Laplace prior (developed in Bhattacharya et al. (2013)), and the Normal-Gamma prior (developed in (Griffin and Brown, 2010)). For both Dirichlet-Laplace and Normal-Gamma priors, Gibbs sampling Markov chains have been developed to generate approximate samples from the corresponding posterior distributions. We show by using a drift and minorization based analysis that the Gibbs sampling Markov chains corresponding to the aforementioned models are geometrically ergodic. Establishing geometric ergodicity of these Markov chains is crucial, as it provides theoretical justification for the use of Markov chain CLT, which can then be used to obtain asymptotic standard errors for Markov chain based estimates of posterior quantities. Both Gibbs samplers in the paper use the Generalized Inverse Gaussian (GIG) distribution, as one of the conditional distributions. A novel contribution of our convergence analysis is the use of drift functions which include terms that are negative fractional powers of normal random variables, to tackle the presence of the GIG distribution.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of aggregating a general collection of affine estimators for fixed design regression and show that the proposed estimator leads simultaneously to all the best known bounds for aggregation with high probability.
Abstract: We consider the problem of aggregating a general collection of affine estimators for fixed design regression. Relevant examples include some commonly used statistical estimators such as least squares, ridge and robust least squares estimators. Dalalyan and Salmon [DS12] have established that, for this problem, exponentially weighted (EW) model selection aggregation leads to sharp oracle inequalities in expectation, but similar bounds in deviation were not previously known. While results [DRZ12] indicate that the same aggregation scheme may not satisfy sharp oracle inequalities with high probability, we prove that a weaker notion of oracle inequality for EW that holds with high probability. Moreover, using a generalization of the newly introduced $Q$-aggregation scheme we also prove sharp oracle inequalities that hold with high probability. Finally, we apply our results to universal aggregation and show that our proposed estimator leads simultaneously to all the best known bounds for aggregation, including $\ell_{q}$-aggregation, $q\in(0,1)$, with high probability.

Journal ArticleDOI
TL;DR: In this paper, lower bounds for the empirical compatibility constant and the empirical restricted eigenvalue were derived for the inner product matrix with respect to the set of vectors with constant values converging to one.
Abstract: This study aims at contributing to lower bounds for empirical compatibility constants or empirical restricted eigenvalues. This is of importance in compressed sensing and theory for $\ell_{1}$-regularized estimators. Let $X$ be an $n\times p$ data matrix with rows being independent copies of a $p$-dimensional random variable. Let $\hat{\Sigma}:=X^{T}X/n$ be the inner product matrix. We show that the quadratic forms $u^{T}\hat{\Sigma}u$ are lower bounded by a value converging to one, uniformly over the set of vectors $u$ with $u^{T}\Sigma_{0}u$ equal to one and $\ell_{1}$-norm at most $M$. Here $\Sigma_{0}:=\mathbb{E} \hat{\Sigma}$ is the theoretical inner product matrix, which we assume to exist. The constant $M$ is required to be of small order $\sqrt{n/\log p}$. We assume moreover $m$-th order isotropy for some $m>2$ and sub-exponential tails or moments up to order $\log p$ for the entries in $X$. As a consequence, we obtain convergence of the empirical compatibility constant to its theoretical counterpart, and similarly for the empirical restricted eigenvalue. If the data matrix $X$ is first normalized so that its columns all have equal length we obtain lower bounds assuming only isotropy and no further moment conditions on its entries. The isotropy condition is shown to hold for certain martingale situations.

Journal ArticleDOI
TL;DR: In this paper, a nonparametric approach to link prediction in large-scale dynamic networks is proposed, which uses graph-based features of pairs of nodes as well as those of their local neighborhoods to predict whether those nodes will be linked at each time step.
Abstract: We propose a nonparametric approach to link prediction in large-scale dynamic networks. Our model uses graph-based features of pairs of nodes as well as those of their local neighborhoods to predict whether those nodes will be linked at each time step. The model allows for different types of evolution in different parts of the graph (e.g, growing or shrinking communities). We focus on large-scale graphs and present an implementation of our model that makes use of locality-sensitive hashing to allow it to be scaled to large problems. Experiments with simulated data as well as five real-world dynamic graphs show that we outperform the state of the art, especially when sharp fluctuations or nonlinearities are present. We also establish theoretical properties of our estimator, in particular consistency and weak convergence, the latter making use of an elaboration of Stein’s method for dependency graphs.

Journal ArticleDOI
TL;DR: The angular Mahalanobis depth as discussed by the authors combines the advantages of both the depth and quantile settings: appealing depth-based geometric properties of the contours (convexity, nestedness, rotation-equivariance) and typical quantile-asymptotics, namely Bahadur-type representation and asymptotic normality.
Abstract: In this paper, we introduce a new concept of quantiles and depth for directional (circular and spherical) data. In view of the similarities with the classical Mahalanobis depth for multivariate data, we call it the angular Mahalanobis depth. Our unique concept combines the advantages of both the depth and quantile settings: appealing depth-based geometric properties of the contours (convexity, nestedness, rotation-equivariance) and typical quantile-asymptotics, namely we establish a Bahadur-type representation and asymptotic normality (these results are corroborated by a Monte Carlo simulation study). We introduce new user-friendly statistical tools such as directional DD- and QQ-plots and a quantile-based goodness- of-fit test. We illustrate the power of our new procedures by analyzing a cosmic rays data set.

Journal ArticleDOI
TL;DR: In this article, a single-gamma approximation of the jump density of the Dickman distribution is proposed, which is based on the observation that it has an evident similarity to that of a generic gamma variable, and is viewed as a sum of independent gamma processes evaluated at time $1.
Abstract: It is well known that the sum $S$ of $n$ independent gamma variables—which occurs often, in particular in practical applications—can typically be well approximated by a single gamma variable with the same mean and variance (the distribution of $S$ being quite complicated in general). In this paper, we propose an alternative (and apparently at least as good) single-gamma approximation to $S$. The methodology used to derive it is based on the observation that the jump density of $S$ bears an evident similarity to that of a generic gamma variable, $S$ being viewed as a sum of $n$ independent gamma processes evaluated at time $1$. This observation motivates the idea of a gamma approximation to $S$ in the first place, and, in principle, a variety of such approximations can be made based on it. The same methodology can be applied to obtain gamma approximations to a wide variety of important infinitely divisible distributions on $\mathbb{R}_{+}$ or at least predict/confirm the appropriateness of the moment-matching method (where the first two moments are matched); this is demonstrated neatly in the cases of negative binomial and generalized Dickman distributions, thus highlighting the paper’s contribution to the overall topic.

Journal ArticleDOI
TL;DR: In this article, quantile estimation using Markov chain Monte Carlo is considered and conditions under which the sampling distribution of the Monte Carlo error is approximately Normal are established, which enables construction of an asymptotically valid interval estimator.
Abstract: We consider quantile estimation using Markov chain Monte Carlo and establish conditions under which the sampling distribution of the Monte Carlo error is approximately Normal. Further, we investigate techniques to estimate the associated asymptotic variance, which enables construction of an asymptotically valid interval estimator. Finally, we explore the finite sample properties of these methods through examples and provide some recommendations to practitioners.

Journal ArticleDOI
TL;DR: This paper proposes an extension to the Gaussian approach which uses Gaussian mixtures as approximations which provides an alternative to the more commonly employed factorization approach and enlarges the range of tractable distributions.
Abstract: Variational Bayesian inference with a Gaussian posterior approximation provides an alternative to the more commonly employed factorization approach and enlarges the range of tractable distributions. In this paper, we propose an extension to the Gaussian approach which uses Gaussian mixtures as approximations. A general problem for variational inference with mixtures is posed by the calculation of the entropy term in the Kullback-Leibler distance, which becomes analytically intractable. We deal with this problem by using a simple lower bound for the entropy and imposing restrictions on the form of the Gaussian covariance matrix. In this way, efficient numerical calculations become possible. To illustrate the method, we discuss its application to an isotropic generalized normal target density, a non-Gaussian state space model, and the Bayesian lasso. For heavy-tailed distributions, the examples show that the mixture approach indeed leads to improved approximations in the sense of a reduced Kullback-Leibler distance. From a more practical point of view, mixtures can improve estimates of posterior marginal variances. Furthermore, they provide an initial estimate of posterior skewness which is not possible with single Gaussians. We also discuss general sufficient conditions under which mixtures are guaranteed to provide improvements over single-component approximations.

Journal ArticleDOI
TL;DR: The focus of this work lies on the optimal data driven choice of the smoothing parameter using a penalization strategy, and introduces kernel estimators and present upper risk bounds.
Abstract: A density deconvolution problem with unknown distribution of the errors is considered. To make the target density identifiable, one has to assume that some additional information on the noise is available. We consider two different models: the framework where some additional sample of the pure noise is available, as well as the repeated observation model, where the contaminated random variable of interest can be observed repeatedly. We introduce kernel estimators and present upper risk bounds. The focus of this work lies on the optimal data driven choice of the smoothing parameter using a penalization strategy.

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of providing nonparametric confidence guarantees with finite sample Berry-Esseen bounds for undirected graphs under weak assumptions and prove lower bounds that if we want accurate inferences with weak assumptions then $D$ must be less than $n$.
Abstract: We consider the problem of providing nonparametric confidence guarantees — with finite sample Berry-Esseen bounds — for undirected graphs under weak assumptions. We do not assume sparsity or incoherence. We allow the dimension $D$ to increase with the sample size $n$. First, we prove lower bounds that show that if we want accurate inferences with weak assumptions then $D$ must be less than $n$. In that case, we show that methods based on Normal approximations and on the bootstrap lead to valid inferences and we provide new Berry-Esseen bounds on the accuracy of the Normal approximation and the bootstrap. When the dimension is large relative to sample size, accurate inferences for graphs under weak assumptions are not possible. Instead we propose to estimate something less demanding than the entire partial correlation graph. In particular, we consider: cluster graphs, restricted partial correlation graphs and correlation graphs.

Journal ArticleDOI
TL;DR: This dataset was collected for the study of the aneurysmal pathology, within the AneuRisk Project, and includes the geometrical reconstructions of one of the main cerebral vessels, the Inner Carotid Artery, described in terms of the vessel centreline and the vessel radius profile.
Abstract: We describe the AneuRisk65 data, obtained from image reconstruction of three-dimensional cerebral angiographies. This dataset was collected for the study of the aneurysmal pathology, within the AneuRisk Project. It includes the geometrical reconstructions of one of the main cerebral vessels, the Inner Carotid Artery, described in terms of the vessel centreline and of the vessel radius profile. We briefly illustrate the data derivation and processing, explaining various aspects that are of interest for this applied problem, while also discussing the peculiarities and critical issues concerning the definition of phase and amplitude variabilities for these three-dimensional functional data.

Journal ArticleDOI
TL;DR: It is shown via simulations and an fMRI data set that failure to regularize the estimates of the spectral density matrix can yield unstable statistics, and that this can be alleviated by shrinkage estimation.
Abstract: Time series data obtained from neurophysiological signals is often high-dimensional and the length of the time series is often short relative to the number of dimensions. Thus, it is difficult or sometimes impossible to compute statistics that are based on the spectral density matrix because estimates of these matrices are often numerically unstable. In this work, we discuss the importance of regularization for spectral analysis of high-dimensional time series and propose shrinkage estimation for estimating high-dimensional spectral density matrices. We use and develop the multivariate Time-frequency Toggle (TFT) bootstrap procedure for multivariate time series to estimate the shrinkage parameters, and show that the multivariate TFT bootstrap is theoretically valid. We show via simulations and an fMRI data set that failure to regularize the estimates of the spectral density matrix can yield unstable statistics, and that this can be alleviated by shrinkage estimation.

Journal ArticleDOI
TL;DR: A lower bound on the penalty that ensures an oracle inequality for the authors' estimator is provided, which aims at estimating the number of components of this mixture of Gaussian regressions by a penalized maximum likelihood approach.
Abstract: In the framework of conditional density estimation, we use candidates taking the form of mixtures of Gaussian regressions with logistic weights and means depending on the covariate. We aim at estimating the number of components of this mixture, as well as the other parameters, by a penalized maximum likelihood approach. We provide a lower bound on the penalty that ensures an oracle inequality for our estimator. We perform some numerical experiments that support our theoretical analysis.