scispace - formally typeset
Search or ask a question
JournalISSN: 1935-7524

Electronic Journal of Statistics 

Institute of Mathematical Statistics
About: Electronic Journal of Statistics is an academic journal published by Institute of Mathematical Statistics. The journal publishes majorly in the area(s): Estimator & Asymptotic distribution. It has an ISSN identifier of 1935-7524. It is also open access. Over the lifetime, 1551 publications have been published receiving 33686 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A method for constructing a sparse estimator for the inverse covariance (concentration) matrix in high-dimensional settings using a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty is proposed.
Abstract: The paper proposes a method for constructing a sparse estima- tor for the inverse covariance (concentration) matrix in high-dimensional settings. The estimator uses a penalized normal likelihood approach and forces sparsity by using a lasso-type penalty. We establish a rate of con- vergence in the Frobenius norm as both data dimension p and sample size n are allowed to grow, and show that the rate depends explicitly on how sparse the true concentration matrix is. We also show that a correlation- based version of the method exhibits better rates in the operator norm. We also derive a fast iterative algorithm for computing the estimator, which relies on the popular Cholesky decomposition of the inverse but produces a permutation-invariant estimator. The method is compared to other es- timators on simulated data and on a real data example of tumor tissue classification using gene expression data.

996 citations

Journal ArticleDOI
TL;DR: The first result establishes consistency of the estimate b � in the elementwise maximum-norm, which allows us to derive convergence rates in Frobenius and spectral norms, and shows good correspondences between the theoretical predictions and behavior in simulations.
Abstract: Given i.i.d. observations of a random vector X 2 R p , we study the problem of estimating both its covariance matrix � ∗ , and its inverse covariance or concentration matrix � ∗ = (� ∗ ) −1 . We estimate � ∗ by minimizing an l1-penalized log-determinant Bregman divergence; in the multivariate Gaussian case, this approach corresponds to l1-penalized maximum likelihood, and the structure of � ∗ is specified by the graph of an associated Gaussian Markov random field. We analyze the performance of this estim ator under high-dimensional scaling, in which the number of nodes in the graph p, the number of edges s and the maximum node degree d, are allowed to grow as a function of the sample size n. In addition to the parameters (p, s, d), our analysis identifies other key quantities that control rates: (a) the l∞-operator norm of the true covariance matrix � ∗ ; and (b) the l∞ operator norm of the submatrix ∗, where S indexes the graph edges, and ∗ = (� ∗ ) −1 (� ∗ ) −1 ; and (c) a mutual incoherence or irrepresentability measure on the matrix ∗ and (d) the rate of decay 1/f(n, δ) on the probabilities {|b n � ∗| > δ}, where b � n is the sample covariance based on n samples. Our first result establishes consistency of our estimate b � in the elementwise maximum-norm. This in turn allows us to derive convergence rates in Frobenius and spectral norms, with improvements upon existing results for graphs with maximum node degrees d = o( p s). In our second result, we show that with probability converging to one, the estimate b � correctly specifies the zero pattern of the concentration matrix � ∗ . We illustrate our theoretical results via simulations for various graphs and problem parameters, showing good correspondences between the theoretical predictions and behavior in simulations. 1. Introduction. The area of high-dimensional statistics deals with estimation in the “large p, small n” setting, where p and n correspond, respectively, to the dimensionality of the dat a and the sample size. Such high-dimensional problems arise in a variety of applications, among them remote sensing, computational biology and natural language processing, where the model dimension may be comparable or substantially larger than the sample size. It is well-known that such high-dimensional scaling can lead to dramatic breakdowns in many classical procedures. In the absence of additional model assumptions, it is frequently impossible to obtain consistent procedures when p ≫ n. Accordingly, an active line of statistical research is based on imposing various restrictions on the model—-for instance, sparsity, manifold structure, or graphical model structure—-and then studying the scaling behavior of different estimators as a function of sample size n, ambient dimension p and additional parameters related to these structural assu mptions.

669 citations

Journal ArticleDOI
TL;DR: In this article, the restricted eigenvalue condition or the slightly weaker compatibility condition are shown to be sufficient for oracle results for a general class of design matrices, and the optimality of the Lasso for prediction and estimation holds for more general situations than what it appears from coherence or restricted isometry assumptions.
Abstract: Oracle inequalities and variable selection properties for the Lasso in linear models have been established under a variety of different assumptions on the design matrix. We show in this paper how the different conditions and concepts relate to each other. The restricted eigenvalue condition [2] or the slightly weaker compatibility condition [18] are sufficient for oracle results. We argue that both these conditions allow for a fairly general class of design matrices. Hence, optimality of the Lasso for prediction and estimation holds for more general situations than what it appears from coherence [5, 4] or restricted isometry [10] assumptions.

596 citations

Journal ArticleDOI
TL;DR: In this paper, the authors studied the sparsity oracle properties of l1-penalized least squares in nonparametric regression with random design and showed that the penalized least square estimator satisfies sparsity inequalities, i.e., bounds in terms of the number of nonzero components of the oracle vector.
Abstract: This paper studies oracle properties of l1-penalized least squares in nonparametric regression setting with random design. We show that the penalized least squares estimator satisfies sparsity oracle inequalities, i.e., bounds in terms of the number of non-zero components of the oracle vector. The results are valid even when the dimension of the model is (much) larger than the sample size and the regression matrix is not positive definite. They can be applied to high-dimensional linear regression, to nonparametric adaptive regression estimation and to the problem of aggregation of arbitrary estimators.

471 citations

Journal ArticleDOI
TL;DR: This paper propose a family of statistical models for social network evolution over time, which represent an extension of Exponential Random Graph Models (ERGMs) and give examples of their use for hypothesis testing and classification.
Abstract: We propose a family of statistical models for social network evolution over time, which represents an extension of Exponential Random Graph Models (ERGMs). Many of the methods for ERGMs are readily adapted for these models, including maximum likelihood estimation algorithms. We discuss models of this type and their properties, and give examples, as well as a demonstration of their use for hypothesis testing and classification. We believe our temporal ERG models represent a useful new framework for modeling time-evolving social networks, and rewiring networks from other domains such as gene regulation circuitry, and communication networks.

463 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
202337
2022164
202192
2020125
2019132
2018135