Home
/
Authors
/
Ming Yuan

Author

Ming Yuan

Other affiliations: University of Wisconsin-Madison, Georgia Institute of Technology, University of North Carolina at Chapel Hill ...read more

Bio: Ming Yuan is an academic researcher from Columbia University. The author has contributed to research in topics: Estimator & Minimax. The author has an hindex of 46, co-authored 172 publications receiving 15913 citations. Previous affiliations of Ming Yuan include University of Wisconsin-Madison & Georgia Institute of Technology.

Topics: Estimator, Minimax, Rank (linear algebra), Tensor, Tensor (intrinsic definition) ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Model selection and estimation in regression with grouped variables

[...]

Ming Yuan¹, Yi Lin²•Institutions (2)

Georgia Institute of Technology¹, University of Wisconsin-Madison²

01 Feb 2006-Journal of The Royal Statistical Society Series B-statistical Methodology

TL;DR: In this paper, instead of selecting factors by stepwise backward elimination, the authors focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection.

...read moreread less

Abstract: Summary. We consider the problem of selecting grouped variables (factors) for accurate prediction in regression. Such a problem arises naturally in many practical situations with the multifactor analysis-of-variance problem as the most important and well-known example. Instead of selecting factors by stepwise backward elimination, we focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection. The lasso, the LARS algorithm and the non-negative garrotte are recently proposed regression methods that can be used to select individual variables. We study and propose efficient algorithms for the extensions of these methods for factor selection and show that these extensions give superior performance to the traditional stepwise backward elimination method in factor selection problems. We study the similarities and the differences between these methods. Simulations and real examples are used to illustrate the methods.

...read moreread less

7,400 citations

Journal Article•DOI•

Model selection and estimation in the Gaussian graphical model

[...]

Ming Yuan¹, Yi Lin²•Institutions (2)

Georgia Institute of Technology¹, University of Wisconsin-Madison²

01 Mar 2007-Biometrika

TL;DR: The implementation of the penalized likelihood methods for estimating the concentration matrix in the Gaussian graphical model is nontrivial, but it is shown that the computation can be done effectively by taking advantage of the efficient maxdet algorithm developed in convex optimization.

...read moreread less

Abstract: SUMMARY We propose penalized likelihood methods for estimating the concentration matrix in the Gaussian graphical model. The methods lead to a sparse and shrinkage estimator of the concentration matrix that is positive definite, and thus conduct model selection and estimation simultaneously. The implementation of the methods is nontrivial because of the positive definite constraint on the concentration matrix, but we show that the computation can be done effectively by taking advantage of the efficient maxdet algorithm developed in convex optimization. We propose a BIC-type criterion for the selection of the tuning parameter in the penalized likelihood methods. The connection between our methods and existing methods is illustrated. Simulations and real examples demonstrate the competitive performance of the new methods.

...read moreread less

1,824 citations

Journal Article•DOI•

High dimensional semiparametric gaussian copula graphical models

[...]

Han Liu¹, Fang Han², Ming Yuan³, John Lafferty⁴, Larry Wasserman⁴ - Show less +1 more•Institutions (4)

Johns Hopkins University¹, University of Minnesota², Georgia Institute of Technology³, Carnegie Mellon University⁴

01 Aug 2012-Annals of Statistics

TL;DR: In this article, the non-paranormal graphical models are used as a safe replacement of the popular Gaussian graphical models, even when the data are truly Gaussian, for graph recovery and parameter estimation.

...read moreread less

Abstract: ing the Spearman’s rho and Kendall’s tau. We prove that the nonparanormal skeptic achieves the optimal parametric rates of convergence for both graph recovery and parameter estimation. This result suggests that the nonparanormal graphical models can be used as a safe replacement of the popular Gaussian graphical models, even when the data are truly Gaussian. Besides theoretical analysis, we also conduct thorough numerical simulations to compare the graph recovery performance of dierent estimators under both ideal and noisy settings. The proposed methods are then applied on a largescale genomic dataset to illustrate their empirical usefulness. The R package huge implementing the proposed methods is available on the Comprehensive R Archive Network: http://cran.r-project.org/.

...read moreread less

521 citations

Journal Article•DOI•

Composite quantile regression and the oracle Model Selection Theory

[...]

Hui Zou, Ming Yuan¹•Institutions (1)

Georgia Institute of Technology¹

18 Jun 2008-arXiv: Statistics Theory

TL;DR: In this paper, a new regression method called composite quantile regression (CQR) was proposed, and the authors show that CQR can be much more efficient and sometimes arbitrarily more efficient than least squares.

...read moreread less

Abstract: Coefficient estimation and variable selection in multiple linear regression is routinely done in the (penalized) least squares (LS) framework. The concept of model selection oracle introduced by Fan and Li [J. Amer. Statist. Assoc. 96 (2001) 1348--1360] characterizes the optimal behavior of a model selection procedure. However, the least-squares oracle theory breaks down if the error variance is infinite. In the current paper we propose a new regression method called composite quantile regression (CQR). We show that the oracle model selection theory using the CQR oracle works beautifully even when the error variance is infinite. We develop a new oracular procedure to achieve the optimal properties of the CQR oracle. When the error variance is finite, CQR still enjoys great advantages in terms of estimation efficiency. We show that the relative efficiency of CQR compared to the least squares is greater than 70% regardless the error distribution. Moreover, CQR could be much more efficient and sometimes arbitrarily more efficient than the least squares. The same conclusions hold when comparing a CQR-oracular estimator with a LS-oracular estimator.

...read moreread less

458 citations

Journal Article•

High Dimensional Inverse Covariance Matrix Estimation via Linear Programming

[...]

Ming Yuan¹•Institutions (1)

Georgia Institute of Technology¹

01 Mar 2010-Journal of Machine Learning Research

TL;DR: An estimating procedure is proposed that can effectively exploit "sparsity" of the inverse covariance matrix and can be computed using linear programming and therefore has the potential to be used in very high dimensional problems.

...read moreread less

Abstract: This paper considers the problem of estimating a high dimensional inverse covariance matrix that can be well approximated by "sparse" matrices. Taking advantage of the connection between multivariate linear regression and entries of the inverse covariance matrix, we propose an estimating procedure that can effectively exploit such "sparsity". The proposed method can be computed using linear programming and therefore has the potential to be used in very high dimensional problems. Oracle inequalities are established for the estimation error in terms of several operator norms, showing that the method is adaptive to different types of sparsity of the problem.

...read moreread less

447 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

Collapse

Cited by

PDF

Open Access

More filters

Book•

Compressed sensing

[...]

D.L. Donoho¹•Institutions (1)

Stanford University¹

01 Jan 2004

TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.

...read moreread less

Abstract: Suppose x is an unknown vector in Ropfm (a digital image or signal); we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n=O(m1/4log5/2(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an lscrp ball for 0

...read moreread less

18,609 citations

Book•

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

[...]

Stephen Boyd¹, Neal Parikh¹, Eric Chu¹, Borja Peleato¹, Jonathan Eckstein² - Show less +1 more•Institutions (2)

Stanford University¹, Rutgers University²

23 May 2011

TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.

...read moreread less

Abstract: Many problems of recent interest in statistics and machine learning can be posed in the framework of convex optimization. Due to the explosion in size and complexity of modern datasets, it is increasingly important to be able to solve problems with a very large number of features or training examples. As a result, both the decentralized collection or storage of these datasets as well as accompanying distributed solution methods are either necessary or at least highly desirable. In this review, we argue that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas. The method was developed in the 1970s, with roots in the 1950s, and is equivalent or closely related to many other algorithms, such as dual decomposition, the method of multipliers, Douglas–Rachford splitting, Spingarn's method of partial inverses, Dykstra's alternating projections, Bregman iterative algorithms for l1 problems, proximal methods, and others. After briefly surveying the theory and history of the algorithm, we discuss applications to a wide variety of statistical and machine learning problems of recent interest, including the lasso, sparse logistic regression, basis pursuit, covariance selection, support vector machines, and many others. We also discuss general distributed optimization, extensions to the nonconvex setting, and efficient implementation, including some details on distributed MPI and Hadoop MapReduce implementations.

...read moreread less

17,433 citations

Journal Article•DOI•

Regularization Paths for Generalized Linear Models via Coordinate Descent

[...]

Jerome H. Friedman¹, Trevor Hastie¹, Robert Tibshirani•Institutions (1)

Stanford University¹

02 Feb 2010-Journal of Statistical Software

TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.

...read moreread less

Abstract: We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problems while the penalties include l(1) (the lasso), l(2) (ridge regression) and mixtures of the two (the elastic net). The algorithms use cyclical coordinate descent, computed along a regularization path. The methods can handle large problems and can also deal efficiently with sparse features. In comparative timings we find that the new algorithms are considerably faster than competing methods.

...read moreread less

13,656 citations

Book•

Machine Learning : A Probabilistic Perspective

[...]

Kevin P. Murphy

24 Aug 2012

TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

...read moreread less

Abstract: Today's Web-enabled deluge of electronic data calls for automated methods of data analysis. Machine learning provides these, developing methods that can automatically detect patterns in data and then use the uncovered patterns to predict future data. This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach. The coverage combines breadth and depth, offering necessary background material on such topics as probability, optimization, and linear algebra as well as discussion of recent developments in the field, including conditional random fields, L1 regularization, and deep learning. The book is written in an informal, accessible style, complete with pseudo-code for the most important algorithms. All topics are copiously illustrated with color images and worked examples drawn from such application domains as biology, text processing, computer vision, and robotics. Rather than providing a cookbook of different heuristic methods, the book stresses a principled model-based approach, often using the language of graphical models to specify models in a concise and intuitive way. Almost all the models described have been implemented in a MATLAB software package--PMTK (probabilistic modeling toolkit)--that is freely available online. The book is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

...read moreread less

8,059 citations

Journal Article•DOI•

Statistics for Spatial Data.

[...]

Andrew B. Lawson¹, Noel A Cressie•Institutions (1)

University of Dundee¹

01 Mar 1993-The Statistician

6,278 citations