scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Residual Ratio Thresholding for Linear Model Order Selection

TL;DR: In this article, the authors proposed to use residual ratio thresholding (RRT) for model order selection in linear regression models and provide rigorous mathematical analysis of RRT for MOS.
Abstract: Model order selection (MOS) in linear regression models is a widely studied problem in signal processing. Penalized log likelihood techniques based on information theoretic criteria (ITC) are algorithms of choice in MOS problems. Recently, a number of model selection problems have been successfully solved with explicit finite sample guarantees using a concept called residual ratio thresholding (RRT). This paper proposes to use RRT for MOS in linear regression models and provide rigorous mathematical analysis of RRT. RRT is numerically shown to deliver a highly competitive performance when compared to popular MOS criteria, such as Akaike information criterion, Bayesian information criterion, and penalized adaptive likelihood, especially when the sample size is small. We also analytically establish an interesting interpretation for RRT based on ITC thereby linking these two model selection principles.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors proposed two techniques, namely residual ratio minimization (RRM) and residual ratio thresholding with adaptation (RRTA), to operate OMP algorithm without a priroi knowledge of noise variance and signal sparsity.

2 citations

Posted Content
TL;DR: A novel technique called generalized residual ratio thresholding (GRRT) is presented for operating SOMP and BOMP without the \textit{a priori} knowledge of signal sparsity and noise variance and derive finite sample and finite signal to noise ratio (SNR) guarantees for exact support recovery.
Abstract: Simultaneous orthogonal matching pursuit (SOMP) and block OMP (BOMP) are two widely used techniques for sparse support recovery in multiple measurement vector (MMV) and block sparse (BS) models respectively. For optimal performance, both SOMP and BOMP require \textit{a priori} knowledge of signal sparsity or noise variance. However, sparsity and noise variance are unavailable in most practical applications. This letter presents a novel technique called generalized residual ratio thresholding (GRRT) for operating SOMP and BOMP without the \textit{a priori} knowledge of signal sparsity and noise variance and derive finite sample and finite signal to noise ratio (SNR) guarantees for exact support recovery. Numerical simulations indicate that GRRT performs similar to BOMP and SOMP with \textit{a priori} knowledge of signal and noise statistics.

1 citations

Journal ArticleDOI
TL;DR: This work proposes two new parallel iterative algorithms as extensions of the Gauss–Seidel algorithm (GSA) to solve regression problems involving many variables to solve big-data analytics problems involving large matrices.
Abstract: In order to perform big-data analytics, regression involving large matrices is often necessary. In particular, large scale regression problems are encountered when one wishes to extract semantic patterns for knowledge discovery and data mining. When a large matrix can be processed in its factorized form, advantages arise in terms of computation, implementation, and data-compression. In this work, we propose two new parallel iterative algorithms as extensions of the Gauss–Seidel algorithm (GSA) to solve regression problems involving many variables. The convergence study in terms of error-bounds of the proposed iterative algorithms is also performed, and the required computation resources, namely time- and memory-complexities, are evaluated to benchmark the efficiency of the proposed new algorithms. Finally, the numerical results from both Monte Carlo simulations and real-world datasets are presented to demonstrate the striking effectiveness of our proposed new methods.

1 citations

Proceedings ArticleDOI
20 Jul 2022
TL;DR: Three model order selection algorithms whose penalty terms depend on the observed data are described and these algorithms are applied to the problem of estimating the number of signals with unknown amplitudes.
Abstract: Model order selection problems are important for signal processing and its various applications in wireless communications, radar theory, navigation, control theory, and others. We described three model order selection algorithms whose penalty terms depend on the observed data and apply these algorithms to the problem of estimating the number of signals with unknown amplitudes. Performance analysis for these algorithms is carried out using the error probability as the performance measure. Based on the results of the performance analysis we find optimal values of tuning parameters of the algorithms where needed and compare all algorithms with each other.

1 citations

Proceedings ArticleDOI
26 Jul 2022
TL;DR: A class of signal parameters for which the widely used maximum likelihood method is useless for estimating the number of signals is described, and it is established, that the amplitude parameters belong to this class.
Abstract: Model order selection is an important stage in many technical areas. Estimating the number of signals with unknown parameters is a special case of the model order selection problem. We describe a class of signal parameters for which we show that the widely used maximum likelihood method is useless for estimating the number of signals. It is established, that the amplitude parameters belong to this class. Therefore, we study the estimation problem for the number of signals with unknown amplitudes in discrete time. Five algorithms for estimating the number of signals are described, including the new one. Finally, we provide a performance analysis of these algorithms, comparing them using both analytical and numerical approaches.
References
More filters
Journal ArticleDOI
TL;DR: F can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program) and numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted.
Abstract: This paper considers a natural error correcting problem with real valued input/output. We wish to recover an input vector f/spl isin/R/sup n/ from corrupted measurements y=Af+e. Here, A is an m by n (coding) matrix and e is an arbitrary and unknown vector of errors. Is it possible to recover f exactly from the data y? We prove that under suitable conditions on the coding matrix A, the input f is the unique solution to the /spl lscr//sub 1/-minimization problem (/spl par/x/spl par//sub /spl lscr/1/:=/spl Sigma//sub i/|x/sub i/|) min(g/spl isin/R/sup n/) /spl par/y - Ag/spl par//sub /spl lscr/1/ provided that the support of the vector of errors is not too large, /spl par/e/spl par//sub /spl lscr/0/:=|{i:e/sub i/ /spl ne/ 0}|/spl les//spl rho//spl middot/m for some /spl rho/>0. In short, f can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program). In addition, numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted. This work is related to the problem of finding sparse solutions to vastly underdetermined systems of linear equations. There are also significant connections with the problem of recovering signals from highly incomplete measurements. In fact, the results introduced in this paper improve on our earlier work. Finally, underlying the success of /spl lscr//sub 1/ is a crucial property we call the uniform uncertainty principle that we shall describe in detail.

6,853 citations

Book
30 Nov 2012
TL;DR: This book covers a much wider range of topics than a typical introductory text on mathematical statistics, and includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses.
Abstract: WINNER OF THE 2005 DEGROOT PRIZE! This book is for people who want to learn probability and statistics quickly. It brings together many of the main ideas in modern statistics in one place. The book is suitable for students and researchers in statistics, computer science, data mining and machine learning. This book covers a much wider range of topics than a typical introductory text on mathematical statistics. It includes modern topics like nonparametric curve estimation, bootstrapping and classification, topics that are usually relegated to follow-up courses. The reader is assumed to know calculus and a little linear algebra. No previous knowledge of probability and statistics is required. The text can be used at the advanced undergraduate and graduate level.

1,540 citations

Journal ArticleDOI
TL;DR: It is shown that under conditions on the mutual incoherence and the minimum magnitude of the nonzero components of the signal, the support of the signals can be recovered exactly by the OMP algorithm with high probability.
Abstract: We consider the orthogonal matching pursuit (OMP) algorithm for the recovery of a high-dimensional sparse signal based on a small number of noisy linear measurements. OMP is an iterative greedy algorithm that selects at each step the column, which is most correlated with the current residuals. In this paper, we present a fully data driven OMP algorithm with explicit stopping rules. It is shown that under conditions on the mutual incoherence and the minimum magnitude of the nonzero components of the signal, the support of the signal can be recovered exactly by the OMP algorithm with high probability. In addition, we also consider the problem of identifying significant components in the case where some of the nonzero components are possibly small. It is shown that in this case the OMP algorithm will still select all the significant components before possibly selecting incorrect ones. Moreover, with modified stopping rules, the OMP algorithm can ensure that no zero components are selected.

1,093 citations

Journal ArticleDOI
TL;DR: The parametric (or model-based) methods of signal processing often require not only the estimation of a vector of real-valued parameters but also the selection of one or several integer-valued parameter that are equally important for the specification of a data model.
Abstract: The parametric (or model-based) methods of signal processing often require not only the estimation of a vector of real-valued parameters but also the selection of one or several integer-valued parameters that are equally important for the specification of a data model. Examples of these integer-valued parameters of the model include the orders of an autoregressive moving average model, the number of sinusoidal components in a sinusoids-in-noise signal, and the number of source signals impinging on a sensor array. In each of these cases, the integer-valued parameters determine the dimension of the parameter vector of the data model, and they must be estimated from the data.

1,075 citations

01 Jan 1997
TL;DR: In this paper, the authors considered the problem of selecting a linear model to approximate the true un-known regression model, some necessary and/or sufficient conditions are estab- lished for the asymptotic validity of various model selection procedures such as Akaike's AIC, Mallows' Cp, Shibata's FPEλ, Schwarz' BIC, generalized AIC and cross validation.
Abstract: In the problem of selecting a linear model to approximate the true un- known regression model, some necessary and/or sufficient conditions are estab- lished for the asymptotic validity of various model selection procedures such as Akaike's AIC, Mallows' Cp, Shibata's FPEλ, Schwarz' BIC, generalized AIC, cross- validation, and generalized cross-validation. It is found that these selection proce- dures can be classified into three classes according to their asymptotic behavior. Under some fairly weak conditions, the selection procedures in one class are asymp- totically valid if there exist fixed-dimension correct models; the selection procedures in another class are asymptotically valid if no fixed-dimension correct model exists. The procedures in the third class are compromises of the procedures in the first two classes. Some empirical results are also presented.

595 citations