Implementation of algorithms for tuning parameters in regularized least squares problems in system identification

Q: What is the way to solve the least squares problem?

It is well-known, e.g. (Golub & Van Loan, 1996, Section 5), that the least squares problem (15) can be solved more accurately with the QR factorization method than the Cholesky factorization method, when the condition number of AT A defined in (17) is ill-conditioned.

Q: What is the basic problem to estimate linear regressions?

The linear least squares problem to estimate linear regressions is one of the most basic estimation problems, and there is an extensive literature around it, e.g. (Rao, 1973; Daniel & Wood, 1980; Draper & Smith, 1981).

Q: What is the reason why the extra constraints are used in Monte Carlo?

So one may question if the extra constraints can eventually cause performance loss in the regularized least squares estimate (5b).

Q: What is the magnitude of LT NTNL?

The magnitude of LT ΦNΦTNL can be very large if the element of α that controls the magnitude of P(α) is large, which is often the case for the stable spline kernel (Pillonetto & Nicolao, 2010; Pillonetto et al., 2011).

Question

Q1. What have the authors contributed in "Implementation of algorithms for tuning parameters in regularized least squares problems in system identification" ?

Q2. What is the numerical accuracy of the cost function in (6)?

Q3. What is the problem of the matrixAT A?

Q4. What is the command used to solve the nonconvex optimization problem (6)?

Q5. How many data sets are used to estimate the FIR model?

Q6. What is the condition number of the “DC” kernel?

Q7. What is the way to solve the least squares problem?

Q8. What is the basic problem to estimate linear regressions?

Q9. What is the reason why the extra constraints are used in Monte Carlo?

Q10. What is the magnitude of LT NTNL?

Q11. what is the maximum likelihood method to estimate from (4)?

Q12. What is the algorithm for estimating the impulse response?

Q13. What is the algorithm for estimating the FIR model?

Accepted Answer

There is recently a trend to study linear system identification with high order finite impulse response ( FIR ) models using the regularized least-squares approach.

Accepted Answer

When seeking efficient algorithms to compute the costfunction in (6) for very large N and N À n, the numerical accuracy depends on the conditioning and the magnitude of the matrices P(α) and ΦNΦTN .

Accepted Answer

For the impulse response estimation problem, the matrixAT A = σ2In +LT ΦNΦTNL (17)can be ill-conditioned due to the following two problems:•

Accepted Answer

The command fmincon is used here to solve the nonconvex optimization problem (6) with the trust region reflective algorithm selected.

Accepted Answer

For each data set, the authors aim to estimate FIR model (3) with n = 125 using the regularized least squares (5) including the empirical Bayes method (6).

Accepted Answer

For the “DC” kernel (31c), further assume 0.72≤ λ < 1 and−0.99≤ ρ ≤ 0.99 so that the condition number of the DC kernel is smaller than 2.0×1020. •

Accepted Answer

It is well-known, e.g. (Golub & Van Loan, 1996, Section 5), that the least squares problem (15) can be solved more accurately with the QR factorization method than the Cholesky factorization method, when the condition number of AT A defined in (17) is ill-conditioned.

Accepted Answer

The linear least squares problem to estimate linear regressions is one of the most basic estimation problems, and there is an extensive literature around it, e.g. (Rao, 1973; Daniel & Wood, 1980; Draper & Smith, 1981).

Accepted Answer

So one may question if the extra constraints can eventually cause performance loss in the regularized least squares estimate (5b).

Accepted Answer

The magnitude of LT ΦNΦTNL can be very large if the element of α that controls the magnitude of P(α) is large, which is often the case for the stable spline kernel (Pillonetto & Nicolao, 2010; Pillonetto et al., 2011).

Accepted Answer

It is the maximum likelihood method to estimate α from (4) under the (Bayesian) assumptions that θ is Gaussian with zero mean and covariance matrix P(α) and VN is Gaussian with zero mean and covariance matrix σ2IN .

Accepted Answer

In this case, as discussed in (Chen, Ohlsson & Ljung, 2012), the authors can first estimate, with the Maximum Likelihood/Prediction Error Method e.g. (Ljung, 1999), a low-order “base-line model” that can take care of the dominating part of the impulse response.

Accepted Answer

The authors then use regularized least squares (based on Algorithm 2) to estimate an FIR model with reasonably large n, which should capture the residual (fast decaying) dynamics (Chen, Ohlsson & Ljung, 2012).

Implementation of algorithms for tuning parameters in regularized least squares problems in system identification

Figures

Citations

Survey Kernel methods in system identification, machine learning and function estimation: A survey

System Identification Via Sparse Multiple Kernel-Based Regularization Using Sequential Convex Optimization Techniques

A shift in paradigm for system identification

On kernel design for regularized LTI system identification

Maximum entropy properties of discrete-time first-order stable spline kernel

References

System Identification: Theory for the User

Applied Regression Analysis

Gaussian Processes for Machine Learning

Linear statistical inference and its applications

System identification

Related Papers (5)