What is the first order necessary condition for the first order interior point algorithm?

For the unconstrained l2-lp optimization (3), any local minimizer x satisfies the first order necessary condition [6]XAT (Ax− c) + λp|x|p = 0, (4)and the second order necessary conditionXATAX + λp(p− 1)|X|p 0, (5)where |X|p = diag(|x1|p, . . . , |xn|p).5

What is the first order interior point algorithm?

Lemma 3 If ρk ≥ 49 (γR 3 + 12λR p)‖dk‖ holds for all k ∈ K, then the Second Order Interior Point Algorithm produces an global minimizer of (1) in at most O( − 3 2 ) steps.

how many steps can be taken to compute the first term after the inequality?

The proposed Second Order Interior Point Algorithm obtains an scaled second order stationary point or global minimizer of (27) in no more than 36f(x0) max{λ2R2p, 8λRp} − 32 steps.

what is the first order interior point algorithm?

2.2 First Order Interior Point AlgorithmNote that for any x, x+ ∈ (0, b], Assumption 2.1 implies thatH(x+) ≤ H(x) + 〈∇H(x), x+ − x〉+ β 2 ‖x+ − x‖2. (7)Since ϕ is concave on [0,+∞), then for any s, t ∈ (0,+∞),ϕ(t) ≤ ϕ(s) + 〈∇ϕ(s), t− s〉.

What are the penalty functions in R++?

These six penalty functions are concave in R+ and continuously differentiable in R++, which are often used in statistics and sparse reconstruction.

what is the complexity of the second order interior point Algorithm?

∞In − λRp p(1− p)(2− p)(3− p) 2 ‖dk‖∞In − λ 2 Rp‖dk‖∞In.(58)From (54), (55), (56), (58), and ρk < 4 9 (γR 3 + λ2R p)‖dk‖, the authors obtainXk+1∇2f(xk+1)Xk+1 =Xk+1∇2H(xk+1)Xk+1 + λp(p− 1)Xk+1Xp−2k+1Xk+1− (9 4 ρk + γR 3‖dk‖∞ + 1 2 λRp‖dk‖∞)In − 2(γR3 + λ 2 Rp)‖dk‖∞In −min{1, 1 γR3 + 12λR p } √ In − √ In.(59)17According to Lemmas 3 - 5, the authors can obtain the complexity of the Second Order Interior Point Algorithm for finding an scaled second order stationary point of (27).

(Open Access) Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization (2015) | Wei Bian

Q: What have the authors contributed in "Complexity analysis of interior point algorithms for non-lipschitz and nonconvex minimization" ?

The authors propose a first order interior point algorithm for a class of nonLipschitz and nonconvex minimization problems with box constraints, which arise from applications in variable selection and regularized optimization. The authors show that the worst-case complexity for finding an scaled first order stationary point is O ( −2 ).

Q: How can the authors solve a convex optimization problem?

Using an interior-point algorithm, Ye [17] proved that an scaled KKT or first order stationary point of general quadratic programming can be computed in O( −1 log( −1)) iterations where each iteration would solve a ball-constrained or trust-region quadratic program that is equivalent to a simplex convex quadratic minimization problem.

Q: What is the solution to the SSQP problem?

(2)At each step, the SSQP algorithm solves a strongly convex quadratic minimization problem with a diagonal Hessian matrix, which has a simple closedform solution.

Q: What is the way to solve a first order stationary point?

The authors show that the objective function value f(xk) is monotonically decreasing along the sequence {xk} generated by the algorithm, and the worst-case complexity of the algorithm for generating an interior scaled first order stationary point of (1) is O( −2), which is the same in the worst-case complexity order of the steepest-descent methods for nonconvex smooth optimization problems.

Mathematical Programming manuscript No.

(will be inserted by the editor)

Complexity Analysis of Interior Point Algorithms

for Non-Lipschitz and Nonconvex Minimization

Wei Bian · Xiaojun Chen · Yinyu Ye

July 25, 2012, Received: date / Accepted: date

Abstract We propose a ﬁrst order interior point algorithm for a class of non-

Lipschitz and nonconvex minimization problems with box constraints, which

arise from applications in variable selection and regularized optimization. The

objective functions of these problems are continuously diﬀerentiable typically

at interior points of the feasible set. Our algorithm is easy to implement and the

objective function value is reduced monotonically along the iteration points.

We show that the worst-case complexity for ﬁnding an  scaled ﬁrst order sta-

tionary point is O(

−2

). Moreover, we develop a second order interior point

algorithm using the Hessian matrix, and solve a quadratic program with ball

constraint at each iteration. Although the second order interior point algo-

rithm costs more computational time than that of the ﬁrst order algorithm in

each iteration, its worst-case complexity for ﬁnding an  scaled second order

stationary point is reduced to O(

−3/2

). An  scaled second order stationary

point is an  scaled ﬁrst order stationary point.

Keywords constrained non-Lipschitz optimization · complexity analysis ·

interior point method · ﬁrst order algorithm · second order algorithm

This work was supported partly by Hong Kong Research Council Grant PolyU5003/10p, The

Hong Kong Polytechnic University Postdoctoral Fellowship Scheme and the NSF foundation

(11101107,10971043) of China.

Wei Bian

Department of Mathematics, Harbin Institute of Technology, Harbin, China. Current ad-

dress: Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong

Kong. E-mail: bianweilvse520@163.com.

Xiaojun Chen

Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong,

China. E-mail: maxjchen@polyu.edu.hk

Yinyu Ye

Department of Management Science and Engineering, Stanford University, Stanford, CA

94305. E-mail: yinyu-ye@stanford.edu

Mathematics Subject Classiﬁcation (2000) 90C30 · 90C26 · 65K05 ·

49M37

1 Introduction

In this paper, we consider the following optimization problem:

min f(x) = H(x) + λ

i=1

ϕ(x

)

s.t. x ∈ Ω = {x : 0 ≤ x ≤ b},

(1)

where H ∈ R

→ R is continuously diﬀerentiable, ϕ : R

→ R is continuous

and concave, λ > 0, 0 < p < 1, b = (b

, b

, . . . , b

)

with b

> 0, i = 1, 2, . . . , n.

Moreover, ϕ is continuously diﬀerentiable in R

and ϕ(0) = 0. Without loss

of generality, we assume that a minimizer of (1) exists and min

Ω

f(x) ≥ 0.

Problem (1) is nonsmooth, nonconvex, and non-Lipschitz, which has been

extensively used in image restoration, signal processing and variable selection;

see, e.g., [8,11,13,19]. The function H(x) is often used as a data ﬁtting term,

while the function

i=1

ϕ(x

) is used as a regularization term. Numerical

experiments indicate these type of problems could be solved eﬀectively for

ﬁnding a local minimizer or stationary point. But little theoretical complexity

or convergence speed analysis of the problems is known, which is in contrast

to complexity study of convex optimization in the past thirty years.

There were few results on complexity analysis of nonconvex optimization

problems. Using an interior-point algorithm, Ye [17] proved that an  scaled

KKT or ﬁrst order stationary point of general quadratic programming can

be computed in O(

−1

log(

−1

)) iterations where each iteration would solve

a ball-constrained or trust-region quadratic program that is equivalent to a

simplex convex quadratic minimization problem. He also proved that, as  → 0,

the iterative sequence converges to a point satisfying the scaled second order

necessary optimality condition. The same complexity result was extended to

linearly constrained concave minimization by Ge et al. [10].

Cartis, Gould and Toint [2] estimated the worst-case complexity of a ﬁrst

order trust-region or quadratic regularization method for solving the following

unconstrained nonsmooth, nonconvex minimization problem

min

x∈R

(x) := H(x) + h(c(x)),

where h : R

→ R is convex but may be nonsmooth and c : R

→ R

continuously diﬀerentiable. They show that their method takes at most O(

−2

)

steps to reduce the size of a ﬁrst order criticality measure below , which is

the same in order as the worst-case complexity of steepest-descent methods

applied to unconstrained, nonconvex smooth optimization. However, f in (1)

cannot be deﬁned in the form of Φ

Garmanjani and Vicente [9] proposed a class of smoothing direct-search

methods for the unconstrained optimization of nonsmooth functions by ap-

plying a direct-search method to the smoothing function

f of the objective

function f [3]. When f is locally Lipschitz, the smoothing direct-search method

[9] took at most O(−

−3

log ) iterations to ﬁnd an x such that k∇

f(x, µ)k ≤ 

and µ ≤ , where µ is the smoothing parameter. When µ → 0,

f(x, µ) → f (x)

and ∇

f(x, µ) → v with v ∈ ∂f(x).

Recently, Bian and Chen [1] proposed a smoothing sequential quadratic

programming (SSQP) algorithm for solving the following non-Lipshchitz un-

constrained minimization problem

min

x∈R

(x) := H(x) + λ

i=1

ϕ(|x

). (2)

At each step, the SSQP algorithm solves a strongly convex quadratic mini-

mization problem with a diagonal Hessian matrix, which has a simple closed-

form solution. The SSQP algorithm is easy to implement and its worst-case

complexity of reaching an  scaled stationary point is O(

−2

Obviously, the objective functions f of (1) and f

of (2) are identical in R

Moreover, f is smooth in the interior of R

. Note that problem (2) includes

the l

-l

problem

min

x∈R

kAx − ck

+ λ

i=1

(3)

as a special case, where A ∈ R

m×n

, c ∈ R

, and p ∈ (0, 1).

In this paper, we analyze the worst-case complexity of interior point meth-

ods for solving problem (1). We propose a ﬁrst order interior point algorithm

with the worst-case complexity for ﬁnding an  scaled ﬁrst order stationary

point being O(

−2

), and develop a second order interior point algorithm with

the worst-case complexity of it for ﬁnding an  scaled second order stationary

point being O(

−3/2

Our paper is organized as follows. In Section 2, a ﬁrst order interior point

algorithm is proposed for solving (1), which only uses ∇f and a Lipschitz

constant of H on Ω and is easy to implement. Any iteration point x

> 0

belongs to Ω and the objective function is monotonically decreasing along the

generated sequence {x

}. Moreover, the algorithm produces an  scaled ﬁrst

order stationary point of (1) in at most O(

−2

) steps. In Section 3, a second

order interior point algorithm is given to solve a special case of (1), where H

is twice continuously diﬀerentiable, ϕ := t and Ω = {x : x ≥ 0}. By using

the Hessian of H, the second order interior point algorithm can generate an 

interior scaled second order stationary point in at most O(

−3/2

) steps.

Throughout this paper, K = {0, 1, 2, . . .}, I = {1, 2, . . . , n}, I

= {i ∈

{1, 2, . . . , n} : b

< +∞} and e

= (1, 1, . . . , 1)

∈ R

. For x ∈ R

, A =

)

m×n

∈ R

m×n

and q > 0, |A|

= (|a

)

m×n

, x

= (x

, x

, . . . , x

)

]

i=1

:= x, |x| = (|x

|, |x

|, . . . , |x

, kxk = kxk

:= (x

+ x

+ . . . + x

)

and diag(x) = diag(x

, x

, . . . , x

). For two matrices A, B ∈ R

n×n

, we denote

A  B when A − B is positive semi-deﬁnite.

2 First Order Method

In this section, we propose a ﬁrst order interior point algorithm for solving (1),

which uses the ﬁrst order reduction technique and keeps all iterates x

> 0

in the feasible set Ω. We show that the objective function value f(x

) is

monotonically decreasing along the sequence {x

} generated by the algorithm,

and the worst-case complexity of the algorithm for generating an  interior

scaled ﬁrst order stationary point of (1) is O(

−2

), which is the same in the

worst-case complexity order of the steepest-descent methods for nonconvex

smooth optimization problems. Moreover, it is worth noting that the proposed

ﬁrst order interior point algorithm is easy to implement, and computation cost

at each step is little.

Throughout this section, we need the following assumptions.

Assumption 2.1: ∇H is globally Lipschitz on Ω with a Lipschitz constant

β.

Specially, when I

6= ∅ we choose β such that β ≥ max

i∈I

and β ≥ 1.

Assumption 2.2: For any given x

∈ Ω, there is R ≥ 1 such that

sup{kxk

∞

: f(x) ≤ f (x

), x ∈ Ω} ≤ R.

When H(x) =

kAx − ck

, Assumption 2.1 holds with β = kA

Ak. As-

sumption 2.2 holds, if Ω is bounded or H(x) ≥ 0 for all x ∈ Ω and ϕ(t) → ∞

as t → ∞ .

2.1 First Order Necessary Conditions

Note that for problem (1), when 0 < p < 1, the Clarke generalized gradient

of ϕ(|s|

) does not exist at 0. Inspired by the scaled ﬁrst and second order

necessary conditions for local minimizers of unconstrained non-Lipschitz opti-

mization in [5,6], we give the scaled ﬁrst and second order necessary condition

for local minimizers of the constrained non-Lipschitz optimization (1) in this

section. Then, for any  ∈ (0, 1], the  scaled ﬁrst order and second order

necessary stationary point of (1) can be deduced directly.

First, for  > 0, an  global minimizer of (1) is deﬁned as a feasible solution

0 ≤ x



≤ b and

f(x



) − min

0≤x≤b

f(x) ≤ .

It has been proved that ﬁnding a global minimizer of the unconstrained l

-l

minimization problem (3) is strongly NP hard in [4]. For the unconstrained

-l

optimization (3), any local minimizer x satisﬁes the ﬁrst order necessary

condition [6]

(Ax − c) + λp|x|

= 0, (4)

and the second order necessary condition

AX + λp(p − 1)|X|

 0, (5)

where |X|

= diag(|x

, . . . , |x

For (1), if x is a local minimizer of (1) at which f is diﬀerentiable, then

x ∈ Ω satisﬁes

(i) [∇f(x)]

= 0 if x

6= b

;

(ii) [∇f(x)]

≤ 0 if x

= b

Although [∇f(x)]

does not exist when x

= 0, one can see that, as x

→ 0+,

[∇f(x)]

→ +∞.

Denote

X∇f(x) = X∇H(x) + λp[∇ϕ(s)

s=x

]

i=1

Similarly, using X as a scaling matrix, if x is a local minimizer of (1), then

x ∈ Ω satisﬁes the scaled ﬁrst order necessary condition

[X∇f(x)]

= 0 if x

6= b

; (6a)

[∇f(x)]

≤ 0 if x

= b

. (6b)

Now we can deﬁne an  scaled ﬁrst order stationary point of (1).

Deﬁnition 1 For a given 0 < ε ≤ 1, we call x ∈ Ω an  scaled ﬁrst order

stationary point of (1), if there is δ > 0 such that

(i) |[X∇f(x)]

| ≤  if x

< b

− δ;

(ii) [∇f(x)]

≤  if x

≥ b

− δ.

Deﬁnition 1 is consistent with the ﬁrst order necessary conditions in (6a)-

(6b) with  = 0. Moreover, Deﬁnition 1 is consistent with the ﬁrst order

necessary conditions given in [6] with  = 0 for unconstrained optimization.

2.2 First Order Interior Point Algorithm

Note that for any x, x

∈ (0, b], Assumption 2.1 implies that

H(x

) ≤ H(x) + h∇H(x), x

− xi +

− xk

. (7)

Since ϕ is concave on [0, +∞), then for any s, t ∈ (0, +∞),

ϕ(t) ≤ ϕ(s) + h∇ϕ(s), t − si. (8)

Thus, for any x, x

∈ (0, b], we obtain

f(x

) ≤ f (x) + h∇f(x), x

− xi +

− xk

. (9)

Let Xd

= x

− x. We obtain

f(x

) ≤ f (x) + hX∇f(x), d

i +

kXd

. (10)

We now use the reduction idea to solve the constrained non-Lipschitz op-

timization problem (1). To achieve a reduction of the objective function, we

Complexity analysis of interior point algorithms for non-Lipschitz and nonconvex minimization

Citations

Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models

Implementable tensor methods in unconstrained convex optimization.

Iterative reweighted minimization methods for $$l_p$$lp regularized unconstrained nonlinear programming

Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications

Interior Point Algorithms Theory And Analysis

References

Regression Shrinkage and Selection via the Lasso

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Introductory Lectures on Convex Optimization: A Basic Course

Regression shrinkage and selection via the lasso: a retrospective

Nearly unbiased variable selection under minimax concave penalty

Related Papers (5)

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

Cubic regularization of Newton method and its global performance

Nearly unbiased variable selection under minimax concave penalty

Exact Reconstruction of Sparse Signals via Nonconvex Minimization

Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results

Frequently Asked Questions (12)

Q1. What have the authors contributed in "Complexity analysis of interior point algorithms for non-lipschitz and nonconvex minimization" ?

Q2. How can the second order interior point algorithm be used to solve a special case?

Q3. How can the authors solve a convex optimization problem?

Q4. What is the first order necessary condition for the first order interior point algorithm?

Q5. What is the first order interior point algorithm?

Q6. how many steps can be taken to compute the first term after the inequality?

Q7. What is the solution to the SSQP problem?

Q8. what is the first order interior point algorithm?

Q9. What is the way to solve a first order stationary point?

Q10. What is the worst-case complexity of the second order interior point algorithm?

Q11. What are the penalty functions in R++?

Q12. what is the complexity of the second order interior point Algorithm?