Journal Article•DOI•

Tucker factorization with missing data with application to low-$$n$$n-rank tensor completion

Marko Filipović, Ante Jukic¹•Institutions (1)

01 Jul 2015-Multidimensional Systems and Signal Processing (Springer US)-Vol. 26, Iss: 3, pp 677-692

TL;DR: This paper proposes a simple algorithm for Tucker factorization of a tensor with missing data and its application to low-$$n$$n-rank tensor completion and demonstrates in several numerical experiments that the proposed algorithm performs well even when the ranks are significantly overestimated.

read less

Abstract: The problem of tensor completion arises often in signal processing and machine learning. It consists of recovering a tensor from a subset of its entries. The usual structural assumption on a tensor that makes the problem well posed is that the tensor has low rank in every mode. Several tensor completion methods based on minimization of nuclear norm, which is the closest convex approximation of rank, have been proposed recently, with applications mostly in image inpainting problems. It is often stated in these papers that methods based on Tucker factorization perform poorly when the true ranks are unknown. In this paper, we propose a simple algorithm for Tucker factorization of a tensor with missing data and its application to low-$$n$$n-rank tensor completion. The algorithm is similar to previously proposed method for PARAFAC decomposition with missing data. We demonstrate in several numerical experiments that the proposed algorithm performs well even when the ranks are significantly overestimated. Approximate reconstruction can be obtained when the ranks are underestimated. The algorithm outperforms nuclear norm minimization methods when the fraction of known elements of a tensor is low.

...read moreread less

Summary (1 min read)

Jump to: [1 Introduction] – [2 Proposed approach] – [3 Experimental results] and [4 Conclusions]

1 Introduction

Low-rank matrix completion problem was studied extensively in recent years (Recht et al 2010; Candes and Recht 2009).
In Subsection 1.1, the authors review basics of tensor notation and terminology.
The emphasis is on synthetic experiments, which are used to demonstrate the efficiency of the proposed method on exactly low-rank problems.
There, the authors introduced an extension of nuclear norm to tensors.
Therefore, one approach to tensor completion could be to use one of these algorithms, and an approximation of the complete tensor can be obtained from its Tucker factorization.

2 Proposed approach

The authors suppose that they have some estimations of true ranks.
Any first-order optimization method can be used for minimizing fW in (15) with respect to one of the parameters.
For this reason, the authors have used nonlinear conjugate gradient method, as in (Acar et al 2011), from the Poblano toolbox (Dunlavy et al 2010) for MATLAB.
The core and the factors are initialized by HOSVD algorithm (De Lathauwer et al 2000; Kolda and Bader 2009), applied to the initial approximation X̂.
The experiments show that an accurate reconstruction of tensor X can be obtained when the true ranks are overestimated (for example, r̂i = ri + 10).

3 Experimental results

Several experiments were performed on synthetic and realistic data to demonstrate the efficiency of the proposed method.
Maximal number of iterations was set to 5000, although the algorithm stabilized (reached the stationary point) even in much less iterations in all experiments.
It can be seen that good quality of reconstruction (better than using the nuclear norm minimization) can be obtained when the true ranks are over- or even underestimated.

4 Conclusions

The authors have proposed a Tucker factorization-based approach to low-n-rank tensor completion using similar approach as in (Acar et al 2011), where it was used for PARAFAC decomposition with missing data.
It was demonstrated that the proposed method can recover the underlying low-n-rank tensor even when the true tensor ranks are unknown.
An important assumption was that the true ranks can be overestimated.
Of course, there are no theoretical guarantees for the proposed method (since it is based on non-convex optimization), which is its main flaw.
The authors would like to thank the project leader Ivica Kopriva.

Did you find this useful? Give us your feedback

Figures (5)

Fig. 3 Generalization performance of the proposed method and the ADMM method for nuclear norm minimization from (Tomioka et al. (2011)1)

Table 1 Tensor completion results for the experiment setup from (Liu et al 2013). Values in the table are mean values of relative error and time (in seconds) over 100 random realizations. ρ is the fraction of known elements in a tensor.

Table 2 Tensor completion results for the experiment setup from (Gandy et al 2011). Values in the table are mean values of relative error and time (in seconds) over 20 random realizations. ρ is the fraction of known elements in a tensor. ADM-TR (E) refers to alternating direction method with exact update proposd in (Gandy et al 2011). N-way IM refers to the algorithm for Tucker factorization with missing data and incorrect model information used in (Gandy et al 2011)

Table 3 Comparison of reconstruction quality depending on rank estimates for the image from Figure 2. Values in the table are peak signal-to-noise ratios (PSNR-s) in decibels (dB).

Fig. 1 Comparison of the generalization error vs. fraction of known elements with the method from (Tomioka et al. (2011)1). The graph shows mean values and standard deviations of the generalization error over 50 realizations of the low-n-rank tensor and indices of the observed tensor elements.

Content maybe subject to copyright Report

Noname manuscript No.

(will be inserted by the editor)

Tucker factorization with missing data with

application to low-n-rank tensor completion

Marko Filipovi´c · Ante Juki´c

Received: date / Accepted: date

Abstract The problem of tens or completi on arises often in signal p rocessing and

machine learning. It consists of recovering a tenso r from a subset of its entries.

The usual structural assumption on a tensor t h at makes the problem well posed

is that the tensor has low rank in every mode. Several tensor completion methods

based on minimization of nuclear norm, which is the closest convex approximation

of rank, have been proposed recently, with applications mostly in image i n paint-

ing problems. It is often stated in these papers that method s based on Tucker

factorization perform poorly when the true ranks are unknown. In this paper,

we propose a simple algorithm for Tucker factoriz ation of a tensor with missing

data and its application to low-n-ra nk tensor completion . The algorithm is simi lar

to previously proposed metho d for PARAFAC decomposition with mis s ing data.

We demonstrate in several numerical experiments that the pr oposed alg orithm

performs well even when the ranks are signiﬁcantly overesti mated. Approximate

reconstruction can be obtained when the ranks are underestimated. The algorithm

outperforms nuclear norm minimization methods when the frac tion of known ele-

ments of a t ens or is low.

Keywords Tucker factorization · Tensor co mpletion · Low-n-rank tensor ·

Missing data

Mathematics Subject Classiﬁcation (2000) MSC 68U99

M. Filipovi´c

Rudjer Boˇskovi´c Institute

Bijeniˇcka 54, 10000 Zagreb, Croatia

Tel.: +385-1-4571241

Fax: +385-1-4680104

E-mail: ﬁlipov@irb.hr

A. Juki´c

Department of Medical Physics and Acoustics

University of Oldenburg

26111 Oldenburg, Germany

E-mail: ante.jukic@uni-oldenburg.de

2 Marko Filipovi´c, Ante Juki´c

1 Introduction

Low-rank matrix completion problem was studied extensively in recent years

(Recht et al 2010; Candes and Recht 2009). It arises natur ally in many practi-

cal problems when one would like to recover a matrix from a subset of its entries.

On th e other hand, in ma ny applications one is dealing with multi-way data,

which are n aturally represented by tens ors. Tensor s are multi-dimensional arrays,

i.e. higher-order generalizations of matrices. Multi-way data anal ysis was ori gi-

nally developed in t h e ﬁelds of psychometrics and chemometrics, but nowadays

it also has applications in signal processing, machine learning and da ta analysis.

Here, we are interested in the problem of recovering a partially observed tensor, or

tensor completion problem. Examples of applicati ons where the problem arises in-

clude image occlusio n /inpainting problems, social network data analysis , network

traﬃc data ana lysis, bibliometric data a nalysis, spectroscopy, multidimensional

NMR (Nuclear Magnetic Resonance) data analysis, EEG (electroencephalogram)

data analysis and many others. For a more detailed description of appl ications,

interested reader is referred t o (Acar et al 2011) and references therein.

In the matrix case, it is often realistic to assum e that the matrix that we want

to reconstruct from a subset of its entries has a low rank. This assumption en-

ables matrix completion from only a small number of its entries. However, the

rank function is discrete and nonconvex, which makes its optimization hard in

practice. Therefo re, nuclear norm has been used in many papers as its approxi-

mation. Nuclear norm is deﬁned as the sum of singular values of a matrix, and

it is the tightest convex lower bound of th e rank of a matrix on the set of ma-

trices {Y : kY k

≤ 1} (here, k · k

denotes usual matrix 2-norm). When the rank

is replaced by nuclear norm, the resulting problem of nuclear norm minimization

is convex, and, as shown in (Candes and R echt 2009), if the matrix rank is low

enough, the solution of the o riginal (rank minimizati on) problem can be f ound

by minimizing the nuclear norm. In several recent papers on tensor completion,

the deﬁnitio n o f nuclear norm was extended to tensors. There, it was stated that

methods based on Tucker factorization perform poorly when the true ranks of

the tensor are unknown. In this paper, we propose a method for Tucker factor-

izati on wit h missing data, with application in tensor completion. We demonstrate

in several numerical experiments that t h e method performs well even when th e

true ranks are signiﬁcantly overestimated. Namely, it can estimate the exact ranks

from the data. Also, it outperform s nuclear norm minimization methods when the

fracti on of known elements of a tenso r is low.

The rest of the paper is or ganized as follows. In Subsection 1.1, we review

basics of tensor notation and t erminol ogy. Problem setting and previou s work are

described in Subsection 1.3. We describe our approach in Section 2. In Subsections

2.1 and 2.2, details related to optimization method and implementation of the

algo rithm are described. Several numerical experiments are presented in Section

3. The emphasis is on s ynthetic experiments, which are used to demonstrate the

eﬃciency of th e proposed method on exactly low-rank problems. However, we also

perform some ex periments o n realistic data. Conclusions are presented in Section

Low-n-rank tensor completion 3

1.1 Tensor notation and terminology

We denote scalars by regular lowercase o r uppercase, vector s by bold lowercase,

matrices by bo ld uppercase, and tensors by bold Euler script letters. For more de-

tail s on tensor notation and terminol ogy, the reader is also referred to

(Kolda and Bader 2009).

The order of a tensor is the number of its dimens ions (also called ways or

modes). We denote the vector space of tensors of order N and size I

× · · · × I

by R

×···×I

. Elements o f tensor X of order N are denoted by x

...i

A ﬁber of a tensor is deﬁned as a vector obtained by ﬁxing all indices but

one. Fibers a re generaliz ations of m atrix columns and rows. Mode-n ﬁbers are

obtained by ﬁxing all indices but n-th. Mode-n matricization (unfolding) of tensor

X, denoted as X

(n)

, is obta ined by arranging all mode-n ﬁbers as columns of a

matrix. Precise order in which ﬁbers are stacked as columns is not important as

long as it is consistent. Folding is the inverse operation of ma tricization/ un folding.

Mode-n prod uc t of tensor X and matrix A is denoted by X ×

A. It is deﬁned

Y = X ×

A ⇐⇒ Y

(n)

= AX

(n)

Mode-n product is commutative (when applied in distinct modes), i.e.

(X ×

A) ×

B = (X ×

B) ×

for m 6= n. Repeated mode-n produc t can be expressed as

(X ×

A) ×

B = X ×

(BA) .

There are several deﬁnitions of tensor rank. In this pap er, we are interested in

n-rank. For N-way tensor X, n-rank is deﬁned as the rank of X

(n)

. If we denote

= rank X

(n)

, for n = 1, . . . , N, we say that X is a rank-(r

, . . . , r

) tensor. In

the experi mental section (Section 3) in this paper we denote an estimation of the

n-rank of given tensor X by ˆr

For completeness, we al s o st ate the usual deﬁnition of the rank of a tensor.

We say tha t an N -way tensor X ∈ R

×···×I

is rank-1 if it can be written as the

outer product of N vectors, i.e.

X = a

(1)

◦ · · · ◦ a

(N)

, (1)

where ◦ denotes the vector outer product. Elementwise, (1) is written as x

...i

(1)

· · · a

(N)

, for all 1 ≤ i

≤ I

. Tensor rank of X is d eﬁn ed as minimal number

of rank-1 tensors that generate X in the sum. As opposed to n-rank of a tensor,

tensor rank is hard to compute (H˚astad 1990).

The Hadamard prod uc t of tensors is the componentwise product, i .e. for N -way

tensors X, Y, it is deﬁned as (X ∗ Y)

...ı

= x

...i

The Frobenius norm of tensor X of size I

× · · · × I

is denoted by kXk

and

deﬁned as

kXk

· · ·

...i

. (2)

4 Marko Filipovi´c, Ante Juki´c

1.2 Tensor factorizations/decompositions

Two of the most often used tensor factorizations/decom positions a re PARAFAC

(parallel factors) decomposition a n d Tucker factorization. PARAFAC decomposi-

tion is also called canonical decomposition (CANDECOMP) or

CANDECOMP/PARA FAC (CP) decomposition. For a given tensor X ∈ R

×···×I

it is deﬁned as a decomposition of X as a linear combination of minimal number

of rank-1 tensors

X =

r=1

(1)

◦ · · · ◦ a

(N)

. (3)

For more details regarding the PARAFAC deco mposition, the reader is referred to

(Kolda and Bader 2009), since here we are interested in Tucker factorization.

Tucker factoriza tion (also ca lled N -mode PCA or higher-order SVD) of a tensor

X can be written as

X = G ×

· · · ×

, (4)

where G ∈ R

×···×J

is the core tensor with J

≤ I

, for i = 1, . . . , N , an d A

i = 1, . . . , N a re, usually orthogonal, factor matrices. Factor matrices A

are of

size I

× r

, for i = 1, . . . , N , if X is rank-(r

, . . . , r

). A tensor that has low ran k

in every mode can be represented with its Tucker fact orization with small core

tensor (whose dimensi ons correspond to r anks in correspo nd ing modes). Mode-n

matricization X

(n)

of X in (4) can be written as

(n)

= A

(n)



(N)

⊗ · · · ⊗ A

(n+1)

⊗ A

(n−1)

⊗ · · · ⊗ A

(1)



, (5)

where G

(n)

denotes mode-n matricization of G, ⊗ denotes Kronecker product of

matrices, and M

denotes the transpose of matrix M. If the factor matrices A

(i)

are constrained to be orthogonal, then they can be interpreted as the principa l

components in correspondin g modes, whi le the elem ents of the core tensor G show

the level of interaction between diﬀerent modes. In general, Tucker factorization is

not unique. However, in practica l applications some constraints are often imposed

on the core and the factors to obtain a meaningful fact orization, for example

orthogonality, non-negativity or sparsity. For m ore details, t h e reader i s referred

to (Tucker 1966; Kolda and Bader 2 009; De La thauwer et al 2000).

1.3 Problem deﬁnition and previous work

Let us denote by T ∈ R

×···×I

a tensor that is low-rank in every mode (low-

n-rank tensor ), and by T

Ω

the projection of T onto indexes of observed entries.

Here, Ω is a subset of {1, . . . , I

} × {1, . . . , I

} × · · · × {1, . . . , I

}, consisting of

positions of obs erved tensor entries. The problem of low-n-rank tensor completion

was formulated in (Gandy et al 2011) as

min

X∈R

×···×I

n=1

rank



(n)



subject to X

Ω

= T

Ω

. (6)

Some other functi on o f n-ranks of a tensor can also be considered here, for exam-

ple any linear combination of n-ranks. Nuclear norm mini mization approaches to

Low-n-rank tensor completion 5

tensor completion, described in the following, are based on this type of problem

formulation. Namely, the idea is to replace rank



(n)



with nuclear norm of X

(n)

This leads to the problem formulation

min

X∈R

×···×I

n=1



(n)



∗

subject to X

Ω

= T

Ω

Here, f or given matrix X ∈ R

×d

, kXk

∗

min(d

, d

)

i=1

(X), where σ

(X)

denote singular values of X, denotes nuclear norm (o r trace norm) of X. Corre-

sp onding unconstrained formulation is

min

X∈R

×···×I

n=1



(n)



∗

Ω

− T

Ω

, (7)

where λ > 0 is a regularization parameter. This problem formulation was used in

(Gandy et al 2011). In (Tomioka et al. (2011)

), similar formulation

min

X∈R

×···×I

n=1



(n)



∗

2λ

Ω

− T

Ω

, (8)

where λ > 0 and γ

≥ 0 are parameters, was used. In (Liu et al 2013), formulation

min

X∈R

×···×I

n=1





(n)



∗



(n)

− T

(n)





subject to X

Ω

= T

Ω

(9)

where again α

≥ 0 and γ

> 0 are para meters, was used. In above equations (7),

(8) and (9), we have used the notation from corresponding pap ers. Therefore, note

that λ in (7) and (8), i.e. γ

in (8) and (9), have diﬀerent interpretations.

The ﬁrst paper tha t proposed an extension of low-rank matrix completion

concept to tensors seems to be (Liu et al 2009). There, the authors introduced

an extension of nuclear norm to tensors. They focused on n-rank, and deﬁned

the nuclear norm of tensor X as the average of nuclear norms of its unfoldings.

In subsequent paper (Liu et al 2013), they deﬁned the nuclear norm of a tensor

more generally, as a convex combination of nuclear norms of its unfoldings. Similar

approaches were used in (Gandy et al 2011) and (Tomioka et al. (2011)

In (Liu et al 20 13), three algorithms were proposed. Simple low rank tensor

complet ion (SiLRTC) is a block coordinate descent method that is guaranteed to

ﬁnd the optimal solution since the objective is convex. To improve its convergence

sp eed, the authors in (Liu et al 2013) proposed another algorithm: fast low rank

tensor completion (FaLRTC). FaLRTC uses a smoothing scheme to convert the

original nonsmooth pro blem into a sm ooth one. Then, acceleration scheme is used

to improve the convergence speed of the algorithm. Finally, the authors also pro-

posed the highly accurat e low rank tensor completion (HaLRTC), which applies

the alternating direction method of multipliers (ADMM) algorithm to the low

rank tensor completion p roblems. It was shown to be slower than FaLRTC, but

can achieve higher accuracy. Similar algorithm was derived in (Gandy et al 2011).

R. Tomioka, K. Hayashi and H. Kashima: Estimation of low-rank tensors via convex

optimization. Technical report, http://arxiv.org/abs/1010.0789.

HTML Viewer

Frequently Asked Questions (15)

Q1. What are the contributions in "Tucker factorization with missing data with application to low-n-rank tensor completion" ?

In this paper, the authors propose a simple algorithm for Tucker factorization of a tensor with missing data and its application to low-n-rank tensor completion. The authors demonstrate in several numerical experiments that the proposed algorithm performs well even when the ranks are significantly overestimated.

Q2. What are the possible extensions of the proposed approach?

Since the proposed approach is based on unconstrained optimization, possible extensions include introducing some constraints on the factors in the model, for example orthogonality or non-negativity.

Q3. Why was the nonlinear conjugate gradient method used?

Nonlinear conjugate gradient method, implemented in the Poblano toolbox (Dunlavy et al 2010), was used for optimization because of its speed.

Q4. What are some examples of applications where the low-rank matrix completion problem arises?

Examples of applications where the problem arises include image occlusion/inpainting problems, social network data analysis, network traffic data analysis, bibliometric data analysis, spectroscopy, multidimensional NMR (Nuclear Magnetic Resonance) data analysis, EEG (electroencephalogram) data analysis and many others.

Q5. What was the size of the tensor in the first setting?

In the first setting, the size of the tensor was 20× 20× 20× 20× 20, all n-mode ranks were set to 2, and the fraction of known entries of the tensor was 0.2.

Q6. What is the problem with the approaches that use nuclear norm?

The problem with the approaches that use nuclear norm is their computational complexity, since in every iteration the singular value decomposition (SVD) needs to be computed.

Q7. What is the definition of a low-rank matrix completion problem?

When the rank is replaced by nuclear norm, the resulting problem of nuclear norm minimization is convex, and, as shown in (Candes and Recht 2009), if the matrix rank is low enough, the solution of the original (rank minimization) problem can be found by minimizing the nuclear norm.

Q8. what is the objective function used in a tensor?

The objective function used in (Tomasi and Bro 2005; Acar et al 2011) was of the form (for 3-way tensors)fW (A,B,C) =I ∑i=1J ∑j=1K ∑k=1{wijk(xijk −R ∑r=1airbjrckr)}2, (11)where W is a tensor of the same size as X defined aswijk =1 , if xijk is known 0 , if xijk is missing (12)This approach differs from the one taken in (Andersson and Bro 1998).

Q9. What is the rank of a tensor?

The authors say that an N -way tensor X ∈ RI1×···×IN is rank-1 if it can be written as the outer product of N vectors, i.e.X = a(1) ◦ · · · ◦ a(N), (1)where ◦ denotes the vector outer product.

Q10. What is the effect of the proposed method on the underlying low-rank tensor?

It can be seen that the proposed method can reconstruct the underlying low-n-rank tensor even for small number of observed entries (for 20 percent or more), smaller than the nuclear norm minimization approach, despite the fact that the ranks were significantly overestimated.

Q11. What is the tensor rank in the Tucker model?

as demonstrated in numerical experiments in Section 3, the proposed algorithm can estimate the exact n-ranks of a tensor as long as initial approximations of n-ranks are over-estimations of exact ranks.

Q12. What is the relative error in the ADMM method?

Relative error was calculated as ∥∥ ∥ X̂−X∥ ∥ ∥ΩC , F‖X‖ΩC , F (18)where X̂ denotes the output of the algorithm and ‖ · ‖ΩC , F denotes the error calculated only on the set ΩC .

Q13. What is the main flaw of the proposed method?

As another contribution, the authors show that the proposed method performs better than nuclear norm minimization methods when the fraction of known tensor elements is low.

Q14. How many repetitions did the authors use for each experiment?

Since here the authors concentrate mostly on synthetic experiments, where data are generated randomly, a natural question is how confident the reported results are2 MATLAB codesince the authors used at most 100 repetitions (i.e. different random realizations) for given problem setting.

Q15. how many iterations did the algorithm take?

For this number of iterations, the algorithm took about 22 minutes (the algorithms from (Tomioka et al. (2011)1) took about 42 minutes for 5000 iterations).

Tucker factorization with missing data with application to low-$$n$$n-rank tensor completion

Summary (1 min read)

1 Introduction

2 Proposed approach

3 Experimental results

4 Conclusions

Figures (5)

Citations

Cites methods from "Tucker factorization with missing d..."

Cites background or methods from "Tucker factorization with missing d..."

References

"Tucker factorization with missing d..." refers background in this paper

"Tucker factorization with missing d..." refers background in this paper

"Tucker factorization with missing d..." refers background or methods in this paper

"Tucker factorization with missing d..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (15)

Q1. What are the contributions in "Tucker factorization with missing data with application to low-n-rank tensor completion" ?

Q2. What are the possible extensions of the proposed approach?

Q3. Why was the nonlinear conjugate gradient method used?

Q4. What are some examples of applications where the low-rank matrix completion problem arises?

Q5. What was the size of the tensor in the first setting?

Q6. What is the problem with the approaches that use nuclear norm?

Q7. What is the definition of a low-rank matrix completion problem?

Q8. what is the objective function used in a tensor?

Q9. What is the rank of a tensor?

Q10. What is the effect of the proposed method on the underlying low-rank tensor?

Q11. What is the tensor rank in the Tucker model?

Q12. What is the relative error in the ADMM method?

Q13. What is the main flaw of the proposed method?

Q14. How many repetitions did the authors use for each experiment?

Q15. how many iterations did the algorithm take?