What have the authors contributed in "Interval-valued matrix factorization with applications" ?

In this paper, the authors propose the Interval-valued Matrix Factorization ( IMF ) framework. In this paper, the authors analyze the data approximation in FA as well as CF applications and construct interval-valued matrices to capture these approximation phenomenons. The authors adapt basic NMF and PMF models to the interval-valued matrices and propose Interval-valued NMF ( I-NMF ) as well as Intervalvalued PMF ( I-PMF ). The authors conduct extensive experiments to show that proposed I-NMF and I-PMF significantly outperform their single-valued counterparts in FA and CF applications.

What is the way to evaluate the IMF framework?

The evaluations over multiple real-life data sets with different experimental settings show that I-NMF and I-PMF, which take these interval-valued matrices as input, significantly outperform their corresponding single-valued counterparts.

How do the authors propose the IMF framework?

5http://www.mit.edu/∼rsalakhu/BPMF.htmlIn this paper the authors propose the IMF framework which injects data approximation into traditional MF via taking intervalvalued matrices as input.

(Open Access) Interval-valued Matrix Factorization with Applications (2010) | Zhiyong Shen

Interval-valued Matrix Factorization with Applications

Zhiyong Shen

1,3

,LiangDu

2,1

, Xukun Shen

, Yidong Shen

Hewlett Packard Labs China, zhiyongs@hp.com

State Key Laboratory of Computer Science, China,{duliang,ydshen}@ios.ac.cn

State Key Laboratory of Virtual Reality Technology and system,China, xkshen@vrlab.buaa.edu.cn

Abstract—In this paper, we propose the Interval-valued

Matrix Factorization (IMF) framework. Matrix Factorization

(MF) is a fundamental building block of data mining. MF

techniques, such as Nonnegative Matrix Factorization (NMF)

and Probabilistic Matrix Factorization (PMF), are widely used

in applications of data mining. For example, NMF has shown

its advantage in Face Analysis (FA) while PMF has been

successfully applied to Collaborative Filtering (CF). In this

paper, we analyze the data approximation in FA as well

as CF applications and construct interval-valued matrices to

capture these approximation phenomenons. We adapt basic

NMF and PMF models to the interval-valued matrices and

propose Interval-valued NMF (I-NMF) as well as Interval-

valued PMF (I-PMF). We conduct extensive experiments to

show that proposed I-NMF and I-PMF signiﬁcantly outperform

their single-valued counterparts in FA and CF applications.

Keywords -Matrix factorization, uncertainty

I. INTRODUCTION

Exploring data approximation has attracted much atten-

tion in uncertain data mining [1] and privacy preserving

data mining [2]. Data approximation might be caused by

limitations of measuring, delayed data update or intensional

data perturbation. When traditional data mining techniques

are employed, the consideration of data approximation may

improve the quality of results. Thus, various data mining

techniques, such as clustering, classiﬁcation, association

mining have been adapted to handling data approximation. In

this paper, we devote to inject data approximation into Ma-

trix Factorization (MF) techniques. MF, also known as ma-

trix decomposition, underlies many data mining techniques

including clustering, dimensionality reduction and missing

data prediction etc.. It decomposes an input data matrix into

a number of low-rank factor matrices, which leads to a more

compact linear approximation for the original data matrix.

Variations MF have been extensively studied in literatures.

In this paper, we pay special attention to Nonnegative Ma-

trix Factorization (NMF) [3], [4] and Probabilistic Matrix

Factorization (PMF) [5]. Each of these MF techniques is

suited for a particular class of applications. For example,

NMF has shown its advantage in Face Analysis (FA) [4].

In FA applications, each face is represented by a feature

vector. NMF factorizes the matrix of multiple face feature

vectors into factor matrices and then achieve a more compact

representation of the original face data. On the other hand,

PMF has been successfully applied to Collaborative Filtering

(CF) [6]. CF is one of the most successful techniques

for automatic recommendation systems which need only

an observed r ating matrix as input. PMF decomposes the

sparse rating matrix into user proﬁle matrix and item proﬁle

matrix, and then makes predictions for the unknown entries.

However, traditional NMF and PMF ignore the following

data approximation phenomenons in FA and CF.

Alignment approximation in FA: The faces need to be

rotated and aligned to make sure that same columns in the

data matrix are corresponding to the same positions in faces.

Such alignment is hardly to be perfect in practice, i.e. there

is approximation with the alignment in FA applications (see

Section II-A for details).

Rating approximation in CF: When a user rates an

item in a real-life rating system she/he usually selects a

discretized rating value which is close to the ideal numerical

preference value (the exact preference degree). Thus, the

rating matrix does contain approximations to some degree

(see Section II-B for details).

Interval bounds are better than single-valued variables to

describe the above phenomenons of approximation. Many

application areas have taken advantage of interval-valued

data analysis (see for instance [7]), such as object tracking,

market analysis, quantitative economics and so on. In tra-

ditional MF techniques, input data matrices might be real

values, non-negative values or binary values etc., all of

which are single-valued. In this paper, we introduce a new

type of data matrix − interval-valued matrix to MF, which

captures approximation in the observed data matrix. Then,

we propose a novel MF framework − Interval-valued Matrix

Factorization (IMF) to decompose such matrices. Under the

IMF framework, we inject data approximation i nto NMF and

PMF and extend them to interval-valued NMF (I-NMF for

short) and interval-valued PMF (I-PMF for short). Therefore,

our work is a marriage between interval-valued data analysis

[7] and MF, and our contributions on both sides of research

area are summarized as follows

∙ We analyze the alignment approximation in FR as well

as the rating approximation in CF, and formalize them

with interval-valued matrices (Section II).

∙ We propose the IMF framework, under which we

extend two representative basic MF techniques NMF

2010 IEEE International Conference on Data Mining

DOI 10.1109/ICDM.2010.115

1037

and PMF to I-NMF and I-PMF which are capable of

handling interval-valued matrices (Section IV).

∙

We conduct extensive experiments to show that the the

proposed I-NMF and I-PMF signiﬁcantly outperform

their traditional single-valued counterparts in FA and

CF applications (Section V).

II. I

NTERVAL-VALUED MATRIX AND DATA

APPROXIMATION

In this section we formalize the approximation in CF and

FR problems with interval-valued matrices. First of all, we

give formal deﬁnitions of interval-valued matrix.

Let 𝑿 ∈ ℝ

𝑛×𝑑

denote the input data matrix, with

entries denoted as 𝑋

𝑖𝑗

.Let𝐼(𝑿) denote the interval-valued

matrix corresponding to 𝑿, and we have the following two

equivalent representations for 𝐼(𝑿).

Deﬁnition 1 (Center-radius representation). We denote the

interval with center 𝑋

𝑖𝑗

and radius 𝛿

𝑖𝑗

𝐼(𝑋

𝑖𝑗

)=⟨𝑋

𝑖𝑗

,𝛿

𝑖𝑗

⟩ (1)

For entire matrices, we have 𝐼(𝑿)=⟨𝑿, 𝜹⟩.

Deﬁnition 2 (Min-max representation). We denote the in-

terval bounds as 𝑋

low

𝑖𝑗

= 𝑋

𝑖𝑗

− 𝛿

𝑖𝑗

and 𝑋

𝑖𝑗

= 𝑋

𝑖𝑗

+ 𝛿

𝑖𝑗

𝐼(𝑋

𝑖𝑗

)=[𝑋

low

𝑖𝑗

,𝑋

𝑖𝑗

] (2)

For entire matrices, we have 𝐼(𝑿)=[𝑿

low

, 𝑿

In practice, we might only observe single-valued data

matrices rather than interval-valued ones. In the following

subsections we’ll give the empirical method to construct

𝑰(

𝑿) based on 𝑿. The above deﬁnitions have already been

adopted in interval-valued data analysis [8]. In our work,

we’ll use the center-radius representation (Deﬁnition 1) to

formalize the rating approximation in CF and alignment

approximation in FR and then construct interval-valued

matrices. The min-max representation (Deﬁnition 2) will be

used as input for the proposed IMF models introduced in

Section IV.

A. Alignment Approximation in FR

In many FA techniques, we need to align the faces image

such that, ideally, the pixels with the same coordinates

correspond to the identical positions of a face. In Figure

1, we take the position of the nose tip as an example to

show that the alignment is not perfect. Although the same

position of a face is not exactly aligned, they should be

near to each other. Take the ﬁrst row as examples, the pixel

with coordinate (33,35) may corresponds to the face position

coordinated by (41,34) in the second image, or (33,40) in

the third and so on. Formally, the value of a pixel with

coordinates (𝑥, 𝑦),𝑥 ∈{1, ..., 𝑑

𝑥

},𝑦 ∈{1, ..., 𝑑

𝑦

}, might

correspond to a pixel with coordinates (𝑥 +Δ𝑥, 𝑦 +Δ𝑦),

≤ Δ𝑥, Δ𝑦 ≤ 𝑟.InMF,the𝑖’th face is represented by

(33,35)

20 40 60

(41,34)

20 40 60

(33,40)

20 40 60

(25,32)

20 40 60

(44,33)

20 40 60

(34,32)

20 40 60

(29,38)

20 40 60

(32,36)

20 40 60

(35,38)

20 40 60

(37,37)

20 40 60

Figure 1. Illustration o f alignment approximation.

20 40 60

Figure 2. An example of 𝜹 matrix corresponding to faces in Figure 1.

a vector 𝑿

𝑖⋅

with dimensionality 𝑑 = 𝑑

𝑥

× 𝑑

𝑦

.Weuse

(𝑥

(𝑖,𝑗)

,𝑦

(𝑖,𝑗)

) to denote the coordinates of pixels in the 𝑖’th

image which corresponds to the 𝑗’th element in vector 𝑿

𝑖⋅

namely 𝑋

𝑖𝑗

. Then, we deﬁne the following set of the entries

in 𝑿 for each 𝑋

𝑖𝑗

𝒮

FA( 𝑟)

𝑖𝑗

= {𝑋

𝑖𝑗

′

∣∣𝑥

(𝑖,𝑗

′

)

−𝑥

(𝑖,𝑗)

∣≤𝑟 ∧∣𝑦

(𝑖,𝑗

′

)

−𝑦

(𝑖,𝑗)

∣≤𝑟}

(3)

The elements in 𝒮

FA( 𝑟)

𝑖𝑗

correspond to pixels around

(𝑥

(𝑖,𝑗)

,𝑦

(𝑖,𝑗)

) in a range 𝑟. Intuitively, 𝑋

𝑖𝑗

may corresponds

to a value in the interval of [min(𝒮

FA ( 𝑟)

𝑖𝑗

),max(𝒮

FA ( 𝑟)

𝑖𝑗

)],

which coincides the min-max deﬁnition (Deﬁnition 2).

However, min-max statistics are not robust in practice and

alternatively, we construct 𝐼(𝑋

𝑖𝑗

) based on the standard

deviation to capture the variation in 𝒮

FA( 𝑟)

𝑖𝑗

. According to

Deﬁnition 1, we set 𝑋

𝑖𝑗

as the center of 𝐼(𝑋

𝑖𝑗

) and calculate

the radius via

𝛿

FA( 𝑟)

𝑖𝑗

:= 𝛼 ⋅ std(𝒮

FR(𝑟)

𝑖𝑗

) (4)

where 𝛼 ∈ ℝ

is a multiplicative scale coefﬁcient. Based

on Deﬁntion 2, it’s easy to calculate the bounds of interval-

valued input for I-NMF according to min-max representation

(Deﬁnition 2). Examples of the 𝜹

FR(r)

𝑖⋅

corresponding to

the faces in Figure 1 are shown in Figure 2, where lighter

gray level represents larger radius. In Figure 2, we can see

positions such as eyes or nose have larger radius. These

positions are more sensitive to alignment error that may

hurt the performance of single-valued techniques. With a

relatively large radius, the interval-valued techniques may

be more tolerant to such alignment errors.

1038

Table I

XAMPLES OF SINGLE-VALUED AND INTERVAL-VALUED RATING MATRICES FOR CF

(a) A single-valued rating matrix: 𝑿

𝑚

𝑢

14 5

𝑢

312

𝑢

142

𝑢

325

(b) A interval-valued rating matrix: 𝐼(𝑿)

𝑚

𝑢

[0.6,1.4] [3.5,4.5] [4.8,5.2]

𝑢

[2.8,3.2] [0.5,1.5] [1.5,2.5]

𝑢

[0.7,1.3] [3.5,4.5]

𝑢

[4.5,5.5]

𝑢

[0.4,1.6] [3.7,4.3] [1.8,2.2]

𝑢

[2.7, 3.3] [1.4, 2.6] [4.2, 5.8]

B. Rating Approximation in CF

In CF, the rating degree is actually an approximate to

its actual preference degree of a user 𝑢 over an item. For

example, a web site allows users to rate items from one star

to ﬁve stars. User 𝑢 may think the two items 𝑎 and 𝑏 are

beyond two stars while not worth four stars, and he may

prefer 𝑎 to 𝑏. Suppose the continuous preference degrees of

user 𝑢 on 𝑎 and 𝑏 are 3.4 and 2.8, respectively. However,

due to the constraint of the rating system, 𝑢 can only rate

both 𝑎 and 𝑏 as three stars, and the difference between 𝑎

and 𝑏 disappears. It also indicates that the rating degree

actually represents a continuous interval, which may include

the i deal preference degree. Intuitively, the rating degree 𝑋

𝑖𝑗

is affected by both the 𝑖’th user and 𝑗’th item. Therefore,

we deﬁne the observations relevant to 𝑋

𝑖𝑗

with the set as

follows:

𝒮

𝑖𝑗

= {𝑋

𝑖

′

𝑗

′

∣(𝑖

′

= 𝑖 ∨ 𝑗

′

= 𝑗) ∧ (𝑖

′

,𝑗

′

) ∈ (i, j)} (5)

𝒮

𝑖𝑗

is actually constructed by the observed rating degrees

in the 𝑖-th row and 𝑗-th column of the rating matrix in CF.

Again, we calculate the radius 𝛿

𝑖𝑗

for each observed r ating

degree 𝑋

𝑖𝑗

according to Deﬁnition 1 based on the standard

deviation of the ratings in 𝒮

𝑖𝑗

𝛿

𝑖𝑗

:= 𝛼 ⋅ std(𝒮

𝑖𝑗

) (6)

where 𝛼 ∈ ℝ

is again a multiplicative scale coefﬁcient.

Intuitively, a user’s ratings on different items (or the ratings

of a item from different users) vary greatly, we should assign

a big value of interval radius to this entry. Then, it’s easy

to calculate the bounds of interval-valued input for I-PMF

according to min-max representation (Deﬁnition 2). A exam-

ple of interval-valued rating matrix with its corresponding

single-valued matrix in min-max representation are shown

in Table II-A

III. M

ATR I X FACTORIZATION WITH APPLICATIONS

In this section we brieﬂy discuss the MF techniques with

their applications. We devote special attention the the NMF

and PMF techniques since they serve to be the single-valued

counterparts of the proposed IMF models.

MF is a linear approximation data representation for the

original data matrix 𝑿 ∈ ℝ

𝑛×𝑑

. Generally, we have

𝑿 → 𝑼𝑽 (7)

where 𝑼 ∈ ℝ

𝑛×𝑘

and 𝑽 ∈ ℝ

𝑘×𝑑

. Each data instance 𝑋

𝑖⋅

is approximated by a linear combination of the rows of 𝑽

with weight vector 𝑼

𝑖⋅

,the𝑖’th row of 𝑼 . Thus, we call

𝑼 as weight matrix and 𝑽 as basis matrix. The ranks of

𝑼 and 𝑽 are always much lower than the rank of 𝑿, i.e.

𝑘 ≪ 𝑚𝑖𝑛(𝑛, 𝑑). After learning 𝑼 and 𝑽 , we can reconstruct

𝑿 as follows

𝑿 ← 𝑼𝑽 (8)

Various assumptions over 𝑼 and 𝑽 lead to different MF

models which have been widely used in data mining appli-

cations. The following two series of applications are relevant

to this paper:

Parts-based representation: MF naturally represent the

original data matrix 𝑿 by parts. The rows in 𝑽 , so-called

basis vectors, are optimized for the linear approximation

for 𝑿 , and 𝑼

𝑖⋅

could be regard as a representation for the

𝑋

𝑖⋅

with lower dimensionality. NMF has been successfully

applied to ﬁnd addictive parts-based representations for face

images (see for detail in Section III-A).

Missing Data Prediction: The reconstructed matrix

𝑿 is

a full matrix. Therefore, when 𝑿 is sparse, we can make

prediction for its missing entries based on

𝑿. For example,

PMF is successfully applied to predict the missing entries

of the rating matrices in CF (see for detail in Section III-B).

A. Nonnegative Matrix Factorization

NMF aims to factorize a nonnegative matrix 𝑿 ∈ ℝ

𝑛×𝑑

with two nonnegative matrices 𝑼 ∈ ℝ

𝑛×𝑘

and 𝑽 ∈ ℝ

𝑘×𝑑

which minimize the following 𝐿

loss function

ℒ

NMF

= ∥𝑿 − 𝑼𝑽 ∥

s.t. 𝑼 ≥ 0, 𝑽 ≥ 0

(9)

where ∥⋅∥

denotes the Frobenius norm. The estimations

of 𝑼 and 𝑽 can be ﬁnd via the multiplicative update rules

proposed in [3], which iteratively update 𝑼 and 𝑽 as follows

1039

𝑈

𝑖𝑗

← 𝑈

𝑖𝑗

(𝑿𝑽

𝑇

)

𝑖𝑗

(𝑼𝑽 𝑽

𝑇

)

𝑖𝑗

𝑉

𝑖𝑗

← 𝑉

𝑖𝑗

(𝑼

𝑇

𝑿)

𝑖𝑗

(𝑼

𝑇

𝑼𝑽 )

𝑖𝑗

(10)

The update rules in (10) can be deduced according to

Karush-Kunhn-Trucker optimal condition [9] of inequality

constraint (see for detail in [10]). In [3], it is proved that

the updates in (10) lead to a local minimum of (9). The

non-negative constraints on 𝑼 and 𝑽 only allow addictive

linear combination of basis vectors in 𝑽 , so-called parts-

based representation [4]. NMF is suited for many real world

applications such as human face analysis [4]. In human

face analysis, the resultant matrix 𝑼 constructs a optimized

representation for the original data instances. Many FA

algorithms, such as face recognition, face clustering, may

be directly applied on 𝑼 instead of the original data matrix

𝑿.

B. Probabilistic Matrix Factorization

In CF, the PMF model [5] assume that the ratings are

drawn from some Gaussian distribution.

𝑝(𝑋

𝑖𝑗

∣𝑖, 𝑗, 𝑼 , 𝑽 ,𝜎

)=G(𝑋

𝑖𝑗

∣𝑼

𝑖⋅

𝑽

⋅𝑗

,𝜎

) (11)

For 𝑼 and 𝑽 , they place zero-mean spherical Gaussian

priors

𝑝(𝑼 ∣𝜎

∏

𝑖

G(𝑼

𝑖⋅

∣0,𝜎

𝑰),𝑝(𝑽 ∣𝜎

∏

𝑗

G(𝑽

⋅𝑗

∣0,𝜎

𝑰)

(12)

The 𝑼 and 𝑽 are computed via over the observed ratings

ℒ

PMF

= ∥𝑿 − 𝑼𝑽 ∥

+ 𝜆

[

∥𝑼 ∥

+ ∥𝑽 ∥

]

(13)

where 𝜆 = 𝜎

/𝜎

. A local minimum of (13) can be found

via gradient decent in 𝑼

𝑖⋅

and 𝑽

⋅𝑗

∂ℒ

PMF

∂𝑼

𝑖⋅

∑

𝑗∈j

𝑖

(𝑼

𝑖⋅

𝑽

⋅𝑗

− 𝑋

𝑖𝑗

)𝑽

𝑇

⋅𝑗

+ 𝜆𝑼

𝑖⋅

∂ℒ

PMF

∂𝑽

⋅𝑗

∑

𝑖∈i

𝑗

(𝑼

𝑖⋅

𝑽

𝑇

⋅𝑗

− 𝑋

𝑖𝑗

)𝑼

𝑇

𝑖⋅

+ 𝜆𝑽

⋅𝑗

(14)

Based on the learning of 𝑼 and 𝑽 , we can estimate the

unknown ratings in 𝑿 via

𝑋

𝑖𝑗

= 𝑼

𝑖⋅

𝑽

⋅𝑗

(15)

IV. I

NTERVAL-VALUED MATR I X FACTORIZATION

In this section, we introduce the IMF framework. The

proposed framework is based on the Min-Max representation

of the interval-valued matrix: 𝐼(𝑿 )=[𝑿

low

, 𝑿

]. We can

extend the original MF over 𝑋 to the joint MF over 𝑿

low

and 𝑿

. Firstly, we assume each 𝑋

𝑖𝑗

is drawn from a

uniform distribution with parameters 𝑋

low

𝑖𝑗

and 𝑋

𝑖𝑗

𝑋

𝑖𝑗

∼ uniform(𝑋

low

𝑖𝑗

,𝑋

𝑖𝑗

) (16)

Base on this assumption, we have

E(𝑋

𝑖𝑗

(𝑋

low

𝑖𝑗

+ 𝑋

𝑖𝑗

) (17)

Therefore, we propose to estimate the bounds of 𝐼(𝑿) ﬁrst

via the following joint MF

𝑿

low

→ 𝑼𝑽

low

, 𝑿

→ 𝑼𝑽

(18)

We ﬁx the weight matrix 𝑼 to make a unique proﬁle for

each data instance and use 𝑽

low

, 𝑽

to maintain the data

approximation. The reconstructions of 𝑿

low

and 𝑿

could

be calculated as follows

𝑿

low

← 𝑼𝑽

low

𝑿

← 𝑼𝑽

(19)

According to (17) and (19), we can reconstruct 𝑿 via

𝑿 ←

(

𝑿

low

𝑿

) (20)

A. Interval-valued NMF

According to (9) and (18), the 𝐿

loss function of interval-

valued NMF (I-NMF f or short) is

ℒ

I−NMF

= ∥𝑿

low

− 𝑼𝑽

low

∥

+ ∥𝑿

− 𝑼𝑽

∥

s.t. 𝑼 ≥ 0, 𝑽

low

≥ 0, 𝑽

≥ 0

(21)

Similar to traditional NMF, we have the following multi-

plicative update rule for 𝑼 , 𝑽

low

and 𝑽

𝑈

𝑡+1

𝑖𝑗

← 𝑈

𝑡

𝑖𝑗

[𝑿

low

(𝑽

low

)

𝑇

+ 𝑿

(𝑽

)

𝑇

]

𝑖𝑗

[𝑼𝑽

low

(𝑽

low

)

𝑇

+ 𝑼𝑽

(𝑽

)

𝑇

]

𝑖𝑗

𝑉

low,𝑡+1

𝑖𝑗

← 𝑉

low,𝑡

𝑖𝑗

(𝑼

𝑇

𝑿

low

)

𝑖𝑗

(𝑼

𝑇

𝑼𝑽

low

)

𝑖𝑗

𝑉

up,𝑡+1

𝑖𝑗

← 𝑉

up,𝑡

𝑖𝑗

(𝑼

𝑇

𝑿

)

𝑖𝑗

(𝑼

𝑇

𝑼𝑽

)

𝑖𝑗

(22)

Similar to traditional NMF, we also have that the 𝐿

loss

function ℒ

I−NMF

as shown in (21) is nonincreasing under

the multiplicative update rules as shown in (22).

Traditional NMF decomposes the original data matrix

into two low-rank factor matrices: one proﬁles the data

instances while the other proﬁles the features. In I-NMF,

the proposed the joint matrix factorization framework makes

the feature proﬁle factor matrices 𝑽

low

⋅𝑗

and 𝑽

⋅𝑗

contain the

data approximation while preserving a unique proﬁle 𝑼

𝑖⋅

for

each data instance. We can directly apply the face analysis

techniques over 𝑼 .

B. Interval-valued PMF

In this section we introduce the interval-valued PMF (I-

PMF for short). Analogously to (13) and according to (18),

we have the following regularized 𝐿

loss

ℒ

I−PMF

= ∥𝑿

low

− 𝑼𝑽

low

∥

+ ∥𝑿

− 𝑼𝑽

∥

+𝜆

(

∥𝑼 ∥

+ ∥𝑽

low

∥

+ ∥𝑽

∥

)

(23)

1040

20 22 24 26 28 30 32 34 36 38 40

0.87

0.875

0.88

0.885

0.89

0.895

0.9

0.905

0.91

0.915

0.92

Face Recognition

F1 Measure

Number of Factors (k)

ORL32: Raw

ORL32: NMF

ORL32: I−NMF

20 22 24 26 28 30 32 34 36 38 40

13.5

14.5

15.5

16.5

Face Reconstruction

Number of Factors (k)

20 22 24 26 28 30 32 34 36 38 40

0.62

0.63

0.64

0.65

0.66

0.67

0.68

0.69

0.7

0.71

Number of Factors (k)

ACC

Face Clustering

ORL64: Raw

ORL64: NMF

ORL64: I−NMF

20 22 24 26 28 30 32 34 36 38 40

0.78

0.785

0.79

0.795

0.8

0.805

0.81

0.815

0.82

0.825

0.83

Face Clustering

NMI

Number of Factors (k)

Figure 3. Performance comparison in face analysis.

It is easy to derive a gradient decent in 𝑼

𝑖⋅

, 𝑽

low

⋅𝑗

and

𝑽

⋅𝑗

to ﬁnd a local minimum of (23).

∂ℒ

I−PMF

∂𝑼

𝑖⋅

∑

𝑗∈j

𝑖

[(𝑼

𝑖⋅

𝑽

low

⋅𝑗

− 𝑋

low

𝑖𝑗

)𝑽

low𝑇

⋅𝑗

+(𝑼

𝑖⋅

𝑽

⋅𝑗

− 𝑋

𝑖𝑗

)𝑽

up𝑇

⋅𝑗

]+𝜆𝑼

𝑖⋅

∂ℒ

I−PMF

∂𝑽

low

⋅𝑗

∑

𝑖∈i

𝑗

(𝑼

𝑖⋅

𝑽

low

⋅𝑗

− 𝑀

low

𝑢𝑚

)𝑼

𝑇

𝑖⋅

+ 𝜆𝑽

low

⋅𝑗

∂ℒ

I−PMF

∂𝑽

⋅𝑗

∑

𝑖∈i

𝑗

(𝑼

𝑖⋅

𝑽

⋅𝑗

− 𝑀

𝑢𝑚

)𝑼

𝑇

𝑖⋅

+ 𝜆𝑽

⋅𝑗

(24)

For CF application, we can used the learned 𝑼 , 𝑽

low

and

𝑽

to compute the unknown ratings via (19) and (20).

V. E

XPERIMENTAL RESULTS

We divide the experiments into two parts: In Section V-A

we conduct the comparison of I-NMF against the basic NMF

for FA applications, and in Section V-B we compare the

performance of I-PMF and PMF over CF applications.

A. Comparison of I-NMF against NMF

We compare the performance of NMF and I-NMF on

various FA applications including face recognition, face

reconstruction and face clustering.

1) Data Description and Evaluation Setting: We use the

Olivertti Research Laboratory (ORL) face data sets to evalu-

ate the NMF and I-NMF models, which contain ten different

images of each of 40 distinct persons, (𝑛 =10× 40 = 400

in total). Two versions of processed data sets

: one with res-

olution 32×32 (ORL32) and the other with 64×64 (ORL64),

are used for our experimental evaluation. In ORL32, each

face image is represented by a vector with dimensionality

𝑑 =32× 32 = 1024 while in ORL64, 𝑑 =64× 64 = 4096.

We implement I-NMF based on multiplicative update

rules introduced in Section IV-A. The experiments for NMF

http://www.cs.uiuc.edu/homes/dengcai2/Data/FaceData.html

and are based on the DTU NMF toolbox

. Various classiﬁers

has been adopted for face recognition and in this paper, we

apply the the nearest neighbor method for its simplicity. For

face clustering, we choose the popular K-means algorithm.

All the classiﬁcation and clustering algorithms are applied

on the output weight matrices 𝑼 from NMF and I-NMF and

we also give the performance of these algorithms over the

raw data matrix 𝑿 as the baseline. In the construction of

interval-valued matrices (4), we set 𝑟 =5and 𝛼 =2.5.

We evaluate the proposed models in terms of the face

recognition and clustering effectiveness. Note that face

recognition is actually a classiﬁcation problem. To evaluate

the effectiveness of classiﬁcation (FR), we use the standard

𝐹 1 measure. We adopt two popular metrics Normalized

Mutual Information (NMI) [11] and Clustering Accuracy

(ACC) for cluster evaluation. Based on NMF, the faces

are reconstructed with the weighted summation of basis

vectors. We use the following Reconstruction Error (RE):

RE(

𝑿, 𝑿 )=

√

∑

𝑛

𝑖=1

∑

𝑑

𝑗=1

(

𝑋

𝑖𝑗

−𝑋

𝑖𝑗

)

𝑛×𝑑

to evaluate the good-

ness of reconstruction matrix

𝑿 according to the original

data matrix 𝑿 .

Note that larger values of F1, NMI and ACC indicate bet-

ter face recognition or clustering results while small values

of RE indicate better performance of face reconstruction.

2) Evaluation Results: We compare the models with

varying rank of factor matrix 𝑘 and interval sizes.

Evaluation with varying 𝑘: The face clustering and face

reconstruction tasks are evaluated over entire data sets. For

the face recognition task, we make ten rounds of random

sampling of 50% data for training. In general, the perfor-

mance of NMF and I-NMF for all the face analysis tasks

varies with the number of latent factors (𝑘). For each value

of 𝑘, we run 100 rounds of NMF and I-NMF. The average

values of the performance metrics plotted for each model

as shown in Figure 3 where each sub-ﬁgure corresponds

to a face analysis task with the speciﬁc evaluation metric

and each line corresponds to a model on a speciﬁc data

set.From Figure 3, we see that I-NMF outperforms NMF

with statistical signiﬁcance over all evaluation metrics on

both two data sets.

B. Comparison of I-PMF against PMF

1) Data Description and Evaluation Setting: In this part

of experiments, we also use two data sets for evaluation.

Movielens data set

is downloaded from the web-site of

GroupLens research group and we use the subset which

contains 100,000 ratings for 𝑑 = 1682 movies by 𝑛 = 943

users of the online movie recommender service. We name

this data set as Movielens-100K. Netﬂix data set

is the

ofﬁcial data set used in the Netﬂix Prize competition. Again,

http://isp.imm.dtu.dk/toolbox/nmf/nmf toolbox ver1.4.zip

http://www.grouplens.org/system/ﬁles/ml-data 0.zip

http://archive.ics.uci.edu/ml/datasets/Netﬂix+Prize

1041

Interval-valued Matrix Factorization with Applications

Figures

Citations

Matrix Factorization with Interval-Valued Data

Tensor-Train Decomposition in the Presence of Interval-Valued Data

Generalized Interval Valued Nonnegative Matrix Factorization

Tensor-Train Decomposition in the Presence of Interval-Valued Data

References

Elements of information theory

Learning the parts of objects by non-negative matrix factorization

Learning parts of objects by non-negative matrix factorization

Algorithms for Non-negative Matrix Factorization

Algorithms for non-negative matrix factorization

Related Papers (5)

Nonnegative Matrix Factorization

An overview of kernel based nonnegative matrix factorization

Sparse and unique nonnegative matrix factorization through data preprocessing

Robust orthogonal nonnegative matrix tri-factorization for data representation

Sparse nonnegative matrix factorization using ℓ 0 -constraints

Frequently Asked Questions (3)

Q1. What have the authors contributed in "Interval-valued matrix factorization with applications" ?

Q2. What is the way to evaluate the IMF framework?

Q3. How do the authors propose the IMF framework?