What are the contributions in "Matrix factorization with interval-valued data" ?

Yet, in many applications, the available data are inherently non-scalar for various reasons, including imprecision in data collection, conflicts in aggregated data, data summarization, or privacy issues, where one is provided with a reduced, clustered, or intentionally noisy and obfuscated version of the data to hide information. In this paper, the authors propose matrix decomposition techniques that consider the existence of interval-valued data. The authors show that naive ways to deal with such imperfect data may introduce errors in analysis and present factorization techniques that are especially effective when the amount of imprecise information is large.

What is the eigenvalue of the matrix?

Let matrices V∗ and V ∗ capture the eigenvectors and Σ∗ and Σ ∗ be diagonal matrices encoding the square roots of the eigenvalues of matrices

What is the corollary of the above theorem?

Corollary 2. Corollary 1 further implies that U†U T † and V†V T † cannot be equal to scalar-valued matrix, The author, which means that an exact decomposition of interval-valued matrices is not possible.

What is the result of the recomputation step?

The result is visualized in Figure 5(b): as the authors see here, after the recomputation step, the V∗ and V ∗ matrices become much more similar, indicating more precise factor matrices, an improvement which (as the authors detail in Section 6) contributes to more accurate decompositions.

What is the first approach to solve the decomposition problem?

This approach results in U and V matrices that are scalar and orthonormal and a Σ core matrix that is also scalarvalued (i.e., it is compatible only with the decomposition target-c, discussed in Section 3.4).

What is the reason for the poor performance of the ISVD0 technique?

Figure 6 provides an overview of the accuracy and execution time results for the default configuration:• the authors obtain the highest accuracies using ISVD#-b class of techniques (returning both scalar-valued factors andinterval-valued core) – highest overall accuracy is provided by ISVD4-b, which leverages both semantic alignment and latent space recomputation techniques; • the ISVD#-c class of techniques (returning scalar valued factor and core matrices) approximate the accuracy of the ISVD0 technique – however, these include redundant work; • linear-programing based competitors [33], [35] have poor accuracies and massive execution times; the reason for this poor performance is that, as also acknowledged by the authors, these approaches are effective only when the interval ranges are very small, while their proposed approaches are able to handle intervals of varying sizes effectively.

What is the simplest example of a latent space?

Given this observation, the authors can now present an interval latent semantic alignment problem, which would optimally combine minimum and maximum vectors to form an interval-valued latent space.

(Open Access) Matrix Factorization with Interval-Valued Data (2021) | Mao-Lin Li

Q: What are the main reasons for the development of interval-valued PCA algorithms?

several data analysis tools, including regression [23], [24], canonical analysis [25], and multi-dimensional scaling [26], have been developed for symbolic and interval-valued data.

Q: What is the key challenge in performing decompositions over intervals?

The key challenge in performing decompositions over interval-valued matrices is that definitions of basic algebraic operations, such as multiplication and inversion (needed to implement factorization operations), are not as straightforward for intervals as they are for scalars (see Section 2.1).

Q: What is the simplest way to find the local minimum of loss function LPMF?

A local minimum of loss function LPMF can be found via gradient descent in U[i,:] and V[j,:] T∂LPMF ∂U[i,:]= m ∑j=1(U[i,:]V[j,:] T −M [i, j])V[j,:] + λU[i,:]∂LPMF ∂V[j,:]

Q: What is the difference between the two columns in the factor matrices?

The columns in the factor matrices are referred to as latent semantics (LS) and, preferably, they are mutually orthogonal to serve as basis of the transformed space.

10 August 2022

AperTO - Archivio Istituzionale Open Access dell'Università di Torino

Original Citation:

Matrix Factorization with Interval-Valued Data

Published version:

DOI:10.1109/TKDE.2019.2942310

Open Access

(Article begins on next page)

Anyone can freely access the full text of works made available as "Open Access". Works made available

under a Creative Commons license can be used according to the terms and conditions of said license. Use

of all other works requires consent of the right holder (author or publisher) if not exempted from copyright

protection by the applicable law.

Availability:

This is the author's manuscript

This version is available http://hdl.handle.net/2318/1726448 since 2020-02-03T22:20:21Z

Matrix Factorization with Interval-Valued Data

Journal:

Transactions on Knowledge and Data Engineering

Manuscript ID

TKDE-2018-12-1264.R1

Manuscript Type:

Regular

Keywords:

Matrix factorization, Interval valued data, H.2.8.d Data mining < H.2.8

Database Applications < H.2 Database Management < H Information

Technology and Systems, H.2.8.i Mining methods and algorithms <

H.2.8 Database Applications < H.2 Database Management < H

Information Technology and Sys, G.1.3.h Singular value decomposition

< G.1.3 Numerical Linear Algebra < G.1 Numerical Analysis < G

Mathematics of Computing, I.2.3.l Uncertainty, “fuzzy,” and probabilistic

reasoning < H.5.2 User Interfaces < H.5 Information Interfaces and

Represent

Transactions on Knowledge and Data Engineering

TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1

Matrix Factor ization with Interval-Valued Data

Mao-Lin Li, Francesco Di Mauro, K Selc¸uk Candan, and Maria Luisa Sapino

Abstract—With many applications relying on multi-dimensional datasets for decision making, matrix factorization (or decomposition) is

becoming the basis for many knowledge discoveries and machine learning tasks, from clustering, trend detection, anomaly detection,

to correlation analysis. Unfortunately, a major shortcoming of matrix analysis operations is that, despite their effectiveness when t he

data is scalar, these operations become difﬁcult to apply in the presence of non-scalar data, as they are not designed for data that

include non-scalar observations, such as intervals. Yet, in many applications, the available data are inherently non-scalar for various

reasons, including imprecision in data collection, conﬂicts in aggregated data, data summarization, or privacy issues, where one is

provided with a reduced, clustered, or intentionally noi sy and obfuscated version of the data to hide information. I n this paper, we

propose m atrix decomposition techniques that consider the existence of interval-valued data. We show that naive ways to deal with

such imperfect data may introduce errors in analysis and present factorization techniques that are especially effective when the amount

of i mprecise information is large.

Index Terms—Mat rix factorization, Interval valued data.

✦

1 INTRODUCTION

ITH many machine-learning applications requiring

latent semantics underlying the data s e ts, matrix fac-

torization has emerged as a successful tool for discovering

latent patterns in data [1], [2]: matrices are used to encode

relationships among pairs of entities and data are analyzed

for their latent semantics through matrix decomposition

operations, such as singular value decomposition, SVD [3]

or principal component analysis, PCA [4].

1.1 Challenge: Interval-Valued Data

In many applications, data need t o b e represented as ranges

or intervals of possible values, a s opposed to scalar data:

• Summarized data. Analyzing reduced or summarized

data sets can be more efﬁcient, especially for imple-

menting interactive a pplications [5], [6]. When several

observations are grouped and collapsed into a single

observation, data may need to be represented as value

ranges. While it may sometimes be possible to asso-

ciate statistical meanings to the intervals and (assuming

that appropriate generative models, probability distri-

butions, and conditioning strategies are found) it might

be possib le to le verage probabilistic matrix factorization

techniques, such as [7], this approach may be infeasible

or ineffective due to the lack of appropriate statistical

representations and/or the cost.

• Data with conﬂicts. When a data set reﬂects knowledge

integrated from different data sources, it might not

be possible to assign a single scalar weight to each

observation and an interval of poss ible values might

be a more appropriate representations [6]. Moreover,

• Mao-Lin Li and K. Sel¸cuk Candan are with the School of Computing, In-

formatics, and Decision Systems Engineering, Arizona State University,

Tempe, AZ, 85281, USA.

E-mail: {mao-lin.li,candan}@asu.edu

• Francesco Di Mauro and Maria Luisa Sapino are with the Department of

Computer Science, University o f Turin, Italy.

E-mail: {dimauro,mlsapino}@di.unito.it

when analyzing such inte grated data, the resulting in-

tervals may not have a statistical interpretation, beyond

presenting the spread (i.e., minimum and maxim um

values) of the data.

• Anonymized data. Various privacy-preserving data pub-

lishing algorithms, such as recoding techniques [8],

replace p recise scalar values with less precise value

ranges or intervals (such as those obtained through

value generalization [8]). The resulting intervals (inten-

tionally) do not represent any speciﬁc data distribution;

consequently, associating a statistical interpretation t o

the interval is not necessarily appropriate. This means

that probabilistic techniques for data analysis are not

appropriate for anonymized data sets. In Section 6, we

see that the proposed approach is highly effective for

interval-valued data generated through generalization.

• Imprecise data. Data imprecision may be caused by vari-

ous reasons, including limitations in measurement. For

example, minute variations in multiple facial images

from a single individual may be represented using

interval-valued data (see [9] and Section 6.1.2). As we

further discuss in Section 6.1.3 , ambiguities in users’

ratings in a collaborating ﬁltering application may also

be cap tured using interval-valued data [10], [11].

The key challenge in performing decompositions over

interval-valued matrices is that deﬁnitions of basic algebraic

operations, such as multiplication and inversion (needed to

implement factorization operations), are not as straightfor-

ward for intervals as they are for s calars (see Section 2.1).

Also, unlike scalars which are totally ordered, intervals

are often partially ordered. Furthermore, a naive approach

(which would exhaustively e numerate all possible decom-

positions) would signiﬁcantly increase the computational

complexity of the problem (which is already high for the

case with scalar weights). Therefore, many basic operations

need to be redeﬁned to a ccommodate such non-scalar data

and these need to be implemented in wa ys that avoid

increases in computa tional costs.

Page 1 of 29 Transactions on Knowledge and Data Engineering

TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2

1.2 Contributions of this Paper

In this paper, we study the problem of obtaining decompo-

sitions of interval-valued matrices:

• We present the decomposition problem for interval val-

ued data sets. We introduce interval-valued a lge bra and

discuss the core challenges presented by the decompo-

sition of interval-valued data.

• We propose an interval-valued latent semant ics align-

ment scheme and, relying on this, we develop algo-

rithms for obtaining eigenvalue-base d (such as SVD [3])

or probabilistic (such as PM F [7]) decomposition for

interval-valued data matrices.

• We ﬁnally study the effectiveness of the proposed

schemes in several applications, including face image

analysis a nd collaborative ﬁltering.

1.3 Organization of the Paper

The paper is organized as follows: We introduce the back-

ground and review t he related work in Section 2. Section 3

introduces the problem and presents t he key observations

and mathematical formulations regarding interval-valued

latent spaces. Section 4 presents the proposed interval sin-

gular value decomposition (ISVD) algorithm to obta in SVD

decompositions in the presence of interval-valued data.

Section 5 shows that the p roposed semantic alignment tech-

nique can also be used in probabilistic matrix factorization

scenarios. Section 6 reports our experimental results under

diverse scenarios. We conclude the paper in Section 7.

2 BACKGROUND AND RELATED WORKS

2.1 Interval Algebra

We ﬁrst formalize the deﬁnitions for interval-valued data

and its algebraic operations:

Deﬁnition 1 (Interval representation). An interval a

†

is a pair

†

= [a

∗

, a

∗

], a

∗

≤ a

∗

where a

∗

is the minimum v alue and a

∗

is the maximum value of

the interval a

†

. If a

∗

= a

∗

, then a

†

is scalar.

Deﬁnition 2 (Interval span). Given an interval a

†

, the corre-

sponding span is computed by:

span(a

†

) = span([a

∗

, a

∗

]) = (a

∗

− a

∗

) ∈ R

Deﬁnition 3 (Interval algebraic operations). Given two in-

tervals, [a

∗

, a

∗

] and [b

∗

, b

∗

], we adopt the following interval

algebraic operations on the m [12]:

• addition: [a

∗

, a

∗

] + [b

∗

, b

∗

] = [a

∗

+ b

∗

, a

∗

+ b

∗

• subtraction: [a

∗

, a

∗

] − [b

∗

, b

∗

] = [a

∗

− b

∗

, a

∗

− b

∗

• multiplication: [a

∗

, a

∗

] × [b

∗

, b

∗

] =

[min(a

∗

×b

∗

, a

∗

×b

∗

, a

∗

×b

∗

, a

∗

×b

∗

), max(a

∗

×b

∗

, a

∗

, a

∗

× b

∗

, a

∗

× b

∗

)].

When one of the valu e s, say a, is scalar, the multipli-

cation [a, a] × [b

∗

, b

∗

] can be written as [min(a × b

∗

, a ×

∗

), max(a × b

∗

, a × b

∗

)] and the corresponding value of the

span is span(a × [b

∗

, b

∗

]) = a × span([b

∗

, b

∗

]).

Note that given the above deﬁnition of interval alge-

braic ope rations, more complex interval-valued operations,

such as interval-valued matrix a lge bra, can be deﬁned by

replacing scalar addition, subtraction, and multiplication

operations, with their interval-valued counterparts.

2.2 Matrix Factorization

Feature selection and dimensionality reduction tech-

niques [13] usually involve some (often linear) transforma-

tion of the vector space containing the data to help focus

on a few features (or combinations of features) that best

discriminate the da ta in a given corpus. Among these trans-

formations, Karhunen-Loeve Transform, KLT (also known

as the principal component analysis, PCA [14]), and singular

value decomposition, SVD [3] have the key property that

the vectors selected as the dimensions of the space are

mutually orthogonal

and, hence, linearly independent (i.e.,

there is no redundancy among the dimens ions). The result-

ing basis vectors are referred to as the latent variables [15] or

the latent se mantics of the data [3]. While KLT and SVD may

result in negative values, in non-negative matrix factoriza-

tion (NMF) [16], [17], factor matrices are non-negative and

enable probabilistic interpretation of the results and discov-

ery of ge nerative models. Below, we outline three common

matrix factoriza tion schemes: singular value decomposition

(SVD [3]), non-negativ e matrix factorization (NMF [16],

[17]), and probabilistic matrix factorization (PMF [7 ]).

2.2.1 Singular Value Decomposition (SVD)

Let M ∈ R

n×m

represent the input matrix. Let the rank,

r, be a positive integer r ≤ min(n, m). In this paper, we

denote the value of i

row and j

column of M as M [i, j].

The j

column vector of M is similarly denoted as M [j].

M can be decomposed into M = UΣV

through singular

value decomposition (SVD), where

• U ∈ R

n×r

, and UU

= I

;

• Σ ∈ diag(R

);

• V

= transpose (V ), V ∈ R

m×r

, and V V

= I

The columns of U, als o called the left singular vectors of

matrix M , are the eigenvectors of the n × n matrix, MM

The columns of V , or the right singular vectors of M, are

the eigenvectors of the m×m matrix, M

M. Note that b oth

columns of U and columns of V are mutually orthogonal.

2.2.2 Non-negative Matrix Factorizati on (NMF)

Given a non-negative matrix M ∈ R

n×m

, NMF factorizes M

into two non-negative matrices U ∈ R

n×r

and V ∈ R

m×r

with target rank r, which minimize the L

loss function

NM F

= kM − UV

where U ≥ 0, V ≥ 0 and k.k

denotes the Frobenius norm.

The approximated solutions for U and V are commonly

found by iterative update rules, such as [17]

U[i, j] ← U [i, j]

(MV )[i, j]

(UV

V )[i, j]

[i, j] ← V

[i, j]

M)[i, j]

)[i, j]

Page 2 of 29Transactions on Knowledge and Data Engineering

TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 3

[9] extended these to interval-valu e d matrices as follows:

I−NM F

= kM

∗

− U V

∗

+ kM

∗

− U V

∗T

U[i, j] ← U[i, j]

(MV )[i , j]

(UV

V )[i, j]

∗

[i, j] ← V

∗

[i, j]

∗

)[i, j]

∗

)[i, j]

∗T

[i, j] ← V

∗T

[i, j]

∗

)[i, j]

∗T

)[i, j]

Note that this scheme, called I-NMF, factorizes the matrix

into a scalar-valued U and an interval-valued V

†

= [V

∗

, V

∗

2.2.3 Probabilistic Matrix Factorization (PMF)

Probabilistic matrix factorization (PMF) [7] assumes matrix

entries are drawn from Gaussian distribution. In particular,

given a matrix M ∈ R

n×m

, the conditional distribution over

the observed values is deﬁned as

p(M[i, j]|U, V, σ

) =

i=1

j=1

[N (M[i, j]|U

[i,:]

[j,:]

, σ

)]

where N is the probability density function of Gaussian

distribution with mean µ and variance σ

, I

is the indi-

cator function that is equal to 1 if M[i, j] is not null and 0

otherwise. Above, U

[i,:]

and V

[j,:]

are row vectors

in U and

V , such that M[i, j] ≃ U

[i,:]

[j,:]

, and they place z e ro-mean

spherical Gaussian priors on latent semantics

. The factors,

U and V , are computed via the loss function

P M F

= kM − UV

+ λ

kUk

+ λ

where λ

=σ

/σ

, λ

=σ

/σ

, and k.k

denotes the Frobe-

nius norm. A local minimum of loss function L

P M F

can be

found via gradient descent in U

[i,:]

and V

[j,:]

∂L

P M F

∂U

[i,:]

j=1

[i,:]

[j,:]

− M [i, j])V

[j,:]

+ λU

[i,:]

∂L

P M F

∂V

[j,:]

i=1

[i,:]

[j,:]

− M [i, j])U

[i,:]

+ λV

[j,:]

2.3 Analysis of Symbolic and Interval-valued Data

In the real world, data rarely comes in simple scalar

form. Often variables of interest may take complex, of-

ten symbolic, forms, including sets, histograms, vectors,

intervals, or probability distributions [18], [19], [20], [21].

This is especially true when data is aggregated [22] or

anonymized [8]. Consequently, several data analysis tools,

including regression [23], [24], canonical analysis [25], and

multi-dimensional scaling [26], have been developed for

symbolic and interval-valued data. Given the popularity

of PCA in data analysis, several interval-valued PCA al-

gorithms have a lso been proposed [27], [28], [29], [30],

most of which leverage the sp e ciﬁc statistical and geometric

meanings of principal components of a system of variables.

As discussed above, interval NMF and PMF [9] also have

1. In this paper, we use X

[i,:]

to denote i

row vector and X

[:,j]

denote j

column vector of matrix X.

2. Note that here, and in th e rest of the paper, we use ≃ to denote

“approximately equal”

Fig. 1: Scalar latent semantic spaces

(a) int. -valued latent space

!"#$

!"%$

(b) point in int.-valued latent space

Fig. 2: Interval-valued latent semantic spaces

been studied to resolve alignment approximation in face

analysis a nd rating approximation in collaborative ﬁltering.

In contrast, we develop a more general interval-valued la-

tent semantic alignment algorithm which can be integrated

in common ma trix factorization approaches that directly

leverages interval-va lued properties.

3 INTERVAL-VALUED LATENT SPACES

In Section 2.2.1, we have seen that a s calar matrix can be

decomposed into factor matrices (U and V ) and core matrix

(Σ). The columns in the factor matrices are referred to as

latent semantics (LS) and, preferably, they are mutually

orthogonal to serve as basis of the transformed space. Figure

1 shows a scalar-valued latent semantic space superimposed

on the original space; the ﬁgure also shows a scalar-valued

data point projected onto both original and latent sp aces.

Unfortunately, scalar-valued latent spaces are not sufﬁcient

to present interval-valued data.

3.1 Interval-Valued Decomposition

Here, we ﬁrst extend the deﬁnition of singular valued

decomposition taking into account the presence of interval-

valued data.

Deﬁnition 4 (Interval-valued Decomposition). Given an

interval-valued matrix, M

†

∈ R

n×m

, and a target rank r ≤

min(n, m), interval-valued SVD would decompose M

†

into

†

≃ U

†

, such that

• U

†

∈ R

n×r

and V

†

∈ R

m×r

(potentially interval-valued)

matrices, s uch that the columns of U

†

and V

†

are quasi-

orthonormal; i.e., given column indexes h and l,

Page 3 of 29 Transactions on Knowledge and Data Engineering

Matrix Factorization with Interval-Valued Data

Figures

Citations

Tensor-Train Decomposition in the Presence of Interval-Valued Data

Tensor-Train Decomposition in the Presence of Interval-Valued Data

SIRTEM: Spatially Informed Rapid Testing for Epidemic Modeling and Response to COVID-19

Matrix Factorization with Interval-Valued Data

References

Elements of information theory

Indexing by Latent Semantic Analysis

Learning the parts of objects by non-negative matrix factorization

Learning parts of objects by non-negative matrix factorization

Probabilistic Matrix Factorization

Related Papers (5)

High-dimensional Time Series Prediction with Missing Values

User Relation Prediction Based on Matrix Factorization and Hybrid Particle Swarm Optimization

Fast nonparametric matrix factorization for large-scale collaborative filtering

Matrix Factorization with Side and Higher Order Information.

A matrix factorization framework for jointly analyzing multiple nonnegative data sources

Frequently Asked Questions (12)

Q1. What are the contributions in "Matrix factorization with interval-valued data" ?

Q2. What are the main reasons for the development of interval-valued PCA algorithms?

Q3. What is the eigenvalue of the matrix?

Q4. What are the main reasons for the non-scalar data?

Q5. What is the key challenge in performing decompositions over intervals?

Q6. What is the corollary of the above theorem?

Q7. What is the simplest way to find the local minimum of loss function LPMF?

Q8. What is the result of the recomputation step?

Q9. What is the difference between the two columns in the factor matrices?

Q10. What is the first approach to solve the decomposition problem?

Q11. What is the reason for the poor performance of the ISVD0 technique?

Q12. What is the simplest example of a latent space?