What contributions have the authors mentioned in the paper "3waysym-scal: three-way symbolic multidimensional scaling" ?

The authors model three-way two-mode data consisting of an interval of dissimilarities for each object pair from each of K sources by a set of intervals of the distances defined as the minimum and maximum distance between two sets of embedded rectangles representing the objects. In this paper, the authors provide a new algorithm called 3WaySym-Scal using iterative majorization, that is based on an algorithm, I-Scal developed for the two-way case where the dissimilarities are given by a range of values ie an interval ( see Groenen et al. ( 2006 ) ). The authors present the results on an empirical data set on synthetic musical tones.

(Open Access) 3WaySym-Scal: three-way symbolic multidimensional scaling (2007)

3WaySym-Scal: Three-Way Symbolic

Multidimensional Scaling

P.J.F. Groenen

and S. Winsberg

Econometric Institute, Erasmus University Rotterdam,

P.O. Box 1738, 3000 DR Rotterdam, The Netherlands

email: groenen@few.eur.nl

Predisoft, San Pedro, Costa Rica

email: SuzanneWinsberg@predisoft.com

Econometric Institute Report EI 2006-49

Abstract. Multidimensional scaling aims at reconstructing dissimilarities between

pairs of objects by distances in a low dimensional space. However, in some cases the

dissimilarity itself is not known, but the range, or a histogram of the dissimilarities

is given. This type of data fall in the wider class of symbolic data (see Bock and

Diday (2000)). We model three-way two-mode data consisting of an interval of

dissimilarities for each object pair from each of K sources by a set of intervals of

the distances deﬁned as the minimum and maximum distance between two sets

of embedded rectangles representing the objects. In this paper, we provide a new

algorithm called 3WaySym-Scal using iterative majorization, that is based on an

algorithm, I-Scal developed for the two-way case where the dissimilarities are given

by a range of values ie an interval (see Groenen et al. (2006)). The advantage of

iterative majorization is that each iteration is guaranteed to improve the solution

until no improvement is possible. We present the results on an empirical data set

on synthetic musical tones.

Keywords: Multidimensional scaling, Three-way data, Interval data, Sym-

bolic data analysis, 3WaySym-Scal.

1 Introduction

Classical multidimensional scaling (MDS) models the dissimilarities among a

set of objects as distances between points in a low dimensional space. The aim

of MDS is to represent and recover the relationships among the objects and to

reveal the dimensions giving rise to the space. To illustrate: the goal in many

MDS studies, for example, in psychoacoustics or marketing is to visualize

the objects and the distances among them and to discover and reveal the

dimensions underlying the dissimilarity ratings, that is, the most important

perceptual attributes of the objects.

Often, the proximity data available for the n objects consist of a single

numerical value for the dissimilarity δ

between each object pair. Then, the

2 Patrick J.F. Groenen and Suzanne Winsberg

data may be presented in a single dissimilarity matrix with the entry for

the i-th row and the j-th column being a single numerical value represent-

ing the dissimilarity between the i-th and j-th object (with i = 1, . . . , n and

j = 1, . . . , n). Techniques for analyzing this two-way, one-mode data have

been developed (see, e.g., Kruskal (1964), Winsberg and Carroll (1989), or

Borg and Groenen (2005)). Sometimes proximity data are collected from K

sources, for example, a panel of K judges or under K diﬀerent conditions,

yielding three-way two-mode data and an n×n× K array of single numerical

values. Techniques have been developed to deal with this form of data permit-

ting the study of individual or group diﬀerences underlying the dissimilarity

ratings (see, e.g., Carroll and Chang (1972), Winsberg and DeSoete (1993)).

All of these above mentioned MDS techniques require that each entry of

the dissimilarity matrix, or matrices be a single numerical value. However,

the objects in the set under consideration may be of such a complex nature

that the dissimilarity between each pair of them is better represented by a

range, that is, an interval of values, or a histogram of values rather than a

single value. For example, if the number of objects under study becomes very

large, it may be unreasonable to collect pairwise dissimilarities from each

judge and one may wish to aggregate the ratings from many judges where

each judge has rated the dissimilarities from a subset of all the pairs. Then,

rather than using an average value of dissimilarity for each object pair one

would wish to retain the information contained in the interval or histogram

of dissimilarities obtained for each pair of objects. Or, it might be useful to

collect data reﬂecting the imprecision or fuzziness of the dissimilarity between

each object pair. Then, the ij-th entry in the n × n data matrix, that is, the

dissimilarity between objects i and j, is either an interval or an empirical

distribution of values (a histogram). In these cases, the data matrix consists

of symbolic data.

By now, MDS of symbolic data can be analyzed by several techniques.

The case where the dissimilarity between each object pair is represented by

a range or interval of values has been treated by Denœux and Masson (2000)

and Masson and Denœux (2002). They model each object as alternatively a

hyperbox (hypercube) or a hypersphere in a low dimensional space and use

a gradient descent algorithm. Groenen et al. (2006) have developed an MDS

technique for interval data which yields a representation of the objects as

hyperboxes in a low-dimensional Euclidean space rather than hyperspheres

because the hyperbox representation is reﬂected as a conjunction of p prop-

erties where p is the dimensionality of the space. We shall follow this latter

approach here.

The hyperbox representation is interesting for two reasons. First a hyper-

box is more appealing because it allows a strict separation between the units

of the dimensions it uses. For example, the top speed of a certain type of car

might be between 170 and 190 km/h and its fuel consumption between 8 and

10 liters per 100 km. These aspects can be easily described alternatively as

3WaySym-Scal: Three-Way Symbolic Multidimensional Scaling 3

an average top speed of 180 km/h plus or minus 10 km/h and an average fuel

consumption of 9 liters per 100 km plus or minus 1. Both formulations are

in line with the hyperbox approach. However, the hypersphere interpretation

would be to state that the car is centered around a top speed of 180 km/h

and a fuel consumption of 9 liters per 100 km and give a radius. The units

of this radius cannot b e easily expressed anymore. A second reason for using

hyperboxes is that we would like to discover relationships in terms of the

underlying dimensions. The use of hyperboxes leads to unique dimensions,

whereas the the use of hyperspheres introduces the freedom of rotation so

that dimensions are not unique anymore.

Groenen and Winsberg (2006) have extended the method developed by

Groenen et al. (2006) to deal with the case in which the dissimilarity between

object i and object j is an empirical distribution of values or, equivalently, a

histogram.

All of the methods described above for MDS of symbolic data treat the

two-way one-mode case. That is, they deal with a single data matrix. Here, we

extend that approach to deal with the two-mode three-way case. We consider

the case where each of K judges denote the dissimilarity between the i-th and

j-th object pair as an interval, or a histogram thereby giving a range of values

or a fuzzy dissimilarity. So, the accent here will be on individual diﬀerences.

Of course, the method also applies to the case where data is collected for K

conditions, where for each condition the dissimilarity between the i-th and

j-th pair is an interval, or a histogram.

In the next section, we review brieﬂy the I-Scal algorithm developed by

Groenen et al. (2006) for MDS of interval dissimilarities based on iterative

majorization. Then, we present an extension of the method to the three-way

two-mode case and analyze an empirical data sets dealing with dissimilar-

ities of sounds. The paper ends with some conclusions and suggestions for

continued research.

2 MDS of Interval Dissimilarities

We now review brieﬂy the case of two-way one-mode MDS of interval dissim-

ilarities. In this case, an interval of a dissimilarity will be represented by a

range of distances between the two hyperboxes of objects i and j. This ob-

jective is achieved by representing the objects by rectangles and approximate

the upper bound of the dissimilarity by the maximum distance between the

rectangles and the lower bound by the minimum distance between the rect-

angles. An example of rectangle representation is shown in Figure 1. It also

indicates how the minimum and maximum distance b etween two rectangles

is deﬁned.

By using hyperboxes, both the distances and the coordinates are ranges.

Let the coordinates of the centers of the rectangles be given by the rows of

the n × p matrix X, where n is the number of objects and p the dimen-

4 Patrick J.F. Groenen and Suzanne Winsberg

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6

-0.8

-0.6

-0.4

-0.2

0.2

0.4

0.6

(L)

(U)

Fig. 1. Example of distances in MDS for interval dissimilarities where the objects

are represented by rectangles.

sionality. The distance from the center of rectangle i along axis s, denoted

by the spread, is represented by r

which is by deﬁnition nonnegative. The

maximum Euclidean distance between rectangles i and j is given by

(U)

(X, R) =

s=1

[|x

− x

| + (r

+ r

)]

1/2

(1)

and the minimum Euclidean distance by

(L)

(X, R) =

s=1

max[0, |x

− x

| − (r

+ r

)]

1/2

. (2)

This deﬁnition implies that rotation of the axes changes the distances be-

tween the hyperboxes because they are always parallel to the rotated axes.

This sensitivity for rotation can be seen as an asset because it makes a so-

lution rotational unique, which is not true for ordinary MDS. In the special

case of R = 0, the hyperboxes b ecome points and the rotational uniqueness

disappears as in ordinary MDS.

Symbolic MDS for interval dissimilarities aims at approximating the lower

and upper bounds of the dissimilarities by minimum and maximum distances

between rectangles. This objective is formalized by the I-Stress loss function

(X, R) =

i<j

(U)

− d

(U)

(X, R)

i<j

(L)

− d

(L)

(X, R)

where δ

(U)

is the upper bound of the dissimilarity of objects i and j, δ

(L)

the lower bound , and w

is a given nonnegative weight. σ

(X, R) can be

minimized by iterative majorization (see Groenen et al. (2006)).

3WaySym-Scal: Three-Way Symbolic Multidimensional Scaling 5

Iterative majorization has the advantage that I-Stress is guaranteed to

reduce in each iteration from any starting conﬁguration until a stationary

point is obtained. In practice, the algorithm stops at a stationary point that

is a local minimum. Another important property for the purpose of this paper

is that, in each iteration, the algorithm operates on a quadratic function in X

and R. Groenen et al. (2006) have derived the quadratic majorizing function

for σ

(X, R) as the one at the right hand side of

(X, R) ≤

s=1

(1)

− 2x

(1)

)

s=1

(2)

− 2r

(2)

) +

s=1

i<j

(γ

(1)

ijs

+ γ

(2)

ijs

), (3)

where x

is column s of X, r

is column s of R, y

is column s of Y (the pre-

vious estimate of X). The matrices A

(1)

, B

(1)

, A

(2)

, vectors b

(2)

, and scalars

(1)

ijs

, γ

(2)

ijs

all depend dependent on previous estimates of X and R, hence

they are known at the present iteration. Their exact deﬁnition can be found

in Groenen et al. (2006). For our purposes, it is important to realize that the

majorizing function at the right of (3) is quadratic in X and R, so that an

update can be readily derived by setting the derivatives equal to zero.

Another important feature of the majorizing function being quadratic is

that it becomes easy to impose the constraints that we will need for the

extension to two-mode three-way symbolic MDS proposed in this paper. For

more details on iterative majorization and its use in three-way MDS, see, for

example, De Leeuw and Heiser (1980) and Borg and Groenen (2005).

3 Two-Mode Three-Way MDS of Interval Data

The I-Scal algorithm developed by Groenen et al. (2006) can be extended

quite easily to two-mode three-way interval data. In this case, we have an

interval available of the dissimilarities available for replication ` = 1, . . . , L.

Then, δ

(L)

ij`

and δ

(U)

ij`

are the lower and upper boundary of the interval of δ

for

replication `. Of course, a normal I-Scal solution could be computed for every

replication separately. However, here we impose restrictions of the weighted

Euclidean model similar to the Indscal approach of Carroll and Chang (1972).

The main idea is to have a single common space of hyperboxes and allow

each replication ` to stretch or shrink the dimensions to ﬁt its ranges of

dissimilarities as good as possible. Let X and R denote here the centers

and spreads of the hyperboxes in the common space. Then, the weighted

Euclidean model restrictions imply that the hyperboxes for the individual

replication ` are modelled as

= XV

(4)

= RV

, (5)

3WaySym-Scal: three-way symbolic multidimensional scaling

Figures

References

Modern Multidimensional Scaling: Theory and Applications (Second Edition)

A Latent Class Approach to Fitting the Weighted Euclidean Model, CLASCAL.

A Quasi-Nonmetric Method for Multidimensional Scaling via an Extended Euclidean Model.

Multidimensional scaling of interval-valued dissimilarity data

I-Scal: Multidimensional scaling of interval dissimilarities

Related Papers (5)

I-Scal: Multidimensional scaling of interval dissimilarities

Multidimensional scaling of interval-valued dissimilarity data

Multidimensional scaling of fuzzy dissimilarity data

Fuzzy multidimensional scaling

A Latent Class Multidimensional Scaling Model for Two-Way One-Mode Continuous Rating Dissimilarity Data

Frequently Asked Questions (1)

Q1. What contributions have the authors mentioned in the paper "3waysym-scal: three-way symbolic multidimensional scaling" ?