Stochastic block models for multiplex networks: an application to a multilevel network of researchers

doi:10.1111/RSSA.12193

Stochastic Block Models for Multiplex networks: an

application to networks of researchers

Pierre Barbillon

∗1

, Sophie Donnet

1

, Emmanuel Lazega

2

, and Avner Bar-Hen

3

1

AgroParisTech / UMR INRA MIA 518, 16 rue Claude Bernard, 75231 Paris

Cedex 05, France

2

Institut d’

´

Etudes Politiques de Paris (Sciences Po), D´epartement de Sociologie,

Centre de Sociologie des Organisations, 19 rue Am´elie, 75007 Paris, France

3

MAP5, UFR de Math´ematiques et Informatique Universit´e Paris Descartes 45

rue des Saints-P`eres 75270 Paris cedex 06

Abstract

Modeling relations between individuals is a classical q u est i on in soci al sciences

and clustering individuals according to the observed patterns of interactions allows

to uncover a latent struct u re in the data. Stochastic bl ock model (SBM) is a popu-

lar approach for grouping t he individuals with respect to their social comportment.

When several relationships of various types can occur jointly between the individu-

als, the data are represented by multiplex networks where more than one edge can

exist between the nodes. In this paper, we extend the SBM to multiplex networks

in order to obtain a clustering based on more than one kind of relationship. We pro-

pose to estimat e the parameters –such as the marginal probabilities of assignment

to groups (blocks) and the mat ri x of probabilities of connections between groups–

∗

pierre.barbillon@agroparistech.fr

1

arXiv:1501.06444v1 [stat.ME] 26 Jan 2015

through a variational Expectation-Maximization procedure. Consistency of the es-

timates as well as statistical properties of the model are obt ai ne d. The numb er of

groups is chosen thanks to the Integrated Completed Likelihood criteria, a penalized

likelihood criterion. Multiplex Stochastic Block Model arises in many situations but

our applied example is moti vated by a network of French cancer rese ar chers. The

two possible links (edges) between researchers are a direct connection or a connec-

tion through their labs. Our results show strong i nteractions between these two

kinds of connections and the groups that are obtained are discussed to emphasize

the common features of researchers grouped together.

Keywords— Bivariate Stochastic Block Model, Multilevel / Multiplex networks,

Social network.

1 Introduction

Network analysis has emerged as a key technique for understanding and for investigating

social interactions through the properties of relations between and within units. From a

statistical point of view, a network is a realization of a random graph formed by a set of

nodes V representing the units (e.g. individuals, actors, companies) and a set of edges

E representing relat i ons hip s between pairs of nodes.

The system in which the same nodes belong to multiple networks is typically referred

to as a multiplex network or multigraph (see

Wasserman (1994) for example). In recent

literature, there has been an upsurge of interest in multiplex networks (see for example

Cozzo et al. (2012); Loe and Jeldtoft Jensen (2014); Rank et al. (2010); Szell et al. (2010);

Mucha et al. (2010); Maggioni et al. (2013); Brummitt et al. (2012); Saumell-Mendiola

et al. (2012

); Bianconi (2013); Ni c osi a et al. (2013)). In these multiplex networks, diﬀer-

ent kinds of links (or c onn ec ti on s) are possible for each p ai r of nodes. This induced link

multiplexity is a fundamental aspect of social relation s (

Snijders and Bae rveldt, 2003)

since t h es e multiple links are frequently interdependent: links in one network may have

2

an inﬂuence on the formation or di ss ol ut i on of lin ks in ot h er networks.

The simultaneous analysis of several networks also arises when one is interested in the

social comp ort m ent of individuals b el on gin g to organized entities (such as companies,

laboratories, political groups, etc.), with some individuals possibly belonging to the same

institution. While the actors will exchange resources (such as advice for instance) at the

individual level, their respective organizations of aﬃliation will also share resources at

the ins ti t u t ion al level (ﬁnancial resources for instance). Each level (individuals and or-

ganizations) constitutes a system of e xchange of diﬀerent resources that has its own

logic and could be studied separately. However, studying the two networks jointly (and

hence embedding the individuals in the multilevel relational and organizational st r uc -

tures constituting the inter-organizational context of t he i r actions) would allow us to

identify the indiv i du al s t h at bene ﬁt from relatively easy access to the re sou rc es circu-

lating in each level, whi ch is of much more interest. In other words, studying the two

levels jointly could help us understand how an indivi du al can b en eﬁ t from the position

of its organization in the institu t i onal network.

In this paper, we are interested in s t ud y in g the advice relations between researchers

and the exchanges of resources between laboratories. We adopt the following indivi d ual -

oriented strategy (this point is discussed in the paper): the institutional network is used

to deﬁne a new network on the individual level i.e. the set of nodes consists in the set

of individuals and for a pair of ind i v id u als , two kinds of links are possible: a direct con-

nection given by the individual network and a connection through their organizations

given by the organizational network. As a consequence, the individual and institution al

levels are fused into a multiplex.

We then develop a statistical model able to detect in multiplex substantial non-

trivial topological features, wit h patterns of connection between their elements that are

3

not purely regular. Several models such as scale-free networks and small-world networks

have been proposed to describe and understand the heterogeneity observed in networks.

These models allow to derive proper ti e s of the network at the macro-scale and to un-

derstand the outcomes of interactions. To explore heterogeneity at others scales (such

as micro or meso-scale) in social n etworks, speciﬁc mod el s such as the stochastic block

models (SBM) (

Snijders and Nowicki (1997)) have been developed for uniplex networks.

In thi s paper, we propose an original extension of the SBMs to the multiplex case. Our

model is eﬃcient to model not only the main eﬀects (that correspond to a classical uni-

plex) but also the pairwise interactions between the nodes. We estimate the parameters

of the multiplex SBM usin g an extension of the variational EM algorithm. Consistency

of the est i mat i on of the parameters is proved. As for unip le x SBM, a key issue is to

choose t he number of blocks. We use a penalized likelihood criterion, namely Integrated

Completed Likelihood (ICL). The inference procedure is performed on the cancer re-

searchers / laboratories dataset.

The paper is organized as follows. The extension of SBM to multiplex network is pre-

sented in Section

2, the proofs of model identiﬁability and the consistency of variational

EM procedure ar e postponed in Appendices

A and B. In Se ct i on 3, we describe Laz ega

et al. (2008

)’s dataset, apply the new modeling and discuss the results. Eventually, t h e

contribution of multiplex SBM to the analysis of multiplex networks is highlighted in

Section

4.

2 Multiplex stochastic block model

The main object i ve is to clus t er the individuals (or nodes) into blocks sharing connection

properties with the other individuals of the multiplex-network. Stochastic block models

(Nowicki and Snijders, 2001) for random graphs have emerged as a natural tool to

perform such a clustering based on uniplex networks (directed or not, valued or not).

4

In the following, we propose an ex t e ns ion of the Stochastic Block Model (SBM) to

multiplex networks. The SBM for multiplex networks is derived fr om a multiplex Erd¨os-

R´enyi model which is descri bed in subsection

2.1. T he SBM for multiplex networks is

derived in sub se ct i on

2.2.

2.1 Erd¨os-R´enyi model for multiplex networks

Let X

1

, . . . , X

K

be K directed graphs relying on the same set of nodes E = {1, . . . , n}.

We assume that ∀(i, j), i 6= j, ∀k ∈ {1, . . . , K}, X

k

ij

∈ {0, 1} and X

ii

6= 0. We deﬁne a

joint distribution on X

1:K

= (X

1

, . . . , X

K

) as: ∀(i, j) ∈ {1, . . . , n}

2

, i 6= j, ∀w ∈ {0, 1}

K

,

P(X

1:K

ij

= w) = π

(w)

where

X

w∈{0,1}

K

π

(w)

= 1 , (1)

and (X

1:K

ij

)

i,j

are mutually in de pendent.

The maximum likelihood estimate of th e parameter of interest π = (π

(w)

)

w∈{0,1}

K

is, for all w ∈ {0, 1}

K

:

bπ

w

=

1

n(n − 1)

X

i,j,i6=j

I

{X

1:K

ij

=w}

.

This model is quite simple since any relation between two individuals (a relation being

a collection of edges) d oes not depend on the relations be tween the other individuals.

However, the diﬀerent kind of relations between two individuals (edges) are not assumed

to be independent.

Remark 1. This model is clearly an extension of the Erd¨os-R´enyi model since the

marginal distribution of X

k

ij

(for any k = 1 . . . K) is Bernou lli w it h density :

P(X

k

ij

= x

k

ij

) =





X

w∈{0,1}|w

k

=1

π

(w)





x

k

ij





X

w∈{0,1}|w

k

=0

π

(w)





1−x

k

ij

.

Moreover, any conditional distribution of X

k

ij

given (X

l

ij

)

l∈S

\k

(where S

\k

is a subset of

5

Stochastic block models for multiplex networks: an application to a multilevel network of researchers

Citations

Clustering Network Layers with the Strata Multilayer Stochastic Block Model

Des poissons et des mares : l'analyse de réseaux multi-niveaux

Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms

Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms

Variational Inference for Stochastic Block Models From Sampled Data

References

Stochastic blockmodels: First steps

On comparing partitions

Community Structure in Time-Dependent, Multiscale, and Multiplex Networks

Assessing a mixture model for clustering with the integrated completed likelihood

Estimation and prediction for stochastic blockstructures

Related Papers (5)

Stochastic blockmodels: First steps

Stochastic blockmodels and community structure in networks

Multilayer Networks

Community Structure in Time-Dependent, Multiscale, and Multiplex Networks

Estimation and prediction for stochastic blockstructures