scispace - formally typeset
Open AccessJournal ArticleDOI

Stochastic block models for multiplex networks: an application to a multilevel network of researchers

Reads0
Chats0
TLDR
This work extends stochastic block models to multiplex networks to obtain a clustering based on more than one kind of relationship and shows strong interactions between these two kinds of connection and the groups that are obtained.
Abstract
Modelling relationships between individuals is a classical question in social sci- ences and clustering individuals according to the observed patterns of interactions allows us to uncover a latent structure in the data. The stochastic block model is a popular approach for grouping individuals with respect to their social comportment. When several relationships of various types can occur jointly between individuals, the data are represented by multiplex networks where more than one edge can exist between the nodes. We extend stochastic block models to multiplex networks to obtain a clustering based on more than one kind of relation- ship. We propose to estimate the parameters—such as the marginal probabilities of assignment to groups (blocks) and the matrix of probabilities of connections between groups—through a variational expectation–maximization procedure. Consistency of the estimates is studied. The number of groups is chosen by using the integrated completed likelihood criterion, which is a penalized likelihood criterion. Multiplex stochastic block models arise in many situations but our applied example is motivated by a network of French cancer researchers. The two possi- ble links (edges) between researchers are a direct connection or a connection through their laboratories. Our results show strong interactions between these two kinds of connection and the groups that are obtained are discussed to emphasize the common features of researchers grouped together.

read more

Content maybe subject to copyright    Report

Stochastic Block Models for Multiplex networks: an
application to networks of researchers
Pierre Barbillon
1
, Sophie Donnet
1
, Emmanuel Lazega
2
, and Avner Bar-Hen
3
1
AgroParisTech / UMR INRA MIA 518, 16 rue Claude Bernard, 75231 Paris
Cedex 05, France
2
Institut d’
´
Etudes Politiques de Paris (Sciences Po), D´epartement de Sociologie,
Centre de Sociologie des Organisations, 19 rue Am´elie, 75007 Paris, France
3
MAP5, UFR de Math´ematiques et Informatique Universit´e Paris Descartes 45
rue des Saints-P`eres 75270 Paris cedex 06
Abstract
Modeling relations between individuals is a classical q u est i on in soci al sciences
and clustering individuals according to the observed patterns of interactions allows
to uncover a latent struct u re in the data. Stochastic bl ock model (SBM) is a popu-
lar approach for grouping t he individuals with respect to their social comportment.
When several relationships of various types can occur jointly between the individu-
als, the data are represented by multiplex networks where more than one edge can
exist between the nodes. In this paper, we extend the SBM to multiplex networks
in order to obtain a clustering based on more than one kind of relationship. We pro-
pose to estimat e the parameters –such as the marginal probabilities of assignment
to groups (blocks) and the mat ri x of probabilities of connections between groups–
pierre.barbillon@agroparistech.fr
1
arXiv:1501.06444v1 [stat.ME] 26 Jan 2015

through a variational Expectation-Maximization procedure. Consistency of the es-
timates as well as statistical properties of the model are obt ai ne d. The numb er of
groups is chosen thanks to the Integrated Completed Likelihood criteria, a penalized
likelihood criterion. Multiplex Stochastic Block Model arises in many situations but
our applied example is moti vated by a network of French cancer rese ar chers. The
two possible links (edges) between researchers are a direct connection or a connec-
tion through their labs. Our results show strong i nteractions between these two
kinds of connections and the groups that are obtained are discussed to emphasize
the common features of researchers grouped together.
Keywords— Bivariate Stochastic Block Model, Multilevel / Multiplex networks,
Social network.
1 Introduction
Network analysis has emerged as a key technique for understanding and for investigating
social interactions through the properties of relations between and within units. From a
statistical point of view, a network is a realization of a random graph formed by a set of
nodes V representing the units (e.g. individuals, actors, companies) and a set of edges
E representing relat i ons hip s between pairs of nodes.
The system in which the same nodes belong to multiple networks is typically referred
to as a multiplex network or multigraph (see
Wasserman (1994) for example). In recent
literature, there has been an upsurge of interest in multiplex networks (see for example
Cozzo et al. (2012); Loe and Jeldtoft Jensen (2014); Rank et al. (2010); Szell et al. (2010);
Mucha et al. (2010); Maggioni et al. (2013); Brummitt et al. (2012); Saumell-Mendiola
et al. (2012
); Bianconi (2013); Ni c osi a et al. (2013)). In these multiplex networks, differ-
ent kinds of links (or c onn ec ti on s) are possible for each p ai r of nodes. This induced link
multiplexity is a fundamental aspect of social relation s (
Snijders and Bae rveldt, 2003)
since t h es e multiple links are frequently interdependent: links in one network may have
2

an influence on the formation or di ss ol ut i on of lin ks in ot h er networks.
The simultaneous analysis of several networks also arises when one is interested in the
social comp ort m ent of individuals b el on gin g to organized entities (such as companies,
laboratories, political groups, etc.), with some individuals possibly belonging to the same
institution. While the actors will exchange resources (such as advice for instance) at the
individual level, their respective organizations of affiliation will also share resources at
the ins ti t u t ion al level (financial resources for instance). Each level (individuals and or-
ganizations) constitutes a system of e xchange of different resources that has its own
logic and could be studied separately. However, studying the two networks jointly (and
hence embedding the individuals in the multilevel relational and organizational st r uc -
tures constituting the inter-organizational context of t he i r actions) would allow us to
identify the indiv i du al s t h at bene fit from relatively easy access to the re sou rc es circu-
lating in each level, whi ch is of much more interest. In other words, studying the two
levels jointly could help us understand how an indivi du al can b en efi t from the position
of its organization in the institu t i onal network.
In this paper, we are interested in s t ud y in g the advice relations between researchers
and the exchanges of resources between laboratories. We adopt the following indivi d ual -
oriented strategy (this point is discussed in the paper): the institutional network is used
to define a new network on the individual level i.e. the set of nodes consists in the set
of individuals and for a pair of ind i v id u als , two kinds of links are possible: a direct con-
nection given by the individual network and a connection through their organizations
given by the organizational network. As a consequence, the individual and institution al
levels are fused into a multiplex.
We then develop a statistical model able to detect in multiplex substantial non-
trivial topological features, wit h patterns of connection between their elements that are
3

not purely regular. Several models such as scale-free networks and small-world networks
have been proposed to describe and understand the heterogeneity observed in networks.
These models allow to derive proper ti e s of the network at the macro-scale and to un-
derstand the outcomes of interactions. To explore heterogeneity at others scales (such
as micro or meso-scale) in social n etworks, specific mod el s such as the stochastic block
models (SBM) (
Snijders and Nowicki (1997)) have been developed for uniplex networks.
In thi s paper, we propose an original extension of the SBMs to the multiplex case. Our
model is efficient to model not only the main effects (that correspond to a classical uni-
plex) but also the pairwise interactions between the nodes. We estimate the parameters
of the multiplex SBM usin g an extension of the variational EM algorithm. Consistency
of the est i mat i on of the parameters is proved. As for unip le x SBM, a key issue is to
choose t he number of blocks. We use a penalized likelihood criterion, namely Integrated
Completed Likelihood (ICL). The inference procedure is performed on the cancer re-
searchers / laboratories dataset.
The paper is organized as follows. The extension of SBM to multiplex network is pre-
sented in Section
2, the proofs of model identifiability and the consistency of variational
EM procedure ar e postponed in Appendices
A and B. In Se ct i on 3, we describe Laz ega
et al. (2008
)’s dataset, apply the new modeling and discuss the results. Eventually, t h e
contribution of multiplex SBM to the analysis of multiplex networks is highlighted in
Section
4.
2 Multiplex stochastic block model
The main object i ve is to clus t er the individuals (or nodes) into blocks sharing connection
properties with the other individuals of the multiplex-network. Stochastic block models
(Nowicki and Snijders, 2001) for random graphs have emerged as a natural tool to
perform such a clustering based on uniplex networks (directed or not, valued or not).
4

In the following, we propose an ex t e ns ion of the Stochastic Block Model (SBM) to
multiplex networks. The SBM for multiplex networks is derived fr om a multiplex Erd¨os-
R´enyi model which is descri bed in subsection
2.1. T he SBM for multiplex networks is
derived in sub se ct i on
2.2.
2.1 Erd¨os-R´enyi model for multiplex networks
Let X
1
, . . . , X
K
be K directed graphs relying on the same set of nodes E = {1, . . . , n}.
We assume that (i, j), i 6= j, k {1, . . . , K}, X
k
ij
{0, 1} and X
ii
6= 0. We define a
joint distribution on X
1:K
= (X
1
, . . . , X
K
) as: (i, j) {1, . . . , n}
2
, i 6= j, w {0, 1}
K
,
P(X
1:K
ij
= w) = π
(w)
where
X
w∈{0,1}
K
π
(w)
= 1 , (1)
and (X
1:K
ij
)
i,j
are mutually in de pendent.
The maximum likelihood estimate of th e parameter of interest π = (π
(w)
)
w∈{0,1}
K
is, for all w {0, 1}
K
:
bπ
w
=
1
n(n 1)
X
i,j,i6=j
I
{X
1:K
ij
=w}
.
This model is quite simple since any relation between two individuals (a relation being
a collection of edges) d oes not depend on the relations be tween the other individuals.
However, the different kind of relations between two individuals (edges) are not assumed
to be independent.
Remark 1. This model is clearly an extension of the Eros-R´enyi model since the
marginal distribution of X
k
ij
(for any k = 1 . . . K) is Bernou lli w it h density :
P(X
k
ij
= x
k
ij
) =
X
w∈{0,1}|w
k
=1
π
(w)
x
k
ij
X
w∈{0,1}|w
k
=0
π
(w)
1x
k
ij
.
Moreover, any conditional distribution of X
k
ij
given (X
l
ij
)
l∈S
\k
(where S
\k
is a subset of
5

Citations
More filters
Journal ArticleDOI

Clustering Network Layers with the Strata Multilayer Stochastic Block Model

TL;DR: An algorithm for separating layers into their appropriate strata and an inference technique for estimating the SBM parameters for each stratum are described, which demonstrate the method using synthetic networks and a multilayer network inferred from data collected in the Human Microbiome Project.
Posted Content

Des poissons et des mares : l'analyse de réseaux multi-niveaux

TL;DR: In this article, the authors propose a structural linked design approach to the multi-level dimension of organizational and social life, which is composed of two major steps: first, they examine the complete networks at the two different levels, and then they articulate the two networks in relation to one another using systematic information about the membership of each individual in the first network (interindividual) to one of the organizations in the second network(inter-organizational).
Journal ArticleDOI

Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms

TL;DR: In this article, a multilayer spectral graph clustering (SGC) framework is proposed to perform convex layer aggregation, and a phase transition analysis of clustering reliability is provided.
Posted Content

Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms

TL;DR: In this paper, a multilayer spectral graph clustering (SGC) framework is proposed to perform convex layer aggregation, and a phase transition analysis of clustering reliability is provided.
Journal ArticleDOI

Variational Inference for Stochastic Block Models From Sampled Data

TL;DR: In this article, nonobserved dyads during the sampling of a network and consecutive issues in the inference of the stochastic block model (SBM) were dealt with, and sampling designs and recover missin...
References
More filters
Journal ArticleDOI

Stochastic blockmodels: First steps

TL;DR: Estimation techniques are developed for the special case of a single relation social network, with blocks specified a priori, and an extension of the model allows for tendencies toward reciprocation of ties beyond those explained by the partition.
Journal Article

On comparing partitions

TL;DR: In this paper, Hubert and Arabie corrected the Rand Index for chance (Adjusted Rand Index) and presented some alternative indices, which do not assume one set of units for two partitions.
Journal ArticleDOI

Community Structure in Time-Dependent, Multiscale, and Multiplex Networks

TL;DR: A generalized framework of network quality functions was developed that allowed us to study the community structure of arbitrary multislice networks, which are combinations of individual networks coupled through links that connect each node in one network slice to itself in other slices.
Journal ArticleDOI

Assessing a mixture model for clustering with the integrated completed likelihood

TL;DR: An assessing method of mixture model in a cluster analysis setting with integrated completed likelihood appears to be more robust to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.
Journal ArticleDOI

Estimation and prediction for stochastic blockstructures

TL;DR: In this article, a statistical approach to a posteriori blockmodeling for digraph and valued digraphs is proposed, which assumes that the vertices of the digraph are partitioned into several unobserved (latent) classes and that the probability distribution of the relation between two vertices depends only on the classes to which they belong.
Related Papers (5)