scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Accuracy of variational estimates for random graph mixture models

TL;DR: In this paper, four different variational estimates for the parameters of these models were compared through simulation studies and showed that the variational Bayes estimates seem the most accurate for moderate graph size.
Abstract: Variational and variational Bayes techniques are popular approaches for statistical inference of complex models but their theoretical properties are still not well known. Because of both unobserved variables and intricate dependency structures, mixture models for random graphs constitute a good case study. We first present four different variational estimates for the parameters of these models. We then compare their accuracy through simulation studies and show that the variational Bayes estimates seem the most accurate for moderate graph size. We finally re-analyse the regulatory network of Escherichia coli with this approach.

Summary (1 min read)

Jump to:  and [Summary]

Summary

  • HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not.
  • The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

HAL Id: inria-00494740
https://hal.inria.fr/inria-00494740
Submitted on 24 Jun 2010
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Accuracy of Variational Estimates for Random Graph
Mixture Models
Steven Gazal, Jean-Jacques Daudin, Stéphane Robin
To cite this version:
Steven Gazal, Jean-Jacques Daudin, Stéphane Robin. Accuracy of Variational Estimates for Random
Graph Mixture Models. 42èmes Journées de Statistique, 2010, Marseille, France, France. �inria-
00494740�

Accuracy of Variational Estimates for Random
Graph Mixture Models
Steven Gazal
a,b
& Jean-Jacques Daudin
a,b
& St´ephane Robin
a,b
a
AgroParisTech, UMR 518, F-75005, Paris, France
b
INRA, UMR 518, F-75005, Paris, France
L’analyse des r´eseaux exerce depuis quelques ann´ees un attrait croissant. Les donn´ees
qui sont sous la forme de mesures de relations entre items sont de plus en plus disponibles,
et abandonnent la structure usuelle d’un jeu de donn´ees de type individus-variables pour
une structure de type individus-individus. Ces donn´ees ”relationnelles” sont tr`es souvent
pr´esent´ees sous la forme d’un graphe, eme si cette repr´esentation a ses limites, notam-
ment quand le nombre d’individus d´epasse la centaine. La repr´esentation graphique des
donn´ees des eseaux est alors attractive, mais n´ecessite un mod`ele synth´etique.
Le mod`ele de graphe le plus ancien et le plus utilis´e est le mod`ele de m´elange d’Erd¨os-
R´enyi. C’est un mod`ele simple dont les propri´et´es moyennes ou asymptotiques sont
connues : distribution des degr´es, densit´e du graphe, coefficient d’agr´egation, diam`etre
moyen... L’´ecriture litt´erale de la vraisemblance de ce mod`ele est tr`es simple, mais son
temps de calcul croit de fa¸con exponentielle avec le nombre d’individu. Une utilisation des
algorithmes d’estimation usuels comme E-M n’est pas envisageable. Une approche varia-
tionnelle a ´et´e utilis´e comme alternative pour impl´ementer un algorithme d’estimation des
param`etres du mod`ele, et cela pour des r´eseaux de tr`es grande taille (Daudin & al (2008)).
La m´ethode variationnelle est une technique d’optimisation qui permet de trouver le
maximum d’une fonction en optimisant une borne inf´erieure (Jaakola (2000)). Les solu-
tions de ces probl`emes sont souvent des ´equations de point fixe.
Les propri´et´es statistiques des estimateurs produits par cette approche sont cependant mal
connues. Gunawardana et Byrne (2005) ont prouv´e que l’algorithme variationnel converge
vers une solution qui minimise la divergence, mais qui n’est pas un point stationnaire de
la vraisemblance, sauf pour les mod`eles d´eg´en´er´es. McGrory et Titterington (2007) et
(2009) ont ´etudi´e les propri´et´es des estimateurs variationnels en terme de s´el´ection de
mod`eles pour des m´elanges gaussiens et des mod`eles de chaine de Markov cach´ee. Hormis
ces travaux, nous n’avons pas d’autres informations sur les propri´et´es statistiques des es-
timateurs variationnels.
Le but de nos travaux est donc d’´etudier par simulation la convergence de diff´erents estima-
teurs variationnels. Seront ´etudi´es l’estimateur variationnel classique (VEM), l’estimateur
de la Belief Propagation (BP) (Yedidia & al (2003)), ainsi qu’une extension au mod`ele
bayesien (VB) (Beal & Ghahramani (2003)) o`u les param`etres sont consid´er´es comme des
variables cach´ees.
1

Tout d’abord nous avons simuler des r´eseaux aux dimensions ”restreintes” pour pouvoir
comparer la qualit´e et la pr´ecision des estimateurs EM et variationnels. Puis nous avons
fait varier la taille des graphes pour ´etudier la convergence des estimateurs variationnels
uniquement.
Les esultats montrent une bonne convergence des estimateurs variationnels, et une qualit´e
tr`es proche de celle des estimateurs EM.
Nous avons ´egalement prouv´e th´eoriquement la consistance de l’estimateur variationnel.
La d´emonstration ne sera pas montr´e mais les id´ees principales seront avanc´ees.
Enfin nous illustrerons nos diff´erents estimateurs variationnels VEM et VB sur le eseau
de r´egulation d’E. Coli.
Complex networks are more and more studied in different domains such as social sci-
ences and biology. The network representation of the data is graphically attractive, but
there is clearly a need for a synthetic model, giving a lightning representation of complex
networks. Statistical methods have been developed for analyzing complex data such as
networks in a way that could reveal underlying data patterns through some form of clas-
sification.
The Erd¨os-R´enyi’s mixture model is a network model used a lot. It is very simple and its
properties are well known. The likelihood can be written easily, but the big size of the
graph forbids the use of the traditional EM algorithm. The dependency of all the nodes
implies that the E step explores all the configurations of a graph. It is too complex to
compute it when the number of nodes is high. A variational method has been used for
estimating the parameters in a reasonable time (Daudin & al (2008)).
The variational methods refer to a large collection of optimization techniques (Jaakola
(2000)). It consists on replacing a complex likeliho od by a lower bound of the likelihood
that is simpler to compute. In EM algorithm, the estimators that maximize this lower
bound have unknown statistical properties. Gunawardana & Byrne (2005) claim that
variational methods converge to solutions that minimize divergence, but these are not
necessarily stationary points in likelihood. They only converge to stationary points in
likelihood in degenerate cases. McGrory and Titterington (2007) and (2009) studied the
properties of variational estimates in terms of model selection for Gaussian mixtures and
hidden Markov models. Despite these works and others, we still do not have an overall
picture of the statistical properties of variational estimates.
We present several variational estimates of the model parameters, and compare their
accuracy through a simulation study. We study the variational estimator (VEM), the
Belief Propagation estimate (BP) (Yedidia & al (2003)) and the variational approach
will be extended to the Bayesian setting, the parameters being considered as unobserved
2

variables (VB) (Beal and Ghahramani (2003)).
First we simulated small networks in order to be able to calculate the EM estimate and
to compare its quality with the variational estimators. Then we increased the size of our
networks to study the convergence of the variational estimates only.
The results show a good convergence of the variational estimates and a good quality com-
pared to the EM estimate.
We also proved the consistance of the variational estimate. The proof will not be ex-
plained but the main ideas will be shown.
We finally illustrate the differences between the variational estimates VEM and VB on
the regulatory network of E. coli.
Mots-cl´es: Mod`eles graphiques - M´ethodes bay´esiennes
Bibliographie
[1]Balazsi, G., Barabasi, A.-L., Oltvai, Z. (2005) Topological units of environmental signal
processing in the transcriptional network of escherichia coli. PNAS, 102(22), 7841-7846.
[2]Beal, M. J., Ghahramani, Z. (2003) The variational Bayesian EM algorithm for incom-
plete data: with application to scoring graphical model structures. Bayesian Statistics 7,
Oxford University Press, pp. 543-52.
[3]Besag, J. (1986) On the statistical analysis of dirty pictures. J. R. Statist. Soc. B, 48
(3), 259-302.
[4]Daudin, J.-J., Picard, F., Robin, S. (2008) A mixture model for random graphs. Stat
Comput, 18, 173-183.
[5] Dempster, A. P., Laird, N. M., Rubin, D. B. (1977) Maximum likelihood from incom-
plete data via the EM algorithm. J. R. Statist. Soc. B, 39, 1-38.
[6] Gunawardana, A., Byrne, W. (2005) Convergence theorems for generalized alternating
minimization procedures. J. Machine Learn. Res., 6, 2049-2073.
[7] Jaakola, T. S. (2000) Tutorial on variational approximation methods. MIT Press.
[8] Jordan, M. I., Ghahramani, Z., Jaakkola, T., Saul, L. K. (1999) An introduction to
variational methods for graphical models. Machine Learning, 37 (2), 183-233.
[9] Latouche, P., Birmel, E., Ambroise, C. (2008) Bayesian methods for graph clustering.
SSB Research Report 17.
[10] MacKay, D. J. (2003) Information Theory, Inference, and Learning Algorithms. Cam-
bridge Univer- sity Press.
[11] McGrory, C., A., Titterington, D., M. (2009) Variational Bayesian analysis for hidden
Markov models. Austr. & New Zeal. J. Statist., 51 (2), 227-44.
[12] McGrory, C. A., Titterington, D. M. (2007) Variational approximations in Bayesian
model selection for finite mixture distributions. Comput. Statist. and Data Analysis, 51,
5332-67.
[13] McLachlan, G., Peel, D. (2000) Finite Mixture Models. Wiley.
3

[14] Nowicki, K., Snijders, T. (2001) Estimation and prediction for stochastic block-
structures. J. Amer. Statist. Assoc., 96, 1077-87.
[15] Pattison, P. E., Robins, G. L. (2007) Handbook of Probability Theory with Applica-
tions. Sage Publication, Ch. Probabilistic Network Theory.
[16] Picard, F., Miele, V., Daudin, J. J., Cottret, L., Robin, S. (2009) Deciphering the
connectivity structure of biological networks using MixNet. BMC Bioinformatics , 10.
[17] Shen-Orr., R., M., S., M., U., A. (2002) Network motifs in the transcriptional regu-
lation network of escherichia coli. Nature genetics , 31, 64-68.
[18] Yedidia, J. S., Freeman, W. T., Weiss, Y. (2003) Understanding belief propagation
and its general- izations. Exploring Artificial Intelligence in the New Millenium, 8, 239-
236.
4
Citations
More filters
Journal ArticleDOI
TL;DR: Using an existing variational Bayes algorithm for the stochastic block models (SBMs) along with the corresponding weights for model averaging, an estimate of the graphon function as an average of SBMs with increasing number of blocks is derived.
Abstract: W-graph refers to a general class of random graph models that can be seen as a random graph limit. It is characterized by both its graphon function and its motif frequencies. In this paper, relying on an existing variational Bayes algorithm for the stochastic block models along with the corresponding weights for model averaging, we derive an estimate of the graphon function as an average of stochastic block models with increasing number of blocks. In the same framework, we derive the variational posterior frequency of any motif. A simulation study and an illustration on a social network complete our work.

42 citations

Journal ArticleDOI
TL;DR: This work establishes sufficient conditions for the groups posterior distribution to converge (as the size of the data increases) to a Dirac mass located at the actual (random) groups configuration.
Abstract: We propose a unified framework for studying both latent and stochastic block models, which are used to cluster simultaneously rows and columns of a data matrix. In this new framework, we study the behaviour of the groups posterior distribution, given the data. We characterize whether it is possible to asymptotically recover the actual groups on the rows and columns of the matrix, relying on a consistent estimate of the parameter. In other words, we establish sufficient conditions for the groups posterior distribution to converge (as the size of the data increases) to a Dirac mass located at the actual (random) groups configuration. In particular, we highlight some cases where the model assumes symmetries in the matrix of connection probabilities that prevents recovering the original groups. We also discuss the validity of these results when the proportion of non-null entries in the data matrix converges to zero.

26 citations

Posted Content
TL;DR: A selective review on probabilistic modeling of heterogeneity in random graphs focuses on latent space models and more particularly on stochastic block models and their extensions that have undergone major developments in the last five years.
Abstract: We present a selective review on probabilistic modeling of heterogeneity in random graphs. We focus on latent space models and more particularly on stochastic block models and their extensions that have undergone major developments in the last five years.

24 citations

Journal ArticleDOI
TL;DR: In this article, an estimate of the graphon function as an average of stochastic block models with increasing number of blocks is derived, and the variational posterior frequency of any motif is derived.
Abstract: W-graph refers to a general class of random graph models that can be seen as a random graph limit. It is characterized by both its graphon function and its motif frequencies. In this paper, relying on an existing variational Bayes algorithm for the stochastic block models (SBMs) along with the corresponding weights for model averaging, we derive an estimate of the graphon function as an average of SBMs with increasing number of blocks. In the same framework, we derive the variational posterior frequency of any motif. A simulation study and an illustration on a social network complete our work.

24 citations

Journal ArticleDOI
TL;DR: In this article, a unified framework for studying both latent and stochastic block models, which are used to cluster simultaneously rows and columns of a data matrix, is proposed, and the authors characterize whether it is possible to asymptotically recover the actual groups on the rows or columns of the matrix, relying on a consistent estimate of the parameter.
Abstract: We propose a unified framework for studying both latent and stochastic block models, which are used to cluster simultaneously rows and columns of a data matrix. In this new framework, we study the behaviour of the groups posterior distribution, given the data. We characterize whether it is possible to asymptotically recover the actual groups on the rows and columns of the matrix, relying on a consistent estimate of the parameter. In other words, we establish sufficient conditions for the groups posterior distribution to converge (as the size of the data increases) to a Dirac mass located at the actual (random) groups configuration. In particular, we highlight some cases where the model assumes symmetries in the matrix of connection probabilities that prevents recovering the original groups. We also discuss the validity of these results when the proportion of non-null entries in the data matrix converges to zero.

22 citations

References
More filters
BookDOI
28 Jan 2005
TL;DR: The important role of finite mixture models in statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and geospatial literature.
Abstract: The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and ge...

8,258 citations

Book
02 Oct 2000
TL;DR: The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the mathematical and statistical literature.
Abstract: The important role of finite mixture models in the statistical analysis of data is underscored by the ever-increasing rate at which articles on mixture applications appear in the statistical and ge...

8,095 citations

Book
06 Oct 2003
TL;DR: A fun and exciting textbook on the mathematics underpinning the most dynamic areas of modern science and engineering.
Abstract: Fun and exciting textbook on the mathematics underpinning the most dynamic areas of modern science and engineering.

8,091 citations

Book
06 Oct 2003
TL;DR: In this paper, the mathematics underpinning the most dynamic areas of modern science and engineering are discussed and discussed in a fun and exciting textbook on the mathematics underlying the most important areas of science and technology.
Abstract: Fun and exciting textbook on the mathematics underpinning the most dynamic areas of modern science and engineering.

7,243 citations