# Aspects of Bayesian Inference for (Curved) Exponential Families of distributions for Graphs and Digraphs - The Liked Importance Sampler Auxiliary Variable (LISA) Metropolis Hastings for Distributions with Intractable Normalising Constants

*Applied Statistics Seminar*

Joint Applied Statistics/Phsychology Seminar

*by Dr Johan Koskinen*

*Institution:*Department of Psychology, The University of Melbourne

*Date: Wed 6th December 2006*

*Time: 11:00 AM*

*Location: Room 213, Richard Berry Building, The University of Melbourne*

*Abstract*: We consider here a probability model for the edge set of a graph that is commonly referred to as the exponential random graph model (ERGM), and its extension, the curved ERGM. Although some issues remain to be resolved when it comes to how to specify the ERGM, this class of models holds some promise when it comes to capturing network processes. Currently the favoured methods for statistical inference are Markov chain Monte Carlo (MCMC) Maximum likelihood estimate (MLE) and an MCMC implementation of the Robbins-Monroe algorithm, both of which rely on the properties of the method of moments for exponential family distributions. We propose instead to take a Bayesian approach that (i) yields clearly defined answers in terms of probabilities (the asymptotic properties of the MLE are not fully understood in the case of the ERGM); (ii) offers a rich picture of uncertainty (the MLEs and approx. s.e.'s do not adequately reflect the uncertainty stemming from the pronounced dependencies between observations); (iii) makes allowances for penalising "degenerate parts" of the parameter space using proper subjective prior distributions; (iv) provides us with a natural and probabilistic approach for handling missing data; (v) offers a principled and probabilistic procedure for performing model selection; (vi) provides us with posterior predictive distributions; etc.

How to implement a Bayesian inference scheme for the ERGM is, however, far from straightforward. It is clear that in all but trivial cases we have to rely on numerical methods. It is probably fair to say that as far as numerical methods go, MCMC is the gold standard. Thus far, however, efforts at designing an MCMC algorithm for the ERGM has been hampered by the fact that it is typically not possible to evaluate the normalising constant (the partition function) in the likelihood function. Although the (pure) MCMC does not require that we can evaluate the normalising constant in the posterior distribution it usually requires that we can evaluate the likelihood function. Recently an auxiliary variable MCMC (SISA; our acronym) was proposed that circumvented the need to evaluate the partition function. The key being to introduce an auxiliary variable defined on the same state space as data. However, while SISA performs sufficiently well in order for it to be useful for "simpler" models like the Ising model, it seems as if it runs into serious problems when applied to the ERGM. It is not only a question of whether the mixing is good or not, rather it is a question of whether it mixes at all. The reasons for this being so are easily understood when the SISA is understood in terms of the Simple Importance Sampler (SIS). We propose a solution (LISA) where the (single) auxiliary variable is replaced by an auxiliary variable defined on an extended state space. Whereas SISA may be seen as an algorithm that performs a one-sample point SIS in each iteration of the Metropolis-Hastings sampler, LISA performs a bridged (linked) importance sampling (LIS) estimation in each iteration, with the number of bridging distributions and sample points chosen to tune mixing. The extra number of calculations necessary to perform LISA as compared to the SISA is negligible. We illustrate LISA when applied to the analysis of the Ising on a 50x50 grid and a network for a New England law firm.

BIOGRAPHY OF THE SPEAKER:

Johan Koskinen, Ph.D. Stockholm University (2005), is a Postdoctoral Fellow in the Department of Psychology, University of Melbourne. His research interests are in the areas of Bayesian analysis for social networks, cognitive social structures, longitudinal social network data, social influence, bipartite graphs, and missing data in demographic Hazard rate models. In 2005 he was a visiting researcher at the Department of Psychology, University of Melbourne, and during 2006 he has been working as a researcher in the Swedish Institute of Social Research (SOFI).

*For More Information:* Dr Owen Jones: odj@mailhost.ms.unimelb.edu.au