next up previous
Next: Simulated annealing Up: Generic introduction to Previous: Review of Markov

Definition of MCMC

 

MCMC is used when we wish to simulate from a distribution known only up to a constant factor, i.e.

where is known but C is unknown and too horrible to calculate. Such problems arose in statistical physics around the time of the development of the hydrogen bomb after the end of World War II. Metropolis et al. [8] proposed to simulate a Markov chain whose stationary distribution is , using only ratios in so doing.

We skip to the more general formulation due to W. K. Hastings [2]. Later, we will see how the Metropolis et al. algorithm is a special case.

Start with any Markov chain having transition matrix over the set of states i on which is defined. Given , the idea is to simulate a random variable with distribution , i.e. . The distribution is called the proposal distribution.

Suppose we have some numbers and that we accept with probability and reject it otherwise, i.e.

The transition probabilities of this two-stage chain are:

The detailed balance condition which would imply that is the stationary distribution requires that for all i and :

To satisfy this condition, we choose the acceptance probabilities as follows:

The s have come to be called Hastings ratios.

Exercise 1 : Show that detailed balance is satisfied when the s are as above.

Metropolis et al. [8] used a symmetric Q. Then and the Hastings ratio simplifies to .

Notice that in computing the s, only the ratios of the are needed. That's how we avoid the need to compute the intractable normalizing constant of the distribution.

Another special case of the Hastings algorithm is the Gibbs sampler (Geman and Geman [1]). Here, we consider states that can be partitioned into coordinates , e.g. the ordered multilocus genotypes of n individuals in a pedigree or a spatial Markov field (also called a Gibbs state).

At each step, we update a single coordinate of the state, let's say the , conditional on the value of all coordinates excluding the . More formally, the step from to is governed by the proposal distribution:

where designates the vector i less the coordinate.

The computation of the proposal distribution must be feasible. The Gibbs sampler is used in situations where each coordinate depends only on other coordinates within some well-defined neighborhood. That's exactly what happens with the ordered multilocus genotype of an individual in a pedigree which depends only on its parents, spouses and children (see week 7, section 2).

It turns out that with the Gibbs sampler the acceptance probabilities are all equal to 1 and this eliminates the acceptance/rejection step of the algorithm.

Exercise 2 : Show that the acceptance probabilities are all equal to 1.

The coordinate to update at each step can be chosen at random. Alternatively (and more usually), we can work along the coordinates of systematically.

Exercise 3 : Give a justification of this systematic updating scheme.

Finally, notice that the Gibbs sampler is particularly well suited for simulating conditional distributions. In this case,

where represents the unconditional distribution and is hard to compute. It is easy to see that the proposal distribution can be obtained directly from the s.



next up previous
Next: Simulated annealing Up: Generic introduction to Previous: Review of Markov



Simon Cawley
Wed Apr 22 19:50:08 PDT 1998