MCMC is used when we wish to simulate from a distribution
known only up to a constant factor, i.e.

where
is known but C is unknown and too horrible to
calculate. Such problems
arose in statistical physics around the time of the development of the
hydrogen bomb after the end of World War II. Metropolis et al.
[8] proposed to simulate a Markov chain
whose
stationary distribution is
, using only ratios
in so doing.
We skip to the more general formulation due to
W. K. Hastings [2]. Later, we will see how the
Metropolis et al. algorithm is a special case.
Start with any Markov chain having transition matrix
over
the set of states i on which
is defined. Given
,
the idea is to simulate a random variable
with distribution
, i.e.
. The distribution
is called the proposal distribution.
Suppose we have some numbers
and that we accept
with probability
and reject it otherwise, i.e.

The transition probabilities
of this two-stage chain are:

The detailed balance condition which would imply that
is
the stationary distribution requires that for all i and
:

To satisfy this condition, we choose the acceptance probabilities
as follows:

The
s have come to be called Hastings ratios.
Exercise 1 : Show that detailed balance is satisfied when the
s are as above.
Metropolis et al. [8] used a symmetric
Q. Then
and the Hastings ratio simplifies to
.
Notice that in computing the
s, only the ratios of the
are needed. That's how we avoid the need to compute the
intractable normalizing
constant of the distribution.
Another special case of the Hastings algorithm is the Gibbs sampler
(Geman and Geman [1]). Here, we consider states that can be
partitioned into coordinates
, e.g. the ordered
multilocus genotypes of n individuals in a pedigree or a spatial
Markov field (also called a Gibbs state).
At each step, we update a single coordinate of the state, let's say
the
, conditional on the value of all coordinates excluding
the
. More formally, the step from
to
is governed by the
proposal distribution:

where
designates the vector i less the
coordinate.
The computation of the proposal distribution
must be feasible. The Gibbs sampler is used in situations where each
coordinate depends only on other coordinates within some well-defined
neighborhood. That's exactly what happens with the ordered multilocus
genotype of an individual in a pedigree which depends only on its
parents, spouses and children (see week 7, section 2).
It turns out that with the Gibbs sampler the acceptance probabilities
are all equal to 1 and this eliminates the acceptance/rejection step of
the algorithm.
Exercise 2 : Show that the acceptance probabilities are all
equal to 1.
The coordinate to update at each step can be chosen at
random. Alternatively (and more usually), we can work along the
coordinates of
systematically.
Exercise 3 : Give a justification of this systematic updating
scheme.
Finally, notice that the Gibbs sampler is particularly well suited for simulating conditional distributions. In this case,

where
represents the unconditional distribution and
is hard to compute. It is easy to see that the
proposal distribution
can be obtained directly from the
s.