In the context of pedigree analysis, the states of the Markov chain
are the unknown genetic states X. MCMC implementations described in
the literature have used haplotypes, genotypes and inheritance vectors
as genetic states X. Notice that X includes all the loci for all the
pedigree members (or all the non-founders in the case of inheritance
vectors). With the Gibbs sampler, X is partitioned in the individuals
in the pedigree
.
The probability distribution that we want to simulate from is
, where Y represents the observed data, including
disease phenotype, observed marker data, etc. and
represents
the parameters of the genetic model including recombination fractions,
penetrance functions, etc.
That probability distribution can be expressed as:

The denominator is the likelihood of the data at
and is hard
or impossible to work out, but the numerator is usually doable. It can
be written as

is the ``prior'' probability of X and
is usually straightforward. The acceptance
probabilities of the Metropolis-Hastings algorithm are computed using
only ratios of
.
The ability to simulate genetic states X conditional on the observed data Y is an important feature of MCMC. Before the introduction of MCMC in pedigree analysis, the likelihood of the data under a specified genetic model had been expressed as an expectation that could be approximated by Monte Carlo simulation (see Ott [11]):

The advantage of Lange's formulation is that it allows us to simulate
at
, which could be easier or more effective than simulating
at
, the parameter value of interest.
The problem is that the distribution of X,
, is not
conditional on the data. Most values of X are incompatible with the
data, so that
. The result is that a large
proportion of the simulated Xs contribute nothing to the
likelihood. With MCMC, we simulate from
, and the
chain always steps between values of X compatible with Y.
One thing we can do with MCMC is to estimate a likelihood ratio using the following equality (Thompson and Guo [13], Thompson [14]):

Proof of the equality:

To estimate the likelihood ratio using MCMC, we simulate from
, getting
and forming

In practice, we may want to discard the first observations because the chain has not yet reached equilibrium. This initial phase is called the burn-in.
We can estimate the likelihood ratio over a range of values of
by
simulating at a single value
. However, the estimate will be
good only for
in the neighborhood of
, where the
distribution
is close to
.