In this section we consider modifications to our analysis necessary
when we have marker data at or near a locus of interest on sibs and
possibly parents too, not necessarily determining IBD status. Let's begin by looking at some examples.
Example A. At a single marker locus, parental mating type is
and sib genotypes are
and
. Clearly, the sibs share DNA IBD on 1 chromosome (paternally inherited allele "1").
Example B. As above, but parental mating type is
and
sib genotypes are
and
. Again, it is clear that the sibs share DNA IBD on 0 chromosome at this locus.
Example C. Parental mating type now is
, and sib
genotypes are (a)
,
, or (b)
,
. In case (a) we
could have IBD =0 (the "3" alleles are distinct by descent, i.e. come
from different maternal chromosomes) or IBD=1 (the "3" alleles are
IBD, i.e. come from the same maternal chromosome). Intuitively, these
two possibilities are equally likely, given no extra information, but
what if the sibs are both affected? Similarly, for case (b), the sibs
can share DNA IBD on either 1 or 2 chromosomes.
Example D. Parental mating type is
and sib
phenotypes are (a)
,
; or (b)
,
; or (c)
,
; or (d)
,
. It is clear that in cases (a), (b) and
(c), the sibs share 2, 1 and 0 alleles IBD at this locus, while in
case (d) they could share 0 ("1" alleles from different parent and "2"
alleles from different parent) or 2 ( "1" and "2" alleles from same
parent). Again, it is intuitively clear that these last two options
are equally probable, given no further information. To check this a bit more formally, list the 16 possible inheritance vectors, and look at those
compatible with these data. If we consider the ordered parental
genotype
compatible with mating type
,
where
denotes a maternal allele, then the four inheritance vectors
compatible with the data on the sibs are
,
,
, and
, which involve sharing of 2, 0, 0 and
2 alleles IBD, respectively. These 4 vectors are equally likely given
no further information. Similarly for the other three ordered parental
genotypes. What if both sibs are affected?
Let pgm denote ordered parental genotypes at the marker, mtm
parental mating type at the marker (as defined in Section 2.2), and
sgm sib genotypes at the marker (unordered and sib genotypes can be
permuted). The observed marker data is m, and
or
if no parental data are available.
With no phenotype information on the sibs

where pgm is any ordered genotype compatible with mtm.
The general approach to incomplete IBD information is to expand
in terms of expressions which we can evaluate. To do this we need the following additional genetic assumption:
Assumption G3. Within a family, sib phenotypes
are conditionally independent of any maker genotype data given
multilocus genotypes at the DS loci.
Assumption G4. There is linkage equilibrium
between marker and DS loci, i.e. parental genotypes at the marker are independent of parental genotypes at the DS loci.
This seems like a fairly strong assumption, and it clearly excludes "markers" right on top of a DS locus. Nevertheless, we are unable to get the conclusion of the following proposition without it. Of course, one doesn't need the proposition if IBD can be established directly.


In practice, we don't need to sum over all parental genotypes pgm,
since
,

Hence

where the first sum is over all parental mating types mtm at the marker
compatible with observed parental marker data (if any), and pgm is
any ordered parental genotype compatible with mtm. When the parental
mating-type mtm is known, likelihood based tests for testing linkage
don't depend on
, the parental mating-type frequencies.
Let's use this proposition on Examples C and D (d) above. With Example
C, we have
and
. There are 8 inheritance
vectors consistent with sgm and a representative ordered parental genotype
, namely
,
,
, and
and 4 other vectors obtained by permuting the two sibs. Of these 4 have IBD= 0 and 4 have IBD = 1. Hence,

Similarly, with Example D(d),

If we had two sibs with marker genotypes
and NO
parental information, it would be necessary to sum over those parental
mating types at the marker compatible with sgm, specifically
. The resulting probability
would then involve the
parental mating type frequencies, typically calculated under the
assumptions of Hardy-Weinberg equilibrium and random mating, from
observed marker allele frequencies. Just how sensitive the results
will be to violations of these (usually unexamined) assumptions is difficult to say.
What do we do with these expressions? Under the sampling assumptions
introduced earlier ( Assumptions S1, S2), plus the following
additional assumption we calculate the likelihood of marker data
,
, on n ASPs.
Assumption S3. For a particular sib-pair, the
parental genotypes at the marker and at the DS loci are independent of
any phenotype and marker data from OTHer families, i.e.

where OTH denotes any marker and phenotype data from OTHer families.
Under Assumptions S1, S2, S3, the likelihood of the marker data given the phenotype data is:

We can now go on to carry out likelihood-based tests as before,
e.g. likelihood ratio test of
vs.
or
of
vs.
, or a score
test of
. Maximization of the likelihood with
respect to the
's is done by using the EM algorithm.