next up previous
Next: Multilocus analysis on Up: The power of Previous: Additional assumptions for

Sib-pair multilocus analysis

 

We first explain the multilocus or multipoint calculations in the context of incomplete IBD information on sibs discussed in section 9 of week 6. We consider the case where only the sibs are observed. For example, the sibs multilocus genotypes could be:

The sibs unordered multilocus genotypes contain information on the haplotypes they inherited from their parents. Here for example, the most likely haplotypes as reconstructed by MAPMAKER/SIBS are:

Computing haplotype probabilities conditional on marker data is not however the final step of the analysis. The probability of interest is where is the IBD status at locus x (cf. week 6, section 9) and is derived from the haplotype probabilities.

In what follows, the notation from week 6 is used unless otherwise specified. One of the elements of the HMM is the probability of the observed marker genotypes given the inheritance vector at marker locus . To compute it, we must condition on the ordered parental genotypes at the marker ().

We can write the last expression because the are independent of the inheritance vectors under Assumption G2. Note that since and completely determine the sibs genotypes at the marker ().

is a product of allele frequencies under the random mating and HW assumptions. is 0 or 1 depending on whether the marker genotype is consistent with or not. For a marker with k alleles, there are terms in the summation. This number is usually small enough that the computation is done quickly, but we will see that in general pedigrees complexity increases and it becomes advantageous to restrict the summation to terms where is 1.

The next elements we need are the transition matrices between adjacent marker loci. They have the same form as the transition matrix between a disease and a marker locus described in section 5 of week 6, namely:

where indexes the intervals between markers.

We now have all the ingredients to set up a forward and backward recurrence relation to compute , the inheritance distribution at marker locus l conditional on the multilocus marker data m. This is similar to what we did in section 5.1 of week 3.

 

Similarly,

 

We can see that

The inheritance distribution at locus l is given by the -variables

which are functions of the forward and backward variables.

From there, the conditional inheritance distribution at a locus x on which we have no data is obtained as a function of and recombination fractions between the point x and the marker l. We get the IBD probabilities at locus x (or a marker locus) by summing over s corresponding to IBD = 0,1 or 2. The use of in linkage analysis on sib pairs is described in section 9.6 of week 6.

The assumption of linkage equilibrium between the marker loci ( Assumption G6) required to establish the recurrence relations is a strong assumption on top of the random mating and HW assumptions made to compute . It is the price we pay for not observing the parent's genotypes. However, if these assumptions seem reasonable, genotyping only the sibs and not their parents at a number of markers (not too widely spaced) is a big saving in genotyping work with almost no loss of information. Researchers can then employ their resources to type more sib pairs.



next up previous
Next: Multilocus analysis on Up: The power of Previous: Additional assumptions for



Simon Cawley
Thu Apr 16 15:30:12 PDT 1998