next up previous
Next: Location scores Up: The power of Previous: Sib-pair multilocus analysis

Multilocus analysis on general pedigrees

 

The multilocus approach that we just described at section 3.2 extends to pedigrees of arbitrary structure. The program GENEHUNTER (Kruglyak et al. [1]) has been created to perform multilocus linkage analysis in this setting.

We consider an arbitrary pedigree with f founders and n non-founders. (A sib-pair pedigree is a special case with 2 founders and 2 non-founders). The length of the inheritance vector is now 2n, and the number of possible vectors is . The multilocus genotypes of some number of pedigree members (including possibly founders) have been collected and constitute the marker data m.

The formulation of the computation of is modified. We give distinct labels to the alleles carried by the founders at a marker locus, like we did in section 1.1 of week 6, and we call them ``genes'', in the broad sense of an heritable DNA segment, to make a distinction between them and the observable allele types.

An allelic assignment is then defined as a mapping from the founder genes to the allele types. For example, the assignment could be:

When the unordered genotypes of the founders are known, the actual assignment is known up to a permutation of the labels within a founder. When the genotypes of some or all founders are missing, we have to sum over the allelic assignments for the missing founders, just as we summed over ordered parental genotypes in section 3.2 above.

Denote by the vector of alleles assigned to the founders at locus l.

where is the genotype of the non-founder j at locus l and may be either observed or missing.

The probability of the vector of assigned alleles is computed as a product of allele frequencies by assuming independent sampling of the founder genes from the gene pool of the population (random mating). is 1 if is compatible with and and 0 otherwise.

With k allele types at a marker locus, there are different allelic assignments, a number growing exponentially with the number of founders in the pedigree. Most of these assignments are however incompatible with the genotype data. Summing over all of them is a waste of time. An efficient algorithm as been developed that restricts the summation to the non-zero terms only (Sobel and Lange [6], Kruglyak et al. [1]). The algorithm is efficient because the number of operations it involves grows only linearly with the number of founders. A description of the algorithm can be found in appendix A.

The transition matrix between adjacent marker loci generalizes the sib-pair transition matrix.

The s and s of the forward-backward algorithm can now be computed using equations (1) and (2) of section 3.2. Difficulties arise because we have to repeatedly multiply transition matrices by vectors , a computation that appear to require operations. It is possible to take advantage of the fact that is a Kronecker product to perform the computation in operations. Kruglyak et al. [3] describe an algorithm to achieve that improvement. Their algorithm achieves the same order of simplification as the Yates' algorithm presented in appendix B.

The conditional inheritance distribution at any point x along a chromosome is obtained in the same way as in the sib-pair analysis of section 3.2. This probability is used differently with arbitrary pedigrees than with sib pairs. The most common statistical procedure is to compute a location score. Before explaining how GENEHUNTER calculates a location score, we discuss what it is in the next section.



next up previous
Next: Location scores Up: The power of Previous: Sib-pair multilocus analysis



Simon Cawley
Thu Apr 16 15:30:12 PDT 1998