next up previous
Next: The program VITESSE Up: Stat 260: Statistics Previous: Combining peeling and

Allele recoding

 

Many techniques have been developed to reduce the number of genotypes to store in memory and make peeling feasible with multilocus genotypes of highly polymorphic markers. O'Connell and Weeks [5] list a few of them:

  1. We can lump the alleles that do not appear in the pedigree together in a single allele with frequency equal to the sum of the non-appearing alleles frequencies.
  2. When everyone in the pedigree is typed, the population allele frequencies do not enter in the calculation. Alleles can be replaced by arbitrary labels. Only 4 labels are needed to represent the 4 parental alleles in a nuclear family, and the same 4 labels can be reused in the other families.
  3. For simple diseases, we can make only the meioses involving affected individuals informative. This again allows us to reuse allele labels. It does not work for complex diseases with incomplete penetrance or phenocopies (similar phenotypes caused by different disease loci).
  4. An unobserved founder with only one typed offspring can be made homozygous for the allele transmitted to that offspring. The number of genotypes to sum over is therefore reduced to one.

The storage space required for peeling will also depend on the representation of the genotypes in the computer.

The first approach is to store all possible multilocus genotypes for each person in a matrix with M(M+1)/2 different elements, where M is the product of the number of alleles at each locus. A large number of these genotypes are inconsistent with the data observed on the pedigree, but all genotypes are readily available. This is the method used in FASTLINK, a fast version of the classical pedigree analysis package LINKAGE.

The second approach is to store only single locus genotype lists, and to build valid multilocus genotypes from the list when they are needed. The genotype reconstruction takes more time but the reduction in size is drastic. Consider for example 5 10-allele marker loci. It can be stored in 5 lists of 55 single-locus genotypes for a total of 275 elements. With first approach, M = 100,000 and 5,000,050,000 multilocus genotypes need to be stored in memory.





next up previous
Next: The program VITESSE Up: Stat 260: Statistics Previous: Combining peeling and



Simon Cawley
Thu Apr 16 15:30:12 PDT 1998