next up previous
Next: Johannsen and pure Up: Stat 260: Statistics Previous: Stat 260: Statistics

Quantitative traits.

As you will recall, the first discussion of quantitative traits (QT from now on) from a Mendelian viewpoint was given by Mendel. In section 10 (our numbering) of his paper, he indicated how the flower color differences he observed in bean plants might possibly be the result of two independently segregating factors, assuming two co-dominant alleles for each factor, acting additively.

Exercise 1: Check that this Mendelian model gives up to 5 different graduations in color, segregating as 1:4:6:4:1 in the intercross. That is, the palest and the darkest colors arising in 1/16th of the progeny, the next darkest and the next palest in 1/4, with the intermediate color in 6/16th.

Of course one of Mendel's traits (stem length of the pea plant) was really a QT, but he ``dichotomized'' it, and was able to analyze it successfully as a qualitative trait. In a sense, this contrasting pair of characters (short and long) defined the first QTL: quantitative trait locus. Our topic for this week is the mapping of QTL.

The few remarks in his section 10 were obviously not well understood in the years immediately following the rediscovery of Mendel, because a much-discussed battle is said to have raged between ``Mendelians'' and ``the biometric school''. The latter apparently believed that Mendelian segregation was incompatible with the inheritance of QTs (continuously varying characters, to use the jargon of the day). We will pick up a small part of that battle, an interchange between the statisticians Yule and Pearson. In 1904 Pearson gave an analysis of the inheritance of QTs which led him to believe that Mendelian segregation could not explain observed correlations of about 1/2 between parent-offspring QT values such as stature. (Recall from your other statistical studies that Pearson and Lee has amassed such data, following Galton's discovery of correlation and regression.) Pearson's conclusion was that the Mendelian theory ``was not sufficiently elastic to cover the observed facts''. Yule (1906) redid Pearson's analysis under slightly more general assumptions, and showed this conclusion to be false, and we'll run through his argument now.

For this analysis, we need to refer to simple populaton genetic notions, ideas we have not discussed so far. Let us assume that we have a single Mendelian factor with two variant forms, Q and q, say, which exist in a population with equal frequency (1/2, 1/2). Further, let us suppose that the genotypes QQ, Qq and qq are in the population in proportions 1:2:1 (just like the progeny of an intercross). These are known as Hardy-Weinberg frequencies, and we'll get to that in due course. We'll suppose (with Yule) that people with these genotypes have QT values of a, b and c respectively. (In practice, these could be average values.)

We need to calculate the correlation between QT values of a father and a son, so let's begin by assuming that fathers have genotypes QQ, Qq and qq in the proportions just mentioned, and that mothers also have these genotypes with the same frequencies, independently of their husband's genotype. What are the possibilities for a son? You can easily check the following assertions:

If the father is QQ, the son will be QQ or Qq with equal probabilities. If the father is qq, the son will be Qq or qq with equal probabilities. Finally, if the father is Qq, the son will be QQ, Qq or qq in proportions 1:2:1.

Thus the joint distribution of father's and son's genotype at the locus Q and their QT values are as given in the following table:

Where a denotes QQ, b Qq and c qq.

With this background, the correlation between QT values of father and son (for parent-offspring) is just simple algebra. It turns out to be

What can one do with such a formula? Firstly, one can note that it applies a little more generally. Suppose that we have n biallelic loci, segregating independently of one another, each having the same population distribution as our Q and q above, again independently of one another, and that our population mates without regard to genotypes at any of these loci (a lot of independence assumptions!). If the ith locus contributes , and to the QT value, according to the individual's genotype at that locus, just as in our original case, and these contributions are additive, the correlation coefficient will be just like the one above, with sums over i.

Exercise 2: Obtain the expression for in the simpler, and then the more general case discussed above.

Having this formula, Yule pointed out Pearson's mistake. Pearson had assumed dominance in the action of the trait, i.e. that a = b, and got = 1/3, markedly different from 1/2. Yule pointed out that if, instead, he had assumed additivity,

he would have obtained 1/2, the accepted value for . In my view Yule's simple analysis is elegant, compelling and important. It links Mendelian genetics with Galtonian statistics, illuminating both. Fisher usually gets all the credit for this, because of his famous 1918 paper ``The correlation between relatives on the supposition of Mendelian inheritance''. This is indeed a great paper, redoing Yule's analysis more carefully, examining independence assumptions, having more general population frequencies for the the alleles, considering non-random mating, inventing anova, and much more. But in my view, the beauty and simplicity of Yule's reasoning cannot be beaten, and in a real sense, his additivity story gets to the heart of the matter.



next up previous
Next: Johannsen and pure Up: Stat 260: Statistics Previous: Stat 260: Statistics



Simon Cawley
Mon Apr 20 19:59:26 PDT 1998