As you will recall, the first discussion of quantitative traits (QT from now on) from a Mendelian viewpoint was given by Mendel. In section 10 (our numbering) of his paper, he indicated how the flower color differences he observed in bean plants might possibly be the result of two independently segregating factors, assuming two co-dominant alleles for each factor, acting additively.
Exercise 1:
Check that this Mendelian model gives up to 5 different graduations
in color, segregating as 1:4:6:4:1 in the
intercross. That is,
the palest and the darkest colors arising in 1/16th of the
progeny, the next darkest and the next palest in 1/4, with the
intermediate color in 6/16th.
Of course one of Mendel's traits (stem length of the pea plant) was really a QT, but he ``dichotomized'' it, and was able to analyze it successfully as a qualitative trait. In a sense, this contrasting pair of characters (short and long) defined the first QTL: quantitative trait locus. Our topic for this week is the mapping of QTL.
The few remarks in his section 10 were obviously not well understood in the years immediately following the rediscovery of Mendel, because a much-discussed battle is said to have raged between ``Mendelians'' and ``the biometric school''. The latter apparently believed that Mendelian segregation was incompatible with the inheritance of QTs (continuously varying characters, to use the jargon of the day). We will pick up a small part of that battle, an interchange between the statisticians Yule and Pearson. In 1904 Pearson gave an analysis of the inheritance of QTs which led him to believe that Mendelian segregation could not explain observed correlations of about 1/2 between parent-offspring QT values such as stature. (Recall from your other statistical studies that Pearson and Lee has amassed such data, following Galton's discovery of correlation and regression.) Pearson's conclusion was that the Mendelian theory ``was not sufficiently elastic to cover the observed facts''. Yule (1906) redid Pearson's analysis under slightly more general assumptions, and showed this conclusion to be false, and we'll run through his argument now.
For this analysis, we need to refer to simple populaton genetic notions,
ideas we have not discussed so far. Let us assume that we have a single
Mendelian factor with two variant forms, Q and q, say, which exist in
a population with equal frequency (1/2, 1/2). Further, let us suppose
that the genotypes QQ, Qq and qq are in the population in proportions
1:2:1 (just like the progeny of an
intercross). These are known
as Hardy-Weinberg frequencies, and we'll get to that in due course.
We'll suppose (with Yule) that people with these genotypes have QT
values of a, b and c respectively. (In practice, these could be average
values.)
We need to calculate the correlation between QT values of a father and a son, so let's begin by assuming that fathers have genotypes QQ, Qq and qq in the proportions just mentioned, and that mothers also have these genotypes with the same frequencies, independently of their husband's genotype. What are the possibilities for a son? You can easily check the following assertions:
If the father is QQ, the son will be QQ or Qq with equal probabilities. If the father is qq, the son will be Qq or qq with equal probabilities. Finally, if the father is Qq, the son will be QQ, Qq or qq in proportions 1:2:1.
Thus the joint distribution of father's and son's genotype at the locus Q and their QT values are as given in the following table:

Where a denotes QQ, b Qq and c qq.
With this background, the correlation between QT values of father and son
(for parent-offspring) is just simple algebra. It turns out to be
What can one do with such a formula? Firstly, one can note that it applies
a little more generally. Suppose that we have n biallelic loci, segregating
independently of one another, each having the same population distribution as
our Q and q above, again independently of one another, and that our
population mates without regard to genotypes at any of these loci
(a lot of independence assumptions!). If the ith locus contributes
,
and
to the QT value, according to the individual's
genotype at that locus, just as in our original case, and these
contributions are additive, the correlation coefficient
will
be just like the one above, with sums over i.
Exercise 2: Obtain the expression for
in the simpler,
and then the more general case discussed above.
Having this formula, Yule pointed out Pearson's mistake. Pearson had assumed
dominance in the action of the trait, i.e. that a = b, and got
= 1/3,
markedly different from 1/2.
Yule pointed out that if, instead, he had assumed additivity,
he would have obtained 1/2, the accepted value for
. In my view
Yule's simple analysis is elegant, compelling and important. It links
Mendelian genetics with Galtonian statistics, illuminating both. Fisher
usually gets all the credit for this, because of his famous 1918 paper
``The correlation between relatives on the supposition of Mendelian
inheritance''. This is indeed a great paper, redoing Yule's analysis
more carefully, examining independence assumptions, having more
general population frequencies for the the alleles, considering
non-random mating, inventing anova, and much more. But in my view, the
beauty and simplicity of Yule's reasoning cannot be beaten, and in a
real sense, his additivity story gets to the heart of the matter.