next up previous
Next: Variationsextensions and Up: Statistics in Genetics Previous: Estimating recombination.

Stochastic models of recombination.

Why might we want them? Firstly, one might hope that a suitable stochastic model for recombination would lead to more efficient analyses of recombination data, and to an extent this is true. But as in all modelling, something is lost, and there is certainly a view that the losses outweigh the gains. Morgan's school which did so much mapping in Drosophila never made use of stochastic models of the kind I will shortly discuss. However, they usually collected lots of data, and did everything empirically. A second reason for introducing models is the possibility that they might throw light on mechanisms, and here it is hard to say this has occurred. Much of the great work elucidating the nature of recombination was built on simple statistical summaries of experiments or carefully collected data, not models. Nevertheless, models are widely used in modern linkage analysis, in part because there always seems to be too little data, and models help, but also because as it turns out to be impossible to carry out what are known as multilocus analyses without models.

In order to model recombination, we need to model the appropriate stage of meiosis, namely reciprocal meiotic exchange between non-sister chromatids. This takes place at the 4-strand stage, although there are models of the relevant features of meiotic products, which do not begin at the 4-strand.

It is convenient to consider chromosome arms separately, as the exchanges on separate arms generally seem to be independent. The conventional approach requires specification of

i) a point process defining the locations of the exchanges along the 4-strand bundle, known as a bivalent; and

ii) a process which specifies the strands involved in the exchanges, given their locations.

The standard (near universal) model uses a Poisson point process for i), and assumes that all strand choice is random, that is, the pair of strands involved in any given exchange is any particular one of the four possible non-sister pairs with probabilityy 1/4, independently of the choices of strands for all other exchanges, and of the positions of the exchanges. This assumption for ii) is known as No Chromatid Interference, abbreviated NCI.

Consider two loci A and B on a chromosome arm, and let N denote the number of exchanges occurring on the 4-strand bundle between A and B. Mather's formula, actually first proved by Emerson and Rhoades (1933), relates the unobservable exchange events on the 4-strand bundle to potentially observable events on meiotic products. It has a number of forms, the most basic of which is the fact that whatever the value of N, as long as N>0, exactly half of the the 4N meiotic products will be recombinant and half parental at A and B. (This is in theory; the issue of detecting these events is not relevant here.)

Exercise 5. Prove the previous assertion by a counting argument. (Hint: obtain a recursion for the number of recombinant strands among the exchange configurations.)

A second form of Mather's formula is a simply a probability interpretation of the previous assertion. We state it as: the chance that a random meiotic product is recombinant between A and B, given that N=n, is 1/2 if n>0, and 0 otherwise.

To see this, take a random meiotic product and consider the number of exchanges between A and B that involve it. If this number is odd, it is recombinant across A-B; if the number is even, it is parental across A-B. By NCI the number of such exchanges will be binomially distributed with parameters n and 1/2. Thus the recombination fraction r between A and B is the sum of over all odd numbers k between 0 and n, and this is readily seen to be 1/2 if n>0, and 0 otherwise.

Our third form of Mather's formula comes by writing for the number of exchanges on the bivalent between A and B, and is as follows:

Of course this is obtained by multiplying the second form by the probability of obtaining N=n, and summing over all >From it we can make two important observations: under NCI, recombination fractions are bounded above by 1/2, and are non-decreasing in the size of the chromosomal interval A-B.

If E has a Poisson distribution with mean 2d, so that the mean number of exchanges involving a random meiotic product is d, then this d is related to the recombination fraction r via:

This relation is known as Haldane's mapping function, and is widely used in multilocus mapping, although it usually greatly overestimates the mean number d of exchanges per strand per meiosis.



next up previous
Next: Variationsextensions and Up: Statistics in Genetics Previous: Estimating recombination.



Simon Cawley
Mon Apr 20 19:50:16 PDT 1998