next up previous
Next: References Up: Stat 260: Statistics Previous: Application

Mathematical model: branching process

Like all other biochemical processes, PCR is not a perfect process. Occasionally, DNA polymerase will substitute, add, or delete a nucleotide to the growing DNA chain, and these mismatches happening in vitro cannot be removed in the way the DNA replication machinery in the cell does. Statistical models can be used to model the distribution of the number of mutations in PCR, see Sun [4]. We present some basic results in the following.

Without loss of generality, we just study one strand of the DNA. Let be the initial number of the identical copies of the single-stranded sequences which will serve as templates for DNA replication. During each cycle, we assume that DNA polymerase forms a new strand from each existing template with probability . These newly formed strands as well as the old templates will serve as templates in the next cycle. Let be the number of single-stranded sequences containing the target and the two primers after n PCR cycles. Then the sequence is a branching process. Further, we make the following two assumptions.

  1. is a Markov process, that is, the distribution of only depends on .
  2. All the templates during each cycle are i.i.d., that is, each template is copied independently and identically.
Under these assumptions, is a Galton-Watson process (see [1]). Now we define the number of generation. The original sequences is called the 0-th generation. The sequences generated directly from the original sequences are call first-generation. Inductively, the sequences generated from the k-th generation are called the (k+1)-th generation. Let be the number of k-th generation sequences after n cycles. Then . It can be shown that , and thus . After n cycles, the probability that we get a k-th generation sequence from a random chosen sequence is , which can be approximated to be if is sufficiently large. This is a result of strong law of large numbers. Therefore, the following assumption can be made when is sufficiently large.

Assmption(A1). The distribution of the generation number K of a random chosen sequence after n PCR cycles is .

    

For the proof of this and a number of related results, see [4]



next up previous
Next: References Up: Stat 260: Statistics Previous: Application



Simon Cawley
Wed Apr 29 19:50:12 PDT 1998