next up previous
Next: Local Alignment Up: Stat 260: Statistics Previous: Global pairwise alignment

Dynamic Programming

 

A directed graph is defined as a finite set of vertices and edges, with and E a set of ordered pairs from V. A path of length l-1 is a sequence of vertices with for all k, or equivalently for paths of positive length as the corresponding sequence of edges. Paths of length 0 are just single vertices. A cycle is a path that ends at the same vertex as it begins. We will be concerned with directed acyclic graphs (DAGs) which as the name suggests are directed graphs that do not contain any cycles. A graph may be weighted by a real function W defined on the edges (and possibly also the vertices). The weight of a path P is then .

Now we give a theorem that is the essence of all dynamic programming algorithms.

 

Proof For each vertex v define to be the maximum weight over all paths ending in v. Let be a path achieving this maximum, and be the vertex preceding v in P, i.e. , or if P has length 0. These can be written recursively as

Using these formulae we can calculate and given the values of these quantities for all j<i. We see that the total number of terms to be maximized in all our calculations of is |E|+|V|, and comes for free as the argument in the maximization giving . Variants of this algorithm are

  1. Writing for the maximum weight over all paths starting in v, and a path achieving this maximum gives a similar proof, but working backwards rather than forwards through the DAG. This is what Needleman and Wunsch chose to do.

  2. Use positive weights and multiply along paths, implemented by taking logs and adding.

  3. Restrict the starting and ending positions. Insisting on starting on the first row or column and ending on the last row or column forces the algorithm to give a global alignment.

  4. Find the total sum over all paths ending at a particular vertex.

We now cover some applications of dynamic programming algorithms in statistical genetics, the first example being the one directly relevant to sequence alignment.





next up previous
Next: Local Alignment Up: Stat 260: Statistics Previous: Global pairwise alignment



Simon Cawley
Fri May 8 16:20:53 PDT 1998