03-511/711, 15-495/856 Course Notes - Sept 3, 2009
Pairwise alignment continued
Alignment algorithms
The dynamic programs for sequence alignment compute a matrix a[i,j],
which gives the scores of the optimal alignments of all prefixes. These algorithms have four components:
- Initialization of the first row and column of a[i,j].
- A recurrence relation for a[i,j], i,j > 1.
- Determination of the score of the optimal alignment
from the matrix a[i,j] in o(m-n) time.
- Trace back through the alignment matrix to obtain the
optimal alignment in o(m+n) time.
The details of each of these steps are what differentiate global,
semi-global and local alignment.
Local Alignment
- Initialize the first row and column to zero: s[i,0] = t[0,j] = 0 for all i and j
- Recurrence
a[i,j]= max { |
a[i-1,j] + g |
a[i-1,j-1] + p(s[i], t[j]) |
a[i,j-1] + g |
0 |
- The score of the optimal alignment is max{ a[i,j]}, where the
maximum is taken over all i and all j.
- Trace back starting at a*[i,j], the cell corresponding
to the maximum score. End the trace back when the score
reaches zero
Note that :
- There can be more than one optimal alignment
- Suboptimal alignments may be of interest
- M > m > 2g
- Global and semi-global alignments can use distance or
similarity functions.
- Local pairwise alignment requires that
- The scoring function be a similarity function.
- The similarity matrix, p[i,j], must contain at least one positive
value.
- The expected random alignment score must be
negative.
Last modified: September 3 2009.
Maintained by Dannie Durand (durand@cs.cmu.edu) and Annette McLeod.