03-511/711, 15-495/856 Course Notes

03-511/711, 15-495/856 Course Notes - Sept 7, 2010

Pairwise alignment continued

Affine gap functions

Empirically, we tend to observe that indels occur in consecutive blocks. This makes intuitive sense: it is plausible that a fragment encoding a structural or functional module will be inserted in a single event.

However, the gap function discussed in previous lectures does not prefer blocks of gaps. The cost of an alignment with $i$ indels is the same, regardless of whether the indels are in a single block or scattered through out the alignment.

An alternative is to use a more complex gap function, a function where the cost of adding another indel to an existing gap is lower than the cost of initiating a new gap. There is a broad class of functions that have this property.

The function most often used in practise is the affine gap function:

  w(i) = h + i g.

Here, h is the gap initiation or gap opening penalty and g is the gap extension penalty. This function performs fairly well in practise and is easily implemented in an O(n²) algorithm. Details are given in Setubal and Meidanis, pp. 64-66, available through electronic reserves.

In class, we worked through an example, aligning ''BAA'' with ''BANANA''. In this example, the matrices are color-coded (a: white, b: orange, c: blue). The color codes are used to indicate from which matrix the last transition was made. For example, if cell i,j in matrix a is orange, then the previous cell was b[i-1,j-1].

Note that using an affine gap function, the optimal alignment is

        BANANA
	BA___A

With a linear gap function, w(i) = i g, with no gap opening penalty, other alignments with scattered indels, such as this one:

        BANANA
	B__A_A

would also have the optimal score.

Last modified: September 7, 2010.
Maintained by Dannie Durand (durand@cs.cmu.edu) and Annette McLeod.