Patrice Koehl
Department of Computer Science
Genome Center
Room 4319, Genome Center, GBSF
451 East Health Sciences Drive
University of California
Davis, CA 95616
Phone: (530) 754 5121

Computational Structural Biology: Winter 2024

Sequence Alignment

A sequence alignment is a way of arranging the sequences of DNA, RNA, or protein molecules to identify regions of similarity that may reveal functional, structural, or evolutionary relationships between the sequences. If two sequences in an alignment share a common ancestor, mismatches can be interpreted as point mutations and gaps as insertion or deletion mutations. In sequence alignments of proteins, the degree of similarity between amino acids occupying a particular position in the sequence can be interpreted as a rough measure of how conserved a particular region is.

Very short or very similar sequences can be aligned by hand. However, most interesting problems require the alignment of long, often highly variable numerous sequences Human knowledge is then applied in constructing algorithms to produce high-quality sequence alignments, and occasionally in adjusting the final results to reflect patterns that are difficult to represent algorithmically (especially in the case of nucleotide sequences). Computational approaches to sequence alignment generally fall into two categories: global alignments and local alignments. Calculating a global alignment is a form of global optimization that ensures that the alignment span the entire length of all query sequences. By contrast, local alignments identify regions of similarity within long sequences that are often widely divergent overall.

Homology: paralogs and orthologs

Lecture Notes

Download document:

Powerpoint document (click to download)
PDF document (click to download)
PDF document: 3 slides/page (click to download)

Further Reading

  Page last modified 29 February 2024