ECS 224 STRING ALGORITHMS AND APPLICATIONS IN COMPUTATIONAL BIOLOGY (4) I
Lecture: 3 hours
Discussion: 1 hour
Prerequisite: Course ECS 122A
Grading: Letter; homework (30%), project (25%), final (45%)
Algorithms that operate on strings. Pattern matching, sets of patterns, regular expression pattern matching, suffix trees and applications, inexact similarity, parametric sequence alignment, applications to DNA sequencing and protein database searching. Offered in alternate years.
The purpose of the course is to introduce important techniques in pattern matching and pattern discovery using combinatorial algorithms that operate on strings. This area has recently become of great interest because of its utility in computational biology, particularly in sequence database searching. This course will focus on computer science techniques with an eye towards their productive application in biology. No background in biology will be assumed.
Expanded Course Description:
- Exact matching using the Z and the Boyer-Moore algorithms
- Suffix trees and linear time algorithms for computing them
- Applications of suffix trees in pattern matching and computational biology
- Inexact matching, edit distance and alignment
- Accelerations for inexact matching
- Additional alignment models for biological applications of inexact matching. Gap questions.
- Hybrid dynamic programming — integrating suffix trees with inexact matching
- Database searching
- Applications of string algorithms in DNA sequencing problems
- String algorithms in phylogenetic reconstruction
The project involves some computer implementation of an important algorithm or the theoretical development of some new method.
D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge Press, 1997
Instructor: D. Gusfield
Prepared by: D. Gusfield (September 2002)
There is no significant overlap with any other course.