Specific Curriculum proposals

Next: Appendix: Details of a Up: Curriculum Development Previous: proficiency in use vs.

Specific Curriculum proposals

We should develop additional undergraduate and graduate courses in bioinformatics and genomics. The course syllabus below is only one of several possible courses. It is for an undergraduate course to be taught first in spring 2000.

ECS 124 THEORY AND PRACTICE OF BIOINFORMATICS (4) III

Lecture: 3 hours

Laboratory: 1 hour

Prerequisites: Course CSE 10 or 30 or E 5 or E 6; Stat 12 or 13 or 32 or 100 or Math 131/Stat 131A; Bio Sci 1A or 101 or MCB 10

Grading: Letter; 5-7 homework/laboratory sets (60

Catalog Description:

Fundamental biological, mathematical and algorithmic models underlying bioinformatics; sequence analysis, database search, gene prediction, molecular structure comparison and prediction, phylogenetic trees, high throughput biology, massive datasets; applications in molecular biology and genetics; use and extension of common bioinformatics tools.

Goals:

I. Understanding the role and utility of bioinformatics in modern biology

II. Understanding basic biological, mathematical and algorithmic concepts, techniques and models underlying bioinformatics tools

III. Mastery of common bioinformatics tools

IV. Simple programming in Perl or Java to extend the utility of common bioinformatics tools

Expanded Course Description:

I. Initial examples of the power of bioinformatics in modern biology A. The importance of sequence and structure comparison and of database search B. The use of sequence analysis in laboratory protocols C. The use of phylogenetics in evolution and non-evolutionary areas of biology

II. Sequence analysis A. Probabilistic and biological models underlying sequence alignment B. Computational efficiency and the need for compromises in the models C. The general technique of Dynamic Programming D. Pairwise sequence alignment - algorithms for global, local alignment and variations E. Algorithms for multiple sequence alignment and the identification/use of motifs F. Database search - FASTA, BLAST, PSI-BLAST, scoring matrices, statistical significance and its significance G. Creation and use of motif models H. Novel uses of sequence analysis in studying DNA, RNA and proteins I. Sequence analysis in genomics and high throughput biology

III. Phylogenetic algorithms A. Probabilistic and ideal-data models underlying phylogenetic algorithms B. Distance-based methods C. Character/parsimony-based methods D. Maximum-likelihood methods E. PAUP, PHYLIP F. Evolutionary and non-evolutionary uses for phylogenetics G. The interaction of phylogenetics and sequence analysis

IV. Protein and RNA structure comparison and prediction A. Ideal-data models underlying structure comparison and prediction B. Algorithms for RNA folding C. Methods and problems in protein structure comparison and prediction D. Biological use of structure prediction and comparison tools

V. Overview of common bioinformatics utilities and web-based resources such as GCG and Entrez

Textbooks:

Required Text

R. Durbin et al., Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge Press, 1998

A. Baxevanis and B. Ouellette, Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Wiley-Interscience 1998

Supplemental Text

M. Bishop and C. Rawlings, DNA and Protein Sequence Analysis: A practical Approach, IRL Press, 1997

D. Gusfield: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge Press, 1997.

Supplemental readings and notes distributed in class

Homework:

Each homework set includes creative problems as well as recitation problems to strengthen understanding and discover new material.

Computer Usage:

The lab portion of the class will emphasize practical computer exercises using both established bioformatics software and writing simple programs in Perl or Java.

ABET Category Content:

Engineering Science: 1 unit Engineering Design: 0 unit

Instructor: Dan Gusfield

Prepared by: Dan Gusfield (April 1999)

The course is aimed both at biology and computer science students. It is expected that the typical biology student will have a stronger background in molecular biology, genetics and biochemistry (not listed as a prerequisite) than is reflected in the prerequisite list, and that the computer science student will have a stronger background in programming and mathematics than is listed in the prerequisites. Some of the laboratory assignments will be done by groups mixing biology and computer science students whose backgrounds should complement each other.

The laboratory portions of the course will teach the hands-on computer tools, while the lectures will focus on the fundamental biological, mathematical and algorithmic chain of reasoning leading to the models that underlie these tools. Thus, the course requires some sophistication in mathematics, and some intuitive understanding of what can be accomplished by computer programming, but does not require an extensive background in programming.

Next: Appendix: Details of a Up: Curriculum Development Previous: proficiency in use vs.

Dan Gusfield
1999-11-03