Reading List
Week 1: Intro to Genomics, Biotechnologies,CS for Biologists
Week 2: Gene Expression Data Analysis, Classification
- Old Lecture Notes: Analysis,
Classification
- Lecture Papers:
- Additional Material:
- "Statistical Analysis of Gene Expression Microarray Data,"
Terry Speed editor, Chapman and Hall, 2003 (a text covering both
statistical analysis and data mining)
- Hughes et al., Expression profiling using microarrays fabricated by
an ink-jet oligonucleotide synthesizer, Nature Biotech. 2001, 19,
342-347 (array design)
- Kerr, Churchill, Experimental Design for Gene Expression
Microarrays, Biostatistics 2001, 2, 183-201 (experiment design)
- Visit Terry Speed's Microarray
Data Analysis Group Page for a number of great papers/software (take
a look at their Always Log! page in Hints/Prejudices there)
- Young et al., Normalization for cDNA microarray data: a robust
composite method addressing single and multiple slide systematic
variation, NAR, 2002, 30, e15. (loess or lowess normalization)
- Durbin et al., A variance-stabilizing transformation for
gene-expression data, Bioinformatics, 2002, 18, 105-110 (statistical
treatment)
- Dudoit et al., Statistical methods for identifying differentially
expressed genes in replicated cDNA microarray experiments, Tech. Report
578, Stats Dept., UC Berkeley, 2000 (Differential Expression,
Multiple testing: FWER)
- Tusher et al., Significance testing of microarrays applied to the
ionizing radiation response, PNAS, 2001, 98, 5116-5121 (Multiple
testing: FDR)
- Troyanskaya et al., Missing value estimation methods for DNA microarrays, Bioinformatics 2001 Jun;17(6):520-5.
(Missing data is a very important concern in microarray data
analysis)
- Dudoit et al. (2002), "Comparison of discrimination methods
for the classification of tumors using gene expression data", Journal
of the American Statistical Association, 97(457):77-87. (Classification)
- Pomeroy et al. (2002), "Prediction of central nervous
system embryonal tumour outcome based on gene expression", Nature,
415:436-442. (Classification, various class. methods used)
- Software: Many of the analysis and classification methods are
implemented in R (see resources)
Week 3: Presentations
Week 4: Clustering
- Lecture Notes
- Old Lecture Notes: Clustering
- Lecture Papers:
- Additional Reading:
- The above book by T. Speed.
- Alon et al.(1999), "Broad patterns of gene expression
revealed by clustering analysis of tumor and normal colon tissues
probed by oligonucleotide arrays", Proceedings of the National Academy
of Sciences, 96(12):6745-6750 (nice application paper)
- Ross et al. (2000), "Systematic variation in gene expression patterns in human cancer cell lines", Nature Genetics,
24(3):227-235 (nice application paper)
- Tamayo et al. (1999), Interpreting patterns of gene expression with self organizing maps, 1999.
PNAS, v. 96, 2907-2912. (First paper using Self Organizing Maps to
cluster microarray data)
- Sharan and Shamir (2000), "CLICK: A clustering algorithm
with applications to gene expression analysis", Proceedings of the
Eighth International Conference on Intelligent Systems for Molecular
Biology (AAAI Press), pp.307-316. (Graph theoretic clustering)
- Cheng and Church (2000), "Biclustering of expression data",
Proceedings of the Eighth International Conference on Intelligent
Systems for Molecular Biology (AAAI Press), pp.93-103. (first biclustering paper on microarray data)
- Madeira and Oliveira (2004), "Biclustering Algorithms for
Biological Data Analysis: A Survey," IEEE/ACM Transactions on
Computational Biology and Bioinformatics 1(1): 24 - 45 (a survey of biclustering methods)
- Getz et al. (2000), "Coupled Two Way Clustering (CTWC)," PNAS 97, 12079
- Bergmann et al. (2003), "Iterative signature
algorithm for the analysis of large-scale gene expression data,"
PHYSICAL REVIEW E 67, 031902
- Segal et al. (2004), "A module map showing conditional activity of expression modules in
cancer," Nat Genet. 2004 Oct;36(10):1090-8
- Microarray Data Mining Tools Based on Clustering:
- R, Bioconductor (in Resources on the class web page)
- Expander
- GeneXPress
- for others see Resources on the class web page
Week 4: Promoter Region Analysis
Week 5: Data Integration I: Expression + Promoter Region
Week 5: Data Integration II: Expression, TF DNA, PPI and others
Week 6: Presentations
Week 7 & 8: Biological Networks, Gene Network Modeling
and Inference
Week 9: Presentations
Week 10: Data Integration III: Towards Networks