Reading List
Week 1 (Apr. 5, 7): Intro to Genomics, Biotechnologies
- In class handouts
- Lecture Notes: 1,
2, 3
Week 2 (Apr. 12, 14): Gene Expression Data Analysis, Classification
- Old Lecture Notes: Analysis,
Classification
- Lecture Papers:
- Additional Material:
- "Statistical Analysis of Gene Expression Microarray Data,"
Terry Speed editor, Chapman and Hall, 2003 (a text covering both
statistical analysis and data mining)
- Hughes et al., Expression profiling using microarrays fabricated by
an ink-jet oligonucleotide synthesizer, Nature Biotech. 2001, 19,
342-347 (array design)
- Kerr, Churchill, Experimental Design for Gene Expression
Microarrays, Biostatistics 2001, 2, 183-201 (experiment design)
- Visit Terry Speed's Microarray
Data Analysis Group Page for a number of great papers/software (take
a look at their Always Log! page in Hints/Prejudices there)
- Young et al., Normalization for cDNA microarray data: a robust
composite method addressing single and multiple slide systematic
variation, NAR, 2002, 30, e15. (loess or lowess normalization)
- Durbin et al., A variance-stabilizing transformation for
gene-expression data, Bioinformatics, 2002, 18, 105-110 (statistical
treatment)
- Dudoit et al., Statistical methods for identifying differentially
expressed genes in replicated cDNA microarray experiments, Tech. Report
578, Stats Dept., UC Berkeley, 2000 (Differential Expression,
Multiple testing: FWER)
- Tusher et al., Significance testing of microarrays applied to the
ionizing radiation response, PNAS, 2001, 98, 5116-5121 (Multiple
testing: FDR)
- Troyanskaya et al., Missing value estimation methods for DNA microarrays, Bioinformatics 2001 Jun;17(6):520-5.
(Missing data is a very important concern in microarray data
analysis)
- Dudoit et al. (2002), "Comparison of discrimination methods for the classification of tumors using gene expression data", Journal of the American Statistical Association, 97(457):77-87.
(Classification)
- Pomeroy et al. (2002), "Prediction of central nervous system embryonal tumour outcome based on gene expression", Nature, 415:436-442.
(Classification, various class. methods used)
- Software: Many of the analysis and classification methods are
implemented in R (see resources)
Week 3: Presentations
Week 4 (Apr. 26): Clustering
- Lecture Notes
- Old Lecture Notes: Clustering
- Lecture Papers:
- Additional Reading:
- The above book by T. Speed.
- Alon et al.(1999), "Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays", Proceedings of the National Academy of Sciences,
96(12):6745-6750 (nice application paper)
- Ross et al. (2000), "Systematic variation in gene expression patterns in human cancer cell lines", Nature Genetics,
24(3):227-235 (nice application paper)
- Tamayo et al. (1999), Interpreting patterns of gene expression with self organizing maps, 1999.
PNAS, v. 96, 2907-2912. (First paper using Self Organizing Maps to
cluster microarray data)
- Sharan and Shamir (2000), "CLICK: A clustering algorithm with applications to gene expression analysis", Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (AAAI Press), pp.307-316.
(Graph theoretic clustering)
- Cheng and Church (2000), "Biclustering of expression data", Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
(AAAI Press), pp.93-103.
(first biclustering paper on microarray data)
- Madeira and Oliveira (2004), "Biclustering Algorithms for Biological Data Analysis: A Survey," IEEE/ACM Transactions on Computational Biology and Bioinformatics
1(1): 24 - 45 (a survey of biclustering methods)
- Getz et al. (2000), "Coupled Two Way Clustering (CTWC)," PNAS 97, 12079
- Bergmann et al. (2003), "Iterative signature
algorithm for the analysis of large-scale gene expression data,"
PHYSICAL REVIEW E 67, 031902
- Segal et al. (2004), "A module map showing conditional activity of expression modules in
cancer," Nat Genet. 2004 Oct;36(10):1090-8
- Microarray Data Mining Tools Based on Clustering:
- R, Bioconductor (in Resources on the class web page)
- Expander
- GeneXPress
- for others see Resources on the class web page
Week 4 (Apr. 28): Promoter Region Analysis
Week 5 (May 3): Data Integration I: Expression + Promoter Region
Week 5 (May 5): Data Integration II: Expression, TF DNA, PPI and others
Week 6 (May 10, 12): Presentations
Week 7 & 8 (May 19, 24): Biological Networks, Gene Network Modeling
and Inference
Week 9 (May 26, 31): Presentations
Week 10 (June 2): Data Integration III: Towards Networks