Date
|
Topics
|
Slides/Lecture notes
|
Add'l resources
|
8 / 31
|
Introduction. Molecular biology primer.
|
Homework 1
Lecture notes 1
|
|
9 / 5
|
Molecular biology primer.
|
Lecture notes 2
|
|
9 / 7
|
Exact matching |
Gusfield chap. 1
Homework 2
|
|
9 / 12
|
Boyer-Moore algorithm |
Gusfield chap. 2
|
|
9 / 14
|
Proof that Boyer-Moore runs in linear time |
Gusfield chap. 3
Homework 3
|
|
9 / 19
|
Proof that Boyer-Moore runs in linear time
KMP algorithm |
Gusfield chap. 3
Gusfield chap. 2
|
|
9 / 21
|
KMP algorithm
Aho-Corasick algorithm
Proof Aho-Corasick construction is linear time |
Gusfield chap. 2
Gusfield chap. 3
Homework 4
|
|
9 / 26
|
Proof Aho-Corasick construction is linear time
Aho-Corasick if patterns can be substrings of another patterns
Matching with don't care symbols
Suffix trees - introduction
Using suffix trees for exact matching |
Gusfield chap. 3
Gusfield chap. 6
|
|
9 / 28
|
Suffix trees - Ukkonen's construction |
Gusfield chap. 6
Homework 5
|
|
10 / 3
|
Suffix trees - Ukkonen's construction cont'd. |
Gusfield chap. 6
|
Giegerich and Kurtz, Algorithmica, 19:331 (1997)
|
10 / 5
|
Suffix trees - implementation details
Suffix trees - applications |
Gusfield chap. 6
Gusfield chap. 7
Project 1 handed out
(no homework)
|
Note: Project 1 assumes all sequences are "forward" (no need to reverse complement)
Text.fa
Pattern.fa
|
10 / 10
|
Suffix arrays
Data compression - Lempel-Ziv, Burrows-Wheeler |
Gusfield chap. 7
|
Manber, U. and Myers, G. SODA, 1990.
Abouelhoda, Kurtz, Ohlebusch . WABI 2002
Wikipedia Lempel-Ziv entry
Wikipedia Burrows-Wheeler entry
|
10 / 12
|
Inexact matching (edit distance, global alignment) |
Gusfield chap. 11
Homework 6
|
|
10 / 17
|
Inexact matching (sequence similarity, affine gap scores) |
Gusfield chap. 11
|
|
10 / 19
|
Inexact matching (heuristics: linear space, bounded error, chaining algorithm) |
Gusfield chap. 12
Gusfield chap. 13
no homework
|
|
10 / 24
|
Inexact matching (heuristics: MUMmer, FASTA, BLAST, minimizers, significance scores), p-values/e-values) |
|
Delcher et. al. NAR, 1999 (MUMmer)
Delcher et al. NAR, 2002 (MUMmer)
Lipman, Pearson. Science, 1985 (FASTA)
Pearson, Lipman. PNAS, 1988 (FASTA)
Altshul et al. J. Mol. Bio, 1990 (BLAST)
Roberts et al. Bioinformatics, 2004 (Minimizers)
|
10 / 26
|
Inexact matching (significance scores, p-values/e-values)
Multiple sequence alignment |
Durbin, 2.1, 2.7, 2.8
Gusfield chap. 14
Gusfield chap. 15
Homework 7
|
Korf, Yandell, Bedell. BLAST, O'Reilly, 2003. chap. 4
|
10 / 31
|
Multiple sequence alignment |
Gusfield chap. 14
Project 1 due
|
|
11 / 2
|
Midterm exam
|
|
|
11 / 7
|
Multiple sequence alignment |
Gusfield chap. 14
Final project handed out
|
|
11 / 9
|
Multiple sequence alignment |
Gusfield chap. 14
Homework 8
|
|
11 / 14
|
Multiple sequence alignment - practical considerations (Muscle, ClustalW, ...)
Phylogenetic trees (intro) |
Durbin, chap. 7
|
Anson, Myers. J. Comp. Bio. 1997
Gribskov et al. PNAS 1987
Thompson et al. NAR, 1994
Edgar, NAR 2004
Edgar, BMC Bioinformatics 2004
Lee et al. Bioinformatics 2002
Lee, Bioinformatics 2003
Grasso et al. Bioinformatics 2003
|
11 / 16
|
Project proposal defense |
Project proposals due
Homework 9
|
|
11 / 21
|
Phylogenetic trees (distance based methods) |
Durbin, chap. 7
|
|
11 / 23
|
Thanksgiving (no class)
|
|
Additional reading
|
11 / 28
|
Phylogenetic trees (maximum likelihood) |
Durbin, chap. 8
|
|
11 / 30
|
RNA folding |
Durbin, chap. 10
|
Zuker, Science 1989
|
12 / 5
|
Protein folding (on a lattice) |
Lecture notes 25
|
Hart, Istrail, STOC 1995
Berger, Leighton, RECOMB 1998
|
12 / 7
|
Protein folding (threading) |
Lecture notes 26
|
Bowie, Luthy, Eisenberg, Science 1991
Lathrop, Protein Eng., 1994
Lathrop, Smith, 1996
|
12 / 12
|
Protein folding, recap, Q&A |
Lecture notes 27
Final project due
|
|
12/14
|
Final exam (8-10am)
|
GOOD LUCK!
|
|