Course objectives: Cover interesting algorithms and methods for the
analysis of biological data. We will cover string matching algorithms, string
searching, string pattern finding (gene finding, discovery of protein binding
sites), genome assembly, phylogenetics, and several topics of current research
interest in bioinformatics.
Class time: Tue/Thr 12:30am-1:45pm in CSIC 2107.
Professor: Carl
Kingsford, Office: CBCB 3113. Email: carlk AT cs.
Office hours: Wednesdays, 11:00-noon in CBCB 3113. If you
cannot attend office hours, email me about scheduling a different time.
Grades: will be posted at http://grades.cs.umd.edu
TA: Darya Filippova (dfilippo AT cs.umd.edu)
TA Office Hours: Mondays 1-3 in CBCB 3118, and
Thursdays 10-noon in the office hours room on first floor of AVW. If you cannot get into the building, please use the call box to
call Denise Cross (035) or Carl (032).
Announcements:
- Some collected bioinformatics lectures
- Homework 4
- Partner Evaluation Form
- Python code for Gibbs sampling
- Solution to
Project 1 (see email for password)
- Project #2 is posted. It is due on Dec 8 at 11:59pm.
- Project clarifications, etc.:
- You should create the given output file --- it might not exist. If
it does exist, just overwrite it.
- When computing the alignment SP-score to output on the "SP-score" line, use
the rule on page 10 of this
lecture. That is: compute the SP-score as sum over all columns of the the
substitution score between all pairs of characters in that column. This will
ignore the gap open costs. You should still use the gap open costs when
computing the pairwise alignments.
- To ensure your tests run correctly on the submit server, please: (a) put
your main() function in a class "Main" in the package
edu.umd.cbcb.align. The "starter files" on the submit server include a
template file where you can put your main function (or a call to your existing
main function). (b) don't use System.exit() or uncaught exceptions to exit
your program prematurely. Your main() function should return normally (since
the testing code will only run after your main function returns). (c) you have
to submit your source files, not only your compiled .class files. Submit a
zipped folder with your source code and package hierarchy, and the submit
server will automatically compile and test your code. I encourage you to
submit something early to make sure your submission works with the submit
server.
- When you are constructing the progressive alignment, you can sometimes have
several legal ways of preserving the mapping I talked about today in class. For
example, suppose your pairwise alignments are:
SC: cat-----hat SC: cat--hat
S1: catinthehat S2: catinhat
When adding S2 to the SC/S1 alignment, you could either have:
SC: cat-----hat
S1: catinthehat
S2: catin---hat
or
SC: cat-----hat
S1: catinthehat
S2: cat---inhat
or any other placement of "in" within the gap, as all preserve the alignments
with the center sequence SC. Any such alignment is correct.
- Homework #3 is posted and is due on Nov 17 at the start of class.
- Project #1 is posted. It is due on Nov 15 at 11:59pm.
- Darya's office hours today (Monday, Oct 17) will be shifted one hour later to 2-4pm today.
- Answers to some of the homework problems are posted
- Homework 2 is posted, and is due Oct 6 at the start of class.
- Office hours have been changed from what was originally posted. They are now:
Darya: Mondays, 1-3pm in CBCB 3118 and Thursdays 10-noon in AVW TA room.
Carl: Wednesdays, 11-noon in CBCB 3113
- Homework 1 is posted, and due Sept 22 at the start of class.
- Extra Credit (due Thursday, Sep. 15): send me up to 10 word pairs that both
(a) have an interesting alignment (gaps, mismatches, unexpected matches) and
(b) have some cleaver association with each other. Some that have already been submitted:
gattaca/genetical;
transsubstantiation/superstition;
unsinkabletitanic/hunkofice;
computerscience/liberalarts;
goldmansachs/lehmanbrothers;
underarmour/nike;
bioinformatics/algorithms;
fox news/faux news;
harrypotter/wizard;
einstein/brainiac;
starcraft/terriblewasteoftime;
programming/painful;
money/power;
primal/dual;
omnipotent/omniscient;
grill/skillet;
mathematics/computer science;
dissertation/thesis;
symmetry/dihedral;
traffic/pain;
submodular tree cover/polymatroid steiner tree;
nash equilibrium/dominant strategy
Handouts:
- Homework 2 (due Oct 6 at the start of class)
- Homework 1 (due Sept 22 at the start of class)
- The syllabus can be found here.
Demos & Code:
Optional Reading:
Reviews and Research papers where some of the techniques we have discussed were first introduced
will be posted here.
Lecture Slides: