CMSC858E: Algorithms for Biosequence Analysis (Fall 2006)
Instructor: Mihai Pop
Office hours: Mondays,
3120F Biomolecular Sciences Building (bldg #296). Building is
usually locked. Call me from the intercom and I'll buzz you
Qualifier status: counts as a PhD and MS
in the Algorithms/Theory area.
This course will cover the algorithms and heuristics used in analyzing
biological sequences, with a focus on string
matching and alignment algorithms, and their application to
analysis. A particular emphasis will be placed on
design of efficient algorithms and on techniques for analyzing the time
and space complexity of these algorithms. The course will
the computational concepts in
the context of current biological applications and will provide CS
students with a basic overview of molecular biology.
To get a better idea of the topics covered you can view last
year's syllabus. Note that the last third of this
course will cover phylogenetic analysis and protein folding instead of
Hidden Markov Models and haplotype phasing.
This course is intended for graduate students with a strong background
in algorithms and data-structures. Programming expertise is a
must. No background in biology is required. If you
uncertain about meeting these requirements please contact me.
There are no required textbooks for this course. Most of the
material covered can be found in the following two books.
Additional material will be provided, as needed, during the class:
Algorithms on strings, trees,
Durbin, Eddy, Krogh,
Biological sequence analysis.
Both books have been placed on reserve at the library.
The course will cover the following main areas. A detailed syllabus is
- Introduction to molecular biology
- Sequence alignment: exact and inexact string matching,
- Phylogenetic tree construction
- Protein folding
Coursework and grading
Regular homework assignments will consist of a combination of one or
more of the following: (i) exercises from one of the textbooks; (ii)
small programming assignments; (iii) "discovery" exercises using
publicly available bioinformatics tools. In addition, all
students must complete two programming projects, the first selected by
the instructor and the second chosen by the students in consultation
with the instructor.
The final grades will be a combination of the grades for the homework,
project, and mid-term and final exams. In addition,
participation in the class will be taken into account for extra credit.
The breakdown of you final grade is shown below.
Homework - 10 %
Project 1 - 15 %
Project 2 - 15 %
Midterm - 25%
Final - 35%
Unless otherwise indicated in class, most assignments will be given out
on Thursdays of each week and will be expected in by the beginning of
the Tuesday class. Remember, the office hours are on Mondays so
come by if you have any questions about your assignments.
Assignments submitted late will
be graded as follows: up to 1 day late - 10 points will be deducted
from the grade, up to 2 days late - 20 points will be deducted.
Your assignment will not be graded beyond the second day past the
deadline. If for reasons outside your control you will not be
able to submit an assignment on time, see me as soon as possible to discuss an alternate deadline.
This course follows the University's
attendance policy. In short, if you will miss class
for any reason you should let me know in advance, unless this
is not possible (e.g. sudden illness). In any case, please
let me know as soon as you are aware that will not be able to attend a
class (e-mail is OK). I will work with you to help you catch
up on homework or exams if you have to miss any of the lectures.
I expect that the students taking this class fully adhere to the Code of Academic
Integrity. Please read this document in full if you
have not already done so. In addition, the University
requires that you sign the Honor Pledge on every examination you turn in. Please read the relevant excerpt from
the Code of Academic Integrity (reproduced below).