CMSC858W: Algorithms for Biosequence Analysis (Spring 2010)

Essential details

Time: Tuesday & Thursday, 3:30-4:45pm
Location: CSIC 3118
Instructor: Mihai Pop (mpop at umiacs) x5-7245
Office hours: TBA
Office address: 3120F Biomolecular Sciences Building (bldg #296).
Building is usually locked. Call me from the intercom and I'll buzz you in.
3223 AVW (by appointment)
Qualifier status: counts as a PhD and MS qualifying course in the Algorithms/Theory area.



This course covers a range of string matching/sequence alignment topics with a focus on biological applications. In addition to a survey of several classical algorithms in string matching and alignment, a major focus will be placed on recent advances in this field, including space-efficient indices (e.g. extensions of the Burrows-Wheeler transform) and parallel string-matching algorithms.

While the focus of the course is on biological applications, the algorithms and techniques described in the course have broader application in other areas of CS (including NLP, software engineering, and security).


This course is intended for graduate students with a strong background in algorithms and data-structures. Programming expertise is a must. No background in biology is required. If you are uncertain about meeting these requirements please contact me.


There are no required textbooks for this course. Some of the material covered can be found in the following two books. In addition, we will rely on scientific articles describing recent advances in the field.

Gusfield. Algorithms on strings, trees, and sequences.
Durbin, Eddy, Krogh, Mitchison. Biological sequence analysis.

Course topics

The course will cover the following main areas.

  • Introduction to molecular biology

  • Sequence alignment:

    • exact string matching,

    • inexact alignment

    • multiple sequence alignment

    • parallel string matching algorithm

  • Genome assembly: a graph-theoretical perspective

  • (if time allows) Clustering and phylogenetic analysis

Coursework and grading

Regular homework assignments will consist of a combination of one or more of the following: (i) exercises from one of the textbooks; (ii) small programming assignments; (iii) "discovery" exercises using publicly available bioinformatics tools. In addition, all students must complete a programming project, chosen by the students in consultation with the instructor.

The final grades will be a combination of the grades for the homework, project, and mid-term and final exams. In addition, participation in the class will be taken into account for extra credit. The breakdown of you final grade is shown below.

Homework - 10 %
Project - 30 %
Midterm - 25%
Final - 35%

Assignments submitted late will be graded as follows: up to 1 day late - 10 points will be deducted from the grade, up to 2 days late - 20 points will be deducted. Your assignment will not be graded beyond the second day past the deadline. If for reasons outside your control you will not be able to submit an assignment on time, see me as soon as possible to discuss an alternate deadline.

Attendance policy

This course follows the University's attendance policy. In short, if you will miss class for any reason you should let me know in advance, unless this is not possible (e.g. sudden illness). In any case, please let me know as soon as you are aware that will not be able to attend a class (e-mail is OK). I will work with you to help you catch up on homework or exams if you have to miss any of the lectures.

Academic integrity

I expect that the students taking this class fully adhere to the Code of Academic Integrity. Please read this document in full if you have not already done so. In addition, the University suggests that you sign the Honor Pledge on every examination you turn in. Please read the relevant excerpt from the Code of Academic Integrity (reproduced below).

Honor Pledge

  • On every examination, paper or other academic exercise not specifically exempted by the instructor, the student shall write by hand and sign the following pledge:

    I pledge on my honor that I have not given or received any unauthorized assistance on this examination.

    Failure to sign the pledge is not an honors offense, but neither is it a defense in case of violation of this Code. Students who do not sign the pledge will be given the opportunity to do so. Refusal to sign must be explained to the instructor. Signing or non-signing of the pledge will not be considered in grading or judicial procedures. Material submitted electronically should contain the pledge; submission implies signing the pledge.

  • On examinations, no assistance is authorized unless given by or expressly allowed by the instructor. On other assignments, the pledge means that the assignment has been done without academic dishonesty, as defined above.

  • The pledge is a reminder that at the University of Maryland students carry primary responsibility for academic integrity because the meaningfulness of their degrees depends on it. Faculty is urged to emphasize the importance of academic honesty and of the pledge as its symbol. Reference on syllabuses to the pledge and to this Code, including where it can be found on the Internet and in the Undergraduate Catalog, is encouraged.