Comparative genomics/sequence alignment

A way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.


(New in 2014) Harvest is a suite of core-genome alignment and visualization tools for quickly analyzing thousands of intraspecific microbial genomes. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Combined they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees.

Genome Assembly and Analysis with Optical Restriction Maps

Optical Mapping Data as a Guide for Genome Assembly

Genome assembly -- the task of reconstructing a genome from the small fragments of DNA that can be sequenced by modern technologies -- is a difficult computational problem, in no small part due to the fact that the shotgun sequencing process cannot preserve the long-range structure of the genome being assembled. Optical mapping is a genomic technology, pioneered by David Schwartz, which can map the location of restriction sites along a genomic chromosome. Thus, optical mapping provides a long-range sparse representation

Students and Postdoctoral researchers:

Principal Investigators


is a comparative genome assembler, which uses one genome as a reference on which to assemble another, closely related species. See the journal paper here.

AMOS Assembler project

The is a set of tools, libraries, and freestanding genome assemblers, all open source. AMOS is also an open consortium that includes TIGR, the University of Maryland, The Karolinska Institutet, and the Marine Biological Laboratory.

Algorithms for the Analysis of Data from Massively-parallel Genome Sequencing

New generation DNA sequencing technologies are revolutionizing modern biological research. Scientists can now generate the rough equivalent of an entire human genome (~3 billion base-pairs of DNA) in just a few days with one single sequencing instrument. Until recently, such amounts of data could only be generated at large genome centers using hundreds of sequencers. The analysis of these data is complicated by their size - a single run of a sequencing instrument yields terabytes of information, often requiring a significant scale-up of the existing computational infrastructure.

Principal Investigators

Subscribe to RSS - Comparative genomics/sequence alignment