Niranjan

Niranjan Nagarajan

Current Position: Senior Research Scientist, Computational and Mathematical Biology, Genome Institute of Singapore

Postdoctoral Fellow, 2007-2009 (advisor: Mihai Pop)
Center for Bioinformatics and Computational Biology,
and UM Institute for Advanced Computer Studies

Ph.D., Cornell University, 2006 (advisor: Uri Keich)
M.S., Cornell University, 2004
B.A., Ohio Wesleyan University, 2000

niranjan [at] umiacs.umd.edu
Center for Bioinformatics and Computational Biology
Biomolecular Sciences Bldg #296
College Park, MD 20742
301-405-8804


Genome Assembly

Genome Assembly (the next generation!)

    Deciphering the genome of an organism is a computationally challenging task akin to solving a very large 1-dimensional puzzle. This is because we currently do not know of a way to somehow "read" the letters of the genome from start to finish. Various sequencing technologies, however, do exist that can read short stretches of DNA (30-1000 bases) modulo some experimental error. A common approach then is to shred DNA into small pieces, read the sequence of these pieces and then use software to put them together into longer sequences (also called Whole Genome Shotgun sequencing). Recently, several new sequencing technologies (454,  Illumina, SOLiD) have been introduced that sequence fast and cheap (by several orders of magnitude). The reads however can be very small (~30 bases for some) and often the genomes reconstructed from these sequences can be highly fragmented (thousands of pieces). A promising solution to this problem is the use of ordered restriction maps (such as optical maps and nanocode maps) to order the sequence fragments in a genomewide map. In recent work, we designed a robust system for "scaffolding" genomic sequences onto such maps that can handle sequencing errors and detect misassemblies (SOMA - Scaffolding using Optical Map Alignment). The SOMA package (Nagarajan et al., 2008) is freely available and has been used to scaffold nearly a dozen bacterial genomes (see also: Yersinia genomes).