NIH Project R01-LM006845: Computational Gene Modeling and Genome Sequence Assembly

Project PI: Steven L. Salzberg, Ph.D.

Senior Personnel: Arthur L. Delcher, Ph.D., Mihaela Pertea, Ph.D., Mihai Pop, Ph.D.


Software systems supported by this grant

Assembly and Alignment Software

A Modular Open-Source assembler (AMOS),
http://amos.sourceforge.net/

A comparative genome assembler, AMOScmp, http://amos.sourceforge.net/docs/pipeline/AMOScmp.html

Minimus, an assembler for small genome sequencing projects, http://amos.sourceforge.net/docs/pipeline/minimus.html

Bowtie, An ultrafast, memory-efficient short read aligner that aligns short DNA sequences to the human genome at a rate of about 25 million reads per hour on a typical workstation with 2 GB of memory. http://bowtie.cbcb.umd.edu/

TopHat, A short read aligner for RNA-Seq experiments. TopHat discovers novel exon-exon splice junctions and can align millions of RNA-Seq reads to a mammalian genome per hour.  http://tophat.cbcb.umd.edu/

Cufflinks, A transcript assembler and abundance estimator for RNA-Seq, http://cufflinks.cbcb.umd.edu/

MUMmerGPU, high-throughput sequence alignment using Graphics Processing Units (GPUs), http://mummergpu.sourceforge.net/

Computational Gene Finding (and motif finding) Software

GlimmerHMM, a eukaryotic genefinder, at http://cbcb.umd.edu/software/glimmerhmm/

JIGSAW, a software system for combining the results of multiple gene finding methods, at http://cbcb.umd.edu/software/jigsaw/

TWAIN, a gene finder for finding genes in two genomes in parallel, at http://cbcb.umd.edu/software/twain/twaindoc.html

GeneZilla, a eukaryotic gene finder, at http://www.genezilla.org/

AutoEditor, software for automated correction of sequencing and basecaller errors, http://sourceforge.net/apps/mediawiki/amos/index.php?title=AutoEditor

GeneSplicer, software for predicting splice sites in eukaryotic genomes, at http://cbcb.umd.edu/software/GeneSplicer/

TransTerm, a system for finding transcription terminators in bacteria, at http://cbcb.umd.edu/software/

Selected publications supported in part by this grant

2009
  • B. Langmead, M.C. Schatz, J. Lin, M. Pop, and S.L. Salzberg.  Searching for SNPs with cloud computing.  Genome Biology (2009) 10:R134. doi:10.1186/gb-2009-10-11-r134.
  • A. Brady and S.L. Salzberg.  Phymm and PhymmBL: Metagenomic Phylogenetic Classification with Interpolated Markov Models.  Nature Methods, 6:9 (2009), 673 – 676.
  • A.V. Zimin, A.L. Delcher, L. Florea, D.A. Kelley, M.C. Schatz, D. Puiu, F. Hanrahan, G. Pertea, C.P. Van Tassell, T.S. Sonstegard, G. Marçais, M. Roberts, P. Subramanian, J.A. Yorke, and S.L. Salzberg.  A whole-genome assembly of the domestic cow, Bos taurus.  Genome Biology (2009), 10:R42.  Highly accessed.
  • C. Kingsford, N. Nagarajan and S.L. Salzberg.  2009 Swine-Origin Influenza A (H1N1) Resembles Previous Influenza Isolates. PLoS ONE 4:7 (2009), e6402. (doi:10.1371/journal.pone.0006402)
  • M. Berriman, B.J. Haas, P.T. LoVerde, R.A. Wilson, G.P. Dillon, G.C. Cerqueira, S.T. Mashiyama, B. Al-Lazikani, L.F. Andrade, P.D. Ashton, M.A. Aslett, D.C. Bartholomeu, G. Blandin, C.R. Caffrey, A. Coghlan, R.Coulson, T.A. Day, A. Delcher, R. DeMarco, A. Djikeng, T. Eyre, J.A. Gamble, E. Ghedin, Y. Gu, C. Hertz-Fowler, H. Hirai, Y. Hirai, R. Houston, A. Ivens, D.A. Johnston, D. Lacerda, C.D. Macedo, P. McVeigh, Z. Ning, G. Oliveira, J.P. Overington, J. Parkhill, M. Pertea, R.J. Pierce, A.V. Protasio, M.A. Quail, M.-A. Rajandream, J. Rogers, M. Sajid, S.L. Salzberg, M. Stanke, A.R. Tivey, O. White, D.L. Williams, J. Wortman, W. Wu, M. Zamanian, A. Zerlotini, C.M. Fraser-Liggett, B.G. Barrell, and N.M. El-Sayed.  The genome of the blood fluke Schistosoma mansoni.  Nature 460 (2009), 352-358.
  • B. Langmead, C. Trapnell, M. Pop, and S.L. Salzberg.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.  Genome Biology (2009) 10:R25. (Ranked 1st among most-accessed articles in Genome Biology for many consecutive months).
  • Cole Trapnell, Lior Pachter, and Steven L. Salzberg.  TopHat: Discovering splice junctions with RNA-Seq.  Bioinformatics 25 (2009), 1105-1111 .
  • Cole Trapnell and Steven L. Salzberg. How to map billions of short reads onto genomes.  Nature Biotechnology 27:5 (2009), 455-457.
  • A.M Phillippy, X. Deng, W. Zhang, and S.L Salzberg. Efficient oligonucleotide probe selection for pan-genomic tiling arrays.  BMC Bioinformatics 10:293 (2009). (doi:10.1186/1471-2105-10-293)
  • S.L. Salzberg, D. Puiu, D.D. Sommer, V. Nene, and N.H. Lee. The genome sequence of Wolbachia endosymbiont of Culex quinquefasciatus JHB.  J. Bacteriology 191:5 (2009). 1725.
  • J. Ravel, L. Jiang, S.T. Stanley, M.R. Wilson, R.S. Decker, T.D. Read, P. Worhsam, P.S. Keim, S.L. Salzberg, C.M. Fraser-Liggett, and D.A. Rasko.  The complete genome sequence of Bacillus anthracis Ames “Ancestor.”  J. Bacteriology 191:1 (2009), 445-446.
  • M. Pertea, K. Ayanbule, M. Smedinghoff, and S.L. Salzberg. OperonDB: a comprehensive database of predicted operons in microbial genomes.  Nucleic Acids Research 2009 37(Database issue):D479-D482; doi:10.1093/nar/gkn784.
2008
  • S.L. Salzberg, D.D. Sommer, D. Puiu, and V.T. Lee.  Gene-boosted assembly of a novel bacterial genome from very short reads.  PLoS Computational Biology, 4:9 (2008), e1000186. doi:10.1371/journal.pcbi.1000186.
  • J.M. Carlton, J.H. Adams, J.C. Silva, S.L. Bidwell, H. Lorenzi, E. Caler, J. Crabtree, S.V. Angiuoli, E.F. Merino, P. Amedeo, Q. Cheng, R.M.R. Coulson, B.S. Crabb, H.A. del Portillo, K. Essien, T.V. Feldblyum, C. Fernandez-Becerra, P.R. Gilson, A.H. Gueye, X. Guo, S. Kang'a, T.W.A. Kooij, M. Korsinczky, E.V.-S. Meyer, Vish Nene, I. Paulsen, O. White, S.A. Ralph, Q. Ren, T.J. Sargeant, S.L. Salzberg, C.J. Stoeckert, S.A. Sullivan, M.M. Yamamoto, S.L. Hoffman, J.R. Wortman, M.J. Gardner, M.R. Galinski, J.W. Barnwell, and C.M. Fraser-Liggett.  Comparative genomics of the neglected human malaria parasite Plasmodium vivax. Nature 455 (2008), 757-763.
  • E.V. Shakirov, S.L. Salzberg, M. Alam, and D.E. Shippen.  Analysis of Carica papaya Telomeres and Telomere-Associated Proteins: Insights into the Evolution of Telomere Maintenance in Brassicales.  Tropical Plant Biology (2008) doi:10.1007/s12042-008-9018-x.
  • H. Lu, P. Patil, M. Van Sluys, F.F. White, R.P. Ryan, J.M. Dow, P. Rabinowicz, S.L. Salzberg, J.E. Leach, R. Sonti, V. Brendel, and A.J. Bogdanove. Acquisition and Evolution of Plant Pathogenesis–Associated Gene Clusters and Candidate Determinants of Tissue-Specificity in Xanthomonas.  PLoS ONE 3:11 (2008), e3828. doi:10.1371/journal.pone.0003828.
  • Daniela Puiu and Steven L. Salzberg.  Re-assembly of the genome of Francisella tularensis subsp. holarctica OSU18. PLoS ONE 3:10 (2008): e3427. doi:10.1371/journal.pone.0003427.
  • C. Kingsford and S.L. Salzberg.  What are decision trees?  Nature Biotechnology 26:9 (2008), 1011-1013.
  • N. Nagarajan, R. Navajas-Pérez, M. Pop, M. Alam, R. Ming, A.H. Paterson, and S.L. Salzberg. Genome-wide analysis of repetitive elements in papaya.  Tropical Plant Biology (2008) doi:10.1007/s12042-008-9015-0.
  • S.L. Salzberg.  The contents of the syringe.  Nature 454 (2008), 160-162.
  • S.L. Salzberg, D.D. Sommer, M.C. Schatz, A.M. Phillippy, P.D. Rabinowicz, S. Tsuge, A. Furutani, H. Ochiai, A.L. Delcher, D. Kelley, R. Madupu, D. Puiu, D. Radune, M. Shumway, C. Trapnell, G.r Aparna, G. Jha, A. Pandey, P.B Patil, H. Ishihara, D.F. Meyer, B. Szurek, V. Verdier, R. Koebnik, J.M. Dow, R.P. Ryan, H. Hirata, S. Tsuyumu, S.W. Lee, Y.-S. Seo, M. Sriariyanum, P.C. Ronald, R.V. Sonti, M. Van Sluys, J.E. Leach, F.F. White and A.J. Bogdanove. Genome sequence and rapid evolution of the rice pathogen Xanthomonas oryzae pv. oryzae PXO99A.  BMC Genomics 9:204 (2008). Highly accessed.
  • Ray Ming, Shaobin Hou, Yun Feng, Qingyi Yu, Alexandre Dionne-Laporte, Jimmy H. Saw, Pavel Senin, Wei Wang, Benjamin V. Ly, Kanako L. T. Lewis, Steven L. Salzberg, Lu Feng, Meghan R. Jones, Rachel L. Skelton, Jan E. Murray, Cuixia Chen, Wubin Qian, Junguo Shen, Peng Du, Moriah Eustice, Eric Tong, Haibao Tang, Eric Lyons, Robert E. Paull, Todd P. Michael, Kerr Wall, Danny W. Rice, Henrik Albert, Ming-Li Wang, Yun J. Zhu, Michael Schatz, Niranjan Nagarajan, Ricelle A. Acob, Peizhu Guan, Andrea Blas, Ching Man Wai, Christine M. Ackerman, Yan Ren, Chao Liu, Jianmei Wang, Jianping Wang, Jong-Kuk Na, Eugene V. Shakirov, Brian Haas, Jyothi Thimmapuram, David Nelson, Xiyin Wang, John E. Bowers, Andrea R. Gschwend, Arthur L. Delcher, Ratnesh Singh, Jon Y. Suzuki, Savarni Tripathi, Kabi Neupane, Hairong Wei, Beth Irikura, Maya Paidi, Ning Jiang, Wenli Zhang, Gernot Presting, Aaron Windsor, Rafael Navajas-Pérez, Manuel J. Torres, F. Alex Feltus, Brad Porter, Yingjun Li, A. Max Burroughs, Ming-Cheng Luo, Lei Liu, David A. Christopher, Stephen M. Mount, Paul H. Moore, Tak Sugimura, Jiming Jiang, Mary A. Schuler, Vikki Friedman, Thomas Mitchell-Olds, Dorothy E. Shippen, Claude W. dePamphilis, Jeffrey D. Palmer, Michael Freeling, Andrew H. Paterson, Dennis Gonsalves, Lei Wang, and Maqsudul Alam.  The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452 (2008), 991-996.
  • M. Pop and S.L. Salzberg.  Bioinformatics challenges of new sequencing technology.  Trends in Genetics 24:3 (2008), 142-149.
  • B.J. Haas, S.L. Salzberg, W. Zhu, M. Pertea, J.E. Allen, J. Orvis, O. White, C.R. Buell, and J.R. Wortman.  Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments.  Genome Biology (2008), 9:R7, http://genomebiology.com/2008/9/1/R7. Highly accessed.
2007
  • Drosophila 12 Genomes Consortium.  Evolution of genes and genomes on the Drosophila phylogeny.  Nature 450 (2007), 203-218.
  • E. Ghedin, S. Wang, D. Spiro, E. Caler, Q. Zhao, J. Crabtree, J.E. Allen, A.L. Delcher, D.B. Guiliano, D. Miranda-Saavedra, S.V. Angiuoli, T. Creasy, P. Amedeo, B. Haas, N.M. El-Sayed, J.R. Wortman, T. Feldblyum, L. Tallon, M. Schatz, M. Shumway, H. Koo, S.L. Salzberg, S. Schobel, M. Pertea, M. Pop, O. White, G.J. Barton, C.K.S. Carlow, M.. Crawford, J. Daub, M.W. Dimmic, C.F. Estes, J.M. Foster, M. Ganatra, W.F. Gregory, N.M. Johnson, J. Jin, R. Komuniecki, I. Korf, S. Kumar, S. Laney, B.-W. Li, W. Li, T.H. Lindblom, S. Lustigman, D. Ma, C.V. Maina, D.M.A. Martin, J.P. McCarter, L. McReynolds, M. Mitreva, T.B. Nutman, J. Parkinson, J.M. Peregrín-Alvarez, C. Poole, Q. Ren, L. Saunders, A.E. Sluder, K. Smith, M. Stanke, T.R. Unnasch, J. Ware, A.D. Wei, G. Weil, D.J. Williams, Y. Zhang, S.A. Williams, C. Fraser-Liggett, B. Slatko, M.L. Blaxter, and A.L. Scott.  Draft Genome of the Filarial Nematode Parasite Brugia malayiScience 317:5845 (2007), 1756-60.
  • C. Kingsford, A.L. Delcher, and S.L. Salzberg.  A unified model explaining the offsets of overlapping and near-overlapping prokaryotic genes.  Molecular Biology and Evolution, 24:9 (2007), 2091-2098.
  • S.L. Salzberg, C. Kingsford, G. Cattoli, D.J. Spiro, D.A. Janies, M.M. Aly, I.H. Brown, E. Couacy-Hymann, G.M. De Mia, D.H. Dung, A. Guercio, T. Joannis, A.S. Maken Ali, A. Osmani, I. Padalino, M.D. Saad, V. Savić, N.A. Sengamalay, S. Yingst, J. Zaborsky, O. Zorman-Rojs, E. Ghedin, and I. Capua. Genome analysis linking recent European and African influenza (H5N1) viruses.  Emerging Infectious Diseases 13:5 (2007), 713-718 (http://www.cdc.gov/EID/content/13/5/713.htm).
  • A.L. Delcher, K.A. Bratke, E.C. Powers, and S.L. Salzberg.  Identifying bacterial genes and endosymbiont DNA with Glimmer.  Bioinformatics 23:6 (2007), 673-679.
  • B.J. Haas and S.L. Salzberg. Finding repeats in genome sequences.  In Bioinformatics – From Genomes to Therapies, Volume 1: Molecular Sequences and Structures (T. Lengauer, ed.).  Weinheim, Germany: Wiley-VCH, 2007, 197-234.
  • V. Nene, J.R. Wortman, D. Lawson, B. Haas, C. Kodira, Z. Tu, B. Loftus, Z. Xi, K. Megy, M. Grabherr, Q. Ren, E.M. Zdobnov, N.F. Lobo, K.S. Campbell, S.E. Brown, M.F. Bonaldo, J. Zhu, S.P. Sinkins, D.G. Hogenkamp, P. Amedeo, P. Arensburger, P.W. Atkinson, S. Bidwell, J. Biedler, E. Birney, R.V. Bruggner, J. Costas, M.R. Coy, J. Crabtree, M. Crawford, B. deBruyn, D. DeCaprio, K. Eiglmeier, E. Eisenstadt, H. El-Dorry, W.M. Gelbart, S.L. Gomes, M. Hammond, L.I. Hannick, J.R. Hogan, M.H. Holmes, D. Jaffe, J.S. Johnston, R.C. Kennedy, H. Koo, S. Kravitz, E.V. Kriventseva, D. Kulp, K. LaButti, E. Lee, S. Li, D.D. Lovin, C. Mao, E. Mauceli, C.F.M. Menck, J.R. Miller, P. Montgomery, A. Mori, A.L. Nascimento, H.F. Naveira, C. Nusbaum, S. O'Leary, J. Orvis, M. Pertea, H. Quesneville, K.R. Reidenbach, Y.-H. Rogers, C.W. Roth, J.R. Schneider, M. Schatz, M. Shumway, M. Stanke, E.O. Stinson, J.M.C. Tubio, J.P. VanZee, S. Verjovski-Almeida, D. Werner, O. White, S. Wyder, Q. Zeng, Q. Zhao, Y. Zhao, C.A. Hill, A.S. Raikhel, M.B. Soares, D.L. Knudson, N.H. Lee, J. Galagan, S.L. Salzberg, I.T. Paulsen, G. Dimopoulos, F.H. Collins, B. Birren, C.M. Fraser-Liggett, and D.W. Severson.  Genome Sequence of Aedes aegypti, a Major Arbovirus Vector.  Science 316:5832 (2007), 1718-1723.
  • M. Pertea, S.M. Mount, and S.L. Salzberg. A computational survey of candidate exon splicing enhancer motifs in the model plant Arabidopsis thaliana.  BMC Bioinformatics (2007), 8:159.
  • S.L. Salzberg. Genome re-annotation: a wiki solution? Genome Biology 2007, 8:102. Highly accessed.
  • D.D. Sommer, A.L. Delcher, S.L. Salzberg, and M. Pop.  Minimus: A fast, lightweight genome assembler.  BMC Bioinformatics 8:64 (2007). Highly accessed
  • J.M. Carlton, R.P. Hirt, J.C. Silva, A.L. Delcher, M. Schatz, Q. Zhao, J.R. Wortman, S.L. Bidwell, U.C.M. Alsmark, S. Besteiro, T. Sicheritz-Ponten, C.J. Noel, J.B. Dacks, P.G. Foster, C. Simillion, Y. Van de Peer, D. Miranda-Saavedra, G.J. Barton, G.D. Westrop, S. Müller, D. Dessi, P.L. Fiori, Q. Ren, I. Paulsen, H. Zhang, F.D. Bastida-Corcuera, A. Simoes-Barbosa, M.T. Brown, R.D. Hayes, M. Mukherjee, C.Y. Okumura, R. Schneider, A.J. Smith, S. Vanacova, M. Villalvazo, B.J. Haas, M. Pertea, T.V. Feldblyum, T.R. Utterback, C.-L. Shu, K. Osoegawa, P.J. de Jong, I. Hrdy, L. Horvathova, Z. Zubacova, P. Dolezal, S.-B. Malik, J.M. Logsdon Jr., K. Henze, A. Gupta, C.C. Wang, R.L. Dunne, J.A. Upcroft, P. Upcroft, O. White, S.L. Salzberg, P. Tang, C.-H. Chiu, Y.-S. Lee, T.M. Embley, G.H. Coombs, J.C. Mottram, J. Tachezy, C.M. Fraser-Liggett, and P.J. Johnson. Draft Genome Sequence of the Sexually Transmitted Pathogen Trichomonas vaginalis. Science 315 (2007), 207-212.
  • C.L. Kingsford, K. Ayanbule, and S.L. Salzberg. Rapid, accurate computational discovery of rho-independent transcription terminators illuminates their relationship to DNA uptake. Genome Biology (2007), 8:R22.
  • M.C. Schatz, A.M. Phillippy, B. Shneiderman, and S.L. Salzberg. Hawkeye: An interactive visual analytics tool for genome assembly. Genome Biology (2007), 8:R34. Highly accessed

2006
  • J.E. Allen, W.M. Majoros, M. Pertea, and S.L. Salzberg.  JIGSAW, GeneZilla, and GlimmerHMM: puzzling out the features of human genes in the ENCODE regions.  Genome Biology 7 (2006). Suppl 1:S9.
  • J.E. Allen and S.L. Salzberg.  A phylogenetic generalized hidden Markov model for predicting alternatively spliced exons.  Algorithms for Molecular Biology 1:14 (2006). Highly accessed.
  • J. Quackenbush and S.L. Salzberg.  It is time to end the patenting of software.  Bioinformatics 22:12 (2006), 1416-1417.



2005 and earlier
  1. Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution.  (Reprint) (Abstract) E. Ghedin, N.A. Sengamalay, M. Shumway, J. Zaborsky, T. Feldblyum, V. Subbu, D.J. Spiro, J. Sitz, H. Koo, P. Bolotov, D. Dernovoy, T. Tatusova, Y. Bao, K. St George, J. Taylor, D.J. Lipman, C.M. Fraser, J.K. Taubenberger, and S.L. Salzberg.  Nature (2005), 1162-1166.
  2. Serendipitous discovery of Wolbachia genomes in multiple Drosophila species.  (local PDF copy) S.L. Salzberg, J.C. Dunning Hotopp, A.L. Delcher, M. Pop, D.R. Smith, M.B. Eisen, and W.C. Nelson.  Genome Biology 2005, 6:R23.
  3. Efficient implementation of a generalized pair hidden Markov model for comparative gene finding.  W.H. Majoros, M. Pertea, and S.L. Salzberg. Bioinformatics 21:9 (2005), 1782-88.
  4. Efficient decoding algorithms for generalized hidden Markov model gene finders.  W.H. Majoros, M. Pertea, A.L. Delcher, and S.L. Salzberg.  BMC Bioinformatics 6 (2005), 16.
  5. The genome assembly archive: a new public resource.  S.L. Salzberg, D. Church, M. DiCuccio, E. Yaschenko, and J. Ostell. PLoS Biology 9:2 (2004), 1273-1275.
  6. Yeast rises again.  S.L. Salzberg, Nature 423 (2003), 233-234.
  7. Comparative genome assemblyM. Pop, A. Phillippy, A.L. Delcher, S.L. Salzberg, Briefings in Bioinformatics 5:3 (2004), 237-248.
  8. Automated correction of genome sequence errors.  P. Gajer, M. Schatz, and S.L. Salzberg.  Nucleic Acids Research 32:2 (2004), 562-569.  This describes the AutoEditor system, with open source code available here.
  9. M. Pop. Shotgun sequence assembly. Advances in Computers vol. 60, M. Zelkowitz ed. June 2004.
  10. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders.  W.H. Majoros, M. Pertea, and S.L. Salzberg.  Bioinformatics 20:16 (2004), 2878-79.
  11. An empirical analysis of training protocols for probabilistic gene finders.  W.H. Majoros and S.L. Salzberg.  BMC Bioinformatics 5 (2004), 206.
  12. Versatile and open software for comparing large genomes.  S. Kurtz, A. Phillippy, A.L. Delcher, M. Smoot, M. Shumway, C. Antonescu, and S.L. Salzberg.  Genome Biology 5:R12 (2004), http://genomebiology.com/2004/5/2/R12.  The is the MUMmer3 paper, with open source code available here.
  13. DAGChainer: A tool for mining segmental genome duplications and synteny.  B.J. Haas, A.L. Delcher, J.R. Wortman, and S.L. Salzberg.  Bioinformatics 20:18 (2004), 3643-6.
  14. Hierarchical scaffolding with Bambus. M. Pop, D. Kosack, and S.L. Salzberg.  Genome Research 14(2004), 149-159.  This describes our open source system for the scaffolding phase of genome assembly.
  15. Computational gene prediction using multiple sources of evidence.  J.E. Allen, M. Pertea, and S.L. Salzberg.  Genome Research 14(2004), 142-148.  This describes our open source system for producing a gene prediction based on multiple gene finders, alignment programs, and other evidence.
  16. Comparative genome sequencing for discovery of novel polymorphisms in Bacillus anthracis T.D. Read, S.L. Salzberg, M. Pop, M. Shumway, L. Umayam, L. Jiang, E. Holtzapple, J. Busch, K.L. Smith, J.M. Schupp, D. Solomon, P. Keim, and C.M. Fraser. Science 296 (2002), 2028-2033.
  17. Fast algorithms for large-scale genome alignment and comparison (Abstract) (Full text PDF) A.L. Delcher. A. Phillippy, J. Carlton, and S.L. Salzberg. Nucleic Acids Research 30:11 (2002), 2478-2483.  (This is the MUMmer 2 paper.)
  18. Full-length messenger RNA sequences greatly improve genome annotation.  B.J. Haas, N. Volfovsky, C.D. Town, M. Troukhan, N. Alexandrov, K.A. Feldmann, R.B. Flavell, O. White, and S.L. Salzberg.  Genome Biology 3:6 (2002), research0029.1-12.
  19. M. Pop, S. L. Salzberg, M. Shumway. Genome Sequence Assembly: Algorithms and Issues. IEEE Computer 35(7) 2002, pp. 47-54.
  20. Microbial Genes in the Human Genome: Lateral Transfer or Gene Loss? (Abstract)(Full text) (PDF file) S.L. Salzberg, O. White, J. Peterson, and J.A. Eisen, Science 292 (2001), 1903-1906.   See also the Enhanced Perspective in ScienceANNOTATED! See the annotated version of this paper, designed to help students and teachers of science, developed by the SCOPE project and the Editors of Science.
  21. GeneSplicer: a new computational method for splice site prediction M. Pertea, X. Lin, and S.L. Salzberg.  Nucleic Acids Research 29:5 (2001) 1185-1190.
  22. A probabilistic method for identifying start codons in bacterial genomes.  B.E. Suzek, M.D. Ermolaeva, M. Schreiber, and S.L. Salzberg.  Bioinformatics 17:12 (2001), 1123-1130.
  23. Prediction of operons in microbial genomes. M.D. Ermolaeva, O. White and S.L. Salzberg.  Nucleic Acids Research 29:5 (2001), 1216-1221.
  24. A clustering method for repeat analysis in DNA sequences.  N. Volfovsky, B.J. Haas, and S.L. Salzberg.  Genome Biology 2:8 (2001), research0027:1-11.  This describes the RepeatFinder software.
  25. Finding genes in Plasmodium falciparum chromosome 3.  M. Pertea, S.L. Salzberg, and M.J. Gardner. Nature 404 (2000), 34.
  26. Prediction of transcription terminators in bacterial genomes (get abstract).  M.D. Ermolaeva, H. Khalak, O. White, H.O. Smith, and S.L. Salzberg.  J. Molecular Biology 301 (2000), 27-33.
  27. Improved microbial gene identification with GLIMMER  A.L. Delcher, D. Harmon, S. Kasif, O. White, and S.L. Salzberg.  Nucleic Acids Research, 27:23 (1999), 4636-4641.