Adam M. Phillippy, Ph.D.
National Human Genome Research Institute
GitHub: Maryland Bioinformatics Laboratories (MarBL)
Published May 2015 Long-read assembly with MinHash
Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive-Hashing
The resulting assemblies include fully resolved chromosome arms and close persistent gaps in these important reference genomes, including heterochromatic and telomeric transition sequences. For D. melanogaster, MHAP achieved a 600-fold speedup relative to prior methods and a cloud computing cost of a few hundred dollars. These results demonstrate that single-molecule sequencing alone can produce near-complete eukaryotic genomes at modest cost.
Complete Genome Assembly with Long Reads : See me present this work at ISMB 2014
Free preprint on bioRxiv
Published Dec 2014 Long-read sequencing and assembly review
One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly
Long reads promise to improve the quality of reference databases and facilitate new studies of chromosomal structure and variation. We present an overview of these new technologies and the methods used to assemble long reads into complete genomes.
I am interested in the design and application of efficient algorithms for the analysis of massive genomic sequencing data. My current research focuses on methods for the assembly, comparison, and exploration of large genomes and metagenomes; and the production of high-quality reference genomes to enable novel variation discovery.
Abbreviated list of my research interests:
- DNA sequencing
- Whole-genome alignment
- Sequence assembly and validation
- Microbial genomics and forensics
I have helped develop the following software tools:
- MUMmer - Rapid
whole-genome alignment and matching using suffix trees
- Mash - Fast genome distance estimation using MinHash
- PBcR - PacBio read correction and assembly pipeline
- MHAP - MinHash Alignment Process for noisy read overlapping
- Celera Assembler - Large-genome shotgun sequence assembly
- Harvest - A suite of alignment tools for thousands of microbial genomes
- MetAMOS - A
metagenomic assembly and analysis pipeline
- Krona -
Interactive visualization of hierarchical data for metagenomics
- Insignia - DNA
signature discovery for microbial detection and diagnostics
- PanArray -
Pan-genome tiling array design for rapid comparative genomics
And some older tools no longer actively supported:
- AMOS - Shotgun
sequence assembly infrastructure
- AMOScmp -
Comparative genome assembly
- Automated genome assembly validation
- Visual analytics for genome assembly
- PhyloTrac -
Environmental sample analysis and visualization for the
Berkeley Lab PhyloChip
- GenomeMTV -
Analysis and display for tiling and expression microarrays
Google Scholar, PubMed
- Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.
Berlin K, Koren S, Chin CS, Drake JP, Landolin JM, Phillippy AM. Nat Biotechnol. 2015 May 25.
- One chromosome, one contig: complete microbial genomes from long-read sequencing and assembly.
Koren S, Phillippy AM. Curr Opin Microbiol. 2015 February 23.
- The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.
Treangen TJ, Ondov BO, Koren S, Phillippy AM. Genome Biol. 2014 November 19.
- Automated ensemble assembly and validation of microbial genomes.
Koren S, Treangen TJ, Hill CM, Pop M, Phillippy AM. BMC Bioinformatics. 2014 May 3.
- Reducing assembly complexity of microbial genomes with single-molecule sequencing.
Koren S, Harhay GP, Smith TPL, Bono JL, Harhay DM, McVey DS, Radune D, Bergman NH, Phillippy AM. Genome Biol. 2013 September 13.
- Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species.
The Assemblathon Group. GigaScience. 2013 July 22.
- MetAMOS: a modular and open source metagenomic assembly and analysis pipeline.
Treangen TJ, Koren S, Sommer DD, Liu B, Astrovskaya I, Ondov B, Darling AE, Phillippy AM, Pop M. Genome Biol. 2013 January 15.
- Irreconcilable differences: divorcing geographic mutation and recombination rates within a global MRSA clone.
Treangen TJ, Phillippy AM. Genome Biol. 2012 December 27. [PDF]
- The rise of a digital immune system.
Schatz MC, Phillippy AM. GigaScience. 2012 July 12.
- Molecular epidemiologic investigation of an anthrax outbreak among heroin users, Europe.
Price EP, Seymour ML, Sarovich DS, Latham J, Wolken R, Mason J, Vincent G, Drees KP, Beckstrom-Sternberg SM, Phillippy AM, Koren S, Okinaka RT, Chung W, Schupp JM, Wagner DM, Vipond R, Foster JT, Bergman NH, Burans J, Pearson T, Brooks T, Keim P. Emerging Infectious Diseases. 2012 July 5.
- Hybrid error correction and de novo assembly of single-molecule sequencing reads.
Koren S, Schatz MC, Walenz BP, Martin J, Howard JT, Ganapathy G, Wang Z, Rasko DA, McCombie WR, Jarvis ED, Phillippy AM. Nat Biotechnol. 2012 July 1. [PDF] [SUP]
- Hawkeye and AMOS: visualizing and assessing the quality of genome assemblies.
Schatz MC, Phillippy AM, Sommer DD, Delcher AL, Puiu D, Narzisi G, Salzberg SL, Pop M. Brief Bioinform. 2011 Dec 23.
- GAGE: A critical evaluation of genome assemblies and assembly algorithms.
Salzberg SL, Phillippy AM, Zimin AV, Puiu D, Magoc T, Koren S, Treangen T, Schatz MC, Delcher AL, Roberts M, Marcais G, Pop M, Yorke JA. Genome Res. 2011 Dec 6.
- Complex microbiome underlying secondary and primary metabolism in the tunicate-Prochloron symbiosis.
Donia MS, Fricke WF, Partensky F, Cox J, Elshahawi SI, White JR, Phillippy AM, Schatz MC, Piel J, Haygood MG, Ravel J, Schmidt EW. Proc Natl Acad Sci. 2011 Nov 28.
- Interactive metagenomic visualization in a Web browser.
Ondov BD, Bergman NH, Phillippy AM. BMC Bioinformatics. 2011 Sep 30;12(1):385.
- Assemblathon 1: A competitive assessment of de novo short read assembly methods.
The Assemblathon Group. Genome Res. 2011 Sep 16.
- Two new complete genome sequences offer insight into host and tissue specificity of plant pathogenic Xanthomonas spp.
Bogdanove AJ, Koebnik R, Lu H, Furutani A, Angiuoli SV, Patil PB, Van Sluys MA, Ryan RP, Meyer DF, Han SW, Aparna G, Rajaram M, Delcher AL, Phillippy AM, Puiu D, Schatz MC, Shumway M, Sommer DD, Trapnell C, Benahmed F, Dimitrov G, Madupu R, Radune D, Sullivan S, Jha G, Ishihara H, Lee SW, Pandey A, Sharma V, Sriariyanun M, Szurek B, Vera-Cruz CM, Dorman KS, Ronald PC, Verdier V, Dow JM, Sonti RV, Tsuge S, Brendel V, Rabinowicz PD, Leach JE, White FF, Salzberg SL. J Bacteriol. 2011 Jul 22.
- Genomic comparison of multi-drug resistant invasive and colonizing Acinetobacter baumannii isolated from diverse human body sites reveals genomic plasticity.
Sahl JW, Johnson JK, Harris AD, Phillippy AM, Hsiao WW, Thom KA, Rasko DA. BMC Genomics. 2011 Jun 4;12(1):291.
- Bacillus anthracis comparative genome analysis in support of the Amerithrax investigation.
Rasko DA, Worsham PL, Abshire TG, Stanley ST, Bannan JD, Wilson MR, Langham RJ, Decker RS, Jiang L, Read TD, Phillippy AM, Salzberg SL, Pop M, Van Ert MN, Kenefic LJ, Keim PS, Fraser-Liggett CM, Ravel J. Proc Natl Acad Sci. 2011 Mar 7.
- Probing the pan-genome of Listeria monocytogenes: new insights into intraspecific niche expansion and genomic diversification.
Deng X, Phillippy AM, Li Z, Salzberg SL, Zhang W. BMC Genomics. 2010 Sep 16;11(1):500.
- Integrated microbial survey analysis of prokaryotic communities for the PhyloChip microarray.
Schatz MC, Phillippy AM, Gajer P, Desantis TZ, Andersen GL, Ravel J. Appl and Environ Microbiol. 2010 June 25;AEM.00303-10.
- Transcriptomic response of Salmonella enterica Enteritidis and Typhimurium to chlorine based oxidative stress.
Wang S, Phillippy AM, Deng K, Rui X, Li Z, Tortorello ML, Zhang W. Appl and Environ Microbiol. 2010 June 18;AEM.00823-10.
- Efficient oligonucleotide probe selection for pan-genomic tiling arrays.
Phillippy AM, Deng X, Zhang W, Salzberg SL. BMC Bioinformatics. 2009 September 16;10:293.
- Insignia: a DNA signature search web server for diagnostic assay development.
Phillippy AM, Ayanbule K, Edwards, NJ, Salzberg SL. Nucleic Acids Res. 2009 Jul 1;37(Web Server issue):W229-34.
sequence and rapid evolution of the rice pathogen Xanthomonas
oryzae pv. oryzae PXO99A.
Salzberg SL, Sommer DD,
Schatz MC, Phillippy AM, Rabinowicz PD, Tsuge S,
Furutani A, Ochiai H, Delcher AL, Kelley D, Madupu R, Puiu D,
Radune D, Shumway M, Trapnell C, Aparna G, Jha G, Pandey A,
Patil PB, Ishihara H, Meyer DF, Szurek B, Verdier V, Koebnik
R, Dow JM, Ryan RP, Hirata H, Tsuyumu S, Lee SW, Ronald PC,
Sonti RV, Van Sluys MA, Leach JE, White FF, Bogdanove
AJ. BMC Genomics. 2008 May 1;9(1):204.
assembly forensics: finding the elusive mis-assembly.
Phillippy AM, Schatz MC, Pop M. Genome
of genes and genomes on the Drosophila phylogeny.
Drosophila 12 Genomes Consortium. Nature. 2007 Nov
DNA signature discovery and validation.
Phillippy AM, Mason JA, Ayanbule K, Sommer DD, Taviani E, Huq
A, Colwell RR, Knight IT, Salzberg SL. PLoS Comput
Biol. 2007 May 18;3(5):e98.
discovery using the sand spatial browser.
Phillippy A, Sankaranarayanan
J. Proceedings of the 7th National Conference on Digital
Government Research. 2007 May pages 284-285, Philadelphia, PA.
an interactive visual analytics tool for genome
Schatz MC, Phillippy AM,
Shneiderman B, Salzberg SL. Genome
sequence of the PCE-dechlorinating bacterium Dehalococcoides
Seshadri R, Adrian L, Fouts DE, Eisen JA,
Phillippy AM, Methe BA, Ward NL, Nelson WC, Deboy RT,
Khouri HM, Kolonay JF, Dodson RJ, Daugherty SC, Brinkac LM,
Sullivan SA, Madupu R, Nelson KE, Kang KH, Impraim M, Tran K,
Robinson JM, Forberger HA, Fraser CM, Zinder SH, Heidelberg
JF. Science. 2005 Jan 7;307(5706):105-8.
Pop M, Phillippy A,
Delcher AL, Salzberg SL. Brief Bioinform. 2004
and open software for comparing large genomes.
Phillippy A, Delcher AL, Smoot M, Shumway M,
Antonescu C, Salzberg SL. Genome
- Using MUMmer to identify similar regions in large sequence
Delcher AL, Salzberg SL, Phillippy AM. Current Protocols in
Bioinformatics. John Wiley & Sons, 2003.
algorithms for large-scale genome alignment and
Delcher AL, Phillippy A,
Carlton J, Salzberg SL. Nucleic Acids Res. 2002
- Long-read, whole-genome shotgun sequence data for five model organisms.
Kim KE, Peluso P, Babayan P, Yeadon PJ, Yu C, Fisher WW, Chin C-S, Rapicavoli NA, Rank DR, Li J, Catcheside DEA, Celniker SE, Phillippy AM, Bergman CM, Landolin JM. Scientific Data. 2014 Nov 25.
- Complete genome sequence of the quality control strain Staphylococcus aureus subsp. aureus ATCC 25923.
Treangen TJ, Maybank RA, Enke S, Friss MB, Diviak LF, Karaolis DK, Koren S, Ondov B, Phillippy AM, Bergman NH, Rosovitz MJ. Genome Announc. 2014 Nov 6.
- High-coverage sequencing and annotated assemblies of the budgerigar
Ganapathy G, Howard JT, Ward JM, Li J, Li B, Li Y, Xiong Y, Zhang Y, Zhou S, Schwartz DC, Schatz M, Aboukhalil R, Fedrigo O, Bukovnik L, Wang T, Wray G, Rasolonjatovo I, Winer R, Knight JR, Koren S, Warren WC, Zhang G, Phillippy AM, Jarvis ED. Gigascience. 2014 Jul 8.
- Complete closed genome sequences of three Bibersteinia trehalosi nasopharyngeal isolates from cattle with shipping Fever.
Harhay GP, McVey DS, Koren S, Phillippy AM, Bono J, Harhay DM, Clawson ML, Heaton MP, Chitko-McKown CG, Korlach J, Smith TP. Genome Announc. 2014 Feb 13.
- Complete closed genome sequences of four Mannheimia varigena isolates from cattle with shipping fever.
Harhay GP, Murray RW, Lubbers B, Griffin D, Koren S, Phillippy AM, Harhay DM, Bono J, Clawson ML, Heaton MP, Chitko-McKown CG, Smith TP. Genome Announc. 2014 Feb 13.
- Complete closed genome sequences of Mannheimia haemolytica serotypes A1 and A6, isolated from cattle.
Harhay GP, Koren S, Phillippy AM, McVey DS, Kuszak J, Clawson ML, Harhay DM, Heaton MP, Chitko-McKown CG, Smith TPL. Genome Announc. 2013 May/June.
- Genome sequence of the attenuated Carbosap vaccine strain of Bacillus anthracis.
Harrington R, Ondov BD, Radune D, Friss MB, Klubnik J, Diviak L, Hnath J, Cendrowski SR, Blank TE, Karaolis D, Friedlander AM, Burans JP, Rosovitz MJ, Treangen T, Phillippy AM, Bergman NH. Genome Announc. 2013 January/February.
Talks (since 2010)
- "Bioinformatics for the long read era." Invited talk. Applied Bioinformatics and Public Health Microbiology. Hinxton, UK. May 2015.
- "Single molecule sequencing: affordable finished genomes and mobile DNA sensors." Invited seminar. NIH/NHGRI. Bethesda, MD. February 2015.
- "Sequencing and informatics for microbial forensics." Invited talk. NICBR Symposium. Frederick, MD. December 2014.
- "Assembling DNA puzzles from millions of pieces." Invited seminar. Loyola University Maryland ACM. Baltimore, MD. November 2014.
- "Assembling DNA puzzles from millions of pieces." Invited keynote. Michigan State Cyberinfrastructure Days. East Lansing, MI. October 2014.
- "Assembling a puzzle with a billion pieces: what could go wrong?" Invited talk. NCBI Annotation Workshop. Bethesda, MD. October 2014.
- "Sequencing and informatics for microbial forensics." Invited talk. NIST. Gaithersburg, MD. October 2014.
- "Complete genome assembly with long reads." Selected talk. ISMB. Boston, MA. July 2014. (ISMB link)
- "Efficient chromosome-scale assembly of eukaryotic genomes." Invited talk. PacBio User Group Meeting. University of Maryland. Baltimore, MD. July 2014.
- "Keeping pace with advances in DNA sequencing for microbial forensics." Invited talk. MAD SSCi Meeting. University of Maryland. Baltimore, MD. June 2014.
- "Complete genome assembly with long reads." Invited seminar. PacBio Symposium. University of Liverpool. Liverpool, UK. April 2014.
- "Microbial forensics and interdisciplinary science." Invited seminar. NICBR Exploring Careers in a Scientific Environment Symposium. National Cancer Institute. Frederick, MD. February 2014.
- "Genome assembly, microbial forensics, and the digital immune system." Invited seminar. Johns Hopkins University. Baltimore, MD. October 2013.
- "Assembly of complete microbial genomes and metagenomes." Invited seminar. CDC. Atlanta, GA. June 2013.
- "Low-cost finished genomes: what's possible?" Invited talk. PacBio User Group Meeting. Baltimore, MD. June 2013.
- "Reducing assembly complexity with single-molecule sequencing." Selected talk. Sequencing, Finishing and Analysis in the Future. Santa Fe, NM. May 2013.
- "Sequence and analysis standards for microbial forensics." Selected talk. GSC 15. Bethesda, MD. April 2013.
- "Where are standards in genomics most needed now?" Panel discussion. GSC 15. Bethesda, MD. April 2013.
- "Genomics and informatics at the National Bioforensic Analysis Center." Invited seminar. FDA. White Oak, MD. March 2013.
- "Genome assembly and microbial forensics." Invited seminar. Johns Hopkins University. Baltimore, MD. March 2013.
- "Whole genome alignment, microbial genomics, and the digital immune system." Invited seminar. Northern Arizona University. Flagstaff, AZ. October 2012.
- "One-step bacterial genome closure with single-molecule hybrid assembly." Invited talk. Pacific Biosciences User Group Meeting. Menlo Park, CA. October 2012.
- "Algorithms for the digital immune system." Invited talk. Disease Outbreak Detection in the Genomics Era. NCBI. Bethesda, MD. September 2012.
- "Hybrid error correction and de novo assembly of single-molecule sequencing reads." Invited talk. NGS Leaders Featured Webinar. August 2012. (YouTube link)
- "Sequencing alchemy: turning single-molecule reads to gold using hybrid error correction." Invited talk. Advances in Genome Biology and Technology. Marco Island, FL. February 2012.
- "The future of fighting infectious disease." Panel discussion. Advances in Genome Biology and Technology. Marco Island, FL. February 2012.
- "Embracing next generation technologies in outbreak investigations: assessing the needs and bottlenecks." Panel discussion. Microbial Evolution and Cutting Edge Tools for Outbreak Investigations. Centers for Disease Control and Prevention. Atlanta, GA. September 2011.
- "Of parrots and pathogens:
benchmarking of short and long read sequencing." Invited talk. Sequencing,
Finishing, and Analysis in the Future. Santa Fe, NM. June 2011.
- "The bioinformatics of
microbial forensics." Invited seminar. University of
Notre Dame. September 2010.
- "Bioinformatics for
real-time pathogen detection and characterization."
Invited seminar. NCBI Computational Biology Brach. April
- "Bioinformatics for
nucleic acid diagnostics." Invited seminar. National
Biodefense Analysis and Countermeasures Center. February
- "Comparative genomics
of Listeria monocytogenes." Invited seminar.
University of Notre Dame. January 2010.