Harvest

Harvest is a suite of core-genome alignment and visualization tools for quickly analyzing thousands of intraspecific microbial genomes. Harvest includes Parsnp, a fast core-genome multi-aligner, and Gingr, a dynamic visual platform. Combined they provide interactive core-genome alignments, variant calls, recombination detection, and phylogenetic trees.

Harvest components

  1. Harvest tools

    -->binary format and conversion utilities
  2. Gingr

    -->GUI, interactive visualization of alignments, trees and variants
  3. Parsnp

    -->multiple core genome alignment, SNP filtration, core genome phylogeny

Documentation on readthedocs

Latest release and source code on github

Parsnp binaries used in manuscript (v1.0)

Experimental results

  1. Dataset #1: 32 simulated E. coli W3110 strains
  • Dataset #2: 826 P. difficile genomes
    • Gingr input format: GGR
    • Newick formatted phylogeny: TREE
    • XMFA formatted alignment: XMFA
  • Dataset #3: 224 M. tuberculosis genomes
    • Gingr input format: GGR
    • Newick formatted phylogeny: TREE
    • XMFA formatted alignment: XMFA
    • Common SNP calls VCF
  • Dataset #4: 31 S. pneumoniae TIG4 genomes
    • Dataset #5: 10,000 simulated S. pneumoniae TIG4 strains