PacBio Corrected Reads (PBcR) pipeline

Usage and Example Data
  • For a tutorial on using the pipeline for correction (including self-correction) and assembly, please see the PBcR wiki.
    • If you encounter issues or have questions, please contact the authors of the pipeline, Sergey Koren (sergek AT or Adam M. Phillippy (aphillippy AT
  • For best results with a high-coverage PacBio RS data (over 50X), we recommend using 25X of the longest post-correction sequences for assembly.
  • For known issues, please see the known issues wiki page.
  • Spec file for an SGE grid and a high-memory multi-core environment.

Utilities related to the pipeline and publications

  • Validation scripts for corrected sequences and assembled contigs used in the publication. Note, these scripts require MUMmer 3.23.
    • sh <reference fasta file> <corrected sequence fasta file> <uncorrected fasta/fastq file> will output statistics on chimeric and improperly trimmed sequences compared to the reference.
    • sh <directory containing results, can be .> <reference fasta file> <assembly contig fasta file> will output assembly statistics following the GAGE methodology.

    Publications and Supporting Data