Production Assemblies

With the exception of the simplest genomes, the assembly process involves a certain amount of human intervention.  Modern assembly programs are complex systems that need to be tuned to the specific characteristics of the genome being assembled.   Information about the genome being assembled, such as genome size, repeat content, or the presence of polymorphisms, is used to choose the correct parameters for the assembly algorithm.  Such information is frequently not known and has to be gleaned from the results of preliminary assemblies and specialized analyses of the shotgun data.  Often times, the data provided to the assembler has to be cleaned up to satisfy the standards required by an assembly program. The assembly of a complex genome is often performed in an iterative fashion, as researchers continuously tune the assembly program to obtain the best results.

The lab notes detail the special techniques used for that genome.


Assembly Projects


A selection of the assemblies our team has contributed to. See the Genome Assembly page for more information.

Fruit Fly Endosymbionts

Steven Salzberg and colleagues identified the sequence of the bacterial endosymbiont Wolbachia pipientis within the publicly available sequence data of several species of fruit fly.  These results were reported in the open access journal Genome Biology:

Salzberg, S.L., Hotopp, J.C., Delcher, A.L., Pop, M., Smith, D.R., Eisen, M.B., Nelson, W.C. (2005) Serendipitous discovery of Wolbachia genomes in multiple Drosophila species. Genome Biol 6 (3):R23.

The assemblies of the endosymbiont genomes identified in this study can be obtained below.  The assemblies were performed with AMOScmp.

Wolbachia endosymbiont of Drosophila annanasae - GenBank entry
Wolbachia endosymbiont of Drosophila simulans -  GenBank entry
Wolbachia endosymbiont of Drosophila willistoni - contigs  trace IDs


Fruit Fly Assemblies

Genome Sequencing Center Assembler Downloads
 
Drosophila pseudoobscura Baylor College of Medicine
Human Genome Sequencing Center
Celera Assembler Data download (ftp)
GenBank entry
Lab Notes
 
Drosophila yakuba Washington University
Genome Sequencing Center
Celera Assembler Data download
 
Drosophila virilis Agencourt Bioscience Celera Assembler Data download
Assembly Archive
Lab Notes
 
 

Protozoa

Genome
Sequencing Center
Assembler
Downloads
Tetrahymena thermophila The Institute for Genomic Research
Celera Assembler Data download
 
Trypanosoma cruzi The Institute for Genomic Research
Celera Assembler Data download
Lab Notes
 
 

Bacterial Genomes

Genome
Sequencing Center
Assembler
Downloads
Bacillus Anthracis
Ames Ancestor
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Bacillus Anthracis
str. A1055
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Bacillus Anthracis
str. Australia 94
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Bacillus Anthracis
str. CNEVA-9066
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Bacillus Anthracis
str. Kruger B
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Bacillus Anthracis
str. Vollum
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Bacillus Anthracis
str. Western North America USA5153
*
The Institue for Genomic Research Celera Assembler Assembly Archive
 
Borrelia afzelii The Institute for Genomic Research Celera Assembler Lab Notes
 
Burkholderia cepacia R1808
DOE Joint Genome Institute Celera Assembler Data download

Chloroflexus aurantiacus
DOE Joint Genome Institute Celera Assembler Data download

Methylobacillus flagellatus
DOE Joint Genome Institute Celera Assembler Data download

Xanthomonas oryzae
pathovar oryzicola
The Institute for Genomic Research Celera Assembler Data download
Lab Notes

Xylella fastidiosa ANN1 DOE Joint Genome Institute Celera Assembler Data download
 
Xylella fastidiosa DIXON
DOE Joint Genome Institute Celera Assembler Data download
 
 

Eukaryotes

Genome
Sequencing Center
Assembler
Downloads
Trichomonas vaginalis The Institute for Genomic Research
Celera Assembler Assembly Archive
Lab Notes
 
 

Nematodes

Genome
Sequencing Center
Assembler
Downloads
Brugia malayi
The Institute for Genomic Research
Celera Assembler Data download
Lab Notes
 
Caenorhabditis briggsae
Sanger Center
Celera Assembler Data download

*Assemblies completed in partnership with TIGR