Spanki is a set of tools to facilitate analysis of alternative splicing from RNA-Seq data. Spanki compiles quantitative and qualitative information about junction alignments from input BAM files, and analyzes junction-level splicing along with pairwise-defined splicing events. A simulator is also included to evaluate junction detection performance.
Sturgill D, Malone JH, Sun X, Smith HE, Rabinow L, Samson ML, Oliver B.
Design of RNA splicing analysis null models for post hoc filtering of Drosophila head RNA-Seq data with the splicing analysis kit (Spanki).
BMC Bioinformatics. 2013 Nov 9;14(1):320.

Pubmed page PMID: 24209455
Spanki release 0.5.0 (August 15, 2014) *Important update! *
-- This release fixes a bug that caused p-values to be calculated on data in 'precise' mode (only considering junctions at the site where the splice path diverges), instead of the default sensitive mode, when junctions that join exons outside the event are present.
-- Changes in annotate_junctions, including less verbose output
-- Fixed a Hamming distance error when sequences are different lengths
-- Several other reporting changes

Spanki release 0.4.3 (February 3, 2014)
-- Added depth-normalized junction coverage reporting
-- Added error-free option to simulator
-- Changed default intron retention in simulator to zero
-- Optimizing spankijunc to handle large tables better
-- Several minor bug fixes and reporting changes

Spanki release 0.4.2 (August 6, 2013)
-- Several minor bug fixes and reporting changes
-- Changes and additions to example run commands

Spanki release 0.4.1 (June 17, 2013)
-- Several minor bug fixes and reporting changes
-- Note that Github now longer hosts file for download, so only the current release is linked to from the "Download" tab on the right. Alternatively, you may clone the repository and checkout the branch corresponding to the current release (see Installation instructions)

Spanki release 0.4.0 (Oct. 31, 2012)
-- Adds a junction annotator. From a set of junctions, outputs qualitative characteristics (including gene assignments, annotation status, flanking sequence, and presence of proximal nag motifs.)

Quick start


Clone the repository for most recent changes:
   git clone https://github.com/dsturg/Spanki.git
Checkout the branch for the most recent stable release, eg:
   git checkout release-0.4.1
Or download the most recent stable release.

Install using the python setup script:
   sudo python setup.py install
Install without sudo (if on a cluster for example)
   python setup.py install --user

Sample code is included in the software for performing simulations: 'simulation_example.txt' and splicing analysis: 'analysis_commands.sh'

A test data set is available to try out Spanki's features:    http://www.cbcb.umd.edu/software/spanki/testdata.tar.gz

Instructions for using the example data:
- Make a working directory, eg "spankitest
- Go the the spankitest directory, and extract the testdata there
- Copy the "analysis_commands.sh" file from the Spanki directory to here
- Run the commands: ./analysis_commands.sh or manually copy and paste lines of code
Please see the manual pages for details about installing and usage:

Spanki has been tested on Mac OSX 10.6 amd 10.7 and Ubuntu Linux 10.04


Python modules

Spanki requires the following Python packages (Python will attempt to install them for you):
  • pyfasta
  • pysam
  • numpy
  • Biopython
  • scikits.statsmodels
  • fisher
If you encounter problems getting numpy and scipy installed together, we recommend installing the ENTHOUGHT python distribution (http://enthought.com/products/epd.php).

Other required programs

samtools http://samtools.sourceforge.net/
Cufflinks (Spanki uses the gtf_to_sam utility to create a sam representation of a gtf reference) http://cufflinks.cbcb.umd.edu

Required for splicing analysis:
AStalavista (or a precomputed splicing event file) http://genome.crg.es/astalavista/


Please contact Dave Sturgill with questions:
dave.sturgill [at] gmail.com