[an error occurred while processing this directive] Home Software [an error occurred while processing this directive]





OVERVIEW

This page is intended to act as a repository for all of our software, documentation, experimental results, and "tips, tricks, & lessons learned" for gene-finding in particular and bioinformatics prediction problems in general.  Included are full source code for all TIGR gene-finders as well as C++ class libraries and other components for prediction and machine learning tasks.





 Source Code -- Complete Projects 
Project
Description
Language
TigrScan
GHMM gene-finder like Genscan/Genie written in highly optimized C++ & designed to be extensible and reusable for other tasks related to gene-finding.
C++
GlimmerHMM
GHMM gene-finder like Genscan/Genie, written in C.  Very fast and accurate.

Combiner


GlimmerM


Unveil
Pure HMM-based gene-finder based on the VEIL model.  Highly optimized C++.
C++
ELPH
Gibbs sampler for finding motifs in DNA; has been used for detecting exon splice enhancers (ESE's).  Also applicable to other motif-detection tasks.

GeneSplicer


Glimmer


TransTerm


RBSfinder


VEIL
The original VEIL gene-finder.
C++
MORGAN


All of the software listed above is Open Source and is distributed under the ARTISTIC LICENSE.  See www.opensource.org.


 
Source Code -- Reusable Software Components
Package
Description
Language
OC1
oblique decision trees for classification
C
NET
backpropagation neural networks for classification
C++
ET
entropy-based decision trees for classification
C++
Suffix Trees
Suffix trees by Stephan Kurtz
tigr++
C++ container class library used by several TIGR genefinders and other packages.  Covers string & sequence processing, math/statistics, many efficient data structures, GFF parsing, sorting, and I/O.
C++
regress
multivariate regression for classification
C++
bayes
Naive Bayes classifier
C++
GP
genetic algorithms / genetic programming
C++
KNN
K-nearest-neighbors classifier with Mahalanobis distance (to control for correlation among attributes) and feature selection based on F-ratio.
C++









All of the software listed above is Open Source and is distributed under the ARTISTIC LICENSE.  See www.opensource.org.



Documentation
TigrScan User Manual
How to install and use the TigrScan gene-finder
TigrScan Training Manual
How to train the TigrScan gene-finder
TigrScan Software Architecture
How the TigrScan gene-finder software is structured -- for those who wish to modify the program



Training Data
Arabidopsis.thaliana.tar.gz
GFF coordinates & FASTA file
Aspergillus.fumigatus.tar.gz
GFF coordinates & FASTA file
Aspergillus.spp.tar.gz
GFF coordinates & FASTA file
Homo.sapiens.tar.gz
GFF coordinates & FASTA file
Mus.musculus.tar.gz
GFF coordinates & FASTA file
Plasmodium.falciparum.tar.gz
GFF coordinates & FASTA file
ml-training-sets.tar.gz
sample training/test/configuration files for machine-learning packages




Experimental Results
description
poster
date
comparison of machine learning methods for discriminating exons from non-exon ORFs
bmajoros
1-19-04










External Links
Link
Description
GFF
general Feature Format definition at the Sanger center
ORF Finder
program at NCBI to find ORFs in a sequence
GenomeScan
Pair HMM
HMMgene

WebGene

SplicePredictor

GRAIL
Neural-network-based gene-finder
NetPlantGene

TWINSCAN
GHMM "informant" method
GeneMark(tm) a gene finder from Georgia Institute of Technology
Bibliography
Bibliography on Computational Gene Recognition
GENSCAN GHMM-based gene-finder for human
SLAM Pair GHMM-based syntenic gene-finder
EuGene
Genie GHMM-based gene-finder
NetGene2
GrailEXP Neural-network-based gene-finder


TIGR assumes no responsibility for the content of the pages linked in this table.


[an error occurred while processing this directive]