Matrices & Candidates


ELPH weight matrices for Arabidopsis and Drosophila.


About ESE Candidates

Most of the ESE candidates are hexamers. These were chosen based on two methods : the RESCUE-ESE algorithm and ELPH. The candidates are highlited whenever they overlap the three 9mers GAAGAAGAA, CGATCAACG and TGCTGCTGG that we have found as very effective ESE in plants.

  • RESCUE-ESE The ESE Candidates are identified in a similar fashion as described by RESCUE-ESE algorithm (see http://genes.mit.edu/burgelab/rescue-ese/). The ESE predictions include all hexamers that have both significantly higher frequency of occurrence in exons than in introns and significantly higher frequency of occurence in exons with weak (non-consensus) splice sites than in exons with strong (consensus) splice sites.
    • weak exons: 25% of bottom splice sites scores
    • strong exons: 25% of top splice sites scores

    Each hexamer was assigned 2 scores:
    • Delta(EI) - scaled difference (in SD units) of frequency of occurence in exons vs. introns
    • Delta(5WS)/Delta(3WS) - scaled difference (in SD units) of frequency of occurence in 5'/3' end of weak exons vs. strong exons

    When applying a statistical significance threshold of 1.5 SD units above the mean for both scores the curent ESE candidates were identified.
  • ELPH Another set of candidates were found by running ELPH on the data.From all 4096 hexamers we retained only the ones that are statistically significantly represented in the data. Many of these candidates overlap the ones determined by the RESCUE-ESE method. For the Arabidopsis data, we used our segmental genome duplication data to further refine the potential candidates. For all significant hexamers appearing as motifs in the data, we computed their position in each duplicated gene pair (based on ELPH score). The hexamer was retained as ESE candidate if there was significant conservation for the synonymous mutations in the third position of all codons appearing inside the motif, when compared to the rest of the data. The results of this analyses can be consulted here.


Check the latest experimental data.

SEE ESE is an OSI Certified Open Source Software .   

Acknowledgements

The project is supported in part by NSF Award MCB-0114792, "Arabidopsis 2010: Pre-mRNA Splicing Signals in Arabidopsis"

Contact Us

Feel free to send your comments or questions to us