TY - Generic T1 - Inexact Local Alignment Search over Suffix Arrays T2 - IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09 Y1 - 2009 A1 - Ghodsi, M. A1 - M. Pop KW - bacteria KW - Bioinformatics KW - biology computing KW - Computational Biology KW - Costs KW - DNA KW - DNA homology searches KW - DNA sequences KW - Educational institutions KW - generalized heuristic KW - genes KW - Genetics KW - genome alignment KW - Genomics KW - human KW - inexact local alignment search KW - inexact seeds KW - local alignment KW - local alignment tools KW - memory efficient suffix array KW - microorganisms KW - molecular biophysics KW - mouse KW - Organisms KW - Sensitivity and Specificity KW - sequences KW - suffix array KW - USA Councils AB - We describe an algorithm for finding approximate seeds for DNA homology searches. In contrast to previous algorithms that use exact or spaced seeds, our approximate seeds may contain insertions and deletions. We present a generalized heuristic for finding such seeds efficiently and prove that the heuristic does not affect sensitivity. We show how to adapt this algorithm to work over the memory efficient suffix array with provably minimal overhead in running time. We demonstrate the effectiveness of our algorithm on two tasks: whole genome alignment of bacteria and alignment of the DNA sequences of 177 genes that are orthologous in human and mouse. We show our algorithm achieves better sensitivity and uses less memory than other commonly used local alignment tools. JA - IEEE International Conference on Bioinformatics and Biomedicine, 2009. BIBM '09 PB - IEEE SN - 978-0-7695-3885-3 ER -