Cmatch: Fast Exact String Matching on the GPU

Michael C. Schatz and Cole Trapnell


Note: The program has been discontinued and superceeded by MUMmerGPU. Similar to Cmatch, MUMmerGPU aligns reads in parallel on the GPU, but MUMmerGPU can compute substring alignments in addition to end-to-end alignments computed by Cmatch.

Our program, Cmatch, performs exact string matching for a set of query sequences and achieves a speedup of as much as 35x on a recent GPU over the equivalent CPU-bound version. String matching has a long history in computational biology with roots in finding similar proteins and gene sequences in a database of known sequences. The explosion in sequence data available in the 80s and 90s motivated the development of ever faster techniques for searching for similar sequences, and ultimately lead the use of parallelized execution of string matching algorithms using sophisticated data structures called suffix trees. Suffix trees can be constructed time proportional to the length of the corpus, and provide exact matching of a query in time proporional to the length of the query, completely independent of the size of the corpus. Here, we present our string-matching kernel for use in the Compute Unified Device Architecture, which executes parallelized searching of a suffix tree for finding exact matches for a set of query strings. We compare our GPGPU suffix tree search to a serial CPU version of the algorithm, and analogous components of the widely used CPU program MUMmer, and explore issues associated with storing a suffix tree in a graphics card’s memory, and data distribution among the GPU’s processing units.

Paper
Powerpoint Presentation (PDF)
Source Code
Sample Data