CBCB computers

Mission

The University of Maryland Center for Bioinformatics and Computational Biology is a multidisciplinary center dedicated to research on questions arising from the genome revolution. CBCB brings together scientists and engineers from many fields, including computer science, molecular biology, genomics, genetics, mathematics, statistics, and physics, all of whom share a common interest in gaining a better understanding of how life works. The Center for Bioinformatics and Computational Biology is organized as a center within the University of Maryland Institute for Advanced Computer Studies (UMIACS), an interdisciplinary research institute supporting high-performance computing research across the College Park campus.

3115 Biomolecular Sciences Bldg #296, University of Maryland, College Park, MD 20742 • Tel (301) 405-5936; Fax: (301) 314-1341. Directions to CBCB

Open Position for Professor and Director of CBCB

2012 Summer Internship Program

News & Events

March 30, 2012. CBCB graduate student Joseph Paulson has received the prestigious Graduate Research Fellowship from the National Science Foundation . At the same time, his colleague Lee Mendelowitz has received an honorable mention for the same award.

February 15, 2012. CBCB faculty Carl Kingsford has received the prestigious Sloan Research Fellowship in the area of evolutionary and computational molecular biology. Prof. Kingsford is one of 12 awardees in this area and the only Sloan awardee from our University. The Alfred P. Sloan Foundation awards research fellowships to early-career scientists whose achievements and potential identify them as rising stars and future scientific leaders.

April 16, 2011. The Washington Post reports how scientists at TIGR, including several CBCB faculty, helped the FBI to solve the mystery of the 2001 anthrax attacks by identifying unique DNA sequence markers in the bacteria used in the attacks. A scientific account appeared in a recent issue of the Proceedings of the National Academy of Sciences (see Science Highlights on this page).

January 20, 2011. CBCB scientists along with collaborators from several other institutions announce the GAGE (Genome Assembly Gold-Standard Evaluations) competition. GAGE is an evaluation of the very latest large-scale genome assembly algorithms, using recently generated next-generation sequence data from a highly diverse set of genomes.

January 10, 2011. Nature Biotechnology highlights the Cufflinks program as one of 2010's breakthroughs of the year in computational biology. Cufflinks was developed by former CBCB student Cole Trapnell, in collaboration with CBCB members Geo Pertea and Steven Salzberg, and external collaborators Lior Pachter (UC Berkeley) and Barbara Wold (CalTech).

November 1, 2010. Sridhar Hannenhalli joins CBCB and the University of Maryland as an Associate Professor in the Department of Cell Biology and Molecular Genetics. Prof. Hannenhalli, who was previously in the Dept. of Genetics at the University of Pennsylvania, works on biological questions pertaining to eukaryotic gene regulation and molecular evolution.

October 26, 2010. A UMD team consisting of Graduate Student Rob Patro and Prof. Carl Kingsford won the 2010 DREAM5 Challenge 1 competition by producing the most accurate computational method to predict which peptides bind to intravenous immunoglobulin antibodies. Better methods to predict binding affinity will lead to a greater understanding of how antibodies operate in the cell. DREAM is a competition held annually in concert with the RECOMB Systems Biology conference. Rob will present his method there in November.

July 1, 2010. Hector Corrada Bravo, who works on statistical algorithms for analysis of high-throughput genomics data, joins CBCB and the University of Maryland faculty as an Assistant Professor in the Department of Computer Science. Lately, he has focused on methods for analysis of second generation sequencing data.

June 2010. CBCB faculty member Carl Kingsford recieved an R21 research grant from the NIH to study computational techniques for detecting genetic mixing among strains of the influenza virus. This is joint work with former CBCB postdoc Niranjan Nagarajan.

June 2010. Ben Langmead traveled to London to accept the Genome Biology Award for his paper, jointly authored with Cole Trapnell, Mihai Pop, and Steven Salzberg, describing the sequence aligner Bowtie. The Genome Biology Award recognizes the best article published in Genome Biology in 2009 and was introduced to celebrate the ten year anniversary of this journal.

June 2010. Recent CBCB Ph.D. graduate Mike Schatz was profiled in Nature Methods for his work on next-generation sequencing and cloud computing.

May 2010. CBCB student Bo Liu received a best poster award at the International Symposium on Bioinformatics Research and Applications (ISBRA).

Science Highlights Software

June 26, 2011. CBCB scientists Héctor Corrada Bravo and Ben Langmead, along with colleagues from Johns Hopkins University, publish an article in Nature Genetics reporting whole genome profiling of DNA methylation in three colon cancers and matched normal tissue and two adenomatous polyps. The authors identify large blocks of relative hypomethylation and gene expression variability that encompass over half of the genome. Increased methylation variation is also reported in consistent epigenetic domains in five different cancer types: lung, breast, colon, thyroid and Wilms tumor (a childhood kidney cancer). The paper suggests a model for cancer involving loss of epigenetic stability of well-defined genomic domains that underlies increased methylation variability in cancer that may contribute to tumor heterogeneity.

May 31, 2011. Ancient DNA reveals that North American mammoths and woolly mammoths were interbreeding. Scientists from McMaster University along with collaborators from CBCB, the Museum national d'Histoire Naturelle (Paris), the Univ. of Utah, the Univ. of Michigan, and the American Museum of Natural History published results today on DNA extracted from an 11,000-year-old mammoth bone found in Utah. The team describes the suprising finding that the mitochondrial DNA of the American mammoth was nearly identical to that of the much smaller woolly mammoth.

March 7, 2011. Scientists at CBCB, in collaboration with colleagues from the FBI, USAMRIID, and UMD School of medicine publish the long-awaited scientific results behind the investigation into the 2001 anthrax attacks. The research, conducted primarily while the UMD scientists were at The Institute for Genomic Research, revealed four specific genetic markers in the Bacillus anthracis strain used in the attacks. These markers were shared by the anthrax powder in the letters mailed to U.S. Senators and in only one source, a vial of anthrax from the U.S. Army Medical Research Institute of Infectious Diseases in Ft. Detrick, Maryland. CBCB faculty members Steven Salzberg and Mihai Pop, along with former CBCB graduate student Adam Phillippy, led the computational analysis that first identified the unique DNA sequence mutations in the anthrax used in the attacks. The findings remained confidential for many years while the FBI investigation was ongoing, but are now finally being made public.

December 26, 2010. The genome of woodland strawberry (Fragaria vesca) was published in Nature Genetics by an international team of scientists from 38 organisations in ten countries, led by Vladimir Shulaev (Univ. of North Texas) and Kevin Folta (Univ. of Florida). The consortium includes CBCB scientists Art Delcher and Steven Salzberg.

October 7, 2010. CBCB scientists Steven Salzberg and Mihaela Pertea publish an article in Genome Biology on "Do-it-yourself genetic testing." In this article, they describe a method for diagnosing mutations in the cancer-causing genes BRCA1 and BRCA2 directly from whole-genome data. The freely downloadable software, which can easily be extended to other genes, represents a challenge to the patents that cover the BRCA genes.

September 14, 2010. CBCB scientists Ben Langmead and Hector Corrada Bravo, in collaboration with scientists from multiple institutions publish an article in Nature Reviews Genetics on "Tackling the widespread and critical impact of batch effects in high-throughput data." In this article, Ben and Hector describe an analysis of exome sequencing data from second-generation technologies demonstrating the impact of batch effects on copy number estimation.

September 8, 2010. CBCB scientists and collaborators publish the genome of the domesticated turkey. The consortium announced today that they have sequenced the majority of the genome of Meleagris gallopavo, the domesticated turkey, creating the first-ever turkey genome map. The nearly complete map could help growers to more efficiently produce bigger, meatier turkeys. The research is reported today in PLoS Biology, an online journal of the Public Library of Science.

May 2, 2010. CBCB scientists Cole Trapnell, Geo Pertea, and Steven Salzberg, in collaboration with the Pachter lab at Berkeley and the Wold lab at Caltech, publish Cufflinks, the first comprehensive package for transcript assembly and quantification with RNA-Seq. In a study of developing muscle cells, they identified thousands of unannotated mRNAs and tracked differential splicing and promoter use in hundreds of genes.

March 22, 2010. Dr. Rita Colwell, Distinguished University Professor and a faculty member in CBCB, has been named the 2010 Stockholm Water Prize Laureate. "Dr. Rita Colwell's numerous seminal contributions towards solving the world's water and water-related public health problems, particularly her work to prevent the spread of cholera, is of utmost global importance," noted the Stockholm Prize nominating committee.

January 4 2010. CBCB researchers together with colleagues from the Naval Medical Research Center publish in the journal Genome Biology an analysis of 8 genomes from the Yersinia genus. This is the first genomic characterization of environmental cousins of the human pathogen Yersinia pestis, the causative agent of plague. Also a first, this study demonstrates the power of high-throughput genomic technologies (a combination of 454 pyrosequencing and optical mapping) to enable the rapid construction of high-quality genomic assemblies without the need for substantial manual curation. The data presented in this paper were generated with the help of open-source optical map scaffolding software developed at the CBCB (see SOMA ).

November 20, 2009. CBCB students Ben Langmead, Mike Schatz, and several faculty colleagues publish a paper in Genome Biology describing CrossBow, a cloud-computing system that combines the aligner Bowtie and the SNP caller SOAPsnp. Executing on "the cloud," Crossbow can run a SNP analysis of 38-fold coverage of the human genome in three hours or less.

August 2, 2009. CBCB scientists publish a paper in Nature Methods describing Phymm + PhymmBL, new software for phylogenetic classification of DNA sequences as short as 100 bp, designed to facilitate analysis of metagenomic data. The Phymm family uses variable-order Markov models to identify taxon-specific sequence patterns, providing a classification accuracy several times higher than existing metagenomics sequence classification tools.

July 28, 2009. CBCB scientists Carl Kingsford, Niranjan Nagarajan, and Steven Salzberg publish a paper describing their finding that the reassortment that created the 2009 H1N1 pandemic influenza virus was remarkably similar to a previous reassortment that occurred in Thailand, but that produced only one documented human infection, in 2005.

July 15, 2009. CBCB scientist Najib El-Sayed led an international team, along with Matt Berriman at the Sanger Institute (UK), that sequenced the genome of the parasitic worm Schistosoma mansoni. Schistosomiasis is a devastating tropical disease that afflicts over 200 million people in the developing world. The paper appeared in this week's issue of Nature.

June 5, 2009. CBCB graduate student Michael Schatz, in collaboration with researchers from the USDA Bee Lab, Columbia University, and 454 Life Sciences, published the draft genome sequence of the honey bee fungal pathogen Nosema ceranae. This fungus is believed to be an agent in honey bee colony collapse disorder, and the genome sequence will aid in understanding this disease and the development of treatments.

May 14, 2009. CBCB scientist Rita Colwell and colleagues publish a review of the new scientific discipline of microbial oceanography - the study of the ocean as "a habitat for the evolution and regulation of microbial-based processes and their ecological consequences" - in the journal Nature.

April 24, 2009. CBCB scientists, jointly with scientists from the USDA, publish the genome of the domestic cow, Bos taurus.  The cow genome contains 2.86 billion bases spread across 30 chromosomes.  The new paper, in the journal Genome Biology, describes how the genome was assembled and presents in greater detail than previously how the cow genome can be mapped onto the human genome.  In addition, a Bowtie index is now available for rapid mapping of short reads to the new Bos taurus genome.

April 10, 2009. CBCB scientists publish a paper in PLoS Computational Biology describing a new statistical method for comparing metagenomic data-sets in a clinical setting. Their method, Metastats, allows scientists to compare two treatment populations (e.g. sick and healthy patients), each comprised of multiple samples, in order to determine individual features (organisms, genes, or pathways) that explain the difference between the two populations.

April 8, 2009. Mike Schatz, a graduate student in Computer Science, and a member of the CBCB, publishes a paper in the journal Bioinformatics describing the use of Cloud Computing (highly-parallel computing infrastructure available through the internet) to speed up sequence alignment algorithms. His program CloudBurst, available open-source from http://cloudburst-bio.sourceforge.net can achieve speed-ups of up to 100-fold over current alignment programs.

March 4, 2009. CBCB scientists publish a paper in Genome Biology describing Bowtie, a new and extremely fast system for aligning short DNA sequences to the human genome or to other large genomes. Bowtie's innovative use of the Burrows-Wheeler Transform allows it to run many times faster than other leading short-read aligners, and its remarkably small memory footprint allows users to run it on a standard desktop or laptop computer.

January 2009. The 2009 database issue of the journal Nucleic Acids Research features two CBCB databases: OperonDB - a database of predicted operons in microbial genomes; and ARDB - a database of antibiotic resistance genes.

Oct. 9, 2008. Scientists publish the genome of the human malaria parasite Plasmodium vivax, which is responsible for 25–40% of the approx515 million annual cases of malaria worldwide.  The study led by NYU's Jane Carlton included CBCB scientist Steven Salzberg and graduate student Sam Angiuoli as co-authors, and appeared in the journal Nature.

Sept. 2008. CBCB scientists publish a new method for assembling a bacterial genome from very short "next-gen" sequencing data, and describe its application to a new strain of the bacterium Pseudomonas aeruginosa.

July 2008. A Nature special section on human and avian influenza features this commentary by Steven Salzberg proposing greater openness in the process of designing the flu vaccine each year.

May 2008. CBCB scientists led a consortium that published the complete genome of the bacterium Xanthomonas oryzae pv. oryzae, which causes bacterial blight in rice. The international collaboration include 35 scientists from the U.S., Japan, India, France, and Ireland.

April 24, 2008. Scientists this week published a description of the papaya tree's genome in the journal Nature, the first transgenic crop ever to have its genome sequenced. The "SunUp" papaya plant includes an artificially inserted virus protein that confers resistance to papaya ringspot virus. The collaboration included CBCB scientists Salzberg, Schatz, Nagarajan, Delcher, and Mount.

March 10, 2008. CBCB scientists Schatz, Trapnell, Delcher and Varshney publish MUMmerGPU, a short-read mapping program. MUMmerGPU uses the graphics processing unit (GPU) in a desktop PC to map short reads to a reference genome up to 10-fold faster than conventional CPU-based programs.