CBCB Seminar Series
Fall 2009
2:00 p.m. Thursday, Sept. 10, 2009 - two talks
Venue: Biomolecular Science
Building Room 3118
Title:
Searching for Genes in Novel Genomes.
By: Brona Brejova, Department of Computer Science, Comenius
University, Slovakia
Abstact:
New rapid sequencing methods now allow affordable sequencing of
previously unexplored genomes. The gene prediction in these novel
genomes is difficult due to the lack of reliable training data
necessary for adjusting parameters of models used for this task.
We have developed a novel method for estimating the parameters of
hidden Markov models for gene finding in newly sequenced species. Our
approach does not rely on curated training data sets, but instead uses
extrinsic evidence (including paired-end ditags that have not been
used in gene finding previously) and iterative training. This new
method is particularly suitable for annotation of species with large
evolutionary distance to the closest annotated species. We have used
our approach to produce an initial annotation of the newly sequenced
Schistosoma japonicum draft genome. Our new gene set provides a first
glimpse at a gene complement of a flatworm (phylum platyhelmintes).
Joint work with Tomas Vinar, Dan Brown, Ming Li, and Yan Zhou.
Title:
Evolutionary Histories of Gene Clusters in Primates.
By: Tomas Vinar, Dept. of Applied Informatics, Comenius
University in Bratislava
Abstact:
Approximately 5% of the human genome is composed of complex gene
clusters that arose by repeated segmental duplications. These
regions are hot spots of evolutionary innovation and contain many
biomedically important gene families. We propose that these gene
clusters should be analyzed in the context of their duplication
histories that allow construction of accurate gene trees for the
purpose of comparative genomic analysis, enable analysis of
chimeric genes and promoter regions, and facilitate transfer of
annotations between species.
We have developed novel methods for reconstructing the
duplication histories from genomic sequences of multiple
species. Our methods are based on a simple probabilistic model of
evolution of gene clusters by segmental duplication, and we use
MCMC sampling to infer duplication histories with high likelihood
under this model.
This is a joint work with Brona Brejova (Comenius), Adam Siepel
(Cornell), Webb Miller (Penn State U.), and Eric Green (NHGRI).
2:00 p.m. Thursday, Oct. 8, 2009
Title: Chasing Change:
Primate Centromere Evolution
By: Mary Schueler, National Human Genome Research
Institute, National Institutes of Health
Venue: Biomolecular
Science
Building Room 3118
Abstact: Rapid evolution is a
hallmark of centromeric DNA in eukaryotic genomes. The centromere has a
conserved functional role mediated by the kinetochore protein complex
in all species. We performed comparative mapping and sequencing of
centromeric regions and the genomic loci of three foundation
kinetochore proteins – Centromere Proteins A, B, and C - to gain a
detailed view of the evolutionary events that have shaped the primate
centromere.
A Histone H3 variant, Centromere Protein A (CENP-A), is the foundation
of the centromere-specific nucleosome. Comparative sequence analyses
involving 14 primate species has, for the first time, identified amino
acid residues within both the histone fold domain and the N-terminal
tail that are under strong positive selection in the primate lineage.
Similar comparative analyses of CENP-B, a kinetochore protein with a
specific binding site within alpha-satellite DNA, somewhat
surprisingly, do not show signs of positive selection. However, CENP-C,
another foundation protein essential for centromere function, is under
strong positive selection. Residues under selection are found
throughout the protein, including several in the
centromere-localization and DNA-binding regions.
A model of progressive proximal expansion of alpha-satellite DNA at the
primate X centromere predicts that older alpha satellite lacking
higher-order structure lies adjacent to the chromosome arms, while
regions of more recently evolved alpha satellite flank the higher-order
alpha-satellite arrays. Comparative mapping and sequencing of these
regions confirms this predicted organization in six primates. Our
ongoing additional comparative genomic studies should further develop
this model of centromere evolution, and provide the reagents necessary
for testing a correlation between evolution of kinetochore proteins and
centromeric DNA.
2:00 p.m. Thursday, Oct. 15, 2009
Title: Computational
Techniques for Inferring Phylogenetic Relationships Using Multiple Loci.
By: Luay Nakhleh, Department of Computer Science, Rice
University
Venue: Biomolecular
Science
Building Room 3118
Abstact:
Accurate inference of phylogenetic relationships of species, and
understanding their relationships with gene trees are two central
themes in molecular and evolutionary biology. Traditionally, a species
tree is inferred by (1) sequencing a genomic region of interest from
the group of species under study, (2) reconstructing its evolutionary
history, and (3) declaring it to be the estimate of the species tree.
However, recent analyses of increasingly available multi-locus data
from various groups of organisms have demonstrated that different
genomic regions may have evolutionary histories (called “gene trees”)
that may disagree with each other, as well as with that of the
species. This observation has called into question the suitability of
the traditional approach to species tree inference. Further, when
some, or all, of these disagreements are caused by reticulate
evolutionary events, such as hybridization, then the phylogenetic
relationship of the species is more appropriately modeled by a
phylogenetic network than a tree. As a result, a new, post-genomic
paradigm has emerged, in which multiple genomic regions are analyzed
simultaneously, and their evolutionary histories are reconciled in
order to infer the evolutionary history of the species, which may not
necessarily be treelike.
In this talk, I will describe our recent work on developing
mathematical criteria and algorithmic techniques for analyzing
incongruence among gene trees, and inferring phylogenetic
relationships among species despite such incongruence. This includes
work on lineage sorting, reticulate evolution, as well as simultaneous
treatment of both.
Speaker BIO:
Luay Nakhleh is an Assistant Professor of Computer Science and
Biochemistry and Cell Biology at Rice University. He received the
B.Sc. degree from the Technion, Israel Institute of Technology, in
1996, the Master’s degree from Texas A&M University in 1998, and
the
PhD degree from the University of Texas at Austin in 2004all three
degrees in Computer Science. His research interests fall in the
general areas of computational biology and bioinformatics; in
particular, he works on computational phylogenomics and its connection
with other fields in biology. Luay has published over 50 manuscripts
on his work, supervised the dissertations of two recent PhD graduates,
and currently supervises the dissertations of 6 PhD students. Luay has
received several awards, including the Texas Excellent Teaching Award
from UT Austin in 2001, the Outstanding Dissertation Award from UT
Austin in 2005, the Roy E. Campbell Faculty Development Award from
Rice University in 2006, the DOE Early Career Award in 2006, the NSF
CAREER Award in 2009, and the Phi Beta Kappa Teaching Prize in 2009.
11:00 a.m. Thursday, Oct. 22, 2009
Title:
Finding the trees in Darwin's forest.
By: Robert K. Bradley, Massachusetts Institute of Technology
Venue: Biomolecular Science
Building Room 3118
Abstact:
TBA
2:00 p.m. Thursday, Dec. 17, 2009
Title: "Genetic analysis of O-repeat biosynthesis in Neisseria sicca 4320"
By: Clinton Miller
Venue: Biomolecular Science
Building Room 3118
Abstact:
One of the important virulence determinants found in pathogenic Neisseria is lipooligosaccharide (LOS). LOS differs from lipopolysaccharide (LPS) in that it lacks the o-repeat characteristic of LPS. LOS has been shown to be important for invasion, host immune evasion, and bacterial attachment to host tissue. A great diversity of structures is found both within pathogenic species and commensal species. Variations between strains are mediated by changes in biosynthetic gene clusters. Variation within a strain is mediated by changes in the expression state of genes. While the genetic basis of LOS production in the pathogenic Neisseria has been extensively studied, little research has focused on the genetics underlying LPS/LOS production and resulting diversity in commensal Neisseria. A commensal strain that caused a fatal case of bacterial endocarditis, Neisseria sicca 4320, was found to produce a novel o-repeat structure in addition to the typical Neisserial LOS. The genome of N. sicca 4320 was sequenced and analyzed to identify genes possibly involved in the synthesis of the o-repeat structure. The identified genes were cloned and prepared for inactivation in order to generate knockout mutants.
|