CMSC858B: Computational Systems Biology and Functional Genomics (Spring 2012)

Course Information

Course Calendar

DateLecture NameReadings (recommended)
1/25Course introduction and administrivia
1/31Molecular biology for computer scientists and statisticiansHunter. Life and its molecules
Hunter. Molecular Biology for Computer Scientists
2/1Statistical Learning: a whirlwind tour of statistical inference machine learning and probabilistic models
2/6The R statistical programming environmentThese are the scripts we used in class:
twitterMap.R
twitterUtils.R
2/8The R/Bioconductor genomics analysis environmentThe setup script: setup.R
The script we're using in class: bioconductor.R
2/13Gene expression analysis: Overview of microarray technology, preprocessing methods and algorithms[1] Chs. 1 and 2 (Specifically Section 2.3.2)
[2] Sec 6.1 Local polynomial regression
Quantile Normalization paper (Bolstad, et al., Bioinformatics 2003.)
2/15Differential expression analysis[1] Ch. 11 and Ch. 14
2/20Empirical Bayes, SAM and Multiple testingSAM
limma
q-value
2/22Geneset Enrichment AnalysisGOstats and references therein
GSEA
2/27Overview of second generation sequencing technologyHW 1 due
RNAseq review article
Bowtie
2/29RNA sequencing analysisMyrna
DESeq
3/5Isoform expression quantification and transcriptome assemblyJiang and Wong
Salzman, Jiang and Wong
Cufflinks
3/7Unsupervised methods[1] Chs. 12 and 13

[2] Ch. 14
3/12Classification and prediction methods[2] Ch. 4
3/14Classification and prediction methods (2)HW 2 due on 3/16
3/19No class: Spring Break
3/21No class: Spring Break
3/26Sparse methods in genomics[2] Ch. 18, Project proposal due
3/38Gene regulation Transcription factor analysis (ChIPSeq and motif finding)Ji et al., Nature Biotech 2008
Supplementary
4/2Regulatory network discoverySegal, et al., 2003
4/4Epigenetics: Intro to epigenetics: Chromatin modifications,DNA methylation and the CpG genomic landscapeBock and Lengauer, 2008
Baylin and Jones, 2012
Wu, et al., 2010
4/9Analysis of differential methylation with microarrays and sequencingHansen et al., 2011
Supplementary
4/11Midterm exam
4/16 Analysis of differential methylation with microarrays and sequencing (cont'd)Hansen et al., 2011
Supplementary
4/18Cancelled
4/23Genetics: Genomic variant discovery with sequencing technologySOAP
Li, 2011
4/25Chromatin modifications and conformationChromatin HMM, Ernst, et al., 2011
Supplementary
Hi-C: Chromosome conformation, Lieberman-Aiden, 2009
Supplementary
4/30Genotype/phenotype association discovery and analysisRAPID
Graph Splines
Lirnet
5/2Data integration: Methods and algorithms for genomic data integrationmodEncode
Hefalmap
5/7Final take-home exam due
5/7Project presentations (1)
5/9Project presentations (2)
5/16Final project writeup due

Many slides are borrowed from a number of sources (hopefully cited in slides). A lot of them are borrowed from Rafael A. Irizarry Lectures linked are from last semester and very likely to change near lecture time

[1] Gentleman, R., Carey, V.J., et al. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, 2005.
[2] Hastie, Tibshirani and Friedman. The Elements of Statistical Learning. Springer 2009.

Homeworks

HomeworkDate postedDue date
Homework 1Feb 17Feb 27
Homework 2Mar 7Mar 16

Resources

Syllabus

The official syllabus detailing class policies, calendar and other details can be found here [pdf]

Description

Major advances in technology for genomic studies are bringing the prospect of personalized and individualized medicine closer to reality. Many of these advances are predicated on the ability to generate data at an unprecedented rate, posing a significant need for computational data analysis that is clinically and biologically useful and robust.

This course will concentrate on the fundamental computational and statistical methods required to meet this need. It will cover topics in functional genomics, population genetics and epigenetics. Computational methods studied for this type of analysis include: supervised, unsupervised and semi-supervised learning, data visualization, statistical modeling and inference, probabilistic graphical models, sparse methods, and numerical optimization. Machine learning methods will be a core component of this class. No prior knowledge of biology is required.

Topics to be covered (not an exhaustive list)