CBCB Research in Progress Series (RIPS)

UPDATE (Sept. 3, 2021): For Fall 2021, CBCB RIP Seminars will be held in a hybrid format. Unless otherwise noted, they will take place on Thursday afternoons @ 2pm the Iribe Center, Room 4105 and the talks will be streamed for folks to attend remotely via Zoom. The format for the RIPS talks will be the same as in previous semesters; each week we will look to have 2 30-minute talks. However, if you feel you need an hour slot, please specify this in the second row of the corresponding date on the spreadsheet. The signup sheet for this semester is here. The day and time for the RIPS seminar this semester will be Thursdays from 2-3PM. For Zoom meeting details to view RIP talks contact Barbara Lewis.

The CBCB RIP series provides an informal forum for computational biologists to keep abreast of colleagues' projects, to help students and postdocs hone their presentation skills, and to get expert feedback on new or ongoing projects. The forum is targeted towards anyone working at the interface of Biology and Analytical sciences. This is a great opportunity for everyone in our CBCB community to come together, and to learn about the research being done by our colleagues in different labs within the center.


Other seminars you may be interested in attending can be found HERE

    CBCB RIPS Schedule for Fall Semester 2021

    Date

    Speaker(s)

    PI/Lab/Host

    Topic & Abstract

    Time (if other than 2PM)

    9/9/21

    CBCB Member Introductions

    9/9/21

    9/16/21

    Shiva Mehravaran

    BISI-CBBG

    Pancorneal Symmetry Analysis of Fellow Eyes: A Machine Learning Proof of Concept Study

    Abstract: Despite great scientific progress in our knowledge of corneal properties and technological advances in corneal imaging, identifying subclinical forms of corneal degenerative disorders remains a major challenge. In this proof of concept project, the feasibility of a machine learning approach was tested for clustering attributes created from interocular difference data. The raw anterior corneal elevation data of the entire anterior surface from 4613 bilateral cases was used as the input. Python packages such as Pandas, NumPy, Matplotlib, and Seaborn as well as various modules and codes were used to process the data, compute elevation difference matrices, create colormaps, and engineer features for unsupervised machine learning. Clustering was performed with the Simple K Mean algorithm in WEKA (Waikato Environment for knowledge analysis) using the attributes created in the first stage of the project. Three clusters were generated, and mean interocular differences for measures of corneal thickness and keratometry in these clusters were in agreement with their corresponding groups reported in the literature. Proving the feasibility of this approach is the first step in creating a novel diagnostic index for identifying abnormal corneas.

    9/16/21

    9/23/21

    9/23/21

    9/30/21

    9/30/21

    10/7/21

    Mohsen Zakeri

    CBCB/Patro's Lab

    Title: Accurate and efficient quantification of spliced and unspliced RNA-seq single-cell reads with Alevin-fry

    Abstract: The rapid growth of high-throughput single-cell and single-nucleus RNA sequencing technologies has produced a wealth of data over the past few years. The available technologies continue to evolve and experiments continue to increase in both number and scale. The size, volume, and distinctive characteristics of these data necessitate the development of new software and associated computational methods to accurately and efficiently quantify single-cell and single-nucleus RNA-seq data into count matrices that constitute the input to downstream analyses. The alevin-fry framework is developed in Combine-Lab for quantifying single-cell and single-nucleus RNA-seq data. Alevin-fry is able to avoid false-positive expression by mapping the sequencing reads to both spliced and unspliced regions of the genome. It selects a minimal number of additional sequences from the intronic regions of the genome in addition to the transcriptome to build the index for mapping the reads. Alevin-fry uses the Pufferfish index to build the reference index and employs a pseudoalignment algorithm with structural constraints, the sketch algorithm, to map the reads to the index (including both spliced and unspliced sequences). Alevin-fry offers different options for generating a permitlist (whitelist) from the input set of barcode sequences, as well as different options for UMI deduplication and generating the gene expression count matrices. Despite being faster and more memory frugal than other accurate and scalable quantification approaches, alevin-fry does not suffer from the false positive expression or memory scalability issues that are exhibited by other lightweight tools. I will talk about how alevin-fry can be effectively used to quantify single-cell and single-nucleus RNA-seq data, and also how the spliced and unspliced molecule quantification required as input for RNA velocity analyses can be seamlessly extracted from the same preprocessed data used to generate regular gene expression count matrices.

    10/7/21

    10/14/21

    Brantley Hall

    TBD

    10/14/21

    10/21/21

    Kiran Javkar

    CBCB/Pop

    SIMILE: Discovering shared genomic regions across a collection of metagenomic assemblies

    10/21/21

    10/28/21

    Jackie Michaelis

    CBCB/Pop

    Graph-based variant discovery reveals novel dynamics in the human microbiome

    Abstract: Sequence variation within metagenomes imparts important information about microbial changes in human and ecological health. However, many existing methods for variant detection are reference-dependent and limited to single nucleotide polymorphisms, missing more complex functional and structural changes. We use assembly graphs to detect structural variants in almost 1,000 metagenomes from the Human Microbiome Project. We identified over nine million variants representing insertion/deletion events, strain differences, plasmids, and repeats. Our analysis revealed striking differences in the rate of variation across body sites, highlighting niche-specific mechanisms of bacterial adaptation. Within indels and interspersed repeats, we also found mobile genetic elements, including potential phage that had integrated into a host bacterial genome. This work highlights the utility of using graph-based variant detection to capture biologically significant signals in microbial populations.

    10/28/21

    11/4/21

    11/4/21

    11/11/21

    11/11/21

    11/18/21

    11/18/21

    11/25/21

    Thanksgiving Holiday

    11/25/21

    Thanksgiving Holiday

    12/2/21

    12/2/21

    12/9/21

    12/9/21