I am a PhD student at Yale University in the lab of Matthew Simon. I have developed a number of bioinformatic tools for processing and analyzing nucleotide recoding RNA-seq (NR-seq) experiments. NR-seq is a class of methods for quantifying the kinetics of RNA synthesis and degradation, and includes methods like TimeLapse-seq (developed in the Simon lab), SLAM-seq, TUC-seq, etc. These are powerful tools that are allowing us to study the dynamic lives of RNA at unprecedented resolution, and I have had a lot of fun dreaming up ways in which we can extract exciting biological insights from NR-seq data.
I am also a statistician, and have developed a number of novel methods for analyzing NR-seq data. One of my passions is helping biologists better understand the statistical methods powering their favorite bioinformatic tools. This led me to develop a course at Yale called “Statistical Intuition for Modern (RNA) Biochemists”. It’s inspired by Susan Holmes and Wolfgang Huber’s book of a similar title and Richard McCelreath’s cultural phenomenon “Statistical Rethinking”. It aims to be an accessible introduction to statistics for biochemistry majors. We will be covering the basic machinery of statistical modeling and its use in popular methods like linear modeling and clustering. Rather than relying on mathematical formalism to convey these concepts though, we will be making use of simulations and interactive exercises to help students develop an intuition for key concepts. Finally, we will start the course by introducing student’s to RNA-seq, so as to have a common data language that we can connect all conecpts back to. Eventually, all of the course material will be hosted at this repo.
Education
Yale University | New Haven, CT | August 2019 - Present
PhD in Molecular Biophysics and Biochemistry
Centre College | Danville, KY | August 2015 - May 2019
B.S. in Physics, minor in Math
Experience
Programming in R (since 2019)
Scripting (example repo)
R package development (most notably, EZbakR)
Shiny app development (e.g., RNAdecayCafe)
Snakemake pipeline development (since 2021)
fastq2EZbakR: Flexible processing of NR-seq data.
NRsim: Simulating NR-seq data to test analysis strategies and pipelines.
AnnotationCleaner: Assembling annotations using StringTie and some custom scripts.
THE_Aligner: Aligning almost any kind of RNA-seq data.
PROseq_etal: PRO-seq/ChIP-seq/ATAC-seq pipeline.
Other experience
Python (since 2021; used throughout Snakemake pipelines listed above)
C (since 2023; example repo)
Pytorch (since 2023; example repo)
