Instructor GENE 520: Computational Human Genomics and Epigenomics
Graduate course, Case Western Reserve University, Computer and Data Science Department, 2025
Taught units starting from the basics of Python programming building up to dimension reduction and representation learning techniques for biological data with a focus on single-cell genomics and epigenomics.
- Python basics and data analysis with Pandas and Numpy
- Data visualization with Matplotlib and Seaborn
- Working with sequencing data in Python
- Assignment to build a simple transcription factor motif binding site predictor using a simple sliding window cross correlation approach
- Linear dimensionality reduction techniques such as PCA and MDS
- Nonlinear dimensionality reduction techniques such as t-SNE and UMAP
- Neural network-based representation learning techniques such as autoencoders and variational autoencoders
- Variational autoencoders for single-cell genomics and epigenomics data
- Assignment to train a scVI model on a single-cell RNA-seq dataset and use the latent representation for downstream analysis such as clustering, visualization, and celltype classifications with the goal of comparing to PCA and other linear methods
