Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale single-cell and spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Caleb Hallinan
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Canvas for location details.
Office Hours: 10:00am-10:50am Monday, Wednesday, and by request. See Canvas for location details.

Course Details
☞ see Course tab


All Visualizations

EC1: Comparing PCA and tSNE clustering methods with gganimate

What’s vizualized? A gif visualizing the cluster derived with kmeans in reduced dimensional space using linear vs non-linear methods (PCA and tSNE), as well as in the original physical space....

hw 4 DEG analysis

Description of analysis The modification from the HW3 using eevee data set is that I changed the selection of cluster based on the overall cluster visualization in physical space. Method-wise,...

HW4: Exploring Cell Type with Differentially upregulated CD4

1. Figure Description. Figure A: Total within-cluster sum of squares using different value of k. Figure B: Cluster 2 is highlighted in red in PCA space, while the remaining three...

Identifying B cell markers in imaging dataset

To begin analyzing the imaging dataset, I decided to normalize by cells’ areas, rather than use count-based normalization. Afterwards, I clustered my normalized gene expression data using k-means and determined...

Analyzing PDGFRB Gene Expression in Pikachu Dataset

Visualization Summary In this visualization, I analyzed a cluster within the Pikachu dataset responsible for cell growth, and likely cancer. This was a major change from the Eevee sequencing dataset...

Validating Sequencing-based 10x Visium Identification of T Cell Population with Imaging-Based Spatial Transcriptomics

From last week’s results and selected cluster, I identified the genes LTB, CD247, and IL7R , all of which suggest a T cell population (or similar immune cell population comprising...

HW4 Data Exploration - Cluster 4 and MMP2 Gene

Some modifications for this visualization compared to previous was the gene I selected to focus on - the Pikachu dataset does not contain the CCN1 gene which I focused on...

Identifying a Transcriptionally Similar Cell Type Across Datasets: Clustering and Differential Expression Analysis

I previously performed k-means clustering with k=6 to identify distinct transcriptional clusters in the Eevee dataset. When analyzing the Pikachu dataset, I initially used the same approach but found that...

Multi-dimensional Analysis of HER2+ Cells: Spatial Distribution and Gene Expression Patterns

In analyzing both the Pikachu and Eevee datasets, I successfully identified similar cell populations while making several key adjustments to account for the different data types. The most significant change...

Multi-Panel Data Visualization of Epithelial Cell Cluster in Eevee Dataset

This figure presents an analysis of cellular clusters within the Eevee dataset, focusing on the identification and characterization of a biologically relevant cluster using k-means clustering, dimensionality reduction techniques (PCA...