Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale single-cell and spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Caleb Hallinan
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Canvas for location details.
Office Hours: 10:00am-10:50am Monday, Wednesday, and by request. See Canvas for location details.

Course Details
☞ see Course tab


All Visualizations

Comparing PCA and t-SNE Dimensionality Reduction on Spatial Transcriptomics Dataset

In many tissues, cells with similar gene expression patterns tend to cluster together both in a dimensionality-reduced “gene expression space” (like the PCA or t-SNE plots) and in their actual...

HW2: Exploring PC1 Loading Vs. Gene Expression Variance Before and After Normalization

1. How do the gene loadings on the first PC relate to features of the genes such as its variance? Using the raw data, when the gene expression variance is...

Homework 2 submission

[description] In my visualization, I use points as the geometric primitive, angle and color for visual channel. The x-axis represents the PCA loadings for each gene, while the y-axis shows...

Making a Multi-Panel Data Visualization

The visualization effectively conveys relationships between gene expression and spatial organization by utilizing dimensionality reduction (PCA) to simplify high-dimensional gene expression data. The PCA scatter plot helps distinguish patterns in...

Comparison of Scaled and Unscaled PCA: Gene Mean Expression, Variance, and PC1 Loadings

1. What data types are you visualizing? I am visualizing quantitative data, which includes log-transformed mean expression (x-axis), log-transformed variance (y-axis), and PC1 loading values (color hue).

PC1 values (unscaled vs. scaled variances) as a function of spatial coordinates

1. What data types are you visualizing? I wanted to visualize spatial data (locations of spots on tissue sample) and quantitative data (PC1 values for each spot). I looked at...

Analyzing the Relationship between Cell Gene Expression and Position in the Eevee Dataset

1. What data types are you visualizing? I am visualizing quantitative data, specifically cell gene expression and positional data (x and y coordinates).

Top 5 Genes with Highest PC1 Loads

This visualization addressed the second aim, specifically how gene loadings on the first PC relate to features of the gene when scaled and unscaled. Particularly, I used a violin plot...

PCA and Spatial Distribution Multi Panels

1. Why is My Data Visualization Effective?

Comparison of Pikachu Cells Within Gene Expression Space and Physical Space

This data visualization utilizes the Pikachu dataset to investigate how cells are related in the gene expression space as compared to their physical space distribution. Even more, the visualization uncovers...

Impact of Scaling on PCA: Relationship Between PC1 Loadings and Gene Expression

This visualization explores the relationship between gene loadings on the first principal component (PC1) and mean gene expression. The results show that in unscaled data, genes with higher mean expression...