Home | Genomic Data Visualization

Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale single-cell and spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Caleb Hallinan
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Canvas for location details.
Office Hours: 10:00am-10:50am Monday, Wednesday, and by request. See Canvas for location details.

Course Details

☞ see Course tab

Featured Visualizations

Interrogating Spatial Spot Cluster Differential Gene Expression with 10x Visium

In these panels, I am depicting the representation of a 10x visium dataset in latent tSNE-embedded space and over the original spatial slide coordinates. I...

Sachin K
12 Feb 2025

Identifying a Cluster of Breast Granular Cells

In the top left of my figure, I am depicting both my clusters made by kmeans clustering with k=7 in PCA space (with my cluster...

Alex Gorham
12 Feb 2025

Homework 3: Differentially Expressed Genes analysis

[description] Those panels present a comprehensive visualization of Cluster 0 and its association with the gene SFRP4 through a combination of UMAP, spatial, and gene...

Yi Yang
10 Feb 2025

Spatial Transcriptomics Reveals a Distinct Epithelial Cell Population Defined by ELF3 Expression: A Multi-Dimensional Analysis of the Cluster in Interest

1. Describe your figure briefly so we know what you are depicting. Write a description to convince me that your cluster interpretation is correct.

Carol L
10 Feb 2025

Homework 2 submission

[description] In my visualization, I use points as the geometric primitive, angle and color for visual channel. The x-axis represents the PCA loadings for each...

Yunzhou L
06 Feb 2025

Comparison of Scaled and Unscaled PCA: Gene Mean Expression, Variance, and PC1 Loadings

1. What data types are you visualizing? I am visualizing quantitative data, which includes log-transformed mean expression (x-axis), log-transformed variance (y-axis), and PC1 loading values...

Daniel C
06 Feb 2025

Dimensionality Reduction using PCA

Homework 2

Kevin Meza Landeros
05 Feb 2025

HW1: Gene expression pattern for GNB1 and HES4

1. What data types are you visualizing? I am visualizing HES4 and GNB1’s spatial gene expression patterns for eevee.

Alexandra M
29 Jan 2025

10 Genes with the Highest Counts are Expressed Across Most Spots

1. What data types are you visualizing? For this data visualization of the Eevee spatial transcriptomic data, I visualized both categorical data, the 10 genes...

Alex Gorham
29 Jan 2025

Generation of Heatmap Expressing Top 20 Genes Within Pikachu Dataset

1. What data types are you visualizing? Within the Pikachu dataset that was visualized, gene expression levels across multiple individual cells proved to be a...

Meera Bhat
28 Jan 2025

HW1: gene expression scatterplot

1. What data types are you visualizing? Spatial data of each cell, i.e the location of the cell within the section of the image, which...

Suki O
28 Jan 2025

All Visualizations

Identification of White Pulp Tissue Structures within CODEX Dataset

1. Create a data visualization and write a description to convince me that your interpretation is correct.

Carol L
25 Feb 2025

CODEX dataset analysis

We conducted normalization, standardization, dimensionality reduction, k-means clustering, and differential expression analysis to reveal two distinct cell clusters within the CODEX data. tSNE plots and marker expression heatmaps is plotted...

Sky Li
24 Feb 2025

Interpreting CODEX data

Describe your figure briefly so we know what you are depicting (you no longer need to use precise data visualization terms as you have been doing). The data was normalized...

Amanda Looi
24 Feb 2025

Exploring differences between linear and non-linear dimensionality reduction methods

Description The visualization compares three different dimensionality reduction techniques—PCA (Principal Component Analysis), t-SNE (t-Distributed Stochastic Neighbor Embedding), and UMAP (Uniform Manifold Approximation and Projection)—to visualize high-dimensional gene expression data from...

Yi Yang
23 Feb 2025

Question 4: Exploring the Effect of Varying Principal Components for Non-Linear Dimensionality Reduction

For this project, I created an animation using gganimate to visualize the effect of varying the number of principal components before applying t-SNE. I used the Pikachu dataset. First, I...

Kamie Mueller
22 Feb 2025

EC1: Comparing PCA and tSNE clustering methods with gganimate

What’s vizualized? A gif visualizing the cluster derived with kmeans in reduced dimensional space using linear vs non-linear methods (PCA and tSNE), as well as in the original physical space....

Suki O
22 Feb 2025

Differential Gene Expression Analysis- Eevee vs. Pikachu

Description

Hannah Puhov
20 Feb 2025

hw 4 DEG analysis

Description of analysis The modification from the HW3 using eevee data set is that I changed the selection of cluster based on the overall cluster visualization in physical space. Method-wise,...

Hannah T
19 Feb 2025

HW4: Exploring Cell Type with Differentially upregulated CD4

1. Figure Description. Figure A: Total within-cluster sum of squares using different value of k. Figure B: Cluster 2 is highlighted in red in PCA space, while the remaining three...

Alexandra M
19 Feb 2025

Identifying B cell markers in imaging dataset

To begin analyzing the imaging dataset, I decided to normalize by cells’ areas, rather than use count-based normalization. Afterwards, I clustered my normalized gene expression data using k-means and determined...

Sabahat Rahman
19 Feb 2025

Analyzing PDGFRB Gene Expression in Pikachu Dataset

Visualization Summary In this visualization, I analyzed a cluster within the Pikachu dataset responsible for cell growth, and likely cancer. This was a major change from the Eevee sequencing dataset...

Sid R.
19 Feb 2025

Validating Sequencing-based 10x Visium Identification of T Cell Population with Imaging-Based Spatial Transcriptomics

From last week’s results and selected cluster, I identified the genes LTB, CD247, and IL7R , all of which suggest a T cell population (or similar immune cell population comprising...

Sachin K
19 Feb 2025