Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale single-cell and spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Caleb Hallinan
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Canvas for location details.
Office Hours: 10:00am-10:50am Monday, Wednesday, and by request. See Canvas for location details.

Course Details
☞ see Course tab


All Visualizations

Animating the tSNE embedding of CODEX spleen dataset

I thought it would be fun to animate the transition from normal space (our x/y positional coordinates) to the tSNE embedded space. Thus, after normalizing my data, I first performed...

Eevee Dataset Deconvolution and Cell-type Analysis

Visualization Summary In this visualization, I analyzed the eevee dataset for unique cell-types using deconvolution (through STdeconvolve) and visualized it using scatterbar. I found 9 distinct cell types (Figure A),...

Comparison of PCA and t-SNE Methods

Visualization Summary In this visualization, I analyzed the differences between linear and nonlinear dimensionality reduction to visualized cells in the eevee dataset. The animation transitions between Principal Component Analysis and...

Visualization of nonlinear-embedded gex data versus linear-embedded gex data

Here, I illustrate the effect of an embedding in either PC-embedded (linear) space or tSNE-embedded (nonlinear) space. As observed, the PC-embedded shape resembles a volcano plot, with prominent spot placement...

EC1

Description:

Comparing Gene Expression in Normalized and Transformed Data

1. Written Answer Question: What happens if I do or not not normalize and/or transform the gene expression data (e.g. log and/or scale) prior to dimensionality reduction? I chose to...

A Comparative Analysis of Deconvolution and Clustering in Eevee Dataset

Write a brief description of your figure so we know what you are visualizing.

Using gganimate to understand different preprocessing methods

This animation visualizes how different preprocessing methods (raw, log-transformed, scaled, and log-scaled) affect t-SNE dimensionality reduction in spatial transcriptomics data. Each frame represents a different transformation method, highlighting variations in...

Deconvolution of Eevee Dataset for Identification of the Epithelial Cell Type

In this study, I analyzed the Eevee spatial transcriptomics dataset using deconvolution techniques and clustering methods to identify distinct cell types and visualize their gene expression patterns. The dataset was...

Visualization of t-SNE on Different PC Counts on Pikachu Dataset

1.If I perform non-linear dimensionality reduction on PCs, what happens when I vary how many PCs should I use?

Difference between STDecovolution or K-mean clustering on Eevvee Dataset

### Same as homework 4, I performed STdeconvolve on the Eevvee dataset to infer cell-type proportions and used K-means clustering (K=7) to analyze tissue organization. The scatterbar plot visualizes the...

Normalized Eevee Data in PCA Space with and without Log10 Transformation

This animated visualization makes salient the answer to ‘What happens if I do or not log10 transform the normalized gene expression data prior to dimensionality reduction?’. The beginning of the...