Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale single-cell and spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Caleb Hallinan
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Canvas for location details.
Office Hours: 10:00am-10:50am Monday, Wednesday, and by request. See Canvas for location details.

Course Details
☞ see Course tab


All Visualizations

Hw5: Identifying Tissue Samples and Cell Types

Through my analysis, I concluded that the tissue sample is white pulp. This is because the main genes that are expressed are in the CD family, which are mainly found...

Deducing tissue structure in CODEX dataset

1. Write a description explaining why you believe your data visualization is effective using vocabulary terms from Lesson 1.

Identifying cell types in COSEX dataset

Description The visualization utilizes UMAP to reduce the high-dimensional CODEX data into a 2D projection, which allows for effective clustering of cells with similar marker expression patterns. Each point in...

HW5

Description: For this assignment, I first started out by following similar steps to my previous homeworks - normalizing the data, performing kmeans clustering by using the optimal k value, visualizing...

Uncovering spleen tissue type

[description] Figure caption Figure A and B share the same legend. Figure A shows the physical location of each cell on this tissue slide and each cell is colored by...

HWEC1: Exploring Differences Between Linear and Non-linear Dimensionality Reduction

1. Figure Description. Figure State 1: Eevee’s cell spots in PCA space, with x axis for PC1 andy y axis for PC2. Figure State 2: Eevee’s cell spots in t-SNE...

Interrogating Cell Type with CODEX Spleen Dataset

I select clusters 6 and 2 for further analysis given their distinctive spatial organization. Performing differential gene expression analysis on cluster 6 with the Wilcox “greater-than” test yields significant gene...

Identification of White Pulp Tissue Structures within CODEX Dataset

1. Create a data visualization and write a description to convince me that your interpretation is correct.

CODEX dataset analysis

We conducted normalization, standardization, dimensionality reduction, k-means clustering, and differential expression analysis to reveal two distinct cell clusters within the CODEX data. tSNE plots and marker expression heatmaps is plotted...

Interpreting CODEX data

Describe your figure briefly so we know what you are depicting (you no longer need to use precise data visualization terms as you have been doing). The data was normalized...

Exploring differences between linear and non-linear dimensionality reduction methods

Description The visualization compares three different dimensionality reduction techniques—PCA (Principal Component Analysis), t-SNE (t-Distributed Stochastic Neighbor Embedding), and UMAP (Uniform Manifold Approximation and Projection)—to visualize high-dimensional gene expression data from...

Question 4: Exploring the Effect of Varying Principal Components for Non-Linear Dimensionality Reduction

For this project, I created an animation using gganimate to visualize the effect of varying the number of principal components before applying t-SNE. I used the Pikachu dataset. First, I...