Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Suki
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Canvas for location details.
Office Hours: 10:00am-10:50am Monday, Wednesday, and by request. See Canvas for location details.

Course Details
☞ see Course tab


All Visualizations

Visualizing Spatial Gene Expression with PCA and tSNE

1. How do the gene loadings on the first PC relate to features of the genes such as its mean or variance? The first principal component captures the direction of...

HW2: How Distance Metric in tSNE affects Spatial Tissue Structure

1. What data types are you visualizing? Spatial data: X and Y coordinates representing physical tissue location Quantitative data: t-SNE embeddings (continuous numerical values) and euclidean distances to centroids Categorical...

Using PCA to visualize spatial patterns in high-dimensional gene expression within coronal kidney section

1. What data types are you visualizing? The represented data type is quantitative. I am visualizing the x and y spatial positions of cells in the coronal kidney section (all...

HW2

1. Write a description explaining what you are trying to make salient This visualization shows the expression of the five genes that contribute most positively and most negatively to PC1....

Spatial Organization of Genes with Extreme PCA Loadings

1. What data types are you visualizing? I’m visualizing both quantitative and categorical data. The dataset has quantitative spatial information of x and y coordinates for each spot in the...

Influence of Gene Mean and Gene Variance on PC1

1. What data types are you visualizing? I am visualizing quantitative data of: 1) the mean expression of each gene, averaged across all of all spatial spots within the data...

Comparing High and Low Loading Genes Across Spatial and PCA Spaces

Write a description explaining what you are trying to make salient and why you believe your data visualization is effective, using vocabulary terms from Lesson 1. (Question 2: How do...

HW 2 How Gene Loadings on the First PC Relates to Mean and Variance

1. What data types are you visualizing? I am visualizing quantitative data for the genes, including the PC1 loadings, mean expression per gene, and variance per gene. I also visualized...

HW 2

###Summary PC1 loadings correlate positively with both mean expression (r = 0.52-0.58) and variance (r = 0.44-0.48). This indicates PC1 primarily captures overall expression magnitude - highly expressed genes dominate...

Impact of Principal Component Selection on t-SNE Coordinates

1. What about the data would you like to make salient?

HW2

1. Write a description explaining what you are trying to make salient and why you believe your data visualization is effective using vocabulary terms from Lesson 1.

How do tSNE coordinates change as increasing the number of PCs?

1. What data types are you visualizing? I am answering how do tSNE coordinates change as increasing the number of PCs. I computed PCA on the log-transformed, normalized gene expression...