Welcome

Welcome to the Course Website for EN.580.428 Genomic Data Visualization!

As the primary mode through which analysts and audience members alike consume data, data visualization remains an important hypothesis generating and analytical technique in data-driven research to facilitate new discoveries. However, if done poorly, data visualization can also mislead, bias, and slow down progress. This hands-on course will cover the principles of perception and cognition relevant for data visualization and apply these principles to genomic data, including large-scale single-cell and spatially-resolved omics datasets, using the R statistical programming language. Students will be expected to complete class readings, create weekly data visualizations as homework assignments, and make a major class presentation.

Course Information

Course Staff: Prof. Jean Fan and Kalen Clifton
Office Hours: 10:00am-10:50am Monday, Wednesday, and Friday. See Slack for location details.
Lectures: 8:00am-9:50am Monday, Wednesday, and Friday. See Slack for location details.

Course Details
☞ see Course tab


All Visualizations

Dimensionality Reduction approach for spatial transcriptomics in genes ZEB1 and ZEB2

1. Should I normalize and/or transform the gene expression data (e.g. log and/or scale) prior to dimensionality reduction? It is recommended to normalize the data with scale to help take...

Clusters of Genes Expressing ERBB2 using t-SNE

What data types are you visualizing? I am visualizing the similarities in levels of overall gene expression in cells that have non-zero expression of ERBB2 and the level of expression...

Running tSNE analysis on genes or PCs

What data types are you visualizing? In this multi-panel plot, I am visualizing various quantitative and categorical data. For the PCA plot on the upper left, I am visualizing quantitative...

AQP1 Expression for Contrasting Principle Component Numbers

What data types are you visualizing? I am using categorical data (zero and nonzero expression) as well as quantitative (color gradient of expression).

Homework 3 Submission: Comparing Pre-processing Methods Prior to PCA

This series of plots demonstrates the effects of preprocessing steps taken prior to utilizing PCA for dimensionality reduction of multi-dimensional gene expression data. Some pre-processing steps include but is not...

Homework 3

Homework 3

The effect of Count Per Million normalization on Dimensionality Reduction

What data types are you visualizing? I am visualizing quantitative data of cells’ position on tSNE embedded 2-dimensional space. I am also visualizing the total gene count for each individual...

Comparing TSNE on gene expression matrix and top principal components

What data types are you visualizing? I am visualizing the qualitative expression data of PTPRC,the most dominant gene in PCA, with respect to different TSNE reductions upon gene expression or...

Spatial Distribution of GATA3 and ADIPOQ expressions

What data types are you visualizing? I am visualizing quantitative data of the expression of GATA3 (gene of interest) and ADIPOQ (the most highly variable) gene for cells with at...

Relationship between the centroid mass, area and its occurrences of ZEB1

What data types are you visualizing? It’s visualizing a quantitative data type of how many ZEB1 genes are found in a specific area given the x_centroid (4000-4500) and y_centroid (3000-3250)....

Spatial Relationship between ACTG2, ADAM9, and BASP1

What data types are you visualizing? My categorical data is the gene with the highest expression dictating the color, my spatial data is the x and y centroids depicting the...