Comparing tSNE and PCA in Visium Data


Yuqi Zhang
I am an undergrad junior at Johns Hopkins University. I double major in Biomedical Engineering and Computer Science. I like jogging in my free time. Looking forward to learning in this class!

Comparing tSNE and PCA in Visium Data

I am visualizing PC1 and PC2 from PCA in Visium Data as well as tSNE Dimension 1 in Visium. PC1, PC2, and tSNE Dimension 1 are all quantitative data.

I use the geometric primitive of points to represent cells, and the position visual channel is used to encode PC1 and PC2 for each cell. The color visual channel is used to encode tSNE Dimension 1 for each cell.

The variation among cells explained by PC1 and PC2 is demonstrated using proximity in Gesalt principle. The cells closer to each other in the physical 2D plane tend to be preceived as being in a group. The variation among cells explained by tSNE Dimension 1 is demonstrated using similarity in Gesalt principle. Cells of similar color visual channel tend to be preceived as being in a group. Both principles yields two groups of cells which align with each other.

Therefore, in this exploratory data visualization, I try to make salient that there is agreement between variation explained by PC1 and PC2 and variation explained by tSNE Dimension 1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
library(ggplot2)
visium_data = read.csv("~/genomic-data-visualization/data/Visium_Cortex_varnorm.csv.gz", 
                row.names = 1)
visium_data_gene = visium_data[, c(3:ncol(visium_data))]
# Normalization: 
visium_data_gene_normal = visium_data_gene/rowSums(visium_data_gene)*1e6
visium_data_gene_normal = log10(visium_data_gene_normal + 1) 

# PCA and tSNE
pcs = prcomp(visium_data_gene_normal)
tsne_res = Rtsne(visium_data_gene_normal, perplexity=20)

visium_data_gene_normal$PC1 = pcs$x[, 1]
visium_data_gene_normal$PC2 = pcs$x[, 2]
visium_data_gene_normal$t1 = tsne_res$Y[, 1]
visium_data_gene_normal$t2 = tsne_res$Y[, 2]

ggplot(visium_data_gene_normal, aes(PC1,PC2)) + geom_point(mapping = aes(color = t1), size = 1.5) +
  theme_bw() + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + 
  ggtitle("tSNE Dim 1 Compared with PC1 and PC2 \nin Visium Dataset") +
  labs(color = "tSNE Dim 1")
ggsave("~/genomic-data-visualization/homework/hw1/yuqiz_hw1.png", height = 5, width = 5.5)