Lesson 3: Summary Statistics and QC Metrics
Table of ContentsLecture 3
3.0 Lesson learning objectives
By the end of this lesson, we should understand sources of quality variability in spatially resolved transcriptomics data, how to evaluate the quality of spatially resolved transcriptomic data and use data visualizations to assist us in this assessment.
3.1 Why summarize?
3.1.1 Types of summary statistics
- Sum (total)
- Mean (average)
- Spread (variance)
- Dependence (pairwise correlations)
3.2 Quality Control
- Total number of genes per cell
- Unique number of gene species per cell
- Total number of cells expressing a gene
3.3 Normalizing
Hands-on component 3
Our in-class hands-on component will involve analyzing either the MERFISH or Visium dataset to create a data visualization to evaluate the data quality. We will also do an in-class demonstration of the homework submission process.
Class Lesson Notes 3
Prof. Fan’s whiteboard notes from class: genomic-data-visualization-classnotes-20220131.pdf (click to download)
Prof. Fan’s code from class: inclass-plotting-20220131.R (click to download)
Homework 3
None. Assignment and submission of Homework 3 will be pushed back one class to accomodate new students. For those less familiar with R and ggplot, please tinker and practice on your own so you will be ready to create data visualizations of either the MERFISH or Visium dataset using ggplot in R for your next homework.