Introduction
Single-cell RNA-sequencing (scRNA-seq) is a method of molecular profiling cellular states. A common problem in molecular profiling techniques such as scRNA-seq is "batch effects," ('batch fx') whereby differences between samples occur as a result of technical differences, such as when they were processed, who did the experiment, which instrument they were performed on, the batch of reagents used. THis problem is not unique to RNA-seq, but was originally found in microarrays. In fact, one of the commonly used tools for RNA-seq batch effect correction is called COMBAT and was originally developed for microarrays (
https://www.bu.edu/jlab/wp-assets/ComBat/Abstract.html).
Last summer eighteen students attended the Cold Spring Harbor's Single Cell Analysis course (
https://meetings.cshl.edu/courses.aspx?course=C-SINGLE&year=17 and
https://github.com/olgabot/cshl-singlecell-2017). At the course, the students worked in pairs to perform single-cell capture and RNA-seq of 800-5000 single human male fibroblasts, supposedly a very homogeneous population. However, the data is incredibly "batchy" based on the group of people that performed it. The cells are clustered based on the spearman correlation between cells (converted to a distance by sqrt(2*(1-rho)) ) using
PhenoGraph (creates a KNN graph, and finds "communities" -- this is with K=30):