Figure 1 - RegnerM2015/scENDO_scOVAR_2020 GitHub Wiki
scRNA-seq processing workflow
The starting input for this workflow is the filtered feature barcode matrix generated by cellranger for each patient tumor sample. The output is a Seurat object saved as an rds object for each patient tumor sample. For each patient tumor sample, we run essentially the same script that performs the following tasks:
/scRNA-seq Processing Scripts/Individual_Samples/Patient*[1-11]_scRNA-seq.R
- Preprocessing & QC
- Feature selection, dimension reduction and clustering
- inferCNV
- SingleR reference-based annotation of cell types
- Save Seurat object as rds object
After processing each patient tumor sample, we run the following script that combines all individual Seurat objects into one multi-sample Seurat object representing the full cohort of 11 patient samples. The starting input(s) for this script are the individual patient Seurat objects. The output is a fully processed multi-sample (full cohort) Seurat object. The following tasks are performed on this multi-sample Seurat object:
/scRNA-seq Processing Scripts/Full_Cohort/Patients1-11_scRNA-seq.R
- Re-normalize and re-scale
- Feature selection, dimension reduction and clustering
- Retain inferred CNVs from individual sample processing
- Assign cell type labels to clusters based on the majority label within each cluster
- Verify SingleR cell type labels with cell type gene signatures from PanglaoDB using Seurat's AddModuleScore()
- Differential expression analysis (Seurat's
FindAllMarkers
) - Save Seurat object as rds object
scATAC-seq processing workflow
The R package ArchR was used extensively for the scATAC-seq analysis. The starting input for this script is the ATAC fragments file from each patient tumor sample generated by cellranger-atac. The output is a multi-sample ArchR Project (including cells from all 11 patients) saved as an rds object that contains 1) a 500 bp genomic tile matrix, 2) an estimated gene activity matrix, 3) an inferred gene expression matrix, and 4) a peak matrix.
/scATAC-seq Processing Scripts/Full_Cohort/Patients1-11_scATAC-seq.R
- Preprocessing & QC
- Feature selection, dimension reduction and label transferring using scRNA-seq cell type subcluster labels
- Plot intermediate UMAP plots
- Peak calling within each inferred cell type subcluster
- Plot intermediate heatmaps
Helpful graphic of scRNA-seq/scATAC-seq processing workflow:
Note that the last five steps are performed later on.
Figure 1 plotting
The starting inputs for this script are 1) the multi-sample (full cohort) Seurat object and 2) the multi-Sample ArchR project. The outputs are the UMAP plots and proportion bar charts presented in Figure 1.
/Figure_1/Figure_1.R
- Plot scRNA-seq/scATAC-seq UMAP plots colored by sample
- Plot scRNA-seq/scATAC-seq UMAP plots colored by cell type
- Plot proportion bar charts for scRNA-seq and scATAC-seq showing the contribution of each patient to each cell type subcluster
Interested in more exciting research in cancer genomics? Visit https://www.thefrancolab.org/ to learn more!