quality control - WXlab-NJMU/scrna-recom GitHub Wiki

Quality Control

Tools: Seurat、DoubletFinder、SoupX

Quality control using Seurat

Thresolds:

  • barcodes: max.genes, min.genes, max.mt, max.hb
  • genes: max.counts, min.counts, and min.cells

Two modes are supported:

  • set common thresolds for grouped samples using command parameters
  • specify thresolds for a single sample in csv file

Usages

quality-control.R <csv> <outdir> <project> [options]


scRNA-seq quality control using Seurat

positional arguments:
  csv              csv file including sample, path, qc thresolds(specific to a single sample)
  outdir           output result folder
  project          project name

flags:
  -h, --help       show this help message and exit

optional arguments:
  --max.genes  nFeature_RNA maximum [default: 5000]
  --min.genes      nFeature_RNA minimum [default: 200]
  --max.counts     nCount_RNA maximum [default: 40000]
  --min.counts     nCount_RNA minimum [default: 500]
  --min.cells      cell minimum [default: 3]
  --max.mt         percent of maximum mt genes [default: 20]
  --max.hb         percent of maximum hb genes [default: 10]

csv format

  • sample: sample name
  • path: cellranger matrix folder, including genes.tsv, barcodes.tsv, matrix.mtx
  • qc: specified thresolds for this sample, use & for multiple parameters
sample,path,qc
ctrl,examples/ctrl,min.genes=10&min.counts=10&max.mt=15
stim,examples/stim,min.genes=10&min.counts=10

Examples

# testdata
## edit the path in `qc.input.csv` to absolute path
# run
quality-control.R examples/qc.input.csv ~/test/qc pbmc

Ouputs

qc
├── pbmc.qc.rds              # seurat object after qc
├── pbmc.qc.stat.csv         # qc statistics
├── pbmc.qc_after.pdf        # plot after qc
├── pbmc.qc_before.pdf       # plot before qc
├── ctrl.barcodes.csv        # sample barcodes after qc
└── stim.barcodes.csv        # sample barcodes after qc
quality control image

Remove doublet using DoubletFinder

Usages

remove-doublet.R <input> <outdir> <project> [options]


scRNA-seq Doublet Removal using DoubletFinder

positional arguments:
  input            input seurat rds file
  outdir           output result folder
  project          project name

flags:
  -h, --help       show this help message and exit

optional arguments:
  -d, --dims       npcs in Seurat::RunPCA, default is 50 [default: 50]
  -n, --nfeatures  number of variable features to use for scaledata and
                   pca, default is 2000 [default: 2000]

Examples

remove-doublet.R examples/input.rds ~/tests/doublet-removal test

Ouputs

remove-doublet
├── pbmc.dedoublet.rds                   # seurat object after doublet removal
├── pbmc.dedoublet.stat.csv              # statistics
├── ctrl.dedoublet.dims=30.after.rds     # sample ctrl after doublet removal
├── ctrl.dedoublet.dims=30.pdf           # sample ctrl figures in doublet removal
├── ctrl.dedoublet.dims=30.stat.csv      # sample ctrl statistics in doublet removal
├── stim.dedoublet.dims=30.after.rds     # sample stim after doublet removal
├── stim.dedoublet.dims=30.pdf           # sample stim figures in doublet removal
└── stim.dedoublet.dims=30.stat.csv      # sample stim statistics in doublet removal
doublet removal image

Remove background RNA using SoupX

Usages

remove-background.R <raw> <filtered> <outdir> <project>


scRNA-seq Background RNA Removal using SoupX

positional arguments:
  raw         cellranger raw_feature_bc_matrix folder
  filtered    cellranger filtered_feature_bc_matrix folder
  outdir      output result folder
  project     project name

flags:
  -h, --help  show this help message and exit

Examples

# testdata 
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc4k/pbmc4k_raw_gene_bc_matrices.tar.gz
tar -zxvf pbmc4k_raw_gene_bc_matrices.tar.gz
wget https://cf.10xgenomics.com/samples/cell-exp/2.1.0/pbmc4k/pbmc4k_filtered_gene_bc_matrices.tar.gz
tar -zxvf pbmc4k_filtered_gene_bc_matrices.tar.gz
# run 
remove-background.R ./raw_gene_bc_matrices/GRCh38 ./filtered_gene_bc_matrices/GRCh38 ~/test/background-removal pbmc4k

Outputs

├── pbmc4k.bkremoval.SoupX.pdf  # features
└── soupx_filtered_matrix       # count matrix after soupx
    ├── barcodes.tsv
    ├── genes.tsv
    └── matrix.mtx
├── clustering                  # cluster informations
background RNA removal
⚠️ **GitHub.com Fallback** ⚠️