Command line arguments for source data clustering - nicococo/scRNA GitHub Wiki
Setting up the Source Dataset
scRNA-source.sh
Input and output files:
Command line arguments | Description |
---|---|
--fname | Source data (TSV file) |
--fgene-ids | Source gene ids (TSV file) |
--fout | Result files will use this prefix |
--flabels | (optional) Source cluster labels (TSV file) |
Data pre-processing Gene/cell filtering arguments (SC3 inspired):
Command line arguments | Description |
---|---|
--min_expr_genes | (Cell filter) Minimum number of expressed genes (default 2000)", default=2000, type=int) |
--non_zero_threshold | (Cell/gene filter) Threshold for zero expression per gene (default 1.0) |
--perc_consensus_genes | (Gene filter) Filter genes that coincide across a percentage of cells (default 0.98) |
--no-cell-filter | Disable cell filter |
--no-gene-filter | Disable gene filter |
--no-transform | Disable log2(x+1) data transformation |
Test settings: The software will test all values specified in cluster-range and store results separately.
Command line arguments | Description |
---|---|
--cluster-range | Comma separated list of clusters (default 6,7,8) |
These are NMF related parameters:
Command line arguments | Description |
---|---|
--nmf_alpha | Regularization strength (default 1.0) |
--nmf_l1 | L1 regularization impact [0,1] (default 0.75) |
--nmf_max_iter | Maximum number of iterations (default 4000) |
--nmf_rel_err | Relative error threshold must be reached before convergence (default 1e-3) |
Additional commands:
Command line arguments | Description |
---|---|
--no-tsne | Do not plot t-SNE plots as they can be quite time consuming |