Module : Lineage with Monocle - ComputationalSystemsBiology/Single-cell-RNA-seq GitHub Wiki
This module infers lineage using Monocle 2.
-
Internal name : lineage-monocle
-
Avalaible : local mode
-
Input Ports :
- matrix : filtered expression matrix (tsv)
- cells : normalized cells metadata (tsv)
-
Output Ports :
- none
-
Optional parameters :
Parameter | Type | Description | Default Value |
---|---|---|---|
exp_family | text | Expression model family, should be one of : negbinomial, negbinomial.size | negbinomial |
detection_threshold | float | Gene detection parameter | 0.1 |
bypass | text | Use Monocle's native (native) normalization or bypass (bypass) using Eoulsan normalization | bypass |
select_genes | boolean | Whether to reduce gene set or not for dimensionality reduction | False |
mean_threshold | float | Minimum mean expression value to keep a gene | 0.5 |
dispersion_fold_threshold | float | Minimum fold of dispersion to keep a gene | 1 |
max_dim | int | Maximum number of dimensions to keep after dimensionality reduction | 2 |
reduction_method | int | Dimensionality reduction method to use, should be one of : DDRTree or ICA | DDRTree |
norm_method | string | Variance stabilizing method to use, should be one of : vstExprs, log, or none | vstExprs |
reverse | boolean | Whether to represent cells in inverted order or not | False |
color_by | string | Column to use to color cells on plot | State |
- Configuration example
<step id="Lineage" skip="false">
<module>lineage-monocle</module>
<parameters>
<parameter>
<name>bypass</name>
<value>bypass</value>
</parameter>
<parameter>
<name>select_genes</name>
<value>False</value>
</parameter>
<parameter>
<name>color_by</name>
<value>fluorescence</value>
</parameter>
</parameters>
</step>
Gene selection is conducted on log transformed data. Maximum Likelihood Estimate for mean and dispersion parameter of a Negative Binomial are calculated. The results are shown in the following plot. Only genes showing sufficient (user defined) gene expression and dispersion are kept for following analysis. Those genes appear in black. The red line is the estimate of dispersion as a function of mean.
The end product of the module is a lineage plot. Briefly, Monocle learns a minimal spanning tree from the data and then projects the tree on a two dimensional space. This tree is supposed to reflect the evolution of genes expression across an activation or a differentiation path. If the extremities of the plot are stable, branching points depend greatly on the data. We suggest to use it more as a way to challenge a hypothesis ("supposing this really occurs, does my data really reflect this process ?") than as a reliable reconstruction of the process.