Synthetic Example 1 - nolanlab/citrus GitHub Wiki

Identification of populations whose abundance differs between healthy and diseased patients

This example demonstrates how to use Citrus to identify populations of cells whose abundance differs between healthy and diseased patients in a synthetic, 2-dimensional dataset. In this synthetic dataset there are two lineage markers (Red & Blue) that enable resolution of 3 populations: Red-/Blue-, Red+/Blue-, & Red+,Blue+.

In this example, healthy patients have high abundances of the Red+/Blue- population and low abundances of the Red+/Blue+ population.

In turn, diseased patients have high abundances of the Red+/Blue+ population and low abundances of the Red+/Blue- population.

We now use Citrus to analyze a cohort of 10 healthy and 10 diseased patients in order to identify both the populations (R+/B- & R+/B+) and behaviors (varying abundances) that differ between the healthy and diseased patients.

This synthetic dataset is distributed with the Citrus source package and can be found in:

 citrus-master/inst/extdata/example1/

Analysis Instructions

1. Start the R GUI, load the citrus package, and launch the citrus GUI.

 R> library("citrus")
 R> citrus.launchUI()

2. In the file selection dialog box that appears, select a single FCS file in example1 synthetic data directory and click "Open."

3. A Citrus configuration GUI should automatically load in your web browser.

4. In the "Group 1 name" input field, change the name from "Group 1" to "Healthy. Change the name of "Group 2" to "Diseased". Note that in the Group Sample selection boxes beneath the Group Name inputs, samples should be automatically assigned to their respective groups based on their file name. Additionally, in the summary panel on the left, the assignments of each file to each group should be visible.

5. Click on the "Clustering Setup" tab. The "Red" and "Blue" parameters in this circumstance correspond to our lineage markers. Select them as Clustering Parameters. This data does not need to be transformed so uncheck them from the transform option. We will use the default values of 1,000 cells selected per sample and a minimum cluster size percent of 5%.

6. Click on the "Cluster Characterization" tab. In this circumstance, we are interested in looking for changes in cluster abundance. Select "Cluster abundances" as the features that we want Citrus to calculate.

7. Click the "Regression Model Configuration" Tab. Here, we'll use the nearest shrunken centroid model to identify features (specifically cluster abundances as per step #6) that differ between our healthy and diseased patients. Check the "pamr" option. At this time, leave the "Cross Validation Folds" value set at 1.

8. Click on the "Run!" tab, select the Radio Button Option "Quit GUI and run Citrus in R" and then click the "Run Citrus" button.

At this point, the GUI window should turn grey (you may close this window) and you should see the Citrus analysis running in R. After Citrus finishes running, a citrusOutput directory will be created in the same directory as the source data. This directory contains visual representations of the clustering hierarchy, estimation of predictive model accuracy (i.e. a measure of predictive the differing features are), the values of the differing features themselves in each group, and several other summary plots.

Synthetic Example 1 - nolanlab/citrus GitHub Wiki

Identification of populations whose abundance differs between healthy and diseased patients

Analysis Instructions

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️