FAQ - nolanlab/spade GitHub Wiki

How do I use SPADE on non-cytometry data?

SPADE was designed to work with flow (fluorescence) and mass cytometry (CyTOF) data. However, theoretically it will work with any sort of tabular data (e.g. gene expression data). The best first step is to convert your data to an FCS file using software such as Mathematica, Matlab or R and then just try running SPADE.

Often, it is desirable to transform your data. In cytometry, this is usually an ArcSinh-like transformation; in gene expression, this is usually a log transformation. If your data is already transformed, then you should set a linear transform with an argument of 1 when running SPADE:
```
 TRANSFORMS=flowCore::linearTransform(a=1)
```
If it's not already transformed, you could have SPADE do the log transform for you:
```
 TRANSFORMS=flowCore::logTransform(logbase=2) # for base 2
```
SPADE does not normalize each parameter. Whether you want to do that or not ahead of time is up to you and how you think your data will look.

Because most gene expression data sources yield information for a relatively small number of cells (e.g. 1000), it might be desirable to disable downsampling, or not downsample much. (This depends on how heterogeneous your sample is.) The following setting would target 90% of your cells:
```
 downsampling_target_percent=0.9
```
As when analyzing flow cytometry data, you should increase the target number of nodes slightly for each clustering parameter added. The defaults intended for mass cytometry data are 200 nodes and 10% of cells, assuming 10 to 15 clustering channels and >100,000 cells. Fluorescence cytometry datasets where around six clustering channels are used might target 50 to 100 nodes. Gene expression data might behave similarly but will depend on the distribution of signals for each channel.

If you have a success story, or run into problems using SPADE for a secondary indication, please let us know! File an issue or story here.
How do I apply compensation?

SPADE by default compensates FCS data if it finds the compensation matrix in the SPILL or SPILLOVER keyword in the FCS file. For most users this should "just work". However, problems emerge if you export only compensated data (often indicated by angle brackets in FlowJo generated files) AND also export the compensation matrix. In this case, SPADE finds the compensation matrix and attempts to apply the compensation only to not find the parameters (because FlowJo inserted brackets in the name) or it "double" compensates. The error you might see in the former case is something like:
```
   Error: The following parameters in the spillover matrix
   are not present in the flowFrame:
```
So what should I do? The easiest approach is make sure you export both compensated and uncompensated data, along with the compensation matrix. Alternately you can exported compensated data without the compensation matrix, or export uncompensated data with the compensation matrix. Then, when setting up the clustering step in SPADE, select only the parameters without the brackets.
How do I upgrade to a new version?

You need to re-install both the R-package and the Cytoscape plugin as the two are developed "in-sync" with each other, and differing versions might not be compatible. Follow the instructions on the GettingStarted page for installing the R-package and Cytoscape plugin.

Note: If you are upgrading from a very old version of SPADE, i.e., from November 2010, talk to the developers beforehand as the newer versions are not backward compatible with results from the oldest SPADE versions. If you are upgrading from the 0.2+ versions, we don't think there should be any incompatibilities, except for "Generate PDFs" for those upgrading from version 0.2*. Newer versions of the "Generate PDFs" script are not backward compatible with results generated by 0.2* era SPADE. If this is a problem for you, contact the developers for help
How do I export the attributes for a file?

As of SPADE 1.6 (or bioconductor version 1.10), the attribute tables are automatically produced in output/tables.
Why does percentile-driven downsampling keep so many cells

Often when using downsampling_target_pctile (in lieu of downsampling_samples) many more cells are retained after downsampling then expected. For example, you might have a target percentile of 10%, but retain 92% of the cells. Why does this happen? As its name suggested, target percentile is exactly that a target. Internally SPADE computes the density corresponding with that percentile and then retains all cells with density below that value and cells above the density with a probability proportional to the ratio of the cutoff and the density. When working with high-dimensionality, e.g. CyToF data, it can be challenging to compute a meaningful radius in which to compute the density. As a result, almost all cells tend to fall within this radius of each other, and there is little variability in the radius, e.g., there is little difference between the 10th and say 90th percentile. In these cases very few cells are eliminated in the downsampling operation. To mitigate this issue, you can either specific the target number cells, e.g., downsampling_target_number, specify the per-file percent of cells (downsampling_target_percent) or attempt to reduce the dimensionality to just the most relevant markers.
On Windows, why is SPADE only using 1 core? I thought it was multithreaded!

SPADE uses OpenMP to implemented multithreading. By default we disable OpenMP on Windows due to issues with the pthreads compatibility DLLs. Those users wishing to enable OpenMP support on Windows will need to modify src/Makevars.win to include the following compiler flags:
```
 PKG_CXXFLAGS=-fopenmp
 PKG_LIBS=-mthreads -lgomp -lpthread
```
and ensure that the necessary pthreads compatibility libraries are installed. These libraries can be down- loaded for 32-bit Windows installations at http://sourceware.org/pthreads-win32.
On OSX I get this error: Abort trap: 6

This error seems to result from how OSX implements/supports OpenMP and has been observed by a number of people. By default OpenMP creates as many threads as cores, including hyperthreading cores. Reducing the number of threads to just the number of physical cores seems to fix the issue (and is good practice generally). If you are using the runSPADE script, you can set the -num_threads argument, e.g., -num_threads=4 on a quad-core processor, or set the OMP_NUM_THREADS environment variable to the desired number of threads.
Error in CheckSlotAssignment(object, name, value)

Full error:
```
 Error in checkSlotAssignment(object, name, value) :
 assignment of an object of class "numeric" is not valid for slot "transformationId" in an object of class "transform"; is(value, "character") is not TRUE
 Calls: -> @<- -> slot<- -> checkSlotAssignment
 Execution halted
```
Your computer is probably set to use commas instead of periods as the decimal point. We recently (29-September-2012) fixed this issue. Please update your version of SPADE to the latest (>=1.5.2) by following the installation instructions in the wiki Getting Started page.

FAQ - nolanlab/spade GitHub Wiki

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️