Quick hands on - adelabriere/SLAW GitHub Wiki

To quickly process some LC-MS you need to accomplish the following steps:

  1. First setup the last version of the workflow:

    1. If it is not done already install docker (here)
    2. Once docker is installed open the terminal on Mac and Linux or the Powershell (Not the ISE powershell)
    3. Pull the workflow from the DockerHub docker pull adelabriere/slaw:stable
  2. Pick your demo data: Some demo data are provided here (https://github.com/adelabriere/SLAW/tree/master/test_data/mzML.zip). If you want to pick your own data, go to step 3. If not you can jump to step 4.

  3. Format your inputs (Optional do only if you use your own data)

The input of SLAW is centroided LC-MS mzML files. They can be centroided using MSconvert software.

The mzML files that you want to process all need to be stored in the same folder. Information about the sample need to be provided in a CSV file called summary.csv. This .csv file should only contain 2 columns, path and a type column. path should contain the unique name of the mzML files. type should contain informations about the sample class: * QC: Pooled QC. all QCs files are supposed to be exactly identical in theory. * blank: Any blank which should be subtracted. * MS2: Any file which should not be peak picked and from which only the MS2 needs to be extracted. * sample: File for which MS1 peak picking need to be accomplished and MS2 spectra need to be extracted.

  1. Create the output folder: Create an empty directory in which you want to store the output of the processing. For a demonstration from scratch, this folder should contain only the mzML files and the summary.csv files.

  2. Generate a parameters file: The workflow can be run using the following command line:

docker run -v $INPUT_DIR$:/input -v $OUTPUT_DIR$:/output adelabriere/slaw:stable

with $INPUT_DIR$ the folder containing the .mzML file and $OUTPUT_DIR$ the folder for the output of the dataset. A file named parameters.txt file shoudl be generated in $OUTPUT_DIR$/parameters.txt. It is a valid yaml file which can be opened with a text editor.

  1. Tune the parameters: While the workflow include parameters optimization, it is not turned on by default. The most important parameters is the peakwidth. If you want to run the parameters you want to set the value of parameters optimization/need_optimization to true and tune the range value of the parameters you need to optimize.

  2. Run the actual computation To do so simply rerun the command of step 5 but with the parameters.txt files present in the output folder:

docker run -v $INPUT_DIR$:/input -v $OUTPUT_DIR$:/output adelabriere/slaw:stable
  1. Inspect your output: The most interesting output of SLAW is dropped into subfolders the $OUTPUT_DIR$ directory.
  • datamatrices contains three flavors of datamatrices:
    • datamatrix****.csv: The aligned peaktable containing the MS1 quantification information aswell as some diagnosis metrics, and the link with the MS2 feature if present.
    • annotated****full.csv: A table similar to the first one, but with the isotopes and adducts annotated.
    • annotated****reduced.csv: A table containing only one line by adduct group.
  • fused_mgf contains the consensus MS2 output as an mgf, with the link to features given in ms2_id.