SAM_Processing - LappalainenLab/RNApipeline GitHub Wiki

Basic Usage

The SAM_Processing handler converts, sorts, and marks duplicates from a raw SAM file into a processed BAM file. This script utilizes both Picard and SAMTools to process the SAM files. In addition, it indexes the final BAM files for use with downstream tools.

To run SAM_Processing, all common and handler-specific variables must be defined within the configuration file. Once the variables have been defined, SAM_Processing can be submitted to a job scheduler with the following command (assuming that you are in the directory containing RNApipeline):

./main.sh SAM_Processing proj.conf

Handler-Specific Variables

The following are a list of variables that need to be defined within the configuration file. In addition to the handler-specific variables, all common variables must be defined.

Variable Function Method
SP_QSUB QSub settings for batch submission
MAPPED_LIST A list of full file paths to the read-mapped samples. This will be ${OUT_DIR}/Read_Mapping/${PROJECT}_Mapped.txt if using Read_Mapping
INDEX_TYPE Generate either BAI or CSI indices for final BAM file

Output

SAM_Processing will create a sorted BAM file with duplicates marked for each input SAM file. Index files for each BAM file are generated.

In addition, a list of all processed BAM files will be generated for use with other handlers. The full file path to this list will be ${OUT_DIR}/SAM_Processing/${PROJECT}_bams.txt

Dependencies

The SAM_Processing handler depends on:

Next: Quanitfy_Summarize