5. Output - sneuensc/mapache GitHub Wiki
Columns in FASTQ_stats.csv
Each row in this file corresponds to each row in the samples file specified in the config file. For example, if 5 FASTQ files were mapped to two reference genomes, this file should have 5 * 2 rows (+ 1 line for the header).
- genome: Genome ID to which the reads were mapped.
- SM: Sample ID
- LB: Library ID
- ID: ID of FASTQ file analysed, as specified in the samples file.
- reads_raw: Number of starting raw reads in FASTQ file.
- reads_trim: Number of reads that passed the trimming step.
- trim_prop: Proportion of reads that passed the trimming step.
- mapped_raw: Number of reads that were mapped, passing the mapping quality threshold (duplicates included).
- length_reads_raw: Average length of starting raw reads in FASTQ file.
- length_reads_trimmed: Average length of reads that passed the trimming step.
- length_mapped_raw: Average length of reads that were mapped, passing the mapping quality threshold (duplicates included).
- endogenous_raw: Raw endogenous proportion, computed as
mapped_raw
/reads_raw
.
Notice that no statistics for the duplicated reads is reported. This is because duplicates are removed/flagged only at the library level.
Columns in LB_stats.csv
The statistics reported in this table are grouped by library. Thus, if your samples file had 3 libraries mapped to 4 different genomes, this table should have 3 * 4 rows (+1 row for the header).
- genome: Genome ID to which the reads were mapped.
- SM: Sample ID
- LB: Library ID
- reads_raw: Number of starting raw reads in all FASTQ files of the library.
- reads_trim: Number of reads that passed the trimming step.
- trim_prop: Proportion of reads that passed the trimming step.
- mapped_raw: Number of reads that were mapped, passing the mapping quality threshold (duplicates included).
- duplicates: Number of mapped reads that were identified as duplicates.
- duplicates_prop: Proportion of mapped reads that were identified as duplicates.
- mapped_unique: Number of mapped reads passing the mapping quality filter, after removing duplicates.
- length_reads_raw: Average length of starting raw reads in the library.
- length_reads_trimmed: Average length of reads that passed the trimming step.
- length_mapped_raw: Average length of reads that were mapped, passing the mapping quality threshold (duplicates included).
- length_mapped_unique: Average length of mapped reads passing the mapping quality filter, after removing duplicates.
- endogenous_raw: Raw endogenous proportion, computed as
mapped_raw
/reads_raw
. - endogenous_unique: Endogenous proportion, computed as
mapped_unique
/reads_raw
. - Sex: Sex inferred for the individual, if the sex inference was requested (otherwise there is a message in this cell).
- read_depth Average read depth.
Optional (see options below)
If you asked to output the average depth of coverage for a specific chromosome, you will have extra columns prefixed with depth_
and followed by the name of the chromosome.
- depth_
chromosome_name
: Average read depth for chromosomechromosome_name
.
Columns in SM_stats.csv
- genome: Genome ID to which the reads were mapped.
- SM: Sample ID
- reads_raw: Number of starting raw reads in all FASTQ files of all the libraries of the sample.
- reads_trim: Number of reads that passed the trimming step.
- trim_prop: Proportion of reads that passed the trimming step.
- mapped_raw: Number of reads that were mapped, passing the mapping quality threshold (duplicates included).
- duplicates: Number of mapped reads that were identified as duplicates.
- duplicates_prop: Proportion of mapped reads that were identified as duplicates.
- mapped_unique: Number of mapped reads passing the mapping quality filter, after removing duplicates.
- length_reads_raw: Average length of starting raw reads in the sample.
- length_reads_trimmed: Average length of reads that passed the trimming step.
- length_mapped_raw: Average length of reads that were mapped, passing the mapping quality threshold (duplicates included).
- length_mapped_unique: Average length of mapped reads passing the mapping quality filter, after removing duplicates.
- endogenous_raw: Raw endogenous proportion, computed as
mapped_raw
/reads_raw
. - endogenous_unique: Endogenous proportion, computed as
mapped_unique
/reads_raw
. - Sex: Sex inferred for the individual, if the sex inference was requested (otherwise there is a message in this cell).
- read_depth Average read depth.
Optional (see options below)
If you asked to output the average depth of coverage for a specific chromosome, you will have extra columns prefixed with depth_
and followed by the name of the chromosome.