Methodology - jandrewrfarrell/RUFUS GitHub Wiki

"RUFUS is a reference independent method for identifying variants between next generation sequence data sets. It is based on a kmer-based approach that identifies sequence reads that contain unique DNA between two or more sequence libraries. The elimination of reference mapping or whole genome assembly from variant detection may reduce the rate of false positives caused by incorrect mapping without a reducing sensitivity. First, Jellyfish produces kmer counts for each samples set of FASTQ files independently. RUFUS uses these counts to determine which kmers are unique to a sample. Filtered FASTQ files are generated with reads with only unique KMERs and thus reads containing a mutation compared to the comparison sequence library. Filtered FASTQ files where then used for alignment and variant calling, using the same method as the unfiltered FASTQ files."

Cantarel, Brandi L. et al. “Analysis of Archived Residual Newborn Screening Blood Spots after Whole Genome Amplification.” BMC Genomics 16.1 (2015): 602. PMC. Web. 8 Jan. 2018.