1 Genome Profiling - coopermkr/sdepressaAssembly GitHub Wiki
Genome profiling uses kmer counting techniques to estimate ploidy, genome size, heterozygosity, and unique sequence proportions, which can be helpful in assembling a genome.
First process the whole genome shotgun file with your favorite kmer counter. Here I'm using kmc3:
trim=FILENAME.fastq
kmc -k21 -t10 -m64 -ci1 -cs10000 -fq data/$trim trimmed.kmers outdir/
kmc_dump -ci10 -cx2300 trimmed.kmers trimmed.kmers.dump
kmc_tools transform trimmed.kmers -ci10 -cx2300 dump -s kmcdb_L10_U2300.dump
Then feed the input files first into Smudgeplot to estimate ploidy:
smudgeplot.py hetkmers -o kmer_pairs < kmcdb_L10_U2300.dump
And then into GenomeScope2.0 to estimate genome stats: http://genomescope.org/genomescope2.0/analysis.php?code=dMvo6e2u2PqXQmo4me2B