12_Zc - eolesin/AMOR_Indiv_Assembly_Protocol GitHub Wiki
Carbon oxidation state investigations of whole metagenomes
starting from human-cleaned data, we deduplicate the reads. This information is important for read mapping and abundance estimates, but apparently when it comes to the carbon oxidation state calculations we want only non-duplicated reads.
# dereplicate illumina paired-end reads
conda activate cd-hit
cd-hit-est -i 02_HUMAN_Decontam/GS19-ROV16-BS04-cleanR1.fq -j 02_HUMAN_Decontam/GS19-ROV16-BS04-cleanR2.fq -o 13_CDHIT/BS04_cdhitout_R1 -op 13_CDHIT/BS04_cdhitout_R2 -M 1000000