Species assignment using targeted amplicon sequencing datasets - KamilSJaron/k-mer-approaches-for-biodiversity-genomics GitHub Wiki

An amplicon panel and a whole species assignment method were developed to perform species assignment for the entire genus of Anopheles mosquitoes. k-mers provide an objective way to compare highly diverged sequences, where multiple sequence alignment or alignment to a single reference genome tends to introduce bias towards better-represented clades in the panel and the reference species respectively. Moreover, k-mers provide a natural way to incorporate small indels in addition to SNPs, which considerably increases the power to distinguish between species when working with less than 10kb sequence.

To use the anopheles dataset and perform species identification based on k-mer pairwise comparisons, use the jupyter notebook that includes all the guidelines for this exercise.