weakestLinks - christianparobek/cambodiaWGS GitHub Wiki
We want to filter out the samples with lowest coverage. It would be nice if we could have a complete set of SNP calls from all genomes for each variant site that we use. So it looks like we should filter out all the samples with the lowest coverage. If we remove everything with >=10x coverage at <75% of the genome, we have to remove:
- OM074
- OM093
- OM105
- OM108
- OM114
- OM124
- OM125
- OM303
- OM325
From now on, I'll call these "good69" and I'll call the original set "all 78".