1KG population cheat sheet - zeeev/vcflib GitHub Wiki
#Identifying Ethnicity in 1000 Genomes VCF
This simple accessory script gives a comma separate index for 1000 Genomes individuals of a given ethnicity within a VCF. The file "1KG_IDs.txt" must accompany the script, as it contains the ethnic identifiers for each 1KG ID.
The current install has IDs for the 1KG version 3 release contained in 1KG_IDs.txt. Please be aware that this may not be up to date with the current sample set from the 1KG project; the script should be portable to new releases as long as the user generates a new tab delimited text file with the sample ID and population.
Usage:
usage: 1kG_ethnic_index.py [-h] VCF Population
Determines the index of individuals of a given ethnicity within a 1000 Genomes
VCF
positional arguments:
VCF VCF of 1000 Genomes individuals
Population 1KG identifier for populationto be found in the index, enter
"All" to print index for all populations in the VCF
optional arguments:
-h, --help show this help message and exit
author: Brett Kennedy & EJ Osborne