1KG population cheat sheet - zeeev/vcflib GitHub Wiki

#Identifying Ethnicity in 1000 Genomes VCF

This simple accessory script gives a comma separate index for 1000 Genomes individuals of a given ethnicity within a VCF. The file "1KG_IDs.txt" must accompany the script, as it contains the ethnic identifiers for each 1KG ID.

The current install has IDs for the 1KG version 3 release contained in 1KG_IDs.txt. Please be aware that this may not be up to date with the current sample set from the 1KG project; the script should be portable to new releases as long as the user generates a new tab delimited text file with the sample ID and population.

Usage:

usage: 1kG_ethnic_index.py [-h] VCF Population

Determines the index of individuals of a given ethnicity within a 1000 Genomes
VCF

positional arguments:
  VCF         VCF of 1000 Genomes individuals
  Population  1KG identifier for populationto be found in the index, enter
              "All" to print index for all populations in the VCF

optional arguments:
  -h, --help  show this help message and exit

author: Brett Kennedy & EJ Osborne