Principal Coordinates Analysis and FST - UMEcolGenetics/PawPawPulation-Genetics GitHub Wiki

Introduction to Principal Coordinates Analysis (PCoA) and FST using GenAlEx

Principal Coordinates Analysis (PCoA) is a complex mathematical process and the reader is encouraged to read more about the process in the GenAlEx paper1. In short, the major axes of variation are located within a multidimensional data set based on your measures of genetic diversity and this is plotted in 2D or 3D space. When individuals are more genetically similar they will be closer in 2D/3D space. In this tutorial, you will generate a PCoA plot as well as evaluate the FST for the pawpaw data set.

PCoA using GenAlEx

GenAlEx, "Genetic Analysis in Excel", is a cross-platform package for population genetic analysis that runs within Microsoft Excel on both the Windows and the Macintosh operating systems. The package opens as a macro in Excel, see the GenAlEx manual for instructions on how to install. We have made this reformatted data file available for this tutorial (see pawpawpartial.xlsx in example files).

To generate the PCoA plot, first create a genetic distance matrix in GenAlEx as below.

(Figure 1) (Figure 2)

Next, from this genetic distance matrix, generate the PCoA. Be sure to uncheck color data point and color code populations.

(Figure 3) (Figure 4)

Interpreting Results

The resulting plot shows the population genetic differentiation for the partial pawpaw data set. Populations are color coded and you can see that the PET and POC populations are differentiated from the remainder of the populations. You can also see that in terms of percent variation the first, second, and third axes explain 14.88%, 7.31%, and 6.88%, respectively.

(Figure 5)

FST using GenAlEx

To generate an FST analysis, choose Frequency from the Frequency-based dropdown menu. Selete Okay on the Allele Frequency Data Parameters and then select Pairwise FST. Selet the view you would like to see for the meausure, e.g., Output Pairwise Matrix or Output Labeled Pairwise Matrix.

(Figure 6) (Figure 7) (Figure 8)

Interpreting Results

The resulting data shows all pairwise FST measures. The range of measures demonstrate that some populations are more genetically similar than others, e.g., SU1 and SEL, while others are more differentiated, e.g., LCN and FCP.

(Figure 9) (Figure 10)

References

[1]: Peakall RO, Smouse PE. GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Molecular ecology notes. 2006 Mar;6(1):288-95.

⚠️ **GitHub.com Fallback** ⚠️