Factor Analysis (FA) - AAU-Dat/P5-Nonlinear-Dimensionality-Reduction GitHub Wiki

"One way to solve the problems of high dimensions is to first reduce the dimensionality of the data to a manageable size, keeping as much of the original information as possible and then feed the reduced-dimensional data into a pattern recognition system. In this situation, dimensionality reduction process becomes the preprocessing stage of the pattern recognition system."

Introduction

Factor analysis (FA) aims to reduce dimensionality of data by correlating the observed variables into fewer unobserved (latent) variables, called factors, where each variable supplies part of the weight of the factor.

Determining interrelationships between items in the data, FA looks at the variance and covariance and partitioning the variance into two types: common and unique. Intuitively, related items have stronger mathematical correlations, and can thus be grouped with less loss of information upon dimensional reduction.

In general, two categories of FA exist. Exploratory FA (EFA) and Confirmatory FA (CFA). EFA attempts to discover or determine underlying structures or pull insight from data using a correlation/component matrix, while CFA tests data against a hypothesized data structure, which may be extracted from prior EFA, using model equations (or matrix).

Principal Components Analysis (PCA)(@sebastianbot6969) is technically one of many types of EFA. Other variants exist such as: Principal/Common Factor Analysis (PFA), Image Factoring, Maximum Likelihood Analysis, and Alpha Factoring Outweighs Least Squares. The most common and popular of which is PCA and PFA.

PCA vs PFA

Some key observations on PCA vs PFA:

PCA attempts to group data and extract meaning, whereas PFA assumes that some latent meaning may not be observed directly but HAS to be extracted.
In PCA total variance is accounted for, while PFA uses only common variance.

Common variance is variance shared between a set of items, and thus highly correlated items will share more variance. Unique variance is any other variance and is further divided into error variance and specific variance.

Algorithm

FA relies on certain assumptions on the dataset.

There are no outliers or missing values in the data.
Sample size should be greater than the number of factors by a factor of 5 or more.
Variables must be interrelated (Barret test may be used to determine if this is the case).
The data is metric (numerical) in some way (intervals are expected).
Data is preferred to be normalized, but multivariate normalization is not required.

On applicable data, the algorithm is a two-step process of Factor Extraction followed by Factor Rotation.

Factor Extraction

PCA or PFA may be used for this step. (See @sebastianbot6969 on PCA)

PFA is similar but does not include error variance. This is done to find closer commonality variation.

Factor Rotation

When the number of factors and analysis model are determined, Factor Rotation helps to interpret how each variable impacts each factor (factor loading). There are two types of rotation:

Orthogonal. Assumes the factors themselves are not correlated.
Oblique. Assumes the factors are correlated.

See: https://stats.oarc.ucla.edu/spss/seminars/introduction-to-factor-analysis/a-practical-introduction-to-factor-analysis/ for full details.

Examples

Dimensionality reduction for better continued work on the data, or for simplification of the data. For example, book or restaurant reviews where aggregate data may be simpler to relate to for humans.

Latent variables discovery of metrics which are not directly measurable on their own, for example empathy.

Sources

(Best) https://www.datacamp.com/tutorial/introduction-factor-analysis
Dimensionality Reduction Using Factor Analysis by Khosla, Nitin
A practical introduction to exploratory Factor Analysis
Confirmatory Factor Analysis (CFA)
Intro guide to Factor Analysis (python)
https://www.youtube.com/watch?v=Jkf-pGDdy7k