4. Mathematical Formulation - SjulsonLab/generalized_contrastive_PCA GitHub Wiki

Mathematical Formulation

This page describes the mathematical formulation of generalized contrastive PCA (gcPCA).
The goal of gcPCA is to identify dimensions that capture differences in covariance structure between two datasets.

Problem Setup

Assume two datasets:

Condition A: Ra ∈ ℝ^(ma × p)
Condition B: Rb ∈ ℝ^(mb × p)

Where:

Rows represent samples
Columns represent features
Both datasets share the same feature space (neurons, genes, etc)

Compute covariance matrices:

CA = Cov(Ra)
CB = Cov(Rb)

These covariance matrices summarize the structure of each dataset.

Review: Principal Component Analysis (PCA)

PCA identifies directions that maximize variance within a single dataset.

Given covariance matrix C, PCA solves:

argmax_x  xᵀ C x
subject to xᵀx = 1

This optimization yields eigenvectors of C, which define principal components.

PCA identifies dominant variance, but does not distinguish between experimental conditions.

Comparing Two Datasets

To compare two datasets, we want to identify directions that:

Increase variance in Condition A
Decrease variance in Condition B

A natural objective is:

argmax_x  xᵀ (CA − CB) x
subject to xᵀx = 1

This formulation resembles PCA, but applied to the difference of covariance matrices.

However, this approach is sensitive to:

Sampling noise
High-variance dimensions
Finite data effects

These issues motivated the development of contrastive PCA (cPCA).

Contrastive PCA (cPCA)

cPCA introduces a hyperparameter α:

argmax_x  xᵀ (CA − α CB) x
subject to xᵀx = 1

Where:

α controls the influence of condition B
Different values of α produce different solutions

This creates ambiguity, since there is no objective way to choose α.

Generalized Contrastive PCA (gcPCA)

gcPCA removes the hyperparameter by normalizing the contrast.

gcPCA v4 solves:

argmax_x  (xᵀ (CA − CB) x) / (xᵀ (CA + CB) x)
subject to xᵀx = 1

This formulation:

Penalizes high-variance dimensions
Reduces noise bias
Eliminates hyperparameter tuning

The denominator acts as a normalization factor based on total variance.

Interpretation

gcPCA v4 eigenvalues lie in the range:

+1 → variance only in Condition A
−1 → variance only in Condition B
0 → equal variance in both conditions

This makes gcPCA results directly interpretable.

Additionally:

Positive gcPCs → enriched in Condition A
Negative gcPCs → enriched in Condition B

Solving the Optimization

The gcPCA objective can be solved as a generalized eigenvalue problem.

Define:

M = sqrt(CA + CB)

Then solve:

M⁻¹ (CA − CB) M⁻¹

Eigenvectors of this matrix define the gcPCs.

This transformation:

Normalizes high-variance dimensions
Improves numerical stability
Produces robust solutions

Orthogonality

Unlike PCA, gcPCs are:

Not orthogonal in the original feature space
Orthogonal in the normalized feature space

This is expected because gcPCA rescales features before computing components.

Orthogonal variants of gcPCA are also available (see manuscript).

Sparse gcPCA

gcPCA also supports sparse solutions.

Sparse gcPCA:

Applies elastic net regularization
Selects subsets of features
Improves interpretability

Sparse solutions are computed using optimization.

Summary

gcPCA extends PCA to compare two datasets by:

Contrasting covariance structure
Normalizing variance differences
Eliminating hyperparameters
Producing interpretable components

For additional details and derivations, see the gcPCA manuscript.

Links to Other Pages

1. Quickstart Guide
2. Installation
3. Conceptual Overview
5. Code Reference
6. Input Data Guidelines
7. Interpreting Results