4. Mathematical Formulation - SjulsonLab/generalized_contrastive_PCA GitHub Wiki

Mathematical Formulation

This page describes the mathematical formulation of generalized contrastive PCA (gcPCA).
The goal of gcPCA is to identify dimensions that capture differences in covariance structure between two datasets.


Problem Setup

Assume two datasets:

  • Condition A: Ra ∈ ℝ^(ma × p)
  • Condition B: Rb ∈ ℝ^(mb × p)

Where:

  • Rows represent samples
  • Columns represent features
  • Both datasets share the same feature space (neurons, genes, etc)

Compute covariance matrices:

CA = Cov(Ra)
CB = Cov(Rb)

These covariance matrices summarize the structure of each dataset.


Review: Principal Component Analysis (PCA)

PCA identifies directions that maximize variance within a single dataset.

Given covariance matrix C, PCA solves:

argmax_x  xᵀ C x
subject to xᵀx = 1

This optimization yields eigenvectors of C, which define principal components.

PCA identifies dominant variance, but does not distinguish between experimental conditions.


Comparing Two Datasets

To compare two datasets, we want to identify directions that:

  • Increase variance in Condition A
  • Decrease variance in Condition B

A natural objective is:

argmax_x  xᵀ (CA − CB) x
subject to xᵀx = 1

This formulation resembles PCA, but applied to the difference of covariance matrices.

However, this approach is sensitive to:

  • Sampling noise
  • High-variance dimensions
  • Finite data effects

These issues motivated the development of contrastive PCA (cPCA).


Contrastive PCA (cPCA)

cPCA introduces a hyperparameter α:

argmax_x  xᵀ (CA − α CB) x
subject to xᵀx = 1

Where:

  • α controls the influence of condition B
  • Different values of α produce different solutions

This creates ambiguity, since there is no objective way to choose α.


Generalized Contrastive PCA (gcPCA)

gcPCA removes the hyperparameter by normalizing the contrast.

gcPCA v4 solves:

argmax_x  (xᵀ (CA − CB) x) / (xᵀ (CA + CB) x)
subject to xᵀx = 1

This formulation:

  • Penalizes high-variance dimensions
  • Reduces noise bias
  • Eliminates hyperparameter tuning

The denominator acts as a normalization factor based on total variance.


Interpretation

gcPCA v4 eigenvalues lie in the range:

  • +1 → variance only in Condition A
  • −1 → variance only in Condition B
  • 0 → equal variance in both conditions

This makes gcPCA results directly interpretable.

Additionally:

  • Positive gcPCs → enriched in Condition A
  • Negative gcPCs → enriched in Condition B

Solving the Optimization

The gcPCA objective can be solved as a generalized eigenvalue problem.

Define:

M = sqrt(CA + CB)

Then solve:

M⁻¹ (CA − CB) M⁻¹

Eigenvectors of this matrix define the gcPCs.

This transformation:

  • Normalizes high-variance dimensions
  • Improves numerical stability
  • Produces robust solutions

Orthogonality

Unlike PCA, gcPCs are:

  • Not orthogonal in the original feature space
  • Orthogonal in the normalized feature space

This is expected because gcPCA rescales features before computing components.

Orthogonal variants of gcPCA are also available (see manuscript).


Sparse gcPCA

gcPCA also supports sparse solutions.

Sparse gcPCA:

  • Applies elastic net regularization
  • Selects subsets of features
  • Improves interpretability

Sparse solutions are computed using optimization.


Summary

gcPCA extends PCA to compare two datasets by:

  • Contrasting covariance structure
  • Normalizing variance differences
  • Eliminating hyperparameters
  • Producing interpretable components

For additional details and derivations, see the gcPCA manuscript.

Links to Other Pages

1. Quickstart Guide
2. Installation
3. Conceptual Overview
5. Code Reference
6. Input Data Guidelines
7. Interpreting Results