CanonicalMetric - crowlogic/arb4j GitHub Wiki
The relationship between stationary Gaussian processes and their associated metrics forms a cornerstone of stochastic process theory. This report provides a rigorous mathematical proof that the square root of the variance structure function constitutes the canonical metric for stationary Gaussian processes, while exploring its implications for process analysis and applications.
A real-valued stochastic process {X(t)}t∈T is called Gaussian if all finite-dimensional distributions are multivariate normal. The process is stationary (in the strict sense) if for all n ∈ ℕ, t₁,...,tₙ ∈ T, and h ∈ T: $$ (X(t₁+h),...,X(tₙ+h)) \stackrel{d}{=} (X(t₁),...,X(tₙ)) $$
For a mean-zero stationary Gaussian process, the complete statistical characterization comes from its covariance function: $$ C(h) = 𝔼[X(t+h)X(t)] $$ Stationarity implies C(h) depends only on the lag h, not absolute position t.
The variance structure function (variogram) is defined as: $$ D(h) := 𝔼[(X(t+h) - X(t))²] = 2(C(0) - C(h)) $$ This measures the expected squared difference between process values separated by lag h.
For a mean-square continuous stationary Gaussian process {X(t)}t∈T, the function $$ d(h) := \sqrt{D(h)} = \sqrt{2(C(0) - C(h))} $$ defines a metric on the parameter space T.
We verify the four metric axioms:
1. Non-negativity
For all h ∈ T:
$$ d(h) = \sqrt{𝔼[(X(t+h)-X(t))²]} ≥ 0 $$
since variance is non-negative.
2. Identity of Indiscernibles
(⇒) If h = 0:
$$ d(0) = \sqrt{2(C(0)-C(0))} = 0 $$
(⇐) If d(h) = 0: $$ 𝔼[(X(t+h)-X(t))²] = 0 ⇒ X(t+h) = X(t)\ a.s. $$ By mean-square continuity and stationarity, this implies h = 0.
3. Symmetry
$$ d(-h) = \sqrt{2(C(0)-C(-h))} = \sqrt{2(C(0)-C(h))} = d(h) $$
since C(-h) = C(h) for real stationary processes.
4. Triangle Inequality
For h₁,h₂ ∈ T:
$$ d(h₁+h₂) ≤ d(h₁) + d(h₂) $$
Proof of Triangle Inequality
Using the Gaussian increment structure:
$$
\begin{aligned}
d(h₁+h₂)² &= 𝔼[(X(t+h₁+h₂) - X(t))²] \
&= 𝔼[((X(t+h₁+h₂) - X(t+h₂)) + (X(t+h₂) - X(t)))²] \
&= d(h₁)² + d(h₂)² + 2𝔼[(X(t+h₁+h₂)-X(t+h₂))(X(t+h₂)-X(t))]
\end{aligned}
$$
By stationarity:
$$ 𝔼[(X(t+h₁+h₂)-X(t+h₂))(X(t+h₂)-X(t))] = C(h₁) - C(h₂) - C(h₁+h₂) + C(0) $$
Applying the Cauchy-Schwarz inequality:
$$
\begin{aligned}
&𝔼[(X(t+h₁+h₂)-X(t+h₂))(X(t+h₂)-X(t))] \
&≤ \sqrt{𝔼[(X(t+h₁+h₂)-X(t+h₂))²]𝔼[(X(t+h₂)-X(t))²]} \
&= d(h₁)d(h₂)
\end{aligned}
$$
Thus:
$$
d(h₁+h₂)² ≤ d(h₁)² + d(h₂)² + 2d(h₁)d(h₂) = (d(h₁) + d(h₂))²
$$
Taking square roots preserves the inequality. ∎
Consider the OU process with covariance:
$$ C(h) = σ²e^{-θ|h|} $$
The canonical metric becomes:
$$ d(h) = \sqrt{2σ²(1 - e^{-θ|h|})} $$
This metric characterizes the process' correlation structure, showing how distance increases with lag until reaching asymptotic value
The canonical metric determines key sample path features through the Dudley Entropy Integral: $$ 𝔼\left[\sup_{t∈T} X(t)\right] ≤ K∫₀^∞ \sqrt{\log N(T,d,ε)} dε $$ where N(T,d,ε) is the covering number of T by d-balls of radius ε. This connects the metric geometry to process regularity.
If for some α > 0: $$ d(h) ≤ C|h|^α $$ then sample paths are almost surely Hölder continuous of order β < α.
In geostatistics, the canonical metric determines optimal weights for spatial interpolation through kriging equations: $$ \sum_{j=1}^n λ_j C(t_i-t_j) + μ = C(t_i-t_0) $$ with solution depending fundamentally on d(h).
Gaussian process regression uses the canonical metric through kernel functions: $$ k(x,y) = C(|x-y|) $$ directly influencing prediction uncertainty estimates.
This exposition rigorously establishes the square root of the variance structure function as the canonical metric for stationary Gaussian processes. The proof of metric properties leverages fundamental characteristics of Gaussian increments and covariance structures, while applications demonstrate its importance across statistical theory and practice. The canonical metric serves as a vital bridge between abstract process properties and concrete computational implementations.