MercersTheorem - crowlogic/arb4j GitHub Wiki

Mercer's Theorem

The following is reproduced directly from [Riesz and Nagy, 98]

Theorem. If the transformation $A$ generated by the continuous symmetric kernel $A(x, y)$, that is, if $A(f) \geq 0$ for all $f$ or, equivalently, if all the characteristic numbers $\mu_1 \geq 0$ are positive, the development (20) is uniformly convergent.

This theorem extends immediately to the case where all but a finite number of the $\mu \neq 0$ are of the same sign, positive or negative. We observe first that since the kernel $A(x, y)$ is continuous, all the image functions

$$A(f)(x) = \int A(x, y) f(y) , dy$$

are continuous; therefore, in particular, all the characteristic functions $\varphi_n(x) = \frac{1}{\mu_n} A(\varphi_n)(x)$ are continuous. Consequently the "remainders"

$$A_n(x, y) = A(x, y) - \sum_{i = 1}^{n} \mu_i \varphi_i(x) \varphi_i(y)$$

are also continuous functions. Since we have

$$\int \int A_n(x, y) f(x) f(y) , dx , dy = \sum_{i = n + 1}^{\infty} \mu_i \left( \int f \varphi_i \right)^2 \geq 0$$

for every element $f$ of $L^2$.

From this we deduce that $A_n(x, y) \geq 0$. In fact, if we had $A_n(x_0, y_0) < 0$, we should have by continuity $A_n(x, y) < 0$ in a certain neighborhood of the point $(x_0, y_0)$. Setting $f(x) = 1$ for $x_0 - \delta < x < x_0 + \delta$ and $f(x) = 0$ elsewhere, integral (\ref{doubleint}) would become negative, a contradiction. Hence we have

$$A_n(x, x) = A(x, x) - \sum_{i = 1}^{n} \mu_i \varphi_i^2(x) \geq 0$$

for $n = 1, 2, \ldots$. From this we conclude that the series of positive terms

$$\sum_{i = 1}^{\infty} \mu_i \varphi_i^2(x)$$

is convergent and that its sum is $\leq A(x, x)$. Denoting by $M$ the maximum of the continuous function $A(x, x)$, we have by Cauchy's inequality:

$$\left( \sum_{i = 1}^{n} \mu_i \varphi_i(x) \varphi_i(y) \right)^2 \leq \sum_{i = 1}^{n} \mu_i \varphi_i^2(x) \sum_{i = 1}^{n} \mu_i \varphi_i^2(y) \leq M^2$$

From this it follows that the series

$$\sum_{i = 1}^{\infty} \mu_i \phi_i(x) \phi_i(y)$$

converges, for every fixed value of $x$, uniformly in $y$; its sum $B(x, y)$ is therefore a continuous function of $y$, and for every continuous function $f(y)$ we have

$$\int_a^b B(x, y) f(y) dy = \sum_{i = 1}^{\infty} \mu_i \phi_i(x) \int_a^b \phi_i(y) f(y) dy$$

Now by one of the theorems proved in the preceding section, the series in the second member converges to $A(f)(x)$. Hence we have.

$$\int_a^b [B(x, y) - A(x, y)] f(y) dy = 0$$

setting in particular $f(y) = B(x, y) - A(x, y)$ (for a fixed value of $x$), it follows that $B(x, y) - A(x, y) = 0$ for $a \leq y \leq b$, hence

$$A(x, x) = B(x, x) = \sum_{i = 1}^{\infty} \mu_i \phi_i(x)^2$$

Since the terms of this series are positive continuous functions of $x$ and its sum $A(x, x)$ is a continuous function, it follows from a known theorem of Dini that the series converges uniformly. Applying Cauchy's inequality (24) again, we deduce from this that series (25) converges uniformly with respect to $x$ as well as to $y$, and a fortiori simultaneously, which was to be proved.

Whatever be the continuous symmetric kernel $A(x, y)$, its iterate

$$A^{(2)}(x, y) = \int_a^b A(x, z) A(z, y) dz$$

is continuous and of positive type. In fact,

$$(A^{(2)}f, f) = (Af, Af) \geq 0$$

The characteristic functions $\phi_i(x)$ of $A$ are also characteristic functions for $A^2$, but they correspond to the squares of the characteristic values $\mu_i$ of $A$:

$$A^2 \phi_i = A(A \phi_i) = A(\mu_i \phi_i) = \mu_i^2 \phi_i$$

The sequence $\mu_1^2, \mu_2^2, \ldots$ contains all the characteristic values of $A^2$ different from 0, each as many times as its multiplicity indicates. If not, there would be a characteristic function $\phi$ corresponding to a characteristic value $\mu^2 \neq 0$ of $A^2$ and orthogonal to all the $\phi_i$. This would be in contradiction to the fact that

$$\mu \phi = A^2 \phi = \sum_{i = 1}^{\infty} (A \phi, \phi_i) \phi_i = \sum_{i = 1}^{\infty} (\phi, A \phi_i) \phi_i = \sum_{i = 1}^{\infty} \mu_i (\phi, \phi_i) \phi_i = 0$$

By the theorem of Mercer we therefore have, for the iterate of an arbitrary continuous kernel $A(x, y)$, the uniformly convergent development:

$$A^{(2)}(x, y) = \sum_{i = 1}^{\infty} \mu_i^{(2)} \phi_i^{(2)}(x) \phi_i^{(2)}(y)$$

References

  1. Frigyes Riesz and Béla Szőkefalvi-Nagy, Functional Analysis. F. Ungar Pub. Co., New York, 1955.