matrix_derivatives - rasigadelab/mathwiki GitHub Wiki

Matrix Derivatives of $C = U L U^T$ with Respect to $U$ and $l$

Let $U$ be a square matrix of size $n$, and $L$ be a diagonal matrix with diagonal elements in the vector $l$. Define:

$$ C = U L U^T $$

This document provides the expressions for $C$ in index notation, as well as its first and second derivatives with respect to $U$ and $l$.


1. Expression for $C = U L U^T$ in Index Notation

Let:

  • $U$ be an $n \times n$ matrix with elements $U_{ij}$.
  • $L$ be a diagonal $n \times n$ matrix with diagonal elements given by the vector $\vec{l}$, so $L_{ij} = l_i \delta_{ij}$, where $\delta_{ij}$ is the Kronecker delta.
  • $C = U L U^T$.

In index notation, the element $C_{ij}$ of the matrix $C$ can be written as:

$$ C_{ij} = \sum_{k} U_{ik} , l_k , U_{jk} $$


2. First Derivative of $C$ with Respect to $U$ and $\vec{l}$

Derivative of $C$ with Respect to $U$

To find $\frac{\partial C_{ij}}{\partial U_{pq}}$, we differentiate each term in $C_{ij}$ with respect to $U_{pq}$.

$$ \frac{\partial C_{ij}}{\partial U_{pq}} = \sum_k \left( \frac{\partial U_{ik}}{\partial U_{pq}} , l_k , U_{jk} + U_{ik} , l_k , \frac{\partial U_{jk}}{\partial U_{pq}} \right) $$

Using the Kronecker delta to evaluate these partial derivatives, we have:

$$ \frac{\partial U_{ik}}{\partial U_{pq}} = \delta_{ip} \delta_{kq} \quad \text{and} \quad \frac{\partial U_{jk}}{\partial U_{pq}} = \delta_{jp} \delta_{kq} $$

Substituting these into the expression, we get:

$$ \frac{\partial C_{ij}}{\partial U_{pq}} = l_q , U_{jq} , \delta_{ip} + l_q , U_{iq} , \delta_{jp} $$

Derivative of $C$ with Respect to $\vec{l}$

Now, we compute the derivative of $C_{ij}$ with respect to an element $l_m$ of $\vec{l}$.

Since $l_k$ appears only once in the expression for $C_{ij}$, differentiating with respect to $l_m$ selects only the term where $k = m$:

$$ \frac{\partial C_{ij}}{\partial l_m} = U_{im} , U_{jm} $$


3. Second Derivatives of $C$

Second Derivative with Respect to $U$ (Mixed Terms)

For the second derivative of $C_{ij}$ with respect to two different elements of $U$, say $U_{pq}$ and $U_{rs}$, we differentiate $\frac{\partial C_{ij}}{\partial U_{pq}}$ again with respect to $U_{rs}$.

From our earlier result:

$$ \frac{\partial C_{ij}}{\partial U_{pq}} = l_q , U_{jq} , \delta_{ip} + l_q , U_{iq} , \delta_{jp} $$

Differentiating this with respect to $U_{rs}$:

$$ \frac{\partial^2 C_{ij}}{\partial U_{pq} , \partial U_{rs}} = \frac{\partial}{\partial U_{rs}} \left( l_q , U_{jq} , \delta_{ip} + l_q , U_{iq} , \delta_{jp} \right) $$

Applying the Kronecker delta as before, we find:

$$ \frac{\partial^2 C_{ij}}{\partial U_{pq} , \partial U_{rs}} = l_q , \delta_{jq} , \delta_{ir} , \delta_{ps} + l_q , \delta_{iq} , \delta_{jp} , \delta_{qs} $$

Second Derivative with Respect to $l$

The second derivative with respect to the components of $l$ is straightforward. Since

$$ \frac{\partial C_{ij}}{\partial l_m} = U_{im} , U_{jm} $$

we take the derivative with respect to $l_n$:

$$ \frac{\partial^2 C_{ij}}{\partial l_m , \partial l_n} = 0 \quad \text{for} \quad m \neq n $$

and

$$ \frac{\partial^2 C_{ij}}{\partial l_m , \partial l_m} = 0 $$

Mixed Second Derivative with Respect to $U$ and $l$

For the mixed second derivative with respect to $U_{pq}$ and $l_m$:

$$ \frac{\partial^2 C_{ij}}{\partial U_{pq} , \partial l_m} = \frac{\partial}{\partial U_{pq}} \left( U_{im} , U_{jm} \right) $$

Differentiating with respect to $U_{pq}$:

$$ \frac{\partial^2 C_{ij}}{\partial U_{pq} , \partial l_m} = \delta_{ip} , U_{jm} , \delta_{mq} + U_{im} , \delta_{jq} $$


This concludes the expressions for the first and second derivatives of $C = U L U^T$ with respect to $U$ and $\vec{l}$ in index notation.