Market returns a price ratio vector $\theta_t\in\mathbf{R}^d$.
Wealth after iteration $t$: $M_{t+1}=M_t\cdot(\theta_t^\top w_t)$.
Note: For (2), we have $\theta_{t,i}=\frac{\text{price of the }i^{\text{th}}\text{ stock at round }t+1}{\text{price of the }i^{\text{th}}\text{ stock at round }t}$.
For (3), by recursion, we have $\displaystyle M_{T+1}=M_1\cdot\prod_{t=1}^T \theta_t^\top w_t.$
Theorem: A twice-differentiable function is $\alpha$-exp-concave iff $\forall w\in \mathcal{K}$, $$\nabla^2 f(w)\succeq \alpha \nabla f(w)\nabla f(w)^\intercal \succeq 0,$$ i.e., $\nabla^2 f(w)$ is PSD.
Comparison: A twice-differentiable function is $\lambda$-SC iff $\nabla^2 f(w)\succeq \lambda \mathbf{I}_d$, i.e., $\nabla^2 f(w)$ is PD.
A twice-differentiable function is convex iff $\nabla^2 f(w)\succeq \lambda 0$.
So here we see $\lambda$-SC $\subseteq\alpha$-exp-concave $\subseteq$ convex.
Corollary: If a function is $\lambda$-SC, then it is also $\frac{\lambda}{G^2}$-exp-concave.
Theorem: For an $\alpha$-exp-concave function $f$, suppose Assumptions 1 and 2 hold. Then $\forall x,y \in \mathcal{K}$, $$f(y) \geq f(x) + \nabla f(x)^\top (y-x) + \frac{\beta}{2}(y-x)^\top\nabla f(w)\nabla f(w)^\top(y-x)$$ where $\beta=\frac{1}{2}\min(\frac{1}{GD},\alpha)$.
Proof (idea): Recall proof from last lecture. For $\lambda$-SC we have
$$\Vert w_t - w^* \Vert_2^2\left( \frac{1}{\eta_t}-\frac{1}{\eta_{t-1}} \right)-\frac{\lambda}{2} \Vert w_t - w ^* \Vert_2^2$$ such that we try to use the negative term to cancel the positive term by setting $\eta_t=\frac{1}{\lambda t}$ so that $\frac{1}{\eta_t}-\frac{1}{\eta_{t-1}}=\lambda t-\lambda(t-1)=\lambda$.
Here we can try
$$-\frac{\beta}{2}(w_t - w^{*} )^{\top}\nabla^2 f_t(w_t)\nabla f_t(w_t)^{\top}(w_t - w^{*} )$$ so that
$$(w_t-w^{*})^\top\left(\frac{1}{\eta_t}A_t-\frac{1}{\eta_t}A_{t-1}\right)(w_t-w^{*})$$
with $\displaystyle A_t=\sum_{i=1}^t\nabla f_i(w_i)\nabla f_i(w_i)^\top$ and set $\eta_t=\frac{1}{\beta}$.
Theorem: Suppose all loss functions are $\alpha$-exp-concave, and Assumptions 1 and 2 hold. Let $\epsilon=\frac{1}{\beta^2D^2}$ and $\eta_t=\frac{1}{\beta}$, then the regret of the Online Newton Step algorithm is bounded by $$Reg_T^{\text{ONS}}\leq 2\left(\frac{1}{\alpha}+GD\right)d\log T=\mathcal{O}\left(\frac{d\log T}{\alpha}\right)$$
To summarize the online convex optimization so far:
Algorithm
Upper Bound
Lower Bound
Convex
OGD with $\eta_t=\mathcal{O}\left(\frac{1}{\sqrt{T}}\right)$
$\mathcal{O}(\sqrt{T})$
$\Omega(\sqrt{T})$
$\alpha$-exp-concave
ONS
$\mathcal{O}\left(\frac{d\log{T}}{\alpha}\right)$
$\Omega\left(\frac{d\log{T}}{\alpha}\right)$
$\lambda$-SC
OGD with $\eta_t=\mathcal{O}\left(\frac{1}{\lambda_t}\right)$
$\mathcal{O}\left(\frac{\log T}{\lambda}\right)$
$\Omega\left(\frac{\log T}{\lambda}\right)$
Better Bounds 2
Achieving data-dependent bounds
Example 1: Gradient-dependent bound: $$\mathcal{O}\left(D\sqrt{\sum_{t=1}^T \Vert \nabla f_t(w_t)\Vert _2^2}\right)$$
Follow the Leader (FTL): At each step pick $\displaystyle\underset{w\in\mathcal{K}}\arg\min\sum_{t=1}^T f_i(w)$ for all previous steps (assumes oracle access to all previous functions). Achieves $\Omega(T)$ regret.
Follow the Regularized Leader (FTRL): At each step pick $\displaystyle\underset{w\in\mathcal{K}}\arg\min\sum_{t=1}^T f_i(w)+R(w)$ with $R(w)=\lambda\Vert w\Vert_2^2$ for all previous steps. Achieves $\mathcal{O}(\sqrt{T})$ regret.