Lecture 02. Linear Regression - clairedavid/ml_in_hep GitHub Wiki
Need of Feature Scaling: when, when not
Normilization is norm distributed: nonsensical
Linear Regression, Yale.edu,
- Good code + animated gif (red data points and plane) here
\theta'_0 &= \theta_0-\alpha \frac{\partial}{\partial \theta_0} J\left(\theta\right) \\[2ex]
\theta'_1 &= \theta_1-\alpha \frac{\partial}{\partial \theta_1} J\left(\theta\right) \\
\cdots\\
\theta'_j &= \theta_j-\alpha \frac{\partial}{\partial \theta_j} J\left(\theta\right) \\
\cdots\\
\theta'_n &= \theta_n-\alpha \frac{\partial}{\partial \theta_n} J\left(\theta\right) \\
\end{align*}```
Not
{figure} ../images/lec02_2_normalizedContour.png
---
name: normalizedContour
width: 100%
---
. The 3D rendering the figure above. The $b$ stands for _bias_, a term similar to the offset $\theta_0$ that you will see in the literature (especially with neural networks).
<sub>Source: [deeplearning.ai](https://www.deeplearning.ai) | Andrew Ng</sub>