Andrew NG ML Course Notes - sakthiram/100DaysOfMLCode GitHub Wiki
Week 1
Machine Learning
Arthur Samuel described it as: "the field of study that gives computers the ability to learn without being explicitly programmed."
Tom Mitchell provides a more modern definition: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."
Example: playing checkers.
E = the experience of playing many games of checkersT = the task of playing checkers.P = the probability that the program will win the next game.
Supervised & Unsupervised learning
Supervised: Correct answer y(i) will be given for each training data sample x(i). Cost function is derived by how much the prediction varies from this answer.
Regression: Mapping input variables (features) to continuous function using a hypothesis function.
Classification: Mapping input variables to discrete categories
Unsupervised: Clustering/grouping of data based on relationships among input variables (features)
[Linear Regression] Why is the cost function about the sum of SQUARES, rather than sum of absolute/cubes?
It isn't the only possible cost function, but it has many nice properties.
Overestimates (+) and Underestimates (-) are punished equally because of squaring.
Big errors gets punished more than small ones.
Squaring function is smooth & yields linear forms after differentiation. (nice for optimization)
"Convex" property => guarantees "global min" => algorithms will converge.
[Linear Regression] Why can’t I use 4th powers in the cost function? Don’t they have the nice properties of squares?
Distance in Cartesian coordinates is found by srqt(sum of squares of x,y,.. distances from origin) (dist=error)
Even when the axes (x,y,..) are rotated, the sum of squares value remains same for a given point.
So 4th powers lack this property.
Why does 1/(2 * m) make math easier?
When we differentiate the cost to calculate the gradient, we get a factor of 2 due to the exponent inside the sum. The two factors will cancel out, giving a slightly simpler formula