k nearest neighbours - taoualiw/My-Knowledge-Base GitHub Wiki
-
K controlls the shape of the decision boundary
-
Pros :
-
Given a new point takes the k nearest ones and check in these k points which class is dominant to put the new point in
-
No need for training ,it is lazy learning algorithm and therefore requires no training prior to making real time predictions. This makes the KNN algorithm much faster than other algorithms that require training e.g SVM, linear regression, etc.
-
Parameters Needs only to define n and the distance measure
-
-
Cons:
- The KNN algorithm doesn't work well with high dimensional data because with large number of dimensions, it becomes difficult for the algorithm to calculate distance in each dimension.
- The KNN algorithm has a high prediction cost for large datasets. This is because in large datasets the cost of calculating distance between new point and each existing point becomes higher.
- Finally, the KNN algorithm doesn't work well with categorical features since it is difficult to find the distance between dimensions with categorical features.