K Means Clustering - utkaln/machine-learning GitHub Wiki

Algorithm that works with just the x side of the data label and no Y given
Choose a number of cluster that makes the most sense for the business decision (such as: demographics, type of disease, group of retail)
The steps of K means works as follows -
- Step 1: Randomly Initialize Cluster Centroid
- Step 2: Allocate data points closest to Type of Cluster or Cluster Index
- Step 3: Calculate the Means of the data points and reassign the Cluster Centroid to the mean value
- Repeat the above steps for a number of iterations, to reduce the cost Function (prefer to stay under 100 for optimal)

There is a no specific cost function is calculated for K means computation, as with each iteration it is naturally trying to reduce the cost