python icp6 - koushikskr/python GitHub Wiki

Introduction: This ICP is about implementing KMeans algorithm.

Question 1: Implement KMeans, remove null values and find the number of clusters using elbow method.

solution steps: Created data frame using the given data set. Found null values in the columns using isnull() method. removed the null values using mean() method. Now splitted the data into x and y with the required columns. Passed input data and plotted the graph using elbow method then concluded the number of clusters.

Question 2: Calculate the silhouette score for the above clustering.

solution steps: Created KMeans model. passed x data to fit into the model. found prediction values passing x values. found the score.

Question 3,4 and bonus questions: Apply scaling,PCA and visualizations

solution steps: Initialized scaler and passed the x data to fit into the model. got x_scaler model from transform() method. Initialized PCA object. got x_pca data and fit into the model. predicating the data with predict() method. Now found the score and we can see the reduction in the value of the score. Plotted the clusters with prediction data.