ICP6 - PallaviArikatla/Python GitHub Wiki

OBJECTIVE: Implementation of Kmeans algorithm, Scaling and PCA.

QUESTION 1: Implement Kmeans. Remove null values present in the data and identify number of clusters in the data using Elbow method.

  • Remove all the nulls in the given dataset.

Null values gets replaced by null using this function and the output will be as follows:

  • Elbow method: With which we can identify number of clusters.

A graph gets plotted. Analyzing the graph and the given dataset we can identify number of clusters. Here with my data I have divided my clusters to 3.

And the plotted graph will be as follows:

  • Calculate silhouette score for the original data.

Score:

  • Apply scaling and calculate silhouette score for the scaled data.

Score:

  • Apply PCA on original and scaled data.

Bonus Question:

  • Apply Kmeans on PCA result; PCA and Kmeans; Scaled, PCA and Kmeans. And plot the graph for Scaled, PCA and Kmeans.

Scores:

Graph: