ICP6 - PallaviArikatla/Python GitHub Wiki
OBJECTIVE: Implementation of Kmeans algorithm, Scaling and PCA.
QUESTION 1: Implement Kmeans. Remove null values present in the data and identify number of clusters in the data using Elbow method.
- Remove all the nulls in the given dataset.
Null values gets replaced by null using this function and the output will be as follows:
- Elbow method: With which we can identify number of clusters.
A graph gets plotted. Analyzing the graph and the given dataset we can identify number of clusters. Here with my data I have divided my clusters to 3.
And the plotted graph will be as follows:
- Calculate silhouette score for the original data.
Score:
- Apply scaling and calculate silhouette score for the scaled data.
Score:
- Apply PCA on original and scaled data.
Bonus Question:
- Apply Kmeans on PCA result; PCA and Kmeans; Scaled, PCA and Kmeans. And plot the graph for Scaled, PCA and Kmeans.
Scores:
Graph: