ICP 6 - Saiaishwaryapuppala/CSEE5590_python_Icp GitHub Wiki

Python and Deep Learning: Special Topics

Rajeshwari Sai Aishwarya Puppala

Student ID: 16298162

Class ID: 35

In class programming: 6


1.Apply K means clustering in this data set provided below:


Remove any null values by the mean. Use the elbow method to find a good number of clusters with the KMeans algorithm

Calculate the silhouette score for the above clustering

Apply PCA on the same dataset.


  • Import the required libraries like seaborn, pandas, matplotlib, kmeans, pathlib
  • Load the data set into a data frame with the help of path lib.
  • For better information print the tenure counts
  • Now find the null counts in the data set.
  • After printing the null counts we will get to know that Minimum_Payments and Credit_limit have the null value.
  • Replace it with the mean of the respective feature.
  • Divide the data set into train and test.
  • Now with the elbow method, we will find the optimal number of cluster
  • As you can see the plot, 3 is the optimal number of clusters.
  • Now calculate the silhouette_score for 3 clusters by doing KMeans for 3 clusters.
  • Apply the Standardization and apply PCA for the results obtained from standardization.
  • For the results obtained from PCA, check if the score is improving or not.

Source Code


Elbow Plot

Null counts and Silhouette_score before sandardization

After PCA and Silhouette_score after sandardization