ICP 6 - Saiaishwaryapuppala/CSEE5590_python_Icp GitHub Wiki

Python and Deep Learning: Special Topics

Rajeshwari Sai Aishwarya Puppala

Student ID: 16298162

Class ID: 35

In class programming: 6

Objectives:

1.Apply K means clustering in this data set provided below:

Remove any null values by the mean. Use the elbow method to find a good number of clusters with the KMeans algorithm

Calculate the silhouette score for the above clustering

Apply PCA on the same dataset.

Import the required libraries like seaborn, pandas, matplotlib, kmeans, pathlib
Load the data set into a data frame with the help of path lib.
For better information print the tenure counts
Now find the null counts in the data set.
After printing the null counts we will get to know that Minimum_Payments and Credit_limit have the null value.
Replace it with the mean of the respective feature.
Divide the data set into train and test.
Now with the elbow method, we will find the optimal number of cluster
As you can see the plot, 3 is the optimal number of clusters.
Now calculate the silhouette_score for 3 clusters by doing KMeans for 3 clusters.
Apply the Standardization and apply PCA for the results obtained from standardization.
For the results obtained from PCA, check if the score is improving or not.