PCA Model - setiamanlhc/python-snippet-code GitHub Wiki

  1. Scale your data from df to scaled_data
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(df)

scaled_data = scaler.transform(df)
  1. Create PCA object
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(scaled_data)
  1. transform this data to its first 2 principal components
x_pca = pca.transform(scaled_data)
scaled_data.shape
x_pca.shape
  1. Plot the two component using scatter plot
plt.figure(figsize=(8,6))
plt.scatter(x_pca[:,0],x_pca[:,1],c=cancer['target'],cmap='plasma')
plt.xlabel('First principal component')
plt.ylabel('Second Principal Component')
  1. Interpreting Component
#Display components. It is in numpy format
pca.components_

#convert to dataframe
df_comp = pd.DataFrame(pca.components_,columns=cancer['feature_names'])

#Plot heatmap
plt.figure(figsize=(12,6))
sns.heatmap(df_comp,cmap='plasma',)