PCA Model - setiamanlhc/python-snippet-code GitHub Wiki

Scale your data from df to scaled_data

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(df)

scaled_data = scaler.transform(df)

Create PCA object

from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(scaled_data)

transform this data to its first 2 principal components

x_pca = pca.transform(scaled_data)
scaled_data.shape
x_pca.shape

Plot the two component using scatter plot

plt.figure(figsize=(8,6))
plt.scatter(x_pca[:,0],x_pca[:,1],c=cancer['target'],cmap='plasma')
plt.xlabel('First principal component')
plt.ylabel('Second Principal Component')

Interpreting Component

#Display components. It is in numpy format
pca.components_

#convert to dataframe
df_comp = pd.DataFrame(pca.components_,columns=cancer['feature_names'])

#Plot heatmap
plt.figure(figsize=(12,6))
sns.heatmap(df_comp,cmap='plasma',)