PCA Model - setiamanlhc/python-snippet-code GitHub Wiki
- Scale your data from df to scaled_data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(df)
scaled_data = scaler.transform(df)
- Create PCA object
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(scaled_data)
- transform this data to its first 2 principal components
x_pca = pca.transform(scaled_data)
scaled_data.shape
x_pca.shape
- Plot the two component using scatter plot
plt.figure(figsize=(8,6))
plt.scatter(x_pca[:,0],x_pca[:,1],c=cancer['target'],cmap='plasma')
plt.xlabel('First principal component')
plt.ylabel('Second Principal Component')
- Interpreting Component
#Display components. It is in numpy format
pca.components_
#convert to dataframe
df_comp = pd.DataFrame(pca.components_,columns=cancer['feature_names'])
#Plot heatmap
plt.figure(figsize=(12,6))
sns.heatmap(df_comp,cmap='plasma',)