Visualizing dataset - Nori12/Machine-Learning-Tutorial GitHub Wiki

Machine Learning Tutorial

Visualising dataset

It is very important to look at the data before applying any algorithm. For this example, the dataset is the iris dataset and the algorithm is the KNearest Neighbors.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier

iris_dataset = load_iris()

X_train, X_test, y_train, y_test = train_test_split(
    iris_dataset['data'], iris_dataset['target'], random_state=0)

iris_dataframe = pd.DataFrame(X_train, columns=iris_dataset['feature_names'])    


grr = pd.plotting.scatter_matrix(iris_dataframe, c=y_train, figsize=(15, 15), marker='o',
                                 hist_kwds={'bins': 20}, s=60, alpha=.8)
  • image