Cheatsheet - Achronus/Machine-Learning-101 GitHub Wiki

Table of Contents

General

Scikit-Learn (Sklearn)

General

  • .fit() - used to find the internal parameters of a model

  • .transform() - used to map new or existing values to data

  • .fit_transform() - does both fit and transform

  • .predict() - used to make predictions

  • from xgboost import XGBClassifier - XGBoost gradient boosting software

Data Preprocessing

Model Selection

  • from sklearn.model_selection import ...

Accuracy & Predictions

  • from sklearn.metrics import confusion_matrix - used to identify the accuracy of a trained model

Models

  • from sklearn.preprocessing import PolynomialFeatures - Used for creating Polynomial Regressions

  • from sklearn.svm import ...

    • SVC - the model class for Support Vector Classification & Kernel SVM
    • SVR - the model class for Support Vector Regression
  • from sklearn.linear_model import ...

  • from sklearn.tree import ...

  • from sklearn.ensemble import ...

  • from sklearn.neighbors import KNeighborsClassifier - K-Neighbours Classification model

  • from sklearn.naive_bayes import GaussianNB - Naive Bayes model

  • import scipy.cluster.hierarchy as sch - A popular library that can be used for dendrogram creation in Hierarchical Clustering

  • from sklearn.cluster import ...

  • from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA - Linear Discriminant Analysis model

  • from sklearn.decomposition import ...

    • PCA - Principal Component Analysis model
    • KernelPCA - Kernel PCA model

Keras

  • from keras.models import Sequential - basic building block to creating a model

  • from keras.layers import ...

    • Dense - basic function for linear models
    • Dropout - used to add dropout to layers
    • Flatten - used to flatten convolutional layers
    • Conv2D - a basic convolutional layer
    • MaxPooling2D - used to apply max pooling to a convolutional layer
  • from keras.wrappers.scikit_learn import KerasClassifier - used to wrap a sequential model to allow the model to be fit to datasets