ICP2_7 - Hiresh12/Big-Data-Programming GitHub Wiki
Spark ICP : 7
Objective
- Perform classification using Naïve Bayes
- Perform classification using Decision Tree
- Perform classification using Random Forest
- Perform Clustering using K-means
- Perform Regression using Linear Regression
- Perform Regression using Logistic Regression
Approach for Classification Models
- Load data into a dataset and selecting label and class
- Loading the libraries and file
- Casting the column datatypes
- creating the features and output column
Fitting and transforming the data on the model and printing accuracy of the model of Naïve Bayes
Fitting and transforming the data on the model and printing accuracy of the model of Decision Tree
Fitting and transforming the data on the model and printing accuracy of the model of Random Forest
Approach for Clustering Method - K means
- Load data into a dataset and selecting label and class
- Loading the libraries and file
- Casting the column datatypes
- creating the features and output column
- Fitting and transforming the data on the model
Approach for Reg Methods - Linear Regression
- Load data into a dataset and selecting label and class
- Loading the libraries and file
- Casting the column datatypes
- creating the features and output column
Approach for Reg Methods - Logistic Regression
- Load data into a dataset and selecting label and class
- Loading the libraries and file
- Casting the column datatypes
- creating the features and output column