ICP 14 - Gnkhakimova/CS5590-BigData GitHub Wiki
ICP 14
Source Code
Video
Task for this ICP was to run different machine learning algorithms on datasets.
Decision Tree
It is an algorithm which is used in machine learning for classification problems. It uses graph like structure, which makes decisions on each level.
Random Forest
Algorithm which consist of several decision trees and at the end result is added up together.
K-Mean
Unsupervised algorithms make inferences from datasets using only input vectors without referring to known, or labelled, outcomes.
Logistic regression
Logistic Regression is a Machine Learning algorithm which is used for the classification problems, it is a predictive analysis algorithm and based on the concept of probability.
Naive Bayes
Naive Bayes classifiers are a collection of classification algorithms based on Bayes’ Theorem. It is not a single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of features being classified is independent of each other.