ICP 14 - Gnkhakimova/CS5590-BigData GitHub Wiki

ICP 14

Source Code
Video
Task for this ICP was to run different machine learning algorithms on datasets.

Decision Tree

It is an algorithm which is used in machine learning for classification problems. It uses graph like structure, which makes decisions on each level.

Random Forest

Algorithm which consist of several decision trees and at the end result is added up together.

K-Mean

Unsupervised algorithms make inferences from datasets using only input vectors without referring to known, or labelled, outcomes.

Logistic regression

Logistic Regression is a Machine Learning algorithm which is used for the classification problems, it is a predictive analysis algorithm and based on the concept of probability.

Naive Bayes

Naive Bayes classifiers are a collection of classification algorithms based on Bayes’ Theorem. It is not a single algorithm but a family of algorithms where all of them share a common principle, i.e. every pair of features being classified is independent of each other.