List of Models - AgileDataScienceUB/ADS4 GitHub Wiki
Logistic Regression
Logistic regression, despite its name, is a linear model for classification rather than regression. In this model, the probabilities describing the possible outcomes of a single trial are modeled using a logistic function.
Read more in the User Guide.
We will use its implementation on the class LogisticRegressionthe contained in the package sklearn. This implementation can fit binary, One-vs- Rest, or multinomial logistic regression with optional L2 or L1 regularization.
Parameters
Our model will optomize the following paramenters:
- Penalty : str, ‘l1’ or ‘l2’. Used to specify the norm used in the penalization.
- C : positive float. Inverse of regularization strength.
XGboost
XGBoost (eXtreme Gradient Boosting) is an advanced implementation of gradient boosting algorithm. We will use the implementetion of the package xgboost.
Parameters:
There are several tuning techniques for this method. See the following guide
Random forest classifier
A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and use averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True
Parameters
- n_estimators : integer, The number of trees in the forest.