Regression - niranjv/ml-notes GitHub Wiki

Linear Methods
- Linear Regression
- Logistic Regression
- Stochastic Gradient Descent
Shrinkage methods
- Ridge Regression
- LASSO
- LARS
Derived Input Methods
- Principal Components Regression
- Partial Least Squares Regression
Non-linear methods
- MARS
- Polynomial Regression
- LOESS
- Splines
- Generalized Additive Models (GAMS)
- Isotonic Regression
References

Logistic regression

Least Angle Regression (LARS)

Principal Component Regression

If features are correlated, run PCA first and then regress on a few PCs

Bayesian Regression

MARS

LOESS

Isotonic Regression

Focus here is only on 1-d linear ordered isotonic regression, not general isotonic regression
Non-parametric regression method to fit a non-decreasing function to data
Similar to inexact smoothing splines except that monotonicity instead of smoothness is used to remove noise
Fit a free-from line to a set of data s.t. line is non-decreasing everywhere and minimizes MSE on data
No assumptions about target function (e.g., linearity like in a linear model)
Can we weighted or unweighted (all weights must be > 0); no contradictory constraints

Advantages

Non-parametrics
Fast
Simple

Disadvantages

Points at ends of intervals can be noisy
Works best when n > 10,000 (can smooth outcome to improve performance when n < 10,000)

Applications

For fitting non-parametric model to data that is expected to be ordered
Improve calibration of probabilistic classifier - correct probabilities output by classifiers like random forests, boosted trees, SVMs, etc. (but not neural networks which are well calibrated)
Calibration of recommendation models
Non-metric multi-dimensional scaling - isotonic regression is used to find distance as a function of item-item similarity

Algorithms

Linear ordered isotonic regression is solved using Pool Adjacent Violators Algorithm (PAVA)

Implementations

scikit-learn: IsotonicRegression
R: isoreg, Iso package, isotone package, isoMDS
Spark: IsotonicRegression, IsotonicRegressionModel

References

Ad Click Prediction: a View from the Trenches, KDD 2013
Web-Scale Bayesian Click-Through Rate Prediction for Sponsored Search Advertising in Microsoft’s Bing Search Engine, ICML 2010
Predicting Good Probabilities With Supervised Learning
Fastest Isotonic Regression Algorithms
Kaggle: Give Me Some Credit contest
Platt Scaling - Calibration of probabilistics classifier using sigmoid functions). Better than isotonic regression for n < 5,000. Slower than isotonic regression for all n.

Stochastic Gradient Descent

Notes:

Ridge regression and LASSO are forms of penalized estimation. They introduce bias into estimation of model parameters to reduce variance of estimate. They have lower MSE than OLS when multi-collinearity is present. These methods are used mainly for prediction and not for inference since it is difficult to account for bias

References

scikit-learn Generalized Linear Models
Wikipedia - Linear Regression
Numerical Python, Robert Johansson, APress, 2015

Regression - niranjv/ml-notes GitHub Wiki

Contents

Logistic regression

Least Angle Regression (LARS)

Principal Component Regression

Bayesian Regression

MARS

LOESS

Isotonic Regression

Advantages

Disadvantages

Applications

Algorithms

Implementations

References

Stochastic Gradient Descent

References

⚠️ GitHub.com Fallback ⚠️

Regression - niranjv/ml-notes GitHub Wiki

Contents

Logistic regression

Least Angle Regression (LARS)

Principal Component Regression

Bayesian Regression

MARS

LOESS

Isotonic Regression

Advantages

Disadvantages

Applications

Algorithms

Implementations

References

Stochastic Gradient Descent

References

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️