Regression - niranjv/ml-notes GitHub Wiki

Contents

  • Linear Methods
  • Shrinkage methods
    • Ridge Regression
    • LASSO
    • LARS
  • Derived Input Methods
    • Principal Components Regression
    • Partial Least Squares Regression
  • Non-linear methods
    • MARS
    • Polynomial Regression
    • LOESS
    • Splines
    • Generalized Additive Models (GAMS)
    • Isotonic Regression
  • References

Logistic regression

Least Angle Regression (LARS)

Principal Component Regression

  • If features are correlated, run PCA first and then regress on a few PCs

Bayesian Regression

MARS

LOESS

Isotonic Regression

  • Focus here is only on 1-d linear ordered isotonic regression, not general isotonic regression
  • Non-parametric regression method to fit a non-decreasing function to data
  • Similar to inexact smoothing splines except that monotonicity instead of smoothness is used to remove noise
  • Fit a free-from line to a set of data s.t. line is non-decreasing everywhere and minimizes MSE on data
  • No assumptions about target function (e.g., linearity like in a linear model)
  • Can we weighted or unweighted (all weights must be > 0); no contradictory constraints

Advantages

  • Non-parametrics
  • Fast
  • Simple

Disadvantages

  • Points at ends of intervals can be noisy
  • Works best when n > 10,000 (can smooth outcome to improve performance when n < 10,000)

Applications

  • For fitting non-parametric model to data that is expected to be ordered
  • Improve calibration of probabilistic classifier - correct probabilities output by classifiers like random forests, boosted trees, SVMs, etc. (but not neural networks which are well calibrated)
  • Calibration of recommendation models
  • Non-metric multi-dimensional scaling - isotonic regression is used to find distance as a function of item-item similarity

Algorithms

  • Linear ordered isotonic regression is solved using Pool Adjacent Violators Algorithm (PAVA)

Implementations

  • scikit-learn: IsotonicRegression
  • R: isoreg, Iso package, isotone package, isoMDS
  • Spark: IsotonicRegression, IsotonicRegressionModel

References

Stochastic Gradient Descent

Notes:

  • Ridge regression and LASSO are forms of penalized estimation. They introduce bias into estimation of model parameters to reduce variance of estimate. They have lower MSE than OLS when multi-collinearity is present. These methods are used mainly for prediction and not for inference since it is difficult to account for bias

References

⚠️ **GitHub.com Fallback** ⚠️