Logistic Regression - niranjv/ml-notes GitHub Wiki
- Used when dependent variable (response) is categorical (usually 2 classes). Extensions include
multinomial logistic regression
for > 2 classes andordered logistic regression
for ordered responses,mixed logit
,conditional random fields
,conditional logistic regression
, etc. - Used to estimate probability of binary response using 1 or more covariates
-
P(Y|X)
is Bernoulli, not Gaussian. -
Y_i
are not identically distributed sinceP(Y_i|X_i)
depends on the value ofX_i
. ButY_i
are independent conditional onX_i
and$\beta$ - Predicted values are probabilities (in
[0,1]
due to the logistic function); threshold predicted probabilities to classify predictions into categories - Alternative to
linear discriminant analysis
. -
Logit
function is inverse ofLogistic
function.Logit
/log odds of probability is equal to RHS of linear regression equation.
Model parameters must be estimated via iterative method, no closed form expression available like in linear regression. Failure of method to converge can occurs due to:
-
p >> n
=> conservative Wald statistic => non-convergence - Mulicollinearity => high std errors of model parameters
- Sparseness in data => large number of empty cells => problematic for categorical data => no convergence because log(0) is undefined => collapse categories or add constant to all cells
- Complete separation => all predictions are accurate => errors present
- Use
deviance
to assess goodness of fit of model; analogous toR^2
in linear regression. Small values => less deviance from 'full' model - Pseudo R^2 - several measures available, each with its own limitations
- Likelihood Ratio test
- Wald statistic
ISLR, Section 4.3
- Wikipedia