Task 05 - Vcavalcanti1975/regression_project GitHub Wiki
Create an actionable regression model for incidence rate. It should explain as much of the variability as possible but only identify the most important factors.
lm.fit <- lm(incidenceRate ~ log10(PovertyEst) + Name)
plot(lm.fit)
Residual standard error: 27.09 on 30720 degrees of freedom Multiple R-squared: 0.685, Adjusted R-squared: 0.6662 F-statistic: 36.51 on 1830 and 30720 DF, p-value: < 2.2e-16
https://rpubs.com/VicCav/582077
lm.fit1 <- lm(incidenceRate ~ poly(zipCode, 5) + State + log10(PovertyEst) + Name + log10(popEst2015))
plot(lm.fit1)
Residual standard error: 22.75 on 30668 degrees of freedom Multiple R-squared: 0.7781, Adjusted R-squared: 0.7645 F-statistic: 57.14 on 1882 and 30668 DF, p-value: < 2.2e-16
https://rpubs.com/VicCav/582079
lm.fit2 <- lm(incidenceRate ~ poly(zipCode, 5) + lev_income)
plot(lm.fit2)
Residual standard error: 40.96 on 32542 degrees of freedom Multiple R-squared: 0.2372, Adjusted R-squared: 0.237 F-statistic: 1265 on 8 and 32542 DF, p-value: < 2.2e-16