Linear Regression project - louislau66/Traffic_Volumn_Prediction_Python GitHub Wiki
1 Introduction
Data set contains hourly Interstate 94 Westbound traffic volume for MN DoT ATR station 301, roughly midway between Minneapolis and St Paul, MN. Hourly weather features and holidays included for impact on traffic volume. We are trying to observe what factors affect the traffic volume. Then we will build a linear regression model to predict the traffic volume then evaluate the accuracy of the prediction.
2 Attribute Information
3 Explore the dataset
4 Preprocessing
Replace temp=0 with mean of temperature
5 Explore relationships across the entire dataset
Did not observe any strong relationships between variables
6 Traffic volume distribution
7 Correlation of dataset
From correlation table we observe two columns (temp & clouds_all) have week correlation with target variable (traffic_volume)
8 Data_time column impact
- Initial Linear Regression without Date_time column achieved very low R2 score (0.05025), which indicates other columns contribute very little to the prediction accuracy.
- Day of Week, Time and Month information has been retrieved from Date_time column to build a new model.
9 OLS Regression Results
- 102 independent variables
- R2=0.833
- Adj R2= 0.832 R2 value is ok. But the model contains too many variables. We will go ahead to eliminate some variables to make the model more intelligible.
10 Backward Feature Elimination
11 OLS Regression Results after Backward feature Elimination
12 Variables in the Linear Regression model
- Holiday, Rain_1h, Snow_1h columns in the original dataset have been eliminated because they had little or no impact on the traffic volume prediction.
- Prediction model takes weather, day of week, time and month as input to make the prediction.
13 Model Evaluation
Conclusion: Overall, the model did a good job on predicting the traffic volume! cross_val_score(clf,X_final,y,cv=4).mean()
Out[52]: 0.8311152969743676
R2 test: 0.8038296272306586
R2 train: 0.8326675841410172
MSE: 645586.6154393161