Wiki of Capstone Course - dougsmithstpete/Capstone_Coursera_DS GitHub Wiki

Business Problem and Introduction

There were 194,673 reported collisions between 2004 and 2020 in the Seattle Metropolitan Area. There is an opportunity to review common characteristics, including the time of day, weather, road and lighting conditions, geographic location, types of vehicles involved, presence of impairment by drugs or alcohol of individuals involved, among other factors to determine the severity of bodily harm of the individuals involved in the associated collisions via a predictive model.

The ultimate goal will be to compile a set of data and an associated predictive model that will include areas with similar data collection procedures throughout the United States and other international locations. This model and related predictions can be a basis to create a data feed that leverages geofencing to alert users of common GPS direction apps (Google Maps, Waze, Apple Maps) about potential dangers they face traveling in their present location.

Data Overview

The data to be leveraged is provided by the SDOT Traffic Management Division, Traffic Records Group. The date range of the 194,673 collision ranges from January 1st, 2004 to May 5th, 2020. 36 descriptive variables will be leveraged to predict the level of severity of the collision.

Attribute/Feature

Description

WEATHER

A description of the weather conditions during the time of the collision.

ROADCOND

The condition of the road during the collision.

LIGHTCOND

The light conditions during the collision.

PEDROWNOTGRNT

Whether or not the pedestrian right of way was not granted. (Y/N)

SDOTCOLNUM

A number given to the collision by SDOT.

SPEEDING

Whether or not speeding was a factor in the collision. (Y/N)

ST_COLCODE

A code provided by the state that describes the collision.

ST_COLDESC

A description that corresponds to the state’s coding designation.

SEGLANEKEY

A key for the lane segment in which the collision occurred.

CROSSWALKKEY

A key for the crosswalk at which the collision occurred.

HITPARKEDCAR

Whether or not the collision involved hitting a parked car. (Y/N)

LOCATION

Description of the general location of the collision

SEVERITYCODE

A code that corresponds to the severity of the

SEVERITYDESC

A detailed description of the severity of the collision

COLLISIONTYPE

Collision type

PERSONCOUNT

The total number of people involved in the collision

PEDCOUNT

The number of pedestrians involved in the collision. This is entered by the state.

PEDCYLCOUNT

The number of bicycles involved in the collision. This is entered by the state.

VEHCOUNT

The number of vehicles involved in the collision. This is entered by the state.

INJURIES

The number of total injuries in the collision. This is entered by the state.

SERIOUSINJURIES

The number of serious injuries in the collision. This is entered by the state.

FATALITIES

The number of fatalities in the collision. This is entered by the state.

INCDATE

The date of the incident.

INCDTTM

The date and time of the incident.

JUNCTIONTYPE

Category of junction at which collision took place

SDOT_COLCODE

A code given to the collision by SDOT.

SDOT_COLDESC

A description of the collision corresponding to the collision code.

INATTENTIONIND

Whether or not collision was due to inattention. (Y/N)

UNDERINFL

Whether or not a driver involved was under the influence of drugs or alcohol.

OBJECTID

ESRI unique identifier

SHAPE

ESRI geometry field

INCKEY

A unique key for the incident

COLDETKEY

Secondary key for the incident

ADDRTYPE

Collision address type: Alley, Block, Intersection

INTKEY

Key that corresponds to the intersection associated with a collision

⚠️ **GitHub.com Fallback** ⚠️