Doc: TLC - wimlds/smart_cities GitHub Wiki
The New York TLC taxi data set is a public
data set provided by the Taxi and Limousine Commission (TLC).
It includes trip records from all trips completed in
yellow and green taxis in New York City. Data are available for the year 2009-2016 for the yellow, 2013-16 for the green taxis. Each trip record
contains pick-up and drop-off time, pick-up and
drop-off locations, trip distances, itemized fares (recorded through
meters installed in each taxi), and
driver-reported passenger counts, tips and payment info, and several other pieces of information.
The vehicles-for-higher (VFH) data, available for 2015 and 2016, contains fewer variables and does not contain latitude-longitude pickup and drop-off location, but only a location id.
The data are organized in separate files by vehicle type (yellow, green cab, or VFH) and by year. Be careful as these datasets are extremely large! The green taxi data are the smallest with millions of rows each year (one row per ride), each of the VFH datasets has about 60 million rows, and each year of yellow taxi data contains 100-200 million rows.
TLC Yellow Taxi Trip Data
Table path:
`bigquery-public-data.new_york.tlc_yellow_trips_2009`
`bigquery-public-data.new_york.tlc_yellow_trips_2010`
`bigquery-public-data.new_york.tlc_yellow_trips_2011`
`bigquery-public-data.new_york.tlc_yellow_trips_2012`
`bigquery-public-data.new_york.tlc_yellow_trips_2013`
`bigquery-public-data.new_york.tlc_yellow_trips_2014`
`bigquery-public-data.new_york.tlc_yellow_trips_2015`
`bigquery-public-data.new_york.tlc_yellow_trips_2016`
Column Name | Explanation |
---|---|
vendor_id | String A code indicating the TPEP provider that provided the record. 1= Creative Mobile Technologies, LLC; 2= VeriFone Inc |
pickup_datetime | Timestamp The date and time when the meter was engaged. |
dropoff_datetime | Timestamp The date and time when the meter was disengaged. |
passenger_count | Integer The number of passengers in the vehicle. This is a driver-entered value. |
trip_distance | Float The elapsed trip distance in miles reported by the taximeter. |
pickup_longitude | Float Longitude where the meter was engaged. |
pickup_latitude | Float Latitude where the meter was engaged |
rate_code | Integer The final rate code in effect at the end of the trip. 1= Standard rate 2=JFK 3=Newark 4=Nassau or Westchester 5=Negotiated fare 6=Group ride |
store_and_fwd_flag | String This flag indicates whether the trip record was held in vehicle memory before sending to the vendor, aka “store and forward,” because the vehicle did not have a connection to the server. Y= store and forward trip N= not a store and forward trip |
dropoff_longitude | Float Longitude where the meter was disengaged. |
dropoff_latitude | Float Latitude where the meter was disengaged |
payment_type | String A numeric code signifying how the passenger paid for the trip. 1= Credit card 2= Cash 3= No charge 4= Dispute 5= Unknown 6= Voided trip |
fare_amount | Float The time-and-distance fare calculated by the meter |
extra | Float Miscellaneous extras and surcharges. Currently, this only includes the $0.50 and $1 rush hour and overnight charges. |
mta_tax | Float $0.50 MTA tax that is automatically triggered based on the metered rate in use |
tip_amount | Float Tip amount – This field is automatically populated for credit card tips. Cash tips are not included. |
tolls_amount | Float Total amount of all tolls paid in trip. |
total_amount | Float The total amount charged to passengers. Does not include cash tips |
imp_surcharge | Float $0.30 improvement surcharge assessed trips at the flag drop. The improvement surcharge began being levied in 2015. |
TLC Green Taxi Trip Data
Table path:
`bigquery-public-data.new_york.tlc_green_trips_2013`
`bigquery-public-data.new_york.tlc_green_trips_2014`
`bigquery-public-data.new_york.tlc_green_trips_2015`
`bigquery-public-data.new_york.tlc_green_trips_2016`
Column Name | Explanation |
---|---|
vendor_id | String A code indicating the TPEP provider that provided the record. 1= Creative Mobile Technologies, LLC; 2= VeriFone Inc |
pickup_datetime | Timestamp The date and time when the meter was engaged. |
dropoff_datetime | Timestamp The date and time when the meter was disengaged. |
store_and_fwd_flag | String This flag indicates whether the trip record was held in vehicle memory before sending to the vendor, aka “store and forward,” because the vehicle did not have a connection to the server. Y= store and forward trip N= not a store and forward trip |
rate_code | Integer The final rate code in effect at the end of the trip. 1= Standard rate 2=JFK 3=Newark 4=Nassau or Westchester 5=Negotiated fare 6=Group ride |
pickup_longitude | Float Longitude where the meter was engaged |
pickup_latitude | Float Latitude where the meter was engaged. |
dropoff_longitude | Float Longitude where the meter was timed off. |
dropoff_latitude | Float Latitude where the meter was timed off. |
passenger_count | Integer The number of passengers in the vehicle. This is a driver-entered value. |
trip_distance | Float The elapsed trip distance in miles reported by the taximeter. |
fare_amount | Float The time-and-distance fare calculated by the meter |
extra | Float Miscellaneous extras and surcharges. Currently, this only includes the $0.50 and $1 rush hour and overnight charges |
mta_tax | Float $0.50 MTA tax that is automatically triggered based on the metered rate in use |
tip_amount | Float Tip amount – This field is automatically populated for credit card tips. Cash tips are not included. |
tolls_amount | Float Total amount of all tolls paid in trip. |
ehail_fee | Float Describe this field... |
total_amount | Float The total amount charged to passengers. Does not include cash tips. |
payment_type | Integer A numeric code signifying how the passenger paid for the trip. 1= Credit card 2= Cash 3= No charge 4= Dispute 5= Unknown 6= Voided trip |
distance_between_service | Float |
time_between_service | Integer |
trip_type | Integer A code indicating whether the trip was a street-hail or a dispatch that is automatically assigned based on the metered rate in use but can be altered by the driver. 1= Street-hail 2= Dispatch |
imp_surcharge | Float $0.30 improvement surcharge assessed on hailed trips at the flag drop. The improvement surcharge began being levied in 2015. |
TLC VFH Trip Data
More information on the VFH data is available on this fiveThirtyEight github repository, togwther with Uber data form 2014.
Table path:
`bigquery-public-data.new_york.tlc_fhv_trips_2015`
`bigquery-public-data.new_york.tlc_fhv_trips_2016`
Column Name | Explanation |
---|---|
location_id | Integer The TLC taxi zone of the trip pick-up |
pickup_datetime | Timestamp The date and time of the trip pick-up. |
dispatching_base_num | String The TLC Base License Number of the base that dispatched the trip. |
borough | String |
zone | String |
service_zone | String |