Doc: Citibike - wimlds/smart_cities GitHub Wiki

CitiBike provides trip and station data. The trip data goes back to 2013 and records trip (locations, times) and user (sex, age) information. You can read about their methodology here. Trips less than 60 seconds and staff testing trips are removed by CitiBike. The original source provides a host of tables. These original tables have been aggregated into one table detailed below.

CitiBike also publishes real-time system station data in General Bikeshare Feed Specification format. The table you are working with aggregates data from the specification files station_information, station_status, and also includes vendor specific data indicated with the prefix eightd.

A blog about some basic analyses can be found here.

Table Paths:

  • `bigquery-public-data.new_york.citibike_trips`
    • Shape: 33319019 rows, 15 columns
  • `bigquery-public-data.new_york.citibike_stations`
    • Shape: 664 rows, 18 columns

Trips

Column Name Explanation
tripduration integer in seconds
starttime datetime
stoptime datetime
start_station_id integer
start_station_name string
start_station_latitude float
start_station_longitude float
end_sation_id integer
end_station_name string
end_station_latitude float
end_station_logitude float
bikeid integer
usertype string (customer = 24-hr or 7-day pass, subscriber = annual member)
birth_year float
gender string of female, male, unknown

Stations

Column Name Explanation
station_id integer, unique identifier of a station
name string, public name of station
short_name string, short name or other type of identifer as used by the data publisher
latitude float, WGS 84 in decimal degrees
longitude float, WGS 84 in decimal degrees
region_id integer, ID of the region of station 2 regions 70 or 71
rental_methods string, array of enumerables containing payment methods. all are KEY,CREDITCARD
capacity integer, number of total docking points, both available and unavailable
num_bikes_available integer, number of bikes available for rental
num_bikes_disabled integer, number of disabled bikes
num_docks_available integer, number of docks accepting bike returns
num_docks_disabled integer, number of empty but disabled docks
is_installed bool, is the station currently on the street
is_renting bool, is the station currently renting bikes (even if it is empty and no bikes are available)
is_returning bool, is the station accepting bike returns (even if it is full and can't accept bikes, but would if not full)
last_reported integer POSIX timestamp, indicates last time this station reported its status to the backend
eightd_has_available_keys bool, indicates if there are bike keys remaining at the station
eightd_has_key_dispenser bool, indicates if there is a bike key dispenser at the station