Data - aclong/dummy_data_linkage GitHub Wiki

There are a few key datasets that I will link together in this process. The key data set is that of the smart card transactions, as provided by a regional transport provider.

In this process these data are stored in PostgreSQL data bases where each of the variables in the lists below is a separate column.

Transaction Data:

  • Machine ID
    • The unique identifier for the ticket machine that the transaction is made on.
  • Operator Code
    • The code of the Bus Operator, when combined with the route number it provides a unique identifier for each bus route in the West Midlands Area.
  • Route Number
    • Route number of the Bus (as seen on the front of the Bus).
  • Fare Stage
    • A value seemingly for pricing tickets. The fare stage goes up in one direction along a route, and down in the other.
  • Transaction Datetime
    • For the given transaction, precise to one minute.
  • Card ID
    • ID for the individual card.

Timetable Data:

  • Operator Code
    • See above.
  • Route Number
    • See above.
  • Timetable ID
    • The ID of this particular service of the day, combination of letters and numbers depending on the operator's system.
  • Journey Scheduled
    • Time at which particular bus service is scheduled to begin.
  • Arrive
    • Time that service arrives at stop.
  • Depart
    • Time that service leaves stop (seemingly identical to arrive).
  • Stop Sequence Number
    • The nth number of the stop in the sequence of the route. 1 up to n.
  • Direction
    • The direction of the service along the route. "In" or "Out".
  • Naptan Code
    • The code of the particular bus stop.
  • Days of Week Active
    • Code of 7 digits with 1 or 0 representing days of service in the week. For example 0000011 for weekends and 1111100 for weekdays.
  • Start Date
    • When this timetable is valid from.
  • Last Date
    • When the timetable ends.