Rulesy - psrc/shiny-fixie GitHub Wiki

Rulesy cleans data programmatically, performs trip linking and preps tables for Shiny-Fixie

  • complete documentation on Rulesy (Issue #35)

Processes

  1. Prep trip table

    • create a trip table for data cleaning: the table should only include variables relevant to data cleaning
    • development at Issue #25
    • Processes: rulesy_setup_triptable.sql
      • (Add additional fields, including geometry; create indices)
      • (Determine legitimate home and standard work location)
  2. Procedure to recalculate derived fields (This is used both in Rulesy and Fixie.)

    • Run HHSurvey.recalculate_after_edit
  3. Data Corrections

    • Revise travelers count to reflect passengers
    • Classify origin purpose when ‘other’ or missing
    • Classify destination purposes
  4. Trip Linking

  5. Mode number standardization, including access and egress characterization

    • Eliminate repeated values for modes, transit_systems, and transit_lines
    • Characterize access and egress trips, separately for 1) transit trips and 2) auto trips. (Bike/Ped trips have no access/egress)
    • Remove access/egress modes from 1) transit and 2) auto trip strings--not only at the ends, but also the middle.
    • Split the concatenated field into separate mode fields (same for transit systems & transit lines)
  6. Harmonize trips where possible

    • add trips for non-reporting cotravelers
    • missing trips between destinations
    • remove duplicates
    • Recode driver flag when mistakenly applied to passengers and a hh driver is present
    • Recode work purpose when mistakenly applied to passengers and a hh worker is present
    • Add trips where destinations are more than 500m distant
  7. Revise travel times for excessive speed trips

    • Adjust (preferentially departure, but also arrival if necessary) using results from Bing Travel Matrix API
    • Mode changed from non-vehicular to vehicular if travel window is too narrow and vehicular time approximates reported time
  8. Flag inconsistencies for further scrutiny

steps to get Rulesy ready for data cleaning

  • Stored procedures for the 6 main Shiny-Fixie functions and their depending procedures
    • [completed] Dismiss Flag (check in source control - push changes to github)
    • [next] Add trip
    • [next] Delete Trip
    • [if we have time] trip linking and trip unlinking
    • [hold off] split from traces (not sure if data structure is the same over the years)
  • Other rulsey components:
    • check trip linking process with modeling team