Phase 5 Scope of Work - ActivitySim/activitysim GitHub Wiki

Phase 5 will be performed during FY20, ending in June 2020, and includes the following tasks.

Task 1: Project Management
Task 2: Strategic Development and Contribution Plan
Task 3: Support for Two Zone Systems
Task 4: Input and Output Improvements
Task 5: Support for Modeling TNCs and For-Hire Vehicles
Task 6: Model Developer Tutorial
Task 7: Performance Tuning
Task 8: PopulationSim Integration
Task 9: Model Estimation Mode
Task 10: Support for Three Zone Systems and Transit Virtual Path Building

Task 1: Project Management

The purpose of this first task is to manage the overall project, including invoicing and conference calls with the project team, and coordination with the AMPO agency partners. All deliverables, including meeting notes, software, tests, documentation, and issue tracking will be managed through GitHub.

Deliverable(s): (Due 36 weeks from NTP)

Management of Bi-Weekly Meetings
Pre- and Post-Meeting Notes
Invoicing and Progress Reports
Client Coordination

top

Task 2: Strategic Development and Contribution Plan

The purpose of this task is to create a strategic development and contribution plan. The plan will provide further guidance on governance, coordination and management as the ActivitySim contributing and user group continues to grow and expand. The plan will address issues such as:

How to share / maintain expression files across regions, with an eye toward potential software enhancements under later phases of work
How to manage and prioritize the expansion of the Future Features wiki page
How members coordinate creating, responding, and directing existing or new issues in the repo
What guidance (rule book) can be created so that new members can be easily informed on existing protocols, group etiquette, and past decisions
Improved guidance on coordinating and reviewing contributions by others, specifically contributions made by teams that are not formally ActivitySim members. The Contribution Review wiki guidelines will be incorporated into the strategy.

The consultant will solicit from the Project Management Committee and finalize a list of key issues that the strategic development and contribution plan will address. The consultant will share a draft strategic development and contribution plan, and subsequently convene two online two hour workshops with agency partners to discuss and refine the plan. The consultant will then finalize the strategic development and contribution plan and update the project governance documents as necessary.

Deliverable(s): (Due 12 weeks from NTP)

Draft Strategic Development and Contribution Plan and Review Meeting
Final Strategic Development and Contribution Plan and Review Meeting
Updated governance document(s)

top

Task 3: Support for Two Zone Systems

The purpose of this task is to implement two-zone system network level-of-service handling into the ActivitySim framework. In two-zone system models, for example the PSRC and SFCTA models, trips are modeled from microzone/parcel-to-microzone/parcel and the network level-of-service zone system varies by mode:

Auto - zone-to-zone skims
Walk or bike - microzone-to-microzone for nearby zone pairs using path costs from a lookup table developed from an allstreets network
Transit - zone-to-zone skims except for access/egress time, which comes from the microzone/parcel file, which includes access distance or time by transit submode for each microzone. The access distance or time can be any user defined microzone/parcel data column (for example by transit unlabeled mode instead of line-haul mode if unlabeled modes are used).

The model system will be able to operate at the microzone or zone level, and it will be possible to index into zone-skims using the zone IDs or other zone attributes (such as microzone ID if using microzones as well).

Consultant will draft a memo on software design to support two zone systems. This includes specifying how a user configures and make use of two zone system data in the form of the settings (such as num_zone_systems = 1 or 2) and expressions files + revisions to the skims management system to support two zone systems. The design will be based on the existing MTC, SANDAG, ODOT CT-RAMP design + the ActivitySim prototype already implemented for SANDAG. Consultant will share the design with the project team and finalize the design based on comments.

Consultant will copy the existing TM1 example and create a new example with input data revisions for testing. Consultant will switch the existing zones to microzones, split a few microzone to create a many-to-one microzones to zone relationship for testing, and revise microzone input data such as land use and transit access and egress distances. This new example is expected to produce similar results for verification.

Consultant will integrate the prototyped two zone system code developed for SANDAG in the activitysim code base and update all the submodels to use the new code. Consultant will update all the submodel expression files to make use of the revised two zone system input data. Consultant will update the multiprocessing shared data structures so two zone systems work for multiprocessing. Consultant will add logging, tracing, and inline code documentation. Consultant will run the full scale example and summarize results to ensure the results are correct. Consultant will share results with the project team. Consultant will add new two zone specific test methods to test the new code. Consultant will update the user documentation and release an updated version of the package on pypi.

Deliverable(s): (Due 36 from NTP)

Final Design Documented in the Wiki
Updated Code with Support for One and Two Zone System Models
Additional Full Scale Two Zone Example Test Setup
Updated Documentation, Verification of Results, and Tests

top

Task 4: Input and Output Improvements

The purpose of this task is to make it easier and less error prone to implement ActivitySim and to consume ActivitySim outputs.The consultant will work with the ActivitySim Project Management Committee (PMC) to identify the list of features to be added. The final set of features implemented will not exceed task budget. The planned set of input and output improvements will:

Allow users to specify separate input CSV tables or an HDF5 file in the initialize step
Provide automatic generation of simpler and more typical activity-based model output files such as the final household, person, tour, and trip files.
Include table indexes as columns in the output CSV tables to make it easier to join the tables later
Whenever a logit model is solved, add the ability to get the logsum in addition to the choice
Add fields to the output tables including:
- A sequential trip id in time and space by person in order to sort trips in chronological order for reporting and dynamic traffic assignment (DTA)
- Key travel data such as time, distance, and cost, which will be added to the tour and trip output tables via annotation expressions

Deliverable(s): (Due 18 from NTP)

Final Task Plan in the Wiki
Updated Software
Full Scale Example Test Setup
Updated Documentation and Tests

top

Task 5: Support for Modeling TNCs and For-Hire Vehicles

The purpose of this task is to provide ActivitySim with sensitivity to the availability of Transportation Networking Companies (TNCs) and other for-hire vehicle alternatives. The consultant will review and summarize the existing state-of-the-practice and propose methods for enhancing ActivitySim. These enhancements may involve inclusion of TNCs and other for-hire vehicles in the tour and trip mode choice models and inclusion of TNC availability in upper-level models such as auto ownership. The consultant will propose to the PMC a recommended approach for completing this task, and develop all data required for implementation, such as in-vehicle time and cost coefficients, and TNC wait times. The consultant will implement and test the changes, and will validate the revised code base to available Bay Area data provided by the PMC. Finally, the consultant will document all changes and results on the project GitHub site and wiki.

Deliverable(s): (Due 18 from NTP)

Final Task Plan in the Wiki
Revised code base that includes sensitivity to TNCs and for hire vehicles
Full Scale Example Test Setup
Verification of Results
Updated Documentation and Tests

top

Task 6: Model Developer Tutorial

The purpose of this task is to add a tutorial on setting up, running, and analyzing the results of ActivitySim modeling scenarios. Users of ActivitySim are expected to be familiar with the basic concepts of activity-based modeling and so items will not be included in the tutorial. The tutorial will include example data for the user to use. The draft tutorial outline is below and will be finalized with the PMC before being implemented.

What are the inputs, including data column descriptions for each table, to ActivitySim?
What are the outputs, including data column descriptions for each table, from ActivitySim?
Setting up and running a base model
Setting up and running an alternative scenario
Comparing results
Next steps and further reading

Deliverable(s): (Due 24 weeks from NTP)

Final Tutorial Outline in the Wiki
Tutorial Added to the User Guide
Tutorial Example Test Setup
Updated Documentation and Tests

top

Task 7: Performance Tuning

Quick and memory efficient activity-based model runtimes are critical for model relevance in the transportation planning process. The purpose of this task is to tune runtime performance by investigating and implementing strategies to reduce runtimes and reduce memory usage across a range of hardware and deployment environments. The task includes four steps:

Profiling of the single and multi-threaded implementation to identify potential issues and areas of improvement
Investigating machine tuning optimizations research such as the setting used with the Intel MKL, which is used by the Anaconda pandas and numpy distribution
Implementing improvements to the existing code, such as possibly replacing costly string operations with faster categorical data operations
Documenting code updates

The expected improvements in runtime are unknown at this time and will be better understood after completing the profiling exercise and experimenting with some ideas for improvements.

Deliverable(s): (Due 36 from NTP)

Final Task Plan in the Wiki
Improved Source Code
Full Scale Example Test Setup
Updated Documentation and Tests

top

Task 8: PopulationSim Integration

The purpose of this task is to formally integrate PopulationSim into the consortium in order to maintain the tool moving forward. Formal integration means transferring the repository to ActivitySim's GitHub account, updating the PopulationSim user guide, updating activitysim.org, and updating any other resources such as the project wiki. PopulationSim will remain a separate repository that depends on the activitysim package. The existing ActivitySim repository, which currently contains submodules for core and abm, will remain intact. The final plan for migration of PopulationSim will be decided in cooperation with the PMC.

Deliverable(s): (Due 18 from NTP)

Final Migration Plan in the Wiki
Updated Code, Test Setup, Documentation, and Website Resources

top

Task 9: Model Estimation Mode

The purpose of this task is provide ActivitySim with the ability to write out all files required to estimate multinomial and nested logit models using model estimation tool(s) identified by the PMC. ActivitySim will write an estimation data bundle consisting of the following components:

Chooser table with all chooser data for each sub-model, such as household, person, and taz data.
Alternatives table with available alternatives for each chooser for each sub-model. An example includes time windows for tour departure and duration choice.
Utilities table with all available attribute data for each alternative in order to construct and modify utilities during estimation.

Users will have the ability to code model specifications and utility expressions within the ActivitySim framework so as to facilitate ease of use and eliminate inconsistencies and errors between the code used to estimate the models and the code used to apply the models.

The consultant will revise each submodel to support a MODEL_ESTIMATION_MODE = True/False parameter that does the following:

Reads an observed travel survey database in the activitysim pipeline datastore format
Writes the estimation data bundle described above for every household in the observed data. This is similar to the existing trace functionality except that the trace data is better formatted.
Does not make choices, but instead leaves the choices intact in the observed database so the next submodel can create estimation data based on observed choices.

The consultant will also revise existing expression files to ensure separation between coefficients and data so coefficients are isolated in order to be easily updated during estimation. This includes revising all example expression files for all submodels and ensuring the same model results via the test system.

The consultant will inventory available open-source model estimation tools, such as Larch, Biogeme, pylogit, and choicemodels (specifically for fast estimation runtimes for destination choice estimation since this is a known issue with some alternative packages), evaluate their capabilities and limitations, including estimation runtimes, and document the findings. The consultant will work with the PMC to identify preferred model estimation tools for two core activitysim models, nested work tour mode choice and work destination choice. The consultant will create example nested work tour mode choice and work destination choice estimation scripts that take the estimation data bundle as input, prepare estimation configuration / control file(s) for the preferred model estimation tool from ActivitySim model specifications, estimate the models, and convert the estimation tool outputs such as coefficients into the form required by ActivitySim.

The consultant will use the recently completed MTC/SFCTA travel diary survey or the Bay Area component of the 2017 National Household Travel Survey (NHTS) as the observed travel survey database for verification. The PMC will provide to the consultant the observed travel diary surveys to match the existing activitysim format, include all key input tables and fields for households, persons, tours, and trips. If the PMC does not provide this data, a sample of existing activitysim output will be used for development.

Deliverable(s): (Due 24 weeks from NTP)

Revised ActivitySim code base that includes Model Estimation Mode
Updates to the Documentation and Tests
Create and Document Example Nested Tour Mode Choice and Work Destination Choice Estimation Scripts with the Selected - Estimation Package that provide user with a seamless pipeline from coding model specification in ActivitySim through - integrating estimation outputs back into ActivitySim
Verification of Results

top

Task 10: Support for Three Zone Systems and Transit Virtual Path Building

The purpose of this task is to add support for three zone systems and transit virtual path building (TVPB). Three-zone system models, for example the SANDAG, ODOT, and MTC TM2 models, read stop-to-stop skims, a set of nearby stops for each microzone, and microzone to stop network impedances, in order to build microzone-to-microzone transit impedance through the best pair of access and egress stops by market segment. The model system will be able to operate at the microzone or zone level, and it will be possible to index into zone-skims using the zone IDs or other zone attributes (such as transit stop ID if using transit stop skims as well). Output transit tours and trips will include boarding and alighting stop in addition to origin and destination zone.

Consultant will draft a memo on software design to support three zone systems and TVPB. This includes specifying how a user configures and make use of three zone system data and TVPB in the form of the settings (such as num_zone_systems = 1 or 2 or 3) and expressions files, revisions to the skims management system to support three zone systems, transit virtual path building, and runtime expectations. The design will be based on the existing MTC, SANDAG, ODOT CT-RAMP design + the ActivitySim prototype already implemented for SANDAG. Previous implementations of TVPB cache calculated path leg utilities (access utility, stop-to-stop utility, egress utility) by market segment + select N best paths for mode choice to save runtime. The prototype developed earlier did not including caching or N best path selection. The activitysim multiprocessing setup uses a read-only shared data structure so caching is non-trivial. The planned implementation of TVPB will instead pre-compute path leg utilities by market segment so the shared data structure for mutiprocessing remains read-only + use pre-computed N best paths by market segment for mode choice. Consultant will share the design with the project team and finalize the design based on comments.

SANDAG will provide inputs for a three-zone system model setup in activitysim format that will be used for development and testing. The inputs include network level-of-service data, synthetic population, and land use data. Consultant will develop a new example that works with the provided data and makes use of three-zones and TVPB. Starting from a code base that includes support for two zone systems, Consultant will integrate the prototyped three zone system and TVPB code developed for SANDAG in the activitysim code base and create a SANDAG example to use the new code. The TVPB will pre-compute path leg utilities by market segment and make them accessible to submodels. Consultant will setup submodel expression files to make use of the revised three zone system input data + new TVPB. Consultant will update the multiprocessing shared data structures so three zone systems and TVPB works for multiprocessing. Consultant will add logging, tracing, and inline code documentation.

Consultant will run the new example and summarize results to ensure the results are correct and runtimes are reasonable. Consultant will share results with the project team. Consultant will add new three zone system and TVPB specific test methods to test the new code. Consultant will update the user documentation and release an updated version of the package on pypi.

Deliverable(s): (Due 36 weeks from NTP)

Final Design Documented in the Wiki
Updated Code with Support for One, Two, or Three Zone System Models
Updated Code including Transit Virtual Path Builder
Additional New Three Zone + TVPB Example Test Setup
Updated Documentation, Verification of Results, and Tests

top

Phase 5 Scope of Work - ActivitySim/activitysim GitHub Wiki

Table of Contents

Task 1: Project Management

Task 2: Strategic Development and Contribution Plan

Task 3: Support for Two Zone Systems

Task 4: Input and Output Improvements

Task 5: Support for Modeling TNCs and For-Hire Vehicles

Task 6: Model Developer Tutorial

Task 7: Performance Tuning

Task 8: PopulationSim Integration

Task 9: Model Estimation Mode

Task 10: Support for Three Zone Systems and Transit Virtual Path Building