Project Meeting 2020.12.01 - ActivitySim/activitysim GitHub Wiki

Technical Call

TVPB update
- Update from Jeff Doyle on transit virtual path building performance improvements
- TVPB Pre-computer and mode choice now running multiprocessed on Windows
- The existing TM1 example, toy 2 zone and toy3 zone complete model runs also running on Windows
- Below are some settings and results for discussion
- Note previous single threaded on-demand / redo calculations runtime was 3 hours
- We don't have a comparable runtime for a 775k work tours mode choice from Marin TM2
- Result confirmed to be the same as before the multiprocessing work
- Now working on verification summaries, cleaning up the examples for distribution, and user documentation

set MKL_NUM_THREADS=1

python simulation.py -c configs_3_zone_marin_full -c configs_3_zone_marin -c configs -d data_3_marin_full -o output_3_marin_full -s settings_mp.yaml

begin: initialize_tvpb
num_processes: 20
chunk_size: 2376277344 # num_taps * num_taps * rowsize / desired_num_chunks = 2376277344

begin: tour_mode_choice_simulate
num_processes: 32
chunk_size: 0

Time to execute run_sub_simulations step mp_tvpb : 670.767 seconds (11.2 minutes)
Time to execute run_sub_simulations step mp_mode_choice : 327.657 seconds (5.5 minutes)
Time to execute all models : 1333.193 seconds (22.2 minutes)

TVPB discussion
- Runtimes at 22 minutes for the example, 11 minutes for the TVPB pre-computer and 5 minutes for the 775k work tours mode choice from Marin TM2
- Note mode choice calculations are used many times in the form of logsums so this is a key part to speed up
- Could probably be sped up even more but there's not much left in this example to bite onto
- Separate from this work, RSG is now starting on the SANDAG cross border ActivitySim model which will use the TVPB so I expect we'll make some improvements on that project as well. It will be good to have a full scale example to continue to develop with.
- Need to be careful programming shared memory applications with numpy or otherwise Python just replicates the memory in each process which means we run out of RAM
- Now switching from Jeff developing to me testing, documenting, verifying
- Runtimes depend a lot on the maz-tap density/ratio
- Which depends on the maz-tap input files and the max_distance cutoffs
- maz-tap pair availability can also be modified in the expressions as well
- All this will be important to document in the user guide
- Will try a couple different max distances and see how the runtimes compare
- Marin example has 650,000 maz-tap pairs for walk and 6200 taps
- One reason we still have dynamic / on-demand calculations is for tracing - if the HH ID is traced, then it re-runs the pre-computed TVPB calculations from within mode choice
- We could do OD tracing in addition to HH ID tracing
- Access mode is exposed to the tap-tap expressions; could easily add egress mode too if needed
- We expect there to be future optimizations as we roll this out in a few places
- There's maybe 15 times more tours in the full TM2 model so 15 x 5min starts to get a little long for runtimes
Update from Jeff Newman on estimation integration improvements
- Nothing to report
- Jeff Doyle now turning attention to estimation mode enhancements
Update from Clint on ARC related improvements
- Added scheduler pre-processor but didn't help much
- This pre-processor is on the choices rather than the choosers since there's lot of duplicate calcs in the choices
- Lots of duplicate calcs in the logsums by time-of-day
- Instead of doing logsums for each time period, could do for a representative time period within each skim to save runtime (this is done in some CT-RAMP models)
- Parking duration is by time period so using representative time periods is a bit of an abstraction. I'll create an issue for this.
- Clint may consolidate the group by by O,D as opposed to O, D, duration
- Pull request for trip time-of-day choice and CBD parking location models coming soon
- Will create example_mtc_arc_extensions example to exercise new features
Plan to wrap up TVPB pull request and then merge/reconcile all the PRs for a release later this month
- Better for us to pull Clint's updates without tracing then to leave them orphaned
- For release planning, should do periodic (every couple of weeks) review of outstanding code and pull if easy
- Will release TVPB by end of year, along with ARC's improvements
- May include some of the estimation improvements as well
- We will first pull the multizone branch to develop and Clint will rebase his code off of multizone
- It's easier for the author to deal with merging
- I'll deal with the other smaller PRs
- Multizone code works for both spines
Oregon coordinated move to ActivitySim
- Thinking about a coordinated move to Asim by all the Oregon agencies so each doesn't have to do it on their own
- Alex shared guidance memo drafted by Joel
- Some good thoughts in the memo on the multizone system features for the user guide
- For multizone approach, make sure to include network coding implications
Chat with LBL/Berkeley folks
- Building BEAM/matsim/Asim/Urbansim models for DOE
- Currently have models in Detroit, Austin, and SF with 6 more regions planned
- They would like a more generic/simpler model to be included with asim so its easier to setup in new regions
- Also too many detailed skims in the current example - something simpler to start would be good
- This approach is better than to take away / hack up the TM1 example
- Also like the idea of model design templates so new users can simply select a template and configure it
- Reduced barriers to entry is key
- Let's discuss these ideas in more detail at the next call