Project Meeting 2022.11.10 - ActivitySim/activitysim GitHub Wiki

Agenda

Phase 8 scoping update
Scheduling upcoming meetings
Task and code review updates
Memory profiling update

Action Items

Partners to review the IRFQ distributed by Alex and provide written comments/suggested revisions before next Thursday's partner meeting (11/17).
WSP to distribute Tableau workbook with memory profiling information to the group.

Notes

Phase 8 scoping update

Alex drafted IRFQ and sent to the partners.
ACTION ITEM: Joe and Alex ask that the Partners review the doc and send Joe and Alex any recommended revisions before before next Thursday's partner meeting (11/17) so that they can incorporate changes prior to the meeting.
Ideally, by the end of that meeting on the 17th, the text is finalized and can be sent to AMPO on Friday the 18th.

Scheduling upcoming meetings

Tuesday the 15th - CANCELED
Thursday the 17th - Update on tasks/code review, memory profiling check-in
Tuesday the 22nd - Code acceptance and management
Thursday the 24th - CANCELED

Task and Code Review Updates

School Escorting and Flexible Number of Tour and Trip IDs
- Joe has added some comments, and RSG is in the process of responding. The review looks more or less done.
Disaggregate Accessibilities - Sijia to review.
- WSP is reviewing and has provided some comments (some on code but mostly on documentation). They haven't finished reviewing and want to run some more tests.
Shadow Pricing
- It's still unclear if WSP or CS will be doing the code review. Sijia to coordinate with Jeff when he's back from vacation.
Skim Wrapper - will get update from Jeff when he's back from vacation.
Sharrow-
- WSP and RSG are still working on this code review.
PTV's Window Installer - will get update from Jeff when he's back from vacation.
Estimation Fix and Random Seed Generator

Memory profiling update

Presentation: ActivitySim Memory Profiling Task – Process Update 11-10-2022.pptx
Test runs conducted without chunking or multiprocessing
Prototype_arc runs
- Memory peaks at 5, 12.5, 25% samples
- Ran regression results and demonstrated linear relationship
- Memory peaks measured from a tool that Jeff created as part of sharrow, which exports a csv that prints out memory usage every half secondd that the model is running
Prototype_mtc_extended runs
- Also demonstrated linear relationships
Hypothesis Testing
- Relationship between sample size and memory usage
  - Produced time series of memory usage for different sample size runs.
  - ARC runs at different sample sizes show the same profile for memory usage.
  - For MTC extended (at 100% sample), memory peaks at different points. Sijia has a theory about why this is (memory may not be released fast enough when moving to next step) but needs to run some tests to confirm.
  - Haven’t run MWCOG yet but may not be able to run 100% sample so anticipating results similar to the ARC runs.
- Issue is not the skims, it’s the pipeline information
  - Looked into what the pipeline looks like at different memory levels. The chooser table is large when the memory is the highest. (There are some outliers but that may be due to the memory not being released fast enough).
- Data type issues
  - Table has variable names at each checkpoint and data type, looks at number of tables by data type
  - For ARC 25% run, many are int64
  - In the settings for the ARC model, you can set datatypes when files are read in. Sijia will test specifying lower data types to see if that helps the memory issue.
  - Request was made to look at the number of rows in each table in the data type table.
Other things
- Run time for vehicle type is high; this is because of string checks in the model that aren’t optimized well with numba.
- Question about memory usage with multiprocessing – can this be a test as well?
- ACTION ITEM: Sijia to provide the Tableau workbook to the group.