Project Meeting 2021.05.04 - ActivitySim/activitysim GitHub Wiki
Technical meeting
Feel free to request an agenda item in the future as this is a forum for technical sharing
Issues review
No new issues since last time but let's keep talking about performance needs by commenting on issue #377
Based on the recent chunking improvements work, Jeff believes even more so that migrating from strings to factors, especially for the tour and trip tables, should help with performance
Did we speed up sampling for one zone systems by skipping zones with size term == 0? We think so, but can check
Thursday we'll discuss any questions related to the bench IRFP
Update from Jeff on moving from chunking to available RAM settings, #406
Goal is to specify amount of RAM available and then stay within it
It is difficult though because of paging and virtual memory
If you watch memory usage in the Windows Resource Monitor, you see you often don't use all that you expect
So users may want to set higher than installed RAM
Three ways to check RAM usage - bytes allocated, RSS, and USS
bytes allocated reported by python but unreliable
RSS always too big since using disk resident memory
USS is amount if you killed the process now
And remember we need across processes as well for mp
Currently testing a hybrid bytes allocated + USS approach
Need to test 1 zone mtc and 3 zone sandag examples a number of times and each run takes 1 to 3 hours
The exploratory (dynamic) chunking takes 10 to 20% overhead so saving and re-using improves runtime 10 to 20%
User will want to specify machine RAM, run with exploratory chunking, see usage, and adjust settings, and then re-run with cached settings
It is working well and should have numbers to share next week
There's lots of new settings to help instrument it and we'll need to write some documentation on how to use it
Run times should improve as well
What if there just isn't enough RAM for the number of processors? Can we provide some info to the user about this?
What's a typical server config? Agencies to share machine specs
Typically the machine is dedicated to just running the model
As the model progresses through the submodels, more stuff is in RAM and therefore the headroom is reduced
The tour and trip tables for example include strings which are inefficient memory hogs and slow
Jeff plans to wrap up this week and then I'll test and document