Project Meeting 2022.04.12 - ActivitySim/activitysim GitHub Wiki

Agenda

Sharrow: dictionary encoding

Notes

Last meeting discussed fixed point encoding, which reduces memory footprint by changing the values that are stored in memory to integers using a smaller amount of bits. For example, if the scale is 0.01, all values are stored as 100 x the actual value in integer form.

Dictionary encoding takes the actual values that we want to store in an array and put them in a secondary array that is much smaller.

  • Identify unique values, put into vector
  • Store a pointer to another array that has the data, and point to the position of the unique value in the vector
  • This lookup can work across multiple skim files – for example, walk and drive access transit fare skims would often have the same unique values, so you can use the same vector of unique values, with walk access values in the first row and drive access in the second row.
  • Suggestion that it would be worthwhile to develop a method that would analyze the skims and determine whether such vectorization would be computationally efficient rather than pre-determining, since even in the case of very simple current fare structures, future scenarios could have much greater variation in fares. This isn’t done currently but would be worth the effort to do at some point.

Jeff also provided update on sharrow testing for MTC and other model. Runtime gains were not as high for SEMCOG, not getting as many performance gains in trip destination choice. After investigation, there was a reason for the fewer gains, Jeff made some adjustments, and now seeing larger performance gains.