removed Clipper card respondents if they have a Clipper card but no passes loaded on it
re-estimated with two separate models, a subsidy model and a pass model
also added workplace accessibility term
separate models are probably easier to estimate / apply in separate regions since each requires less comprehensive observed data
subsidy model run first and then subsidy variable included in the pass model
different types of persons - for example seniors and kids - can have different subsidy levels and pass discounts
It is difficult to get the fare discount data we need from household travel surveys or OBS
But we'll express a subsidy and pass discount distribution or average by market segments and then assign a value to persons for use in mode choice
In mode choice then, each person will see a fare skim and a discount/modifier
We'll draft examples by market segment / user group - student, senior, employer, operator
To do operator, you'd need a skim to identify the OD pair uses the operator
Models are insensitive to pass value over time with respect to service / auto operating costs, etc. - this temporal stability is cooked into the constant
Discuss chunking improvement
Plan is for the user to specify amount of RAM available and then have asim figure out the right chunksize given the number of processes
This is more difficult than expected since numpy and pandas and Windows do many memory allocations things/tricks
It is difficult to know how much memory (...i.e. real RAM you are using at any point in time)
Windows will create a swap disk as well without you knowing, which is much slower
There are many memory metrics as well which makes the task even more complicated
Looked at how others do this, for example Dask, and it uses RSS as its memory metric to track, so we're going to use that as well
So, basically rewriting the chunker
And then will need to spend considerable time testing it on a real server with a real problem (i.e. a 100% full scale run)
Since numpy and the low level C/C++ code block allocates (and releases) memory and so small problems don't really exercise the functionality
Will also try to re-use the calculated setup as Alex suggests, but it might not save a lot of time....we'll see
In summary, numpy and pandas are not designed to get the last bit of juice out of the orange
They are essentially designed with unlimited RAM in mind
And for problems where you just need a ton of RAM for a sec; not for hours like we need with ABMs