Adding Logsums - ActivitySim/activitysim GitHub Wiki

Scope of Work

Scoped Task 4 Logsums

Getting Started

Will start by implementing just the work location choice model logsum needs in order to draft a vectorized approach
Will create two new model steps and revise the existing step:
- workplace_location_sample - will use a similar expression file as today (but without the logsum expression), but will build a table of workers * all zones in order to select a sample of alternative work locations for the next model. This selects 30 locations from the 1450 zones. It does 30 picks and returns the pick count as well, since this is used in the later model step as an alternative correction factor.
- workplace_location_logsums - will start with the workers * sampled alt zones table output above and will add home taz, alt taz, depart period, end period, and will add a logsum column (see issue below). The existing tour mode choice expression file is used with the following skim lookups:
  - @odt_skims - home taz, work alternative taz, time period = 8am
  - @dot_skims - work alternative taz, home taz, time period = 5pm
- workplace_location (existing) - will start with the table output above and the existing expression file in order to choose a work location like before. This step selects one location from the sample of 30, this time with the mode choice logsum included. The 30 alternatives are collapsed into the unique set of alternatives and the pick count (transformed into a sample correction factor) is added as a utility term.
The mode choice model will be revised to add new capabilities to get logsums in addition to making choices (#164). The existing mode.tour_mode_choice_simulate method calls _mode_choice_simulate, which calls asim.simple_simulate, which calls either eval_mnl or eval_nl, which then calculates utilities and calls the logit class to calculate probabilities and make a choice. This needs to be revised as follows:
- The logit class gets a new method, get_logsum, which returns the logsum (composite utility across all modes).
- Then asim.simple_simulate and mode_choice_simulate need to be revised to support two use cases: calculating utilities and making choice. Setting up and calculating utilities is required in both cases.
- Finally, the mode class needs to be revised to support calculating work location mode logsums using the workplace_location_logsums table calculated above.

Sample Correction Factor

The sample correction factor is calculated as follows:

freq = how often an alternative is sampled (i.e. the pick_count)
prob = probability of the alternative
correction_factor = log(freq/prob)

For example:

freq              1.00	2.00	3.00	4.00	5.00
prob              0.30	0.30	0.30	0.30	0.30
correction factor 1.20	1.90	2.30	2.59	2.81

As the alternative is oversampled, its utility goes up for final selection. The unique set of alternatives is passed to the final choice model and the correction factor is included in the utility. The correction factor is capped by the min(correction_factor,60) expression.

In order to make the vectorized implementation easier, a good solution is to keep the full set of alternatives (including duplicates), put the correction factor on each alternative (record), and then NA the utility of the duplicate alternatives before calculating the probability. Adding an isDuplicate attribute to each alternative allows us to add an expression to the final choice model like: isDuplicate * -99999 in order to make the alternative unavailable. This makes it so we don't have to modify the correction factor and/or the probabilities, while allowing us to keep the existing assumed fixed set of alternatives.