LSOA options - npct/pct-shiny GitHub Wiki

Currently (with MSOAs)

  • We load a whole region into memory on the server.
  • Each new user is in a separate thread so that 3 users looking at West Yorkshire will mean the server has in memory three copies of West Yorkshire.
  • There is a hard edge of the region, we can't have flows from outside the boundaries.
  • It is all in Shiny which means the server is running in R, and R sends the required data to a Leaflet map in the users browser.

Options:

  1. Extending the current region model to LSOAs
    • We have the same regions but with LSOA resolution
    • Benefits
      • Very little code change so the UI works (simple)
    • Problems
      • This would mean the server has vastly more data in memory (maybe too much)
      • There is still the hard edge, a user can only look at one region at a time
  2. PostGIS and Shiny
    • The map features we are using are done using geographical functions that could be done by a database
    • Instead of Leaflet talking to Shiny (R) where the data is, Leaflet could talk Shiny and Shiny could request the data from a database
    • Benefits:
      • This could use some of the existing Shiny code so few changes
      • It would mean regions are un-bounded, we could easily pull in ANY routes which cross the map bounds (instead of the current hard edge) - this allows to have arbitary 'regions' of wherever the map is
      • It would not load the entire region into memory, only the data required
    • Problems:
      • This introduces a round trip so when a user scrolls the browser makes a request to the server, the server makes a request to the db, the db replies to the server which replies to the browser
      • We would have to write geo-caching to not always request everything from the db and store some things on the server - over time if a user moved around West Yorkshire that R thread (one of many on the server) could end up caching a lot so might not save on memory so we'd have to clear the cache
      • When the user scrolls very quickly across the map then we don't want to spam the db with lots of pointless requests for data, we want to wait till things are settled
  3. PostGIS without Shiny
    • Write Leaflet (JS) code that talks only to the db (maybe have a Shiny server but not for the map)
    • Benefits:
      • Faster, the browser requests data from the db directly. Most of what shiny was doing would be replaced with PostGIS queries
      • There are many js libraries that already do these things so lots of code to help us (e.g. there are packages to help with when a user scrolls quickly, for geo-caching cleverly, e.t.c.)
      • Very scalable, we could have multiple databases and let the browser do most of the hard work
    • Problems
      • Would have to start from scratch writing all the code in Leaflet, it took many months to get it working in Shiny so would also take many months in Leaflet
      • Displaying the non map tabs would be hard and probably would fall back on Shiny (i.e. the data tables and model output)
      • We'd need a server to act as a go between and talk to the database, i.e. client pooling. A database can only handle a set number of requests. If each client talks straight to the db then we are likely to overload the database http://www.craigkerstiens.com/2014/05/22/on-connection-pooling/ A rule of thumb I’d use is if you have over 100 connections you want to look at something more robust