Data sources - seinecle/QualityOfLife GitHub Wiki
Notes on data sources:
-
some of these variables would be selected by the user
-
others would depend on its selected "lifestyle" (urban, countryside, "lives near a coast" …)
-
others would be built in the model.
There are several types of variables to consider:
Like, "air quality in a department".
Other variables of this type would be "sunny days / per year", "number of rentals with gardens in the area", "surface of parks in the area", etc.
Like, "an hospital situated at these geo coordinates". Rather than averaging the number of hospitals in a given district and taking this number, a better approach would be to treat each hospital as a POI and applying a decreasing quality of life with the distance, maybe with thresholds. Like: anything at < 5km from an hospital → 100, then 80, the 60… or just a continuous decrease.
When we like to have them, but not right next to where we live.
→ airports, train stations, highways, ringroads…
These variables would cause a decrease of quality of life right next to them, but would increase the quality of life beyond a given radius.
I think it would be a mistake to build rent in the model, because the users will have different purchasing powers which can’t be averaged. Paris would be always shown as "too expensive for rent", when some users can actually afford an expensive rent and are ready to consider Paris.
So, something like "max rent you can afford" + "min square meters you need to live" would be two questions to ask before the results are computed.
Openstreetmap, Wikipedia, INSEE for France? A real estate agency for rent prices?
Points of interest and regions of interest would be stored as geohashes. This would conveniently deal with:
-
storing geo info at different zoom levels, retrieval at a given zoom level, and monitoring of how dense info is at each zoom level (do I have a lot of city wide info? Or lots of street level as well?)
-
enable quick checks of proximity, inclusion, neighborhood (I think…)
-
facilitate conducting geo ops outside of geo dbs if and when necessary