Performance - NaturalHistoryMuseum/ckanext-map GitHub Wiki

Tools

I've setup pgbadger on the prototype - internal access at /postgres_stats.html. Note that for the moment the stats need to be refreshed manually; this makes more sense for development: cd /var/www/pgbadger && pgbader /var/lib/postgersql/9.3/main/pg_log/---.log

Managing concurrent connections

It is crucial for the tuning of the server to have a clear target of how many concurrent users (refreshing the page at the same time) we want to allow. We are going to assume that all the static files (images, CSS, JS, etc.) are served from the nginx cache. So that one user = one connection.

For now this sections is mostly about tuning PostgreSQL memory settings, though we'll need to look at CPU, etc. as well later.

This is the list of settings that will depend on this information.

Setting	Notes
s_ck_p	Number of ckan wsgi processes
s_ck_t	Number of ckan wsgi threads per process
s_pg_c	Number of allowed postgres connections
s_wm_c_c	Postgres work_mem parameter for ckan app accessing ckan db
s_wm_c_d	Postgres work_mem parameter for ckan app accessing datastore db
s_wm_w_d	Postgres work_mem parameter for windshaft server accessing datastore db
s_sqlp_c_c	SqlAlchemy total pool size (size+max) from ckan core to ckan db. Default 15
s_sqlp_d_c	SqlAlchemy total pool size (size+max) from ckan datastore to ckan db. Default 15
s_sqlp_d_dw	SqlAlchemy total pool size (size+max) from ckan datastore to ckan db (write). Fixed at 15
s_sqlp_d_dr	SqlAlchemy total pool size (size+max) from ckan datastore to ckan db (read). Fixed at 15

This is the list of hardware settings that will come into play:

Var	Val	Notes
ncpu_w	4	Number of CPUs on the Windshaft server (assuming one worker per CPU)
ram_p	12GB	Overall RAM available to the postgres server

This is the list of application related values that will come into play:

Var	Val	Notes
uc	1	Number of connections per user. We assume all static files (JS, CSS, etc.) come from the NginX cache, so that 1 user = 1 connection
mj_c_c	4	Max number of join/sorts in queries from the ckan app to the ckan db guessed
mj_c_d	2	Max number of join/sorts in queries from the ckan app to the datastore db guessed
mj_w_d	4	Max number of join/sorts in queries from the windshaft server to the datastore db
mm_c_c	4M	Max mem used for a sort/join in queries from the ckan app to the ckan db guessed
mm_c_d	3GB	Max mem used for a sort/join in queries from the ckan app to the datastore db
mm_w_d	100M	Max mem used for a sort/join in queries from the windshaft app to the datastore db
cp_c_c	min(2 * s_ck_t, s_sqlp_c_c + s_sqlp_d_c) * s_ck_p	Max number of pg connections from ckan to the ckan db. We assume that there is never more than 1 connection per engine per thread.
cp_c_d	min(2 * s_ck_t, s_sqlp_d_dw + s_sqlp_d_dr) * s_ck_p	Max number of pg connections from ckan to the datastore db. We assume that there is never more than 1 connection per engine per thread.
cp_w_d	ncpu_w	Max number of pg connections from windshaft server to datastore db

These are the constraints:

Constraint	Notes
s_ck_p * s_ck_t >= u * uc	Ensure we can serve all the files clients need
cp_c_c + cp_c_d + cp_w_d <= s_pg_c	Ensure we have enough postgres connections
cp_c_c * mj_c_c * s_wm_c_c + cp_c_d * mj_c_d * s_wm_c_d + cp_w_d * mj_w_d * s_wm_w_d <= ram_p	Ensure we have enough ram

What we would desire (but is not a constraint):

Desire	Notes
mm_c_c <= s_wm_c_c	Ensure ckan sorts/joins happen in memory
mm_c_d <= s_wm_c_d	Ensure ckan datastore sorts/joins happen in memory
mm_w_d <= s_wm_w_d	Ensure windshaft sorts/joins happen in memory

Now let's say we want 10 concurrent users. We'd rather keep two processes, so let's have s_ck_p = 2 and s_ck_t = 5. We keep the defaults for s_sqlp_* variables, so we get: cp_c_c = 20, cp_c_d = 20, cp_w_d = 4 and thus s_pg_c = 44.

Now to get the workmem values. First of all, we're not going to get 3GB in there, so let's leave s_wm_c_d at a low value (say 10M). We this in place, we can have s_wm_c_c at 4M and s_wm_w_d at 100M.

Server settings

I have not yet found these settings to have a major impact, but it is important to keep them in mind.

kernel.shmmax - largest size of shared memory segments. Defaults to 33MB on Linux, though several guides I have read (include the osm guide for setting up a tile server) suggests this needs to be increased for large postgresql database. I have set it to 4GB. Did not see any noticeable performance improvement, but it's useful to be aware of this paramenter.

kernel.shmall - Total available shared memory. Defaults to 8GB (the actual number is in pages of 4k). I have not touched this, though it should be increased if we want to increase shared_buffers (see Postgres settings) above 8GB.

Postgres settings

The performance settings that are interesting in tuning are:

shared_buffers - The amount of memory available to PostgreSQL for caching data. The recomended value is about 25% of available memory. Setting this too high may be detrimental to performance - PostgreSQL also uses system caches and it is useful not to use all the memory. I have set this to the recomended value (4GB - default is 128M), though I have not otherwise tried to fine tune the best value for this.

effective_cache_size - This is a hint (not a setting) to tell PostgreSQL how memory it may be able to rely on. Typically can be set to all the memory available to postgres (so shared_buffers + system caches). I've set this to 12GB, though I have not tried to fine tune this value for best result. Setting both this and shared_buffers has shown some small performance improvements.

work_mem - The amount of memory used for operations such as sorting. The memory is used per connection per query, so setting this too high can easily blow up the ram. This parameter will need careful tuning - see the concurent user section.

Note: the work_mem setting can be set per user by doing ALTER ROLE username SET work_mem='100MB'. This allows us to have a different work_mem depending on the context.

checkpoint_segments - Defines the size of segments used when writing to the db. The default value of 3 (48Mb) is acknowledge to be very low, and I have set this to the recommended value of 10. As we are not write-heavy, we probably don't need to tune this very much.

checkpoint_completion_target - Defines how a checkpoint's writes are spread across the next checkpoint. With a value of checkpoint_segments of 10, this can be set to 0.9 (90%) which gives a better spread.

random_page_cost - The cost of doing a random disk access. This is set to 4 times as expensive as doing a sequential scan. This value is considered too high, though we'll need to tune that one properly before changing it.

Nginx caching

We are going to use Nginx, in front of the Apache/Ckan server, for caching. As map tiles & grids go through Ckan, this will also provide caching for map tiles.

Given that it is not possible to cache all map tiles (see http://wiki.openstreetmap.org/wiki/Tile_disk_usage for instance), we should prioritize tiles that are viewed by many users. If a single user views a particular location at a high zoom level, chances are no-one else will need that tile unless it's in a popular location. That user's browser will cache the tile (we need to set browser caching directives appropriately), so we don't need to cache it.

The plan then is to have three different caches:

Cache	Specification
general	5gb ; cache all images/css/js for one week and instruct client to cache for one day.
map	20gb ; cache all map tiles & grids that have been accessed more than 10 times for one week and instruct client to cache for one day.
grid	5gb ; cache all grid data (access to /api/3/action/datastore_search) that has been accessed more than 10 times for one week and instruct client to cache for one day.

Database indexes

The best performance has been achieved by using a partial index on the geometry fields:

CREATE INDEX the_geom_webmercator_index 
          ON "13f0ed4b-b99a-4f32-8554-ab72cb074e15" 
       USING GIST(_the_geom_webmercator) 
       WHERE _the_geom_webmercator IS NOT NULL;

Note that using a partial index prevents clustering, however when testing the performance gains from the partial index where higher than the ones from clustering.

Tile queries

The tile queries, as executed (after being created in CKAN, and modified by both Windshaft and Mapnik) are like this:

SELECT ST_AsBinary("_the_geom_webmercator") AS geom 
  FROM (SELECT _the_geom_webmercator 
	  FROM (SELECT DISTINCT ON (_the_geom_webmercator) "c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator AS _the_geom_webmercator 
	          FROM "c5ac4915-ef7d-4934-b72d-0ed6f66b944e" 
	         WHERE ST_Intersects("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator, ST_Expand(ST_transform(ST_SetSrid(ST_MakeBox2D(ST_makepoint(-90.0, 66.51326044311186), ST_makepoint(0.0, 0.0)), 4326), 3857), 39135.8 * '4'))
               ) AS _mapplugin_sub ORDER BY random()
       ) as cdbq 
 WHERE "_the_geom_webmercator" && ST_SetSRID('BOX3D(-20037508.3 -20037508.3,20037508.3 20037508.3)'::box3d, 3857)

Things to note:

The inner ST_Intersects is generated in the ckan extention. The purpose is to limit the query to fetch only the points needed for the given tile. This does not use Mapnik's !bbox! placeholder as for some reason this is set to the extent of the map;
The outer "_the_geom_webmercator" && ... part of the query is added by Mapnik because no !bbox! placeholder was used. The performance impact is minimal, so dealing with this is not a priority;
This query can be generated by going at /map-tile/2/1/1.png?filters=%5B%5D&resource_id=c5ac4915-ef7d-4934-b72d-0ed6f66b944e&style=plot

Tile queries analyze

Subquery Scan on cdbq  (cost=722623.12..727620.18 rows=333137 width=48) (actual time=2643.463..2670.674 rows=24855 loops=1)
   ->  Sort  (cost=722623.12..723455.96 rows=333137 width=48) (actual time=2643.411..2650.240 rows=24855 loops=1)
         Sort Key: (random())
         Sort Method: quicksort  Memory: 4081kB
         ->  Subquery Scan on _mapplugin_sub  (cost=675986.97..681816.87 rows=333137 width=48) (actual time=2296.780..2617.496 rows=24855 loops=1)
               ->  Unique  (cost=675986.97..677652.66 rows=333137 width=48) (actual time=2296.765..2610.315 rows=24855 loops=1)
                     ->  Sort  (cost=675986.97..676819.81 rows=333137 width=48) (actual time=2296.762..2535.627 rows=220169 loops=1)
                           Sort Key: "c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator
                           Sort Method: external merge  Disk: 11824kB
                           ->  Bitmap Heap Scan on "c5ac4915-ef7d-4934-b72d-0ed6f66b944e"  (cost=31933.81..635180.72 rows=333137 width=48) (actual time=164.464..1051.077 rows=220169 loops=1)
                                 Recheck Cond: ((_the_geom_webmercator && '0103000020110F00000100000005000000F976E22B686863C1B2999999F91B03C1F976E22B686863C1FB76E22B686863419A999999F91B0341FB76E22B686863419A999999F91B0341B2999999F91B03C1F976E22B686863C1B2999999F91B03C1'::geometry) AND (_the_geom_webmercator && '0103000020110F00000100000005000000CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B73C1'::geometry))
                                 Filter: _st_intersects(_the_geom_webmercator, '0103000020110F00000100000005000000F976E22B686863C1B2999999F91B03C1F976E22B686863C1FB76E22B686863419A999999F91B0341FB76E22B686863419A999999F91B0341B2999999F91B03C1F976E22B686863C1B2999999F91B03C1'::geometry)
                                 ->  Bitmap Index Scan on geom_mercator_index  (cost=0.00..31850.52 rows=999411 width=0) (actual time=139.010..139.010 rows=220169 loops=1)
                                       Index Cond: ((_the_geom_webmercator && '0103000020110F00000100000005000000F976E22B686863C1B2999999F91B03C1F976E22B686863C1FB76E22B686863419A999999F91B0341FB76E22B686863419A999999F91B0341B2999999F91B03C1F976E22B686863C1B2999999F91B03C1'::geometry) AND (_the_geom_webmercator && '0103000020110F00000100000005000000CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B73C1'::geometry))

We see the main resource usage is caused by the 'distinct'. Using a 'group by' instead makes a minor performance improvement, though there might be better options.

Note: this has been improved by increasing work mem and moving the sorts into memory.

Grid queries

The grid queries, as executed (after being created in CKAN, and modified by both Windshaft and Mapnik) are like this:

SELECT ST_AsBinary("_the_geom_webmercator") AS geom,
       "_id", "count", "lat", "lng",
       "catalogNumber", "class", "collectionCode", "continent", "country", "county", "dateIdentified",
       "family", "genus", "habitat", "identifiedBy", "institutionCode", "kingdom", "locality", "order",
       "phylum", "scientificName", "stateProvince", "subgenus" 
  FROM (SELECT _id[1] AS _id, count, st_y(_geom[1]) AS lat, st_x(_geom[1]) AS lng, _mapplugin_center AS _the_geom_webmercator,
               kingdom[1] AS kingdom, county[1] AS county, genus[1] AS genus, subgenus[1] AS subgenus,
               habitat[1] AS habitat, family[1] AS family, "stateProvince"[1] AS "stateProvince", 
               country[1] AS country, "institutionCode"[1] AS "institutionCode", locality[1] AS locality,
              "collectionCode"[1] AS "collectionCode", class[1] AS class, 
              "identifiedBy"[1] AS "identifiedBy", phylum[1] AS phylum, 
              "scientificName"[1] AS "scientificName", "dateIdentified"[1] AS "dateIdentified",
              "catalogNumber"[1] AS "catalogNumber", continent[1] AS continent, "order"[1] AS "order"
          FROM (SELECT 
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".kingdom) AS kingdom,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".county) AS county,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".genus) AS genus,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".subgenus) AS subgenus,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".habitat) AS habitat,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".family) AS family,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."stateProvince") AS "stateProvince",
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".country) AS country,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."institutionCode") AS "institutionCode", 
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".locality) AS locality,
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."collectionCode") AS "collectionCode", 
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".class) AS class, 
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."identifiedBy") AS "identifiedBy", 
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".phylum) AS phylum, 
                       array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."scientificName") AS "scientificName", 
                      array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."dateIdentified") AS "dateIdentified", 
                      array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."catalogNumber") AS "catalogNumber", 
                      array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._id) AS _id, 
                      array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e".continent) AS continent, 
                      array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"."order") AS "order", 
                      count("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator) AS count, 
                      array_agg("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._geom) AS _geom, 
                      ST_SnapToGrid("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator, 39135.8 * 4, 39135.8 * 4) AS _mapplugin_center 
	         FROM "c5ac4915-ef7d-4934-b72d-0ed6f66b944e" 
	        WHERE ST_Intersects("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator, ST_transform(ST_SetSrid(ST_MakeBox2D(ST_makepoint(-90.0, 66.51326044311186), ST_makepoint(0.0, 0.0)), 4326), 3857)) 
             GROUP BY ST_SnapToGrid("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator, 39135.8 * 4, 39135.8 * 4) 
             ORDER BY count DESC
  ) AS _mapplugin_sub
) as cdbq WHERE "_the_geom_webmercator" && ST_SetSRID('BOX3D(-20037508.3 -20037508.3,20037508.3 20037508.3)'::box3d, 3857)

Explain Analyze:

 Subquery Scan on _mapplugin_sub  (cost=650334.15..651565.11 rows=61548 width=712) (actual time=4348.386..4383.037 rows=1897 loops=1)
   ->  Sort  (cost=650334.15..650488.02 rows=61548 width=355) (actual time=4348.300..4373.662 rows=1897 loops=1)
         Sort Key: (count("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator))
         Sort Method: external merge  Disk: 38280kB
         ->  GroupAggregate  (cost=627643.69..635337.19 rows=61548 width=355) (actual time=2834.964..4235.346 rows=1897 loops=1)
               ->  Sort  (cost=627643.69..627797.56 rows=61548 width=355) (actual time=2834.672..3183.612 rows=202054 loops=1)
                     Sort Key: (st_snaptogrid("c5ac4915-ef7d-4934-b72d-0ed6f66b944e"._the_geom_webmercator, 0::double precision, 0::double precision, 156543.2::double precision, 156543.2::double precision))
                     Sort Method: external merge  Disk: 50152kB
                     ->  Bitmap Heap Scan on "c5ac4915-ef7d-4934-b72d-0ed6f66b944e"  (cost=27127.96..612646.72 rows=61548 width=355) (actual time=112.946..1447.391 rows=202054 loops=1)
                           Recheck Cond: (_the_geom_webmercator && '0103000020110F0000010000000500000093107C45F81B63C100000040A65408BE93107C45F81B63C195107C45F81B6341000000000000000095107C45F81B6341000000000000000000000040A65408BE93107C45F81B63C100000040A65408BE'::geometry)
                           Filter: ((st_snaptogrid(_the_geom_webmercator, 0::double precision, 0::double precision, 156543.2::double precision, 156543.2::double precision) && '0103000020110F00000100000005000000CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B7341CDCCCC44F81B73C1CDCCCC44F81B73C1CDCCCC44F81B73C1'::geometry) AND _st_intersects(_the_geom_webmercator, '0103000020110F0000010000000500000093107C45F81B63C100000040A65408BE93107C45F81B63C195107C45F81B6341000000000000000095107C45F81B6341000000000000000000000040A65408BE93107C45F81B63C100000040A65408BE'::geometry))
                           ->  Bitmap Index Scan on geom_mercator_index  (cost=0.00..27112.58 rows=923222 width=0) (actual time=92.183..92.183 rows=202054 loops=1)
                                 Index Cond: (_the_geom_webmercator && '0103000020110F0000010000000500000093107C45F81B63C100000040A65408BE93107C45F81B63C195107C45F81B6341000000000000000095107C45F81B6341000000000000000000000040A65408BE93107C45F81B63C100000040A65408BE'::geometry)
 Total runtime: 4410.951 ms

Note: This has been improved by increasing work_mem and moving the sorts into memory.

/map-grid/2/1/1.grid.json?callback=lu0.lu_2_1_1&filters=%5B%5D&resource_id=c5ac4915-ef7d-4934-b72d-0ed6f66b944e&style=plot