Metastore removal - GeoWebCache/geowebcache GitHub Wiki

The metastore functionality

The Metastore is an optional GeoWebCache module responsible for handling meta-information attached to each generated tile, in particular:

identify groups of request parameters that lead to the generation of different tile contents, associating them with a unique numerical identifier
perform tile locking during tile creation, thus making sure two GeoWebCache instances won't generate the same tile at the same time
track tile creation time, which is then used to perform tile expiration

The metastore is coded on top of H2, with some parts being H2 specific and others being generic SQL. While the H2 database can be clustered there are issues with its usage:

the reports of database corruptions over time and the varying database format from release to release makes it not very palatable to enterprise customers
the DBMS in large setups is normally a given, imposing a specific database is not an option

However talking with the GWC maintainer the desire to remove the MetaStore fully emerged. The MetaStore is currently working as part of the StorageBroker, which uses the MetaStore and the BlobStore to perform its actions. Most of the GWC code actually talks with the StorageBroker, so it's easy to replace it with a different implementation after it has been made into an interface.

Implementation plan

The plan is then as follows:

make the StorageBroker into an interface the old StorageBroker, MetaStore and BlobStore would make the "LegacyStorageBroker", which is going to be kept around and used to access and convert legacy tile sets the new StorageBroker would simply used a improved BlobStore that handles directly on the file system all the functionality previously managed by the legacy MetaStore and BlobStore

Replacing the parameter handling

The parameter handling in the current MetaStore takes all the extra parameters and associates them to a single numeric value, a long generated by a database sequence, which is known as the parameterId.

The parameterId is used in two places in GWC:

to build the full path to a tile
to identify the tile set in the disk quota module The current full path to a tile looks as follows:

layerName/gridSet_zoom[_hexParamId]/hx_hy/x_y.format

where:

half = 2 << (z / 2)
hx = x / half
hy = y / half

The parameter id thus avoid the collisions between tile sets using different parameters. The new system will use a fully on disk system to identify the parameters:

the group of parameter is hashed with the SHA1 algorithm generating a unique hash that is very resilient to collisions (unlikely, but not impossible)
the new on disk layout looks as follows:

layerName/gridSet_zoom[_paramsSHA1]/hx_hy/x_y.format
layerName/gridSet_zoom[_paramsSHA1]/params.txt

The params.txt is the file containing the clear text version of the parameters. In the unlikely case of a collision, with two parameter sets ending up getting the same SHA1 value, the code will create a lock file at layerName/gridSet_zoom[_paramsSHA1]/lock.txt which will prevent two GWC instances from trying to create new directories at the same time, and will try to allocate a structure like:

layerName/gridSet_zoom_paramsSHA1_cnt/hx_hy/x_y.format
layerName/gridSet_zoom_paramsSHA1_cnt/params.txt

where cnt is a progressive counter. The lock will be released once the first free number is found on the file system.

When searching for tiles with a certain parameter set GWC will first look for the straight SHA1, check the parameters actually match the contents of params.txt, if not it will fall back on a linear search of the similarly named directories. To facilitate and speed up those checks the StorageBroker will keep an in memory cache of the available paramsSHA1_cnt combinations falling back on a disk check only in case of a miss.

It is assumed that this will perform nicely because:

the SHA1 computation is fast
the cache will save significant amounts of disk access
the SHA1 algorithm offsers in any case a very good collision prevention (1 / 2^51)

Tile locking

The tile locking performed by the metastore serves two purposes:

avoids two instances of GWC to compute the same missing tile
avoids issues with two GWC instances writing on the same target file

The first is considered to be a non issue, the percentage of tiles that actually get computed in parallel by different GWC is minimal since:

users normally access different parts of the map, and when they don't conflicts arise only when the missing tiles are being computed, so there is little effective duplication of work
seeds have the potential to duplicate a lot, but two instances of GWC never seed the same layer at the same time, at the time of writing they would not be able to share the workload, in case we evolve GWC in that direction we'll also make sure to orchestrate the various insances so that they don't do overlapping work.

The remaining issue is file locking. This will be addressed as follows:

tiles will be written on disk in a work area, the name of the file will be a UUID (.format)
once the file is written it will be renamed into the target location, this operation is atomic on various filesystems so it will either succeed or fail. A rename does not alter the file contents, only the directory ones, so the file won't be corrupted in case of collision (all network filesystems should prevent directory entry corruptions against concurrent access)

Tile expiry

At the time of writing the tile creation time, which should coincide with the time the request to the backend drawing the tile is done, is stored in the metastore. Java allows to modify the last modification time of a file, tests over Linux and Windows filesystems show it's possible to set the last modification time of a file to whatever value, even a value earlier than the creation time (which not all file systems handle), thus allowing us to use the last modification time as the time marker used for expiry

Upgrading

Upgrading a GWC using the previous metastore to the new storage broker requires the following steps:

access to the metastore to get all tiles creation time, setting it on the file system as the last modification time
access to the metastore to fetch all associations between parameter ids and parameter values, and migration of the directory structure to the SHA1 based approach

The above can be done at the first startup, with the elimination of the metastore at the very end of the conversion progress (attention will be taken to make sure the migration can be restarted in the middle in case of malfunctionings of any kind).

In case the metastore was not available the directory structure will not need any change, and will be used as is.

The disk quota mechanism poses problems in that the parameter ids are going to change from a long to a string, the current plan is to drop the eventual old JE database and start anew, modifing the disk quota mechanism to use the string based parameter ids as opposed to the numeric based ones