FlightSafety GeoPackage Tiling Experiments - sofwerx/cdb2-concept GitHub Wiki

FlightSafety GeoPackage Tiling Experiments

Setup

The data used for these experiments are primarily freely available, and include the following

Tiling Scheme

The tiling scheme uses the GNOSIS Global Grid (using TMS extension -- https://gitlab.com/imagemattersllc/ogc-vtp2/-/blob/master/extensions/14-tile-matrix-set.adoc). We are using the same type of json file that Ecere is using in their experiment.

LOD Grouping

The grouping is pre-set per experiment. The groups are calculated from the highest LOD, back to coarser LODs. For example, if there are 7 LODs (0-6) and a grouping of 4, then LODs 3 through 7 are in one GeoPackage, and LODs 0 through 2 are in another GeoPackage.

Directory and Naming Scheme

Each top level tile is within a directory that encodes the LOD, the row (rows are counted from the top, so north to south), and the column (longitude west to east). Fox example, "L0_R1_C2". Each tile directory contains one GeoPackage file (for example "Imagery_L0_L2_R1_C2.gpkg") and all the tile directories that refine this area (such as "L3_R9_C22"). There were two intentions to this directory structure:

  1. Limit the number of files in a directory (to keep from running into OS limitations).
  2. Make it a bit easier to export a portion of the world by hand from one CDB X to another.

Directory and File Naming Example

This is an example of how the first two experiments are structured. Top CDB directory:

Within the Imagery layer, there are 8 sub-directories to represent the 8 tiles at the lowest LOD in the world.

Image

Within each top tile directory, there is a single GeoPackage containing a range of LODs, and all the subdirectories that further refine this area (provided that the CDB was not built with all the LODs in a single GeoPackage). In this example, the next level of directories starts at LOD 3. The maximum number of subdirectories are capped by the number of LODs grouped into a single GeoPackage (2^grouping^2).

Image

At the leaf level, there is a single GeoPackage. We might rework this to skip the last directory level in this case, to limit the number of directories.

Image

Experiment 1

Purpose of Experiment

This experiment is to show how the top levels of the tiling scheme work, to show the LOD groupings within multiple GeoPackage files, and to show the proposed directory and file naming. There are 8 top level tiles (2 rows and 4 columns), and all GeoPackages that refine one of these tiles is under that tile's directory structure.

Processing

This experiment uses the NASA Blue Marble imagery to approximate world-wide imagery at a high level. This provides 7 levels of detail of data (L0 to L6). Normally, the GeoPackage files should be larger for efficient use, but to show the LOD groupings, only 4 LODs are grouped together. Also, the imagery is stored as Jpeg, so that tools can view the imagery easier. We originally were creating Jpeg2000 files, but it was harder to check the results in a tool like "DB Browser for SQLite". The data size for this experiment is around 300 MB.

Data Location

Compressed 7-zip file at: https://drive.google.com/drive/folders/1zfdBnAHpf9McLTbaJotdKGR_tXQad4dy?usp=sharing

Experiment 2

Purpose of Experiment

This experiment is to further test the limits of the LOD grouping and directory organization. It will be the World CDB X from experiment 1 with a small higher resolution inset of imagery. Added was some 15m data at LOD level 12 covering New York City, and 2 ft imagery covering Central Park on Manhattan Island at LOD level 16.

Processing

Same processing as experiment 1, but with an LOD grouping of 6 (thought during our planning to be ideal balancing size and number of sub-directories ( (2^6)^2 = 4096 maximum directories within one folder). The maximum LOD for this experiment is 16 (60cm). To find the highest resolution data, look at file CDBX_highres\Imagery\L0_R0_C1\L5_R17_C37\L11_R1120_C2412\Imagery_L11_L16_R1120_C2412.gpkg. The data size for this experiment is almost 1.5 GB.

Data Location

Compressed 7-zip file at: https://drive.google.com/drive/folders/1zKuu0oQy3K5oKestuPIvL-BRiSPn9oaO?usp=sharing

Experiments 3 and 4

Purpose of Experiment

This experiment contains two different tiled layers, Imagery and Elevation. Goals for this experiment:

  • There are two different tiled layers, Imagery and Elevation
  • The data coverage is world-wide, containing 1000m resolution imagery and elevation.
  • The directory structure was reworked to reduce the number of directories produced so that it is no longer a 1-to-1 file to directory ratio. To copy over a section of the world, one would need to copy both the GeoPackage and the directory with similar names
  • The GeoPackage files were renamed to be lod_row_col_endlod.gpkg, to keep the lod/row/column triplet together. For example, Imagery_L4_R9_C6_L6.gpkg

Updated Directory Structure

The directory structure was changed from having each GeoPackage within a directory of the same name (yielding a 1:1 ratio of files to directories), to having a finer resolution GeoPackage in a directory with the coarser tile name. If there is even finer/higher resolution data beyond this GeoPackage, it will be found in a directory at the same level as the GeoPackage with the tile name that matches most of the GeoPackage filename (except for the end lod value). Pictures of the structure below:

Top GeoPackage Level Mid-level directory structure Leaf directory structure

Processing

This experiment uses the NASA Blue Marble imagery as world-wide imagery and USGS GTOPO30 elevation data. This provides 7 levels of detail of data (L0 to L6). Normally, the GeoPackage files should be larger for efficient use, but to show the LOD groupings, only 3 LODs are grouped together. The imagery is stored as Jpeg, so that SQLite tools can view the imagery easier, and the elevation is stored as 32-bit floating point GeoTiff files. The uncompressed data size for this experiment is around 3.05 GB.

For experiment 3, the imagery and elevation layers were built into different GeoPackages and different directory structures. For experiment 4, the imagery and elevation were combined into a single set of GeoPackages and directories while keeping the LOD grouping.

Data Location

Experiment 3 compressed 7-zip file at: https://drive.google.com/drive/folders/1XpljPt_TrqqxsgcWaxLhi9C1qLE0nI32?usp=sharing

Experiment 4 compressed 7-zip file at: https://drive.google.com/drive/folders/1ipgsWaQmy2GWfUwnKVffJwggm6rXKcRj?usp=sharing

Observations

  • The file names and directory names are pretty hard to read and understand by looking at the files. But since the tiles are rarely on a "geocell" boundary, their might not be a good naming scheme.
  • Creating the LOD groupings based on the highest LOD of data makes it difficult to add data of a higher resolution later on. It might also make it harder to create "Versions" of the data that have been updated.
  • There are a lot of directories created with this tiling and naming scheme. In general, there is a 1-to-1 ratio of files to directories, and directories seem to be more work for an OS to create/modify/delete.
  • Official GeoPackage standards are pretty rigid for raster data. Tiles support a very limited set of raster types (PNG or JPG), and the coverage extension supports only 16-bit PNG or 32-bit float GeoTiff. Current OGC CDB 1.1 supports data types of 8-bit unsigned, 8/16/32 bit signed, and 32-bit floating point data types, with CDB 1.2 adding the capability to support Tiff bilevel images (1-bit).
  • Do we need the extra flexibility of putting different layers in different directory structures (and thus different GeoPackage files)?