Structured grid coverage readers - STEMLab/geotools GitHub Wiki

Description

Exposing structure

The current grid coverage API is based on the idea of a coverage reader containing a single coverage, which can be, according to the read parameters, subsetted, rescaled, eventually reprojected too. Some readers, like image mosaic, are actually based on a collection of internal raster data (referred to as "granules") which:

  • is spatially distributed
  • can have a time and elevation associated
  • can have other custom dimensions associated
  • is associated to a number of attributes that allow for general feature selection

All of the above is unfortunately mostly opaque to the caller, besides some level of access to dimensions provided via the metadata strings, which is suboptimal in several ways, first and foremost because all the available data has to be coerced into a in memory string representation, which has to be parsed on the other side. Also, the choice of dimensions made by the reader is somehow arbitrary, and the user of the reader might want to use other time/elevation attributes without having to re-configure the reader.

The current image mosaic is not the only example of "structured" readers, NetCDF data sources, PostGIS rasters and other in database rasters are another example where variables and dimensions could be exposed with higher control and better details. This can be leveraged by callers to have higher flexibilty in exposing dimensions, as well as allowing callers to inspect the inner structure of a complex coverage and better tailor their requests (WCS-EO is an example that allows to expose the inner structure of a mosaic and allows the caller to work against the single granules composing it).

Allowing granule addition/removal

Very often these structured readers need to be modified in terms of the granules they contain by:

  • adding new granules
  • removing existing granules

A simple and typical case is keeping a moving time window of data, e.g., the last month of satellite observations for a certain atmospheric gas. In order to support these cases we propose interfaces mimicking the vector data source, GranuleSource and GranuleStore, that allow access and modification of the granules internals. A structured reader might be read only, in which case it can only return a GranuleSource, or be writable, in which case a GranuleStore is returned instead. FeatureSource/FeatureStore have not been used directly to keep the work needed to implement a structured reader to a minimum (their interface is significantly larger).

In order to allow each reader to acquire extra data the way it prefers a "harvest" operation has been added that works against the file system, and allows each reader to add data into its backend storage the preferred way: a file oriented tool like ImageMosaic will just add references to the files being harvested internally, a database oriented tool like PostGIS raster can instead read the files and copy them into its internal storage.

Implementors of the harvest operation will have to consider the case of harvesting from another structured data source, for example, a image mosaic could with NetCDF files, which have in turn their own internal structure, taking that into account and eventually building not only new granules in the respective existing coverages: for example, a mosaic could be made of NetCDF files having each three variables, NO2, O3 and BrO, each file contains granules for the three gases at different batches of times and elevation, and the mosaic exposes the same three coverages, but hiding the fact the bits and pieces are split among various source NetCDF files.

Implementation

Implementation wise the proposal will come with two implementations:

  • ImageMosaic being improved to expose its internal structure. This one will be writable, allowing for harvesting and granule removal
  • NetCDF (as a new unsupported module), as a read only structured grid coverage that plays well with the ImageMosaic harvesting code, allowing multidimensional mosaics of NetCDF files to be constructed.

Status

This proposal is under construction.

Voting has not started yet:

Tasks

This section is used to make sure your proposal is complete (did you remember documentation?) and has enough paid or volunteer time lined up to be a success

API Changes

StructuredCoverageGridReader

AFTER:

public interface StructuredGridCoverage2DReader extends GridCoverage2DReader {
    /**
     * Returns the granule source for the specified coverage (might be null, if there is only one supported coverage)
     * 
     * @param coverageName the name of the specified coverage
     * @param readOnly a boolean indicating whether we may want modify the GranuleSource
     * @return the requested {@link GranuleSource}
     * @throws IOException
     * @throws UnsupportedOperationException
     */
    GranuleSource getGranules(String coverageName, boolean readOnly) throws IOException, UnsupportedOperationException;
    /**
     * Return whether this reader can modify the granule source 
     * @return
     */
    boolean isReadOnly();
    /**
     * Creates a granule store for a new coverage with the given feature type
     */
    void createCoverage(String coverageName, SimpleFeatureType schema) throws
                                               IOException, UnsupportedOperationException;
    /**
     * removes a granule store for the specified coverageName
     */
    boolean removeCoverage(String coverageName) throws IOException, UnsupportedOperationException;
    
    /**
     * Harvests the specified source into the reader. Depending on the implementation, the original source
     * is harvested in place (e.g., image mosaic), or might be copied into the reader persistent storage (e.g., database raster handling)
     *
     * @param defaultCoverage Default target coverage, to be used in case the sources being harvested are not structured ones. The parameter is optional,
     *                        in case it's missing the reader will use the first coverage as the default target. 
     *                        
     * @param source The source can be any kind of object, it's up to the reader implementation to understand and use it.
     *               Commons source types could be a single file, or a folder. 
     * @param hints Used to provide implementation specific hints on how to harvest the sources
     * @throws IOException
     * @throws UnsupportedOperationException
     */
    List<HarvestedSource> harvest(String defaultTargetCoverage, Object source, Hints hints) throws IOException,
            UnsupportedOperationException;

    /**
     * Describes the dimensions supported by the specified coverage, if any.
     * (coverageName might be null, if there is only one supported coverage)
     */
    List<DimensionDescriptor> getDimensionDescriptors(String coverageName) throws IOException;
 }
 
/**
 * Information about one of the sources that have been processed by
 * {@link StructuredGridCoverage2DReader#harvest(String, Object, org.geotools.factory.Hints)},
 * indicating whether the object was successfully ingested or not.
 * 
 * @author Andrea Aime - GeoSolutions
 * 
 */
public interface HarvestedSource {
    /**
     * The object that has been processed
     * 
     * @return
     */
    Object getSource();
    /**
     * If true, the file has been ingested and generated new granules in the reader, false otherwise
     * 
     * @return
     */
    boolean success();
    /**
     * In case the file was not ingested, provides a reason why it was skipped
     * 
     * @return
     */
    String getMessage();
}
/**
 * Describes a "dimension" exposed by a structured grid coverage reader. 
 */
public interface DimensionDescriptor {
    /**
    * The dimension name
    *
    * @return
    */
   String getName();
   /**
    * The dimension unit symbol
    *
    * @return
    */
   String getUnitSymbol();
   /**
    * The dimension units
    *
    * @return
    */
   String getUnits();
   /**
    * The start attribute 
    *
    * @return
    */
   String getStartAttribute();
   /**
    * The end attribute (In case of dimensions with ranges) 
    *
    * @return
    */
   String getEndAttribute();
}  

GranuleSource

AFTER:

public interface GranuleSource {
    /**
     * Retrieves granules, in the form of a {@code SimpleFeatureCollection}, based on a {@code Query}.
     * 
     * @param q the {@link Query} to select granules
     * @return the resulting granules.
     * @throws IOException
     */
    public SimpleFeatureCollection getGranules(Query q) throws IOException;
    /**
     * Gets the number of the granules that would be returned by the given {@code Query}, taking into account any settings for max features and start
     * index set on the {@code Query}.
     * 
     * @param q the {@link Query} to select granules
     * @return the number of granules
     * @throws IOException
     */
    public int getCount(Query q) throws IOException;
    /**
     * Get the spatial bounds of the granules that would be returned by the given {@code Query}.
     * 
     * @param q the {@link Query} to select granules
     * @return The bounding envelope of the requested data
     * @throws IOException
     */
    public ReferencedEnvelope getBounds(Query q) throws IOException;
    /**
     * Retrieves the schema (feature type) that will apply to granules retrieved from this {@code GranuleSource}.
     * 
     * @return
     * @throws IOException
     */
    public SimpleFeatureType getSchema() throws IOException;
    /**
     * This will free/release any resource (cached granules, ...).
     * 
     * @throws IOException
     */
    public void dispose() throws IOException;
} 

GranuleStore

AFTER:

public interface GranuleStore extends GranuleSource {
    /**
     * Add all the granules from the specified collection to this {@link GranuleStore}.
     * 
     * @param granules the granules to add
     */
    void addGranules(SimpleFeatureCollection granules);
    /**
     * Removes granules selected by the given filter.
     * 
     * @param filter an OpenGIS filter
     * 
     * @throws IOException if an error occurs modifying the data source
     */
    int removeGranules(Filter filter);
    /**
     * Modifies the attributes with the supplied values in all granules selected by the given filter.
     * 
     * @param attributeNames the attributes to modify
     * 
     * @param attributeValues the new values for the attributes
     * 
     * @param filter an OpenGIS filter
     * 
     * @throws IOException if the attribute and object arrays are not equal in length; if the value types do not match the attribute types; if
     *         modification is not supported; or if there errors accessing the data source
     */
    void updateGranules(String[] attributeNames, Object[] attributeValues, Filter filter);
    /**
     * Gets the {@code Transaction} that this {@code GranuleStore} is currently operating against.
     * 
     * <pre>
     * <code>
     * Transaction t = GranuleStore.getTransaction();
     * try {
     *     GranuleStore.addGranules (granules);
     *     t.commit();
     * } catch( IOException erp ){
     *     // something went wrong;
     *     t.rollback();
     * }
     * </code>
     * </pre>
     * 
     * @return Transaction in use, or {@linkplain Transaction#AUTO_COMMIT}
     */
    Transaction getTransaction();
    /**
     * Provide a transaction for commit/rollback control of a modifying operation on this {@code GranuleStore}.
     * 
     * <pre>
     * <code>
     * Transation t = new DefaultTransaction();
     * GranuleStore.setTransaction(t);
     * try {
     *     GranuleStore.addGranules (granules);
     *     t.commit();
     * } catch ( IOException ex ) {
     *     // something went wrong;
     *     t.rollback();
     * } finally {
     *     t.close();
     * }
     * </code>
     * </pre>
     * 
     * @param transaction the transaction
     */
    void setTransaction(Transaction transaction);
}

Documentation Changes

The interfaces are purely additive, so no existing documentation requires to be changed.

Some examples of dealing with a structured grid coverage reader could be extracted from the mosaic/netcdf reader unit tests.

⚠️ **GitHub.com Fallback** ⚠️