Skip to content

GSIP 123

Jody Garnett edited this page Jul 12, 2017 · 1 revision

GSIP 123 - WPS input and execution limits

Overview

Proposed By

Andrea Aime

Assigned to Release

This proposal is for GeoServer 2.7-beta.

State

  • Under Discussion
  • In Progress
  • Completed #874
  • Rejected
  • Deferred

Motivation

Allow the administrator to setup resource consumption limits for the WPS protocol.

Proposal

The existing core protocols, WMS/WFS/WCS all allow some sort of resource control to deny execution to requests that are going to require too much resources, or in case it's hard to predict how much time a request will run, to kill those that are CPU bound and took too much time to execute.

We want to provide the same facilities for WPS, on a global and per process basis.

Execution time limits

A request can invoke one or more processes (chaining) in order to achieve a certain result. Just like in WMS, we want to avoid requests to keep on running if the client is not likely to be waiting for the response anymore, and generally speaking, to avoid excessive CPU usage.

This demands for two separate limits, a synchronous one, linked to the general lifespan of a HTTP connection, and a asynchronous one, which could last for much longer.

These limits are going to be specified at the WPS level, and are going to be enforced by a setup similar to the Dismiss support. Just like the Dismiss support, some processes may be resilient to our attempts to stop them (e.g., processes that do not support progress reporting).

Complex input size limits

The WPS specification allows to advertise limits to the size of complex inputs supported by the process via a single attribute:

<complexType name="SupportedComplexDataInputType">
		<complexContent>
			<extension base="wps:SupportedComplexDataType">
				<annotation>
					<documentation> </documentation>
				</annotation>
				<attribute name="maximumMegabytes" type="integer" use="optional">
					<annotation>
						<documentation>The maximum file size, in megabytes, of this input.  If the input exceeds this size, the server will return an error instead of processing the inputs. </documentation>
					</annotation>
				</attribute>
			</extension>
		</complexContent>
	</complexType>

This limit is valid for all mime types associated to the input, regardless of how compact/verbose the input is. The size of a complex input is normally related to how much time it's going to take to retrieve and process it, and sometimes also to the memory footprint of the execution.

The work in the proposal will allow the administrator to setup two limits:

  • A global limit at the WPS level for complex inputs
  • A detailed limit per process and input, which can override the global limit

Given the GeoServer WPS architecture, it is going to be easy to determine the size of the inputs coming from remote servers and local file system, but hard to compute the one of inputs provided inline, and harder to handle the ones coming from process chaining and internal references to local layers.

The ones provided inline may be parsed during XML parsing via parser delegates, in this case we get a java object right out of the Execute request parse: in these cases we'll estimate the size of the inputs just as if they were local layers or chained process outputs.

For local layers and chained process outputs, we cannot get a size unless we write the collection/raster to persistent storage in some format, which is not desirable. Instead, for these objects we'll roll a ObjectSizeEstimator interface, which will have an easy job for rasters (we'll compute the uncompressed size of it) and a harder time for feature collections, where we'll have to work over assumptions (e.g., all Geometries are going to be made of 100 points, strings are going to be 256 chars long, and so on) to estimate a size. For collections that do not provide a reasonable response to size() (e.g., -1) the estimation will just return zero. Other complex objects traded between processes will require their own size estimators, if the WPS cannot find one, it will assume the size of the object is zero (e.g., nobody thought it was dangerous enough in size to require an estimation).

In terms of advertising the limits, a custom ProcessFilter will bake the size limits in the corresponding process Parameter object metadata, and DescribeProcess will take that information into account. A new Parameter metadata key named "MaxSizeMB" will be used for the purpose.

Literal input limits

Literal inputs do not normally present data transfer issues per se (besides their actual length, which is normally limited by the server POST size limit), but can be source of troubles if their value is directly related to the computational effort: some inputs can be controlling the number of iterations in an algorithm that works by subsequent approximations, or the number of points to represent a quadrant in a loop, and so on.

In these cases it's best to allow specifying a range of possible values.

The WPS specification comes to help and allows to specify a range of valid values in the DescribeProcess output:

<complexType name="LiteralInputType">
		<annotation>
			<documentation>Description of a process input that consists of a simple literal value (e.g., "2.1"). (Informative: This type is a subset of the ows:UnNamedDomainType defined in owsDomaintype.xsd.) </documentation>
		</annotation>
		<complexContent>
			<extension base="wps:LiteralOutputType">
				<sequence>
					<group ref="wps:LiteralValuesChoice">
						<annotation>
							<documentation>Identifies the type of this literal input and provides supporting information.  For literal values with a defined Unit of Measure, the contents of these sub-elements shall be understood to be consistent with the default Unit of Measure.</documentation>
						</annotation>
					</group>
					<element name="DefaultValue" type="string" minOccurs="0">
						<annotation>
							<documentation>Optional default value for this quantity, which should be included when this quantity has a default value.  The DefaultValue shall be understood to be consistent with the unit of measure selected in the Execute request. </documentation>
						</annotation>
					</element>
				</sequence>
			</extension>
		</complexContent>
	</complexType>
	<!-- ========================================================== -->
	<group name="LiteralValuesChoice">
		<annotation>
			<documentation>Identifies the type of this literal input and provides supporting information. </documentation>
		</annotation>
		<choice>
			<element ref="ows:AllowedValues">
				<annotation>
					<documentation>Indicates that there are a finite set of values and ranges allowed for this input, and contains list of all the valid values and/or ranges of values. Notice that these values and ranges can be displayed to a human client. </documentation>
				</annotation>
			</element>
			<element ref="ows:AnyValue">
				<annotation>
					<documentation>Indicates that any value is allowed for this input. This element shall be included when there are no restrictions, except for data type, on the allowable value of this input. </documentation>
				</annotation>
			</element>
			<element name="ValuesReference" type="wps:ValuesReferenceType">
				<annotation>
					<documentation>Indicates that there are a finite set of values and ranges allowed for this input, which are specified in the referenced list. </documentation>
				</annotation>
			</element>
		</choice>
	</group>

These limits are necessarily setup on a per input process basis, they will be baked into the process inputParameter using the existing MINand MAX metadata keys, advertised in DescribeProcess as a result, and enforced in the InputProvider classes.

User interface

The limits are traditionally setup in the service admin pages. However, in the case of processes, they are well suited to be integrated with the process security subsystem, which moved the existing process enabled/disabled table into its own "WPS security" page under the security section, in order to allow editing the roles that can access each process.

So the global limits will be edited there, and a new page will be made available for editing the limits associated to each process input, providing a simple table listing the inputs, and allowing setting a size limit for complex ones, a range limit for simple numerical ones.

Feedback

Backwards Compatibility

Voting

Project Steering Committee:

  • Alessio Fabiani +1
  • Andrea Aime +1
  • Ben Caradoc-Davies
  • Christian Mueller
  • Gabriel Roldán
  • Jody Garnett +1
  • Jukka Rahkonen +1
  • Justin Deoliveira
  • Phil Scadden +1
  • Simone Giannecchini +1

Links

Clone this wiki locally