Processing Orders and prosEO Production Planning Strategies - dlr-eoc/prosEO GitHub Wiki
Processing Orders and prosEO Production Planning Strategies
This chapter describes the way to handle processing orders and their job steps and the strategies used in each step.
Direct actions to manipulate a processing order
The commands follow the production order menu of graphical user interface (GUI). The current user has to have the appropriate UserRole to execute one of the commands. The possible sequence of commands (order states) is defined in Order State.
Approve
After a processing order was created (see Creating Orders Automatically and Creating Orders Manually) it should be reviewed and approved if no changes are required. The new order state is APPROVED.
Plan
Planning a production order builds the jobs and job steps which will be processed on the selected processing facility (see Release). It starts a new thread to do this in the background and sets the order state to PLANNING. Then the expected jobs are created, this means all possible jobs without looking for already existing output products. A job represents the product(s) for a time interval (start and end time). The time interval(s) are defined by the slicing type and the corresponding settings:
- ORBIT: Create jobs by orbit (preferably a list of orbits is then given for the order, if no such lists exists, generate jobs orbit-wise so that the time interval is fully covered, i. e. with the first orbit starting not later than the beginning of the time interval and the last orbit ending not earlier than the end of the time interval jobs will be linked to their respective orbits).
- CALENDAR_DAY: Create jobs by calendar day (in such a way that the first job starts not later than the beginning of the order time interval and the last job ends not earlier than the end of the time interval).
- CALENDAR_MONTH: Same as CALENDAR_DAY, but for calendar months.
- CALENDAR_YEAR: Same as CALENDAR_DAY, but for calendar years.
- TIME_SLICE: Create jobs in fixed time slices, starting with the start time of the order time interval and ending not earlier than the end of the time interval.
- NONE: Do not attempt to create slices, but create a single job spanning exactly the time interval from startTime to stopTime.
Next the job steps of each job are created:
- Create one job step for each requested product class and additional job steps for missing but producible input products. This reverse looking for producible products stops if a product is contained in stop list of input product classes. Such products are expected to exist (already produced) in the product repository.
- If an output product already exists, the job step is not created.
- Set all necessary attributes of the job step.
- Find the applicable configured processor to create the output product.
- Finally set the state to PLANNED.
If a job has no job steps cause all output products already exist it is removed from the production order. Otherwise its state is set to PLANNED. Last the order state is set to PLANNED and the plan thread finishs.
Find the applicable configured processor to create an output product
Products are are splitted into two types, producible and non producible. Non producible products are normally L0 and most of auxilliary products. Their product class doesn't reference a processor and they have to exist in the product repository. All other products reference a processor class via their product class which is used to generate them. Additionaly the selection rules used to select input products can be divided by configured processors. This is sufficent to find a configured processor for product generation.
Secondary it is possible to define requested configured processors in a processing order. This garantees the usage of a specific version of processor and its configuration.
Here are the different decisions to find the configured processor to use:
- Look in the product class of output product for the processor class and fill the list of possible configured processors (referencing this processor class).
- If the order defines requested configured processors, reduce the possible selection list to them. This is used to request a specific version of a processor (e.g. to reproduce a product with this version).
- If the order defines a processing mode, find the configured processors defining this mode. Otherwise search for configured processors without using the processing mode.
- Last select the "newest" configured processor out of the resulting list. The comparison to find the newest one is based on the version of the configurations.
Release
To release a planned order, the processing facility (where the job steps should run) has to be selected first. Then the release thread is started and the order state is RELEASING.
This thread walks over each job step to execute these steps:
- Check the input product queries (selection rules) of the product class produced by the job step (see Configure Product Classes and Selection Rules).
- If all mandatory products are available the job step state is set to READY.
- If the order is created and marked on demand there is the possibility to request and download missed input products from a LTA (Long Term Archive). Then check point (i.).
- Otherwise the job step state is set to WAITING_INPUT.
Start processing of ready job steps
- The start order is: Use first the priority (descending order) then the sensing start time (ascending order) of a job step.
- Check first whether current time is after the execution time set in the order.
- Then check whether a free worker node exists on processing facility.
- Start the job step if these checks are true:
- Create the job order file (JOF) containing the parameter definitions needed to generate the output product. These are at least the processor with the configurion file, sensing start and stop time, the input products and the output product (for more information see IPF specification). Now send the job order file to the storage manager for use in the processor wrapper.
- Try to create and start the job step on the processing facility.
- Regard processor specific requirements like count of cpus and memory. If there is no worker node with enough resources the job steps remains in state READY.
- If there are not enough worker nodes on processing facility to start job steps in state READY or job steps are in in state WAITING_INPUT they are checked again in The Dispatcher Cycle.
- Otherwise create and start the job step on processing facility.
Suspend
The processing of a production order shall be stopped smoothly.
- No further job steps are started and running job steps has the possibility to finish. The order state is set to STOPPING.
- After all running job steps has finished the order state is set to PLANNED.
Suspend forced
The processing of a production order shall be stopped immediatly.
- No further job steps are started and running job steps are killed. The order state is set to PLANNED.
Resume
This handles the processing order like Release.
Retry
Check job steps in state FAILED. If all expected output product files exist the job step state is set to COMPLETED. Otherwise the state is set to PLANNED for the possibility to Resume.
Reset
Reset a PLANNED production order to state INITIAL. All jobs and job steps are removed. This includes also the deletion of already COMPLETED job steps.
Cancel
Set the state of a PLANNED production order to FAILED.
Close
Remove the input product dependencies of the job steps of a COMPLETED or FAILED production order and set the state to CLOSED.
Edit
Opens a production order of state INITAL for edit.
Copy
Copy the prduction order and open the editor. The new production order is in state INITAL and the suffix "_Copy" is appended to the name.
Delete
Delete a production order in state CLOSED. The created products remain in the system.
The Dispatcher Cycle
The role of the dispatcher cycle is:
- Look for job steps in state WAITUNG_INPUT and check whether the mandatory input products now exist.
- If this is true resume like descriped in Release.
- Otherwise keep the state WAITUNG_INPUT and try again in next cycle.
- Start job steps in state READY like described in Release.
The wait time between cycle execution is defined in the configuration (application.yml) of the production planner as parameter "dispatcherwaittime".
Start of Production Planner
At start of the production planner following actions are executed:
- Connect to the processing facility. Then synchronize the job steps running on prozessing facility:
- Look for job steps which are really running and create the corresponding objects for the planner.
- Look for job steps in facility state "failed" or "failure". Finish these job steps as decribed in Finish Job Step.
- Restart Suspend for production orders in state SUSPENDING.
- Restart Release for production orders in state RELEASING.
- Restart Plan for production orders in state PLANNING.
Finish Job Step
After a processor has finished on the worker node the wrapper calls the finish action of the planner job step with the result (success or failed).
- Get and store the log of the job step.
- Set the job step sate to COMPLETED or FAILED.
- Remove the job step from the processing facility.