Spring Cloud Tasks - OpenData-tu/documentation GitHub Wiki

Authorship

Version Date Modified by Summary of changes
2017-05-31 Andres Ardila Working draft

Spring Cloud Task is designed to provide a few things:

  1. Similar abilities to Spring Batch with regards to state management without needing to create a batch job. When running a Boot application on a cloud environment, there is no standard way of getting the results from environment to environment (YARN handles job results differently from tasks on Cloud Foundry which is different from jobs on Kubernetes, etc). Spring Batch provides this but now all short lived processes need the overhead of the Batch API so Spring Cloud Task provides a lighter touch to those use cases.
  2. Automatically adds informational listeners. With Spring XD, when you ran a job in an XD container, the XD container automatically added a number of informational listeners that broadcast events that you could listen for. Spring Cloud Task brings the same functionality without the need for the XD container.
  3. Integration with Spring Cloud Stream. Spring Cloud Task provides the ability to launch tasks from messages received from Spring Cloud Stream. Also, the informational messages previously mentioned (both Batch events as well as Task events) are sent via Spring Cloud Stream channels. The DeployerPartitionHandler. When working in a cloud environment, this PartitionHandler implementation allows you to launch workers for a partitioned batch job as tasks. This allows for the dynamic scaling of partitioned batch jobs instead of the traditional option of pre-deploying workers that listen for work which wastes resources in a modern cloud environment.

For the best option for a modern batch platform, you really need to look into some from of platform first and that begins at the Cloud Foundry/Kubernetes/Mesos/YARN layer. Without that, you end up building a large part of the infrastructure yourself. That is why Spring XD evolved into Spring Cloud Data Flow. The added complexity that lived in the containers of Spring XD is removed by requiring a modern platform to run on (since they all handle those guarantees themselves). Without that piece, you're going to spend a lot of time managing the deployment and orchestration of applications that most modern platforms handle for you.

From there, the choice becomes pretty easy IMHO with Spring Cloud Task for simple tasks, Spring Batch for batch jobs, and Spring Cloud Data Flow for orchestration. [1]

Tutorials, Walk-throughs, etc.

Starting jobs

There are a couple of ways you could launch Tasks in Spring Cloud Data Flow. Both the example are referred to a Streaming processing as the Batch processing is better for short lived, on demand service:

[1] https://stackoverflow.com/questions/38448411/how-to-dynamic-deploy-for-standalone-spring-batch-using-spring-cloud-task