DSL Reference - garyrussell/spring-xd GitHub Wiki

Introduction

Spring XD provides a DSL for defining a stream. Over time the DSL is likely to evolve significantly as it gains the ability to define more and more sophisticated streams as well as the steps of a batch job.

Pipes and filters

A simple linear stream consists of a sequence of modules. Typically an Input Source, (optional) Processing Steps, and an Output Sink. As a simple example consider the collection of data from an HTTP Source writing to a File Sink. Using the DSL the stream description is:

http | file

A stream that involves some processing:

http | filter | transform | file

The modules in a stream definition are connected together using the pipe symbol |.

Module parameters

Each module may take parameters. The parameters supported by a module are defined by the module implementation. As an example the http source module exposes port setting which allows the data ingestion port to be changed from the default value.

http --port=1337

It is only necessary to quote parameter values if they contain spaces or the | character. Here the transform processor module is being passed a SpEL expression that will be applied to any data it encounters:

transform --expression='new StringBuilder(payload).reverse()'

If the parameter value needs to embed a single quote, use two single quotes:

// Query is: Select * from /Customers where name='Smith'
scan --query='Select * from /Customers where name=''Smith'''

Named channels

Instead of a source or sink it is possible to use a named channel. Normally the modules in a stream are connected by anonymous internal channels (represented by the pipes), but by using explicitly named channels it becomes possible to construct more sophisticated flows. In keeping with the unix theme, sourcing/sinking data from/to a particular channel uses the > character. A named channel is specified by using a channel type, followed by a : followed by a name. The channel types available are:

queue - this type of channel has point-to-point (p2p) semantics

topic - this type of channel has pub/sub semantics

Here is an example that shows how you can use a named channel to share a data pipeline driven by different input sources.

queue:foo > file

http > queue:foo

time > queue:foo

Now if you post data to the http source, you will see that data intermingled with the time value in the file.

The opposite case, the fanout of a message to multiple streams, is planned for a future release. However, taps are a specialization of named channels that do allow publishing data to multiple sinks. For example:

tap:stream:mystream > file

tap:stream:mystream > log

Once data is received on mystream, it will be written to both file and log.

Support for routing messages to different streams based on message content is also planned for a future release.

Labels

Labels provide a means to alias or group modules. Labels are simply a name followed by a : When used as an alias a label can provide a more descriptive name for a particular configuration of a module and possibly something easier to refer to in other streams.

mystream = http | obfuscator: transform --expression=payload.replaceAll('password','*') | file

Labels are especially useful for disambiguating when multiple modules of the same name are used:

mystream = http | uppercaser: transform --expression=payload.toUpperCase() | exclaimer: transform --expression=payload+'!' | file

Refer to this section of the Taps chapter to see how labels facilitate the creation of taps in these cases where a stream contains ambiguous modules.

Single quotes, Double quotes, Escaping

Spring XD is a complex runtime that involves a lot of systems when you look at the complete picture. There is a Spring Shell based client that talks to the admin that is responsible for parsing. In turn, modules may themselves rely on embedded languages (like the Spring Expression Language) to accomplish their behavior.

Those three components (shell, XD parser and SpEL) have rules about how they handle quotes and how syntax escaping works, and when stacked with each other, confusion may arise. This section explains the rules that apply and provides examples to the most common situations.

Note	It’s not always that complicated This section focuses on the most complicated cases, when all 3 layers are involved. Of course, if you don’t use the XD shell (for example if you’re using the REST API directly) or if module option values are not SpEL expressions, then escaping rules can be much simpler

Spring Shell

Arguably, the most complex component when it comes to quotes is the shell. The rules can be laid out quite simply, though:

a shell command is made of keys (--foo) and corresponding values. There is a special, key-less mapping though, see below
a value can not normally contain spaces, as space is the default delimiter for commands
spaces can be added though, by surrounding the value with quotes (either single ['] or double ["] quotes)
if surrounded with quotes, a value can embed a literal quote of the same kind by prefixing it with a backslash (\)
Other escapes are available, such as \t, \n, \r, \f and unicode escapes of the form \uxxxx
Lastly, the key-less mapping is handled in a special way in the sense that if does not need quoting to contain spaces

For example, the XD shell supports the ! command to execute native shell commands. The ! accepts a single, key-less argument. This is why the following works:

xd:>! rm foo

The argument here is the whole rm foo string, which is passed as is to the underlying shell.

As another example, the following commands are strictly equivalent, and the argument value is foo (without the quotes):

xd:>stream destroy foo
xd:>stream destroy --name foo
xd:>stream destroy "foo"
xd:>stream destroy --name "foo"

XD Syntax

At the XD parser level (that is, inside the body of a stream or job definition) the rules are the following:

option values are normally parsed until the first space character
they can be made of literal strings though, surrounded by single or double quotes
To embed such a quote, use two consecutive quotes of the desired kind

As such, the values of the --expression option to the filter module are semantically equivalent in the following examples:

filter --expression=payload>5
filter --expression="payload>5"
filter --expression='payload>5'
filter --expression='payload > 5'

Arguably, the last one is more readable. It is made possible thanks to the surrounding quotes. The actual expression is payload > 5 (without quotes).

Now, let’s imagine we want to test against string messages. If we’d like to compare the payload to the SpEL literal string, "foo", this is how we could do:

filter --expression=payload=='foo'           (1)
filter --expression='payload == ''foo'''     (2)
filter --expression='payload == "foo"'       (3)

This works because there are no spaces. Not very legible though
This uses single quotes to protect the whole argument, hence actual single quotes need to be doubled
But SpEL recognizes String literals with either single or double quotes, so this last method is arguably the best

Please note that the examples above are to be considered outside of the Spring XD shell. When entered inside the shell, chances are that the whole stream definition will itself be inside double quotes, which would need escaping. The whole example then becomes:

xd:>stream create foo --definition "http | filter --expression=payload='foo' | log"
xd:>stream create foo --definition "htpp | filter --expression='payload == ''foo''' | log"
xd:>stream create foo --definition "http | filter --expression='payload == \"foo\"' | log"

SpEL syntax and SpEL literals

The last piece of the puzzle is about SpEL expressions. Many modules accept options that are to be interpreted as SpEL expressions, and as seen above, String literals are handled in a special way there too. Basically,

literals can be enclosed in either single or double quotes
quotes need to be doubled to embed a literal quote. Single quotes inside double quotes need no special treatment, and vice versa

As a last example, assume you want to use the transform module. That module accepts an expression option which is a SpEL expression. It is to be evaluated against the incoming message, with a default of payload (which forwards the message payload untouched).

It is important to understand that the following are equivalent:

transform --expression=payload
transform --expression='payload'

but very different from the following:

transform --expression="'payload'"
transform --expression='''payload'''

and other variations.

The first series will simply evaluate to the message payload, while the latter examples will evaluate to the actual literal string payload (again, without quotes).

Putting it all together

As a last, complete example, let’s review how one could force the transformation of all messages to the string literal hello world, by creating a stream in the context of the XD shell:

stream create foo --definition "http | transform --expression='''hello world''' | log" (1)
stream create foo --definition "http | transform --expression='\"hello world\"' | log" (2)
stream create foo --definition "http | transform --expression=\"'hello world'\" | log" (2)

This uses single quotes around the string (at the XD parser level), but they need to be doubled because we’re inside a string literal (very first single quote after the equals sign)
use single and double quotes respectively to encompass the whole string at the XD parser level. Hence, the other kind of quote can be used inside the string. The whole thing is inside the --definition argument to the shell though, which uses double quotes. So double quotes are escaped (at the shell level)