Console Sink - rambabu-chamakuri/PSTL-DOC GitHub Wiki
If you are not familiar with the basic characteristics of a sink, please refer to the sinks documentation.
WARNING: The console sink is not a particularly useful tool outside of interactive prototyping. The output to the "console" occurs on the driver, and typically we run with the driver floating around somewhere on a cluster. Similarly, the console sink requires significant data aggregation to the driver if the underlying stream is large. This aggregation to the driver can cause the driver to OOM. Typically, PSTL will be configured with a very small value for spark.driver.maxResultSize
to minimize the potential of an OOM from misuse / abuse of interactive functionality in a production environment.
Usage
The console sink prints processed rows in a table based format to the driver's console.
Options
numRows
Specifies the number of rows to display when rendering the table of data on the driver's console.
Defaults to 20
.
SAVE STREAM foo
TO CONSOLE
OPTIONS(
'numRows'='100'
);
truncate
Specifies whether to truncate data when rendering the table of data on the driver's console. When a dataset contains many columns, etc. it is possible it won't fill the screen width properly. If you are actively prototyping and introspecting data it may be necessary to set truncate to false
if you do not have more appropriate tools at your disposal.
Defaults to true
.
SAVE STREAM foo
TO CONSOLE
OPTIONS(
'truncate'='false'
);