User: Compatibility of Steps - HiromuHota/pentaho-kettle GitHub Wiki

Standard steps

The list of standard steps here is the same as the one in here.

Name Category ID Executable Editable Comments
Abort Flow Abort Y Y
Add a checksum Transform CheckSum Y Y
Add constants Transform Constant Y Y
Add sequence Transform Sequence Y Y
Add value fields changing sequence Transform FieldsChangeSequence Y Y
Add XML Transform AddXML Y Y
Analytic Query Statistics AnalyticQuery Y Y
Append streams Flow Append Y Y
ARFF Output Data Mining Arff Output
Automatic Documentation Output Output AutoDoc Y Y
Avro input Input AvroInput
Avro Input (New) Input AvroInputNew Y (See #86)
Avro Output Output AvroOutput Y (See #86)
Block this step until steps finish Flow BlockUntilStepsFinish Y Y
Blocking Step Flow BlockingStep Y Y
Calculator Transform Calculator Y Y
Call DB Procedure Lookup DBProc
Call Endpoint BA Server CallEndpointStep
Change file encoding Utility ChangeFileEncoding
Cassandra input Big Data CassandraInput Y Y Tested with Cassandra 2.0.7
Cassandra output Big Data CassandraOutput Y Y Tested with Cassandra 2.0.7
Check if a column exists Lookup ColumnExists Y Y
Check if file is locked Lookup FileLocked
Check if webservice is available Lookup WebServiceAvailable Y Y
Clone row Utility CloneRow Y Y
Closure Generator Transform ClosureGenerator Y Y
Combination lookup/update Data Warehouse CombinationLookup Y Y
Concat Fields Transform ConcatFields Y Y
Copy rows to result Job RowsToResult Y Y
CouchDB Input Big Data CouchDbInput
Credit card validator Validation CreditCardValidator Y Y
CSV file input Input CsvInput Y Y
Data Grid Input DataGrid Y Y
Data Validator Validation Validator Y Y
Database join Lookup DBJoin
Database lookup Lookup DBLookup
De-serialize from file Input CubeInput
Delay row Utility Delay Y Y
Delete Output Delete Y Y
Detect empty stream Flow DetectEmptyStream
Dimension lookup/update Data Warehouse DimensionLookup Y Y
Dummy (do nothing) Flow Dummy Y Y
Dynamic SQL row Lookup DynamicSQLRow
Edi to XML Utility TypeExitEdi2XmlStep Y Y
ElasticSearch Bulk Insert Bulk loading ElasticSearchBulk
Email messages input Input MailInput
ESRI Shapefile Reader Input ShapeFileReader
ETL Metadata Injection Flow MetaInject Y Y
Execute a process Utility ExecProcess Y Y
Execute row SQL script Scripting ExecSQLRow
Execute SQL script Scripting ExecSQL
File exists Lookup FileExists Y Y
Filter Rows Flow FilterRows Y Y (0.6.1.5+)
Fixed file input Input FixedInput Y Y (0.6.1.6+) "Finish" button is mis-labled as "Next"
Formula Scripting Formula Y Y
Fuzzy match Lookup FuzzyMatch Y Y
Generate random credit card numbers Input RandomCCNumberGenerator Y Y
Generate random value Input RandomValue Y Y
Generate Rows Input RowGenerator Y Y
Get data from XML Input getXMLData Y Y
Get File Names Input GetFileNames Y Y
Get files from result Job FilesFromResult
Get Files Rows Count Input GetFilesRowsCount Y Y
Get ID from slave server Transform GetSlaveSequence
Get repository names Input GetRepositoryNames Y Y
Get rows from result Job RowsFromResult
Get Session Variables BA Server GetSessionVariableStep
Get SubFolder names Input GetSubFolders Y Y
Get System Info Input SystemInfo Y Y
Get table names Input GetTableNames Y Y
Get Variables Job GetVariable
Google Analytics Input TypeExitGoogleAnalyticsInputStep
Google Docs Input Input
[Greenplum Bulk Loader] Bulk loading GPBulkLoader
Greenplum Load Bulk loading GPLoad
Group by Statistics GroupBy Y Y
GZIP CSV Input Input ParallelGzipCsvInput
Hadoop File Input Big Data HadoopFileInputPlugin Y Y *1
Hadoop File Output Big Data HadoopFileOutputPlugin Y Y *1
HBase input Big Data HbaseInput Y Y *1
HBase output Big Data HbaseOutput Y Y *1
HBase Row Decoder Big Data HBaseRowDecoder
HL7 Input Input HL7Input Y Y
HTTP client Lookup HTTP Y Y
HTTP Post Lookup HTTPPOST Y Y
IBM Websphere MQ Consumer Input MQInput
IBM Websphere MQ Producer Output MQOutput
Identify last row in a stream Flow DetectLastRow Y Y
If field value is null Utility IfNull Y Y
Infobright Loader Bulk loading InfobrightOutput
Ingres VectorWise Bulk Loader Bulk loading VectorWiseBulkLoader
Injector Inline Injector
Insert / Update Output InsertUpdate Y Y
Java Filter Flow JavaFilter
JMS Consumer Input JmsInput
JMS Producer Output JmsOutput
?Job Executor Flow JobExecutor
Join Rows (cartesian product) Joins JoinRows Y Y (0.6.1.5+)
JSON Input Input JsonInput Y Y
JSON output Output JsonOutput Y Y
Knowledge Flow Data Mining KF
LDAP Input Input LDAPInput
LDAP Output Output LDAPOutput
LDIF Input Input LDIFInput
Load file content in memory Input LoadFileInput Y Y
LucidDB Streaming Loader Bulk loading LucidDBStreamingLoader
Mail Utility Mail
Mail Validator Validation MailValidator
Mapping (sub-transformation) Mapping Mapping Y Y
Mapping input specification Mapping MappingInput Y Y
Mapping output specification Mapping MappingOutput Y Y
MapReduce Input Big Data HadoopEnterPlugin Y Y *1
MapReduce Output Big Data HadoopExitPlugin Y Y *1
MaxMind GeoIP Lookup Lookup MaxMindGeoIPLookup
Memory Group by Statistics MemoryGroupBy Y Y
Merge Join Joins MergeJoin Y Y
Merge Rows (diff) Joins MergeRows Y Y
Metadata structure of stream Utility StepMetastructure Y Y
Microsoft Access Input Input AccessInput
Microsoft Access Output Output AccessOutput
Microsoft Excel Input Input ExcelInput Y Y
Microsoft Excel Output Output ExcelOutput Y Y
Microsoft Excel Writer Output TypeExitExcelWriterStep
Modified Java Script Value Scripting ScriptValueMod Y Y (0.7.0.9+)
Mondrian Input Input MondrianInput
MonetDB Agile Mart Agile
MonetDB Bulk Loader Bulk loading MonetDBBulkLoader
MongoDB Input Big Data MongoDbInput Y Y
MongoDB Output Big Data MongoDbOutput Y Y
Multiway Merge Join Joins MultiwayMergeJoin
MySQL Bulk Loader Bulk loading MySQLBulkLoader
Null if... Utility NullIf Y Y
Number range Transform NumberRange Y Y
OLAP Input Input OlapInput
OpenERP Object Delete Delete OpenERPObjectDelete
OpenERP Object Input Input OpenERPObjectInput
OpenERP Object Output Output OpenERPObjectOutputImport
Oracle Bulk Loader Bulk loading OraBulkLoader
Output steps metrics Statistics StepsMetrics
Palo Cell Input Input PaloCellInput
Palo Cell Output Output PaloCellOutput
Palo Dimension Input Input PaloDimInput
Palo Dimension Output Output PaloDimOutput
Pentaho Reporting Output Output PentahoReportingOutput Y Y
PostgreSQL Bulk Loader Bulk loading PGBulkLoader
Prioritize streams Flow PrioritizeStreams
Process files Utility ProcessFiles
Properties Output Output PropertyOutput Y Y
Property Input Input PropertyInput
R script executor Statistics RScriptExecutor Y Y R: 3.3.3, rJava: 0.9-8, webSpoon: 0.7.1.11, openjdk: 1.8.0_131, libjri.so in /usr/local/tomcat/native-jni-lib/
Regex Evaluation Scripting RegexEval Y Y
Replace in string Transform ReplaceString Y Y
Reservoir Sampling Statistics ReservoirSampling Y Y
REST Client Lookup Rest Y Y
Row denormaliser Transform Denormaliser Y Y
Row flattener Transform Flattener Y Y
Row Normaliser Transform Normaliser Y Y
RSS Input Input RssInput
RSS Output Output RssOutput
Rule Executor Scripting RuleExecutor Y Y
Rule Accumulator Scripting RuleAccumulator Y Y
Run SSH commands Utility SSH
S3 CSV Input Input S3CSVINPUT Y Y
S3 File Output Output S3FileOutputPlugin Y Y
SAP HANA Bulk Loader Bulk loading HanaBulkLoader
Salesforce Delete Output SalesforceDelete
Salesforce Input Input SalesforceInput
Salesforce Insert Output SalesforceInsert
Salesforce Update Output SalesforceUpdate
Salesforce Upsert Output SalesforceUpsert
Sample rows Statistics SampleRows Y Y
SAP Input Input SapInput
SAS Input Input SASInput
Script Experimental
Secret key generator Cryptography SecretKeyGenerator
Select values Transform SelectValues Y Y
Send message to Syslog Utility SyslogMessage
Serialize to file Output CubeOutput
Set field value Transform SetValueField
Set field value to a constant Transform SetValueConstant Y Y
Set files in result Job FilesToResult
Set Session Variables BA Server SetSessionVariableStep
Set Variables Job SetVariable Y Y
SFTP Put Experimental
Simple Mapping Mapping SimpleMapping
Single Threader Flow SingleThreader
Socket reader Inline SocketReader
Socket writer Inline SocketWriter
Sort rows Transform SortRows Y Y
Sorted Merge Joins SortedMerge
Split field to rows Transform SplitFieldToRows3 Y Y
Split Fields Transform FieldSplitter Y Y
Splunk Input Transform SplunkInput
Splunk Output Transform SplunkOutput
SQL File Output Output SQLFileOutput
Stream lookup Lookup StreamLookup Y Y
SSTable Output Big Data SSTableOutput
String operations Transform StringOperations Y Y
Strings cut Transform StringCut Y Y
Switch / Case Flow SwitchCase Y Y
Symmetric Cryptography Cryptography SymmetricCryptoTrans
Synchronize after merge Output SynchronizeAfterMerge
Table Agile Mart Agile
Table Compare Utility TableCompare Y Y
Table exists Lookup TableExists Y Y
Table input Input TableInput Y Y Tested with MySQL
Table output Output TableOutput Y Y
Teradata Fastload Bulk Loader Bulk loading TeraFast
Teradata TPT Insert Upsert Bulk Loader Bulk loading TeraDataBulkLoader
Text file input Input TextFileInput Y Y
Text file output Output TextFileOutput Y Y
Transformation Executor Flow Y Y
Unique rows Transform Unique Y Y
Unique rows (HashSet) Transform UniqueRowsByHashSet Y Y
Univariate Statistics Statistics UnivariateStats Y Y
Update Output Update Y Y
User Defined Java Class Scripting UserDefinedJavaClass Y Y
User Defined Java Expression Scripting Janino Y Y
Value Mapper Transform ValueMapper Y Y
Vertica Bulk Loader Bulk loading VerticaBulkLoader Y Y
Web services lookup Lookup WebServiceLookup Y Y
Knowledge Flow Data Mining KF
Write to log Utility WriteToLog Y Y
XBase input Input XBaseInput Y Y
XML Input Stream (StAX) Input XMLInputStream Y Y
XML Join Joins XMLJoin Y Y
XML Output Output XMLOutput Y Y
XSD Validator Validation XSDValidator
XSL Transformation Transform XSLT
Yaml Input Input YamlInput Y Y
Zip File Utility ZipFile

*1: Tested with a locally deployed Cloudera quickstart VM (5.8)

Non-standard steps

Non-standard steps include, but not limited to, those in the marketplace.

Name Category ID Executable Editable Comments
Apache Kafka Consumer Input KafkaConsumer Y Y
Apache Kafka Producer Output KafkaProducer Y Y
CKAN DataStore Writer for Pentaho Data Integration Output ckan_datastore Y Y
CPython Script Executor Statistics CPythonScriptExecutor Y* Y Cannot make a plot
Execute R Script Scripting PRScript Y Y PRScript: 0.0.4, R: 3.1.1, rJava: 0.9-8, webSpoon: 0.7.1.9, openjdk: 1.8.0_131, libjri.so in /usr/local/tomcat/native-jni-lib/
MQTT Subscriber Input MQTTSubscriberMeta Y Y
MQTT Publisher Output MQTTPublisherMeta Y Y
PDI Jira JiraPlugin Y Y Tested with 0.6.1
Ruby Script Scripting TypeExitRubyStep Y Y (0.7.1.13+) Tested with 1.3.4
Rule Engine Validation Jare_Rule_Engine_Plugin Y Y
Rule Engine Client Validation Jare_Rule_Engine_Client_Plugin Y Y