Running CoRB - marklogic-community/corb2 GitHub Wiki
There are a variety of ways to execute a CoRB job.
The entry point is the main method in the com.marklogic.developer.corb.Manager
class. CoRB requires the MarkLogic XCC JAR on the classpath (preferably the version that corresponds to the MarkLogic server version), which can be downloaded from https://developer.marklogic.com/products/xcc.
Running CoRB as a gradle task
ml-gradle has a CorbTask that facilitates executing CoRB as a Gradle task. Refer to https://github.com/marklogic-community/ml-gradle/wiki/Corb-and-Gradle for an overview and specific instructions for configuring and executing CoRB as a gradle task.
Configuring Options
CoRB needs options specified through one or more of the following mechanisms:
- command-line parameters
- Java system properties ex:
-DXCC-CONNECTION-URI=xcc://user:password@localhost:8202
- As properties file in the class path specified using
-DOPTIONS-FILE=myjob.properties
. Relative and full file system paths are also supported.
If specified in more than one place, a command line parameter takes precedence over a Java system property, which take precedence over a property from the OPTIONS-FILE properties file.
Note: Any or all of the properties can be specified as Java system properties or key value pairs in properties file.
Note: CoRB exit codes
0
- successful,0
- nothing to process (ref: EXIT-CODE-NO-URIS),1
- initialization or connection error and2
- execution error
Note: CoRB now supports Logging Job Metrics back to the MarkLogic database log and/or as document in the database.
Usage Examples
Usage 1 - Command line options:
java -server -cp .:marklogic-xcc-11.1.0.jar:marklogic-corb-2.5.6.jar
com.marklogic.developer.corb.Manager
XCC-CONNECTION-URI
[COLLECTION-NAME] [PROCESS-MODULE] [THREAD-COUNT] [URIS-MODULE] [MODULE-ROOT]
[MODULES-DATABASE] [INSTALL] [PROCESS-TASK] [PRE-BATCH-MODULE] [PRE-BATCH-TASK]
[POST-PROCESS-MODULE] [POST-BATCH-TASK] [EXPORT-FILE-DIR] [EXPORT-FILE-NAME]
[URIS-FILE]
Usage 2 - Java system properties specifying options:
java -server -cp .:marklogic-xcc-11.1.0.jar:marklogic-corb-2.5.6.jar
-DXCC-CONNECTION-URI=xcc://user:password@host:port/[database]
-DPROCESS-MODULE=module-name.xqy -DTHREAD-COUNT=10
-DURIS-MODULE=get-uris.xqy
-DPOST-BATCH-PROCESS-MODULE=post-batch.xqy
-D...
com.marklogic.developer.corb.Manager
Usage 3 - Properties file specifying options:
java -server -cp .:marklogic-xcc-11.1.0.jar:marklogic-corb-2.5.6.jar
-DOPTIONS-FILE=job.properties com.marklogic.developer.corb.Manager
looks for job.properties file in classpath
Usage 4 - Combination of properties file with java system properties and command line options:
java -server -cp .:marklogic-xcc-11.1.0.jar:marklogic-corb-2.5.6.jar
-DOPTIONS-FILE=job.properties -DTHREAD-COUNT=10
com.marklogic.developer.corb.Manager XCC-CONNECTION-URI
Sample job properties
Note: any of the properties below can be specified as java system property i.e. '-D' option)
sample 1 - simple batch job
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy
PROCESS-MODULE=transform.xqy
sample 2 - Use username, password, host and port specified separately instead of connection URI
XCC-USERNAME=username
XCC-PASSWORD=password
XCC-HOSTNAME=localhost
XCC-PORT=9999
XCC-DBNAME=ML-database
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy
PROCESS-MODULE=SampleCorbJob.xqy
sample 3 - simple batch with URIS-FILE (in place of URIS-MODULE)
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-FILE=input-uris.csv
PROCESS-MODULE=SampleCorbJob.xqy
sample 4 - simple batch with XML-FILE (in place of URIS-MODULE)
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
XML-FILE=input.xml
XML-NODE=/rootNode/childNode
URIS-LOADER=com.marklogic.developer.corb.FileUrisXMLLoader
PROCESS-MODULE=SampleCorbJob.xqy
sample 5 - report, generates a single file with data from processing each URI
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
PROCESS-MODULE=get-data-from-document.xqy
PROCESS-TASK=com.marklogic.developer.corb.ExportBatchToFileTask
EXPORT-FILE-NAME=/local/path/to/exportmyfile.csv
sample 6 - report with header, add following to sample 4.
PRE-BATCH-TASK=com.marklogic.developer.corb.PreBatchUpdateFileTask
EXPORT-FILE-TOP-CONTENT=col1,col2,col3
sample 7 - dynamic headers, assuming pre-batch-header.xqy module returns the header row, add the following to sample 4.
PRE-BATCH-MODULE=pre-batch-header.xqy
PRE-BATCH-TASK=com.marklogic.developer.corb.PreBatchUpdateFileTask
sample 8 - pre and post batch hooks
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy
PROCESS-MODULE=transform.xqy
PRE-BATCH-MODULE=pre-batch.xqy
POST-BATCH-MODULE=post-batch.xqy
sample 9 - adhoc tasks
XQuery modules live local to filesystem where CORB is located. Any XQuery module can be adhoc.
XCC-CONNECTION-URI=xcc://user:password@localhost:8202/
THREAD-COUNT=10
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.xqy|ADHOC
PROCESS-MODULE=SampleCorbJob.xqy|ADHOC
PRE-BATCH-MODULE=/local/path/to/adhoc-pre-batch.xqy|ADHOC
sample 10 - jasypt encryption
XCC-CONNECTION-URI, XCC-USERNAME, XCC-PASSWORD, XCC-HOSTNAME, XCC-PORT and/or XCC-DBNAME properties can be encrypted and optionally enclosed by ENC(). If JASYPT-PROPERTIES-FILE is not specified, it assumes default jasypt.properties.
XCC-CONNECTION-URI=ENC(encrypted_uri)
DECRYPTER=com.marklogic.developer.corb.JasyptDecrypter
sample jasypt.properties
jasypt.password=foo
jasypt.algorithm=PBEWithMD5AndTripleDES
sample 11 - private key encryption with java keys
XCC-CONNECTION-URI, XCC-USERNAME, XCC-PASSWORD, XCC-HOSTNAME, XCC-PORT and/or XCC-DBNAME properties can be encrypted and optionally enclosed by ENC()
XCC-CONNECTION-URI=encrypted_uri
DECRYPTER=com.marklogic.developer.corb.PrivateKeyDecrypter
PRIVATE-KEY-FILE=/path/to/key/private.key
PRIVATE-KEY-ALGORITHM=RSA
sample 12 - private key encryption with unix keys
XCC-CONNECTION-URI, XCC-USERNAME, XCC-PASSWORD, XCC-HOSTNAME, XCC-PORT and/or XCC-DBNAME properties can be encrypted and optionally enclosed by ENC()
XCC-CONNECTION-URI=encrypted_uri
DECRYPTER=com.marklogic.developer.corb.PrivateKeyDecrypter
PRIVATE-KEY-FILE=/path/to/rsa/key/rivate.pkcs8.key
sample 13 - JavaScript modules deployed to modules database
MODULE-ROOT=/temp/
MODULES-DATABASE=MY-Modules-DB
URIS-MODULE=get-uris.sjs
PROCESS-MODULE=transform.sjs
sample 14 - Adhoc JavaScript modules
URIS-MODULE=get-uris.sjs|ADHOC
PROCESS-MODULE=extract.sjs|ADHOC