JavaScript script enrichment - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki
HOME > SNOWPLOW SETUP GUIDE > Step 3: Setting up Enrich > Configurable enrichments > JavaScript script enrichment
Warning: The JavaScript Enrichment is a powerful tool and should be used with care. Do not allow untrusted third parties to edit your script.
JSON Schema iglu:com.snowplowanalytics.snowplow/javascript_script_config/jsonschema/1-0-0 Compatibility r66+ Data provider None
This enrichment lets you write a JavaScript function which is executed in the Enrichment process for each enriched event, and returns one or more derived contexts which are attached to the final enriched event.
Use this enrichment to apply your own business logic to your enriched events at the row-level; JavaScript functions across multiple rows are not supported. Because your JavaScript function can throw exceptions which are gracefully handled by the calling Enrichment process, you can also use this enrichment to provide simple filtering of events.
This is the field which this enrichment will augment:
- derived_contexts
Your JavaScript must include a function, process(event)
, which:
- Takes a Snowplow enriched event POJO(Plain Old Java Object) as its sole argument
- Returns a JavaScript array of valid self-describing JSONs, which will be added to the
derived_contexts
field in the enriched event - Returns
[]
ornull
if there are no contexts to add to this event - Can
throw
exceptions but note that throwing an exception will cause the entire enriched event to end up in the Bad Bucket or Bad Stream
Note that you can also include other top-level functions and variables in your JavaScript script - but you must include a process(event)
function somewhere in your script.
Here is an example:
const SECRET_APP_ID = "Joshua";
/**
* Performs two roles:
* 1. If this is a server-side event, we
* validate that the app_id is our
* valid secret. Prevents spoofing of
* our server-side events
* 2. If app_id is not null, return a new
* Acme context, derived_app_id, which
* contains the upper-cased app_id
*/
function process(event) {
var appId = event.getApp_id();
if (platform == "server" && appId != SECRET_APP_ID) {
throw "Server-side event has invalid app_id: " + appId;
}
if (appId == null) {
return [];
}
var appIdUpper = new String(appId.toUpperCase());
return [ { schema: "iglu:com.acme/derived_app_id/jsonschema/1-0-0",
data: { appIdUpper: appIdUpper } } ];
}
Please note:
- You must use a Java-style getter to retrieve the
app_id
from the event (more information) - We have to convert the uppercased
appId
back to a JavaScript String (from a Java String) before we return it (more information)
The self-describing JSON to configure this enrichment with the above JavaScript script is as follows:
{
"schema": "iglu:com.snowplowanalytics.snowplow/javascript_script_config/jsonschema/1-0-0",
"data": {
"vendor": "com.snowplowanalytics.snowplow",
"name": "javascript_script_config",
"enabled": true,
"parameters": {
"script": "Y29uc3QgU0VDUkVUX0FQUF9JRCA9ICJKb3NodWEiOw0KDQovKioNCiAqIFBlcmZvcm1zIHR3byByb2xlczoNCiAqIDEuIElmIHRoaXMgaXMgYSBzZXJ2ZXItc2lkZSBldmVudCwgd2UNCiAqICAgIHZhbGlkYXRlIHRoYXQgdGhlIGFwcF9pZCBpcyBvdXINCiAqICAgIHZhbGlkIHNlY3JldC4gUHJldmVudHMgc3Bvb2Zpbmcgb2YNCiAqICAgIG91ciBzZXJ2ZXItc2lkZSBldmVudHMNCiAqIDIuIElmIGFwcF9pZCBpcyBub3QgbnVsbCwgcmV0dXJuIGEgbmV3DQogKiAgICBBY21lIGNvbnRleHQsIGRlcml2ZWRfYXBwX2lkLCB3aGljaA0KICogICAgY29udGFpbnMgdGhlIHVwcGVyLWNhc2VkIGFwcF9pZA0KICovDQpmdW5jdGlvbiBwcm9jZXNzKGV2ZW50KSB7DQogICAgdmFyIGFwcElkID0gZXZlbnQuZ2V0QXBwX2lkKCk7DQoNCiAgICBpZiAocGxhdGZvcm0gPT0gInNlcnZlciIgJiYgYXBwSWQgIT0gU0VDUkVUX0FQUF9JRCkgew0KICAgICAgICB0aHJvdyAiU2VydmVyLXNpZGUgZXZlbnQgaGFzIGludmFsaWQgYXBwX2lkOiAiICsgYXBwSWQ7DQogICAgfQ0KDQogICAgaWYgKGFwcElkID09IG51bGwpIHsNCiAgICAgICAgcmV0dXJuIFtdOw0KICAgIH0NCg0KICAgIHZhciBhcHBJZFVwcGVyID0gbmV3IFN0cmluZyhhcHBJZC50b1VwcGVyQ2FzZSgpKTsNCiAgICByZXR1cm4gWyB7IHNjaGVtYTogImlnbHU6Y29tLmFjbWUvZGVyaXZlZF9hcHBfaWQvanNvbnNjaGVtYS8xLTAtMCIsDQogICAgICAgICAgICAgICBkYXRhOiAgeyBhcHBJZFVwcGVyOiBhcHBJZFVwcGVyIH0gfSBdOw0KfQ=="
}
}
}
The "parameters" fields are as follows:
- "script": Your JavaScript function, Base64 encoded. You can use either URL-safe or regular Base64 encoding
The data source for this enrichment is the entire enriched/good
event in the form of Snowplow Plain Old Java Object produced during common enrichment process.
This enrichment uses the Rhino JavaScript engine to execute your JavaScript. Your JavaScript is pre-compiled so that your code should approach native Java speeds.
The process
function is passed the exact Snowplow enriched event POJO. The return value from the process
function is converted into a JSON string (using JSON.stringify
) in JavaScript before being retrieved in our Scala code. Our Scala code confirms that the return value is either null or an empty or non-empty array of Objects. No validation of the self-describing JSONs is performed.
You can review the exact Scala code which executes your JavaScript script in the JavascriptScriptEnrichment.scala file.
This is our most powerful enrichment yet - here are some do's and don'ts to avoid potential footguns.
Do:
- use Snowplow version tags to confirm the fields available in your Snowplow version's enriched event POJO
- return as many contexts as you want
- throw an exception if you want this enriched event to end up in the Bad Bucket or Bad Stream
- include minified, self-contained JavaScript libraries that your
process(event)
function needs - test this enrichment on sample sets of events before putting it into production
- ensure your new contexts are defined in Iglu, Redshift, JSON Paths etc
Don't:
- mutate existing fields in the supplied enriched event - return a new context instead
- try to share state across multiple enriched events - write your own Scalding or Spark job instead
- include CPU-intensive tasks without being aware of the impact on your event processing time
- allow untrusted parties to write your script - the script has access to the Java standard library and therefore to your filesystem.
As during the JavaScript enrichment process the new context is added to derived_contexts
of the enriched/good
event, the data generated will end up in its own table determined by the custom self-describing JSON provided by the user with process(event)
function encoded in javascript_script_config
file.