Upgrading data transforms to 2.0 - nyurik/vega GitHub Wiki
This wiki documents Vega version 2. For Vega 3 documentation, see vega.github.io/vega.
NOTE that there's another page that describes how to update v1 specs to work with Vega 2. This page describes changes to Vega internals, particularly those related to data transforms.
In addition to the minor API changes noted elsewhere, there have been major changes to Vega internals for v2. This means that even a simple transform will require fairly major revisions, as demonstrated by the old (v1) and new (v2) versions of the basic Sort. Here are the salient differences in v2 that pertain to data transforms:
-
Most JavaScript has been refactored into CommonJS modules. For a typical transform, this just means adding a few
require()
statements to assign any dependencies to local variables. -
Each transform is embodied as a subclass of Transform, then registered by exporting the subclass from the 'transforms' module so that appears in your JS as, for example,
vg.transforms.Sort
. (See below for ways to add a new transform without rebuilding all of Vega or modifying its code.) -
Incoming parameters are defined with
Transform.addParameters
, and assigned one of several types:/* Types used in `addParam`, check out the vega 2 transforms to learn more * https://github.com/vega/vega/tree/master/src/transforms */ {type: 'value'} // a simple value (string, boolean, number) {type: 'data'} {type: 'expr'} {type: 'field'} {type: 'custom'} // provide custom getter and setter {type: 'field', default: null} // you can provide a default value {type: 'value', default: 0} {type: 'value', default: false} {type: 'array<value>'} {type: 'array<field>', default: ['data']}
NOTE that there's a different set of "primitive" types used in the formal JSON Schema that more clearly documents the expected parameters, their default values, and any constraints. (Another, more readable intro to JSON Schemas is http://spacetelescope.github.io/understanding-json-schema/)
-
There are several flags available in each transform that help Vega to efficiently traverse the graph of possible transforms. These are not well documented beyond the code shown below, but I've had good results with educated guesswork and examining similar existing transforms, as well as studying the code in vega-dataflow.
var Flags = Node.Flags = { Router: 0x01, // Responsible for propagating tuples, cannot be skipped. Collector: 0x02, // Holds a materialized dataset, pulse node to reflow. Produces: 0x04, // Produces new tuples. Mutates: 0x08, // Sets properties of incoming tuples. Reflows: 0x10, // Forwards a reflow pulse. Batch: 0x20 // Performs batch data processing, needs collector. };
These flags are false by default, but can be set to true by calling a series of corresponding functions on the transform before returning from your class's constructor function, for example:
return this.router(true).produces(true);
return this.router(true) .reflows(true) .mutates(true);
-
The heart of your v1 transform will become an inner function
transform
assigned to its subclass as shown here. -
Most transforms will manipulate data in a new, streaming form instead of the simple "materialized" data used in v1. The
input
object passed to the innertransform
function carries added tuples (input.add
), modified tuples (input.mod
) and removed typles (input.rem
) that your code can manipulate and return.
If you're planning to make changes to Vega, or to build compatible code like data transforms, there are some tools you'll need to install for bundling and managing its CommonJS modules.
- Node.js
- npm (Node Package Manager)
- browserify
- watchify (if you want to auto-bundle modules after any code edits)
- [package.json] (just a file, but it's a very important one)
There are lots of tutorials on the web for these tools -- here's one I particularly like -- so we won't go into great detail here. But once you have the general idea, it's useful to see a working package.json file for an app that includes custom transforms. This makes it easy to install all the required dependencies with npm, including "shims" for JS scripts that weren't written as CommonJS modules.
There is apparently a plugin system in the works for transforms, but for now the general assumption is that you'd add them to vega's source code and bundle it with browserify.
But if you'd rather not fork Vega to add your own transforms, it's still
possible to build and use "third-party" transforms separately, by taking
advantage of CommonJS modules. Simply include vega and your transforms
using require()
, then register your custom transforms with vega.