Optimizely Responder user guide - snowplow-archive/sauna GitHub Wiki

HOME > GUIDE FOR ANALYSTS > RESPONDERS > Optimizely Responder user guide

See also: Optimizely Responder setup guide

Contents

1. Overview

This Sauna responder lets you export user-level data from your event warehouse and upload this data to Optimizely in various formats for A/B testing.

2. Responder actions

Currently this responder supports two actions:

File landing path for Sauna Action performed in Optimizely
com.optimizely/targeting_lists Upload one or more targeting lists (batch)
com.optimizely.dcp/datasource Upload a DCP datasource (batch)

2.1 Upload one or more targeting lists (batch)

2.1.1 Overview

This responder path lets you export user-level data from your event warehouse and upload it to Optimizely as a targeting list for A/B testing.

Optimizely supports lists which identify users for targeting based on:

  1. A cookie
  2. A query string
  3. A zip/postal code

2.1.2 File landing path

The format for the path where files should land is:

<sauna_landing_root>/com.optimizely/targeting_lists/v1/tsv:*/<sub_folders/...>

Notes on this:

  • <sauna_landing_root> should be a S3 bucket/path or local folder as appropriate.
  • com.optimizely/targeting_lists specifies our responder action.
  • This is v1, version 1, of this responder path.
  • Currently tsv (for tab-separated values) is the only supported input format.
  • Currently your files must have all the fields required by the Targeting List API, hence the tsv:*.
  • You can add as many sub-folders as required to prevent conflicts with other Sauna users in your organization.

2.1.3 File input format

At the responder path, you must upload files which contain:

  • Tab (\t)-separated values encoded in double quotation marks (")
  • Rows separated by newlines (\n)

Fields should be as follows:

"<project_id>"\t"<list_name>"\t"<list_description>"\t"<list_type>"\t"<key_fields>"\t"<value>"\n

Where:

  • <project_id> is your Optimizely project ID, e.g. 456.
  • <list_name> is the name of your targeting list (must be unique within this project), e.g. List_1.
  • <list_description> is the description for your list, e.g. Description of List_1.
  • <list_type> must be 1 for a cookie, 2 for a query string, 3 for a zip/postal code.
  • <key_fields> is a comma-separated list of cookie names (if <list_type> is 1), or query parameters (if <list_type> is 2) to target on. Leave empty if <list_type> is 3.
  • <value> should contain a single element to add to the list.

2.1.4 Response algorithm

Sauna will take the file(s) found at the file landing path and convert them into Create/Update Targeting List call(s) to the Optimizely API.

All lines related to a given <project_id> and <list_name> pair will translate into a single API call (with all <values> concatenated with commas into the underlying API's list_content property).

2.1.5 Troubleshooting

For a given <project_id> and <list_name> pair, the <list_description>, <list_type> and <key_fields> must be identical across all rows, otherwise the creation of the corresponding list will fail and an incident will be triggered. This does not affect other lists.

If you attempt to upload a single list larger than 5MB, the creation of this list will fail and trigger an incident. This does not affect other lists.

2.1.6 Usage examples

2.1.6.1 Local folder

Assuming that your Sauna root is set to local folder /opt/sauna, and the file lands as:

/opt/sauna/com.optimizely/targeting_lists/v1/tsv:*/marketing-team/mary/new-lists.tsv

The contents of new-lists.tsv are as follows:

"456"    "dec_ab_group"    "December 2015 A/B group"    "1"    "login,signup"    "38071d80-1f03-4c50-ba94-f3daafd5a8c0"
"456"    "dec_ab_group"    "December 2015 A/B group"    "1"    "login,signup"    "d6cf1245-352c-4fa6-acf3-f985e7c4d785"
"456"    "dec_ab_group"    "December 2015 A/B group"    "1"    "login,signup"    "8b2e5bd6-05cf-47b9-a26c-940e9b2345ba"
"456"    "jan_ab_group"    "January 2016 A/B group"    "3"        "90210"
"456"    "jan_ab_group"    "January 2016 A/B group"    "3"        "10065"
"456"    "jan_ab_group"    "January 2016 A/B group"    "3"        "07620"

Here we can see that Mary in the Marketing Department wants to create two new lists both in project 456:

  1. Mary has 3 users identified by cookies holding UUIDs to add to her dec_ab_group targeting list.
  2. Mary has 3 users identified by US zip codes to add to her jan_ab_group targeting list.
2.1.6.2 Redshift UNLOAD

UNLOAD is a Redshift SQL command which lets you export the contents of a SELECT statement to Amazon S3 for further processing.

Assuming that your Sauna root is set to S3 folder s3://my-sauna-bucket/prod and you run the following SQL:

CREATE TABLE targeting_lists (
  project_id       varchar(256) NOT NULL,
  list_name        varchar(256) NOT NULL,
  list_description varchar(256) NOT NULL,
  list_type        smallint     NOT NULL,
  key_fields       varchar(256)     NULL,
  value            varchar(256) NOT NULL
);

INSERT INTO targeting_lists VALUES
  ('456', 'dec_ab_group', 'December 2015 A/B group', 1, 'login,signup', '38071d80-1f03-4c50-ba94-f3daafd5a8c0'),
  ('456', 'dec_ab_group', 'December 2015 A/B group', 1, 'login,signup', 'd6cf1245-352c-4fa6-acf3-f985e7c4d785'),
  ('456', 'dec_ab_group', 'December 2015 A/B group', 1, 'login,signup', '8b2e5bd6-05cf-47b9-a26c-940e9b2345ba'),
  ('456', 'jan_ab_group', 'January 2016 A/B group', 3, null, '90210'),
  ('456', 'jan_ab_group', 'January 2016 A/B group', 3, null, '10065'),
  ('456', 'jan_ab_group', 'January 2016 A/B group', 3, null, '07620');

UNLOAD ('select * from targeting_lists')
  TO 's3://my-sauna-bucket/prod/com.optimizely/targeting_lists/v1/tsv:*/marketing-team/mary/'
  CREDENTIALS 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
  DELIMITER AS '\t'
  ADDQUOTES
  PARALLEL OFF;

Once again, this will export Mary's two marketing lists to Amazon S3, where Sauna will detect the file landing and make a call to the Optimizely API.

2.1.6.3 Implementation

From both of the examples above the following PUT calls to Optimizely would be made, creating two new targeting lists:

$ curl \
  -H "Token: abcdefghijklmnopqrstuvwxyz:123456" \
  -H "Content-Type: application/json" \
  -d '{"name":"dec_ab_group","description":"December 2015 A/B group","list_type":1,"key_fields":"login,signup","list_content":"38071d80-1f03-4c50-ba94-f3daafd5a8c0,d6cf1245-352c-4fa6-acf3-f985e7c4d785,8b2e5bd6-05cf-47b9-a26c-940e9b2345ba","format":"csv"}' \
  -X PUT "https://www.optimizelyapis.com/experiment/v1/targeting_lists/456/"
$ curl \
  -H "Token: abcdefghijklmnopqrstuvwxyz:123456" \
  -H "Content-Type: application/json" \
  -d '{"name":"jan_ab_group","description":"January 2016 A/B group","list_type":3,"key_fields":null,"list_content":"90210,10065,07620","format":"csv"}' \
  -X PUT "https://www.optimizelyapis.com/experiment/v1/targeting_lists/456/"

2.2 Upload a DCP datasource (batch)

2.2.1 Overview

Please note that Dynamic Customer Profiles are currently in Beta and are only available for whitelisted Optimizely accounts. The APIs are subject to change based on feedback from early customers.

This Sauna responder path lets you export customer-level attributes into a datasource defined within Optimizely Dynamic Customer Profiles (DCP). This responder uses the Bulk Upload API to load these customer profiles.

2.2.1 File landing path

The format for the path where files should land is:

<sauna_landing_root>/com.optimizely.dcp/datasource/v1/<dcp_service>/<dcp_datasource>/tsv:<attribute_1>,<attribute_2>,<attribute_3>,.../<sub_folders/...>

Notes on this:

  • Your <sauna_landing_root> will be an S3 bucket/path or local folder as appropriate.
  • com.optimizely.dcp/datasource specifies our responder action.
  • This is v1, version 1, of this responder path.
  • <dcp_service> is the ID of the DCP Service that this datasource belongs to.
  • <dcp_datasource> is the ID of the Datasource that these files should be loaded into.
  • Currently tsv (for tab-separated values) is the only supported input format.
  • Because the Bulk Upload API a) lets you upload files with a subset of the datasource's defined attributes, and b) has no inherent ordering to the attributes, you must list each of the attributes after the tsv: prefix. You must include the customerId attribute.
  • You can add as many sub-folders as required to prevent conflicts with other Sauna users in your organization.

2.2.3 File input format

At the responder path, you must upload files which contain:

  • Tab (\t)-separated values encoded in double quotation marks (").
  • Rows separated by newlines (\n).

Fields should be as follows:

"<attribute_1>"\t"<attribute_2>"\t"<attribute_3>"\t"<...>"\n

Where:

  • The order of <attribute>s exactly matches the order given in your tsv: file landing path.
  • The value for each <attribute> exactly matches the datatype in your datasource configuration.
  • One of the columns must be for the customerId (it doesn't have to be the first column however).

2.2.4 Response algorithm

Sauna will take the file(s) found at the file landing path and:

  1. Re-format them to be compatible with the Bulk Upload API;
  2. Add a header row containing each of the attributes present in this file;
  3. Upload them to the optimizely-import S3 bucket with the correct path for this DCP datasource and service;
  4. Optimizely will then run the import.

Let's go through each of these steps in turn:

2.2.4.1 Reformatting for the Bulk Upload API

To prepare the export for the DCP Bulk Upload API, Sauna will:

  • Change all tabs to commas;
  • Remove all double quotation marks around values;
  • Convert any timestamps in the data into the datetime format specified for this attribute in the DCP datasource configuration;
  • Convert any t or f in the data into the values true or false.
2.2.4.2 Uploading to the Optimizely S3 bucket

Sauna will:

  • Query the DCP datasource using the <dcp_service> and the <dcp_datasource> in the file landing path;
  • Retrieve the S3 path and AWS credentials required to perform the upload;
  • Upload the file to the given S3 path using the retrieved AWS credentials.
2.2.4.3 Optimizely importing the file

Optimizely imports the file and immediately deletes it from the bucket.

2.2.5 Troubleshooting

Make sure that your listing of attributes in your file landing path exactly matches the column order of your exported data.

Bear in mind that Optimizely will validate your file against the attributes' datatypes and formats on a per-row basis, which can result in some of your rows imported, rather than simply all or none.

Don't attempt to reformat your timestamps into the format of the attribute's datetime; leave it as-is and Sauna will handle the conversion.

2.2.6 Usage examples

2.2.6.1 Our target datasource

Let's imagine that we have a DCP datasource called "My Customer Warehouse" with the following attributes:

  • DCP Service ID: 4083282827
  • Datasource ID: 4259602384
  • Key Field Locator Type: uid

We have configured it with the following attributes:

Name Datatype Format
isVip bool -
lifetimeValue double -
spendSegment string -
whenCreated datetime epoch

Let's now populate this datasource using Sauna.

2.2.6.2 Local folder

Assuming that your Sauna root is set to local folder /opt/sauna, and the file lands as:

/opt/sauna/com.optimizely.dcp/datasource/v1/4083282827/4259602384/tsv:isVip,customerId,spendSegment,whenCreated/ua-team/joe/warehouse.tsv

The contents of warehouse.tsv are as follows:

"t"    "123"    "alpha"    "2013-12-15 14:05:06.789"
"f"    "456"    "delta"    "2014-06-10 21:48:32.712"
"f"    "789"    "omega"    "2015-01-28 07:32:16.329"

Some notes on data formats:

  • Use t or f as flag values for true and false respectively.
  • The timestamp format must be ISO 8601 with milliseconds but no timezone information.

Here we can see that Joe in the User Acquisition team wants to upload three customer profiles to the DCP datasource:

  1. The customer with customerId 123 is a VIP in the alpha spend category.
  2. Customers 456 and 789 joined more recently and are not yet VIPs.
2.1.6.3 Redshift UNLOAD

UNLOAD is a Redshift SQL command which lets you export the contents of a SELECT statement to Amazon S3 for further processing.

Assuming that your Sauna root is set to S3 folder s3://my-sauna-bucket/prod and you run the following SQL:

CREATE TABLE customer_profiles (
  customer_id      varchar(256) NOT NULL,
  is_vip           boolean      NOT NULL,
  lifetime_value   float8       NOT NULL,
  spend_segment    varchar(256) NOT NULL,
  when_created     timestamp    NOT NULL
);

INSERT INTO customer_profiles VALUES
  ('123', true, 234.23, 'alpha', '20131215 14:05:06.789'),
  ('456', false, 43.48, 'delta', '20150128 07:32:16.329'),
  ('789', false, 21.01, 'omega', '20140610 21:48:32.712');

UNLOAD ('select is_vip, customer_id, spend_segment, when_created from customer_profiles')
  TO 's3://my-sauna-bucket/prod/com.optimizely.dcp/datasource/v1/4083282827/4259602384/tsv:isVip,customerId,spendSegment,whenCreated/ua-team/joe/'
  CREDENTIALS 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
  DELIMITER AS '\t'
  ADDQUOTES
  PARALLEL OFF;

Again, this will export 3 customer profiles in a TSV to Amazon S3, where Sauna is waiting to detect this file landing and upload to Optimizely. Some notes:

  • The order of our fields in UNLOAD ('select must match the order given in the tsv:... file landing path.
  • Although we have the lifetimeValue column defined in our customer_profiles table, as we do in our DCP datasource, we choose not to upload it to Optimizely.
  • Don't confuse the AWS credentials provided in the UNLOAD, required to write to our my-sauna-bucket S3 bucket, with the AWS credentials which will be used to upload the data to Optimizely's S3 bucket.
2.1.6.4 Implementation

From the example above (either local or S3 landing path), Sauna will generate a CSV file in the following format:

isVip,customerId,spendSegment,whenCreated
true,123,alpha,1387116306789
false,456,delta,1402436912712
false,789,omega,1422430336329

Note that Sauna has:

  • added a header row exactly matching the attributes from the file landing path;
  • removed all double quotations and replace tabs with commas for Optimizely;
  • reformatted the whenCreated timestamp values as epochs (using GMT, not local time), because this is the format specified for the attribute in the DCP datasource.

Sauna will then upload this file to

s3://optimizely-import/dcp/4083282827/4259602384

using the AWS credentials fetched by a GET to Optimizely's /dcp_datasources/4259602384.

2.3 Load DCP datasource (streaming)

This Sauna responder agent will use the DCP customer profiles API to upload new customer profiles into DCP in a streaming manner.

This responder has not yet been implemented. If you are interested in contributing this or sponsoring us to build this, please get in touch. You can read more about the proposed implementation, give feedback or +1 the feature in issue #6.

2.4 Alias DCP customer IDs (batch)

This Sauna responder path will use the DCP alias API to alias the customer ID of the specified datasource with the customer ID of the Optimizely datasource.

This responder has not yet been implemented. If you are interested in contributing this or sponsoring us to build this, please get in touch. You can read more about the proposed implementation, give feedback or +1 the feature in issue #7.

⚠️ **GitHub.com Fallback** ⚠️