Optimizely Responder user guide - snowplow-archive/sauna GitHub Wiki
HOME > GUIDE FOR ANALYSTS > RESPONDERS > Optimizely Responder user guide
See also: Optimizely Responder setup guide
- 1. Overview
- 2. Responder actions
- 2.1 Upload one or more targeting lists (batch)
- 2.1.1 Overview
- 2.1.2 File landing path
- 2.1.3 File input format
- 2.1.4 Response algorithm
- 2.1.5 Troubleshooting
- 2.1.6 Usage examples
- 2.1.6.1 Local folder
- 2.1.6.2 Redshift UNLOAD
- 2.1.6.3 Implementation
- 2.2 Load DCP datasource (batch)
- 2.2.1 Overview
- 2.2.2 File landing path
- 2.2.3 File input format
- 2.2.4 Response algorithm
- 2.2.4.1 Reformatting for the Bulk Upload API
- 2.2.4.2 Uploading to the Optimizely S3 bucket
- 2.2.4.3 Optimizely importing the file
- 2.2.5 Troubleshooting
- 2.2.6 Usage examples
- 2.2.6.1 Our target datasource
- 2.2.6.2 Local folder
- 2.2.6.3 Redshift UNLOAD
- 2.2.6.4 Implementation
- 2.3 Load DCP datasource (streaming)
- 2.4 Alias DCP customer IDs (batch)
- 2.1 Upload one or more targeting lists (batch)
This Sauna responder lets you export user-level data from your event warehouse and upload this data to Optimizely in various formats for A/B testing.
Currently this responder supports two actions:
File landing path for Sauna | Action performed in Optimizely |
---|---|
com.optimizely/targeting_lists |
Upload one or more targeting lists (batch) |
com.optimizely.dcp/datasource |
Upload a DCP datasource (batch) |
This responder path lets you export user-level data from your event warehouse and upload it to Optimizely as a targeting list for A/B testing.
Optimizely supports lists which identify users for targeting based on:
- A cookie
- A query string
- A zip/postal code
The format for the path where files should land is:
<sauna_landing_root>/com.optimizely/targeting_lists/v1/tsv:*/<sub_folders/...>
Notes on this:
-
<sauna_landing_root>
should be a S3 bucket/path or local folder as appropriate. -
com.optimizely/targeting_lists
specifies our responder action. - This is
v1
, version 1, of this responder path. - Currently
tsv
(for tab-separated values) is the only supported input format. - Currently your files must have all the fields required by the Targeting List API, hence the
tsv:*
. - You can add as many sub-folders as required to prevent conflicts with other Sauna users in your organization.
At the responder path, you must upload files which contain:
- Tab (
\t
)-separated values encoded in double quotation marks ("
) - Rows separated by newlines (
\n
)
Fields should be as follows:
"<project_id>"\t"<list_name>"\t"<list_description>"\t"<list_type>"\t"<key_fields>"\t"<value>"\n
Where:
-
<project_id>
is your Optimizely project ID, e.g.456
. -
<list_name>
is the name of your targeting list (must be unique within this project), e.g.List_1
. -
<list_description>
is the description for your list, e.g.Description of List_1
. -
<list_type>
must be1
for a cookie,2
for a query string,3
for a zip/postal code. -
<key_fields>
is a comma-separated list of cookie names (if<list_type>
is1
), or query parameters (if<list_type>
is2
) to target on. Leave empty if<list_type>
is3
. -
<value>
should contain a single element to add to the list.
Sauna will take the file(s) found at the file landing path and convert them into Create/Update Targeting List call(s) to the Optimizely API.
All lines related to a given <project_id>
and <list_name>
pair will translate into a single API call (with all <values>
concatenated with commas into the underlying API's list_content
property).
For a given <project_id>
and <list_name>
pair, the <list_description>
, <list_type>
and <key_fields>
must be identical across all rows, otherwise the creation of the corresponding list will fail and an incident will be triggered. This does not affect other lists.
If you attempt to upload a single list larger than 5MB, the creation of this list will fail and trigger an incident. This does not affect other lists.
Assuming that your Sauna root is set to local folder /opt/sauna
, and the file lands as:
/opt/sauna/com.optimizely/targeting_lists/v1/tsv:*/marketing-team/mary/new-lists.tsv
The contents of new-lists.tsv
are as follows:
"456" "dec_ab_group" "December 2015 A/B group" "1" "login,signup" "38071d80-1f03-4c50-ba94-f3daafd5a8c0"
"456" "dec_ab_group" "December 2015 A/B group" "1" "login,signup" "d6cf1245-352c-4fa6-acf3-f985e7c4d785"
"456" "dec_ab_group" "December 2015 A/B group" "1" "login,signup" "8b2e5bd6-05cf-47b9-a26c-940e9b2345ba"
"456" "jan_ab_group" "January 2016 A/B group" "3" "90210"
"456" "jan_ab_group" "January 2016 A/B group" "3" "10065"
"456" "jan_ab_group" "January 2016 A/B group" "3" "07620"
Here we can see that Mary in the Marketing Department wants to create two new lists both in project 456
:
- Mary has 3 users identified by cookies holding UUIDs to add to her
dec_ab_group
targeting list. - Mary has 3 users identified by US zip codes to add to her
jan_ab_group
targeting list.
UNLOAD
is a Redshift SQL command which lets you export the contents of a SELECT
statement to Amazon S3 for further processing.
Assuming that your Sauna root is set to S3 folder s3://my-sauna-bucket/prod
and you run the following SQL:
CREATE TABLE targeting_lists (
project_id varchar(256) NOT NULL,
list_name varchar(256) NOT NULL,
list_description varchar(256) NOT NULL,
list_type smallint NOT NULL,
key_fields varchar(256) NULL,
value varchar(256) NOT NULL
);
INSERT INTO targeting_lists VALUES
('456', 'dec_ab_group', 'December 2015 A/B group', 1, 'login,signup', '38071d80-1f03-4c50-ba94-f3daafd5a8c0'),
('456', 'dec_ab_group', 'December 2015 A/B group', 1, 'login,signup', 'd6cf1245-352c-4fa6-acf3-f985e7c4d785'),
('456', 'dec_ab_group', 'December 2015 A/B group', 1, 'login,signup', '8b2e5bd6-05cf-47b9-a26c-940e9b2345ba'),
('456', 'jan_ab_group', 'January 2016 A/B group', 3, null, '90210'),
('456', 'jan_ab_group', 'January 2016 A/B group', 3, null, '10065'),
('456', 'jan_ab_group', 'January 2016 A/B group', 3, null, '07620');
UNLOAD ('select * from targeting_lists')
TO 's3://my-sauna-bucket/prod/com.optimizely/targeting_lists/v1/tsv:*/marketing-team/mary/'
CREDENTIALS 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
DELIMITER AS '\t'
ADDQUOTES
PARALLEL OFF;
Once again, this will export Mary's two marketing lists to Amazon S3, where Sauna will detect the file landing and make a call to the Optimizely API.
From both of the examples above the following PUT
calls to Optimizely would be made, creating two new targeting lists:
$ curl \
-H "Token: abcdefghijklmnopqrstuvwxyz:123456" \
-H "Content-Type: application/json" \
-d '{"name":"dec_ab_group","description":"December 2015 A/B group","list_type":1,"key_fields":"login,signup","list_content":"38071d80-1f03-4c50-ba94-f3daafd5a8c0,d6cf1245-352c-4fa6-acf3-f985e7c4d785,8b2e5bd6-05cf-47b9-a26c-940e9b2345ba","format":"csv"}' \
-X PUT "https://www.optimizelyapis.com/experiment/v1/targeting_lists/456/"
$ curl \
-H "Token: abcdefghijklmnopqrstuvwxyz:123456" \
-H "Content-Type: application/json" \
-d '{"name":"jan_ab_group","description":"January 2016 A/B group","list_type":3,"key_fields":null,"list_content":"90210,10065,07620","format":"csv"}' \
-X PUT "https://www.optimizelyapis.com/experiment/v1/targeting_lists/456/"
Please note that Dynamic Customer Profiles are currently in Beta and are only available for whitelisted Optimizely accounts. The APIs are subject to change based on feedback from early customers.
This Sauna responder path lets you export customer-level attributes into a datasource defined within Optimizely Dynamic Customer Profiles (DCP). This responder uses the Bulk Upload API to load these customer profiles.
The format for the path where files should land is:
<sauna_landing_root>/com.optimizely.dcp/datasource/v1/<dcp_service>/<dcp_datasource>/tsv:<attribute_1>,<attribute_2>,<attribute_3>,.../<sub_folders/...>
Notes on this:
- Your
<sauna_landing_root>
will be an S3 bucket/path or local folder as appropriate. -
com.optimizely.dcp/datasource
specifies our responder action. - This is
v1
, version 1, of this responder path. -
<dcp_service>
is the ID of the DCP Service that this datasource belongs to. -
<dcp_datasource>
is the ID of the Datasource that these files should be loaded into. - Currently
tsv
(for tab-separated values) is the only supported input format. - Because the Bulk Upload API a) lets you upload files with a subset of the datasource's defined attributes, and b) has no inherent ordering to the attributes, you must list each of the attributes after the
tsv:
prefix. You must include thecustomerId
attribute. - You can add as many sub-folders as required to prevent conflicts with other Sauna users in your organization.
At the responder path, you must upload files which contain:
- Tab (
\t
)-separated values encoded in double quotation marks ("
). - Rows separated by newlines (
\n
).
Fields should be as follows:
"<attribute_1>"\t"<attribute_2>"\t"<attribute_3>"\t"<...>"\n
Where:
- The order of
<attribute>
s exactly matches the order given in yourtsv:
file landing path. - The value for each
<attribute>
exactly matches the datatype in your datasource configuration. - One of the columns must be for the
customerId
(it doesn't have to be the first column however).
Sauna will take the file(s) found at the file landing path and:
- Re-format them to be compatible with the Bulk Upload API;
- Add a header row containing each of the attributes present in this file;
- Upload them to the
optimizely-import
S3 bucket with the correct path for this DCP datasource and service; - Optimizely will then run the import.
Let's go through each of these steps in turn:
To prepare the export for the DCP Bulk Upload API, Sauna will:
- Change all tabs to commas;
- Remove all double quotation marks around values;
- Convert any timestamps in the data into the datetime format specified for this attribute in the DCP datasource configuration;
- Convert any
t
orf
in the data into the valuestrue
orfalse
.
Sauna will:
- Query the DCP datasource using the
<dcp_service>
and the<dcp_datasource>
in the file landing path; - Retrieve the S3 path and AWS credentials required to perform the upload;
- Upload the file to the given S3 path using the retrieved AWS credentials.
Optimizely imports the file and immediately deletes it from the bucket.
Make sure that your listing of attributes in your file landing path exactly matches the column order of your exported data.
Bear in mind that Optimizely will validate your file against the attributes' datatypes and formats on a per-row basis, which can result in some of your rows imported, rather than simply all or none.
Don't attempt to reformat your timestamps into the format of the attribute's datetime; leave it as-is and Sauna will handle the conversion.
Let's imagine that we have a DCP datasource called "My Customer Warehouse" with the following attributes:
- DCP Service ID: 4083282827
- Datasource ID: 4259602384
- Key Field Locator Type: uid
We have configured it with the following attributes:
Name | Datatype | Format |
---|---|---|
isVip |
bool |
- |
lifetimeValue |
double |
- |
spendSegment |
string |
- |
whenCreated |
datetime |
epoch |
Let's now populate this datasource using Sauna.
Assuming that your Sauna root is set to local folder /opt/sauna
, and the file lands as:
/opt/sauna/com.optimizely.dcp/datasource/v1/4083282827/4259602384/tsv:isVip,customerId,spendSegment,whenCreated/ua-team/joe/warehouse.tsv
The contents of warehouse.tsv
are as follows:
"t" "123" "alpha" "2013-12-15 14:05:06.789"
"f" "456" "delta" "2014-06-10 21:48:32.712"
"f" "789" "omega" "2015-01-28 07:32:16.329"
Some notes on data formats:
- Use
t
orf
as flag values fortrue
andfalse
respectively. - The timestamp format must be ISO 8601 with milliseconds but no timezone information.
Here we can see that Joe in the User Acquisition team wants to upload three customer profiles to the DCP datasource:
- The customer with
customerId
123 is a VIP in the alpha spend category. - Customers 456 and 789 joined more recently and are not yet VIPs.
UNLOAD
is a Redshift SQL command which lets you export the contents of a SELECT
statement to Amazon S3 for further processing.
Assuming that your Sauna root is set to S3 folder s3://my-sauna-bucket/prod
and you run the following SQL:
CREATE TABLE customer_profiles (
customer_id varchar(256) NOT NULL,
is_vip boolean NOT NULL,
lifetime_value float8 NOT NULL,
spend_segment varchar(256) NOT NULL,
when_created timestamp NOT NULL
);
INSERT INTO customer_profiles VALUES
('123', true, 234.23, 'alpha', '20131215 14:05:06.789'),
('456', false, 43.48, 'delta', '20150128 07:32:16.329'),
('789', false, 21.01, 'omega', '20140610 21:48:32.712');
UNLOAD ('select is_vip, customer_id, spend_segment, when_created from customer_profiles')
TO 's3://my-sauna-bucket/prod/com.optimizely.dcp/datasource/v1/4083282827/4259602384/tsv:isVip,customerId,spendSegment,whenCreated/ua-team/joe/'
CREDENTIALS 'aws_access_key_id=<access-key-id>;aws_secret_access_key=<secret-access-key>'
DELIMITER AS '\t'
ADDQUOTES
PARALLEL OFF;
Again, this will export 3 customer profiles in a TSV to Amazon S3, where Sauna is waiting to detect this file landing and upload to Optimizely. Some notes:
- The order of our fields in
UNLOAD ('select
must match the order given in thetsv:...
file landing path. - Although we have the
lifetimeValue
column defined in ourcustomer_profiles
table, as we do in our DCP datasource, we choose not to upload it to Optimizely. - Don't confuse the AWS credentials provided in the
UNLOAD
, required to write to our my-sauna-bucket S3 bucket, with the AWS credentials which will be used to upload the data to Optimizely's S3 bucket.
From the example above (either local or S3 landing path), Sauna will generate a CSV file in the following format:
isVip,customerId,spendSegment,whenCreated
true,123,alpha,1387116306789
false,456,delta,1402436912712
false,789,omega,1422430336329
Note that Sauna has:
- added a header row exactly matching the attributes from the file landing path;
- removed all double quotations and replace tabs with commas for Optimizely;
- reformatted the
whenCreated
timestamp values as epochs (using GMT, not local time), because this is the format specified for the attribute in the DCP datasource.
Sauna will then upload this file to
s3://optimizely-import/dcp/4083282827/4259602384
using the AWS credentials fetched by a GET
to Optimizely's /dcp_datasources/4259602384
.
This Sauna responder agent will use the DCP customer profiles API to upload new customer profiles into DCP in a streaming manner.
This responder has not yet been implemented. If you are interested in contributing this or sponsoring us to build this, please get in touch. You can read more about the proposed implementation, give feedback or +1
the feature in issue #6.
This Sauna responder path will use the DCP alias API to alias the customer ID of the specified datasource with the customer ID of the Optimizely datasource.
This responder has not yet been implemented. If you are interested in contributing this or sponsoring us to build this, please get in touch. You can read more about the proposed implementation, give feedback or +1
the feature in issue #7.