Clojure collector - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki

HOME > SNOWPLOW TECHNICAL DOCUMENTATION > Collectors

Introduction

The Clojure-based collector is a Snowplow event collector for Snowplow, written in Clojure. It is typically used in place of Snowplow's CloudFront-based collector when site visitors need to be uniquely identified across multiple different domains (e.g. on a content or ad network).

It is designed to be easily runnable on Amazon Elastic Beanstalk.

How it works

There are two key aspects to the Clojure Collector:

  1. User identification - how users are uniquely identified across domains
  2. Event logging - how Snowplow events are logged ready for Enrichment

User identification

The Clojure Collector allows the use of a third-party cookie, making user tracking across domains possible. The CloudFront Collector does not support cross domain tracking of users because user ids are set client-side, whereas the Clojure Collector sets them server-side.

In a nutshell: the Clojure Collector receives events from the Snowplow JavaScript tracker, sets/updates a third-party user tracking cookie, and returns the pixel to the client. The ID in this third-party user tracking cookie is stored in the network_userid field in Snowplow events.

In pseudocode terms:

if (request contains an "sp" cookie) {
    Record that cookie as the user identifier
    Set that cookie with a now+1 year cookie expiry
    Add the headers and payload to the output array
} else {
    Set the "sp" cookie with a now+1 year cookie expiry
    Add the headers and payload to the output array
}

The user cookie is updated with a new expires date and re-sent on every response, extending the lifetime out.

Note that this approach to tracking users across domains works on all browsers except mobile Safari.

Event logging

The Clojure Collector does not contain any logging functionality of its own; instead, you are expected to run the Clojure Collector in a servlet container such as Tomcat, with access logging (including response headers) enabled.

The Clojure Collector contains all of the configuration required to run the Clojure Collector within a Tomcat container running on Elastic Beanstalk. It is configured in such a way that the output log format is the exact same as that generated by CloudFront.

Note that the Clojure Collector logs the cross-domain user ID held in the third-party cookie by appending &nuid= to the logged request. This prevents conflict with the &uid= (business user_id) and &duid= (domain aka first-party user ID) fields which are set in the tracker.

Technical architecture

The Clojure Collector is built on top of Ring and Compojure.

To run it locally:

$ lein ring server

See also

⚠️ **GitHub.com Fallback** ⚠️