Python Tracker v0.2 - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki
HOME > SNOWPLOW TECHNICAL DOCUMENTATION > Trackers > Python Tracker
This page refers to version 0.2.0 of the Snowplow Python Tracker. Documentation for other versions is available:
-
- 3.1 Configuring your tracker
- 3.1.1
set_platform()
- 3.1.2
set_base64_to()
- 3.1.1
- 3.2 Adding extra data
- 3.2.1
set_app_id()
- 3.2.2
set_user_id()
- 3.2.3
set_screen_resolution()
- 3.2.4
set_viewport()
- 3.2.5
set_color_depth()
- 3.2.6
set_lang
- 3.2.1
- 3.1 Configuring your tracker
-
- 4.1 Common
- 4.1.1 Argument validation
- 4.1.2 Optional timestamp argument
- 4.2
track_screen_view()
- 4.3
track_page_view()
- 4.4
track_ecommerce_transaction()
- 4.5
track_ecommerce_transaction_item()
- 4.6
track_struct_event()
- 4.7
track_unstruct_event()
- 4.7.1 Supported datatypes
- 4.1 Common
The Snowplow Python Tracker allows you to track Snowplow events from your Python apps and games.
The tracker should be straightforward to use if you are comfortable with Python development; any prior experience with Snowplow's JavaScript Tracker or Lua Tracker, Google Analytics or Mixpanel (which have similar APIs to Snowplow) is helpful but not necessary.
Note that this tracker has access to a more restricted set of Snowplow events than the JavaScript Tracker and covers almost all the events from the Lua Tracker.
Assuming you have completed the Python Tracker Setup for your Python project, you are now ready to initialize the Python Tracker.
Require the Python Tracker's module into your Python code like so:
from snowplow_tracker.tracker import Tracker
That's it - you are now ready to initialize a tracker instance.
Initialise a tracker instance like this:
tracker = Tracker("d3rkrsqld9gmqf.cloudfront.net")
You can also add an optional tracker name parameter:
tracker = Tracker("d3rkrsqld9gmqf.cloudfront.net", "cf")
This will be attached to all events which the tracker fires, allowing you to identify their origin.
Each tracker instance is completely sandboxed, so you can create multiple trackers as you see fit.
Here is an example of instantiating two separate trackers:
t1 = Tracker("d3rkrsqld9gmqf.cloudfront.net", "t1")
t1.set_platform("cnsl")
t1.track_page_view("http://www.example.com")
t2 = Tracker("my-company.c.snplow.com", "t2")
t2.set_platform("cnsl")
t2.track_screen_view("Game HUD", "23")
t1.track_screen_view("Test", "23") # Back to first tracker
Each tracker instance is initialized with sensible defaults:
- The platform the tracker is running on is set to "pc"
- Property data for unstructured events is sent Base64-encoded
However you can change either of these defaults:
You can change the platform the tracker is running on by calling:
t.set_platform(platform_code)
For example:
t.set_platform("tv") # Running on a Connected TV
For a full list of supported platforms, please see the Snowplow Tracker Protocol.
You can set whether or not to Base64-encode property data for unstructured events by calling:
t.set_base64_to( {{True OR False}} )
So to disable it and send the data URI-encoded instead:
t.set_base64_to(False)
You may have additional information about your application"s environment, current user and so on, which you want to send to Snowplow with each event.
The tracker instance has a set of set...()
methods to attach extra data to all tracked events:
We will discuss each of these in turn below:
You can set the application ID to any string:
t.set_app_id( "{{APPLICATION ID}}" )
Example:
t.set_app_id("wow-addon-1")
You can set the user ID to any string:
t.set_user_id( "{{USER ID}}" )
Example:
t.set_user_id("alexd")
If your Python code has access to the device's screen resolution, then you can pass this in to Snowplow too:
t.set_screen_resolution( {{WIDTH}}, {{HEIGHT}} )
Both numbers should be positive integers; note the order is width followed by height. Example:
t.set_screen_resolution(1366, 768)
If your Python code has access to the device's screen resolution, then you can pass this in to Snowplow too:
t.set_viewport( {{WIDTH}}, {{HEIGHT}} )
Both numbers should be positive integers; note the order is width followed by height. Example:
t.set_viewport(300, 200)
If your Python code has access to the bit depth of the device's color palette for displaying images, then you can pass this in to Snowplow too:
t.set_color_depth( {{BITS PER PIXEL}} )
The number should be a positive integer, in bits per pixel. Example:
t.set_color_depth(32)
This method lets you pass a user's language in to Snowplow:
t.set_lang( {{LANGUAGE}} )
The number should be a positive integer, in bits per pixel. Example:
t.set_lang('en')
Snowplow has been built to enable you to track a wide range of events that occur when users interact with your websites and apps. We are constantly growing the range of functions available in order to capture that data more richly.
Tracking methods supported by the Python Tracker at a glance:
Function | Description |
---|---|
track_page_view() |
Track and record views of web pages. |
track__ecommerce_transaction() |
Track an ecommerce transaction on transaction level. |
track_ecommerce_transaction_item() |
Track an ecommerce transaction on item level. |
track_screen_view() |
Track the user viewing a screen within the application |
track_struct_event() |
Track a Snowplow custom structured event |
track_unstruct_event() |
Track a Snowplow custom unstructured event |
All events are tracked with specific methods on the tracker instance, of the form track_XXX()
, where XXX
is the name of the event to track.
Python is a dynamically typed language, but each of our track...()
methods expects its arguments to be of specific types and value ranges, and validates that to be the case. These checks are done using the PyContracts library.
If the validation check fails, then a runtime error is thrown:
t = Tracker.hostname("localhost")
t.set_color_depth("walrus")
contracts.interface.ContractNotRespected: Breach for argument 'depth' to Tracker:set_color_depth().
Expected type 'int', got 'str'.
checking: Int for value: Instance of str: 'walrus'
checking: $(Int) for value: Instance of str: 'walrus'
checking: int for value: Instance of str: 'walrus'
Variables bound in inner context:
- self: Instance of Tracker: <snowplow_tracker.tracker.Tracker object... [clip]
If your value is of the wrong type, convert it before passing it into the track...()
method, for example:
level_idx = 42
t.track_screen_view("Game Level", str(level_idx))
We specify the types and value ranges required for each argument below.
Each track...()
method supports an optional timestamp as its final argument; this allows you to manually override the timestamp attached to this event.
If you do not pass this timestamp in as an argument, then the Python Tracker will use the current time to be the timestamp for the event.
Here is an example tracking a structured event and supplying the optional timestamp argument.We can explicitly supply None
s for the intervening arguments which are empty:
t.track_struct_event("some cat", "save action", None, None, None, 1368725287)
Alternatively, we can use the argument name:
t.track_struct_event("some cat", "save action", tstamp=1368725287)
Timestamp is counted in seconds since the Unix epoch - the same format as generated by time.time()
in Python.
Warning: this feature is implemented in the Lua and Python tracker, but it is not currently supported in the Enrichment, Storage or Analytics stages in the Snowplow data pipeline. As a result, if you use this feature, you will log screen views to your collector logs, but these will not be parsed and loaded into e.g. Redshift to analyse. (Adding this capability is on the roadmap.)
Use track_screen_view()
to track a user viewing a screen (or equivalent) within your app. Arguments are:
Argument | Description | Required? | Validation |
---|---|---|---|
name |
Human-readable name for this screen | Yes | Non-empty string |
id_ |
Unique identifier for this screen | No | String |
tstamp |
When the screen was viewed | No | Positive integer |
Example:
t.track_screen_view("HUD > Save Game", "screen23", 1368725287)
Use track_page_view()
to track a user viewing a page within your app.
Arguments are:
Argument | Description | Required? | Validation |
---|---|---|---|
page_url |
The URL of the page | Yes | Non-empty string |
page_title |
The title of the page | No | String |
referrer |
The address which linked to the page | No | String |
tstamp |
When the pageview occurred | No | Positive integer |
Example:
t.track_page_view("www.example.com", "example", "www.referrer.com")
Use track_ecommerce_transaction()
to track an ecommerce transaction on the transaction level.
Arguments:
Argument | Description | Required? | Validation |
---|---|---|---|
order_id |
ID of the eCommerce transaction | Yes | Non-empty string |
tr_total_value |
Total transaction value | Yes | Int or Float |
tr_affiliation |
Transaction affiliation | No | String |
tr_tax_value |
Transaction tax value | No | Int or Float |
tr_shipping |
Delivery cost charged | No | Int or Float |
tr_city |
Delivery address city | No | String |
tr_state |
Delivery address state | No | String |
tr_country |
Delivery address country | No | String |
tstamp |
When the transaction event occurred | No | Positive integer |
Examples:
t.track_ecommerce_transaction("order-456", 142, None, 20, 12.99, "London", None, "United Kingdom")
t.track_ecommerce_transaction("order-456", 142, tr_city="Paris", tr_country="France")
Use track_ecommerce_transaction_item()
to track an individual line item within an ecommerce transaction.
Arguments:
Argument | Description | Required? | Validation |
---|---|---|---|
ti_id |
Order ID | Yes | Non-empty string |
ti_sku |
Item SKU | Yes | Non-empty string |
ti_price |
Item price | Yes | Int or Float |
ti_quantity |
Item quantity | Yes | Int |
ti_name |
Item name | No | String |
ti_category |
Item category | No | String |
tstamp |
When the transaction event occurred | No | Positive integer |
Example:
t.track_ecommerce_transaction_item("order-789", "2001", 49.99, 1, "Green shoes", "clothing")
Use track_struct_event()
to track a custom event happening in your app which fits the Google Analytics-style structure of having up to five fields (with only the first two required):
Argument | Description | Required? | Validation |
---|---|---|---|
category |
The grouping of structured events which this action belongs to |
Yes | Non-empty string |
action |
Defines the type of user interaction which this event involves | Yes | Non-empty string |
label |
A string to provide additional dimensions to the event data | No | String |
property |
A string describing the object or the action performed on it | No | String |
value |
A value to provide numerical data about the event | No | Int or Float |
tstamp |
When the structured event occurred | No | Positive integer |
Example:
t.track_struct_event("shop", "add-to-basket", None, "pcs", 2)
Warning: this feature is implemented in the Python tracker, but it is not currently supported in the Enrichment, Storage or Analytics stages in the Snowplow data pipeline. As a result, if you use this feature, you will log unstructured events to your collector logs, but these will not be parsed and loaded into e.g. Redshift to analyse. (Adding this capability is on the roadmap.)
Use track_unstruct_event()
to track a custom event which consists of a name and an unstructured set of properties. This is useful when:
- You want to track event types which are proprietary/specific to your business (i.e. not already part of Snowplow), or
- You want to track events which have unpredictable or frequently changing properties
The arguments are as follows:
Argument | Description | Required? | Validation |
---|---|---|---|
name |
The name of the event | Yes | Non-empty string |
properties |
The properties of the event | Yes | Non-empty table |
tstamp |
When the unstructured event occurred | No | Positive integer |
Example:
t.track_unstruct_event("save-game", {
"save_id": "4321",
"level": 23,
"difficultyLevel": "HARD",
"dl_content": true
}, 1369330929 )
The properties table consists of a set of individual name = value
pairs. The structure must be flat: properties cannot be nested. Be careful here as this is not currently enforced through validation.
Snowplow unstructured events support a relatively rich set of datatypes. Because these datatypes do not always map directly onto Python datatypes, we have introduced some "type suffixes" for the Python property names, so that Snowplow knows what Snowplow data types the Python data types map onto:
Snowplow datatype | Description | Python datatype | Type suffix(es) | Supports array? |
---|---|---|---|---|
Null | Absence of a value | N/A | - | No |
String | String of characters | string | - | Yes |
Boolean | True or false | boolean | - | Yes |
Integer | Number without decimal | number | $int |
Yes |
Floating point | Number with decimal | number | $flt |
Yes |
Geo-coordinates | Longitude and latitude | { number, number } | $geo |
Yes |
Date | Date and time (ms precision) | number |
$dt , $ts , $tms
|
Yes |
Array | Array of values | {x, y, z} | - | - |
Let's go through each of these in turn, providing some examples as we go:
Tracking a Null value for a given field is currently untested in the Python Tracker. TODO.
Tracking a String is easy:
{
"product_id" = "ASO01043"
}
Tracking a Boolean is also straightforward:
{
"trial" = True
}
To track an Integer, use a Python number but add a type suffix like so:
{
"in_stock$int" = 23
}
Warning: if you do not add the $int
type suffix, Snowplow will assume you are tracking a Floating point number.
To track a Floating point number, use a Python number; adding a type suffix is optional:
{
"price$int" = 4.99,
"sales_tax" = 49.99 # Same as sales_tax$flt = ...
}
Tracking a pair of Geographic coordinates is done like so:
{
"check_in$geo" = (40.11041, -88.21337) # Lat, long
}
Please note that the datatype takes the format latitude followed by longitude. That is the same order used by services such as Google Maps.
Warning: if you do not add the $geo
type suffix, then the value will be incorrectly interpreted by Snowplow as an Array of Floating points.
Snowplow Dates include the date and the time, with milliseconds precision. There are three type suffixes supported for tracking a Date:
-
$dt
- the Number of days since the epoch -
$ts
- the Number of seconds since the epoch -
$tms
- the Number of milliseconds since the epoch. This precision is hard to access from within Python
You can track a date by adding a Python number to your properties
object. The following are all valid dates:
{
"birthday2$dt" = 3996,
"registered2$ts" = 1371129610,
"last_action$tms" = 1368454114215, # Accurate to milliseconds
}
Note that the type prefix only indicates how the Python number sent to Snowplow is interpreted - all Snowplow Dates are stored to milliseconds precision (whether or not they include that level of precision).
Two warnings:
- If you specify a Python number but do not add a valid Date suffix (
$dt
,$ts
or$tms
), then the value will be incorrectly interpreted by Snowplow as a Number, not a Date - If you specify a Python number but add the wrong Date suffix, then the Date will be incorrectly interpreted by Snowplow, for example:
{
"last_ping$dt" = 1371129610 # Should have been $ts. Snowplow will interpret this as the year 3756521449
}