Hosted assets - artsy/snowplow GitHub Wiki

To simplify setting up and running Snowplow, the Snowplow Analytics team provide public hosting for some of the Snowplow sub-components. These hosted assets are publically available through Amazon Web Services (CloudFront and S3), and using them is free for Snowplow community members.

As we release new versions of these assets, we will leave old versions unchanged on their existing URLs - so you won't have to upgrade your own Snowplow installation unless you want to.

Disclaimer: While Snowplow Analytics Ltd will make every reasonable effort to host these assets, we will not be liable for any failure to provide this service. All of the hosted assets listed below are freely available via [our GitHub repository] snowplow-repo and you are encouraged to host them yourselves.

The current versions of the assets hosted by the Snowplow Analytics team are as follows:

1. Trackers

1.1 JavaScript Tracker resources

The minified JavaScript tracker is hosted on CloudFront against its full semantic version:

http(s)://d1fc8wv8zag5ca.cloudfront.net/2.1.2/sp.js

2.1 Clojure Collector resources

The Clojure Collector packaged as a complete WAR file, ready for Amazon Elastic Beanstalk, is here:

s3://snowplow-hosted-assets/2-collectors/clojure-collector/clojure-collector-0.9.0-standalone.war

Right-click on this [Download link] cc-download to save it down locally via CloudFront CDN.

2.2 Scala Stream Collector resources

The Scala Stream Collector is packaged as an executable jarfile:

s3://snowplow-hosted-assets/2-collectors/scala-stream-collector/snowplow-stream-collector-0.1.0

Right-click on this [Download link] ssc-download to save it down locally via CloudFront CDN.

3. Enrich

3.1 Scala Hadoop Enrich resources

The Scala Hadoop Enrich process uses a single jarfile containing the MapReduce job. This is made available in a public Amazon S3 bucket, for Snowplowers who are running their Hadoop Enrich process on Amazon EMR:

s3://snowplow-hosted-assets/3-enrich/hadoop-etl/snowplow-hadoop-etl-0.9.0.jar

Right-click on this [Download link] hadoop-enrich-download to save it down locally via CloudFront CDN.

3.2 Scala Hadoop Shred resources

The Scala Hadoop Shred process uses a single jarfile containing the MapReduce job. This is made available in a public Amazon S3 bucket, for Snowplowers who are running their Hadoop Enrich & Shred process on Amazon EMR:

s3://snowplow-hosted-assets/3-enrich/scala-hadoop-shred/snowplow-hadoop-shred-0.2.1.jar

Right-click on this [Download link] hadoop-shred-download to save it down locally via CloudFront CDN.

3.3 Scala Hadoop Bad Rows resources

The Scala Hadoop Bad Rows tool uses a single jarfile containing the MapReduce job. This is made available in a public Amazon S3 bucket:

s3://snowplow-hosted-assets/3-enrich/scala-bad-rows/snowplow-bad-rows-0.1.0.jar

Right-click on this [Download link] hadoop-bad-rows-download to save it down locally via CloudFront CDN.

3.4 Scala Kinesis Enrich resources

The Scala Kinesis Enrich process is packaged as an executable jarfile:

s3://snowplow-hosted-assets/3-enrich/scala-kinesis-enrich/snowplow-kinesis-enrich-0.1.0

Right-click on this [Download link] kinesis-enrich-download to save it down locally via CloudFront CDN.

3.3 Shared resources

3.31. MaxMind GeoLiteCity

Both Enrichment processes make use of the free [GeoLite City database] geolite from [MaxMind, Inc] maxmind, also stored in this public Amazon S3 bucket:

s3://snowplow-hosted-assets/third-party/maxmind/GeoLiteCity.dat

This file is updated every month by the Snowplow Analytics team.

If you are running Scala Kinesis Enrich, you will need a local copy of this file. Right-click on this [Download link] glc-download to save it down locally via CloudFront CDN.

4. Storage

Our shredding process for loading JSONs into Redshift uses a standard set of JSON Path files, available here:

s3://snowplow-hosted-assets/4-storage/redshift-storage/jsonpaths

If you are running StorageLoader, these files will automatically be used for loading corresponding JSONs into Redshift.

5. Analytics

No hosted assets currently.

Hosted assets - artsy/snowplow GitHub Wiki

1. Trackers

1.1 JavaScript Tracker resources

2.1 Clojure Collector resources

2.2 Scala Stream Collector resources

3. Enrich

3.1 Scala Hadoop Enrich resources

3.2 Scala Hadoop Shred resources

3.3 Scala Hadoop Bad Rows resources

3.4 Scala Kinesis Enrich resources

3.3 Shared resources

3.31. MaxMind GeoLiteCity

4. Storage

5. Analytics

See also

⚠️ GitHub.com Fallback ⚠️

Hosted assets - artsy/snowplow GitHub Wiki

1. Trackers

1.1 JavaScript Tracker resources

2.1 Clojure Collector resources

2.2 Scala Stream Collector resources

3. Enrich

3.1 Scala Hadoop Enrich resources

3.2 Scala Hadoop Shred resources

3.3 Scala Hadoop Bad Rows resources

3.4 Scala Kinesis Enrich resources

3.3 Shared resources

3.31. MaxMind GeoLiteCity

4. Storage

5. Analytics

See also

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️