Hosted assets - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki
To simplify setting up and running Snowplow, the Snowplow Analytics team provide public hosting for some of the Snowplow sub-components. These hosted assets are publically available through Amazon Web Services (CloudFront and S3), and using them is free for Snowplow community members.
As we release new versions of these assets, we will leave old versions unchanged on their existing URLs - so you won't have to upgrade your own Snowplow installation unless you want to.
Disclaimer: While Snowplow Analytics Ltd will make every reasonable effort to host these assets, we will not be liable for any failure to provide this service. All of the hosted assets listed below are freely available via our GitHub repository and you are encouraged to host them yourselves.
The current versions of the assets hosted by the Snowplow Analytics team are as follows:
We are steadily moving over to Bintray for hosting binaries and artifacts which don't have to be hosted on S3.
To make operating Snowplow easier, the EmrEtlRunner and StorageLoader apps are now available as prebuilt executables in a single zipfile here:
http://dl.bintray.com/snowplow/snowplow-generic/snowplow_emr_r88_angkor_wat.zip
Right-click on this Download link to save it down locally.
Note: The link above refers to the latest version at the time of writing (R87). If you know there is a newer version you can locate and download it from the generic page. Search for the pattern snowplow_emr_
. The higher the number version the newer it is.
The minified JavaScript tracker is hosted on CloudFront against its full semantic version:
http(s)://d1fc8wv8zag5ca.cloudfront.net/2.7.0/sp.js
Note: The above URL references JavaScript tracker v2.7.0 (d1fc8wv8zag5ca.cloudfront.net/2.6.2/sp.js). To ensure you are using the latest version, please, check what it currently is at GitHub and amend accordingly.
The Clojure Collector packaged as a complete WAR file, ready for Amazon Elastic Beanstalk, is here:
s3://snowplow-hosted-assets/2-collectors/clojure-collector/clojure-collector-1.1.0-standalone.war
Right-click on this Download link to save it down locally via CloudFront CDN.
The Scala Stream Collector is available on Bintray here:
http://dl.bintray.com/snowplow/snowplow-generic/snowplow_scala_stream_collector_0.9.0.zip
Right-click on this Download link to save it down locally.
The Scala Hadoop Enrich process uses a single jarfile containing the MapReduce job. This is made available in a public Amazon S3 bucket, for Snowplowers who are running their Hadoop Enrich process on Amazon EMR:
s3://snowplow-hosted-assets/3-enrich/scala-hadoop-enrich/snowplow-hadoop-enrich-1.8.0.jar
Right-click on this Download link to save it down locally via CloudFront CDN.
The Scala Hadoop Shred process uses a single jarfile containing the MapReduce job. This is made available in a public Amazon S3 bucket, for Snowplowers who are running their Hadoop Enrich & Shred process on Amazon EMR:
s3://snowplow-hosted-assets/3-enrich/scala-hadoop-shred/snowplow-hadoop-shred-0.11.0.jar
Right-click on this Download link to save it down locally via CloudFront CDN.
The Scala Hadoop Event Recovery (formerly Hadoop Bad Rows) tool uses a single jarfile containing the MapReduce job. This is made available in a public Amazon S3 bucket:
s3://snowplow-hosted-assets/3-enrich/hadoop-event-recovery/snowplow-hadoop-event-recovery-0.2.0.jar
Right-click on this Download link to save it down locally via CloudFront CDN.
The Stream Enrich app is available on Bintray here:
http://dl.bintray.com/snowplow/snowplow-generic/snowplow_stream_enrich_0.10.0.zip
Right-click on this Download link to save it down locally.
Both Enrichment processes make use of the free GeoLite City database from MaxMind, Inc, also stored in this public Amazon S3 bucket:
s3://snowplow-hosted-assets/third-party/maxmind/GeoLiteCity.dat
This file is updated every month by the Snowplow Analytics team.
If you are running Stream Enrich, you will need a local copy of this file. Right-click on this Download link to save it down locally via CloudFront CDN.
Our shredding process for loading JSONs into Redshift uses a standard set of JSON Path files, available here:
s3://snowplow-hosted-assets/4-storage/redshift-storage/jsonpaths
If you are running StorageLoader, these files will automatically be used for loading corresponding JSONs into Redshift.
The Kinesis Elasticsearch Sink app is available for both Elasticsearch 1.x and 2.x on Bintray here:
http://dl.bintray.com/snowplow/snowplow-generic/snowplow_kinesis_elasticsearch_sink_0.8.0_1x.zip
http://dl.bintray.com/snowplow/snowplow-generic/snowplow_kinesis_elasticsearch_sink_0.8.0_2x.zip
Right-click on this Download link for the 1.x version and this Download link for the 2.x version.
The Kinesis S3 app is available for download separately here:
http://dl.bintray.com/snowplow/snowplow-generic/kinesis_s3_0.4.0.zip
Right-click on this Download link to save it down locally.
No hosted assets currently.
To make deployment easier, the Kinesis apps Scala Stream Collector, Stream Enrich and Kinesis Elasticsearch Sink are also all available in a single zip file here:
https://dl.bintray.com/snowplow/snowplow-generic/snowplow_kinesis_r85_metamorphosis.zip
Right-click on this Download link to save it down locally.
As well as these hosted assets for running Snowplow, the Snowplow Analytics team also make code components and libraries available through Ruby and Java artifact repositories.
Please see the Artifact repositories wiki page for more information.