Appendix - artsy/snowplow GitHub Wiki

Nota Bene

The information here elaborates on what was needed to set up the practices around Artsy's use of snowplow (heroku deployment and snowplow app management strategy).

This appendix does not contain essential information to contribute to or help keep artsy/snowplow running.

Choosing Sonatype for hosting scala-common-enrich

This seems to be the best choice to host a dependency since it is the most standard way to include a library in scala/java, and because it avoids having a complicated deploy step, simply using the build in sbt publishSigned.

A tutorial on pushing to Sonatype for the first time can be found here.

Failed alternatives to managing the scala-common-enrich dependency in scala-kinesis-enrich:

  • sbt managing through github
    • something like RootProject( uri("git://github.com/typesafehub/sbteclipse.git#v1.2") ) works great locally, but is not compatible with heroku
  • using RootProject( file("/home/user/a-project") ) with a submodule does not play friendly with heroku. Since the deploy task changes the directory organization of the app (unnests it from 3-enrich/scala-kinesis-enrich), submodules get messed up. Even after fixing these in .git and .git/config, heroku has trouble compiling, it misunderstands the project name
  • specifying the project name with ProjectRef( file("/home/user/a-project"), "project-id") first of all requires different project-ids on heroku vs locally, but second of all also does not compile properly on heroku
  • avoiding use of git submodules, by adding a pre deploy task to copy the source without using git submodules still does not compile properly on heroku
  • I have doubts that hosting .jars on github will be compatible with heroku

Setup for heroku for snowplow scala apps

Steps for altering scala-kinesis-enrich and scala-stream-collector to make them deployable to heroku.

  1. obfuscate access keys in conf as: ${AWS_ACCESS_KEY_ID}
    • if you get a error like adf, make sure resolve() the config file like: ConfigFactory.parseFile(new File("conf/application.conf")).resolve()
  2. add addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "0.8.0-M1") to project/plugins.sbt and .settings(com.typesafe.sbt.SbtNativePackager.packageArchetype.java_application: _*) to wherever val project is defined (project/ScalaCollectorBuild.scala)
    • this provides the sbt stage task that heroku runs to compile
    • be careful to skip every other line in plugins.sbt
  3. add java.runtime.version=1.7 to system.properties in repo root
  4. add worker: ./target/universal/stage/bin/name-of-my-app --config myconf.conf to Procfile in repo root, check name of executable after sbt compile stage
    • don't use sh here, heroku runs Ubuntu with dash not bash

Setup for heroku for snowplow java apps

Steps for altering kinesis-redshift-sink to be deployable to heroku

  1. obfuscate access keys in conf as: ${AWS_ACCESS_KEY_ID}
  2. add java.runtime.version=1.7 to system.properties in repo root
  3. ensure that the jar compiles with mvn clean package, include maven-shade-plugin like so otherwise
  4. add worker: java $JAVA_OPTS -jar ./target/kinesis-redshift-sink-0.0.1.jar to Procfile in repo root, check name of executable jar after mvn clean package
  5. delete gpg signing plugin option in mvn pom.xml if it exists