Naming - artsy/snowplow GitHub Wiki

Deployable Apps

There are 4 deployable apps we use in snowplow as of Nov 27 2014, so naming can get confusing! Here is a guide to how things are named, and what things have names.

Summary

Here is a table summarizing this section:

deployable apps source code folder branch name heroku app name
scala-stream-collector 2-collectors/scala-stream-collector scala-stream-collector-master snowplow-stream-collector
scala-kinesis-enrich 3-enrich/scala-kinesis-enrich scala-kinesis-enrich-master snowplow-scala-kinesis-enrich
scala-common-enrich 3-enrich/scala-common-enrich scala-common-enrich-master On Sonatype not Heroku
kinesis-redshift-sink 4-storage/kinesis-redshift-sink kinesis-redshift-sink-master snowplow-kinesis-redshift-sink snowplw-redshft-snk-impression

What we deploy

These are names the snowplow org gives to the apps+library we use. To see what they do, read about the event flow.

  • scala-stream-collector
  • scala-kinesis-enrich
  • scala-common-enrich (library not an app)
  • kinesis-redshift-sink

Where the source code lives by folder

  • 2-collectors/scala-stream-collector
  • 3-enrich/scala-kinesis-enrich
  • 3-enrich/scala-common-enrich (library not an app)
  • 4-storage/kinesis-redshift-sink

Where the source code lives by github branch name

We manage different apps under different branches.

Basically the grammar is "#{app_name}-master" for branches that will be deployed. This is where you should PR.

Where the source code lives by heroku app name

  • snowplow-stream-collector
    • YES, this name should be changed to snowplow-scala-stream-collector to be in line with the other naming schemes
  • snowplow-scala-kinesis-enrich
  • The scala-enrich-common library is not on heroku
  • snowplow-kinesis-redshift-sink
    • also snowplw-redshft-snk-impression (30 char heroku name limit) which is dedicated to storing impressions from a dedicated kinesis stream into a dedicated redshift table

scala-enrich-common exception

This code is hosted in a maven repository on Sonatype, read more about its deployment here.

Environment variables

You'll notice in the myconf.conf files across the apps that there are env variables referenced with the ${NAME} syntax. The input/output variables like the names of where events go to/from are described in The Event Flow in Detail.

These variables are names of Kinesis streams, s3 buckets, and the redshift endpoint/table.

IAM AWS credentials should be used.