Snowplow Analytics SDK - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki

HOME » SNOWPLOW TECHNICAL DOCUMENTATION » Snowplow Analytics SDK

Overview

We are pleased to announce the releases of our first analytics SDKs for Snowplow, created for data engineers and data scientists working with Snowplow in Scala.

Some good use cases for the SDK include:

Performing event data modeling in Apache Spark as part our Hadoop batch pipeline
Developing machine learning models on your event data using Apache Spark (e.g. using Databricks or Zeppelin on EMR)
Performing analytics-on-write in AWS Lambda as part of our Kinesis real-time pipeline:

Snowplow Analytics SDK use cases

We are hugely excited about developing our analytics SDK initiative in four directions:

Adding more SDKs for other languages popular for data analytics and engineering, including Python, Node.js (for AWS Lambda) and Java
Adding additional event transformers to the Scala Analytics SDK - please let us know any suggestions!
We are planning on “dogfooding” the Scala Analytics SDK by starting to use it in standard Snowplow components, such as our Kinesis Elasticsearch Sink
Adding additional functions that are useful for processing event data (and sequences of event data) in particular

Snowplow Analytics SDKs

Scala Analytics SDK - lets you work with Snowplow enriched events in your Scala event processing, data modeling and machine-learning jobs. You can use this SDK with Apache Spark, AWS Lambda, Apache Flink, Scalding, Apache Samza and other Scala-compatible data processing frameworks.
Python Analytics SDK - lets you work with Snowplow enriched events in your Python event processing, data modeling and machine-learning jobs. You can use this SDK with Apache Spark, AWS Lambda, and other Python-compatible data processing frameworks.
Node.js Analytics SDK
Java Analytics SDK

Snowplow Analytics SDK - OXYGEN-MARKET/oxygen-market.github.io GitHub Wiki

Overview

Snowplow Analytics SDKs

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️