Setting up Druid dependencies - chuwy/snowplow-ci GitHub Wiki

HOME > SNOWPLOW SETUP GUIDE > Step 4: setting up alternative data stores > *Setup Druid > Setup Druid for production in AWS > Setup Druid dependencies

The prerequisites for a production setup of Druid in AWS are as follows:

[Amazon S3] amazon-s3 to act as the data repository for Druid ("deep storage")
[Postgres on Amazon RDS] pg-rds to act as the metadata storage for Druid
[Apache ZooKeeper] [zookeeper] to coordinate the Druid clusters

Let's configure/install each of these in turn.

### 1. Amazon S3

We will use [Amazon S3] amazon-s3 as the data repository for Druid ("deep storage").

ADD REST OF SECTION

### 2. Postgres on Amazon RDS

We will use a PostgreSQL instance running on Amazon RDS as the medata storage for Druid.

ADD REST OF SECTION

### 3. Apache ZooKeeper

We will use Apache ZooKeeper as the cluster coordination service for Druid.

Setting up and running a production ZooKeeper cluster is out of the scope of this documentation. We strongly recommend reading [ZooKeeper (O'Reilly)] zookeeper-oreilly before proceeding.

Create a ZooKeeper cluster on EC2, with an odd number of nodes, at least 3. You should not attempt to run any other Druid components on these ZooKeeper nodes.

Setting up Druid dependencies - chuwy/snowplow-ci GitHub Wiki

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️