Setting up Druid dependencies - chuwy/snowplow-ci GitHub Wiki
HOME > SNOWPLOW SETUP GUIDE > Step 4: setting up alternative data stores > *Setup Druid > Setup Druid for production in AWS > Setup Druid dependencies
The prerequisites for a production setup of Druid in AWS are as follows:
- [Amazon S3] amazon-s3 to act as the data repository for Druid ("deep storage")
- [Postgres on Amazon RDS] pg-rds to act as the metadata storage for Druid
- [Apache ZooKeeper] [zookeeper] to coordinate the Druid clusters
Let's configure/install each of these in turn.
### 1. Amazon S3We will use [Amazon S3] amazon-s3 as the data repository for Druid ("deep storage").
ADD REST OF SECTION
### 2. Postgres on Amazon RDSWe will use a PostgreSQL instance running on Amazon RDS as the medata storage for Druid.
ADD REST OF SECTION
### 3. Apache ZooKeeperWe will use Apache ZooKeeper as the cluster coordination service for Druid.
Setting up and running a production ZooKeeper cluster is out of the scope of this documentation. We strongly recommend reading [ZooKeeper (O'Reilly)] zookeeper-oreilly before proceeding.
Create a ZooKeeper cluster on EC2, with an odd number of nodes, at least 3. You should not attempt to run any other Druid components on these ZooKeeper nodes.