Kubernetes Production Cluster Hardware Reccomendations - CloudCommandos/JohnChan GitHub Wiki

Master Node


This section is from https://docs.okd.io/latest/install/prerequisites.html#hardware
In a highly available OKD cluster with external etcd, a master host needs to meet the minimum requirements and have 1 CPU core and 1.5 GB of memory for each 1000 pods. Therefore, the recommended size of a master host in an OKD cluster of 2000 pods is the minimum requirements of 2 CPU cores and 16 GB of RAM, plus 2 CPU cores and 3 GB of RAM, totaling 4 CPU cores and 19 GB of RAM.


Worker Node


This section is from https://docs.okd.io/latest/install/prerequisites.html#hardware
The size of a node host depends on the expected size of its workload. As an OKD cluster administrator, you need to calculate the expected workload and add about 10 percent for overhead. For production environments, allocate enough resources so that a node host failure does not affect your maximum capacity.


etcd


This section is from https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#hardware-recommendations

Example hardware configurations

Here are a few example hardware setups on AWS and GCE environments. As mentioned before, but must be stressed regardless, administrators should test an etcd deployment with a simulated workload before putting it into production.

Note that these configurations assume these machines are totally dedicated to etcd. Running other applications along with etcd on these machines may cause resource contentions and lead to cluster instability.

Small cluster

A small cluster serves fewer than 100 clients, fewer than 200 of requests per second, and stores no more than 100MB of data.

Example application workload: A 50-node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.large 2 8 3600 56.25
GCE n1-standard-2 + 50GB PD SSD 2 7.5 1500 25

Medium cluster

A medium cluster serves fewer than 500 clients, fewer than 1,000 of requests per second, and stores no more than 500MB of data.

Example application workload: A 250-node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.xlarge 4 16 6000 93.75
GCE n1-standard-4 + 150GB PD SSD 4 15 4500 75

Large cluster

A large cluster serves fewer than 1,500 clients, fewer than 10,000 of requests per second, and stores no more than 1GB of data.

Example application workload: A 1,000-node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.2xlarge 8 32 8000 125
GCE n1-standard-8 + 250GB PD SSD 8 30 7500 125

xLarge cluster

An xLarge cluster serves more than 1,500 clients, more than 10,000 of requests per second, and stores more than 1GB data.

Example application workload: A 3,000 node Kubernetes cluster

Provider Type vCPUs Memory (GB) Max concurrent IOPS Disk bandwidth (MB/s)
AWS m4.4xlarge 16 64 16,000 250
GCE n1-standard-16 + 500GB PD SSD 16 60 15,000 250

Useful Links:
Etcd hardware recommendations