17 ‐ Docker Swarm - CloudScope/DevOpsWithCloudScope GitHub Wiki

Docker Swarm is Docker's native orchestration tool, used for managing a cluster of Docker nodes (machines) in a distributed environment. It enables high availability, scaling, and management of containerized applications. As a DevOps engineer, understanding Docker Swarm is essential for orchestrating containerized workloads and deploying applications in production environments. Here's a summary of key concepts and practical notes for working with Docker Swarm.


1. What is Docker Swarm?

Docker Swarm is Docker's built-in orchestration tool that allows you to deploy and manage containers across multiple Docker hosts. It turns a group of Docker engines into a cluster, making it easier to manage large-scale applications by automating container deployment, scaling, and load balancing.

  • Cluster: A group of machines (physical or virtual) running Docker Swarm.
  • Node: A single Docker engine that is part of a Swarm cluster.
  • Manager Node: Controls the swarm, handles the scheduling of tasks, and manages the Swarm state.
  • Worker Node: Runs container tasks as assigned by the manager node.

2. Key Concepts in Docker Swarm

  • Swarm Mode: Docker's native clustering and orchestration feature that can be activated with a single command.
  • Service: A containerized application that you want to run in the Swarm cluster. It defines how containers should run, including replicas, update policies, and scaling.
  • Task: A running instance of a service. It represents a container managed by Swarm.
  • Stack: A group of services defined in a docker-compose.yml file that can be deployed together as a unit in the swarm.

3. Setting Up a Docker Swarm Cluster

To start using Docker Swarm, you need at least one manager node and one or more worker nodes.

a. Initialize a Swarm Cluster (on Manager Node)

docker swarm init

This command initializes the Swarm cluster and designates the current machine as the manager node. After running this, you'll see a token that can be used to join other nodes to the cluster.

b. Add Worker Nodes to the Cluster

On each worker node, run the command provided after initializing the swarm on the manager node:

docker swarm join --token <WORKER-TOKEN> <MANAGER-IP>:2377

This command will join the worker node to the Swarm cluster.

c. Check the Cluster Status

docker node ls

This command shows the status of all nodes in the Swarm cluster, including manager and worker nodes.


4. Managing Services in Docker Swarm

Docker Swarm uses services to define containers that should be deployed and managed. A service is typically defined in a docker-compose.yml file and can scale horizontally across nodes.

a. Create a Service

docker service create --name <service-name> --replicas <number-of-replicas> <image-name>
  • Example: Running an Nginx service with 3 replicas:
    docker service create --name nginx --replicas 3 -p 80:80 nginx

b. Scale Services

You can scale the number of containers for a service:

docker service scale <service-name>=<new-replica-count>

Example:

docker service scale nginx=5

c. List Services

To see the active services in the Swarm:

docker service ls

d. Inspect Service Details

docker service inspect <service-name>

e. Update a Service

You can update a service with a new image or configuration:

docker service update --image <new-image> <service-name>

f. Remove a Service

docker service rm <service-name>

5. Swarm Networking

Docker Swarm provides built-in networking features that make it easier to connect containers running on different nodes.

a. Overlay Networks

An overlay network allows containers across different nodes to communicate securely. It's created automatically when you use docker service create.

docker network create --driver overlay <network-name>

b. Routing Traffic to Services

Services can be exposed using published ports or load balancers. When you create a service, you can expose specific ports on the manager node, and Docker Swarm will automatically distribute traffic across the replicas.

docker service create --name <service-name> -p <host-port>:<container-port> <image-name>

For example, to expose a web service:

docker service create --name webapp -p 80:80 myapp-image

c. Internal Communication

By default, Swarm enables internal DNS for containers to communicate with each other using service names. For example, one container can communicate with another container using http://<service-name>:<port>.


6. Stack Deployments

You can define and deploy a complete multi-service application in Docker Swarm using Docker Compose files.

a. Create a docker-compose.yml for Swarm

Example of a stack definition in docker-compose.yml:

version: "3.8"
services:
  web:
    image: nginx
    deploy:
      replicas: 3
      ports:
        - "80:80"
  app:
    image: myapp
    deploy:
      replicas: 2

b. Deploy the Stack

docker stack deploy -c docker-compose.yml <stack-name>

This command deploys all services defined in the compose file as a stack to the swarm cluster.

c. List Stacks

docker stack ls

d. Remove a Stack

docker stack rm <stack-name>

7. Rolling Updates and Rollbacks

Docker Swarm supports rolling updates for services, ensuring zero downtime during deployment.

a. Perform a Rolling Update

docker service update --image <new-image> <service-name>

Swarm will update the service gradually by stopping old containers and starting new ones to avoid downtime.

b. Rollback a Service

If there are issues with the update, you can roll back to the previous version:

docker service rollback <service-name>

8. Swarm Security and Secrets Management

a. Docker Secrets

Docker Swarm supports the management of sensitive data (e.g., API keys, passwords) using Docker Secrets.

To create a secret:

echo "my_secret_password" | docker secret create my_secret_password -

To use a secret in a service:

services:
  myapp:
    image: myapp
    secrets:
      - my_secret_password

To list secrets:

docker secret ls

9. Monitoring and Logging

To ensure the health and performance of your Swarm cluster, monitoring and logging are critical:

  • Logs: Use docker service logs <service-name> to view the logs for a service.
  • Health Checks: Define health checks in your service definition to ensure that the containers are running properly.
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost"]
      interval: 30s
      retries: 3

For advanced monitoring, integrate with tools like Prometheus, Grafana, or ELK stack.


10. To remove a node from a Docker Swarm cluster, you need to follow different steps depending on whether the node is a manager node or a worker node.

Here’s how you can remove a node from Docker Swarm:

1. Remove Worker Node from Docker Swarm

Step 1: Drain the Worker Node (Optional but Recommended)

Before removing a worker node, you may want to "drain" it, which means Docker will stop scheduling new tasks on this node and will reschedule any existing tasks to other nodes in the cluster.

To drain a node, run the following command on the manager node:

docker node update --availability drain <node-name>

This prevents new tasks from being assigned to the node and attempts to move running tasks to other nodes in the Swarm.

Step 2: Remove the Node from the Swarm

Now, remove the worker node from the swarm by running the following command on the manager node:

docker node rm <node-name>

This will remove the worker node from the Swarm cluster.

Step 3: Leave the Swarm (on the Worker Node)

On the worker node itself, you can run the following command to leave the Swarm cluster:

docker swarm leave

This command will remove the worker node from the swarm and effectively take it out of the cluster.

Note: If the node is a manager node, you can use docker node rm to remove it from the manager set. But, a manager node must have at least one manager left in the swarm to avoid leaving the swarm in a "single manager" state.


2. Remove Manager Node from Docker Swarm

If you are removing a manager node, you need to ensure that there are other manager nodes available in the cluster to maintain quorum.

Step 1: Drain the Manager Node (Optional but Recommended)

It’s a good idea to drain the manager node before removing it, so no new tasks are assigned to it:

docker node update --availability drain <manager-node-name>

Step 2: Demote the Manager Node (if you don't want it to be a manager)

If you just want to demote the manager node (i.e., make it a worker node), you can do this:

docker node demote <manager-node-name>

This will convert the manager node into a worker node, and it will stop participating in manager operations. If you want to keep the node in the Swarm but as a worker, this is the step to use.

Step 3: Remove the Manager Node

If you want to completely remove the manager node from the Swarm, you can run the following on the manager node:

docker node rm <manager-node-name>

This command removes the manager node from the cluster, and if it's the only manager left, Docker will fail to remove it.

Step 4: Leave the Swarm (on the Manager Node)

If you are removing the manager node entirely (not demoting it), run the following command on the manager node itself:

docker swarm leave --force

The --force flag is required to leave the Swarm when the node is a manager, as it will ensure the node leaves the cluster even if it is managing the Swarm.

Important: You cannot remove the last manager node unless you have more than one manager node in the Swarm. Swarm requires a quorum of manager nodes to maintain the cluster.


3. Forcefully Remove a Node (in case of failure)

In rare cases where a node is unresponsive or you are unable to remove it using the standard steps, you can force remove a node from the Swarm using the --force flag on the manager node.

docker node rm <node-name> --force

This will forcibly remove the node from the Swarm, but use this carefully as it can lead to data inconsistencies if the node was still part of active services.


4. Verify Node Removal

After removing the node, you can verify that it has been removed by listing the nodes in the cluster:

docker node ls

This will show the current status of all nodes in the Swarm. The removed node should no longer appear in the list.


Summary of Commands

  1. Drain Worker Node (optional but recommended):

    docker node update --availability drain <node-name>
  2. Remove Worker Node (on the manager node):

    docker node rm <node-name>
  3. Leave Swarm (on the worker node):

    docker swarm leave
  4. Demote Manager Node (if needed):

    docker node demote <manager-node-name>
  5. Remove Manager Node (on the manager node):

    docker node rm <manager-node-name>
  6. Leave Swarm (on the manager node):

    docker swarm leave --force
  7. Force Remove Node (in case of failure):

    docker node rm <node-name> --force

11. Best Practices for Docker Swarm

  • High Availability: Ensure you have at least three manager nodes to maintain high availability and fault tolerance.
  • Resource Constraints: Use resource limits (CPU, memory) for containers to avoid overconsumption.
  • Backup and Restore: Regularly back up the Swarm manager state and services configuration.
  • Use Secrets and Configs: Always store sensitive information like credentials and configuration files using Docker Secrets and Configs for enhanced security.
⚠️ **GitHub.com Fallback** ⚠️