Skip to main content

Docker Swarm

Intro

😀
This less windy version of the original guide I made for TuringPi cluster. You can find the original here: TuringPi Docker Swarm. So if you want more details, check it out there.

What is Docker Swarm ?

Docker Swarm is a native clustering solution for Docker. It allows you to run multiple Docker daemons on multiple machines, and manage them as a single virtual machine. In essence its Kubernetes just with less features and less headache. It allows us to run docker containers in Highly Available mode, and scale them up and down as needed. It also allows us to run multiple services on the same cluster, and manage them as a single entity.

Why Docker Swarm ?

I'm not going to go into details about Docker Swarm, but I will give you a few reasons why you might chose it over Kubernetes.

  • Docker Swarm is a native solution for Docker.
  • Docker Swarm is a lot easier to setup and manage than Kubernetes.
  • Docker Swarm is a lot more cost effective than Kubernetes.

Plan

OS

We will utilize DietPi for its ease of setup. With DietPi, you can have your system up and running with minimal effort and without the need for extensive manual configuration.

Network

Swarm offers several network drivers that determine how the network is created and how containers are connected to it. The most common network driver is the "bridge" driver, which creates a virtual network inside the Docker host and connects containers to it.

When you export your docker service in Docker Swarm, any Node in swarm can pick up the connection and relay it to the correct container. This is fine in general, however for our convenience we can use Keepalived.

Keepalived

Keepalived is a tool that helps keep a service or application running by detecting when a server is not working and redirecting traffic to another server that is working. This helps ensure that the service or application is always available to users.

In our case, we will create one virtual IP that will serve as our single point of entry. This IP will always point to one of the Nodes and if one Node fails, the IP will renegotiate its routing automatically and always point to the other swarm node.

Layout:

  • Node1 - 10.0.0.60
  • Node2 - 10.0.0.61
  • Node3 - 10.0.0.62
  • Node4 - 10.0.0.63
  • Ingress (virtual IP) - 10.0.0.70

Storage

Alright, folks, so Docker Swarm might not come with a built-in persistent storage feature to handle container migration between nodes, but don't fret! There's a nifty third-party solution called GlusterFS that's got our backs. It's a strong and flexible option to take care of persistent storage when dealing with Docker Swarm. So, worry not, GlusterFS is here to save the day!

GlusterFS

GlusterFS is a scalable, distributed file system that allows you to store and access files across multiple servers as if they were on a single server. It provides features such as automatic data replication and data distribution across multiple nodes, allowing you to store large amounts of data and ensure high availability and data redundancy.

Compared to NFS (Network File System), GlusterFS has several advantages. GlusterFS offers better scalability, as it can easily grow and accommodate increasing storage needs by adding more GlusterFS nodes to the cluster. GlusterFS also provides a higher level of data protection, as it automatically replicates data across multiple nodes, reducing the risk of data loss. Additionally, GlusterFS can distribute data across nodes for improved performance, making it well-suited for high-performance storage requirements.

Ok the add for GlusterFS is over, we can look at more practical information now.

While it is possible to configure GlusterFS with just one replica, which would result in data being stored on a single node, this approach is not recommended. To ensure high availability and data protection, it is advisable to have at least two replicas in a GlusterFS cluster.

I would recommend at least 2 or 3 nodes in the cluster to have USB SSD attached and used for this purpose. However, it's important to note that SD cards are not suitable for heavy writing and reading environments, and should not be used as the primary storage backend for GlusterFS or a Kubernetes cluster.

Enough chit-chat, let's get to the good stuff! Grab some coffee and let's get started!