Migrating Kubernetes from Docker to containerd
On December 2nd, a surprise announcement made waves in the Kubernetes Twitter-sphere - that after the upcoming 1.20 release, Docker would be officially deprecated.
Oh no!#
Due to widespread confusion over what “Docker” means in specific contexts, many people panicked - myself included. Due to its popularity and ease of use, the Docker engine has become synonymous with “containers”. However, Docker is really an entire ecosystem of container tools and processes, including building and shipping container images. So what does this announcement mean, and what are the implications for everyone using it?
The only thing Kubernetes is deprecating is using Docker as a container runtime, and the reasoning is sound. Docker’s lack of support for the “Container Runtime Interface” API - or CRI, for short - forced Kubernetes to implement an abstraction layer called “dockershim” to allow Kubernetes to manage containers in Docker. The burden of maintaining dockershim was too great to bear, so they are deprecating dockershim in release 1.20, and will eventually remove it entirely in 1.22.
There are two other container runtimes featured in the Kubernetes quickstart guide as an alternative to Docker - containerd
and CRI-O. containerd
is the same runtime that Docker itself uses internally, just without the fancy Docker wrapping paper and tools.
Ugh.#
Annoyingly enough, I had recently finished migrating my entire homelab container infrastructure to Kubernetes three months ago, with Docker as the container runtime. I initially thought, “Crap. Guess I’ll be rebuilding my cluster!” Then I began to think about what such a change would look like, and whether replacing Docker with containerd
in the same cluster is doable.
Hmm…#
Turns out, it is!
I have a 3-node HA cluster which I created using kubeadm. Because I have multiple control plane nodes, I can remove them one at a time using kubeadm reset
, rebuild them with containerd
instead of Docker, and then rejoin using kubeadm join
.
Here are the steps I came up with:
Uninstalling Docker#
- Using
kubectl
, drain and evict pods from the target node.
kubectl drain ${node}
- On the target node, use
kubeadm
to remove the node from the cluster.
kubeadm reset
- Once
kubeadm reset
is finished, stop Docker and finish cleaning up the node.
systemctl stop docker
rm -rf /etc/cni/net.d
iptables --flush
- Uninstall the Docker CE suite and CLI.
apt-get -y remove docker-ce*
rm -rf /var/lib/docker/*
rm -rf /var/lib/dockershim
- Now’s a great time to update your kernel and OS packages…
apt-get update
apt-get -y dist-upgrade
- …and reboot!
shutdown -r now
Installing containerd#
(These steps are lifted straight from the fantastic k8s containerd docs!)
- Apply the module configs for
containerd
’s required kernel modules.
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
- Set sysctl tuning parameters for Kubernetes CRI
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
sysctl --system
- Install
containerd
if not already installed
apt-get install -y containerd.io
A note on filesystems#
Since these nodes were running Docker, all of the container data is stored in /var/lib/docker. With containerd
, container data is now stored in /var/lib/containerd. If you had the Docker data directory on its own filesystem, you’ll need to remove it and create one for containerd
. The exact steps depend in your system, so I won’t include them here.
Now back to the fun!#
- Generate a default configuration:
mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml
- Modify the
config.toml
file generated above to enable the systemd cgroup driver:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
- Now enable and start
containerd
!
systemctl enable --now containerd
- Make sure
containerd
is happy before proceeding.
systemctl status containerd
Your system is now fully configured with the containerd
runtime - but before we rejoin the cluster, there’s one more step to get kubeadm to play nicely with it!
Updating the kubelet configuration#
Since this cluster was originally built with the Docker runtime, the default kubelet configuration does not explicitly set a cgroup driver. By default, kubeadm with Docker auto-detects the cgroup driver - but other runtimes like containerd
don’t support that yet. As a result, when you kubeadm join
a containerd
node without a cgroup driver specified, the kubelet won’t start. You can ninja-edit the /var/lib/kubelet/config.yaml
file when joining and then restart the kubelet, but that’s tedious and unnecessary.
Fortunately, we can update the baseline kubelet config at the cluster level to specify the right cgroup driver to use.
- Edit the baseline kubelet config for your Kubernetes version - 1.18, 1.19, etc.
kubectl edit cm -n kube-system kubelet-config-1.18
- Add the following entry for
cgroupDriver
:
data:
kubelet: |
...
cgroupDriver: systemd
...
Joining the cluster#
- Proceed to
kubeadm join
your node with the appropriate kubeadm command! You can runkubeadm token create --print-join-command
to create a new token.
kubeadm join 123.45.67.89:6443 \
--token <...snip...> \
--discovery-token-ca-cert-hash sha256:<...snip...> \
[--control-plane --certificate-key <...snip...>]
For control plane nodes, be sure to include the --control-plane
flag and --certificate-key
for your cluster - otherwise the node will join as a worker! I made this mistake and had to re-reset and rejoin the first node I converted. Use kubeadm init phase upload-certs --upload-certs
on another control plane node to reupload your certificates to the cluster, and then pass the provided certificate key to kubeadm join
.
Clean-up#
Once your new node is joined, wait a few minutes for your CNI plugin to reprovision the networking stack. Once you’re satisfied and the node shows Ready
in kubectl get nodes
, you can uncordon the node with kubectl uncordon
.
And finally, if necessary, don’t forget to re-taint your new control plane node! When kubeadm
rejoins the node, it applies the same default restriction to prevent control plane nodes from running worker pods. I find separate control planes unnecessary for my homelab, so I taint them to allow pods to run anywhere.
kubectl taint nodes --all node-role.kubernetes.io/master-
Final thoughts#
Now, granted - this process is extremely unnecessary, and runs contrary to the cloud ethos that nodes should be treated like cattle. But for someone running a small bare-metal environment - where provisioning new nodes isn’t entirely automated - these steps save a lot of time otherwise spent rebuilding VMs from the ground up, assigning IP addresses, updating DNS, and potentially building a whole new cluster.
And as an added bonus, I now know more about Kubernetes and container runtimes than I did last week.