In the course of spreading my homelab over a couple more machines, I finally arrived at the Ceph cluster’s MON daemons. These were running on three Ceph VMs on my main x86 server up to now. In this post, I will describe how I moved them to three Raspberry Pis. While the cluster was up the entire time.
First, a couple of considerations:
- MON daemons use on average about 1GB of memory in my cluster
- My cluster, and most of my services, went down during the migration. So please be cautious if you plan to do your own migration
The MON daemons are something of a control plane for Ceph clusters. They hold the MON map of daemons and data locations. Every client which uses the Ceph cluster will use them to access a map of available OSDs to work with.
Please Note: Be cautious with this! If you lose all three of your Monitors, your cluster is broken.
Due to the centrality of the MON daemons for both, the cluster itself and any
clients, a lot of places potentially hold the IPs of your monitors. Most of the
time, that will be in the form of
Clients are generally not automatically receiving new MON addresses. They will need to be updated manually!
So how did I do it all? I started out with migrating a single daemon. My thinking here: I can migrate one daemon, then update all three MON’s addresses to their new values everywhere, and then I can migrate the other two daemons as well.
For the sake of this article, let’s assume that the old MONs are located on
oldhost1,oldhost2,oldhost3 and the new hosts are called
Also note that I’m running a
So to begin with, a single daemon can be migrated by using the
ceph orch apply
ceph orch apply mon --placement "newhost1,oldhost1,oldhost2"
This will disable the MON on
oldhost3 and place a fresh one on
The MON daemons on
oldhost2 will not be touched at all and
At this point, nothing much can go wrong in cluster operations. Any connected
clients will automatically go searching for another MON daemon and find either
oldhost2. But note: Those clients will not automagically get the
newhost1 added to their potential MONs. Many parts of the cluster,
including the MON daemons on
oldhost2, will be informed about
the new MON daemon.
But other parts of the cluster will not. Among the daemons which will not
automatically get the new MON address are the OSDs and NFS daemons.
At this point, I was not aware that there is any kind of problem.
I then adapted all of the
ceph.conf files and other places where the MON IPs
are mentioned. These were:
- Ceph CSI jobs running in my Nomad cluster
ceph.conffiles on a number of unmanaged physical hosts
- The kernel command lines of my netbooting hosts, which contain the MONs
This was where I diverged from my original plan. Instead of just replacing the
oldhost3 with the one of
newhost1, I went ahead and replaced all of them.
And here’s where the problems started. During reboots, my OSDs suddenly were no
longer recognized in the
ceph -s output. They were down, even though I could
see that they were up and running on their respective hosts.
The reason for this: The OSDs do not seem to be updated with new MON addresses
automatically, and they also ignore their host’s
Instead, they have their own conf file, located at
CLUSTER_ID here is the
id: line in the
ceph -s output and
osd.1. That file seems to be a
ceph.conf file used by the OSDs.
Just manually changing the MON addresses in there and restarting the daemons
fixed the issue.
I also observed that the NFS daemon I had running did not seem to be working anymore. It had the same problem and the same solution worked.
A final comment on performance: It seems that Raspberry Pis manage the load of MON daemons just fine. I’ve got three of them hosting the MONs now, and they are also running Nomad, Consul and Vault servers. The CPU utilization seldom goes above 10%.