Wherein I shut down my Nomad cluster for good.

This is part 23 of my k8s migration series.

It is finally done, on the 13th of March I shut down my Nomad cluster. I had originally set it up sometime around 2021. The original trigger was that I had started to separate the Docker containers running public-facing services and the purely internal ones. Around that setup, I had constructed a bunch of bash scripts and a couple of shared mounts. It wasn’t pretty, plus the Homelab had recently turned from a utility into a genuine hobby. In short, increased complexity was actually welcomed. 😁

So when I started reading about workload schedulers, I naturally first looked at Kubernetes. I bounced off of that when I came to the “Now chose a Container Networking Plugin” stage of the install instructions. And I didn’t just not know which CNI plugin to choose - no, I didn’t even know how to make said choice.

And that’s how I came across Nomad. Together with Consul and Vault I had a really enjoyable Homelab. Nomad, as well as Consul and Vault, are absolutely excellent tools. Nomad has some really great flexibility when it comes to the drivers it can use for its jobs. They range from Docker to pure exec jobs run in a simple chroot. Networking can be done as simple or complex as you like, and by default you don’t need to worry about any kind of separate network. If you like, you can run it all on the network between your nodes without any complicated CNIs.

And that’s what initially drew me to Nomad. Taken on its own, it doesn’t do much more than running workloads, and that’s it. For secrets management or service discovery you can then add Vault and Consul, or you can just leave those things out.

Since I started with Nomad, some service discovery and secrets management capabilities have been added to Nomad itself, but I never tried them because I’ve had Vault and Consul already set up to my liking.

So let’s have a short look at an example job:

job "prometheus" {
  datacenters = ["homenet"]

  constraint {
    attribute = "${node.class}"
    value     = "internal"
  }

  group "prometheus" {

    network {
      mode = "bridge"
      port "health" {
        host_network = "local"
        to           = 9090
      }
    }

    service {
      name = "prometheus"
      port = 9090

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "snmp-exporter"
              local_bind_port = 9116
            }
          }
        }
      }

      check {
        type     = "http"
        interval = "30s"
        path     = "/-/ready"
        timeout  = "2s"
        port     = "health"
      }
    }

    volume "vol-prometheus" {
      type            = "csi"
      source          = "vol-prometheus"
      attachment_mode = "file-system"
      access_mode     = "single-node-writer"
    }

    task "prometheus" {
      driver = "docker"
      user = "962:962"

      config {
        image = "prom/prometheus:v2.50.0"

        mount {
          type = "bind"
          source = "secrets/prometheus.yml"
          target = "/etc/prometheus/prometheus.yml"
        }

        args = [
          "--config.file=/etc/prometheus/prometheus.yml",
          "--storage.tsdb.path=/prometheus",
          "--web.console.libraries=/usr/share/prometheus/console_libraries",
          "--web.console.templates=/usr/share/prometheus/consoles",
          "--web.page-title=Homenet Prometheus",
          "--storage.tsdb.retention.time=5y",
          "--log.format=json"
        ]
      }

      vault {
        policies = ["prometheus"]
      }

      volume_mount {
        volume      = "vol-prometheus"
        destination = "/prometheus"
      }

      template {
        data = file("prometheus/templates/prometheus.yml.templ")
        destination = "./secrets/prometheus.yml"
        change_mode = "restart"
      }

      resources {
        cpu = 400
        memory = 400
      }
    }
  }
}

This shows a pretty typical Nomad job setup in my Homelab. One of the interesting things compared to Kubernetes is that most configuration is located in a single file instead of a bunch of Yaml files. Very roughly speaking, the group is an equivalent to Kubernetes Pods, in that it provides a common networking and filesystem/volume namespace, and all tasks in a group get scheduled on the same node.

The great integrations of Vault and Consul are both visible here. First, there’s the service, which hooks Prometheus into Consul’s Connect Mesh for service discovery. How that looks from a consumer service can be seen in the connect stanza, which sets an upstream for an SNMP exporter running in a separate job. Then there’s the vault stanza, which configures a policy that any Vault secrets access can use. These policies can then be tuned to allow only access to the secrets the specific job actually needs.

Also something I learned to appreciate was the template stanza. It internally uses consul-template to template configuration files, complete with Vault integration. This made running apps which expect their secrets in their configuration files a lot more convenient.

But I don’t want to go into too much detail here. I’m planning to write a series of Homelab history posts where I will go into a lot more detail on the setup and dredge up all manner of old configurations and notes.

In the end, the trigger for my decision to migrate my well-functioning Homelab to k8s was HashiCorp’s decision to relicense under a more restrictive license. But I could have survived that one as well. And then they went and changed the ToS for the Terraform provider registry to exclude the FOSS fork of Terraform. That looked very much like pure spite to me, and I no longer trusted HashiCorp enough to build my Homelab on their tools. More details can be found in this post.

So even though I liked (and still like) the tools, I’ve now moved away from them for the most part. Here is a screenshot of the cluster when it was in full swing:

A screenshot of Nomads topology Web UI. It shows that the cluster had 9 clients, running 56 allocations. It had 68.66 GiB of RAM, of which 41% was reserved by jobs. The cluster also had 58.24 GHz of compute, of which 59% was used. To the right, a list shows the nine hosts, running anywhere between 2 and 10 allocations. Most of the hosts are Raspberry Pi CM4, with 8 GiB of RAM and 6000 MHz of compute.

The Nomad cluster when it was in full use.

And then, on March 13th, it looked like this:

A screenshot of Nomads topology Web UI. It shows the same cluster, but now with only five instead of nine clients. The list of allocations assigned to hosts now only shows 'Empty client' for the remaining clients.

The Nomad cluster right before shutdown.

At that point, a couple of hosts were already migrated over to the k8s cluster.

It all ended with this:

Mar 13 20:42:52 nomad[657]: ==> Caught signal: interrupt
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.837+0100 [INFO]  agent: requesting shutdown
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.837+0100 [INFO]  nomad: shutting down server
Mar 13 20:42:52 systemd[1]: Stopping Nomad...
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.837+0100 [WARN]  nomad: serf: Shutdown without a Leave
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.861+0100 [ERROR] consul.sync: failed deregistering agent service: service_id=_nomad-server-znuisv3m75ywtkofhwsukx47zklaefe3 error="Unexpected response code: 403 (Permission denied: token with AccessorID 'eaab766d-7627-3cda-21fe-a3d5fb63dd7a' lacks permission 'service:write' on \"nomad\")"
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.863+0100 [ERROR] consul.sync: failed deregistering agent service: service_id=_nomad-server-fi2peeufsfjc6po3r6v3vrhwg2pcyymo error="Unexpected response code: 403 (Permission denied: token with AccessorID 'eaab766d-7627-3cda-21fe-a3d5fb63dd7a' lacks permission 'service:write' on \"nomad\")"
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.866+0100 [ERROR] consul.sync: failed deregistering agent service: service_id=_nomad-server-ppg65djoq2gktz3gnzojkqza4d4idkv4 error="Unexpected response code: 403 (Permission denied: token with AccessorID 'eaab766d-7627-3cda-21fe-a3d5fb63dd7a' lacks permission 'service:write' on \"nomad\")"
Mar 13 20:42:52 nomad[657]:     2025-03-13T20:42:52.869+0100 [INFO]  agent: shutdown complete
Mar 13 20:42:52 systemd[1]: nomad.service: Main process exited, code=exited, status=1/FAILURE
Mar 13 20:42:52 systemd[1]: nomad.service: Failed with result 'exit-code'.
Mar 13 20:42:52 systemd[1]: Stopped Nomad.
Mar 13 20:42:52 systemd[1]: nomad.service: Consumed 16h 25min 36.860s CPU time.

And with that it’s gone. 🙁

You will note the errors complaining about Consul. I completely forgot about the service registrations before removing the Consul tokens allowing Nomad to handle its own services. This was fixable by removing the Nomad service manually:

consul services deregister -id="_nomad-server-zytfvzuzuboej3ehgwdihrgykyfj46pp

This command needs to be run against the Consul agent where the service was registered, it can’t be executed against just any Consul agent.

And with that, Nomad is gone. There’s still a lot to do. I’m already done shutting down my Ceph cluster as well, that will likely be the next post.