So, as I mentioned in my last article, I want to give Kubernetes another try after HashiCorp’s recent license change.

This also gives me a chance to put the lab back in Homelab, as it has mostly been a Homeprod environment - not much experimentation going on there, just slow, intentional incremental changes here and there. But my Homeprod env is not really suited for housing a Kubernetes cluster. It mostly consists of Raspberry Pis. Don’t get me wrong, it is serving me well - but running two parallel clusters with two different orchestrators on the same HW is probably not a good idea. 😉

So I decided to dig out my old Homeserver from the days when my Homelab was only a single machine. It is an x86 machine, with an Intel i7 10th gen CPU, 2 500GB SSDs and 64 GB of RAM that’s been gathering dust in storage since I decommissioned it sometime in the spring this year. It still had all the innards from when I decommissioned it, and after a quick once-over to make sure I hadn’t unplugged any important cables, I was able to boot it right up.

The first thing to go was the old Arch Linux install, as it’s wholly unsuitable to what I needed to be doing. Instead, the machine got an Ubuntu install. Which was the first hurdle. Not the actual install, but rather the setup of the damn stick for it. Because I didn’t want to connect a monitor and keyboard, I wanted to do a headless install.

And the Ubuntu installer even sets up an SSH server. But the password to log in is randomized and - you guessed it - only shown on-screen after bootup. Which I find an interesting decision. Of course, that’s done for security reasons - having default installer passwords is frowned upon these days.

Next, I thought I could just unpack the ISO, set a password for the installer user and repackage it. I even found a couple of guides to do so, but I was not able to properly repackage the changed ISO to be bootable. So I finally gave up and just connected a monitor and keyboard.

Baremetal host setup

For the actual cluster machines, I went with LXD. For the simple reason that I used it before I moved to a fleet of baremetal Raspberry Pis and had some good experience with it.

For storage, a local 70GB disk serves as the root disk. The remaining 400 GB of that SSD became an LVM volume group to serve as an LXD storage pool:

    - name: create LXD storage volume group
      tags:
        - storage
      community.general.lvg:
        vg: vg-lxd
        pvs:
          - /dev/sdb3
        state: present

One more important change needed for the VM host is the creation of a bridge interface so that the VMs can communicate to the host and to my wider network. A Linux bridge interface is rather similar to a software network switch. I set it up with Ubuntu’s netplan file. One very important point: You need to disable DHCP on your main interface if it is part of the bridge, and instead enable DHCP for the bridge interface. Otherwise, if you disable it for the bridge but leave it enabled for the physical interface, you get DHCP DISCOVER requests, but it won’t actually answer to the DHCP OFFER send from your networking infrastructure. It says so, plain and clear, in the netplan docs (why exactly does a Google search for “netplan bridge” not contain the netplan docs on the very first page?!). So guess who had to pick the server up from its corner and connect a keyboard and mouse again because he thought he is smarter than the docs? 😒

The VM OS

All of the setup finally done, the next decision was on which OS to use for the VMs. At first, I was a bit fascinated with Talos Linux, which bills itself as a Linux for Kubernetes. It follows the new “immutable distro” paradigm, and I had not yet dipped my toes into that particular topic - so time to make this a double experiment? Alas, no. It looks like Talos isn’t just a distro which is “good for Kubernetes”, but also believes it knows better than I do. Namely, it disables SSH access. Completely. You don’t really need shell access, you know? In fact, it’s bad for you.

Let’s clean up one myth right away: This is certainly not for security reasons. Because they still have an API with which you can supposedly do everything. So we have replaced a decades old project, OpenSSH, which had audits up the wazoo, with a hip new API. Yeah. Sure. I definitely trust your API way more than OpenSSH…

Another argument I heard and found more believable than security is saving Ops teams from themselves, by way of removing the temptation of SSH’ing into a machine to fix a problem, instead of say going through the GitOps process, including code reviews and everything, to fix a problem. That one I buy a lot more readily. Although I will say: I’ve been working in ops for a while now, and I have been very happy to have access to the actual machines for debugging purposes. Because sometimes, you just need to attach strace to random processes.

Apart from that particular piece of opinionated design, it also has an admittedly bigger problem when it comes to my goal of experimenting with Kubernetes: It provides its own ability to set up a Kubernetes cluster, and automates a bit too much, at least for my initial, experimental cluster. So I’ve put it off for now, and might set up another experiment once I’ve become a bit more familiar with Kubernetes.

On the positive side, it has support for Raspberry Pis, so at least that’s not a blocker.

I ended up going with what I already knew: Ubuntu, which I also run on all the other machines in my Homelab.

LXD setup

To setup the VMs, I decided to go with Terraform, because it allows me to store the setup in config files, instead of having a playbook with a series of LXD commands. I am using the terraform-lxd provider.

To initialize the provider, I first had to introduce it to my Terraform main config:

terraform {
  required_providers {
    lxd = {
      source = "terraform-lxd/lxd"
      version = "~> 1.10.1"
    }
  }
}

data "vault_generic_secret" "lxd-pw" {
  path = "secret/lxd-pw"
}

provider "lxd" {
  generate_client_certificates = true
  accept_remote_certificate = true

  lxd_remote {
    name = "server-name-here"
    scheme = "https"
    address = "server-fqdn-here"
    default = true
    password = data.vault_generic_secret.lxd-pw["pw"]
  }
}

To get at the password, I’m using my Vault instance again, where I pushed the secret with vault kv put secret/lxd-pw pw=-. This is a bit of an anti-pattern, as it ends up storing the password in the Terraform state. But I’ve come to accept that sometimes, this happens. My state is pretty well secured. But keep this in mind when following along - your Terraform state should be kept secure!

Next step is configuring the LVM based LXD storage pool I mentioned above. This is also done in Terraform:

resource "lxd_storage_pool" "lvm-pool" {
  remote = "server-name-here"
  name = "lvm-pool"
  driver = "lvm"
  config = {
    source = "vg-lxd"
    "lvm.thinpool_name" = "LXDThinPool"
    "lvm.vg_name" = "vg-lxd"
  }
}

Next a couple of profiles, one for my controller nodes, with 4 CPUs and 4 GB of RAM, somewhat similar to the Raspberry Pi 4 4 GB which will ultimately run my control plane. Another one for my Ceph nodes with a bit more RAM, and finally some base profile for networking, which adds a NIC based on the bridge interface created previously. And then another profile for VMs with a local root disk from the LVM pool.

resource "lxd_profile" "profile-base" {
  name = "base"
  remote = "server-name-here"

  config = {
    "boot.autostart" = false
    "cloud-init.vendor-data" = file("${path.module}/lxd/vendor-data.yaml")
  }

  device{
    name = "network"
    type = "nic"

    properties = {
      nictype = "bridged"
      parent = "your-bridge-interface-name"
    }
  }
}

resource "lxd_profile" "profile-localdisk" {
  name = "localdisk"
  remote = "server-name-here"

  device{
    name = "root"
    type = "disk"

    properties = {
      pool = "${lxd_storage_pool.lvm-pool.name}"
      size = "50GB"
      path = "/"
    }
  }
}

resource "lxd_profile" "profile-controller" {
  name = "controller"
  remote = "server-name-here"

  config = {
    "limits.cpu" = 4
    "limits.memory" = "4GB"
  }
}

resource "lxd_profile" "profile-ceph" {
  name = "ceph"
  remote = "server-name-here"

  config = {
    "limits.cpu" = 4
    "limits.memory" = "8GB"
  }
}

With all of that created, I only needed the VMs themselves. But there was one problem: In the rest of my (baremetal) Homelab, I’m producing disk images with HashiCorp’s Packer, with my Ansible user and some other bits and pieces already baked in. But now, I needed another way to bake in the Ansible user, as the goal here is to learn Kubernetes - not LXD image creation. I didn’t really want yet another yak to shave.

VM images

As noted above, I had already decided to go for Ubuntu as my base OS for the VMs. And the Ubuntu LXD images support cloud-init, and so does LXD.

After some digging, I found that I could relatively easily create my Ansible user and provide the SSH key for it. I could also adapt the sudoers files as I needed to make it all work.

But there was one problem remaining: I want my Ansible user to require a password for sudo. But I did not want to have my Ansible user’s password in the Terraform state, let alone just plainly written out in the Terraform config file. So what to do? In the end, the only thing I could come up with was to instead set a temporary password for my Ansible user, and run a short bootstrapping playbook to change it to the actual password. It does not feel very elegant, but keeps my user’s sudo password out of the Terraform state and configs.

This can all be achieved with cloud-init. My profile-base LXD profile adds the required cloud-config file:

  config = {
    "cloud-init.vendor-data" = file("${path.module}/lxd/vendor-data.yaml")
  }

LXD’s cloud-init.vendor-data config option is used here. The cloud-init config file looks like this:

#cloud-config
users:
  - name: your-ansible-user
    sudo: ALL=(ALL:ALL) ALL
    ssh_authorized_keys:
      - from="1.2.3.4" ssh-ed25519 abcdef12345 ssh-identifier
    shell: /bin/bash
packages:
  - sudo
  - python3
chpasswd:
  expire: false
  users:
    - name: your-ansible-user
      password: your-temporary-password
      type: text

This first creates the your-ansible-user user, with an appropriate SUDOERS entry. It also adds an SSH key, allowing access only from a single machine, which in my case is a dedicated Command & Control host. I also add the python3 and sudo packages, which are required by Ansible. Finally, I set the password for your-ansible-user to a pretty simple value which I had no problem with committing to git.

The experience with how well this worked also has me thinking about revamping my Netbooting setup. At the moment, I’m generating one image per host, even though most things are the same among all hosts, and I could just have two base images (one amd64, one arm64) and then do the necessary per-host adaptions by running a cloud-init server in my network.

Creating the VMs

The last part of the setup is creating the VMs themselves:

resource "lxd_container" "ceph-vm-1" {
  name = "vm-name"
  remote = "server-name-here"
  type = "virtual-machine"
  image = "ubuntu:22.04"
  profiles = [
    "${lxd_profile.profile-base.name}",
    "${lxd_profile.profile-localdisk.name}",
    "${lxd_profile.profile-ceph.name}"
  ]
  start_container = true

  device{
    name = "cephdisk"
    type = "disk"

    properties = {
      source = "/dev/sda2"
      path = "/dev/cephdisk"
    }
  }

  config = {
    "cloud-init.user-data" = <<-EOT
      #cloud-config
      hostname: vm-name
    EOT
  }
}

This resource remotely contacts the LXD server and creates a new VM. Don’t get confused by the lxd_container resource type, this is simply the resource type shared between LXD containers and VMs, where the type determines what’s really created. In the config section, I’m explicitly setting the hostname of the new machine with the cloud-init user-data config option. By default, the hostname is the same as the LXD VM name, which would be the name field in Terraform. But as I sometimes have the habit of naming my VMs something else than their actual hostname, I provided it explicitly here.

One very important point: The #cloud-init at the top of the file is not a comment - it is part of the cloud init spec. It has to be there. Took me a while to realize that…

The above example is one of the two VMs which will end up serving as Ceph Rook hosts, so it also gets handed another disk for later use by Ceph.

And that’s it. After a final terraform apply, I’ve finally got a Homelab again.

Over the last week, I have been researching Kubernetes and cluster setups. I’ve got a couple of notes on the topic and will likely write another blog post with all the prep work rather soon. If I’m really lucky I might finally be ready to issue the kubeadm init command later today. 😉