Tinkerbell Part II: Lab Setup

A description of my lab setup for tinkering with Tinkerbell.

~~For my Tinkerbell tinkering lab~~ Actually, no. Let’s start with: How did I not come up with “tinkering with Tinkerbell” until the second post of this series? You may tsk tsk tsk disapprovingly at your screen now.

For my Tinkerbell tinkering lab, I decided to run it on my desktop machine. This is because previous work on network booting has shown that I definitely want direct access to the netbooting machine’s TTY. And that’s easiest when it runs on my desktop. Also makes stuff like packet capturing easier. So I needed the following things in my lab setup:

Fresh VLAN
VM tooling on my desktop
Ubuntu server VM for Tinkerbell
k3s, to run Tinkerbell

In this post, I will go into a bit more detail on what that setup looks like.

New VLAN

In my Homelab VLAN, I’ve already got two DHCP servers. One is from my OPNsense router, providing the IPAM (IP Address Management) side of things. Then there’s also a dnsmasq instance running in proxy mode and supplying the necessary info for netbooting, also serving as a TFTP server.

This is definitely something I will need to tackle during the labbing phase - what to do about diskless netbooting machines? For their first boot, they should go with Tinkerbell for initial provisioning. But all subsequent boots should then use the dnsmasq server and boot their normal kernel.

But for now, I’m avoiding having to think about this by creating a separate VLAN so Tinkerbell’s DHCP doesn’t disrupt the netbooting hosts. If you’re curious about the details, head to this post. For now, suffice it to say that I configured another fresh VLAN, let’s say with the ID 512, and added it as a trunk VLAN to the router’s main interface. Same for the rest of the network path to my desktop. There, the VLAN is also configured trunked, so that packets arrive on the host with their VLAN tag intact, allowing me to configure a special interface on the host for just those packets. Importantly, I did not set the desktop’s switch port to autotag incoming packets (coming from the desktop) with that VLAN ID. So all packets for this VLAN come into the host tagged, and they also have to leave the host tagged.

Because I intended to have the lab up only while actively working on it, I didn’t do any config file changes, but instead wrote a small bash script to set up the networking via ip commands:

#!/bin/bash

LAN=eth0
VLANID=512
VLAN=$LAN.$VLANID
BRIDGE=br
IP=203.0.113.1/32

function setup_net {
  ip link add link $LAN name $VLAN type vlan id $VLANID
  ip link add name $BRIDGE type bridge
  ip link set $VLAN master $BRIDGE
  ip link set $BRIDGE up
  ip link set $VLAN up
  ip addr add $IP dev $BRIDGE
}

function teardown_net {
  ip link set $BRIDGE down
  ip link set $VLAN down
  ip link delete $BRIDGE
  ip link delete $VLAN
}

while [[ $# -gt 0 ]]; do
  case $1 in
    up)
      setup_net
      shift
      ;;
    down)
      teardown_net
      shift
      ;;
    *)
      echo "Unknown argument"
      exit 1
      ;;
  esac
done

This script creates two new network devices. The first one, called eth0.512, will serve as the VLAN interface, sitting “on top” of eth0, which is my physical NIC. The PHYSICAL.VLAN naming is only a convention, not a requirement. Then there’s the br bridge, which can be imagined as a “Virtual Switch” simulated by the Linux kernel. Multiple interfaces can be connected to it. And through the eth0.512 interface being part of it, the interface connected to the bridge would have access to the rest of the network.

This type of bridge is a simple type - it is not aware of the VLANs at all, so packets send between the hosts on the bridge would not be tagged. But any packets which go into the wider network would do so via the eth0.512 interface, and would consequently get tagged with the 512 VLAN ID.

Now, one very important fact is that the IP address needs to be assigned to the bridge, not to the VLAN interface. I initially had it assigned to the VLAN IF, and it did not work at all, in that the packets did not arrive on the router from the newly VLAN 512 interface, and packets send from other hosts to the IP assigned to the interface never arrived at all. I’m honestly not really able to explain why that was. Which tells me, yet again, that at some point I need to take a tour through the Linux kernel’s networking stack.

VM setup

I had to think a lot about this part, surprisingly. My normal go-to tool for VMs has always been LXD. I ran my VMs via it for a couple of years during the “one host, multiple VMs” phase of my Homelab. Then I pulled it out again to supply some VMs during the k8s migration. I’m pretty comfortable with it, and I like that it has a Terraform provider so I could put my VM configs under version control.

In some previous desktop VM’ing, I had opted to set up the VM directly with the qemu-system command. But I wanted a little bit more structure this time, because I expect this lab to last a bit longer.

These were the two extremes I was thinking about - LXD (or rather, Incus), requiring a daemon to run and some additional setup, or a bash script for launching the VM via qemu-system. I was looking for something in the middle - without a daemon, but a bit less DIY than a bash script.

Initially, Vagrant looked exactly like what I was looking for. I was a bit dismayed when I saw that it was seemingly written in Ruby though. Nothing wrong with Ruby, but it’s not something I have installed on my desktop. But I went ahead and got right to writing a Vagrant file - just to find this note on Ubuntu’s Vagrant page:

Vagrant has been dropped by Ubuntu due to the adoption of the Business Source License (BSL). Following this change, Canonical will no longer publish Vagrant images directly starting with Ubuntu 24.04 LTS (Noble Numbat).

So much for that idea. And I didn’t want to run any other distro, as the entire Homelab is based on Ubuntu, and at least for now I don’t intend to change that. I then looked into stuff like virtsh. But that then turned out to also require a daemon. And at that point I decided that Incus really was the best choice - at least I was already experienced with it, so I could spend more time on setting up Tinkerbell and less on setting up the lab.

With that decision made, I ran the Incus install:

emerge -av incus

The Gentoo Wiki has a good page on Incus. Following it, I also added my user to the required groups for being allowed to use Incus directly:

usermod --append --groups incus,incus-admin <MYUSER>

Then I could launch Incus like this:

rc-service incus start

As said, I only wanted the lab to be up when I’m actually working with it, so I did not autostart it.

Finally, I initialized Incus with this command:

incus admin init

I basically said “no” to everything, so I could set up stuff like default networking and the default storage provider in OpenTofu later and put that config under version control.

Setting up the Master VM with OpenTofu

To configure Incus, I made use of the OpenTofu Incus provider. I didn’t use the Incus CLI because I wanted to put the config under source control. Even though I’m still on Terraform for my Homelab as a whole, I decided to go with OpenTofu for the lab. I intended to keep the two states, Home(prod)lab and actual lab, separate anyway. And I saw this as a good chance to kick the tires on OpenTofu.

My OpenTofu main.tf looks like this:

terraform {
  backend "local" {
    path = ".terraform/terraform-main.tfstate"
  }

  required_providers {
    incus = {
      source = "lxc/incus"
      version = "0.3.1"
    }
  }
}

provider "incus" {
  remote {
    name = "local"
    default = true
    scheme = "unix"
  }
}

Nothing special here at all. So next, setting up some defaults for the VMs. First step: Some storage. I just went with local storage - in the name of not overcomplicating the lab setup unnecessarily (yes, I can see that smirk on your face right now):

resource "incus_storage_pool" "local-dir" {
  name = "local-dir"
  description = "Local host storage pool"
  driver = "dir"
  config = {
    source = "/var/lib/incus/storage-pools/local-dir"
  }
}

Next comes the base profile for the VMs:

resource "incus_profile" "base" {
  name = "base"

  config = {
    "boot.autostart" = false
    "cloud-init.vendor-data" = <<-EOT
#cloud-config
users:
  - name: ansible-user
    sudo: ALL=(ALL:ALL) ALL
    ssh_authorized_keys:
      - from="192.0.2.100" ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOaxn8l16GNyBEgYzWO0BAko9fw8kkIq9tbels3hXdUt user@foo
    shell: /bin/bash
packages:
  - sudo
  - python3
  - openssh-server
chpasswd:
  expire: false
  users:
    - name: ansible-user
      password: password123
      type: text
EOT
  }

  device {
    name = "network"
    type = "nic"

    properties = {
      nictype = "bridged"
      parent = "br"
    }
  }
}

Let’s start with the network config. Here, I’m configuring the VM to make use of the br bridge I created above. The bridged device type will create a NIC which is part of the given bridge device, meaning it is connected to that bridge and will be able to use it to communicate with other connected hosts. In my setup, this config also allows all connected devices to communicate with the outside world.

Then there’s also the vendor-data config. This is a cloud-init configuration file. Cloud-init was introduced for Ubuntu, but has been adopted by a number of other distributions as well. It’s main usage is as a tool to do initial configuration of a generic OS image. On systems supporting cloud-init, there are generally multiple levels of e.g. systemd services running during boot. Those can configure the network as well as create users, install packages and set passwords and a whole host of other things. Generally, these configs are only executed once, during the initial boot of the machine. Switching to cloud-init is one of the goals during my Tinkerbell migration. Up to now, I’ve been creating individual images for each new host, which contained pretty much only the above configuration. Which was a bit of a waste, considering that I really only needed to do some very light customization, with the sole goal being that after first boot, the machine would be ready for my main Ansible playbook to run.

This particular cloud-init config does exactly that. It installs Python and the OpenSSH server. Surprisingly, the Incus Ubuntu images don’t come with SSH configured by default. Then I’m creating the ansible-user user, which is the user all of my Ansible playbooks use for connecting to the hosts in my Homelab. The config adds the user itself, sets the shell and adds my Ansible SSH key to the authorized_keys, allowing access only from my Command and Control host. The user also has full sudo access. Finally, I’m setting a simple password initially, which is then changed to the actual password during the initial Ansible playbook run. This is probably a bit unsafe, and I plan to look into doing this better, for now it serves reasonably well, because I need a password for sudo access even for the first playbook run.

I’ve also got a small second profile, for creating hosts with disks:

resource "incus_profile" "disk-vms" {
  name = "disk-vms"

  device {
    name = "root"
    type = "disk"
    properties = {
      pool = "${incus_storage_pool.local-dir.name}"
      size = "20GB"
      path = "/"
    }
  }
}

These two profiles are separate because I will also need to test how my diskless netboot setup works with Tinkerbell provisioning. And honestly, I’ve got a bad feeling about it. But that’s for the future. 😬

The last thing to do: Actually create the VM.

resource "incus_instance" "master" {
  name = "master"
  type = "virtual-machine"
  image = "images:ubuntu/24.04/cloud"
  running = true
  profiles = [
    "${incus_profile.base.name}"
  ]

  device {
    name = "root"
    type = "disk"
    properties = {
      pool = "${incus_storage_pool.local-dir.name}"
      size = "50GB"
      path = "/"
    }
  }

  config = {
    "limits.cpu" = 6
    "limits.memory" = "16GB"
    "cloud-init.user-data" = <<-EOT
#cloud-config
hostname: master
EOT
  }
}

Nothing special about this config, it uses the previously discussed base profile and adds a 50 GB disk to it. I’ve configured it with 16GB of RAM, similar to the Pi 5 which will ultimately host the setup.

A single tofu apply later, I had the main VM up and running, ready for the k3s install.

Setting up k3s

Tinkerbell is very much a Kubernetes application. Plus, I had started thinking that standardizing on deploying everything possible in Kubernetes would be a good thing. So regardless of whether Tinkerbell ultimately gets deployed or not, I want a Kubernetes cluster on my cluster master host. After looking through the current offerings, I decided on k3s as the Kubernetes distro to use. Mostly because it seems to be the standard. While I normally instinctively reach for the “vanilla” version of everything, I already know that kubeadm is not exactly friendly to single-node deployments.

For the deployment on the test VM, I adapted this Ansible role. With my adaptions, the role’s tasks/main.yml looks like this:

- name: Populate service facts
  ansible.builtin.service_facts:
- name: get k3s installed version
  ansible.builtin.command: k3s --version
  register: k3s_version_output
  changed_when: false
  ignore_errors: true
- name: set k3s installed version
  when: not ansible_check_mode and k3s_version_output.rc == 0
  ansible.builtin.set_fact:
    installed_k3s_version: "{{ k3s_version_output.stdout_lines[0].split(' ')[2] }}"
- name: Download artifact only if needed
  when: not ansible_check_mode and ( k3s_version_output.rc != 0 or installed_k3s_version is version(k3s_version, '<') )
  block:
    - name: Download K3s install script
      ansible.builtin.get_url:
        url: https://get.k3s.io/
        timeout: 120
        dest: /usr/local/bin/k3s-install.sh
        owner: root
        group: root
        mode: "0755"
    - name: Download K3s binary
      ansible.builtin.command:
        cmd: /usr/local/bin/k3s-install.sh
      environment:
        INSTALL_K3S_SKIP_START: "true"
        INSTALL_K3S_VERSION: "{{ k3s_version }}"
      changed_when: true
- name: Make config directory
  ansible.builtin.file:
    path: "/etc/rancher/k3s"
    mode: "0755"
    owner: root
    group: root
    state: directory
- name: Copy config file
  ansible.builtin.template:
    src: "k3s-config.yaml"
    dest: "/etc/rancher/k3s/config.yaml"
    mode: "0644"
    owner: root
    group: root
  register: _server_config_result
- name: Make data directory
  ansible.builtin.file:
    path: "{{ data_dir }}"
    mode: "0755"
    owner: root
    group: root
    state: directory
- name: Make volume directory
  ansible.builtin.file:
    path: "{{ volume_dir }}"
    mode: "0755"
    owner: root
    group: root
    state: directory
- name: Copy K3s service file
  ansible.builtin.copy:
    src: "k3s.service"
    dest: "/etc/systemd/system/k3s.service"
    owner: root
    group: root
    mode: "0644"
  register: service_file_single
- name: Restart K3s service
  when:
    - ansible_facts.services['k3s.service'] is defined
    - ansible_facts.services['k3s.service'].state == 'running'
    - service_file_single.changed or _server_config_result.changed
  ansible.builtin.systemd:
    name: k3s
    daemon_reload: true
    state: restarted
- name: Enable and check K3s service
  when: ansible_facts.services['k3s.service'] is not defined or ansible_facts.services['k3s.service'].state != 'running'
  ansible.builtin.systemd:
    name: k3s
    daemon_reload: true
    state: started
    enabled: true

The nice thing about this role is that it can handle updates reasonably well. It still feels a bit weird to use a bash script as part of the process, but it looks like that’s really the intended approach for deploying k3s. Worth noting here is the very first task:

- name: Populate service facts
  ansible.builtin.service_facts:

Without this, at least in my setup, later tasks using ansible_facts.services checks do not work, as Ansible does not gather service data by default.

The role also needs some variables defined, which I do in defaults/main.yml:

k3s_version: v1.33.1+k3s1
data_dir: "/srv/k3s/state"
volume_dir: "/srv/k3s/volumes"

The k3s.service file is also taken from the role:

[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target

[Install]
WantedBy=multi-user.target

[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s server

And then finally, there’s the k3s config file:

tls-san:
  - "k3s.example.com"
  - "192.0.2.100"
data-dir: "{{ data_dir }}"
cluster-cidr: "10.42.0.0/16"
service-cidr: "10.43.0.0/16"
flannel-backend: "wireguard-native"
default-local-storage-path: "{{ volume_dir }}"
disable: "servicelb"

Nothing too special here either. I decided to keep k3s’ default local-storage provider. The reason being that I need this cluster to be as independent of any other services as possible, because it’s going to be the place where I deploy everything that’s serving as the bedrock for the rest of the Homelab.

Besides that, the last notable action is disabling the servivelb load balancer service. In short, this is k3s’ implementation of a simple handler for LoadBalancer type k8s Services. I couldn’t use it because DHCP packets never made it to the Tinkerbell Pod. I will go into more detail about this in the next post of the series.

And after an ansible-playbook deployment.yml --limit master, I had a fully functional k3s cluster. It started up without any issue, deployed Traefik and was ready for more workloads. I like how little hassle this was, and I find myself agreeing with k3s’ claims of being a simple k3s distribution. As far as such things can be simple. 😏

Cluster connection setups

Before I finish this post, I would like to talk a little bit about how I configured access to the new k3s cluster, as it would be accessed from the same host as my main cluster. I ended up going with the alias route, using kubectl’s --context parameter.

Let’s first have a look at the updated ~/.kube/config file:

apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: <BASE64 encoded data here>
    server: https://k8s.example.com:6443
  name: main-cluster
- cluster:
    server: https://k3s.example.com:6443
    certificate-authority-data: <Different BASE64 encoded data here>
  name: management-cluster
contexts:
- context:
    cluster: main-cluster
    user: main-admin
  name: main-admin@main-cluster
- context:
    cluster: management-cluster
    user: mgm-admin
  name: mgm-admin@management-cluster
current-context: main-admin@main-cluster
kind: Config
preferences: {}
users:
- name: main-admin
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1
      command: pass
      args:
        - show
        - main-creds
      interactiveMode: IfAvailable
- name: mgm-admin
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1
      command: pass
      args:
        - show
        - mgm-creds
      interactiveMode: IfAvailable

For more details on this config, and why pass appears in it, have a look at this post. Each cluster gets its own context definition, and each cluster has a different user.

The aliases for kubectl then look like this:

alias k=kubectl\ --context=main-admin@main-cluster
alias k-master=kubectl\ --context=mgm-admin@management-cluster

So with k, I’m getting my main cluster. I decided to keep the alias I had originally created for the cluster, instead of renaming it to e.g. k-main. I’ve started to question this decision and would propose for anyone looking to replicate my setup to not re-use an old setup like this, as inevitably, you will be using the main cluster’s alias even though you meant to talk to the management cluster.

Using k when wanting to do something with the k8s cluster has become pretty ingrained over the last year+.

One random comment for when you’re using a similar setup with autocompletion: Don’t surround the alias definition with quotation marks, e.g. like this:

alias k="kubectl --context=main-admin@main-cluster"

The alias itself will work, but autocomplete won’t. That’s why I’m using the \ syntax instead. Apropos autocomplete, you need to explicitly tell bash to autocomplete on aliases. For example like this:

source ~/.kube/kubectl-comp

if [[ $(type -t compopt) = "builtin" ]]; then
    complete -o default -F __start_kubectl k kmaster
else
    complete -o default -o nospace -F __start_kubectl k kmaster
fi

The __start_kubectl function is defined in the autocomplete script provided by kubectl when running kubectl completion bash.

Finally, I wrote about how I’m using Helmfile to manage the deployments on my Kubernetes cluster in the last post. Luckily, Helmfile already has an option to set the context right in the Helmfile:

helmDefaults:
  kubeContext: main-admin@main-cluster

This removes any danger of deploying to the wrong cluster, although commands like destroy might still be dangerous when I’ve got entries with the same names in both files. 😬

Finale

And that completes this part of the setup. The next one will be about the setup of Tinkerbell itself, and I will likely combine it with the provisioning of the first VM with Tinkerbell.

New VLAN#

VM setup#

Setting up the Master VM with OpenTofu#

Setting up k3s#

Cluster connection setups#

Finale#