Wherein I migrate several IoT services over to Kubernetes.

This is part 18 of my k8s migration series.

This is going to be a short one. This weekend, I finished the migration of several IoT related services to k8s. Mosquitto is my MQTT broker, handling messages from several sources. For me, it’s only a listener - I do not have any actual home automations. Said mosquitto instance is scraped by mqtt2prometheus to get the data my smart plugs and thermometers produce into my Prometheus instance. Finally, I also migrated my Zigbee2MQTT instance over to the k8s cluster. It controls my Zigbee transceiver and sends the data from my thermometers on to mosquitto.

If you’d like some more details on the power plug data gathering setup, have a look here. The post on my thermometer setup is still on the large pile of blog posts I’d like to write at some point.

This will be a short(er) post, as I want to only talk about a couple of issues I encountered along the way.

Selfmade Helm chart

I decided to write my own Helm chart for these tools and manage them all in the same namespace. Just makes the setup a bit simpler, as they don’t really need to talk to many other services, none of the apps needs a database for example.

So what does a Helm chart look like when I write it myself? The Chart.yaml is kept extremely simple:

apiVersion: v2
name: iot
description: The Homelab IoT services
type: application
version: 0.1.0

I don’t need anything more. I don’t even bother to change the Chart’s version when I change things.

The values.yaml file is also pretty sparse. I mostly use it for cases where I need a value in multiple places:

commonLabels:
  homelab/part-of: iot
ports:
  mosquitto: "1883"
  pwr: "9641"
  temp: "9642"
  z2m: "8080"
mqttHost: mqtt.example.com

And that’s it already.

Mosquitto

As I said, I won’t detail every single manifest here. But one interesting part was that MQTT isn’t HTTP, it’s a purely TCP based protocol. But I’m still using Ingress mechanisms, because Traefik does support TCP routes. In k8s, these are configured with the IngressRouteTCP CRD. Using such a router config, some things are not available. E.g. if you don’t configure TLS, you cannot do host-based routing. The connection simply doesn’t tell you what host it connected to. So when you want to use unencrypted TCP (or UDP), your have to create a separate Traefik entrypoint with its own port just for this route. Here’s the route manifest:

apiVersion: traefik.io/v1alpha1
kind: IngressRouteTCP
metadata:
  name: mosquitto
  annotations:
    external-dns.alpha.kubernetes.io/hostname: "{{ .Values.mqttHost }}"
    external-dns.alpha.kubernetes.io/target: "ingress.example.com"
spec:
  entryPoints:
    - mqtt
  routes:
    - match: HostSNI(`*`)
      services:
        - name: mosquitto
          kind: Service
          port: 1883

This connects Traefik’s 1883 port to mosquitto’s Service. All connections arriving on the mqtt entrypoint will be forwarded to mosquitto.

If you do require TLS, Traefik can make use of the Server Name Indication, via the HostSNI setting. But SNI is an extension to TLS, so not all software implementing TLS will support it. When TLS is enabled, you can even run pure TLS connections over the same port Traefik is using for HTTPS. An IngressRouteTCP would look like this:

apiVersion: traefik.io/v1alpha1
kind: IngressRouteTCP
metadata:
  name: mosquitto-tls
spec:
  entryPoints:
    - websecure
  routes:
    - match: HostSNI(`{{ .Values.mqttHost }}`)
      services:
        - name: mosquitto
          kind: Service
          port: 1883
  tls: {}

Here, the websecure entrypoint is my standard HTTPS entrypoint. This still works as expected though, even for pure TLS connections, by using the SNI and forwarding connections arriving for mqtt.example.com to mosquitto. The tls key at the end is important, even though it is empty. This tells Traefik to enable TLS with its default configuration, which uses my wildcard cert.

The most interesting part of the mosquitto setup was the creation of users. It uses a passwd-like file format, and I got “creative” when setting up the Nomad job. All of the users (admin user, scrapers, Zigbee2MQTT, my smart plugs) are in a directory in Vault, looking like this:

my-secrets/iot/mqtt/users/username1
my-secrets/iot/mqtt/users/username2
[...]

Then each of those only has a single key, secret, which contains the user’s password, already encrypted with mosquitto_passwd. The problem now is: How to get all of those into a single passwd file for mosquitto to use? The resulting file should look something like this:

user1:$7$foo_encrypted==

user2:$7$bar_encrypted==

It turns out that external-secrets has a pretty good templating engine, so I was actually able to do this. The finished ExternalSecret looks like this:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: mosquitto-users
spec:
  refreshInterval: "1m"
  secretStoreRef:
    name: my-secrets
    kind: ClusterSecretStore
  target:
    name: passwd
    template:
      data:
        passwd: |
          {{ `{{ range $name, $pass := . }}
          {{ $name }}:{{ with $pass | fromJson }}{{ .secret }}{{ end }}
          {{ end }}` }}          
  dataFrom:
    - find:
        path: my-secrets/iot/mqtt/users
        name:
          regexp: ".*"
      rewrite:
        - regexp:
            source: "my-secrets/iot/mqtt/users/(.*)"
            target: "$1"

Let’s start with the data fetching in dataFrom. It returns all secrets below the users/ path and returns them in a map, akin to this:

resultMap:
  my-secrets/iot/mqtt/users/username1: {"secret": "foo"}
  my-secrets/iot/mqtt/users/username2: {"secret": "bar"}

This is a bit unfortunate, because to get the right format, I need the username as well. That’s what the rewrite: object gives me. It does a regex match on the whole path and gives me back only the last element, which is the username. Then the template itself just iterates over the map and brings out the username and password in the right format.

I’m repeatedly impressed how many tight situations external-secrets has helped me out of already. After some fiddling, this is a good enough result.

One thing I found rather unfortunate though: There’s no way of defining the owner of a Secret mapped into a pod as a volume. This means that the passwd file sits in the container world readable. Not great. The only potential solution I found was the introduction of an init container, to run chmod on the file. I skipped that for now, but will have to take care about it at some point, because mosquitto already complains about the fact that the passwd file is world readable, noting that a setup like that will be rejected in the future.

Scraping MQTT data with Prometheus

I like and greatly enjoy my Prometheus data. I like looking at all of the plots in Grafana. There’s a reason it gets to occupy 200 GB of disk space. So I need to get my MQTT data, meaning power consumption from the smart plugs and thermal data from the thermometers, into Prometheus. For this, I’m using mqtt2prometheus. I’ve currently got two instances running, one for my power plugs’ energy measurement and one for my thermometers’ temperature and humidity. I put both of them into one Pod, because having separate Pods for each of them seemed unnecessary.

The configuration of the power measurements exporter looks like this:

apiVersion: v1
kind: ConfigMap
metadata:
  name: pwr-exporter
  labels:
    {{- range $label, $value := .Values.commonLabels }}
    {{ $label }}: {{ $value | quote }}
    {{- end }}
data:
  config.yaml: |
    mqtt:
      server: tcp://{{ .Values.mqttHost }}:{{ .Values.ports.mosquitto }}
      user: promexport
      client_id: pwr-exporter
      topic_path: "plugs/tasmota/tele/#"
      device_id_regex: "plugs/tasmota/tele/(?P<deviceid>.*)/.*"
    metrics:
      - prom_name: mqtt_total_power_kwh
        mqtt_name: ENERGY.Total
        help: "Total power consumption (kWh)"
        type: counter
      - prom_name: mqtt_power
        mqtt_name: ENERGY.Power
        help: "Current consumption (W)"
        type: gauge
      - prom_name: mqtt_current
        mqtt_name: ENERGY.ApparentPower
        help: "Current (A)"
        type: gauge
      - prom_name: mqtt_yesterday_pwr
        mqtt_name: ENERGY.Yesterday
        help: "Yesterdays Total Power Consumption (kWh)"
        type: counter
      - prom_name: mqtt_today_pwr
        mqtt_name: ENERGY.Today
        help: "Todays Total Power Consumption (kWh)"
        type: counter    

And the one for the thermometers looks like this:

apiVersion: v1
kind: ConfigMap
metadata:
  name: temp-exporter
  labels:
    {{- range $label, $value := .Values.commonLabels }}
    {{ $label }}: {{ $value | quote }}
    {{- end }}
data:
  config.yaml: |
    mqtt:
      server: tcp://{{ .Values.mqttHost }}:{{ .Values.ports.mosquitto }}
      user: promexport
      client_id: temp-exporter
      topic_path: "zigbee2mqtt/temp/sonoff/#"
      device_id_regex: "zigbee2mqtt/temp/sonoff/(?P<deviceid>.*)"
    cache:
      timeout: 24h
    metrics:
      - prom_name: mqtt_temp_battery_percent
        mqtt_name: battery
        help: "Current battery percentage (percent)"
        type: gauge
        omit_timestamp: true
      - prom_name: mqtt_temp_humidity
        mqtt_name: humidity
        help: "Current humidity (percent)"
        type: gauge
        omit_timestamp: true
      - prom_name: mqtt_temp_temperature
        mqtt_name: temperature
        help: "Current temperature (C)"
        type: gauge
        omit_timestamp: true    

The configurations mostly consist of the translation of MQTT topics to Prometheus metrics.

Here’s the deployment for the Pod:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: exporters
spec:
  replicas: 1
  selector:
    matchLabels:
      homelab/app: exporters
  strategy:
    type: "Recreate"
  template:
    metadata:
      labels:
        homelab/app: exporters
      annotations:
        checksum/pwr-config: {{ include (print $.Template.BasePath "/pwr-exp-conf.yaml") . | sha256sum }}
        checksum/temp-config: {{ include (print $.Template.BasePath "/temp-exp-conf.yaml") . | sha256sum }}
    spec:
      containers:
        - name: pwr-exporter
          image: ghcr.io/hikhvar/mqtt2prometheus:{{ .Values.mqtt2promVersion }}
          args:
            - "-config"
            - "/etc/mqtt2prom/config.yaml"
            - "-listen-port"
            - "{{ .Values.ports.pwr }}"
            - "-log-format"
            - "json"
          volumeMounts:
            - name: config-pwr
              mountPath: /etc/mqtt2prom
              readOnly: true
          env:
            - name: MQTT2PROM_MQTT_USER
              value: "promexport"
          envFrom:
            - secretRef:
                name: exporter-mosquitto-user
                optional: false
          ports:
            - name: pwr-exporter
              containerPort: {{ .Values.ports.pwr }}
              protocol: TCP
        - name: temp-exporter
          image: ghcr.io/hikhvar/mqtt2prometheus:{{ .Values.mqtt2promVersion }}
          args:
            - "-config"
            - "/etc/mqtt2prom/config.yaml"
            - "-listen-port"
            - "{{ .Values.ports.temp }}"
            - "-log-format"
            - "json"
          volumeMounts:
            - name: config-temp
              mountPath: /etc/mqtt2prom
              readOnly: true
          env:
            - name: MQTT2PROM_MQTT_USER
              value: "promexport"
          envFrom:
            - secretRef:
                name: exporter-mosquitto-user
                optional: false
          ports:
            - name: temp-exporter
              containerPort: {{ .Values.ports.temp }}
              protocol: TCP
      volumes:
        - name: config-pwr
          configMap:
            name: pwr-exporter
        - name: config-temp
          configMap:
            name: temp-exporter

I have again cut out some unimportant pieces. Luckily, mqtt2prometheus supports providing the credentials for MQTT access via environment variables, so I didn’t have to template the entire configuration file to avoid putting the credentials into git.

Finally, I had to also set up the network policy to allow my Prometheus deployment access to the Pod and its ports for scraping:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "exporters"
spec:
  endpointSelector:
    matchLabels:
      homelab/app: exporters
  ingress:
    - fromEndpoints:
      - matchLabels:
          io.kubernetes.pod.namespace: monitoring
          app.kubernetes.io/name: prometheus

The Zigbee manager

My thermometers are connected via Zigbee, so I needed some way to transform the data to MQTT and send it to my mosquitto instance. I don’t use HomeAssistant, because it looks very much like overkill. I don’t actually control anything, I just want to gather a bit of data. I’m using Zigbee2MQTT for this. I’m using a Zigbee transceiver connected via LAN, so I didn’t have to muck about with mounting a USB device into the Pod. Again, zigbee2mqtt is a good piece of software because it allows me to set some config keys, those containing secrets, via environment variables but also allows me to provide the non-secret config options in the configuration file. Zigbee2MQTT requires three secrets:

  1. The MQTT credentials for access to mosquitto
  2. An auth token for access to the web UI
  3. A network key

I’m providing all three from my Vault instance in an ExternalSecret again:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: zigbee2mqtt
spec:
  refreshInterval: "1m"
  secretStoreRef:
    name: my-secrets
    kind: ClusterSecretStore
  target:
    name: zigbee2mqtt
    template:
      data:
        ZIGBEE2MQTT_CONFIG_FRONTEND_AUTH_TOKEN: "{{ `{{ .auth }}` }}"
        ZIGBEE2MQTT_CONFIG_MQTT_PASSWORD: "{{ `{{ .mqtt }}` }}"
        ZIGBEE2MQTT_CONFIG_ADVANCED_NETWORK_KEY: "[{{ `{{ .network }}` }}]"
  data:
    - secretKey: auth
      remoteRef:
        key: my-secrets/iot/zigbee2mqtt/auth
        property: secret
    - secretKey: mqtt
      remoteRef:
        key: my-secrets/iot/zigbee2mqtt/mqtt
        property: secret
    - secretKey: network
      remoteRef:
        key: my-secrets/iot/zigbee2mqtt/network-key
        property: secret

The complicated part of the Zigbee2MQTT deployment is the configuration file. Because sadly, Zigbee2MQTT is one of those applications that need write access to their configuration file. Which makes usage of a ConfigMap complicated, because those are always mounted read-only. In the case of Zigbee2MQTT, I don’t really care about the content changes it makes, I can just deploy my original file over it without an issue. But Zigbee2MQTT won’t even start if it can’t write to the config file.

First, the config map itself:

apiVersion: v1
kind: ConfigMap
metadata:
  name: zigbee2mqtt
data:
  configuration.yaml: |
    version: 4
    homeassistant:
      enabled: false
    permit_join: false

    frontend:
      enabled: true

    mqtt:
      base_topic: zigbee2mqtt
      server: 'mqtts://{{ .Values.mqttHost }}:443'
      user: foo
      client_id: "foobar"

    # Serial settings
    serial:
      port: 'tcp://my-zigbee-bridge:1234'
      adapter: zstack

    advanced:
      channel: 23
      log_output:
        - console
    devices:
      '0x123':
        friendly_name: 'temp/sonoff/thermo1'
        icon: device_icons/bdc2692122548ad0f2b0fb6c9f10a93d.png
      '0x456':
        friendly_name: 'temp/sonoff/thermo2'
        icon: device_icons/bdc2692122548ad0f2b0fb6c9f10a93d.png    

When new devices are connected, Zigbee2MQTT adds them to the devices: map, and I then just add them to the ConfigMap manually.

But how to handle the fact that this config file needs to be writable? Init containers. Because up to now, I’ve been living in blissful ignorance of such hacks, but that streak of good fortune had to end at some point. I just find it so incredibly ugly. Look at it:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: zigbee2mqtt
spec:
  replicas: 1
  selector:
    matchLabels:
      homelab/app: zigbee2mqtt
  strategy:
    type: "Recreate"
  template:
    metadata:
      labels:
        homelab/app: zigbee2mqtt
      annotations:
        checksum/config: {{ include (print $.Template.BasePath "/z2m-config.yaml") . | sha256sum }}
    spec:
      securityContext:
        fsGroup: 1000
      initContainers:
        - name: zigbee2mqtt-init
          image: alpine:3.21.2
          volumeMounts:
            - name: data
              mountPath: /data
            - name: config
              mountPath: /config
          command: ["cp", "/config/configuration.yaml", "/data/configuration.yaml"]
      containers:
        - name: zigbee2mqtt
          image: koenkk/zigbee2mqtt:{{ .Values.zigbee2mqttVersion }}
          volumeMounts:
            - name: data
              mountPath: /app/data
          resources:
            requests:
              cpu: 200m
              memory: 200Mi
          envFrom:
            - secretRef:
                name: zigbee2mqtt
                optional: false
          ports:
            - name: web
              containerPort: {{ .Values.ports.z2m }}
              protocol: TCP
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: z2m-volume
        - name: config
          configMap:
            name: zigbee2mqtt

I’m launching an entire other container just to run a single cp command to copy the mounted ConfigMap into the data volume. I wish we had some better way to do something like this. But it seems we don’t.

And that’s it for this one. I think wherever possible, I will keep the future migration posts in this format, not explaining every single line of every single Yaml file anymore, but only pointing out interesting things like the issue with the mosquitto credentials in this one. It’s more interesting to write and I hope more interesting to read than the umpteenth re-explanation of my CNPG DB setup.

Next up will be my Jellyfin media server. The copying of my media collection is already done, and hopefully I will get the actual migration completed today. That one will contain a lot of Grafana plots and Ceph performance musings. 🤓