A screenshot of a Grafana dashboard. It shows a number of stats metrics at the top, for example the number of users and buckets and the total bytes send in the interval. Below that, there are a number of time series panels, like number of operations over time, bytes send or bytes received by bucket. I will describe each individual panel and its content in detail in the main post.

Gathering Metrics from Ceph RGW S3

Wherein I set up some Prometheus metrics gathering from Ceph’s S3 RGW and build a dashboard to show the data. I like metrics. And dashboards. And plots. And one of the things I’ve been missing up to now was data from Ceph’s RadosGateway. That’s the Ceph daemon which provides an S3 (and Swift) compatible API for Ceph clusters. While Rook, the tool I’m using to deploy Ceph in my k8s cluster, already wires up Ceph’s own exporters to be scraped by a Prometheus Operator, that does not include S3 data. My main interest here is the development of bucket sizes over time, so I can see early when something is misconfigured. Up to now, the only indicator I had was the size of the pool backing the RadosGW, which currently stands at 1.42 TB, which makes it the second-largest pool in my cluster. ...

October 10, 2025 · 15 min · Michael

Replacing a Broken HDD in my Ceph Cluster

Back in July, I was greeted by this error on my Ceph dashboard while visiting family: A Ceph error you generally don’t want to see while you’re 400 km away from your Homelab. This error meant that during the nightly scrub, Ceph detected an error that was not trivially resolvable. ...

September 29, 2025 · 14 min · Michael

Sammelsurium I

Wherein I write down things that don’t feel like they should be their own post. My blogging notes are starting to really fill up with small topics I’d like to write about, but which don’t feel like they warrant their own post. On the other hand, they also don’t feel ephemeral enough to just be a Fediverse post. So I decided to introduce the Sammelsurium, which is the German word for a random collection of things. ...

May 1, 2025 · 5 min · Michael

What's next after the K8s Migration?

Wherein I go over my future plans for the Homelab, now that the k8s migration is finally done. So it’s done. The k8s migration is finally complete, and I can now get started with some other projects. Or, well, I can once I’ve updated my control plane Pis to Pi 5 with NVMe SSDs. But what to do then? As it turns out, I’ve got a very full backlog. I’m decidedly not in danger of boredom. ...

April 29, 2025 · 18 min · Michael
The HashiCorp Nomad and Kubernetes logos, connected with an arrow pointing from Nomad to Kubernetes

Nomad to k8s, Part 25: Control Plane Migration

Wherein I migrate my control plane to the Raspberry Pi 4 nodes it is intended to run on. This is part 26 of my k8s migration series. This one did not remotely go as well as I thought. Initially, I wasn’t even sure that this was going to be worth a blog post. But my own impatience and the slowly aging Pi 4 conspired to ensure I’ve got something to write about. ...

April 9, 2025 · 17 min · Michael