Gathering SNMP Metrics with the SNMP Exporter

I have been gathering metrics from my DrayTek Vigor 165 modem for a while now, and finally got around to documenting the setup, so now you get to read about it. I鈥檓 using the Vigor 165 to connect to the Internet via a Deutsche Telekom 250 Mbit/s VDSL connection. That modem supports SNMP and can provide metrics like the line speed or quality. A couple of years back, I wanted to get that data into my Grafana dashboards. After some searching, I came across the SNMP Exporter. ...

May 25, 2025 路 11 min 路 Michael
The Thanos logo. It is a T in a square with some squares under the T. Below that is the 'Thanos' name.

Setting up Thanos for Metrics Storage

At the time of writing, I have 328 GiB of Prometheus data. When it all started, I had about 250 GiB. I could stop gathering more data whenever I like. 馃槄 So I鈥檝e got a lot of Prometheus data. Especially since I started the Kubernetes cluster - or rather, since I started scraping it - I had to regularly increase the size of the storage volume for Prometheus. This might very well be due to my 5 year retention. But part of it, as it will turn out later, was because some of the things I was scraping had a 10s scrape interval configured. ...

May 18, 2025 路 30 min 路 Michael
A screenshot of a Grafana time series plot. It shows the time between 23:30 and 09:00 for the throughput of my Ceph cluster. It tops out at almost 100 MB, but is on average more around 65 MB. The high throughput happens between approximately 00:00 and 08:50.

Ceph: My Story of Copying 1.7 TB from one Cluster to Another

A couple of weeks ago, I migrated my Jellyfin instance to my Kubernetes cluster. This involved copying my approximately 1.7 TB worth of media from the baremetal Ceph cluster to the new Rook Ceph cluster. And I鈥檇 like to dig a bit into the metrics and try to read them like the entrails of a slain beast during a full moon at the top of a misty mountain. Just this much, the portents don鈥檛 look good for one of my HDDs. ...

March 4, 2025 路 17 min 路 Michael
A Grafana gauge panel. It is red, with the text 98.5% below it.

Prometheus Metrics Cleanup

I had to clean up my Prometheus data, and it got pretty darned close there. When it comes to my metrics, I鈥檓 very much a data hoarder. Metrics gathering was what got me into Homelabbing as a hobby, instead of just a means to an end. Telegraf/Influx/Grafana were the first new services on my Homeserver in about five years. And I really do like looking at my dashboards, including looking at past data. My retention period currently is five years. And I鈥檓 already pretty sure that when I come up to those five years for the initial data, I will just extend that to 10 years. 馃槄 ...

June 28, 2024 路 8 min 路 Michael
The HashiCorp Nomad and Kubernetes logos, connected with an arrow pointing from Nomad to Kubernetes

Nomad to k8s, Part 10: Grafana

Wherein I migrate my Grafana instance over to k8s. This is part 11 of my k8s migration series. I already wrote about my love for metrics in the companion post about the Prometheus setup, so I will spare you my excitement about pretty graphs this time. 馃槈 For the Grafana setup, I used the kube-prometheus-stack鈥檚 integration of the Grafana Helm Chart. Database setup First step is to setup the database for Grafana. You can also run it locally, without an external database. Then, Grafana uses an SQLite DB. But the Postgres database made more sense to me. This was the first deployment of a production database with CloudNativePG and looked like this: ...

April 6, 2024 路 11 min 路 Michael