I have been gathering metrics from my DrayTek Vigor 165 modem for a while now, and finally got around to documenting the setup, so now you get to read about it.

I’m using the Vigor 165 to connect to the Internet via a Deutsche Telekom 250 Mbit/s VDSL connection. That modem supports SNMP and can provide metrics like the line speed or quality. A couple of years back, I wanted to get that data into my Grafana dashboards. After some searching, I came across the SNMP Exporter.

The way the exporter works is by regularly making SNMP requests to the targets and providing the data in the standard Prometheus format. And because it involves SNMP, the setup is a bit more involved than your average exporter.

SNMP

SNMP is the Simple Network Management Protocol. As the name implies, the protocol is intended for managing networking devices, like modems or routers. I’ve never worked professionally in the networking area, so I don’t have any experience with actively managing network devices like switches via SNMP. And as far as I’m aware, my modem is the only device that supports SNMP in my Homelab. So I won’t be discussing the configuration part here, only the read-only part my modem uses for providing metrics.

Information in SNMP is organized according to management information base (MIB) files, which specify what variables are available, and their hierarchy. Here is one such file, defining the variables for VDSL information: VDSL2-LINE-MIB. While these files can be read by humans, I always kept to websites like observium for browsing them.

I think for the purposes of understanding the format, an example result from querying my modem is going to more illuminating than me trying to describe what SNMP queries look like:

snmpbulkwalk -v 2c -c example 203.0.113.1
IF-MIB::ifType.1 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifType.4 = INTEGER: vdsl2(251)
IF-MIB::ifType.5 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifType.6 = INTEGER: propVirtual(53)
IF-MIB::ifType.7 = INTEGER: propVirtual(53)
IF-MIB::ifType.8 = INTEGER: propVirtual(53)
IF-MIB::ifType.9 = INTEGER: propVirtual(53)
IF-MIB::ifType.10 = INTEGER: propVirtual(53)
IF-MIB::ifType.11 = INTEGER: propVirtual(53)
IF-MIB::ifType.12 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifType.13 = INTEGER: ethernetCsmacd(6)
IF-MIB::ifMtu.1 = INTEGER: 1500
IF-MIB::ifMtu.4 = INTEGER: 1500
IF-MIB::ifMtu.5 = INTEGER: 1500
IF-MIB::ifMtu.6 = INTEGER: 1500
IF-MIB::ifMtu.7 = INTEGER: 1500
IF-MIB::ifMtu.8 = INTEGER: 1500
IF-MIB::ifMtu.9 = INTEGER: 1500
IF-MIB::ifMtu.10 = INTEGER: 1500
IF-MIB::ifMtu.11 = INTEGER: 1500
IF-MIB::ifMtu.12 = INTEGER: 1500
IF-MIB::ifMtu.13 = INTEGER: 1500
IF-MIB::ifSpeed.1 = Gauge32: 1000000000
IF-MIB::ifSpeed.4 = Gauge32: 292016000
IF-MIB::ifSpeed.5 = Gauge32: 1000000000
IF-MIB::ifSpeed.6 = Gauge32: 0
IF-MIB::ifSpeed.7 = Gauge32: 0
IF-MIB::ifSpeed.8 = Gauge32: 0
IF-MIB::ifSpeed.9 = Gauge32: 1000000000
IF-MIB::ifSpeed.10 = Gauge32: 1000000000
IF-MIB::ifSpeed.11 = Gauge32: 1000000000
IF-MIB::ifSpeed.12 = Gauge32: 1000000000
IF-MIB::ifSpeed.13 = Gauge32: 1000000000

The snmpbulkwalk command has the advantage over other SNMP commands that it just walks everything the target has to offer, instead of having to provide the OIDs (Object Identifier) to be queried explicitly.

The above output shows a couple of values from my modem, regarding the setup of its network interfaces. You can for example see that interface .4 is the VDSL interface, while .1, .5 are Ethernet interfaces. The ifSpeed object then shows the speed. The -v 2c parameter in the command provides the SNMP version to be used, which -c example defines the community. The community is sort-of an identifier.

SNMP itself, until version 3, does not support any kind of authentication at all. So none needs to be provided. As my modem only supports querying but not configuration, that’s okay.

The interesting information for me comes a bit later in the output of the same command:

[...]
SNMPv2-SMI::transmission.94.1.1.3.1.7.4 = INTEGER: -3
SNMPv2-SMI::transmission.94.1.1.3.1.8.4 = Gauge32: 51374000
SNMPv2-SMI::transmission.94.1.1.4.1.1.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.4.1.2.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.4.1.3.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.4.1.4.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.5.1.1.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.5.1.2.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.5.1.3.4 = Gauge32: 0
SNMPv2-SMI::transmission.94.1.1.5.1.4.4 = Gauge32: 0
SNMPv2-SMI::transmission.251.1.2.2.1.1.4.2 = INTEGER: 2
SNMPv2-SMI::transmission.251.1.2.2.1.2.4.1 = Gauge32: 292016000
SNMPv2-SMI::transmission.251.1.2.2.1.2.4.2 = Gauge32: 46718000
SNMPv2-SMI::transmission.251.1.2.2.1.3.4.1 = Gauge32: 0
SNMPv2-SMI::transmission.251.1.2.2.1.3.4.2 = Gauge32: 0
SNMPv2-SMI::transmission.251.1.2.2.1.4.4.1 = INTEGER: 9
SNMPv2-SMI::transmission.251.1.2.2.1.4.4.2 = INTEGER: 0
SNMPv2-SMI::transmission.251.1.2.2.1.5.4.1 = INTEGER: 305
SNMPv2-SMI::transmission.251.1.2.2.1.5.4.2 = INTEGER: 150
SNMPv2-SMI::transmission.251.1.2.2.1.6.4.1 = INTEGER: 0
SNMPv2-SMI::transmission.251.1.2.2.1.6.4.2 = INTEGER: 0
[...]

These outputs, on their own, are not really that useful. Note especially the 251.1.2.2.1 and 94.1.1.3.1 as part of the OIDs. That indicates that snmpbulkwalk did not have the necessary MIBs to properly decode the information received from my modem. This can be fixed by making those MIBs available. Helpfully, DrayTek provides the supported MIBs for the devices on their website. Namely, the following MIBs are supported:

All of the links go to mibbrowser, which I’ve found a useful page to download MIBs. To make snmpbulkwalk use those additional MIBs, download them into a local directory and start snmpbulkwalk like this:

snmpbulkwalk -M /path/to/downloaded/mibs/:/usr/share/snmp/mibs -m ALL -v 2c -c example 203.0.113.1

The /usr/share/snmp/mibs is the path to some general MIBs on my Gentoo system, so you don’t have to download the more common MIBs.

With that invocation, the above example becomes a lot clearer:

ADSL-LINE-MIB::adslAturCurrOutputPwr.4 = INTEGER: -3 tenth dBm
ADSL-LINE-MIB::adslAturCurrAttainableRate.4 = Gauge32: 51192000 bps
ADSL-LINE-MIB::adslAtucChanInterleaveDelay.4 = Gauge32: 0 milli-seconds
ADSL-LINE-MIB::adslAtucChanCurrTxRate.4 = Gauge32: 0 bps
ADSL-LINE-MIB::adslAtucChanPrevTxRate.4 = Gauge32: 0 bps
ADSL-LINE-MIB::adslAtucChanCrcBlockLength.4 = Gauge32: 0 byte
ADSL-LINE-MIB::adslAturChanInterleaveDelay.4 = Gauge32: 0 milli-seconds
ADSL-LINE-MIB::adslAturChanCurrTxRate.4 = Gauge32: 0 bps
ADSL-LINE-MIB::adslAturChanPrevTxRate.4 = Gauge32: 0 bps
ADSL-LINE-MIB::adslAturChanCrcBlockLength.4 = Gauge32: 0
VDSL2-LINE-MIB::xdsl2ChStatusUnit.4.xtur = INTEGER: xtur(2)
VDSL2-LINE-MIB::xdsl2ChStatusActDataRate.4.xtuc = Gauge32: 292016000 bits/second
VDSL2-LINE-MIB::xdsl2ChStatusActDataRate.4.xtur = Gauge32: 46718000 bits/second
VDSL2-LINE-MIB::xdsl2ChStatusPrevDataRate.4.xtuc = Gauge32: 0 bits/second
VDSL2-LINE-MIB::xdsl2ChStatusPrevDataRate.4.xtur = Gauge32: 0 bits/second
VDSL2-LINE-MIB::xdsl2ChStatusActDelay.4.xtuc = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 9
VDSL2-LINE-MIB::xdsl2ChStatusActDelay.4.xtur = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 0
VDSL2-LINE-MIB::xdsl2ChStatusActInp.4.xtuc = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 305
VDSL2-LINE-MIB::xdsl2ChStatusActInp.4.xtur = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 150
VDSL2-LINE-MIB::xdsl2ChStatusInpReport.4.xtuc = INTEGER: 0
VDSL2-LINE-MIB::xdsl2ChStatusInpReport.4.xtur = INTEGER: 0

The addition of the correct MIBs allows snmpbulkwalk to correctly interpret the values coming from the modem. I’m not 100% sure what the Wrong Type errors are about, but I assume that the modem just doesn’t implement the MIB quite correctly.

Configuring the SNMP exporter

As I’ve noted above, the configuration of the SNMP exporter is a bit more involved.

First, a generator config needs to be created. In my case, it looked like this:

auths:
  draytek:
    community: example
    version: 2
modules:
  draytek:
    walk:
      - 1.3.6.1.2.1.10.94.1.1.3.1.6.4
      - 1.3.6.1.2.1.10.94.1.1.3.1.4.4
      - 1.3.6.1.2.1.10.94.1.1.3.1.5.4
      - 1.3.6.1.2.1.1.5
      - 1.3.6.1.2.1.10.251.1.2.2.1.2.4.1
      - 1.3.6.1.2.1.10.251.1.2.2.1.2.4.2

As you can see, I’m only declaring a handful of values I actually want to gather. I’m skipping most of the per-interface data, because I can already get that data from the PPPoE interface on my OPNsense router. I’ve restricted my gathering to the modem specific data:

  • 1.3.6.1.2.1.10.94.1.1.3.1.6: Current status of the ADSL line
  • 1.3.6.1.2.1.10.94.1.1.3.1.4: Current noise on the line
  • 1.3.6.1.2.1.10.94.1.1.3.1.5: Current attenuation on the line
  • 1.3.6.1.2.1.1.5: System name
  • 1.3.6.1.2.1.10.251.1.2.2.1.2: Current actual data rate on the line, Up and Down

With that defined, the SNMP exporter config file can be generated. But first, I needed to build the generator. For this, I cloned the SNMP exporter repo and switched into the generator/ directory. There, I ran this command:

make generator

And then finally, I was able to run the command to generate the SNMP exporter config file:

./generator generate -m /PATH/TO/MIB/DIR -g /PATH/TO/generator.yml -o snmp.yaml

With the configuration above, the result looks like this:

# WARNING: This file was auto-generated using snmp_exporter generator, manual changes will be lost.
auths:
  draytek:
    community: example
    security_level: noAuthNoPriv
    auth_protocol: MD5
    priv_protocol: DES
    version: 2
modules:
  draytek:
    get:
    - 1.3.6.1.2.1.1.5.0
    - 1.3.6.1.2.1.10.251.1.2.2.1.2.4.1
    - 1.3.6.1.2.1.10.251.1.2.2.1.2.4.2
    - 1.3.6.1.2.1.10.94.1.1.3.1.4.4
    - 1.3.6.1.2.1.10.94.1.1.3.1.5.4
    - 1.3.6.1.2.1.10.94.1.1.3.1.6.4
    metrics:
    - name: sysName
      oid: 1.3.6.1.2.1.1.5
      type: DisplayString
      help: An administratively-assigned name for this managed node - 1.3.6.1.2.1.1.5
    - name: xdsl2ChStatusActDataRate
      oid: 1.3.6.1.2.1.10.251.1.2.2.1.2
      type: gauge
      help: The actual net data rate at which the bearer channel is operating, if
        in L0 power management state - 1.3.6.1.2.1.10.251.1.2.2.1.2
      indexes:
      - labelname: ifIndex
        type: gauge
      - labelname: xdsl2ChStatusUnit
        type: gauge
        enum_values:
          1: xtuc
          2: xtur
    - name: adslAturCurrSnrMgn
      oid: 1.3.6.1.2.1.10.94.1.1.3.1.4
      type: gauge
      help: Noise Margin as seen by this ATU with respect to its received signal in
        tenth dB. - 1.3.6.1.2.1.10.94.1.1.3.1.4
      indexes:
      - labelname: ifIndex
        type: gauge
    - name: adslAturCurrAtn
      oid: 1.3.6.1.2.1.10.94.1.1.3.1.5
      type: gauge
      help: Measured difference in the total power transmitted by the peer ATU and
        the total power received by this ATU. - 1.3.6.1.2.1.10.94.1.1.3.1.5
      indexes:
      - labelname: ifIndex
        type: gauge
    - name: adslAturCurrStatus
      oid: 1.3.6.1.2.1.10.94.1.1.3.1.6
      type: Bits
      help: Indicates current state of the ATUR line - 1.3.6.1.2.1.10.94.1.1.3.1.6
      indexes:
      - labelname: ifIndex
        type: gauge
      enum_values:
        0: noDefect
        1: lossOfFraming
        2: lossOfSignal
        3: lossOfPower
        4: lossOfSignalQuality

This file defines the translation of the OID values to Prometheus metrics. The output in Prometheus format ultimately looks like this:

# HELP adslAturCurrAtn Measured difference in the total power transmitted by the peer ATU and the total power received by this ATU. - 1.3.6.1.2.1.10.94.1.1.3.1.5
# TYPE adslAturCurrAtn gauge
adslAturCurrAtn{ifIndex="4"} 5
# HELP adslAturCurrSnrMgn Noise Margin as seen by this ATU with respect to its received signal in tenth dB. - 1.3.6.1.2.1.10.94.1.1.3.1.4
# TYPE adslAturCurrSnrMgn gauge
adslAturCurrSnrMgn{ifIndex="4"} 8
# HELP adslAturCurrStatus Indicates current state of the ATUR line - 1.3.6.1.2.1.10.94.1.1.3.1.6 (Bits)
# TYPE adslAturCurrStatus gauge
adslAturCurrStatus{adslAturCurrStatus="lossOfFraming",ifIndex="4"} 0
adslAturCurrStatus{adslAturCurrStatus="lossOfPower",ifIndex="4"} 0
adslAturCurrStatus{adslAturCurrStatus="lossOfSignal",ifIndex="4"} 0
adslAturCurrStatus{adslAturCurrStatus="lossOfSignalQuality",ifIndex="4"} 0
adslAturCurrStatus{adslAturCurrStatus="noDefect",ifIndex="4"} 0
# HELP sysName An administratively-assigned name for this managed node - 1.3.6.1.2.1.1.5
# TYPE sysName gauge
sysName{sysName="foobar"} 1
# HELP xdsl2ChStatusActDataRate The actual net data rate at which the bearer channel is operating, if in L0 power management state - 1.3.6.1.2.1.10.251.1.2.2.1.2
# TYPE xdsl2ChStatusActDataRate gauge
xdsl2ChStatusActDataRate{ifIndex="4",xdsl2ChStatusUnit="1"} 2.92016e+08
xdsl2ChStatusActDataRate{ifIndex="4",xdsl2ChStatusUnit="2"} 4.6718e+07

The one thing that I could never really get to work is the adslAturCurrStatus value. It should be showing the current state of the line, indicating whether the DSL line itself is up or not. But I never got it to show anything. All values are always zero, even though I would have expected the noDefect value to be 1 when everything is alright. But it only ever gave me zeroes.

One nice thing to see: I’m actually getting slightly faster downstream service than what I’m paying for. I’ve got a 250/40 contract, but I’m getting 292 Mbit/s down and 46 Mbit/s up. I think that might be because there’s not many other users on this connection overall, as most people in the neighborhood have cable Internet instead.

With the configuration file in hand, here is the Deployment configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: snmp-exporter
spec:
  replicas: 1
  selector:
    matchLabels:
      homelab/app: snmp-exporter
  strategy:
    type: "Recreate"
  template:
    metadata:
      labels:
        homelab/app: snmp-exporter
      annotations:
        checksum/config: {{ include (print $.Template.BasePath "/exporter-config.yaml") . | sha256sum }}
    spec:
      automountServiceAccountToken: false
      securityContext:
        fsGroup: 1000
      containers:
        - name: snmp-exporter
          image: prom/snmp-exporter:{{ .Values.appVersion }}
          args:
            - "--log.format=json"
            - "--config.file=/etc/snmp_exporter/snmp.yml"
          volumeMounts:
            - name: config
              mountPath: /etc/snmp_exporter
              readOnly: true
          resources:
            requests:
              cpu: 100m
              memory: 50Mi
          livenessProbe:
            httpGet:
              port: {{ .Values.port }}
              path: "/-/healthy"
            initialDelaySeconds: 15
            periodSeconds: 30
          ports:
            - name: snmp-scrape
              containerPort: {{ .Values.port }}
              protocol: TCP
      volumes:
        - name: config
          configMap:
            name: exporter-config

The more interesting configuration is the ScrapeConfig for the Prometheus operator. Because if you look back at the generated config file, you will find something missing: A declaration of a target. This is instead done as part of the scrape config, which in my case looks like this:

apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: scraping-modem
  labels:
    prometheus: scrape-modems
spec:
  staticConfigs:
    - labels:
        job: modemmetrics
      targets:
        - 203.0.113.1
  metricsPath: /snmp
  scrapeInterval: 1m
  params:
    module:
      - draytek
    auth:
      - draytek
  relabelings:
    - sourceLabels:
        - "__address__"
      targetLabel: __param_target
    - targetLabel: instance
      replacement: modemnamehere
    - targetLabel: __address__
      replacement: "snmp-exporter.snmp-exporter.svc.cluster.local:9116"
  metricRelabelings:
    - sourceLabels:
        - "__name__"
      action: drop
      regex: snmp_scrape_.*
    - sourceLabels:
        - "__name__"
      action: drop
      regex: sysName
    - sourceLabels:
        - "__name__"
      action: drop
      regex: scrape_.*

It’s important to note the IP under targets is the IP of the modem, not the IP of the SNMP exporter. This is different to how e.g. the node exporter works. There, the targets are the machines which run the exporter. Instead, the __address__ label needs to be replaced in a relabeling so that Prometheus contacts the exporter.

The params.module and params.auth parameters define the sections from the SNMP exporter’s config file to be used for this scrape job. This way, you can have multiple sections for different types of devices in one exporter’s config and control the target+module/auth combinations in the scrape config. To be honest, this way of configuring is a bit weird to me. I would have rather expected the different targets to be defined in the exporter’s config and then being supplied with an identifying label in the metrics.

Results

Sadly, I don’t really have any interesting plots to show here, save perhaps for this one, which shows that my line got a lot better towards the end of 2023:

A screenshot of a Grafana time series plot. It shows two series, the signal-to-noise ratio and the attenuation of the VDSL line which delivered you this blog post. The date is back in October 2023. Both values are very stable, with almost no fluctuation at all. The attenuation starts at 10 dB, and the SNR at 6 dB. On October 12th around 07:00, the line suddenly improves in both values. The attenuation is reduced to a value of 5 dB, while the SNR improves to 8. Both continue with these values until the end of the plot.

Attenuation and Signal-to-Noise ratio of my VDSL line. Attenuation in green starting at 10 dB, SNR in yellow starting at 6.

The quality of my line markedly improved around October 12th. I don’t know what might have changed here, but attenuation went down by 4 dB and Signal-to-Noise ratio improved by 2 dB. That didn’t come with any improvements for me, though.

And that’s it for today. Another small step in my quest to monitor absolutely everything. 😁