Wherein I talk about the Ingress setup for my Homelab’s k8s cluster with Traefik.
This is part four of my k8s migration series.
After the initial setup of some infrastructure like external-dns
and external-secrets
,
I went to work on the Ingress
implementation for my cluster.
I chose Traefik as my Ingress controller. This was mostly driven by the fact that I’m already using Traefik as the proxy in front of my current Nomad cluster, and I’ve become quite familiar with it.
One big advantage of Traefik is the extensive support for a wide array of what they call Configuration Providers. In my current Nomad setup, I’m making use of the Consul provider. In comparison to software like Nginx or Apache, I can configure all proxy related config in the service block of my Nomad jobs, as labels on the Consul service definition. This allows for centralization of the entire config related to a specific service, instead of having two places: The config for the service’s deployment, and the proxy config.
Networking options
While planning my k8s cluster, I considered two different ways of doing
networking for the Ingress. The first one is to simply have the proxy using the
host’s networking. This is the setup that I’m currently working with in my
Nomad setup. I’ve got the Traefik job pegged to a single host, and then I’ve got
a hard-coded A
entry in my DNS pointing to that machine. Traefik then listens on
port 443 and so forth. Then I’m adding CNAME
entries to DNS for other services
running through that proxy.
I set Traefik up the same way during my k8s experiments. But this has one large downside: High availability. If the ingress host goes down, not only is Traefik down, but also all services served through it. That doesn’t bother me too much, but with k8s, I had a different option: Services of type LoadBalancer. This has the advantage that I no longer have to restrict Traefik to a specific host to get a stable IP to point all the DNS entries at. Instead, the stable IP is now supplied by Cilium, which also announces routes to those IPs to my router.
The one downside of the LoadBalancer
approach is that source IPs are not
necessarily preserved. This makes functionality like IP allow lists in Traefik
pretty useless.
The fix for this is to use externalTrafficPolicy: Local
on the Service. This
config ensures that Cilium announces only the IPs of the hosts which currently
run a Traefik pod, and then the cluster-internal, source NAT’ed routing does not
apply, and source IPs are preserved.
Deployment
I’m using the official Helm chart for my deployment. Currently, I’m only running a single replica, but that might change in the future.
I will go through my values.yaml
file piece-by-piece, to make the explanation
a bit more manageable.
Let’s start with the values for the Deployment itself:
deployment:
healthchecksPort: 4435
podLabels:
homelab/ingress: "true"
additionalArguments:
- "--ping.entryPoint=health"
- "--providers.kubernetesingress.ingressendpoint.hostname=ingress-k8s.mei-home.net"
commonLabels:
homelab/part-of: traefik-ingress
logs:
general:
level: DEBUG
format: json
access:
enabled: true
format: json
metrics:
prometheus: null
resources:
requests:
cpu: "250m"
memory: "250M"
I’m not changing very much about the deployment, safe for setting a specific
health port. This is there just because it’s the same for my Nomad Traefik.
The homelab/ingress
label is there to be used in NetworkPolicy
manifests
to allow access for Traefik to services proxied through it.
The ingressendpoint
option is an option which ensures that external-dns
later just creates a CNAME
entry for each Ingress resource pointing to the
given DNS entry, which will point to the Traefik LoadBalancer
Service IP.
I’m disabling metrics here because I have not yet set up Prometheus. The resources assignments are simply coming from the metrics I’ve gathered from my Nomad Traefik deployment over the years.
Next, let’s define Traefik’s ports. I’m staying with the ports for HTTP and HTTPS here. There are a couple more, like the health port, but I’m leaving them out for the sake of brevity (yes, you are allowed to chuckle dryly now 😉).
ports:
traefik: null
websecure: null
metrics: null
secureweb:
port: 8000
exposedPort: 443
expose: true
protocol: TCP
tls:
enabled: true
middlewares:
- traefik-ingress-compression@kubernetescrd
- traefik-ingress-headers-security@kubernetescrd
- traefik-ingress-local-net@kubernetescrd
web:
port: 8081
exposedPort: 80
expose: true
protocol: TCP
redirectTo:
port: secureweb
The traefik
, websecure
and metrics
ports are enabled in the default
values.yaml
file of the chart, but I’m using my own nomenclature. I will also
show the manifests for the middlewares later.
The port options impact two manifests generated by the chart. First, the pod template, which defines the entrypoints for all of them via CLI arguments for the Traefik pod:
[...]
--entrypoints.secureweb.address=:8000/tcp
--entrypoints.web.address=:8081/tcp
--entrypoints.secureweb.http.middlewares=traefik-ingress-compression@kubernetescrd,traefik-ingress-headers-security@kubernetescrd,traefik-ingress-local-net@kubernetescrd
--entrypoints.secureweb.http.tls=true
--entrypoints.web.http.redirections.entryPoint.to=:443
--entrypoints.web.http.redirections.entryPoint.scheme=https
[...]
Those ports are also used in the definition of the Service:
Port: secureweb 443/TCP
TargetPort: secureweb/TCP
NodePort: secureweb 31512/TCP
Endpoints: 10.8.4.116:8000
Port: web 80/TCP
TargetPort: web/TCP
NodePort: web 30208/TCP
Endpoints: 10.8.4.116:8081
Traefik also provides a nice read-only dashboard to see all the configured routes, services and so forth. It is supplied with an Ingress via the chart:
ingressRoute:
dashboard:
enabled: true
entryPoints:
- admin
middlewares:
- name: admin-basic-auth
namespace: traefik-ingress
healthcheck:
enabled: true
entryPoints:
- health
As you can see, this is not a default Kubernetes Ingress, but instead Traefik’s own Ingress definition, the IngressRoute. Normal Kubernetes Ingress manifests also work fine, but they need to then supply Traefik options via annotations.
Next comes the Service definition:
service:
enabled: true
single: true
type: LoadBalancer
annotations:
external-dns.alpha.kubernetes.io/hostname: ingress-k8s.mei-home.net
io.cilium/lb-ipam-ips: "10.86.55.22"
labels:
homelab/public-service: "true"
spec:
externalTrafficPolicy: Local
With the single
option, you can configure whether Traefik creates a single
Service for both TCP and UDP or a separate Service for each.
The external-dns.alpha.kubernetes.io/hostname
annotation sets the DNS name
automatically configured by external-dns. I’m also setting a fixed IP instead
of letting Cilium assign one from the pool, so I can properly configure firewall
rules.
The homelab/public-service
label is significant, because it denotes the services
which Cilium announces. See my post
on using the Cilium BGP load balancer.
As noted above, externalTrafficPolicy: Local
gives me source IP preservation.
The last base configuration options are for TLS, but I will go into more details about how I manage the TLS cert later on.
Middlewares
In Traefik, Middlewares are part of the request handling pipeline. A request enters Traefik via any of the EntryPoints. Then, all Middlewares are applied. These range from IP allow listing to URL rewriting. They can be assigned to EntryPoints, which means they are getting applied to every request, or to specific routes via Ingress or IngressRoute configs.
I’m using a couple of them, which I supply via the Helm chart’s extraObjects
value:
extraObjects:
- apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: compression
labels:
homelab/part-of: traefik-ingress
spec:
compress: {}
- apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: headers-security
labels:
homelab/part-of: traefik-ingress
spec:
headers:
stsSeconds: 63072000
stsIncludeSubdomains: true
customFrameOptionsValue: "sameorigin"
contentTypeNosniff: true
referrerPolicy: "same-origin"
browserXssFilter: true
- apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: local-net
labels:
homelab/part-of: traefik-ingress
spec:
ipWhiteList:
sourceRange:
- "10.1.1.0/24"
- "192.168.1.0/24"
The first one, compression
, just enables the compression middleware.
headers-security
adds a couple of best practices headers to all requests
for security’s sake. The last one, local-net
, is an IP allow list for some of my Homelab
subnets.
Securing the dashboard
Let’s look at the IngressRoute for the dashboard a second time, specifically
its middlewares
option:
middlewares:
- name: admin-basic-auth
namespace: traefik-ingress
This option enables the following middleware:
- apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: admin-basic-auth
labels:
homelab/part-of: traefik-ingress
spec:
basicAuth:
secret: basic-auth-users
This is a BasicAuth middleware, adding HTTP basic auth to my dashboard, just as another layer of security.
This middleware expects the secret basic-auth-users
to contain a key
users
, where the users are listed in the following format:
username:hashedpassword
myuser:$apr1$wpjd1k59$B5E9r2e8DUgmGWubIb/Bk/
The entries can for example be created with htpasswd.
In my setup, I’m handling secrets via my Vault instance with external-secrets. I’ve described the setup here. The secret definition for the basic auth secret looks like this:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: "basic-auth-users"
labels:
homelab/part-of: traefik-ingress
spec:
secretStoreRef:
name: my-vault-store
kind: ClusterSecretStore
refreshInterval: "15m"
target:
creationPolicy: 'Owner'
template:
metadata:
labels:
homelab/part-of: traefik-ingress
data:
users: |
{{ printf "{{ `{{ .user1 }}` }}" }}
data:
- secretKey: user1
remoteRef:
key: "secret/my_kubernetes_secrets/cluster/ingress/auth/user1"
property: val
What happens here is that external-secrets takes the JSON object returned by
Vault for the secret/my_kubernetes_secrets/cluster/ingress/auth/user1
path,
and then takes the val
key in that object, putting it into user1
. That’s
then accessed in the template for the Kubernetes Secret.
The weird {{ printf "{{ '{{ .user1 }}' }}" }}
syntax comes from the fact
that I’m using Helmfile for my Helm charts management, and that puts value
files through a round of Go templating. That’s what the outer printf
is used
to escape. Then that value file goes through Helm’s templating. That’s escaped
by {{ }}
and the backticks. And then {{ .user1 }}
is the template that’s
used by external-secrets.
The TLS certificate
My TLS certificate is a wildcard certificate from Let’s Encrypt. Sadly, my domain registrar does not support an API for the DNS entries, so for now I have to solve the DNS challenge manually. I’m using the LE cert for both, internal and external services. Mostly so that I don’t have to muck around with distributing a self-signed CA cert to all my end-user devices. After I’ve renewed the cert, I push it to Vault and use it from there.
The ExternalSecret
for getting the certs into Kubernetes looks like this:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: "le-cert"
labels:
homelab/part-of: traefik-ingress
spec:
secretStoreRef:
name: my-vault-store
kind: ClusterSecretStore
refreshInterval: "15m"
target:
creationPolicy: 'Owner'
template:
type: kubernetes.io/tls
metadata:
labels:
homelab/part-of: traefik-ingress
data:
tls.key: |
{{ printf "{{ `{{ .privkey }}` }}" }}
tls.crt: |
{{ printf "{{ `{{ .fullchain }}` }}" }}
dataFrom:
- extract:
key: secret/my_kubernetes_cluster/cluster/ingress/le-cert
The two-level escape of the {{ .privkey }}
and {{ .fullchain }}
templates
is again to make sure neither Helmfile nor Helm itself try to interpret the
templates.
Here, I’m using a slightly different format for fetching the secret. With
dataFrom
instead of data
, as in the basic auth secret, I’m getting the entire
JSON object from that path, instead of a specific key from that object.
When I push my cert to Vault, I have four keys, with the private key, the cert
itself, the cert chain and the full chain. Here, I only need the private key
and the full chain.
This secret is then used in Traefik’s TLSStore:
tlsStore:
default:
defaultCertificate:
secretName: le-cert
Network policies
Before coming to an example, I also want to show the NetworkPolicy
I’m using:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "traefik-allow-world-only"
spec:
endpointSelector: {}
ingress:
- fromEndpoints:
- {}
- fromEntities:
- world
With the {}
endpointSelector
, the policy is applied to all pods in the
namespace the policy resides in. In this particular case, that’s only the Traefik
pod.
The fromEndpoints:
setting in turn says that ingress should be allowed from
all pods within the same namespace. Finally the only really interesting setting
here is the fromEntities: [world]
. This setting allows all external traffic
from nodes which are not managed by Cilium, meaning the rest of my Homelab and
especially my end-user devices.
Example Ingress
Last but not least, let’s have a look at a quick example. In my
post about load balancers,
I introduced a simple echo server and made it available via a LoadBalancer
type
Service. With Traefik up and running, I can now switch that service to ClusterIP
and introduce the following Ingress
manifest:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: testsetup-ingress
namespace: testsetup
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: secureweb
labels:
homelab/part-of: testsetup
spec:
rules:
- host: testsetup.mei-home.net
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: testsetup-service
port:
name: http-port
The only Traefik specific config here is the entrypoints
annotation, telling
Traefik to accept connections to the service on the secureweb
entrypoint.
One nice thing about external-dns is that I don’t have to provide an extra
annotation to create a DNS entry. It is automatically created from the
host:
value.
Traefik will parse the Ingress and create a route, where requests are
routed by which domain they request.
Traefik then automatically routes those requests via the Kubernetes Service
and will automatically execute all the Middlewares for the secureweb
entrypoint.
To ensure that Traefik can access the echo pod, I also needed another network policy:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "testsetup-deny-all-ingress"
spec:
endpointSelector: {}
ingress:
- fromEndpoints:
- {}
- matchLabels:
homelab/ingress: "true"
io.kubernetes.pod.namespace: traefik-ingress
Again, this policy is applied to all pods in the namespace for my testsetup
pod, and it allows ingress from all pods in that namespace.
But the Traefik pod lives in another namespace, and so access needs to be
explicitly granted. That’s what the matchLabels
key is about, where I provide
my ingress label and, importantly, also the namespace, as that’s part of Cilium’s
secure identity.
And with that, another piece of important cluster infrastructure is up. :slight_smile: