This is the second post in my k8s migration series.

I will skip the cluster setup itself in this series, as I did not make many changes compared to my experimental setup.

Instead I will start with my very first deployed service, external-secrets.

Motivation

In my initial experimentation, I decided to not go with any secrets management and instead use Helmfile’s secret handling. But I’ve come around to the fact that having some sort of service which can automatically take in secrets from my Vault instance would be pretty nice to have. One trigger was the fact that while setting up a number of services, I found that Helmfile’s approach for getting secrets was not actually that great.

So what does external-secrets do? It is a connector between Kubernetes Secrets and an external secrets provider. In my case, that’s HashiCorp’s Vault. With external-secrets, an operator is set up. This operator watches for new objects of type ExternalSecret. When one of those appears, it reads the object’s values and contacts Vault to download the secrets. Then, external-secrets creates a new Kubernetes Secret with that secret matter collected from the external secrets provider for use in the Kubernetes cluster.

Vault setup

Before I could deploy external-secrets, I had to do some reconfiguration of my Vault setup. I’m managing all of the setup for Vault in Terraform.

The first step was creating a rather restrictive policy for the external-secrets access, as my Vault doesn’t just provide secrets for my workloads, but also for my Ansible playbooks and image generation setup. For now, I’m planning to restrict access to just the Vault kv secrets store, and only particular paths therein. A policy for that might look like this:

path "secret/my_kubernetes_secrets/cluster/*" {
  capabilities = [ "read" ]
}

With that, if my k8s cluster ever gets breached, the attacker will at most have access to the Kubernetes specific secrets. This policy is then added to Vault via Terraform like this:

resource "vault_policy" "external-secrets" {
  name = "external-secrets"
  policy = file("path-to-file.hcl")
}

Policies as short as this could also be added verbatim instead of having a separate file and loading that, but I like it better like this.

The second part of the Vault setup is the authentication. For this I chose Vault’s AppRole, which is intended for use cases exactly like this. I did not actually have that auth backend configured yet, so I added it like this:

resource "vault_auth_backend" "approle" {
  type = "approle"
  path = "approle"
  local = false
}

I just kept the default mount path. In addition to mounting the backend, I also needed to create a role for external-secrets. For my setup, it looked like this:

resource "vault_approle_auth_backend_role" "external-secrets" {
  backend = vault_auth_backend.approle.path
  role_name = "external-secrets"
  token_policies = [vault_policy.external-secrets.name]
  secret_id_bound_cidrs = ["10.1.1.0/24"]
  token_bound_cidrs = ["10.1.1.0/24"]
  token_explicit_max_ttl = 86400
}

This creates an application role with the previously created access policy and the default policy attached. The default policy just allows things like looking up your own token but doesn’t grant any secret access. For additional security, I also configured restricted CIDRs for both the secret-id, which is used to log in, and the tokens produced for the role after login. This restricts the IPs from which logins can happen and after that restricts the IPs from which the generated tokens can be used. For purely best practices reasons, I also restricted the max TTL for tokens created for this role to 24h.

What I did decide to not do here was to also set a TTL for the secret_id. This is due to the fact that while external-secrets can renew tokens if they become invalid, it cannot automatically get a new secret_id. So I’ve added the secret_id to my regular manual secrets rotation plan. I definitely need to write a playbook or script to do all of those rotations at some point. 😬

Once all of the above has been configured and Terraform has been executed, there are two pieces of information needed to configure external-secrets. The first one is the AppRole role-id. It can be collected via this command:

vault read auth/approle/role/external-secrets/role-id

The second piece is the secret_id. A fresh one is generated and shown every time the following Vault command is executed:

vault write -force auth/approle/role/external-secrets/secret-id

The -force is required here because normally Vault needs at least some input parameters, but in this case I didn’t need any.

Finally, I stored the secret_id in the Vault KV store for later access by my external-secrets deployment:

vault kv put secret/my_kubernetes_secrets/role-secret secret-id=-

In theory, I could also have gotten the secret_id via Terraform and then written it to the KV store also via Terraform. But that would have meant that the secret_id would have ended up in the Terraform state. Not optimal.

Kubernetes deployment

With all of the Vault config now prepared, the next step is to actually deploy external-secrets. And this went relatively well. I used the official Helm chart.

I’m using Helmfile for managing the deployments in my Kubernetes cluster. I will not go into the details here, but I’ve got a draft for a post on my deployment setup almost done and will finish it after this post.

My values.yaml file for the Helm chart looks like this:

approleSecretId: {{ "ref+vault://secret/my_kubernetes_secrets/role-secret#/secret-id" | fetchSecretValue }}
approleId: {{ "ref+vault://auth/approle/role/external-secrets/role-id#/role_id" | fetchSecretValue }}
caBundle: |
  {{- exec "curl" (list "https://vault.example.com:/v1/my-ca/ca/pem") | nindent 2 }}  
external-secrets:
  commonLabels:
    homelab/part-of: external-secrets
  serviceMonitor:
    enabled: false
  webhook:
    certManager:
      enabled: false

The ref+vault syntax uses Helmfile’s secret management to get the AppRole credentials from my Vault instance during deployment. The caBundle value will later be used to supply the SecretStore with my internal CA so external-secrets can validate the TLS cert coming from my Vault instance. I will go over this in detail later.

The values under external-secrets are the actual values for the external-secrets Helm chart, as that chart is managed as a dependency. I’m not doing anything special here, just explicitly disabling the serviceMonitor. This is mostly so that I can later grep over my Homelab repo and find all apps providing service monitors once I’ve deployed Prometheus.

Enabling the Vault secrets store

In external-secrets, the different supported secrets providers can be enabled separately via SecretStore or ClusterSecretStore manifests. I decided to use the ClusterSecretStore, as providing per-namespace stores didn’t look like they would make much sense. My thinking here is that yes, I could provide one store per namespace, which would mean one store per deployed app. I could then create different roles for each of these stores in Vault and give them highly restrictive policies to only access what they really need.

But in Kubernetes, it’s not the pods themselves which have access to the Secrets and the secrets stores. It’s the admins and operators which create and write manifests. In the case of this cluster, that’s only me. And I’ve already got all the permissions there are. So if somebody were to get into my Kubernetes account, they would have access to anything anyway. So it didn’t make much sense to me to work with different secret stores.

Without further delay, here is my ClusterSecretStore manifest:

apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
  name: my-vault-store
spec:
  provider:
    vault:
      server: "https://vault.example.com"
      caProvider:
        type: Secret
        name: my-internal-ca
        namespace: external-secrets
        key: caCert
      path: "secret"
      version: "v1"
      auth:
        appRole:
          path: "approle"
          # RoleID configured in the App Role authentication backend
          roleId: {{ .Values.approleId }}
          # Reference to a key in a K8 Secret that contains the App Role SecretId
          secretRef:
            name: "my-approle-secret"
            namespace: {{ .Release.Namespace }}
            key: "secretId"

In addition to this, I’m also deploying two more secrets, one for my internal CA and one with the AppRole secret_id:

apiVersion: v1
kind: Secret
metadata:
  name: my-approle-secret
  labels:
    homelab/part-of: external-secrets
data:
  secretId: {{ .Values.approleSecretId | b64enc }}
---
apiVersion: v1
kind: Secret
metadata:
  name: my-internal-ca
stringData:
  caCert: |
    {{- .Values.caBundle | nindent 6 }}    

These are getting their values from the following lines in the values.yaml.gotmpl file:

approleSecretId: {{ "ref+vault://secret/my_kubernetes_secrets/role-secret#/secret-id" | fetchSecretValue }}
caBundle: |
  {{- exec "curl" (list "https://vault.example.com:/v1/my-ca/ca/pem") | nindent 2 }}  

As mentioned before, I’m using Helmfile’s templating capabilities here. If you’re not using Helmfile, you will have to get the secrets created in a different way. For me, this approach has the advantage of having absolutely everything under version control while not exposing any secrets.

While trying to deploy the ClusterSecretStore, I hit two problems I will describe in detail in a later section.

For now, the above config works.

Deploying a secret

To test the setup, I created a fresh dummy secret in Vault. First, I pushed the secret to Vault:

vault kv put secret/my_kubernetes_secrets/cluster/testsecret secret=supersecretpw

Then, an ExternalSecret manifest using the previously created my-vault-store secret store can be created:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: testsecret
spec:
  refreshInterval: "1m"
  secretStoreRef:
    name: my-vault-store
    kind: ClusterSecretStore
  target:
    name: testsecret
    namespace: external-secrets
  data:
  - secretKey: mysecret
    remoteRef:
      key: secret/my_kubernetes_secrets/cluster/testsecret
      property: secret

Once that manifest has been deployed, external-secrets will create a Kubernetes Secret called testsecret in the namespace external-secrets:

kubectl get -n external-secrets secrets testsecret -o yaml
apiVersion: v1
data:
  mysecret: c3VwZXJzZWNyZXRwdw==
immutable: false
kind: Secret
metadata:
  annotations:
    meta.helm.sh/release-name: external-secrets
    meta.helm.sh/release-namespace: external-secrets
    reconcile.external-secrets.io/data-hash: 12345
  creationTimestamp: "2023-12-26T12:01:30Z"
  labels:
    app.kubernetes.io/managed-by: Helm
    reconcile.external-secrets.io/created-by: 1235
  name: testsecret
  namespace: external-secrets
  ownerReferences:
  - apiVersion: external-secrets.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ExternalSecret
    name: testsecret
    uid: 12345
  resourceVersion: "1839454"
  uid: 12345
type: Opaque

Here, target.name is the name of the secret to be created, with target.namespace being the namespace to deploy to. Under data, the secretKey is the key under which the secret data will be stored in the newly created Secret, and remoteRef.key is the path to the secret in Vault, with remoteRef.property being the property of the resulting JSON object at that path which contains the value to be stored in secretKey.

Network policy problems

While working on deploying specifically the Vault ClusterSecretStore, I hit multiple errors. The first one was this:

Error: UPGRADE FAILED: cannot patch "vault-backend" with kind ClusterSecretStore: Internal error occurred: failed calling webhook "validate.clustersecretstore.external-secrets.io": failed to call webhook: Post "https://external-secrets-webhook.external-secrets.svc:443/validate-external-secrets-io-v1beta1-clustersecretstore?timeout=5s": context deadline exceeded

It appeared whenever I tried to deploy the new ClusterSecretStore. I finally tweaked to the fact that this was likely to do with my CiliumNetworkPolicy, which at that point looked like this:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "external-secrets-deny-all-ingress"
  namespace: {{ .Release.Namespace }}
spec:
  endpointSelector: {}
  ingress:
    - fromEndpoints:
      - {}

This is the canonical network policy for allowing all egress, while blocking all ingress to all pods inside the namespace, save for traffic from pods in the same namespace. So I was extremely confused when I saw that network requests were getting blocked. I removed the policy, and the deployment of the secret store worked fine.

First, I confirmed that the policy was actually applied correctly. This can be done with Cilium, the CNI plugin I’m using, as follows. Some documentation on troubleshooting can be found here.

First, get the host where the pod which blocks access is running:

kubectl get pods -n external-secrets -o wide
NAME                                               READY   STATUS    RESTARTS   AGE   IP           NODE     NOMINATED NODE   READINESS GATES
external-secrets-7fcd5969c8-sltbl                  1/1     Running   0          25h   10.8.4.173   sait     <none>           <none>
external-secrets-cert-controller-fc578ccdd-mcksx   1/1     Running   0          25h   10.8.4.168   sait     <none>           <none>
external-secrets-webhook-68c99c7557-nrqpz          1/1     Running   0          25h   10.8.5.60    sehith   <none>           <none>

Next, check which Cilium pod runs on that specific host:

kubectl get pods -n kube-system -o wide
[...]
cilium-wcffs                       1/1     Running   0               5d22h   10.86.5.205   sehith   <none>           <none>

Then, I needed the correct Cilium endpoint for the pod I was interested in:

kubectl -n kube-system exec -ti cilium-wcffs -- cilium endpoint list
ENDPOINT   POLICY (ingress)   POLICY (egress)   IDENTITY   LABELS (source:key[=value])                                                       IPv6   IPv4         STATUS   
           ENFORCEMENT        ENFORCEMENT                                                                                                                        
1101       Enabled            Disabled          29452      k8s:app.kubernetes.io/instance=external-secrets                                          10.8.5.60    ready   
                                                           k8s:app.kubernetes.io/name=external-secrets-webhook                                                           

Finally armed with the ENDPOINT identifier, 1101 here, we can display the policy rules applied to it:

kubectl -n kube-system exec -ti cilium-wcffs -- cilium endpoint get -o yaml 1101
[...]
rule: '{"port":0,"protocol":"ANY","l7-rules":[{"\u0026LabelSelector{MatchLabels:map[string]string{k8s.io.kubernetes.pod.namespace: external-secrets,},MatchExpressions:[]LabelSelectorRequirement{},}":null},]}'

This was exactly the rule I was expecting - allowing all traffic from the external-secrets namespace. While looking all this up, I also looked at the monitoring for the endpoint, which can be looked at like this:

kubectl -n kube-system exec -ti cilium-wcffs -- cilium monitor --type drop

It spat out lines like this whenever I tried to deploy the secret store:

xx drop (Policy denied) flow 0x0 to endpoint 1101, ifindex 6, file bpf_lxc.c:1968, , identity remote-node->29452: 10.8.0.17:59258 -> 10.8.5.60:10250 tcp SYN

What I did not realized for way too long: The source IP, 10.8.0.17, wasn’t coming from any pod in my entire cluster. I just couldn’t figure out what that IP was. It’s in the CIDR for my cluster pods, but it’s not showing up in the kubectl get -A pods -o wide output.

After an exceedingly long time spend searching for the root cause, I finally found it, through sheer dump luck. I had switched into the terminal of my VM host, where the output of a previous lxc ls command was still visible. And lo and behold, there was the IP, as the cilium_host network interface for one of my control plane nodes.

Some digging later, I found out that this is a network interface created by Cilium, and it is used for the traffic of all static, host networking using pods on a host. This also explained why I never saw any error in the logs of any of the external-secrets pods. The request wasn’t made by any of them. The webhook pod runs a webhook which is used when deploying new secret stores, to verify them before the Kubernetes objects are created. This means the hook isn’t triggered by any external-secrets pod, but by the kube-apiserver.

Going a bit further, I just wanted to allow the kube-apiserver ingress into the webhook pod. This also did not work, because there wasn’t actually any identity for it, as the pod’s networking is not controlled by Cilium.

After a while, I looked back at the Cilium monitoring line:

xx drop (Policy denied) flow 0x0 to endpoint 1101, ifindex 6, file bpf_lxc.c:1968, , identity remote-node->29452: 10.8.0.17:59258 -> 10.8.5.60:10250 tcp SYN

Note the identity remote-node. Luckily, Cilium defines an entity for that, see the docs here. So what finally solved my problem was to add the following network policy:

apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
  name: "external-secrets-allow-webhook-all"
  namespace: {{ .Release.Namespace }}
spec:
  endpointSelector:
    matchLabels:
      app.kubernetes.io/name: external-secrets-webhook
  ingress:
    - fromEntities:
        - remote-node

This allows ingress to the webhook pod from any remote node. These remote nodes are all nodes in the Kubernetes cluster which are not the local node where the pod is running. It isn’t quite as secure as explicitly defining that only the kube-apiserver pods can access the webhook pod, but it will have to do for now, as the kube-apiserver is not under Cilium control and hence it cannot be controlled by, for example, using its labels. I will have to return to this issue at a later point and see whether I can do better.

The CA cert formatting problem

After I had finally fixed the networking issue, I got another error, this time from the external-secrets pod itself. It was not able to connect to Vault:

[...]"error":"could not get provider client: unable to log in to auth met hod: unable to log in with app role auth: Put \"https://vault.example.com/v1/auth/approle/login\": tls: failed to verify certificate: x509: certificate signed by unknown authority"[...]

This was somewhat expected, because my Vault access does not go through my proxy, and uses my Homelab internal CA.

I thought this problem would be easily fixable, as external-secrets does provide settings in the ClusterSecretStore for providing a CA cert for server cert validation. See the docs here. But I had really rotten luck with the caBundle config. I’m getting the PEM formatted CA cert directly from Vault, which runs my internal CA. But I couldn’t get it into a format which external-secrets would accept. Whatever I tried, introducing newlines, putting the cert through b64enc, nothing worked. I was just getting CA cert parsing errors from external-secrets.

What finally worked was to use the caProvider option instead. For this, I created an additional secret (even though the CA cert isn’t exactly a secret):

apiVersion: v1
kind: Secret
metadata:
  name: my-internal-ca
stringData:
  caCert: |
    {{- .Values.caBundle | nindent 6 }}    

And then I used that secret in the caProvider section as seen above. This was an extremely frustrating journey. Through the networking problem, I at least learned something about Cilium networking and how to debug it and finally found the root cause and an acceptable fix. But in this case, the only thing I got out of it was a high dosage of frustration and a workaround switching to a completely different approach.

Conclusion

First service set up successfully on the production k8s cluster. 🎉 But also lots of frustration. Finding the fix for the networking problem was pretty frustrating, but at least I learned a bit about Cilium debugging. But the formatting problem got me pretty riled up.

If anyone reading this has any good ideas about how to produce a Cilium network policy which only allows access from the kube-apiserver instead of the “allow all cluster nodes” setup I’ve got now, hit me up on the Fediverse.