This is the second post in my k8s migration series.
I will skip the cluster setup itself in this series, as I did not make many changes compared to my experimental setup.
Instead I will start with my very first deployed service, external-secrets.
Motivation
In my initial experimentation, I decided to not go with any secrets management and instead use Helmfile’s secret handling. But I’ve come around to the fact that having some sort of service which can automatically take in secrets from my Vault instance would be pretty nice to have. One trigger was the fact that while setting up a number of services, I found that Helmfile’s approach for getting secrets was not actually that great.
So what does external-secrets do? It is a connector between Kubernetes Secrets
and an external secrets provider. In my case, that’s HashiCorp’s Vault.
With external-secrets, an operator is set up. This operator watches for new
objects of type ExternalSecret
. When one of those appears, it reads the
object’s values and contacts Vault to download the secrets. Then, external-secrets
creates a new Kubernetes Secret with that secret matter collected from the
external secrets provider for use in the Kubernetes cluster.
Vault setup
Before I could deploy external-secrets, I had to do some reconfiguration of my Vault setup. I’m managing all of the setup for Vault in Terraform.
The first step was creating a rather restrictive policy for the external-secrets access, as my Vault doesn’t just provide secrets for my workloads, but also for my Ansible playbooks and image generation setup. For now, I’m planning to restrict access to just the Vault kv secrets store, and only particular paths therein. A policy for that might look like this:
path "secret/my_kubernetes_secrets/cluster/*" {
capabilities = [ "read" ]
}
With that, if my k8s cluster ever gets breached, the attacker will at most have access to the Kubernetes specific secrets. This policy is then added to Vault via Terraform like this:
resource "vault_policy" "external-secrets" {
name = "external-secrets"
policy = file("path-to-file.hcl")
}
Policies as short as this could also be added verbatim instead of having a separate file and loading that, but I like it better like this.
The second part of the Vault setup is the authentication. For this I chose Vault’s AppRole, which is intended for use cases exactly like this. I did not actually have that auth backend configured yet, so I added it like this:
resource "vault_auth_backend" "approle" {
type = "approle"
path = "approle"
local = false
}
I just kept the default mount path. In addition to mounting the backend, I also needed to create a role for external-secrets. For my setup, it looked like this:
resource "vault_approle_auth_backend_role" "external-secrets" {
backend = vault_auth_backend.approle.path
role_name = "external-secrets"
token_policies = [vault_policy.external-secrets.name]
secret_id_bound_cidrs = ["10.1.1.0/24"]
token_bound_cidrs = ["10.1.1.0/24"]
token_explicit_max_ttl = 86400
}
This creates an application role with the previously created access policy and
the default policy attached. The default policy just allows things like looking
up your own token but doesn’t grant any secret access.
For additional security, I also configured restricted CIDRs for both the
secret-id
, which is used to log in, and the tokens produced for the role after
login. This restricts the IPs from which logins can happen and after that
restricts the IPs from which the generated tokens can be used.
For purely best practices reasons, I also restricted the max TTL for tokens
created for this role to 24h.
What I did decide to not do here was to also set a TTL for the secret_id
. This
is due to the fact that while external-secrets can renew tokens if they become
invalid, it cannot automatically get a new secret_id
. So I’ve added the
secret_id
to my regular manual secrets rotation plan. I definitely need to
write a playbook or script to do all of those rotations at some point. 😬
Once all of the above has been configured and Terraform has been executed,
there are two pieces of information needed to configure external-secrets.
The first one is the AppRole role-id
. It can be collected via this command:
vault read auth/approle/role/external-secrets/role-id
The second piece is the secret_id
. A fresh one is generated and shown every
time the following Vault command is executed:
vault write -force auth/approle/role/external-secrets/secret-id
The -force
is required here because normally Vault needs at least some input
parameters, but in this case I didn’t need any.
Finally, I stored the secret_id
in the Vault KV store for later access by
my external-secrets deployment:
vault kv put secret/my_kubernetes_secrets/role-secret secret-id=-
In theory, I could also have gotten the secret_id
via Terraform and then
written it to the KV store also via Terraform. But that would have meant that
the secret_id
would have ended up in the Terraform state. Not optimal.
Kubernetes deployment
With all of the Vault config now prepared, the next step is to actually deploy external-secrets. And this went relatively well. I used the official Helm chart.
I’m using Helmfile for managing the deployments in my Kubernetes cluster. I will not go into the details here, but I’ve got a draft for a post on my deployment setup almost done and will finish it after this post.
My values.yaml
file for the Helm chart looks like this:
approleSecretId: {{ "ref+vault://secret/my_kubernetes_secrets/role-secret#/secret-id" | fetchSecretValue }}
approleId: {{ "ref+vault://auth/approle/role/external-secrets/role-id#/role_id" | fetchSecretValue }}
caBundle: |
{{- exec "curl" (list "https://vault.example.com:/v1/my-ca/ca/pem") | nindent 2 }}
external-secrets:
commonLabels:
homelab/part-of: external-secrets
serviceMonitor:
enabled: false
webhook:
certManager:
enabled: false
The ref+vault
syntax uses Helmfile’s secret management
to get the AppRole credentials from my Vault instance during deployment.
The caBundle
value will later be used to supply the SecretStore with my
internal CA so external-secrets can validate the TLS cert coming from my Vault
instance. I will go over this in detail later.
The values under external-secrets
are the actual values for the external-secrets
Helm chart, as that chart is managed as a dependency.
I’m not doing anything special here, just explicitly disabling the serviceMonitor
.
This is mostly so that I can later grep over my Homelab repo and find all apps
providing service monitors once I’ve deployed Prometheus.
Enabling the Vault secrets store
In external-secrets, the different supported secrets providers can be enabled
separately via SecretStore
or ClusterSecretStore
manifests. I decided to
use the ClusterSecretStore
, as providing per-namespace stores didn’t look
like they would make much sense.
My thinking here is that yes, I could provide one store per namespace, which
would mean one store per deployed app. I could then create different roles for
each of these stores in Vault and give them highly restrictive policies to only
access what they really need.
But in Kubernetes, it’s not the pods themselves which have access to the Secrets and the secrets stores. It’s the admins and operators which create and write manifests. In the case of this cluster, that’s only me. And I’ve already got all the permissions there are. So if somebody were to get into my Kubernetes account, they would have access to anything anyway. So it didn’t make much sense to me to work with different secret stores.
Without further delay, here is my ClusterSecretStore
manifest:
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: my-vault-store
spec:
provider:
vault:
server: "https://vault.example.com"
caProvider:
type: Secret
name: my-internal-ca
namespace: external-secrets
key: caCert
path: "secret"
version: "v1"
auth:
appRole:
path: "approle"
# RoleID configured in the App Role authentication backend
roleId: {{ .Values.approleId }}
# Reference to a key in a K8 Secret that contains the App Role SecretId
secretRef:
name: "my-approle-secret"
namespace: {{ .Release.Namespace }}
key: "secretId"
In addition to this, I’m also deploying two more secrets, one for my internal
CA and one with the AppRole secret_id
:
apiVersion: v1
kind: Secret
metadata:
name: my-approle-secret
labels:
homelab/part-of: external-secrets
data:
secretId: {{ .Values.approleSecretId | b64enc }}
---
apiVersion: v1
kind: Secret
metadata:
name: my-internal-ca
stringData:
caCert: |
{{- .Values.caBundle | nindent 6 }}
These are getting their values from the following lines in the values.yaml.gotmpl
file:
approleSecretId: {{ "ref+vault://secret/my_kubernetes_secrets/role-secret#/secret-id" | fetchSecretValue }}
caBundle: |
{{- exec "curl" (list "https://vault.example.com:/v1/my-ca/ca/pem") | nindent 2 }}
As mentioned before, I’m using Helmfile’s templating capabilities here. If you’re not using Helmfile, you will have to get the secrets created in a different way. For me, this approach has the advantage of having absolutely everything under version control while not exposing any secrets.
While trying to deploy the ClusterSecretStore
, I hit two problems I will
describe in detail in a later section.
For now, the above config works.
Deploying a secret
To test the setup, I created a fresh dummy secret in Vault. First, I pushed the secret to Vault:
vault kv put secret/my_kubernetes_secrets/cluster/testsecret secret=supersecretpw
Then, an ExternalSecret
manifest using the previously created my-vault-store
secret store can be created:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: testsecret
spec:
refreshInterval: "1m"
secretStoreRef:
name: my-vault-store
kind: ClusterSecretStore
target:
name: testsecret
namespace: external-secrets
data:
- secretKey: mysecret
remoteRef:
key: secret/my_kubernetes_secrets/cluster/testsecret
property: secret
Once that manifest has been deployed, external-secrets will create a Kubernetes
Secret called testsecret
in the namespace external-secrets
:
kubectl get -n external-secrets secrets testsecret -o yaml
apiVersion: v1
data:
mysecret: c3VwZXJzZWNyZXRwdw==
immutable: false
kind: Secret
metadata:
annotations:
meta.helm.sh/release-name: external-secrets
meta.helm.sh/release-namespace: external-secrets
reconcile.external-secrets.io/data-hash: 12345
creationTimestamp: "2023-12-26T12:01:30Z"
labels:
app.kubernetes.io/managed-by: Helm
reconcile.external-secrets.io/created-by: 1235
name: testsecret
namespace: external-secrets
ownerReferences:
- apiVersion: external-secrets.io/v1beta1
blockOwnerDeletion: true
controller: true
kind: ExternalSecret
name: testsecret
uid: 12345
resourceVersion: "1839454"
uid: 12345
type: Opaque
Here, target.name
is the name of the secret to be created, with target.namespace
being the namespace to deploy to.
Under data
, the secretKey
is the key under which the secret data will be
stored in the newly created Secret, and remoteRef.key
is the path to the
secret in Vault, with remoteRef.property
being the property of the resulting
JSON object at that path which contains the value to be stored in secretKey
.
Network policy problems
While working on deploying specifically the Vault ClusterSecretStore
, I
hit multiple errors. The first one was this:
Error: UPGRADE FAILED: cannot patch "vault-backend" with kind ClusterSecretStore: Internal error occurred: failed calling webhook "validate.clustersecretstore.external-secrets.io": failed to call webhook: Post "https://external-secrets-webhook.external-secrets.svc:443/validate-external-secrets-io-v1beta1-clustersecretstore?timeout=5s": context deadline exceeded
It appeared whenever I tried to deploy the new ClusterSecretStore
. I finally
tweaked to the fact that this was likely to do with my CiliumNetworkPolicy
,
which at that point looked like this:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "external-secrets-deny-all-ingress"
namespace: {{ .Release.Namespace }}
spec:
endpointSelector: {}
ingress:
- fromEndpoints:
- {}
This is the canonical network policy for allowing all egress, while blocking all ingress to all pods inside the namespace, save for traffic from pods in the same namespace. So I was extremely confused when I saw that network requests were getting blocked. I removed the policy, and the deployment of the secret store worked fine.
First, I confirmed that the policy was actually applied correctly. This can be done with Cilium, the CNI plugin I’m using, as follows. Some documentation on troubleshooting can be found here.
First, get the host where the pod which blocks access is running:
kubectl get pods -n external-secrets -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
external-secrets-7fcd5969c8-sltbl 1/1 Running 0 25h 10.8.4.173 sait <none> <none>
external-secrets-cert-controller-fc578ccdd-mcksx 1/1 Running 0 25h 10.8.4.168 sait <none> <none>
external-secrets-webhook-68c99c7557-nrqpz 1/1 Running 0 25h 10.8.5.60 sehith <none> <none>
Next, check which Cilium pod runs on that specific host:
kubectl get pods -n kube-system -o wide
[...]
cilium-wcffs 1/1 Running 0 5d22h 10.86.5.205 sehith <none> <none>
Then, I needed the correct Cilium endpoint for the pod I was interested in:
kubectl -n kube-system exec -ti cilium-wcffs -- cilium endpoint list
ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS
ENFORCEMENT ENFORCEMENT
1101 Enabled Disabled 29452 k8s:app.kubernetes.io/instance=external-secrets 10.8.5.60 ready
k8s:app.kubernetes.io/name=external-secrets-webhook
Finally armed with the ENDPOINT
identifier, 1101
here, we can display the
policy rules applied to it:
kubectl -n kube-system exec -ti cilium-wcffs -- cilium endpoint get -o yaml 1101
[...]
rule: '{"port":0,"protocol":"ANY","l7-rules":[{"\u0026LabelSelector{MatchLabels:map[string]string{k8s.io.kubernetes.pod.namespace: external-secrets,},MatchExpressions:[]LabelSelectorRequirement{},}":null},]}'
This was exactly the rule I was expecting - allowing all traffic from the
external-secrets
namespace.
While looking all this up, I also looked at the monitoring for the endpoint,
which can be looked at like this:
kubectl -n kube-system exec -ti cilium-wcffs -- cilium monitor --type drop
It spat out lines like this whenever I tried to deploy the secret store:
xx drop (Policy denied) flow 0x0 to endpoint 1101, ifindex 6, file bpf_lxc.c:1968, , identity remote-node->29452: 10.8.0.17:59258 -> 10.8.5.60:10250 tcp SYN
What I did not realized for way too long: The source IP, 10.8.0.17
, wasn’t
coming from any pod in my entire cluster. I just couldn’t figure out what that
IP was. It’s in the CIDR for my cluster pods, but it’s not showing up in the
kubectl get -A pods -o wide
output.
After an exceedingly long time spend searching for the root cause, I finally
found it, through sheer dump luck. I had switched into the terminal of my VM
host, where the output of a previous lxc ls
command was still visible.
And lo and behold, there was the IP, as the cilium_host
network interface for
one of my control plane nodes.
Some digging later, I found out that this is a network interface created by
Cilium, and it is used for the traffic of all static, host networking using
pods on a host.
This also explained why I never saw any error in the logs of any of the
external-secrets pods. The request wasn’t made by any of them. The webhook
pod runs a webhook which is used when deploying new secret stores, to verify
them before the Kubernetes objects are created.
This means the hook isn’t triggered by any external-secrets pod, but by the
kube-apiserver.
Going a bit further, I just wanted to allow the kube-apiserver ingress into the webhook pod. This also did not work, because there wasn’t actually any identity for it, as the pod’s networking is not controlled by Cilium.
After a while, I looked back at the Cilium monitoring line:
xx drop (Policy denied) flow 0x0 to endpoint 1101, ifindex 6, file bpf_lxc.c:1968, , identity remote-node->29452: 10.8.0.17:59258 -> 10.8.5.60:10250 tcp SYN
Note the identity remote-node
. Luckily, Cilium defines an entity for that,
see the docs here.
So what finally solved my problem was to add the following network policy:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "external-secrets-allow-webhook-all"
namespace: {{ .Release.Namespace }}
spec:
endpointSelector:
matchLabels:
app.kubernetes.io/name: external-secrets-webhook
ingress:
- fromEntities:
- remote-node
This allows ingress to the webhook pod from any remote node. These remote nodes are all nodes in the Kubernetes cluster which are not the local node where the pod is running. It isn’t quite as secure as explicitly defining that only the kube-apiserver pods can access the webhook pod, but it will have to do for now, as the kube-apiserver is not under Cilium control and hence it cannot be controlled by, for example, using its labels. I will have to return to this issue at a later point and see whether I can do better.
The CA cert formatting problem
After I had finally fixed the networking issue, I got another error, this time from the external-secrets pod itself. It was not able to connect to Vault:
[...]"error":"could not get provider client: unable to log in to auth met hod: unable to log in with app role auth: Put \"https://vault.example.com/v1/auth/approle/login\": tls: failed to verify certificate: x509: certificate signed by unknown authority"[...]
This was somewhat expected, because my Vault access does not go through my proxy, and uses my Homelab internal CA.
I thought this problem would be easily fixable, as external-secrets does provide
settings in the ClusterSecretStore
for providing a CA cert for server cert
validation. See the docs here.
But I had really rotten luck with the caBundle
config. I’m getting the PEM
formatted CA cert directly from Vault, which runs my internal CA. But I couldn’t
get it into a format which external-secrets would accept. Whatever I tried,
introducing newlines, putting the cert through b64enc
, nothing worked. I was
just getting CA cert parsing errors from external-secrets.
What finally worked was to use the caProvider
option instead. For this,
I created an additional secret (even though the CA cert isn’t exactly a secret):
apiVersion: v1
kind: Secret
metadata:
name: my-internal-ca
stringData:
caCert: |
{{- .Values.caBundle | nindent 6 }}
And then I used that secret in the caProvider
section as seen above. This was
an extremely frustrating journey. Through the networking problem, I at least
learned something about Cilium networking and how to debug it and finally found
the root cause and an acceptable fix.
But in this case, the only thing I got out of it was a high dosage of frustration
and a workaround switching to a completely different approach.
Conclusion
First service set up successfully on the production k8s cluster. 🎉 But also lots of frustration. Finding the fix for the networking problem was pretty frustrating, but at least I learned a bit about Cilium debugging. But the formatting problem got me pretty riled up.
If anyone reading this has any good ideas about how to produce a Cilium network policy which only allows access from the kube-apiserver instead of the “allow all cluster nodes” setup I’ve got now, hit me up on the Fediverse.