Wherein I migrate my Drone CI setup on Nomad to a Woodpecker CI setup on k8s.
This is part 16 of my k8s migration series.
Finally, another migration blog post! I’m still rather happy that I’m getting into it again. For several years now, I’ve been running a CI setup to automate a number of tasks related to some personal projects. CI stands for Continuous Integration, and Wikipedia says this about it:
Continuous integration (CI) is the practice of integrating source code changes frequently and ensuring that the integrated codebase is in a workable state.
I’m pretty intimately familiar with the concept on a rather large scale, as I’m working in a CI team at a large company.
In the Homelab, I’m using CI for a variety of use cases, ranging from the traditional automated test cases for software I’ve written to just a convenient automation for things like container image builds. I will go into details on a few of those use cases later on, when I describe how I’ve migrated some of my projects.
The basic principle of CI for me is: You push a commit to a Git repository, and a piece of software automatically launches a variety of test jobs. These can range from UT jobs, over automated linter runs up to automated deploys of the updated software.
From Drone CI to Woodpecker CI
Since I started running a CI, I’ve been using Drone CI. It’s a relatively simple CI system, compared to what one could build e.g. with Zuul, Jenkins and Gerrit.
Drone CI consists of two components, the Drone CI server providing web hooks for the Git Forge to call and launching the jobs, and agents, which take the jobs and run them. In my deployment on Nomad, I was using the drone-runner-docker. It mounts the host’s Docker socket into the agent and uses it to launch Docker containers for each step of the CI pipeline.
It has always worked well for me and mostly got out of my way. So I didn’t switch to Woodpecker CI because of features. There aren’t that many different features anyway, because Woodpecker is a community fork of Drone CI. Rather, Drone CI started to have quite a bad smell. What bothered me the most was that their release notes were basically empty and said things like “integrated UI updates”. Then there is whatever happens after they were bought by Harness. Then there’s the fact that the component which needs to mount your host’s Docker socket hasn’t been updated in over a year.
In contrast, Woodpecker is a community project and had a far nicer smell, so I decided that while I was at it, I would not just migrate Drone to k8s but also switch to Woodpecker.
One of the things I genuinely looked forward to was the backend. With the migration to k8s, I could finally make use of my entire cluster. With Drone’s Docker runner, I always had to reserve a lot of resources for the CI job execution on the nodes where the agents were launched. Now, with the Kubernetes backend, it doesn’t matter (much, more later) where the agents are running - the only thing they do is launching Pods to run each step of the pipeline, but where those are scheduled is left to Kubernetes.
I will go into more detail later, when talking about my CI job migrations, but let me still give a short example of what I’m actually talking about.
Here’s a slight variation of the example pipeline from the Woodpecker docs:
when:
- event: push
branch: master
steps:
- name: build
image: debian
commands:
- echo "This is the build step"
- echo "binary-data-123" > executable
- name: a-test-step
image: golang:1.16
commands:
- echo "Testing ..."
- ./executable
This pipeline tells Woodpecker that it should only be run when a Git push is
done to the master
branch of the repository. This file would be committed to
the repository it’s used in, but there are also options to tell Woodpecker
to listen on events for other repositories. So you could theoretically even have
a separate “CI” repository with all the pipelines. But that’s generally not a
good idea.
The pipeline itself will execute two separate steps, called “build” and “a-test-step”.
The image:
parameter defines which container image is executed, in this case
Debian and the golang image. And then follows a list of commands to be run.
In this case, they’re pretty nonsensical and will lead to failed pipelines,
but it’s only here for demonstration purposes anyway. In the Woodpecker web UI,
this is what the pipeline looks like:
Database deployment
To begin with, Woodpecker needs a bit of infrastructure set up, namely a Postgres database. Smaller deployments can also be run on SQLite, I’m using Postgres mostly out of habit.
As I’ve written about before, I’m using CloudNativePG for my Postgres DB needs. In the recent 1.25 release, CNPG introduced support for creating multiple databases in a single Cluster. But because I’ve already started with “one Cluster per app”, I decided to stay with that approach for the duration of the k8s migration and look into merging it all into one Cluster later.
Because I’ve written about it in detail before, here’s just the basic options for the CNPG Cluster CRD I’m using:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: woodpecker-pg-cluster
labels:
homelab/part-of: woodpecker
spec:
instances: 2
imageName: "ghcr.io/cloudnative-pg/postgresql:16.2-10"
bootstrap:
initdb:
database: woodpecker
owner: woodpecker
resources:
requests:
memory: 200M
cpu: 150m
postgresql:
parameters:
max_connections: "200"
shared_buffers: "50MB"
effective_cache_size: "150MB"
maintenance_work_mem: "12800kB"
checkpoint_completion_target: "0.9"
wal_buffers: "1536kB"
default_statistics_target: "100"
random_page_cost: "1.1"
effective_io_concurrency: "300"
work_mem: "128kB"
huge_pages: "off"
max_wal_size: "128MB"
wal_keep_size: "512MB"
storage:
size: 1.5G
storageClass: rbd-fast
backup:
barmanObjectStore:
endpointURL: http://rook-ceph-rgw-rgw-bulk.rook-cluster.svc:80
destinationPath: "s3://backup-cnpg/"
s3Credentials:
accessKeyId:
name: rook-ceph-object-user-rgw-bulk-cnpg-backup-woodpecker
key: AccessKey
secretAccessKey:
name: rook-ceph-object-user-rgw-bulk-cnpg-backup-woodpecker
key: SecretKey
retentionPolicy: "30d"
---
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: woodpecker-pg-backup
spec:
method: barmanObjectStore
immediate: true
schedule: "0 30 1 * * *"
backupOwnerReference: self
cluster:
name: woodpecker-pg-cluster
As always, I’m configuring backups right away. For CNPG to work, the operator needs network access to the Postgres instance started up in the Woodpecker namespace, so a network policy is also needed:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "woodpecker-pg-cluster-allow-operator-ingress"
spec:
endpointSelector:
matchLabels:
cnpg.io/cluster: woodpecker-pg-cluster
ingress:
- fromEndpoints:
- matchLabels:
io.kubernetes.pod.namespace: cnpg-operator
app.kubernetes.io/name: cloudnative-pg
While we’re on the topic of network policies, here’s my generic deny-all policy I’m using in most namespaces:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "woodpecker-deny-all-ingress"
spec:
endpointSelector: {}
ingress:
- fromEndpoints:
- {}
This allows all intra-namespace access between Pods, but no ingress from any Pods in other namespaces.
And because Woodpecker provides a web UI, I also need to provide access to the
server
Pod from my Traefik ingress:
apiVersion: "cilium.io/v2"
kind: CiliumNetworkPolicy
metadata:
name: "woodpecker-traefik-access"
spec:
endpointSelector:
matchExpressions:
- key: "app.kubernetes.io/name"
operator: In
values:
- "server"
ingress:
- fromEndpoints:
- matchLabels:
homelab/ingress: "true"
io.kubernetes.pod.namespace: traefik-ingress
Hm, writing all of this up I’m realizing that I completely forgot to write a post about some “standard things” I will be doing for most apps. I had planned to do that for the migration of my Audiobookshelf instance to k8s, but completely forgot to write any post about it at all. Will put it on the pile. 😄
Before getting to the Woodpecker Helm chart, we also need to do a bit of yak shaving with regards to the CNPG DB secrets. Helpfully, CNPG always creates a secret with the necessary credentials to access the database, in multiple formats. An example would look like this:
data:
dbname: woodpecker
host: woodpecker-pg-cluster-rw
jdbc-uri: jdbc:postgresql://woodpecker-pg-cluster-rw.woodpecker:5432/woodpecker?password=1234&user=woodpecker
password: 1234
pgpass: woodpecker-pg-cluster-rw:5432:woodpecker:woodpecker:1234
port: 5432
uri: postgresql://woodpecker:1234@woodpecker-pg-cluster-rw.woodpecker:5432/woodpecker
user: woodpecker
username: woodpecker
I would love to be able to use the values from that Secret verbatim, specifically
the uri
property, to set the WOODPECKER_DATABASE_DATASOURCE
variable from
it. But sadly, the Woodpecker Helm chart
is one of those which do allow Secrets to be used to set environment variables -
but only via envFrom.secretRef
. Which feeds the Secret’s keys in as env
variables, but doesn’t allow to set specific env variables to specific keys
from the secret, via env.valueFrom.secretKeyRef
.
I think this should be a functionality every Helm chart provides, specifically for cases like this. I’ve got two tools which automatically create Secrets in my cluster, CNPG for DB credentials and configs, and Rook, which creates Secrets and ConfigMaps for S3 buckets and Ceph users created through its CRDs. But every tool/Helm chart seems to have their own ideas about the env variables certain things should be stored in. The S3 credential env vars in the case of Rook’s S3 buckets should work in most cases because they’re pretty standardized, but everything else is pretty much hit-and-miss.
And, with the env.valueFrom
functionality for both Secrets and ConfigMaps,
Kubernetes already provides the necessary utility to assign specific keys from
them to specific env vars. A number of Helm charts just need to allow me to
make use of that, instead of insisting on Secrets with a specific group of keys.
Anyway, in the case of Secrets, I’ve found a pretty roundabout way to achieve what I want, namely being able to use automatically created credentials. And I’m using my External Secrets deployment for this, more specifically the ability to configure a Kubernetes namespace as a SecretStore:
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: secrets-store
labels:
homelab/part-of: woodpecker
spec:
provider:
kubernetes:
remoteNamespace: woodpecker
auth:
serviceAccount:
name: ext-secrets-woodpecker
server:
caProvider:
type: ConfigMap
name: kube-root-ca.crt
key: ca.crt
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: ext-secrets-woodpecker
labels:
homelab/part-of: woodpecker
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ext-secrets-woodpecker-role
labels:
homelab/part-of: woodpecker
rules:
- apiGroups: [""]
resources:
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- authorization.k8s.io
resources:
- selfsubjectrulesreviews
verbs:
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
homelab/part-of: woodpecker
name: ext-secrets-woodpecker
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ext-secrets-woodpecker-role
subjects:
- kind: ServiceAccount
name: ext-secrets-woodpecker
namespace: woodpecker
This SecretStore then allows me to use External Secret’s ExternalSecret
templating to take the CNPG Secret created automatically and bring it into a
format to make it usable with the Woodpecker Helm chart. I decided that I would
use the envFrom.secretRef
method to turn all of the Secret’s keys into env
variables:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: "woodpecker-db-secret"
labels:
homelab/part-of: woodpecker
spec:
secretStoreRef:
name: secrets-store
kind: SecretStore
refreshInterval: "1h"
target:
creationPolicy: 'Owner'
data:
- secretKey: WOODPECKER_DATABASE_DATASOURCE
remoteRef:
key: woodpecker-pg-cluster-app
property: uri
That ExternalSecret takes the uri
key from the automatically created CNPG
Secret and writes its content into a new Secret’s WOODPECKER_DATABASE_DATASOURCE
key.
And just like that, I have a Secret in the right format to use it with
Woodpecker’s Helm chart.
After I implemented the above, I had another thought how I could do the same
thing without taking the detour via ExternalSecret. The Helm chart does provide
options to add extra volume mounts. Furthermore, Woodpecker has the
WOODPECKER_DATABASE_DATASOURCE_FILE
variable, which allows reading the
connection string from a file. So I could have mounted the CNPG DB Secret as a
volume and then provided the path to the file with the uri
key in this
variable. Sadly I found this a bit late, but I will keep this possibility in
mind should I come across another Helm chart which lacks the possibility
to assign arbitrary Secret keys to env variables.
Temporary StorageClass
Woodpecker needs some storage for every pipeline executed. That storage is shared between all steps and is used to clone the repository and share intermediate artifacts between steps.
With the Kubernetes backend, Woodpecker uses PersistentVolumeClaims, one per
pipeline run. It also automatically cleans those up after the pipeline has run
through.
The issue for me is that in my Rook Ceph setup, the StorageClasses all have their
reclaim policy set to Retain
. This is mostly because I’m not the smartest guy
under the sun, and there’s a real chance that I might accidentally remove a
PVC with data I would really like to keep.
But that’s a problem for these temporary PVCs, which are only relevant for the
duration of a single pipeline run. Using my standard StorageClasses would mean
ending up with a lot of unused PersistentVolumes.
So I had to create another StorageClass with the reclaim policy set to Delete
:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: homelab-fs-temp
provisioner: rook-ceph.cephfs.csi.ceph.com
reclaimPolicy: Delete
parameters:
clusterID: rook-cluster
fsName: homelab-fs
pool: homelab-fs-bulk
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: "{{ .Release.Namespace }}"
csi.storage.k8s.io/controller-expand-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/controller-expand-secret-namespace: "{{ .Release.Namespace }}"
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: "{{ .Release.Namespace }}"
This uses CephFS as the provider, because I like those volumes to be RWX capable, which is not the case for RBD based volumes.
Using this StorageClass, the PersistentVolume is deleted when the PVC is deleted, freeing the space for the next pipeline run.
Gitea configuration
Because Woodpecker needs access to Gitea, there’s some configuration necessary as well, mainly related to the fact that Woodpecker doesn’t have its own authentication and instead relies on the forge it’s connected to.
To begin with, Woodpecker needs to be added as an OAuth2 application. This can
be done by any user, under the https://gitea.example.com/user/settings/applications
URL. The configuration is the same as for any other OAuth2 provider, Woodpecker
needs a client ID and a client secret.
The application can be given any name, and the redirect URL has to be
https://<your-woodpecker-url>/authorize
:
After clicking Create Application, Gitea creates the app and shows the necessary information:
I then copied the Client ID and Client Secret fields into my Vault instance and provided them to Kubernetes with another ExternalSecret:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: "gitea-secret"
labels:
homelab/part-of: woodpecker
spec:
secretStoreRef:
name: hashi-vault-store
kind: ClusterSecretStore
refreshInterval: "1h"
target:
creationPolicy: 'Owner'
data:
- secretKey: WOODPECKER_GITEA_CLIENT
remoteRef:
key: secret/gitea-oauth
property: clientid
- secretKey: WOODPECKER_GITEA_SECRET
remoteRef:
key: secret/gitea-oauth
property: clientSecret
That was all the Gitea config necessary. There’s going to be one more step when accessing Woodpecker for the first time. Because it uses OAuth2, it will redirect you to Gitea to log in, and Gitea will then need confirmation that Woodpecker can access your account info and repositories.
Deploying Woodpecker
For deploying Woodpecker itself, I’m using the official Helm chart.
It’s split into two subcharts, one for the agents which run the pipelines and
one for the server. Let’s start with the server part of the values.yaml
:
server:
enabled: true
metrics:
enabled: false
env:
WOODPECKER_OPEN: "false"
WOODPECKER_HOST: 'https://ci.example.com'
WOODPECKER_DISABLE_USER_AGENT_REGISTRATION: "true"
WOODPECKER_DATABASE_DRIVER: "postgres"
WOODPECKER_GITEA: "true"
WOODPECKER_GITEA_URL: "https://gitea.example.com"
WOODPECKER_PLUGINS_PRIVILEGED: "woodpeckerci/plugin-docker-buildx:latest-insecure"
extraSecretNamesForEnvFrom:
- gitea-secret
- woodpecker-db-secret
persistentVolume:
enabled: true
storageClass: rbd-bulk
ingress:
enabled: true
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: secureweb
hosts:
- host: ci.example.com
paths:
- path: /
resources:
requests:
cpu: 100m
limits:
memory: 128Mi
As I do so often, I explicitly set metrics.enabled
to false
, so that later
I can go through my Homelab repo and slowly enable metrics for the apps I’m
interested in, just by grepping for metrics
.
Woodpecker is entirely configured through environment variables. I’ve configured
those which don’t contain secrets right in the values.yaml
, and the secrets
are added via the extraSecretNamesForEnvFrom
list. Those are the Gitea OAuth2
and CNPG DB Secrets. The server itself also needs some storage space, which I
put on my bulk storage pool with the persistentVolume
option. I’m also
configuring the Ingress and resources.
A short comment on the resources: Make sure that you know what you’re doing. 😅
I initially had the cpu: 100m
resource set under limits
accidentally. And
then I was wondering yesterday why the Woodpecker server was restarted so often
due to failed liveness probes. Turns out that the 100m
is not enough CPU
when the Pod happens to run on a Pi 4 and I’m also clicking around in the Web UI.
The liveness probe then doesn’t get a timely answer and starts failing, ultimately
restarting the Pod.
The second part of a Woodpecker deployment are the agents. Those are the part of Woodpecker that runs the actual pipelines, launching the containers for each step. Woodpecker supports multiple backends. The first one is the traditional Docker backend, which needs the agent to have access to the Docker socket. That’s the config I’ve been running up to now with my Drone setup. The two biggest downsides for me were the fact that a piece of software explicitly intended to execute arbitrary code would have full access to the host’s Docker daemon. The second one was that the agent could only run pipelines on its own host, which meant that it couldn’t distribute the different steps in my entire Nomad cluster.
Now, with Woodpecker, I’m making use of the Kubernetes Backend. With this backend, the agents themselves only work as an interface to the k8s API, launching one Pod for each step and creating the PVC used as shared storage for all steps of a pipeline.
One quirk of the Kubernetes backend is that it adds a NodeSelector to the architecture of the agent which is launching the pipeline. So when the agent executing a pipeline happens to be an ARM64 machine, all Pods for that pipeline will also run on ARM64 machines. But this can be controlled for individual steps as well.
Here is the agent portion of the Woodpecker Helm values.yaml
:
agent:
enabled: true
replicaCount: 2
env:
WOODPECKER_BACKEND: kubernetes
WOODPECKER_MAX_WORKFLOWS: 2
WOODPECKER_BACKEND_K8S_NAMESPACE: woodpecker
WOODPECKER_BACKEND_K8S_VOLUME_SIZE: 10G
WOODPECKER_BACKEND_K8S_STORAGE_CLASS: homelab-fs-temp
WOODPECKER_BACKEND_K8S_STORAGE_RWX: "true"
persistence:
enabled: true
storageClass: rbd-bulk
accessModes:
- ReadWriteOnce
serviceAccount:
create: true
rbasc:
create: true
resources:
requests:
cpu: 100m
limits:
memory: 128Mi
topologySpreadConstraints:
- maxSkew: 1
topologyKey: "kubernetes.io/arch"
whenUnsatisfiable: "DoNotSchedule"
labelSelector:
matchLabels:
"app.kubernetes.io/name": agent
"app.kubernetes.io/instance": woodpecker
Here I’m configuring two agents to be run, and one each on a different
architecture. In my cluster, this leads to one agent running on AMD64 and one
running on ARM64, through the topologySpreadConstraints
. I’m also telling
the agents which StorageClass to use, as I explained above I had to create
a new one with retention disabled. I’m setting a default 10 GB size for the
volume.
Before continuing with some CI pipeline configs, let’s have a short look at the Pods Woodpecker launches. I’ve captured the Pod for the following Woodpecker CI step:
- name: build
image: debian
commands:
- echo "This is the build step"
- echo "binary-data-123" > executable
- chmod u+x ./executable
- sleep 120
It looks like this:
apiVersion: v1
kind: Pod
metadata:
labels:
step: build
name: wp-01jhkac6pf4jyfywavjg6be5cq
namespace: woodpecker
spec:
containers:
- command:
- /bin/sh
- -c
- echo $CI_SCRIPT | base64 -d | /bin/sh -e
env:
- name: CI
value: woodpecker
- name: CI_COMMIT_AUTHOR
value: mmeier
- name: CI_COMMIT_AUTHOR_AVATAR
value: https://gitea.example.com/avatars/d941e68cc8aa38efdee91c3e3c97159e
- name: CI_COMMIT_AUTHOR_EMAIL
value: mmeier@noreply.gitea.example.com
- name: CI_COMMIT_BRANCH
value: master
- name: CI_COMMIT_MESSAGE
value: |
Add a sleep to inspect the Pod
- name: CI_COMMIT_REF
value: refs/heads/master
- name: CI_COMMIT_SHA
value: 353b9f67102ba120ffe9284aa711eb87c2542573
- name: CI_COMMIT_URL
value: https://gitea.example.com/adm/ci-tests/commit/353b9f67102ba120ffe9284aa711eb87c2542573
- name: CI_FORGE_TYPE
value: gitea
- name: CI_FORGE_URL
value: https://gitea.example.com
- name: CI_MACHINE
value: woodpecker-agent-1
- name: CI_PIPELINE_CREATED
value: "1736888948"
- name: CI_PIPELINE_EVENT
value: push
- name: CI_PIPELINE_FILES
value: '[".woodpecker/my-first-workflow.yaml"]'
- name: CI_PIPELINE_FINISHED
value: "1736888960"
- name: CI_PIPELINE_FORGE_URL
value: https://gitea.example.com/adm/ci-tests/commit/353b9f67102ba120ffe9284aa711eb87c2542573
- name: CI_PIPELINE_NUMBER
value: "3"
- name: CI_PIPELINE_PARENT
value: "0"
- name: CI_PIPELINE_STARTED
value: "1736888951"
- name: CI_PIPELINE_STATUS
value: success
- name: CI_PIPELINE_URL
value: https://ci.example.com/repos/1/pipeline/3
- name: CI_PREV_COMMIT_AUTHOR
value: mmeier
- name: CI_PREV_COMMIT_AUTHOR_AVATAR
value: https://gitea.example.com/avatars/d941e68cc8aa38efdee91c3e3c97159e
- name: CI_PREV_COMMIT_AUTHOR_EMAIL
value: mmeier@noreply.gitea.example.com
- name: CI_PREV_COMMIT_BRANCH
value: master
- name: CI_PREV_COMMIT_MESSAGE
value: |
Possibly fix permission error
- name: CI_PREV_COMMIT_REF
value: refs/heads/master
- name: CI_PREV_COMMIT_SHA
value: b680ab9b9a7aa300d80a43bd389de0e57f767e4f
- name: CI_PREV_COMMIT_URL
value: https://gitea.example.com/adm/ci-tests/commit/b680ab9b9a7aa300d80a43bd389de0e57f767e4f
- name: CI_PREV_PIPELINE_CREATED
value: "1736800786"
- name: CI_PREV_PIPELINE_EVENT
value: push
- name: CI_PREV_PIPELINE_FINISHED
value: "1736800827"
- name: CI_PREV_PIPELINE_FORGE_URL
value: https://gitea.example.com/adm/ci-tests/commit/b680ab9b9a7aa300d80a43bd389de0e57f767e4f
- name: CI_PREV_PIPELINE_NUMBER
value: "2"
- name: CI_PREV_PIPELINE_PARENT
value: "0"
- name: CI_PREV_PIPELINE_STARTED
value: "1736800790"
- name: CI_PREV_PIPELINE_STATUS
value: failure
- name: CI_PREV_PIPELINE_URL
value: https://ci.example.com/repos/1/pipeline/2
- name: CI_REPO
value: adm/ci-tests
- name: CI_REPO_CLONE_SSH_URL
value: ssh://gituser@git.example.com:1234/adm/ci-tests.git
- name: CI_REPO_CLONE_URL
value: https://gitea.example.com/adm/ci-tests.git
- name: CI_REPO_DEFAULT_BRANCH
value: master
- name: CI_REPO_NAME
value: ci-tests
- name: CI_REPO_OWNER
value: adm
- name: CI_REPO_PRIVATE
value: "true"
- name: CI_REPO_REMOTE_ID
value: "94"
- name: CI_REPO_SCM
value: git
- name: CI_REPO_TRUSTED
value: "false"
- name: CI_REPO_URL
value: https://gitea.example.com/adm/ci-tests
- name: CI_STEP_FINISHED
value: "1736888960"
- name: CI_STEP_NUMBER
value: "0"
- name: CI_STEP_STARTED
value: "1736888951"
- name: CI_STEP_STATUS
value: success
- name: CI_STEP_URL
value: https://ci.example.com/repos/1/pipeline/3
- name: CI_SYSTEM_HOST
value: ci.example.com
- name: CI_SYSTEM_NAME
value: woodpecker
- name: CI_SYSTEM_PLATFORM
value: linux/amd64
- name: CI_SYSTEM_URL
value: https://ci.example.com
- name: CI_SYSTEM_VERSION
value: 2.8.1
- name: CI_WORKFLOW_NAME
value: my-first-workflow
- name: CI_WORKFLOW_NUMBER
value: "1"
- name: CI_WORKSPACE
value: /woodpecker/src/gitea.example.com/adm/ci-tests
- name: HOME
value: /root
- name: CI_SCRIPT
value: CmlmIFsgLW4gIiRDSV9ORVRSQ19NQUNISU5FIiBdOyB0aGVuCmNhdCA8PEVPRiA+ICRIT01FLy5uZXRyYwptYWNoaW5lICRDSV9ORVRSQ19NQUNISU5FCmxvZ2luICRDSV9ORVRSQ19VU0VSTkFNRQpwYXNzd29yZCAkQ0lfTkVUUkNfUEFTU1dPUkQKRU9GCmNobW9kIDA2MDAgJEhPTUUvLm5ldHJjCmZpCnVuc2V0IENJX05FVFJDX1VTRVJOQU1FCnVuc2V0IENJX05FVFJDX1BBU1NXT1JECnVuc2V0IENJX1NDUklQVAoKZWNobyArICdlY2hvICJUaGlzIGlzIHRoZSBidWlsZCBzdGVwIicKZWNobyAiVGhpcyBpcyB0aGUgYnVpbGQgc3RlcCIKCmVjaG8gKyAnZWNobyAiYmluYXJ5LWRhdGEtMTIzIiA+IGV4ZWN1dGFibGUnCmVjaG8gImJpbmFyeS1kYXRhLTEyMyIgPiBleGVjdXRhYmxlCgplY2hvICsgJ2NobW9kIHUreCAuL2V4ZWN1dGFibGUnCmNobW9kIHUreCAuL2V4ZWN1dGFibGUKCmVjaG8gKyAnc2xlZXAgMTIwJwpzbGVlcCAxMjAK
- name: SHELL
value: /bin/sh
image: debian
imagePullPolicy: Always
name: wp-01jhkac6pf4jyfywavjg6be5cq
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /woodpecker
name: wp-01jhkac6pf4jyfywavjasgpcwn-0-default
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-n75dj
readOnly: true
workingDir: /woodpecker/src/gitea.example.com/adm/ci-tests
dnsPolicy: ClusterFirst
enableServiceLinks: true
imagePullSecrets:
- name: regcred
nodeName: sehith
nodeSelector:
kubernetes.io/arch: amd64
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- name: wp-01jhkac6pf4jyfywavjasgpcwn-0-default
persistentVolumeClaim:
claimName: wp-01jhkac6pf4jyfywavjasgpcwn-0-default
- name: kube-api-access-n75dj
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
There are a number of noteworthy things in here. First perhaps the handling of the script to execute for the job:
- command:
- /bin/sh
- -c
- echo $CI_SCRIPT | base64 -d | /bin/sh -e
env:
- name: CI_SCRIPT
value: CmlmIFsgLW4gIiRDSV9ORVRSQ19NQUNISU5FIiBdOyB0aGVuCmNhdCA8PEVPRiA+ICRIT01FLy5uZXRyYwptYWNoaW5lICRDSV9ORVRSQ19NQUNISU5FCmxvZ2luICRDSV9ORVRSQ19VU0VSTkFNRQpwYXNzd29yZCAkQ0lfTkVUUkNfUEFTU1dPUkQKRU9GCmNobW9kIDA2MDAgJEhPTUUvLm5ldHJjCmZpCnVuc2V0IENJX05FVFJDX1VTRVJOQU1FCnVuc2V0IENJX05FVFJDX1BBU1NXT1JECnVuc2V0IENJX1NDUklQVAoKZWNobyArICdlY2hvICJUaGlzIGlzIHRoZSBidWlsZCBzdGVwIicKZWNobyAiVGhpcyBpcyB0aGUgYnVpbGQgc3RlcCIKCmVjaG8gKyAnZWNobyAiYmluYXJ5LWRhdGEtMTIzIiA+IGV4ZWN1dGFibGUnCmVjaG8gImJpbmFyeS1kYXRhLTEyMyIgPiBleGVjdXRhYmxlCgplY2hvICsgJ2NobW9kIHUreCAuL2V4ZWN1dGFibGUnCmNobW9kIHUreCAuL2V4ZWN1dGFibGUKCmVjaG8gKyAnc2xlZXAgMTIwJwpzbGVlcCAxMjAK
- name: SHELL
value: /bin/sh
Running the CI_SCRIPT
content through base64 -d
results in this shell script:
if [ -n "$CI_NETRC_MACHINE" ]; then
cat <<EOF > $HOME/.netrc
machine $CI_NETRC_MACHINE
login $CI_NETRC_USERNAME
password $CI_NETRC_PASSWORD
EOF
chmod 0600 $HOME/.netrc
fi
unset CI_NETRC_USERNAME
unset CI_NETRC_PASSWORD
unset CI_SCRIPT
echo + 'echo "This is the build step"'
echo "This is the build step"
echo + 'echo "binary-data-123" > executable'
echo "binary-data-123" > executable
echo + 'chmod u+x ./executable'
chmod u+x ./executable
echo + 'sleep 120'
sleep 120
This shows that the commands from the commands:
list from the step
object
in the Woodpecker file is converted into a shell script by copying the commands
into the script and adding an echo
for each of them.
Looking at this and thinking about my own work on a large CI I’m sometimes
wondering what we’d do without the base64
command. 😅
Another aspect of the setup is all of the available environment variables,
supplying a lot of information not just on the commit currently being CI tested,
but also the previous commit. Most of the CI_
variables also have equivalents
prefixed with DRONE_
, for backwards compatibility. I removed them in the
output above to not make the snippet too long.
Finally there’s proof of what I said above about the agent’s architecture. This pipeline was run by the agent on my AMD64 node, resulting in the NodeSelector for AMD64 nodes:
nodeName: sehith
nodeSelector:
kubernetes.io/arch: amd64
Also nice to see that the Pod was running on sehith
, which isn’t the node the
agent ran on, showing that the Pods are just submitted for scheduling to k8s,
being able to run on any (AMD64 in this case) node.
Before ending the post, let’s have a look at some example CI configurations.
CI configurations
Each repository using Woodpecker needs to be enabled. This is done from Woodpecker’s web UI:When clicking the Enable button, Woodpecker will contact Gitea and add a webhook configuration for the repository. With that, Gitea will call the webhook with information about the event which triggered it and the state of the repository.
The Woodpecker configuration files for a specific repository are expected in
the .woodpecker/
directory at the repository root by default.
Blog repo example
Here’s the configuration I’m using to build and publish this blog:
when:
- event: push
steps:
- name: Hugo Site Build
image: "harbor.mei-home.net/homelab/hugo:0.125.4-r3"
commands:
- hugo
- name: Missing alt text check
image: python:3
commands:
- pip install lxml beautifulsoup4
- python3 scripts/alt_text.py ./public/posts/
- name: Hugo Site Upload
image: "harbor.mei-home.net/homelab/hugo:latest"
environment:
AWS_ACCESS_KEY_ID:
from_secret: access-key
AWS_SECRET_ACCESS_KEY:
from_secret: secret-key
commands:
- s3cmd -c /s3cmd.conf sync -r --delete-removed --delete-after --no-mime-magic ./public/ s3://blog/
To start with, the page needs to be build, using Hugo in an image I build myself, based on Alpine with a couple of tools installed. Then I’m running a short Python script which uses beautifulsoup4 to scan through the generated HTML and make sure that each image has alt text, and that there’s actually something in that alt text. Finally, I push the generated site up to an S3 bucket in my Ceph cluster from where it is served.
The when:
at the beginning is important, it determines under which
conditions the pipeline is executed. This can be configured for specific
branches or certain events, like a push or an update of a pull request.
The different conditions can also be combined. In addition to configuring
conditions on the entire pipeline, they can also be configured just on
certain steps, as we will see later.
One thing I find a little bit lacking at the moment, specifically for the Kubernetes use case, is the secrets management. It’s currently only possible via the web UI or the CLI. There’s no way to provide specific Kubernetes Secrets to certain steps in a certain pipeline. But there is an open issue to implement support for Kubernetes Secrets on Github. Until that is implemented, the UI needs to be used. It looks like this:Secrets can be configured for specific repositories, specific orgs where the forge supports them and for all pipelines.
When looking at a specific repository, all of the pipelines which ran for it are listed:This gives a nice overview of the pipelines which ran recently, here with the example of my blog repository, including the most recent run for publishing the post on the backup operator deployment.
Clicking on one of the pipeline runs then shows the overview of that pipeline’s steps and the step logs:
This pipeline is not very complex and runs through in about two minutes. So let’s have a look at another pipeline with a bit more complexity.
Docker repo example
Another repository where I’m making quite some use of CI is my Docker repository. In that repo, I’ve got a couple of Dockerfiles for cases where I’m adding something to upstream images or building my own where no upstream container is available.
This repository’s CI is a bit more complicated mostly because it does the same thing for multiple different Docker files, and because it needs to do different things for pull requests and commits pushed to the Master branch.
And that’s where the problems begin, at least to a certain extend. As I’ve shown
above, you can provide a when
config to tell Woodpecker under which conditions
to run the pipeline. And if you leave that out completely, you don’t end up
with the pipeline being run for all commits. No. You end up with the pipeline being
run twice for some commits.
Consider, for example, this configuration:
steps:
- name: build image
image: woodpeckerci/plugin-docker-buildx:latest-insecure
settings:
<<: *dockerx-config
dry-run: true
when:
- event: pull_request
Ignore the config under settings, and concentrate on the fact that there’s no
when
config on the pipeline level, only on the step level. And there’s only
one step, that’s supposed to run on pull requests. The result of this config
is that two pipelines will be started - including Pod launches, PVC creation
and so on:The pipeline #1 was launched for the “push” event to the woodpecker-ci branch,
and the other for the update of the pull request that push belonged to. The push
event pipeline only launched the clone step, while the pull request pipeline
launched the build image step and the clone step.
The root cause for this behavior is that Gitea always triggers the webhook for all fitting events, one for each event. And consequently, Woodpecker then launches one pipeline for each event.
A similar effect can be observed when combining both, pull requests and push
events in one when
clause on the pipeline level.
Now, you might be saying: Okay, then just configure the triggers only on the
steps, not on the entire pipeline. But that also doesn’t really work. Without
a when
clause, as shown above, two pipelines are always started for commits
in pull requests. And even though one of the pipelines won’t do much, it would
still do something. In my case, it would launch a Pod for the clone step and
also create a PVC and clone the repo - for nothing.
The next idea I came up with: Okay, then let’s set the pipeline’s when
to trigger
on push events, because that would trigger the pipeline for both - pushes to
branches like master and pushes to pull requests. And then just add when
clauses to each step with either the pull request or push event, depending on
when it is supposed to run.
But that also won’t work - any given pipeline only ever sees one event. If I
trigger on push events on the pipeline level, the steps triggering on the
pull request event will never trigger.
I finally figured out a way to do this. I always trigger the pipeline on push events. And then I use Woodpecker’s evaluate clause to trigger only on certain branches.
With all of that said, this is what the config is looking like for the pipeline which builds my Hugo container:
when:
- event: push
path:
- '.woodpecker/hugo.yaml'
- 'hugo/*'
variables:
- &alpine-version '3.21.2'
- &app-version '0.139.0-r0'
- &dockerx-config
debug: true
repo: harbor.example.com/homelab/hugo
registry: harbor.example.com
username: ci
password:
from_secret: container-registry
dockerfile: hugo/Dockerfile
context: hugo/
mirror: https://harbor-mirror.example.com
buildkit_config: |
debug = true
[registry."docker.io"]
mirrors = ["harbor.example.com/dockerhub-cache"]
[registry."quay.io"]
mirrors = ["harbor.example.com/quay.io-cache"]
[registry."ghcr.io"]
mirrors = ["harbor.example.com/github-cache"]
tags:
- latest
- *app-version
build_args:
hugo_ver: *app-version
alpine_ver: *alpine-version
platforms:
- "linux/amd64"
- "linux/arm64"
steps:
- name: build image
image: woodpeckerci/plugin-docker-buildx:latest-insecure
settings:
<<: *dockerx-config
dry-run: true
when:
- evaluate: 'CI_COMMIT_BRANCH != CI_REPO_DEFAULT_BRANCH'
- name: release image
image: woodpeckerci/plugin-docker-buildx:latest-insecure
settings:
<<: *dockerx-config
dry-run: false
when:
- evaluate: 'CI_COMMIT_BRANCH == CI_REPO_DEFAULT_BRANCH'
First, what does this pipeline do for pull requests and main branch pushes?
For pull requests, it uses the buildx plugin
to build a Docker container from the directory hugo/Dockerfile
in the repository.
That’s what happens in the build image step. Notably, no push to a registry
happens here.
In the case of pushes to the repo’s default branch, which is provided by Gitea
in the webhook call, the same plugin and build is used, but this time the
newly build images are pushed to my Harbor registry. For more details on that
setup, see this post.
In the when
clause for the pipeline, as I’ve explained above, I’m triggering
on the push event, to circumvent the problem with multiple pipelines being
executed for commits in pull requests.
In addition, I’m also making use of path-based triggers. Because I’ve got multiple
container images defined in one repository, I’d like to avoid running the builds
for images which haven’t changed unnecessarily. That’s done by triggering the
pipeline only on changes in its own config file and changes in the hugo/
directory.
So if the Hugo image definition and CI config haven’t changed, the pipeline won’t
be triggered.
As you can see I’m building images for both, AMD64 and ARM64. And before I close this section, I have to tell a slightly embarrassing story. I initially tried to run two pipelines - one for each architecture, so that they could both run in parallel on different nodes fitting their architecture. This would avoid the cost of emulating a foreign architecture, making the builds faster overall. This seemed like an excellent idea. And it worked really, really well. The pipelines got a couple of minutes faster. Until I had a look at my Harbor instance. And as some of you might have already figured out, I found that of course there was not one tag with images for both architectures. Instead, the tag contained whatever pipeline finished last. Because of course, two different Docker pushes override each other, instead of doing a merge. This is a problem I need to have another look at later. Someone on the Fediverse already showed me that there is a multistep way to do this manually.
Another point that I still need to improve is image caching. I think that there’s still some potential for optimization in my setup. But that’s also something for after the k8s migration is done.
Before I close out this section, I would like to point out a pretty nice feature Woodpecker has: A linter for the pipeline definitions, for example like this:
The configuration spitting out those warnings is this one for my blog:
steps:
- name: submodules
image: alpine/git
commands:
- git submodule update --init --recursive
- name: Hugo Site Build
image: "harbor.example.com/homelab/hugo:0.125.4-r3"
environment:
commands:
- hugo
- name: Missing alt text check
image: python:3
environment:
commands:
- pip install lxml beautifulsoup4
- python3 scripts/alt_text.py ./public/posts/
- name: Hugo Site Upload
image: "harbor.example.com/homelab/hugo:latest"
environment:
AWS_ACCESS_KEY_ID:
from_secret: access-key
AWS_SECRET_ACCESS_KEY:
from_secret: secret-key
commands:
- s3cmd -c /s3cmd.conf sync -r --delete-removed --delete-after --no-mime-magic ./public/ s3://blog/
The main issues are the empty environment
keys, as well as the fact that I did
not set any when
clause.
Conclusion
And that’s it. Again a pretty long one, but I had never written about my CI setup and wanted to take this chance to do so, also because I had gotten some questions on the Fediverse from people what a CI actually does, and some interest in what Woodpecker looks like.
Oh and also, I just have a propensity for long-winded writing. 😅
With this post, the Woodpecker/CI migration to k8s is done, and I’m quite happy with it. Especially the fact that my CI pipeline steps now get distributed over the entire cluster instead of just running on the nodes with the agents.
For the next step I will likely take my Gitea instance and migrate it over, but as this blog post took longer than I thought it would, it might have to wait until next weekend.