After implementing a lot of helpers and basic infrastructure for my Smokeweb app (see here if you’re interested in what the project is about), I finally arrived at creating the database connection. In this post, I will give an overview of the Golang test helpers I implemented for tests which need a database, show you how I implemented some scripting for a Postgres database using Podman and then I will tell you how I implemented database tests in WoodpeckerCI.

The main goal of this phase was to add a *sql.DB connection to my main App struct, and add checking the database with DB.Ping() to the healthcheck API endpoint.

Before I could go ahead with all of that though, I wanted to implement some scripting to launch a dev DB.

Development database setup

So let’s start with setting up a local dev DB. My intention with this DB was to have a DB for local development, which intentionally doesn’t get automatically deleted. I wanted to have that so I could easily experiment over multiple sessions, without having to set everything up again and again. These were the goals for this development database setup:

  • Database is not deleted/reset automatically
  • Database can be set up easily, without complicated environment setup
  • Run the database containerized

For the containerization, I decided on Podman, just because I’m using Podman instead of Docker for most things these days. The script starts like this:

#! /usr/bin/env bash

SCRIPT_PATH=$(dirname "$(realpath $0)")
DATADIR_PATH="${SCRIPT_PATH}/../db-data"
SMOKEWEB_DB_CONTAINER_NAME="smokeweb-db"
source "${SCRIPT_PATH}/dev_db_config"

The dev_db_config contains a number of environment variables with defaults for the database config, for easier sharing of those defaults between multiple scripts:

export DEVDB_HOST="${DEVDB_HOST:-localhost}"
export DEVDB_PORT="${DEVDB_PORT:-3307}"
export DEVDB_ROOT_PW="${DEVDB_ROOT_PW:-12345}"
export DEVDB_NAME="${DEVDB_NAME:-smokeweb_dev}"
export DEVDB_USER="${DEVDB_USER:-smokeweb_testuser}"
export DEVDB_USER_PW="${DEVDB_USER_PW:-7890}"

All of these set defaults, but use existing values if the variable is already set. This will later be used during the CI setup, for example. The passwords, usernames and database names are hardcoded intentionally, as this scripting is only intended for development. And for dev, convenience is king. I could have also run the Postgres instance completely without credentials, but then I would have also lost the ability of testing credential handling in my app. This file will also be sourced later on by my Makefile for targets which require a database to be running.

Next, let’s look at the start function, which launches the database:

function start() {
  if running; then
    echo "Container running, please stop."
    exit 1
  fi
  podman run -d --rm -v "${DATADIR_PATH}:/var/lib/postgresql:rw,Z" --userns=keep-id:uid=$(id -u),gid=$(id -g) \
         --name "${SMOKEWEB_DB_CONTAINER_NAME}" \
         -e POSTGRES_PASSWORD=${DEVDB_ROOT_PW} \
         -e PGDATA="/var/lib/postgresql" \
         -p "127.0.0.1:${DEVDB_PORT}:5432" \
         docker.io/library/postgres:18.1 || bash -c "echo 'Failed to launch podman, exiting.'; exit 1"

  wait_for_ready || exit 1
}

I will discuss the running and wait_for_ready commands in a moment. What’s happening here is the creation of a Podman container running the official Postgres image. The container is run in detached mode, so you can use the terminal for other things afterwards. The --userns=keep:uid=$(id -u),gid=$(id -g) ensures that the files and directories created by the container are owned by your local user, not by e.g. the postgres user from the container or, even worse, root.

The running function is a small helper to check whether the container is already running:

function running() {
  if podman ps --filter status=running --filter name=${SMOKEWEB_DB_CONTAINER_NAME} --no-trunc --noheading | grep -q "${SMOKEWEB_DB_CONTAINER_NAME}"; then
    return 0
  fi
  return 1
}

Here I’m using the podman ps command to filter for the container’s name and check whether it’s running.

The wait_for_ready function is another helper, this time ensuring that the database container is done initializing, which can take a couple of seconds:

function wait_for_ready() {
  export PGPASSWORD="${DEVDB_USER_PW}"
  until pg_isready -d "${DEVDB_NAME}" -h ${DEVDB_HOST} -p "${DEVDB_PORT}" -t 15 -U ${DEVDB_USER}; do
    echo "Waiting for DB to come up"
    sleep 5
  done
}

Here I’m making use of the pg_isready tool, which is provided by the Postgres project. It checks whether the database server is ready to accept connections. Setting the password and the user is not strictly necessary, I just provided them reflexively when I saw the options in the --help output. The tool itself only checks whether connections are accepted, not whether the connections is actually allowed and authenticated.

To combine both, starting and waiting for running, I’ve also got the launch_if_stopped function:

function launch_if_stopped() {
  if ! running; then
    echo "Container not running, starting"
    start || exit 1
  fi

  echo "Waiting for DB to be ready..."
  wait_for_ready || exit 1
}

It launches the database container if it’s not yet started, in contrast to the start function, which errors out when it is called though the container is already running.

Next, there’s the reset function:

function reset() {
  if running; then
    stop
  fi

  rm -fr "${DATADIR_PATH:?}/"* || exit 1
}

It stops the container if it’s running, and then deletes the data directory.

Finally, there’s the init function, which creates the necessary directories and the dev database itself:

function init() {
  if [[ ! -d "${DATADIR_PATH}" ]]; then
    mkdir -p "${DATADIR_PATH}"
  fi

  if ! running; then
    start
  fi

  export PGPASSWORD="${DEVDB_ROOT_PW}"
  pg_isready -h "${DEVDB_HOST}" -p "${DEVDB_PORT}" -t 120 -U postgres || return 1
  psql -U postgres -p "${DEVDB_PORT}" -h "${DEVDB_HOST}" <<<"${INIT_SQL}"
}

Here is the initial SQL for creating the dev database and the dev user:

read -r -d '' INIT_SQL <<EOF
CREATE DATABASE ${DEVDB_NAME};
\c ${DEVDB_NAME}
CREATE ROLE ${DEVDB_USER} WITH LOGIN PASSWORD '${DEVDB_USER_PW}';
ALTER DATABASE ${DEVDB_NAME} OWNER TO ${DEVDB_USER};
EOF

Last but not least, I’ve already got a convenience function to connect to a running database:

function connect() {
  launch_if_stopped || exit 1
  export PGPASSWORD="${DEVDB_ROOT_PW}"
  psql -U postgres -p "${DEVDB_PORT}" -h "${DEVDB_HOST}"
}

This scripting is then used by my main Makefile, for a couple of targets for handling the database:

.PHONY: dev/db/init
dev/db/init:
	@scripts/dev-db.bash init

.PHONY: dev/db/launch
dev/db/launch:
	@scripts/dev-db.bash launch-if-stopped

.PHONY: dev/db/stop
dev/db/stop:
	@scripts/dev-db.bash stop

.PHONY: dev/db/reset
dev/db/reset:
	@scripts/dev-db.bash reset

.PHONY: dev/db/connect
dev/db/connect:
	@scripts/dev-db.bash connect

Finally, there’s the question of how to get those variables for the database connection into the tests and into the default run execution of the app for dev purposes. The dev part was the easier one, I constructed the default db URI string in such a way that it would fit the default development db, which looks like this:

postgres://smokeweb_testuser:7890@localhost:3307/smokeweb_dev?sslmode=disable

To make developer’s life (meaning mine) a bit easier, I also introduced a Makefile target for launching the app, complete with also using the above dev/db/launch target:

.PHONY: dev/run
dev/run: dev/db/launch
	@go run ./cmd server

For the test targets, I instead sourced the dev_db_config file with the exported variables from the beginning of this section:

.PHONY: ut
ut:
	@echo "Running UTs..."
	@. scripts/dev_db_config && go test ./...

I had initially used Make’s INCLUDE statement. But the issue was that I wanted to configure defaults for the variables, and Make doesn’t actually support bash’s syntax. So I ended up keeping dev_db_config in bash/shell syntax and sourcing it explicitly in the target’s command.

Golang test helpers

Next question: How to provide the test databases for unit tests? While the more complicated database access will be mocked, I wanted to have the ability to test the SQL statements, starting with the function to create the database connection.

My basic idea was to have a single Postgres instance which is kept running through all tests without reboot. Then the test helpers have the ability to create databases and users with names derived from the test’s name. This way, I wouldn’t have to somehow restart/reset a database instance for every test. Instead, I only needed to make sure that databases and users are properly cleaned up at the end of each test run.

I also decided to separate the control connection, which is used to manage the user and database for the test, and the test connection, which is used by the code under test.

My TestDB struct looks like this:

type TestDB struct {
	rootConn *sql.DB
	dbHost   string
	dbPort   string
	dbUser   string
	dbPass   string
	dbName   string
}

It holds the root connection, which is used for management, as well as all the database configuration variables. It’s created via this function:

func NewRootDBConn(t *testing.T) *TestDB {
	t.Helper()
	rootPW := os.Getenv("DEVDB_ROOT_PW")
	if rootPW == "" {
		t.Fatal("could not get DB root pw from DEVDB_ROOT_PW")
		return nil
	}
	port := os.Getenv("DEVDB_PORT")
	if port == "" {
		t.Fatal("could not get DB port from DEVDB_PORT")
		return nil
	}
	host := os.Getenv("DEVDB_HOST")
	if host == "" {
		t.Fatal("could not get DB host from DEVDB_HOST")
	}
	dsn := fmt.Sprintf("postgres://postgres:%s@%s:%s?sslmode=disable",
		rootPW,
		host,
		port,
	)
	db, err := sql.Open("postgres", dsn)
	if err != nil {
		t.Fatalf("could not connect to postgres DB with DSN: %s\nError: %v", dsn, err)
		return nil
	}
	err = db.Ping()
	if err != nil {
		t.Fatalf("got error trying to ping DB: %v", err)
		return nil
	}
	testdb := TestDB{
		rootConn: db,
		dbPort:   port,
		dbHost:   host,
	}
	t.Cleanup(testdb.finalizer(t))
	return &testdb
}

Note especially the use of the t.Cleanup method. It allows me to add a finalizer to the current test’s execution. That finalizer looks like this:

func (conn *TestDB) finalizer(t *testing.T) func() {
	return func() {
		if conn.dbName == "" {
			conn.dbName = conn.dbUser
		}
		if conn.dbName != "" {
			conn.DeleteDB(t)
		}
		if conn.dbUser != "" {
			conn.DeleteUser(t)
		}
		_ = conn.rootConn.Close()
	}
}

Its task is to clean up the database and user created for this TestDB struct. They are created in these functions:

func (conn *TestDB) AddUser(t *testing.T) {
	t.Helper()
	conn.dbUser = base64.URLEncoding.EncodeToString([]byte(t.Name()))
	conn.dbPass = base64.URLEncoding.EncodeToString([]byte(t.Name()))

	query := fmt.Sprintf(`CREATE ROLE "%s" WITH LOGIN PASSWORD '%s'`, conn.dbUser, conn.dbPass)

	_, err := conn.rootConn.Exec(query)
	if err != nil {
		t.Fatalf("failed to create new user: %s with password %s: \n%v", conn.dbUser, conn.dbPass, err)
	}
}

func (conn *TestDB) CreateDB(t *testing.T) {
	if conn.dbUser == "" {
		t.Fatal("user is empty during db creation")
	}
	conn.dbName = base64.URLEncoding.EncodeToString([]byte(t.Name()))
	query := fmt.Sprintf(`CREATE DATABASE "%s" OWNER "%s"`, conn.dbName, conn.dbUser)

	_, err := conn.rootConn.Exec(query)
	if err != nil {
		t.Fatal(err)
	}
}

Setting the username, password and database name to the base64 encoding of the test’s name makes sure I don’t have to come up with unique database names for each tests. The test names are already unique. It is also helpful because it accounts for creating the database in the main test, or in subtests. The naming scheme works for all cases, regardless of where the testing.T handed into the function is located.

One issue I’ve found this setup to produce: During deletions, the database always needs to be deleted before the user is deleted. Otherwise, Postgres will throw an error, because the user (or role, in the correct parlance) can only be deleted when it doesn’t own anything anymore.

Here are the functions which do the deletions:

func (conn *TestDB) DeleteUser(t *testing.T) {
	t.Helper()

	query := fmt.Sprintf(`DROP ROLE IF EXISTS "%s"`, conn.dbUser)

	_, err := conn.rootConn.Exec(query)
	if err != nil {
		t.Error(err)
	}
}

func (conn *TestDB) DeleteDB(t *testing.T) {
	query := fmt.Sprintf(`DROP DATABASE IF EXISTS "%s" WITH (FORCE)`, conn.dbName)
	t.Logf("deleting DB %s", conn.dbName)

	_, err := conn.rootConn.Exec(query)
	if err != nil {
		t.Error(err)
	}
}

The IF EXISTS settings in both queries are required, as the creation of the user or database might have already failed, failing the UT as well, but still calling the finalizer. The WITH (FORCE) in the database deletion ensures that it doesn’t fail when there are still active connections. This is necessary for cases where the test fails and doesn’t call DB.Close().

In use, it looks something like this:

	t.Run("ensure connection works", func(tt *testing.T) {
		dbUri := testhelpers.GetDbUri(tt)

		db, err := CreateDbConnection(dbUri)
		if err != nil {
			tt.Fatal(err)
		}
		defer func() { _ = db.Close() }()

		err = db.Ping()
		if err != nil {
			tt.Error(err)
		}
	})

The GetDbUri function is another helper which creates a management connection, adds the user and database and then constructs a db URI from everything to be used by the CreateDbConnection function which is being tested here.

Before moving to the CI setup, let me make a confession: The database test helpers got so complicated that I felt the need to write tests for them. So I’ve got tests for my test helpers now. 🫣

Setting up a Postgres test database in WoodpeckerCI

I’m using WoodpeckerCI for my CI needs, so I’d need a database running there as well. Luckily for me, Woodpecker supports services. When using the Kubernetes backend for Woodpecker, these are Pods running for the entire duration of a pipeline run. My first attempt looked like this:

services:
  - name: ut-db
    image: postgres:18.1
    environment:
      POSTGRES_PASSWORD: 12345
      PGDATA: /var/lib/postgresql/

I was initially quite happy with this. I could see that a separate Pod got started, and it looked like everything besides the initial clone step waited for that service to start up. How nice!

Then I continued staring at the pipeline run, and nothing more happened. The other steps did not start running. Checking, the Pods for the other steps were all still in the creation stage, with this error in their events:

Multi-Attach error for volume "pvc-86fab878-feb5-4726-bf77-2bf1de45378f" Volume is already used by pod(s) wp-svc-01kpy0nbkv410kmjtg5x87xgp7-ut-db

Which makes a lot of sense. In hindsight.

You see, my pipeline volumes come out of a Kubernetes StorageClass backed by Ceph RBDs. Those are block devices provided from my storage cluster. My Woodpecker instance uses them via PersistentVolumeClaims, where every pipeline gets a new claim and thus volume. But their speed, amongst other things, comes from the fact that they can only be mounted on a single host. Which means, in Kubernetes parlance, that the volumes are ReadWriteOnce.

This setup had the consequence that the service, which starts relatively soon after the pipeline run gets going, claims the volume. Then, none of the other steps’ Pods could launch, because they needed the volume too. And because the service would run until the pipeline run was done, I had a nice and tidy deadlock.

If you’re wondering why I didn’t just switch to using my Ceph cluster’s CephFS to produce the volumes for my pipelines, the short version is: Performance for OCI container image builds. I do quite a few of those, and I’ve found that running them on CephFS is not very performant at all. See this post if you’d like to know the gory details.

After having vented my frustration on the Fediverse and already being halfway through Tekton’s getting started docs, @louis@social.louis-vallat.dev had the right idea here:

can you set the node preference to launch both the CI job and its service on the same node?

This was an excellent idea that ended up working without me having to entirely switch my CI system.

The first point, which I did not know, was that Kubernetes is perfectly happy to mount RWO volumes into multiple Pods, as long as those are all on the same host. I had no idea this was how it worked.

But my first thought was: I could have a completely separate volume I can mount to the service, something which is only used for this service in my Smokeweb app’s CI. The ability to run multiple pipelines in parallel wasn’t required anyway. But sadly, it’s not possible to stop a service from mounting the shared pipeline volume. See this discussion.

So forcing all steps and the service to run on the same host, as Louis proposed, was the only possible solution. But I still don’t really like it. Without this, the steps could be distributed over the entire cluster, instead of all having to run on one host. Sure, due to using an RWO volume, I still wouldn’t be able to run them in parallel, but they’d still be distributed. So I’ll definitely look deeper into Tekton at some point in the future.

But of course this wasn’t the end of the Yak. The way to force all Pods from the pipeline to be scheduled on the same host is to make use of PodAffinity. Note that this is different from NodeAffinity. Where the former allows scheduling Pods on hosts where other Pods with certain labels are already running, the latter allows scheduling Pods on nodes with certain labels. The former is what I wanted, because I didn’t want to force the Pods onto a particular node, just all on the same node. Luckily, Woodpecker already supports configuring Pod affinity for each step and service, see the docs here. Note that this example configures the affinity for all pipelines on the instance, but the same syntax works on a per-step level.

An example config setting Pod affinity would look like this:

services:
  - name: ut-db
    image: postgres:18.1
    environment:
      POSTGRES_PASSWORD: 12345
      PGDATA: /var/lib/postgresql/
    backend_options:
      kubernetes:
        affinity:
          podAffinity:
            requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector: {}
              matchLabelKeys:
                - woodpecker-ci.org/task-uuid
              topologyKey: "kubernetes.io/hostname"

The woodpecker-ci.org/task-uuid is a label set on the Pods created for a pipeline run, so it makes for the ideal label to base affinity on for getting all Pods scheduled on the same machine.

There was just one problem: This format throws an error in Woodpecker:

{"level":"debug",
  "repo":"mmeier/smokes.web",
  "error":"[compiler] max depth reached",
  "time":"2026-04-23T22:21:24Z",
  "caller":"/woodpecker/src/github.com/woodpecker-ci/woodpecker/server/pipeline/create.go:96",
  "message":"failed to parse yaml"
}

I spend a considerable amount of time with a variety of Yaml parsers to figure out what might be wrong. But they all said that the Yaml was perfectly fine. After digging through Woodpecker’s code, I finally discovered this Yaml lib, which Woodpecker makes use of. And it contained the error message, in this file. So alright, this was a defense against infinite recursion. So I went ahead and created an issue. I was also able to relatively easily create a UT which showed the issue, see here.

This spelunking around, and especially the fact that I was able to write a UT for the issue gave me a rather warm and fuzzy feeling. Finally speaking the language of a lot of the projects I’m running in my Homelab feels quite nice.

But I also wanted to go ahead with my actual goal, writing my database UTs and running them in my CI. So I had a closer look at Woodpecker’s build instructions and figured: I can probably put together a version with the xyaml lib adapted to a higher depth limit. So I first went and looked for an explanation on how to temporarily replace a dependency with a local fork, and found it in this post.

Armed with that knowledge, I cloned the xyaml lib and adapted the maxDepth constant in this line:

const maxDepth uint8 = 10

I set it to 20. Then I added this line to the go.mod file:

replace codeberg.org/6543/xyaml => /home/me/projects/xyaml

Then I ran the build with these commands:

make vendor
make build-ui
export TARGETOS=linux TARGETARCH=amd64 CGO_ENABLED=0
make build-server
mkdir dist/server/linux_amd64
mv dist/woodpecker-server dist/server/linux_amd64
export TARGETOS=linux TARGETARCH=arm64 CGO_ENABLED=0
make build-server
mkdir dist/server/linux_arm64
mv dist/woodpecker-server dist/server/linux_arm64
docker login https://docker-registry.example -p "mypass" -u myuser
docker buildx build --platform linux/amd64,linux/arm64 -t docker-registry.example/homelab/woodpecker-server:deep -f docker/Dockerfile.server.multiarch.rootless --push .

That worked quite nicely, and I was finally able to return to my CI config, which now stopped throwing the max depth reached error. But I now ran into the next error, this time during Postgres startup. It started appearing after I moved the database directory in the service to /woodpecker/testdb, which places it on the shared pipeline volume:

performing post-bootstrap initialization ... 2026-04-24 14:11:49.647 UTC [36] FATAL:  data directory "/woodpecker/testdb" has invalid permissions

Turns out that the directory indeed had bad permissions, and bad ownership, namely root. I wasn’t sure how to solve this one, but ended up with moving the database from a service to a detached step. A detached step has similar behavior as a service, but allows the use of things like step dependencies. I made use of that like this:

  - name: prepare db dir
    image: alpine:3.23.4
    environment:
      PGDATA: *testdb-dir
    backend_options:
      kubernetes:
        <<: *kube-affinity-config
    commands:
      - mkdir $${PGDATA}
      - chmod 0750 $${PGDATA}
  - name: ut-db
    image: *db-image
    detach: true
    environment:
      POSTGRES_PASSWORD: 12345
      PGDATA: *testdb-dir
    backend_options:
      kubernetes:
        <<: *kube-affinity-config
    depends_on:
      - prepare db dir

The kube-affinity-config is a Yaml anchor with the affinity config I’ve already shown above. As you can see, I’m really just creating the directory and setting the right permissions. Then I’m using the depends_on feature to add a proper dependency on the preparation step.

One note, because I had that big brain idea: You cannot add a dependency from the UT step onto this ut-db step, because detached steps, similar to services, run until the entire pipeline is through. So adding a dependency from ut-db on the UT step results in the UT step waiting indefinitely and the pipeline running into a timeout.

With the above config, I finally had what I wanted: A Postgres database running in parallel with my CI steps. The last thing remaining was to set the database config environment variables correctly in the UT step:

  - name: UTs
    image: *golang-image
    volumes:
      - gocache-volume:/ci-go-cache
    backend_options:
      kubernetes:
        <<: *kube-affinity-config
    environment:
      GOCACHE: *golang-build-cache
      GOMODCACHE: *golang-mod-cache
      DEVDB_HOST: ut-db
      DEVDB_PORT: 5432
    depends_on:
      - prepare mod cache
      - wait for db ready
    commands:
      - make ut

Conclusion

Lots of Yaks needed shaving for this one, but I’m okay-ish with the solution for now. I’m very thankful to Louis for pointing out the solution with forcing all of the pipeline Pods onto the same machine and thus being able to use RWO volumes. Without him, I’d probably be elbow-deep in replacing Woodpecker with Tekton right now.

I think I will still do that. In fact I’ve already created a task to dig into Tekton soon, to see whether I’d want to use it as my CI system. From a first look, it looks like it is a lot more flexible than Woodpecker. And I think it would be fun to put together a CI system from the building blocks it provides. One feature I find interesting already is the reusability. Because the Woodpecker config lives in the individual repos, reuse is not really a thing. But in Tekton, pipelines and their steps are Kubernetes CRDs, from which I can then instantiate different pipelines with different parameters for each of my repos which need CI.

But then I’m also wondering whether this is the correct approach. Perhaps instead of jumping into another tool, I should instead see whether I can change Woodpecker to make it work like I’d want it to, which is mostly around parallelism and storage volumes. I’d want to look at the following:

  1. Having the ability to skip attaching the shared pipeline volume to individual steps
  2. Adding the ability to autocreate additional, step-individual pipeline volumes
  3. Having more flexibility when it comes to the StorageClass used for the pipeline volumes, e.g. allowing individual pipelines to use different classes, instead of having one StorageClass defined for the entire Woodpecker instance

The thing is that I speak Golang now. And I happen to be a subject matter expert when it comes to CI systems. I’m starting to think that it would behoove me to give back to the open source community, instead of just switching tools whenever there’s something I don’t like.