Wherein I’m adding Bookwyrm to my Homelab.
I used to read novels. A lot. On school days, I would spend the approximately twenty minutes between the end of my morning routine and having to head off with a novel. Ditto for lazy Sunday evenings. During my service as a conscript, I would always find space for a book in my pack when we went on a training exercise. At University, the most difficult decision while packing for a trip home would be judging how many books I would need to pack to ensure I would not run out.
Getting my first Kindle in 2012 was a revolution. Suddenly, I didn’t need to think very hard anymore - I could take my entire library with me. 🎉
But for the last couple of years, my reading has slowly dwindled. So taking a break from my attempts to set up Tinkerbell, I decided to set up Bookwyrm, the Fediverse alternative to Goodreads.
Which, in hindsight, looks a bit weird: I want to read more novels. So first thing to do is more homelabbing. 😅
Bookwyrm
So, what does Bookwyrm look like? While I called it the Fediverse Goodreads alternative, I never actually used Goodreads. So I wasn’t sure exactly why I was getting myself into.
Here is what my home timeline looks like in Bookwyrm: My Home Timeline
This represents Bookwyrm pretty nicely. The core function of it is socializing about books, so all interactions are relative to books. I believe there are private messages which can just be send to another user, but there is no generic, Mastodon-like microblogging. All actions are related to a book. In the above example, you can see two of my posts. The top one represents me marking The Three-Body Problem as a book I want to read. The post below it is a boost from my Mastodon account, where I mark False Gods as finished.
On the left of the screenshot is the new post interface, which reinforces what I wrote above: Bookwyrm is all about books. The new post interface is not just a text box I can write anything in, but it is instead made up of actions related to the selected book. For my English-speaking readers, the title roughly translates to “Fateful hour of a Democracy”, it’s a book about the history of the Weimar Republic. That short period in German history that should be a hell of a lot more emphasized in history lessons than what came before or after it, but sadly isn’t.
Back to Bookwyrm, I can write a review of the book, including a 0-5 star score, a general comment, or a Quote from the book. So all actions I can take relate to the book itself.
Each book also gets its own page, which looks like this: An example of a book page
Scrolling further down shows the reviews for the book: Bottom of the book page, with a review
What I find a bit sad is that it only shows the related reviews and posts, but the automatically created post about me starting to read the book is nowhere to be found.
Another problem is finding the “instance” of a book. Here is a screenshot of
searching for “On Basilisk Station” in Bookwyrm: Bookwyrm book search results
On better federated instances than mine, the book page for the same book looks
a bit more lively: The page for the same book as before, but now from books.theunseen.city.
And that’s it for the Bookwyrm tour. I still haven’t dived deeply into it, and I’m currently following only one other person. But I already like it as a way for people to follow what I’m reading. Let’s see what the future holds.
Deploying Bookwyrm on Kubernetes
Let’s get on with the technical part. I of course wanted to deploy Bookwyrm in my Kubernetes cluster. But its default docs are geared towards deployment with docker-compose. And the instructions contain some “please run this script…” which I had to integrate into my setup, to ensure that I didn’t have to rely on documenting the commands somewhere.
But the first step had to be to create a container image, as the Bookwyrm project itself does not supply one.
Image creation
I took the container build instructions from the official Dockerfile and added the image to my CI. In the process, I completely remade my container image build setup, see this post if you’re interested.
The ultimate version of the image build looks like this:
ARG python_ver
FROM python:${python_ver}
ENV PYTHONUNBUFFERED 1
RUN mkdir /app /app/static /app/images
WORKDIR /app
RUN apt-get update && apt-get install -y gettext libgettextpo-dev tidy libsass-dev && apt-get clean
COPY . /app
RUN env SYSTEM_SASS="true" pip install -r requirements.txt --no-cache-dir
I made two important changes compared to the official Dockerfile. First, the
official docker-compose deployment just mounts the Bookwyrm source code into
the image to make it available. I wanted the image to be self-contained, so
instead of only copying the requirements.txt
file, I copied the entire
source code into the /app
directory.
Another change is the addition of libsass-dev
to the installed packages, and
adding the SYSTEM_PASS="true"
variable to the pip
invocation installing the
dependencies. I found this to be required due to the arm64 image build. During
the amd64 build, a full wheel is available for the libass
package. But no
wheel seems to be available for arm64, and so the C++ libsass is getting
build as part of the pip
invocation. This takes quite a while on a Pi4,
especially as it looks like the compile is only using one core. The builds looked
like this:
Image build for Bookwyrm without the system libsass. Some improvements of the image build times after I started using the system libsass instead of letting pip build it.
But there was one issue remaining: As you can see, I’m copying the Bookwyrm code into the image. But I had to get that code from somewhere first, and I wanted to have it in my Homelab, instead of fetching it from GitHub every time. So I created a mirror on my Forgejo instance. That brought a new question: How to fetch that repo from Forgejo from within a Woodpecker job? I could certainly have made it a public repo and just fetched it, but I figured I would try to do it properly and fetch it with credentials.
But where to get the credentials from? I didn’t want to manually add them to the
repo config in Woodpecker, because I figured that Woodpecker already had the
credentials, because it had to fetch the container image repo where I put the
Containerfile for the Bookwyrm image. Reading up a bit, I found the
environment variable docs for Woodpecker.
These contain the CI_NETRC_USERNAME
and CI_NETRC_PASSWORD
variables. These
are set to the credentials needed to fetch from the git forge configured for
the repository in Woodpecker. Note that the docs say this:
Credentials for private repos to be able to clone data. (Only available for specific images)
Sadly, it doesn’t say which images get a netrc file with the credentials mounted. I found more docs here, mentioning trusted clone plugins. I tried to build a small Alpine image with git installed, but still didn’t manage to get the credentials into that image. The error massage always read:
fatal: could not read Username for 'https://forgejo.example.com': No such device or address
I then dug through the code and tried to find the check, to see what was wrong with my new Alpine image, why it didn’t get the netrc credentials. I found this function:
func (c *Container) IsTrustedCloneImage(trustedClonePlugins []string) bool {
return c.IsPlugin() && utils.MatchImageDynamic(c.Image, trustedClonePlugins...)
}
Note that it doesn’t just check the image, but also verifies that the step is a plugin, not just an image executing commands. Instead of building a plugin, I decided to try to work with the official clone plugin, which is also used to clone the initial repository for a Woodpecker pipeline run. This ultimately worked, and the step for fetching the Bookwyrm repo mirror from my Forgejo looks like this:
- name: clone bookwyrm repo
image: woodpeckerci/plugin-git
settings:
depth: 1
tags: true
branch: production
partial: false
remote: https://forgejo.example.com/mirrors/bookwyrm.git
ref: 'v0.7.5'
path: /woodpecker/bookwyrm
Note that the /mirrors/
part of the URL is not necessary to use it as a mirror,
I just put my Forgejo mirrors into a group called mirrors
.
And with this, I was ending up with the Bookwyrm repo, checked out to the tag v0.7.5
in /woodpecker/bookwyrm
in the rest of the pipeline steps.
Getting to the point of having the Bookwyrm image was quite a ride, but now it’s time for the actual Kubernetes deployment.
Kubernetes deployment
When it comes to dependencies, Bookwyrm requires a Postgres DB and Redis, plus it supports an S3 bucket for media and other static assets. I will not go into detail on those dependencies. If you’re curious about how I’m setting them up in my Homelab, here are the two relevant posts:
When looking at Bookwyrm’s setup docs, it requires executing a script during initial deployment.
Initialize the database by running ./bw-dev migrate
And:
Initialize the application with ./bw-dev setup, and copy the admin code to use when you create your admin account.
So I needed to somehow integrate that into my setup. Looking at the bw-dev script, it became pretty clear that Bookwyrm is really geared towards a docker-compose deployment. The script is intended to be run outside of the Bookwyrm container, as indicated by the fact that it calls docker-compose to achieve things:
[...]
function runweb {
$DOCKER_COMPOSE run --rm web "$@"
}
[...]
function initdb {
runweb python manage.py initdb "$@"
}
function migrate {
runweb python manage.py migrate "$@"
}
function admin_code {
runweb python manage.py admin_code
}
This of course won’t work in a Kubernetes deployment. To work around this, I
wrote my own script, using the manage.py
commands directly, without calling
the bw-dev
script. It ended up looking like this:
apiVersion: v1
kind: ConfigMap
metadata:
name: bookwyrm-script
labels:
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
data:
bookwyrm.sh: |
#! /bin/bash
migrate() {
python manage.py migrate "$@" || return 1
}
initdb() {
python manage.py initdb "$@" || return 1
}
init() {
echo "Running init function..."
migrate || return 1
migrate "django_celery_beat" || return 1
initdb || return 1
python manage.py compile_themes || return 1
python manage.py collectstatic --no-input || return 1
python manage.py admin_code || return 1
return 0
}
update() {
echo "Running update function..."
migrate || return 1
python manage.py compile_themes || return 1
python manage.py collectstatic --no-input || return 1
return 0
}
op="${1}"
if [[ "${op}" == "init" ]]; then
init || exit 1
elif [[ "${op}" == "update" ]]; then
update || exit 1
else
echo "Unknown operation ${op}, aborting."
exit 1
fi
exit 0
This script supports two functions, the first deployment initialization when
running bookwyrm.sh init
, and the possible migration required during updates,
with bookwyrm.sh update
.
Next question, how to run the script? For that, I looked into Helm chart hooks. These are annotations put into a template in a Helm chart which instantiates the template only under certain circumstances. There are hooks available for all phases of the Helm chart lifecycle, from install over delete to updates.
I sadly couldn’t make use of the post-install
hook for the init
part of the
Bookwyrm script, because I had already installed the chart, as it also contains
the CloudNativePG and S3 bucket templates. and I already installed that part
of the chart. So for the init step, I opted for a simple workaround. The Job’s
manifest looks like this:
{{- if .Values.runInit }}
apiVersion: batch/v1
kind: Job
metadata:
name: bookwyrm-init
labels:
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
spec:
template:
metadata:
name: bookwyrm-init
labels:
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
spec:
restartPolicy: Never
containers:
- name: init-script
image: harbor.example.com/homelab/bookwyrm:{{ .Values.appVersion }}
command: ["bash"]
args:
- /hl/bookwyrm.sh
- init
volumeMounts:
- name: bookwyrm-script
mountPath: /hl
readOnly: true
{{- with .Values.env }}
env:
{{- toYaml . | nindent 11 }}
{{- end }}
volumes:
- name: bookwyrm-script
configMap:
name: bookwyrm-script
{{- end }}
So it only gets created when the value runInit
is true
in the values.yaml
file.
But for the update Job, which does DB migrations and regenerates static assets,
I was able to use the pre-upgrade
hook. The manifest looks like this:
apiVersion: batch/v1
kind: Job
metadata:
name: bookwyrm-update
labels:
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
annotations:
"helm.sh/hook": pre-upgrade
spec:
template:
metadata:
name: bookwyrm-update
labels:
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
spec:
restartPolicy: Never
containers:
- name: update-script
image: harbor.example.com/homelab/bookwyrm:{{ .Values.appVersion }}
command: ["bash"]
args:
- /hl/bookwyrm.sh
- update
volumeMounts:
- name: bookwyrm-script
mountPath: /hl
readOnly: true
{{- with .Values.env }}
env:
{{- toYaml . | nindent 11 }}
{{- end }}
volumes:
- name: bookwyrm-script
configMap:
name: bookwyrm-script
Note especially this part:
metadata:
annotations:
"helm.sh/hook": pre-upgrade
That is what marks the Job as a hook to be run before anything else is updated.
The upgrade hook has one unfortunate semantic though - it will be launched
whenever the Helm chart is updated. Not just when the Bookwyrm version is incremented.
What that means is that any time there is any change to the chart, even if it is
just an added label for example, the Job will be executed. And it will be executed
during the helm upgrade
run, and before anything else. So you run helm upgrade
,
and Helm won’t return immediately. It will wait for the hook to finish running,
and only then update all of the other manifests, where necessary. So these Helm
runs will take a bit longer.
But that still seems to be a relatively small prize compared to having the
instructions written in a documentation page I need to remember to execute when
Bookwyrm is updated.
Here is some of the output of my run of the Bookwyrm initialization:
Running init function...
Operations to perform:
Apply all migrations: admin, auth, bookwyrm, contenttypes, django_celery_beat, oauth2_provider, sessions
Running migrations:
Applying contenttypes.0001_initial... OK
Applying contenttypes.0002_remove_content_type_name... OK
Applying auth.0001_initial... OK
Applying auth.0002_alter_permission_name_max_length... OK
Applying auth.0003_alter_user_email_max_length... OK
Applying auth.0004_alter_user_username_opts... OK
[...]
Operations to perform:
Apply all migrations: django_celery_beat
Running migrations:
No migrations to apply.
Your models in app(s): 'bookwyrm' have changes that are not yet reflected in a migration, and so won't be applied.
Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.
Compiled SASS/SCSS file: '/app/bookwyrm/static/css/themes/bookwyrm-dark.scss'
Compiled SASS/SCSS file: '/app/bookwyrm/static/css/themes/bookwyrm-light.scss'
257 static files copied.
*******************************************
Use this code to create your admin account:
1234-56-78-910-111213
*******************************************
Especially the last part is important, as that code is needed to create the initial admin account.
With that done, I was finally ready to write the Deployment. For that, I took the official docker-compose file as a blueprint:
services:
nginx:
image: nginx:1.25.2
restart: unless-stopped
ports:
- "1333:80"
depends_on:
- web
networks:
- main
volumes:
- ./nginx:/etc/nginx/conf.d
- static_volume:/app/static
- media_volume:/app/images
db:
image: postgres:13
env_file: .env
volumes:
- pgdata:/var/lib/postgresql/data
networks:
- main
web:
build: .
env_file: .env
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/images
- exports_volume:/app/exports
depends_on:
- db
- celery_worker
- redis_activity
networks:
- main
ports:
- "8000"
redis_activity:
image: redis:7.2.1
command: redis-server --requirepass ${REDIS_ACTIVITY_PASSWORD} --appendonly yes --port ${REDIS_ACTIVITY_PORT}
volumes:
- ./redis.conf:/etc/redis/redis.conf
- redis_activity_data:/data
env_file: .env
networks:
- main
restart: on-failure
redis_broker:
image: redis:7.2.1
command: redis-server --requirepass ${REDIS_BROKER_PASSWORD} --appendonly yes --port ${REDIS_BROKER_PORT}
volumes:
- ./redis.conf:/etc/redis/redis.conf
- redis_broker_data:/data
env_file: .env
networks:
- main
restart: on-failure
celery_worker:
env_file: .env
build: .
networks:
- main
command: celery -A celerywyrm worker -l info -Q high_priority,medium_priority,low_priority,streams,images,suggested_users,email,connectors,lists,inbox,imports,import_triggered,broadcast,misc
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/images
- exports_volume:/app/exports
depends_on:
- db
- redis_broker
restart: on-failure
celery_beat:
env_file: .env
build: .
networks:
- main
command: celery -A celerywyrm beat -l INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/images
- exports_volume:/app/exports
depends_on:
- celery_worker
restart: on-failure
flower:
build: .
command: celery -A celerywyrm flower --basic_auth=${FLOWER_USER}:${FLOWER_PASSWORD} --url_prefix=flower
env_file: .env
volumes:
- .:/app
- static_volume:/app/static
networks:
- main
depends_on:
- db
- redis_broker
restart: on-failure
dev-tools:
build: dev-tools
env_file: .env
volumes:
- /app/dev-tools/
- .:/app
profiles:
- tools
volumes:
pgdata:
static_volume:
media_volume:
exports_volume:
redis_broker_data:
redis_activity_data:
networks:
main:
It’s a pretty long one, so let’s go through one-by-one. I skipped the Nginx
deployment entirely, as I’m using Bookwyrm’s S3 support for static assets and
images, and with that, the Nginx deployment doesn’t seem to be necessary. For the
same reason, I also don’t have any volumes for /app/static
and /app/images
.
I initially had volumes there, as the docs were not 100% clear whether the
directories might still be used even with S3, but after a couple of days of
running Bookwyrm, I found them to still be empty and removed the volumes. I also
ignored the dev-tools
service, as that also seemed to be unnecessary. I also
skipped the redis_activity
and redis_broker
services as well as the db
service, as I already created those by using CloudNativePG and my existing Redis
instance.
That left me with the following services to run:
services:
web:
build: .
env_file: .env
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/images
- exports_volume:/app/exports
depends_on:
- db
- celery_worker
- redis_activity
networks:
- main
ports:
- "8000"
celery_worker:
env_file: .env
build: .
networks:
- main
command: celery -A celerywyrm worker -l info -Q high_priority,medium_priority,low_priority,streams,images,suggested_users,email,connectors,lists,inbox,imports,import_triggered,broadcast,misc
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/images
- exports_volume:/app/exports
depends_on:
- db
- redis_broker
restart: on-failure
celery_beat:
env_file: .env
build: .
networks:
- main
command: celery -A celerywyrm beat -l INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler
volumes:
- .:/app
- static_volume:/app/static
- media_volume:/app/images
- exports_volume:/app/exports
depends_on:
- celery_worker
restart: on-failure
flower:
build: .
command: celery -A celerywyrm flower --basic_auth=${FLOWER_USER}:${FLOWER_PASSWORD} --url_prefix=flower
env_file: .env
volumes:
- .:/app
- static_volume:/app/static
networks:
- main
depends_on:
- db
- redis_broker
restart: on-failure
networks:
main:
One thing to note is that they all use the same .env
file, and Bookwyrm’s stack
is mostly configured via environment variables, which I applaud. So to not have
to copy the env for each container, I added this section to my values.yaml
file:
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: DEBUG
value: "false"
- name: ALLOWED_HOSTS
value: "bookwyrm.example.com,localhost,$(POD_IP)"
- name: SECRET_KEY
valueFrom:
secretKeyRef:
name: secret-key
key: key
- name: DOMAIN
value: "bookwyrm.example.com"
- name: USE_HTTPS
value: "true"
- name: PGPORT
valueFrom:
secretKeyRef:
name: bookwyrm-pg-cluster-app
key: port
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: bookwyrm-pg-cluster-app
key: password
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: bookwyrm-pg-cluster-app
key: user
- name: POSTGRES_DB
valueFrom:
secretKeyRef:
name: bookwyrm-pg-cluster-app
key: dbname
- name: POSTGRES_HOST
valueFrom:
secretKeyRef:
name: bookwyrm-pg-cluster-app
key: host
- name: REDIS_ACTIVITY_URL
value: "redis://redis.redis.svc.cluster.local:6379/0"
- name: REDIS_BROKER_URL
value: "redis://redis.redis.svc.cluster.local:6379/1"
- name: FLOWER_USER
valueFrom:
secretKeyRef:
name: flower
key: user
- name: FLOWER_PASSWORD
valueFrom:
secretKeyRef:
name: flower
key: pw
- name: FLOWER_BASIC_AUTH
value: "$(FLOWER_USER):$(FLOWER_PASSWORD)"
- name: FLOWER_PORT
value: "8888"
- name: EMAIL_HOST
value: "mail.example.com"
- name: EMAIL_PORT
value: "465"
- name: EMAIL_HOST_USER
value: "bookwyrm@example.com"
- name: EMAIL_HOST_PASSWORD
valueFrom:
secretKeyRef:
name: mail-pw
key: pw
- name: EMAIL_SENDER_NAME
value: "bookwyrm"
- name: EMAIL_SENDER_DOMAIN
value: "example.com"
- name: USE_S3
value: "true"
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: bookwyrm-bucket
key: AWS_ACCESS_KEY_ID
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: bookwyrm-bucket
key: AWS_SECRET_ACCESS_KEY
- name: AWS_STORAGE_BUCKET_NAME
valueFrom:
configMapKeyRef:
name: bookwyrm-bucket
key: BUCKET_NAME
- name: AWS_S3_CUSTOM_DOMAIN
value: "s3-bookwyrm.example.com"
- name: AWS_S3_ENDPOINT_URL
value: "http://rook-ceph-rgw-rgw-bulk.rook-cluster.svc"
- name: ENABLE_THUMBNAIL_GENERATION
value: "true"
I won’t go through all of the options, but there are a few I would like to highlight.
First, the POD_IP
setting is important for Kubernetes probes to work. They will
by default access the pod via its IP, and that IP needs to be specifically allowed
for Django apps. I’ve had a similar issue with Paperless-ngx
before, which is also a Django app.
Another one is the flower auth:
- name: FLOWER_USER
valueFrom:
secretKeyRef:
name: flower
key: user
- name: FLOWER_PASSWORD
valueFrom:
secretKeyRef:
name: flower
key: pw
- name: FLOWER_BASIC_AUTH
value: "$(FLOWER_USER):$(FLOWER_PASSWORD)"
In the docker-compose example from Bookwyrm, the credentials are provided on the command line:
flower:
build: .
command: celery -A celerywyrm flower --basic_auth=${FLOWER_USER}:${FLOWER_PASSWORD} --url_prefix=flower
env_file: .env
volumes:
- .:/app
- static_volume:/app/static
networks:
- main
depends_on:
- db
- redis_broker
restart: on-failure
I was never really able to get this working - for reasons I’m unsure about but
probably have something to do with string escaping, I was not able to login with
my credentials. So I moved them to the FLOWER_BASIC_AUTH
environment variable,
at which point they immediately started working.
With all of that out of the way, here is the Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: bookwyrm
labels:
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
spec:
replicas: 1
selector:
matchLabels:
homelab/app: bookwyrm
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
strategy:
type: "Recreate"
template:
metadata:
labels:
homelab/app: bookwyrm
{{- range $label, $value := .Values.commonLabels }}
{{ $label }}: {{ $value | quote }}
{{- end }}
spec:
automountServiceAccountToken: false
securityContext:
fsGroup: 1000
containers:
- name: bookwyrm-web
image: harbor.example.com/homelab/bookwyrm:{{ .Values.appVersion }}
command: ["python"]
args:
- "manage.py"
- "runserver"
- "0.0.0.0:{{ .Values.ports.web }}"
resources:
requests:
cpu: 200m
memory: 500Mi
{{- with .Values.env }}
env:
{{- toYaml . | nindent 11 }}
{{- end }}
livenessProbe:
httpGet:
port: {{ .Values.ports.web }}
path: "/"
initialDelaySeconds: 15
periodSeconds: 30
ports:
- name: bookwyrm-http
containerPort: {{ .Values.ports.web }}
protocol: TCP
- name: bookwyrm-celery-worker
image: harbor.example.com/homelab/bookwyrm:{{ .Values.appVersion }}
command: ["celery"]
args:
- "-A"
- "celerywyrm"
- "worker"
- "-l"
- "info"
- "-Q"
- "high_priority,medium_priority,low_priority,streams,images,suggested_users,email,connectors,lists,inbox,imports,import_triggered,broadcast,misc"
resources:
requests:
cpu: 200m
memory: 200Mi
{{- with .Values.env }}
env:
{{- toYaml . | nindent 11 }}
{{- end }}
- name: bookwyrm-celery-beat
image: harbor.example.com/homelab/bookwyrm:{{ .Values.appVersion }}
command: ["celery"]
args:
- "-A"
- "celerywyrm"
- "beat"
- "-l"
- "INFO"
- "--scheduler"
- "django_celery_beat.schedulers:DatabaseScheduler"
resources:
requests:
cpu: 200m
memory: 200Mi
{{- with .Values.env }}
env:
{{- toYaml . | nindent 11 }}
{{- end }}
- name: bookwyrm-flower
image: harbor.example.com/homelab/bookwyrm:{{ .Values.appVersion }}
command: ["celery"]
args:
- "-A"
- "celerywyrm"
- "flower"
- "--url_prefix=flower"
resources:
requests:
cpu: 200m
memory: 200Mi
{{- with .Values.env }}
env:
{{- toYaml . | nindent 11 }}
{{- end }}
ports:
- name: flower-http
containerPort: {{ .Values.ports.flower }}
protocol: TCP
Only one comment to the above: Take the resource requests with a grain of salt, I haven’t gotten around to looking at the metrics for the first week of deployment. The above values are still the semi-random values I drew out of a hat while writing the manifest.
At this point, I thought I was done. But that would have been too easy.
The power of CSS
The reason I was sure I wasn’t done yet is that the home page of Bookwyrm looked like this when I first opened it:

There’s clearly something wrong.
Obviously, that’s not what it’s supposed to look like. Those of you who are a bit more familiar with webdev than I am will likely immediately see that there’s some problem with the CSS, but to me it was not quite that clear. A look into the browser console with messages about the file not being found lead me to the same conclusion. I saw the following when opening the sources of the page:
<link href="https://s3-bookwyrm.mei-home.net/css/themes/bookwyrm-light.css" rel="stylesheet" type="text/css" />
But when looking at the S3 bucket, I saw that the file was at /static/...
.
Searching a bit, I found this bug.
It was already fixed in the newest release, v0.7.5
, but I had started out with
v0.7.4
, as I wanted to have a chance to test my upgrade hook/script right away.
After updating to Finally styled, but still with some font glyhs clearly missing.v0.7.5
, I at least got some proper styling, but it still
looked like some things were missing:
Note especially the missing glyphs for the symbols above “Dezentral”, “Freundlich” and “Nichtkommerziell”. And please forgive the partial German language, I hadn’t realized the language mix when taking the screenshot.
Looking at the browser console again, I saw this error message.
Checking a bit further, I found that I missed a part of Bookwyrm’s S3 setup docs.
I followed these docs from Hetzner
to apply the necessary CORS configs to my S3 bucket. I couldn’t directly apply
the JSON config provided in the Bookwyrm docs, because s3cmd
, my default S3
tool, doesn’t support JSON for the CORS config, only XML. So I translated it
to this:
<CORSConfiguration>
<CORSRule>
<AllowedHeader>*</AllowedHeader>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>HEAD</AllowedMethod>
<AllowedMethod>POST</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<AllowedMethod>DELETE</AllowedMethod>
<MaxAgeSeconds>3000</MaxAgeSeconds>
<ExposeHeader>Etag</ExposeHeader>
<AllowedOrigin>https://bookwyrm.example.com</AllowedOrigin>
</CORSRule>
</CORSConfiguration>
I stored the above XML config into a cors.xml
file and applied it to my
Bookwyrm bucket with this command:
s3cmd -c s3-conf setcors cors.xml s3://bookwyrm/
Here, s3-conf
is the s3cmd config for my Ceph S3 setup.
And after that, I was finally done: Bookwyrm looked like it was supposed to! 🎉
Initial network sync?
After I had finally set up my instance, I started to enter a few books, mostly
for testing purposes. Which was when I realized that I could hear a lot of disk
activity. And looking at my metrics, I found that the Bookwyrm container was
using a lot of CPU: CPU utilization of the bookwyrm-celery-worker container.
Looking around a bit more, I also found that there were a lot of new objects
created in my S3 pool on Ceph: Object changes in the S3 data pool, negative values are removed objects, positive values are numbers of added objects.
So it seemed that something was going on with Bookwyrm there, but I had no idea what it might be. Checking the S3 bucket, I saw a lot more book covers appearing in there. But I hadn’t even done much at that point, just added a handful of books. At that point I was flailing a little bit for what it might be. Then I had the idea of looking at flower, which the Bookwyrm docs advertise as a way to look at ongoing tasks.
This was the picture presented to me at the time: List of tasks in Flower
Noteworthy is that most of the tasks are related to Work
objects, which, if
I’m not mistaken, are books in Bookwyrm’s object model. So there seem to be a lot
of things being done with a lot of books. And I had only added two or three
books myself at that point, and hadn’t followed a single person yet. Also note
that the tasks all started in the same minute, 19:39. And it went on and on like
this.
Then I saw that there’s a link to my instance in the Example of task details.args
column, and I clicked
one of the tasks to get to this details page:
I then checked which book the shown https://bookwyrm.mei-home.net/book/16858
URL from the This was the book the flower task related toargs
value points to:
The thing is: I hadn’t interacted with that book, at all. So I tried a few more books from other flower tasks, and they were the same - books I had not interacted with. So the only conclusion I can draw for now is that Bookwyrm looks at all known instances and downloads their entire database of books and adds them to my instance?
If you actually know what’s going on here, please contact me at my Mastodon account and tell me. I’m genuinely curious.
Final thoughts
I’m really curious what that initial database sync (?) was for.
The Bookwyrm setup also holds one last challenge: Resisting the temptation of entering all the books I’ve read in the last 32 years. 😅
Last but not least, if you’d like to follow my reading, I’m https://bookwyrm.mei-home.net/user/mmeier.