Wherein I migrate my Mastodon instance to the k8s cluster.
This is part 21 of my k8s migration series.
Mastodon is currently serving as my presence in the Fediverse. You can find me here, although I’m pretty sure that most of my readers are coming from there already. 😄
If you’re at all interested in joining a genuine community around Homelabbing, I can only recommend to join the fun by following the HomeLab or SelfHosted hashtags and wildly following everyone appearing on there. It’s a great community of rather friendly people enjoying everything from a lonely Pi to several 42U 19" racks full of equipment. If you’re interested in learning more about my own experience with the Fediverse and hosting my own single-user instance, have a look at these older posts.
Preparations
There were two things which needed to be migrated from my Nomad cluster to the k8s deployment: The S3 bucket holding all of the media, and the database.
The database is, by a very large margin, the biggest in my Homelab, clocking in at 2.5 GB. I think it could be a lot smaller, but I completely disabled cleanups for remote posts a while ago. That was due to the fact that the automated cleanup also deletes posts I had bookmarked for reading later, and I’m not very good at actually keeping up with those - so after a while I went through them and became pretty convinced that I was missing some I had bookmarked a while ago. I will likely do some cleanups manually when it really becomes too big to be manageable.
I will not describe the entire migration process here, because it is similar to previous migrations. If you’re interested, have a look at my post about the Gitea migration, where I describe the database migration with CNPG in detail. In short, it was very painless. I provided the database with a 15 GB volume, which seems a bit overboard in hindsight. At some point in the future I will have to figure out how to do database sizing and go through all of my CNPG clusters, because I’m pretty sure most of them are overprovisioned.
Next came the S3 bucket. The first mistake I made here was to forget to
exclude the cache/
prefix. So I copied all of the currently cached media over
instead of just letting Mastodon re-fetch whatever it actually needed. That
prefix currently holds 56 GB out of 61 GB total. Which reminds me that I need to
check whether the automatic cleanup is working on the k8s setup or not.
But yeah, if I had remembered to remove that prefix, I could have saved a lot
of time for the copy operation. As it stands, these are the stats for the copy,
which I did with rclone:
Transferred: 61.786 GiB / 61.786 GiB, 100%, 6.279 MiB/s, ETA 0s
Transferred: 384921 / 384921, 100%
Elapsed time: 3h7m29.8s
Those 6.279 MiB/s are utterly abysmal. Those of you who read my previous post on my media library copy operation probably already know: It was the 4 TB Seagate HDD, which was fully slammed again. There’s definitely something bad about this disk. But anyway, three hours later I was done and had everything copied over.
Before I close the preparations, let’s have some fun and look at the CPU usage
of the FluentD container in my k8s cluster: CPU usage of my FluentD log aggregation container. Log rate of my Traefik ingress container.
The Mastodon setup
I deployed my Mastodon instance with the official Mastodon chart. One important note: This one is, at some point in the future, going to be replaced with a new one, see the relevant issue.
I won’t go through every single option I set, but there were a couple of things which tripped me up.
The first and perhaps most important one: The default appVersion
of the current
chart is 4.2.17
. But I was already on 4.3.3
. The main issue I encountered
related to this discrepancy in versions is the split of the Mastodon container
into two containers, one for the streaming component, and one for everything
else. To fix this, I had to explicitly set the image in the values.yaml
:
mastodon:
streaming:
image:
repository: "ghcr.io/mastodon/mastodon-streaming"
With that, the chart seems to work for 4.3.3 and 4.3.4 without issues.
Then there’s the Redis configuration. I’ve got a central Redis instance in my
cluster, instead of running one for every app. And the chart supports this, but
unless I’ve overlooked something here, the chart requires the Redis instance
to have a password, which mine does not. The way this shows is that the
mastodon-redis
secret is unconditionally added to each container’s env,
for example in the mastodon-web deployment from here:
- name: "REDIS_PASSWORD"
valueFrom:
secretKeyRef:
name: {{ template "mastodon.redis.secretName" . }}
key: redis-password
There’s no condition around that, checking whether Redis is configured with a
password. I also tried to just set an empty password in redis.auth.password
,
but in this case the Secret is not created by the chart, and my containers are
left in ContainerCreationError state because of the missing Secret.
The only way I found was to create a dummy secret with an empty data.redis-password
:
apiVersion: v1
kind: Secret
metadata:
name: masto-redis-mock
labels:
homelab/part-of: mastodon
type: Opaque
data:
redis-password: ""
And then using that Secret in the Helm chart:
redis:
auth:
existingSecret: "masto-redis-mock"
With that, the Redis password env variable is set, but to an empty value, which seems to make Mastodon use Redis properly, without adding a password of any kind to the connection string.
The next noteworthy configuration to be set was the mastodon.trusted_proxy_ip
variable. This one needed the source IP of my Traefik ingress, but that doesn’t
have a fixed IP, so I needed to add the Pod CIDR:
mastodon:
trusted_proxy_ip: "300.300.300.1,127.0.0.1,10.8.0.0/16"
Without this setting, I got the following error in the mastodon-web logs:
[05332434-d3d6-40b1-950d-ae73da0d4967] ActionDispatch::RemoteIp::IpSpoofAttackError (IP spoofing attack?! client 10.8.4.103 is not a trusted proxy HTTP_CLIENT_IP=nil HTTP_X_FORWARDED_FOR="67.241.47.40, 10.86.10.10")
I also decided to switch off the CronJob for media removal:
mastodon:
cron:
removeMedia:
enabled: false
This is because I recently spend quite some time
working on Masotodon’s internal process. From what I can see, this CronJob
uses the tootctl
CLI with the tootctl media remove
command. I like that better than the internal Mastodon process, because back
when I looked at it, tootctl
worked a lot better because it made separate
DELETE requests. But the one thing which keeps me from using the CronJob is that
I can’t configure the retention periods. I might still use it later and just live
with the defaults.
And that’s really all I have to say. For completeness’ sake, here is the full
values.yaml
content:
mastodon:
labels:
homelab/part-of: mastodon
createAdmin:
enabled: false
cron:
removeMedia:
enabled: false
local_domain: "social.mei-home.net"
trusted_proxy_ip: "300.300.300.1,127.0.0.1,10.8.0.0/16"
singleUserMode: true
autherizedFetch: false
limitedFederationMode: false
s3:
enabled: true
existingSecret: "mastodon-bucket"
bucket: masto-media
endpoint: "http://rook.service:80"
alias_host: "s3-mastodon.mei-home.net"
deepl:
enabled: false
hcaptcha:
enabled: false
secrets:
existingSecret: "mastodon-secrets"
sidekiq:
resources:
limits:
memory: 1024Mi
requests:
cpu: 400m
smtp:
auth_method: "plain"
from_address: "Meiers Mastodon <mastodon@mei-home.net>"
openssl_verify_mode: "peer"
port: "465"
server: "mail.example.com"
tls: false
existingSecret: "mastodon-mail"
streaming:
image:
repository: "ghcr.io/mastodon/mastodon-streaming"
resources:
requests:
cpu: 500m
limits:
memory: 2000Mi
web:
resources:
requests:
cpu: 500m
limits:
memory: 1000Mi
cacheBuster:
enabled: false
metrics:
statsd:
exporter:
enabled: false
otel:
enabled: false
extraEnvVars:
SMTP_SSL: true
OIDC_CLIENT_ID: "mastodon"
OIDC_DISPLAY_NAME: "Login with Keycloak"
OIDC_ISSUER: "https://login.example.com/realms/example"
OIDC_DISCOVERY: true
OIDC_SCOPE: "openid,profile,email"
OIDC_UID_FIELD: "preferred_username"
OIDC_REDIRECT_URI: "https://social.mei-home.net/auth/auth/openid_connect/callback"
OIDC_SECURITY_ASSUME_EMAIL_IS_VERIFIED: true
OIDC_END_SESSION_ENDPOINT: "https://login.example.com/realms/example/protocol/openid-connect/logout"
OIDC_ENABLED: true
OMNIAUTH_ONLY: true
RAILS_SERVE_STATIC_FILES: true
S3_BATCH_DELETE_LIMIT: 1
S3_READ_TIMEOUT: 60
S3_BATCH_DELETE_RETRY: 10
ALLOWED_PRIVATE_ADDRESSES: "300.300.300.1"
ingress:
enabled: true
annotations:
external-dns.alpha.kubernetes.io/controller: "none"
hosts:
- host: social.mei-home.net
paths:
- path: "/"
tls: null
streaming:
enabled: false
elasticsearch:
enabled: false
postgresql:
enabled: false
postgresqlHostname: "mastodon-pg-cluster-rw"
postgresqlPort: "5432"
auth:
database: "mastodon"
username: "mastodon"
existingSecret: "mastodon-pg-cluster-app"
redis:
enabled: false
hostname: "redis.example"
port: "6379"
auth:
existingSecret: "masto-redis-mock"
sidekiq:
enabled: false
cache:
enabled: false
Conclusion
To be honest, somewhere during that Sunday I started thinking that starting the Mastodon migration on a Sunday morning might have been a mistake, but in the end it worked out well enough.
Now there are only a few services left to migrate over, chief amongst them my Keycloak instance. Let’s see whether I might even be able to clean out the entire cluster during this weekend. There’s definitely a light at the end of the migration tunnel. I guess this weekend will show whether it’s a freight train. 😅