Wherein I migrate my Jellyfin instance to the k8s cluster.
This is part 19 of my k8s migration series.
I’m running a Jellyfin instance in my Homelab to play movies and TV shows. I don’t have a very fancy setup, no re-encoding or anything like that. I’m just using Direct Play, as I’m only watching things on my desktop computer.
Jellyfin doesn’t have any external dependencies at all, so there’s only the Jellyfin Pod itself to be configured. It also doesn’t have a proper configuration file. Instead, it’s configured through the web UI and a couple of command line options. For that reason, I won’t have any Secrets or ConfigMaps. Instead I’ve just got a PVC with the config and some space for Jellyfin’s cache and another CephFS volume for the media collection.
Said media collection volume will be the main focus of this post, because everything else about the setup follows my standard k8s app setup pretty closely. I had originally planned to also dive a bit (okay, a lot 😅) into the metrics of the copy operation, but that rather quickly turned into a rabbit hole all its own, and so I decided to declare the beginning of operation “articles, not tomes” and split it out into another post that will follow shortly after this one.
Setting up the media volume
For my media volume, I had been using a CephFS volume in the Nomad job setup. I’ve had two reasons for this:
- I need to mount the volume twice and access it from two places: The Jellyfin job, and my main desktop
- Having “unlimited” space
Ceph RBD volumes were out of the question, because those always need to have a size set. They can’t just grow over the entire space available in their Ceph pool. CephFS volumes are different, though. By default, they don’t have any size restriction and can use the entire data pool of the CephFS they’ve been created on. This allows me to not have to worry about whether I need to extend the size at some point. At the same time, I also regularly copy new files onto the disk when expanding my media collection. This happens from my desktop. So I also need to have the ability to mount the volume on two machines at the same time, and write to it at the same time too.
These two points make CephFS the perfect fit for the media volume. But it left me with a problem: I needed a k8s PVC to mount into the Jellyfin Pod. But by default, PVCs always have to have a capacity set. In my initial tests, I tried just removing the size in the manifest for a test PVC, but k8s rejected it when I tried to apply it. The same thing happened when I instead set the size to 0.
So back to the drawing board it was. Luckily for me, @beyondwatts pointed me to static PVCs, which can be used to make manually created CephFS and RBD volumes available as PVCs in Kubernetes. This seems to be a feature of the Ceph CSI. The documentation for the feature can be found here.
I created my new media volume (technically a CephFS subvolume) with the following Ceph commands:
ceph fs subvolumegroup create homelab-fs static-pvcs
ceph fs subvolume create homelab-fs media static-pvcs
After creation, the ceph fs subvolume info homelab-fs media static-pvcs
output
looks like this:
{
"atime": "2025-02-11 22:46:35",
"bytes_pcent": "undefined",
"bytes_quota": "infinite",
"bytes_used": 0,
"created_at": "2025-02-11 22:46:35",
"ctime": "2025-02-11 22:46:35",
"data_pool": "homelab-fs-bulk",
"features": [
"snapshot-clone",
"snapshot-autoprotect",
"snapshot-retention"
],
"flavor": 2,
"gid": 0,
"mode": 16877,
"mon_addrs": [
"300.300.300.1:6789",
"300.300.300.2:6789",
"300.300.300.3:6789"
],
"mtime": "2025-02-11 22:46:35",
"path": "/volumes/static-pvcs/media/9a1f1581-6749-4146-a2aa-251fe2b58eca",
"pool_namespace": "",
"state": "complete",
"type": "subvolume",
"uid": 0
}
Note especially the bytes_quota: infinite
part, which was what I was after.
Next, I created the PersistentVolume for it:
apiVersion: v1
kind: PersistentVolume
metadata:
name: jellyfin-media
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 1Gi
csi:
driver: rook-ceph.cephfs.csi.ceph.com
controllerExpandSecretRef:
name: rook-csi-cephfs-provisioner
namespace: rook-cluster
nodeStageSecretRef:
name: rook-csi-cephfs-node
namespace: rook-cluster
volumeAttributes:
"fsName": "homelab-fs"
"clusterID": "rook-cluster"
"staticVolume": "true"
"rootPath": /volumes/static-pvcs/media/9a1f1581-6749-4146-a2aa-251fe2b58eca
volumeHandle: jellyfin-media
persistentVolumeReclaimPolicy: Retain
volumeMode: Filesystem
I mostly copied this from another CephFS volume I already had as scratch space
for my backup setup. Important to note here is the spec.csi.volumeAttributes.staticVolume: true
entry as well as the rootPath
.
The value for the root path can be found with the following command:
ceph fs subvolume getpath homelab-fs media static-pvcs
The PersistentVolumeClaim then looks like this:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jellyfin-media
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: ""
volumeMode: Filesystem
volumeName: jellyfin-media
Because it’s a CephFS subvolume, I could use the ReadWriteMany access mode.
But when trying to launch a Pod using the PVC, I got this error message initially:
MountVolume.MountDevice failed for volume "jellyfin-media" : rpc error: code = Internal desc = failed to get user credentials from node stage secrets: missing ID field 'userID' in secrets
This showed up in the Events of the Pod. The issue is mentioned in the Rook Docs.
And it needs to be solved by manually creating another Secret. I’m not sure why
the Ceph CSI driver doesn’t automatically create the Secret, as it’s just a
copy of the rook-csi-cephfs-node
secret with different names for the data keys.
I did the copy by first fetching the rook-csi-cephfs-node
secret:
kubectl get -n rook-cluster secrets rook-csi-cephfs-node -o yaml > csi-secret.yaml
From that csi-secret.yaml
I then removed all of the runtime information added
by Kubernetes and then renamed the keys like this:
adminID
->userID
adminKey
->userKey
After that, I applied the new Secret to the cluster and then changed the
spec.csi.nodeStageSecretRef.name
property of the PersistentVolume to the newly
created Secret. After that, the Pod was able to mount the CephFS static volume
without issue.
What I’m still wondering about is why these static PVCs need this special
handling, even though CephFS PVCs created dynamically don’t.
The last step of the preparation was to make sure that I could also mount the
CephFS subvolume on my desktop machine.
This, quite honestly, involved a bit of silliness. In my current configuration,
I just had the name
option set for the mount, giving the Ceph user name to
use for authentication. This then automatically takes the /etc/ceph/ceph.conf
file to get the MON daemon IPs for initial cluster contact and the ceph.client.<username>.keyring
file from the same directory. I couldn’t reuse the same approach, because I’ve
got other mounts from the baremetal cluster I need to keep for now.
But as per the ceph.mount man page,
there is a secretfile
option. In my naivete, I thought that this file takes the
path to a keyring file. Which would make sense. Because the keyring files are
the way Ceph credentials are provided everywhere else. But no. The secretfile
option expects a file which contains only the key, and nothing else.
If you provide it with a full keyring file, the mount command will output an
error like this:
secret is not valid base64: Invalid argument.
adding ceph secret key to kernel failed: Invalid argument
couldn't append secret option: -22
With that finally figured out, I created the Ceph config file for the Rook cluster with this command:
ceph config generate-minimal-conf
Then I was able to mount the subvolume with this command:
mount -t ceph :/volumes/static-pvcs/media/9a1f1581-6749-4146-a2aa-251fe2b58eca /mnt/temp -o name=myuser,secretfile=/etc/ceph/ceph-rook.secret,conf=/etc/ceph/ceph-rook.conf
What I really like about working with Rook instead of baremetal Ceph is that I can create additional users with Kubernetes manifests so I can version control them, instead of having to document long sequences of commands in a runbook:
apiVersion: ceph.rook.io/v1
kind: CephClient
metadata:
name: myuser
spec:
caps:
mds: 'allow rw path=/volumes/static-pvcs/media/9a1f1581-6749-4146-a2aa-251fe2b58eca'
mon: 'allow r'
osd: 'allow rw tag cephfs data=homelab-fs'
This will allow the user to access only that specific static volume in the cluster.
Copying the media collection
My media collection has a size of about 1.7 TiB. I knew that copying it over
would take quite a while, so I planned to do it from my Command&Control host.
But then I got a weird feeling and decided to check the networking diagram.
It looks something like this: Network diagram with the packet flow for the copy operation.
The issue here is the fact my C&C host, called the Copy Host here, is in a different VLAN than the baremetal and Rook Ceph hosts. This means that some routing needs to happen for packets to get to and from the Ceph hosts to the copy host. This in turn means that all packets need to pass through the router. This would be fine if the packets only needed to pass through the router once. But in truth, they need to pass through the router twice. And they pass through the same NIC on the router even four times.
The packets go from the source, the baremetal Ceph cluster, up to the router via the link from the switch. Pass Nr. 1. Then they go down that same link again to reach the C&C host on its VLAN. Pass Nr. 2. The C&C host then sends it to the router again, now with the Rook Ceph host as the destination, pass Nr. 3. And finally, the router then sends the packets back again down that link between router and switch to finally arrive at the Ceph Rook host.
So because each packet passes the link twice in each direction, my maximum copy speed is suddenly reduced to 500 Mbit/s, which is a mere 62 MByte/s, slower even than the HDDs involved in this copy process.
I was contemplating which Homelab host to take out and install the necessary tools on when @badnetmask, rightly, asked why I don’t just launch a Pod somewhere. And that was what I finally went with.
I then remembered that there is a Rook Ceph Toolbox with all the necessary tools already installed and I decided to try that. After copying the credentials similar to what I explained above for my desktop mounts, I got an error:
bash-5.1$ mount -t ceph :/volumes/static-pvcs/media/9a1f1581-6749-4146-a2aa-251fe2b58eca /mnt/rook -o name=admin
mount: drop permissions failed.
I then changed the Pod’s Yaml a bit by running it as root. Which gave me an error again, but at least a different one:
[root@rook-ceph-tools-584df95dcb-vdwqc /]# mount -t ceph :/volumes/static-pvcs/media/9a1f1581-6749-4146-a2aa-251fe2b58eca /mnt/rook -o name=admin
Unable to apply new capability set.
modprobe: FATAL: Module ceph not found in directory /lib/modules/5.15.0-131-generic
failed to load ceph kernel module (1)
Unable to apply new capability set.
unable to determine mon addresses
To get rid of the failed attempt to load the Ceph kernel module, I then also
added the /lib/modules
directory as a volume to the Pod. This worked and got
rid of the fatal modprobe error, but still left me with the other errors.
So throwing up my hands, I set securityContext.privileged
. I’m still a bit
surprised that Linux doesn’t have a specific capability to add to be allowed to
do mounting? Perhaps the ability to run mount is just so powerful that you’ve
got CAP_SYS_ADMIN
anyway?
The final Deployment I used:
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-tools
namespace: rook-cluster # namespace:cluster
spec:
replicas: 1
selector:
matchLabels:
app: rook-ceph-tools
template:
metadata:
labels:
app: rook-ceph-tools
spec:
dnsPolicy: ClusterFirstWithHostNet
serviceAccountName: rook-ceph-default
containers:
- name: rook-ceph-tools
image: quay.io/ceph/ceph:v18
command:
- /bin/bash
- -c
- |
CEPH_CONFIG="/etc/ceph/ceph.conf"
MON_CONFIG="/etc/rook/mon-endpoints"
KEYRING_FILE="/etc/ceph/keyring"
write_endpoints() {
endpoints=$(cat ${MON_CONFIG})
mon_endpoints=$(echo "${endpoints}"| sed 's/[a-z0-9_-]\+=//g')
DATE=$(date)
echo "$DATE writing mon endpoints to ${CEPH_CONFIG}: ${endpoints}"
cat <<EOF > ${CEPH_CONFIG}
[global]
mon_host = ${mon_endpoints}
[client.admin]
keyring = ${KEYRING_FILE}
EOF
}
watch_endpoints() {
real_path=$(realpath ${MON_CONFIG})
initial_time=$(stat -c %Z "${real_path}")
while true; do
real_path=$(realpath ${MON_CONFIG})
latest_time=$(stat -c %Z "${real_path}")
if [[ "${latest_time}" != "${initial_time}" ]]; then
write_endpoints
initial_time=${latest_time}
fi
sleep 10
done
}
ceph_secret=${ROOK_CEPH_SECRET}
if [[ "$ceph_secret" == "" ]]; then
ceph_secret=$(cat /var/lib/rook-ceph-mon/secret.keyring)
fi
cat <<EOF > ${KEYRING_FILE}
[${ROOK_CEPH_USERNAME}]
key = ${ceph_secret}
EOF
write_endpoints
watch_endpoints
imagePullPolicy: IfNotPresent
tty: true
securityContext:
runAsNonRoot: false
privileged: true
env:
- name: ROOK_CEPH_USERNAME
valueFrom:
secretKeyRef:
name: rook-ceph-mon
key: ceph-username
volumeMounts:
- mountPath: /etc/ceph
name: ceph-config
- name: mon-endpoint-volume
mountPath: /etc/rook
- name: ceph-admin-secret
mountPath: /var/lib/rook-ceph-mon
readOnly: true
- name: modules
mountPath: /lib/modules
readOnly: true
volumes:
- name: ceph-admin-secret
secret:
secretName: rook-ceph-mon
optional: false
items:
- key: ceph-secret
path: secret.keyring
- name: mon-endpoint-volume
configMap:
name: rook-ceph-mon-endpoints
items:
- key: data
path: mon-endpoints
- name: ceph-config
emptyDir: {}
- name: modules
hostPath:
path: /lib/modules # directory location on host
tolerations:
- key: "node.kubernetes.io/unreachable"
operator: "Exists"
effect: "NoExecute"
tolerationSeconds: 5
Anyway, with the privileged
option set, I was finally able to mount. Wanting
to use rsync, I installed it with yum install rsync
and mounted the baremetal
and Rook CephFS subvolumes.
I used this command to execute the copy operation:
rsync -av --info=progress2 --info=name0 /mnt/baremetal/* /mnt/rook/
Here is the final output:
sent 1,748,055,479,314 bytes received 155,334 bytes 54,890,039.24 bytes/sec
total size is 1,853,006,549,228 speedup is 1.06
The operation took a total of 9.5 h.
Deploying Jellyfin
Just for completeness’ sake, here is the Jellyfin Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: jellyfin
spec:
replicas: 1
selector:
matchLabels:
homelab/app: jellyfin
strategy:
type: "Recreate"
template:
metadata:
labels:
homelab/app: jellyfin
spec:
securityContext:
fsGroup: 1006
runAsUser: 1007
runAsGroup: 1006
containers:
- name: jellyfin
image: jellyfin/jellyfin:{{ .Values.appVersion }}
command:
- "/jellyfin/jellyfin"
- "--datadir"
- "{{ .Values.mounts.cacheAndConf }}/data"
- "--cachedir"
- "{{ .Values.mounts.cacheAndConf }}/cache"
- "--ffmpeg"
- "/usr/lib/jellyfin-ffmpeg/ffmpeg"
volumeMounts:
- name: cache-and-conf
mountPath: {{ .Values.mounts.cacheAndConf }}
- name: media
mountPath: {{ .Values.mounts.media }}
resources:
requests:
cpu: 1000m
memory: 1000Mi
livenessProbe:
httpGet:
port: {{ .Values.port }}
path: "/health"
initialDelaySeconds: 15
periodSeconds: 30
ports:
- name: jellyfin-http
containerPort: {{ .Values.port }}
protocol: TCP
volumes:
- name: cache-and-conf
persistentVolumeClaim:
claimName: jellyfin-config-volume
- name: media
persistentVolumeClaim:
claimName: jellyfin-media
Some specialties out of the ordinary here are the settings in the spec.securityContext
.
These are there to ensure that I’m getting the right permissions on the files
produced on the media collection subvolume. All files on there have the GID
1006
, which is historically my group on the first desktop connected to my
first Homeserver, and it’s still serving as the shared group for my media
collection. This is because both Jellyfin and my desktop user need to access the
media files. With this configuration, new files are written with the correct
GID by Jellyfin.
Another somewhat interesting point about Jellyfin: It does allow changing around
the config and cache directories, as you can see in the containers[0].command
, but it
does not allow the same for the location of the media libraries. Those locations
are hardcoded.
I had pretty big problems with this fact back when I migrated from Docker Compose
to Nomad, but sadly that was before I took extensive notes or documented everything
in my internal wiki, so I can’t repeat the manual steps I used to migrate
the data location back then. 😔
And that’s it already for this one. As I noted above, I will pretty closely follow this post with another one looking at Ceph during the large copy operation.
My next migration this coming weekend will be my Nextcloud instance. I’ll need to look at some Helm charts, but at this point I’m pretty sure I will just write my own one.