Wherein I migrate my internal container registry to Harbor.
This is part 12 of my k8s migration series.
Let’s start by answering the obvious question: Why even have an internal container registry? For me, there are two reasons:
- Some place to put my own container images
- A cache for external images
Most of my internal images are slightly changed external images. A prime example is my Fluentd image. I’ve extended the official image with a couple of additional plugins. And I needed some place to store them.
My main reason for point 2) is to avoid waste. Why reach out to the Internet and put additional unnecessary load on somebody else’s infrastructure by pulling the same image 12 times? It makes a lot more sense to me to only do that once and then use an internal cache. A secondary reason was of course the introduction of the DockerHub rate limit. I tended to hit that pretty regularly, especially when I was working on my CI.
A tertiary reason is Deutsche Telekom. My ISP. A couple of years ago, they tended to regularly get into peering battles with their tier 1 peering partners, and consequently, you had some days where the entire US was connected down a 512 Kbps pipe. Or at least that was what it felt like. Pulling an image from DockerHub ran with, I kid you not, 5 Kbps. Those days seem to be over, but I still like to at least be able to use previously pulled images.
Finally, there might also be a speed advantage when pulling from a local cache instead of reaching out to the Internet. But for me, that was never really a consideration. I’ve got a 1 Gbps LAN, and most of my storage runs off of a Ceph cluster, with the Image cache running on my bulk storage HDDs. So there’s really not going to be that much gain.
In my Nomad cluster, I had set up two instances of Docker’s official registry. Hm, it is now called “distribution”? And seemingly under the CNCF? Ah:
Registry, the open source implementation for storing and distributing container images and other content, has been donated to the CNCF. Registry now goes under the name of Distribution, and the documentation has moved to…
From the official docs on the Docker page.
I chose registry back then because it looked like a pretty low powered solution. For a GUI, I used docker-registry-ui, which I can warmly recommend.
But I also pretty much ran it as an open registry, which bothered me a bit. Plus, I had looked a lot at Harbor, but always found that it sounded a bit too much oriented towards deployment in Kubernetes. And now that I’m finally running my own Kubernetes cluster, I decided to replace my two registry instances with a single Harbor instance.
Another reason for wanting to look at Harbor was that I think at some point, registry could only serve as a pull-through cache for DockerHub, but not for other registries, e.g. Quay.io. But if I read the docs right, it’s now possible to mirror other registries with it as well.
There are other alternatives as well. The first one, Artifactory, is out, because while I know that it would fulfill my needs, it is also what we use at work. And there is no great love lost between me and Artifactory. It will only get deployed in my Homelab over my dead, cold, decomposing body.
Then there’s Sonatype Nexus. But quite frankly: That always gave off pretty strong “We’re going to go source available within the week” vibes.
Finally, there’s Gitea and their relatively recently introduced package management feature, which also includes a container registry. The main reason I did not go with this one is that it currently doesn’t support pull-through caches, although there’s a feature request. In addition, I’m still a big fan of running apps which do one thing well, instead of everything somewhat decently. (He says, looking at his Nextcloud file sharing/note taking/calendar/contacts/bookmarks moloch 😅)
So Harbor it is. Let’s dig into it.
Harbor setup
To setup Harbor, I used the official Helm chart. It is perfectly workable, but has some quirks when it comes to secrets handling I will go into detail about later.
Here is my values.yaml
file for the chart:
expose:
type: ingress
tls:
enabled: true
certSource: none
ingress:
hosts:
core: harbor.example.com
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: myentrypoint
harbor:
labels:
homelab/part-of: harbor
externalURL: https://harbor.example.com
ipFamily:
ipv6:
enabled: false
persistence:
enabled: false
imageChartStorage:
disableredirect: true
type: s3
s3:
existingSecret: my-harbor-rgw-secret
bucket: harbor-random-numbers-here
regionendpoint: http://rook-ceph-rgw-myobjectstorename.my-rook-cluster-namespace.svc:80
v4auth: true
rootdirectory: /harbor
encrypt: false
secure: false
logLevel: info
existingSecretAdminPassword: my-admin-secret
existingSecretAdminPasswordKey: mySecretsKey
existingSecretSecretKey: my-harbor-secret-key-secret
portal:
resources:
requests:
memory: 256Mi
cpu: 100m
podLabels:
homelab/part-of: harbor
core:
resources:
requests:
memory: 256Mi
cpu: 100m
podLabels:
homelab/part-of: harbor
jobservice:
jobLoggers:
- database
resources:
requests:
memory: 256Mi
cpu: 100m
podLabels:
homelab/part-of: harbor
registry:
registry:
resources:
requests:
memory: 256Mi
cpu: 100m
controller:
resources:
requests:
memory: 256Mi
cpu: 100m
podLabels:
homelab/part-of: harbor
credentials:
username: my-harbor-registry-user
existingSecret: my-harbor-registry-user-secret
trivy:
enabled: false
database:
type: external
external:
host: "harbor-pg-cluster-rw"
port: 5432
username: harbor
coreDatabase: harbor
existingSecret: harbor-pg-cluster-app
redis:
type: external
external:
addr: redis.redis.svc.cluster.local:6379
metrics:
enabled: false
serviceMonitor:
enabled: false
The above is only for completeness’ sake. Let’s go through the config bit-by-bit. The first part is the setup for external access:
expose:
type: ingress
tls:
enabled: true
certSource: none
ingress:
hosts:
core: harbor.example.com
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: myentrypoint
harbor:
labels:
homelab/part-of: harbor
externalURL: https://harbor.example.com
ipFamily:
ipv6:
enabled: false
This uses my Traefik Ingress
to provide external connectivity. I’m disabling IPv6 because I don’t have it
set up in my Homelab. Please note the (perfectly normal!) spelling of externalURL
.
I spelled it wrong, and so all the pull commands which Harbor helpfully shows
in the web UI had the default URL in it. One of those things which can really
only be solved by staring very intently at the YAML for an extended period of
time. 😅
persistence:
enabled: false
imageChartStorage:
disableredirect: true
type: s3
s3:
existingSecret: my-harbor-rgw-secret
bucket: harbor-random-numbers-here
regionendpoint: http://rook-ceph-rgw-myobjectstorename.my-rook-cluster-namespace.svc:80
v4auth: true
rootdirectory: /harbor
encrypt: false
secure: false
Next up is persistence. Harbor has two approaches here. The first one, which is the default that I’m not using here, is using PersistentVolumeClaims to store the data, like container images. The second one is using S3, as I’m doing here. I disable the registry’s redirect feature here. It would normally redirect any requests directly to the S3 storage. But access to my S3 storage is very limited outside the cluster. And with my relatively low levels of activity, I don’t need to reduce the load on Harbor’s registry by enabling it. I’m using my Ceph Rook based S3 setup here. Again for completeness’ sake, here is the manifest for creating the bucket:
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: harbor
spec:
generateBucketName: harbor
storageClassName: rgw-bulk
I will talk about the secrets setup later in a separate section.
Another important thing to configure when setting up storage without persistent volumes is the configuration of storage for the job logs from e.g. the automated security scans Harbor can conduct on the images:
jobservice:
jobLoggers:
- database
resources:
requests:
memory: 256Mi
cpu: 100m
podLabels:
homelab/part-of: harbor
The important part here is the jobservice.jobLoggers[0]=database
setting,
which configures the job service to write logs to the Postgres DB.
I’m also disabling all of this security scanning, by switching off trivy.enabled
.
Next somewhat interesting thing is the database setup:
database:
type: external
external:
host: "harbor-pg-cluster-rw"
port: 5432
username: harbor
coreDatabase: harbor
existingSecret: harbor-pg-cluster-app
To manage the database, I’m using my CloudNativePG setup. Here are some parts of the database config:
resources:
requests:
memory: 200M
cpu: 150m
postgresql:
parameters:
max_connections: "200"
shared_buffers: "50MB"
effective_cache_size: "150MB"
maintenance_work_mem: "12800kB"
checkpoint_completion_target: "0.9"
wal_buffers: "1536kB"
default_statistics_target: "100"
random_page_cost: "1.1"
effective_io_concurrency: "300"
work_mem: "128kB"
huge_pages: "off"
max_wal_size: "128MB"
wal_keep_size: "512MB"
storage:
size: 1.5G
storageClass: rbd-fast
I hope this is a good compromise between dumping a long piece of YAML into every post about an app which needs Postgres, and not showing the database setup at all.
Finally, I’m using my Redis instance for caching and disabling metrics explicitly, so when I get around to gathering all the app level metrics and making dashboards, I’ve got something to grep for in the Homelab repo. 😉
Issues with secrets
I had a couple of issues with the different secrets which Harbor needs. First, let’s start with the place where it’s doing it right, the admin credentials:
existingSecretAdminPassword: my-admin-secret
existingSecretAdminPasswordKey: mySecretsKey
The Helm chart doesn’t just allow setting the Secret to use, but also which key in that Secret contains the password. That’s how it should be done.
The credentials for the database were also okay, because the key the Helm chart
expected, password
, happens to also be the key where CloudNativePG stores
the user password in the secret it creates with the credentials. What saddened
me a bit is that I couldn’t set the host and port that way as well, because
CNPG puts those into the keys of the Secret it creates as well.
But a lot more annoying were the S3 credentials. Rook creates a secret for
every bucket, complete with the access key and the secret key, as well as the
name of the bucket, which is created semi-randomly. It also provides the correct
endpoint. It would have been nice if I could have handed the ConfigMap Rook
creates over to the Helm chart. Instead, I hardcoded the values in the values.yaml
,
which means I would have to do some manual intervention if I ever have to recreate
it all.
For the credentials, I could at least provide the name of an existing Secret.
But as per the values.yaml
comments, the access key and the secret key need
to be put into specific keys in the provided Secret. And those were not the
standard key names you would expect, e.g. AccessKey
and SecretKey
.
No, they have to be REGISTRY_STORAGE_S3_ACCESSKEY
and REGISTRY_STORAGE_S3_SECRETKEY
.
So what to do now? Manually extract the keys from Rook’s secret and write a new secret by hand? Luckily, no. The Fediverse came through, and somebody proposed to use external-secret’s Kubernetes provider. This provider allows me to automatically take a Kubernetes Secret, and create a new secret from it, with the same data in different keys. This is still a pretty roundabout way, but I decided that this is preferable to the other options, which would be writing a secret by hand or forking the Helm chart.
First, we need to define some RBAC objects for use by the SecretStore for the Kubernetes provider.
Here is the ServiceAccount:
apiVersion: v1
kind: ServiceAccount
metadata:
name: ext-secrets-harbor
labels:
homelab/part-of: harbor
Next, we need a Role for that ServiceAccount to use:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ext-secrets-harbor-role
labels:
homelab/part-of: harbor
rules:
- apiGroups: [""]
resources:
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- authorization.k8s.io
resources:
- selfsubjectrulesreviews
verbs:
- create
This allows all accounts using the Role to view Secrets in the Namespace the Role is created in, which in this case is my Harbor Namespace.
Finally, we need a RoleBinding to bind the Role to the ServiceAccount:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
homelab/part-of: harbor
name: ext-secrets-harbor
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ext-secrets-harbor-role
subjects:
- kind: ServiceAccount
name: ext-secrets-harbor
namespace: harbor
Once all of that has been created, we can define the SecretStore:
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: harbor-secrets-store
labels:
homelab/part-of: harbor
spec:
provider:
kubernetes:
remoteNamespace: harbor
auth:
serviceAccount:
name: ext-secrets-harbor
server:
caProvider:
type: ConfigMap
name: kube-root-ca.crt
key: ca.crt
One fascinating thing I learned is that Kubernetes puts the CA certs for the
kube-apiserver in every Namespace, under a ConfigMap called kube-root-ca.crt
.
This SecretStore can then be used to take the Secret created by Rook for the S3 bucket and rewrite it to fit the expectations of the Harbor chart as follows:
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: "harbor-s3-secret"
labels:
homelab/part-of: harbor
spec:
secretStoreRef:
name: harbor-secrets-store
kind: SecretStore
refreshInterval: "1h"
target:
creationPolicy: 'Owner'
data:
- secretKey: REGISTRY_STORAGE_S3_ACCESSKEY
remoteRef:
key: harbor
property: AWS_ACCESS_KEY_ID
- secretKey: REGISTRY_STORAGE_S3_SECRETKEY
remoteRef:
key: harbor
property: AWS_SECRET_ACCESS_KEY
This will have external-secrets go to the kube-apiserver and get the
AWS_SECRET_ACCESS_KEY
and AWS_ACCESS_KEY_ID
keys from the harbor
Secret,
which was previously created automatically by Rook through the ObjectBucketClaim
I used to create the S3 bucket for Harbor.
And with these five simple manifests, I could use the Rook S3 Secret with the Harbor Helm chart. 😅
One last thing which tripped me during setup were the registry credentials. The values.yaml contains these comments on how to set up the credentials:
registry:
credentials:
username: "harbor_registry_user"
password: "harbor_registry_password"
# If using existingSecret, the key must be REGISTRY_PASSWD and REGISTRY_HTPASSWD
existingSecret: ""
# Login and password in htpasswd string format. Excludes `registry.credentials.username` and `registry.credentials.password`. May come in handy when integrating with tools like argocd or flux. This allows the same line to be generated each time the template is rendered, instead of the `htpasswd` function from helm, which generates different lines each time because of the salt.
# htpasswdString: $apr1$XLefHzeG$Xl4.s00sMSCCcMyJljSZb0 # example string
htpasswdString: ""
What I did not initially get from that comment was that when using an existing
Secret, both the clear text password and the htpasswd string are required.
This initially put me into an amusing conundrum: I did not have a single host
where I had htpasswd
available. 😂
I ended up using the Apache container just to generate the htpasswd string:
docker run -it httpd htpasswd -n -B my-harbor-registry-user
I then put that string into the Secret verbatim and was finally able to start the Harbor instance.
Transferring my internal images to Harbor
The first step I took was to transfer all of my internal images over to Harbor, by adapting the CI jobs which create them and pointing them to Harbor.
I’ve currently got five internal images, most of them just copies of official images with some additions. I create them with Drone CI, which I will switch over to Woodpecker later as part of the migration.
The first step in transferring the images was to set up a user for the CI in Harbor. This can be done with the Harbor Terraform provider, but I did it manually for now. Then I created a “homelab” project for those Docker images.
For my image repository, which houses the Dockerfiles for most of my internal
images, I have a .drone.jsonnet
file which looks like this:
local alpine_ver = "3.19.1";
local Pipeline(img_name, version, pr, alpine=false, alpine_ver_int=alpine_ver) = {
kind: "pipeline",
name:
if pr then
"Build "+img_name
else
"Release "+img_name,
platform: {
arch: "arm64",
},
steps: [
{
name:
if pr then
"Build Image"
else
"Release Image",
image: "thegeeklab/drone-docker-buildx",
privileged: true,
settings: {
repo: "harbor.example.com/homelab/"+img_name,
registry: "harbor.example.com",
username: "myuser",
password: {
from_secret: "harbor-secret",
},
dockerfile: img_name+"/Dockerfile",
context: img_name+"/",
mirror: "https://harbor-mirror.example.com",
debug: true,
buildkit_config: 'debug = true\n[registry."docker.io"]\n mirrors = ["harbor.example/dockerhub-cache"]\n[registry."quay.io"]\n mirrors = ["harbor.example.com/quay.io-cache"]\n[registry."ghcr.io"]\n mirrors = ["harbor.example.com/github-cache"]',
tags: [version, "latest"],
custom_dns: ["10.0.0.1"],
build_args: std.prune([
img_name+"_ver="+version,
if alpine then
"alpine_ver="+alpine_ver_int
]),
platforms: [
"linux/amd64",
"linux/arm64",
],
dry_run:
if pr then
true
else
false
},
}
],
trigger:
if pr then
{
event: {
include: [
"pull_request"
]
}
}
else
{
branch: {
include: [
"master"
]
},
event: {
exclude: [
"pull_request"
]
}
}
};
local Image(img_name, version, alpine=false, alpine_ver_int=alpine_ver) = [
Pipeline(img_name, version, true, alpine, alpine_ver_int),
Pipeline(img_name, version, false, alpine, alpine_ver_int)
];
Image("gitea", "1.21.10")
This configuration uses buildkit via the drone-docker-buildx plugin, which is no longer actively developed. One of the reasons why I’m planning to migrate to Woodpecker. I’m creating images for both arm64 and amd64, as most of my Homelab consists of Raspberry Pis.
One snag I hit during this part of the setup was when I tried to switch the
Fluentd image in my logging setup, already running on Kubernetes, over to
Harbor. I got only pull failures, without any indication what was going wrong.
It turned out that this was the first time my Kubernetes nodes were trying to
access something running in my cluster behind the Traefik ingress at
example.com
. And I yet again had to adapt my NetworkPolicy for said Traefik
Ingress.
Looking at the Cilium monitoring, I saw the following whenever one of my k8s
hosts tried to pull the image:
xx drop (Policy denied) flow 0x0 to endpoint 1868, ifindex 6, file bpf_lxc.c:2069, , identity remote-node->39413: 10.8.5.218:55064 -> 10.8.4.134:8000 tcp SYN
xx drop (Policy denied) flow 0x0 to endpoint 1868, ifindex 6, file bpf_lxc.c:2069, , identity remote-node->39413: 10.8.5.218:55064 -> 10.8.4.134:8000 tcp SYN
xx drop (Policy denied) flow 0x0 to endpoint 1868, ifindex 6, file bpf_lxc.c:2069, , identity remote-node->39413: 10.8.5.218:55064 -> 10.8.4.134:8000 tcp SYN
Here the endpoint with the 1868
identity is Traefik, and we can see that access
from a remote-node
identity is failing. This was due to the fact that while I
allowed access from world
to Traefik, world
in Cilium only means all nodes
outside the Kubernetes cluster. Cluster nodes, including the local host, need
to be explicitly allowed. So I had to add the following to my Traefik NetworkPolicy:
ingress:
- fromEntities:
- cluster
cluster
includes both, the local host and all other nodes in the cluster.
With that fixed, my homelab project was able to provide images to both, my Docker based Nomad cluster and my cri-o based Kubernetes cluster:
Setting Harbor up as a pull-through cache
With the handling of my own images finished and working, the last step remaining is the setup of pull-through caches for some public image registries. I wanted to set up an internal mirror for the following registries:
In Harbor, each mirror needs to be set up as a separate project, and it needs to be accessed at “harbor.example.com/project-name”. This is an issue for Docker daemons, which I will go into detail about later.
Here is an example for setting up the quay.io
cache. First, an endpoint needs
to be defined:
After the endpoint is defined, the project needs to be created:
After these steps are done, a mirror for quay.io will be
available at https://harbor.example.com/quay-cache
.
Here is a table of the configs for my current mirrors:
Name | Endpoint URL | Provider |
---|---|---|
dockerhub-cache | https://hub.docker.com | Docker Hub |
github-cache | https://ghcr.io | Github GHCR |
k8s-cache | https://registry.k8s.io | Docker Registry |
quay.io-cache | https://quay.io | Quay |
But there is an issue with Harbor’s subpath approach to projects/mirrors:
Docker only supports the registry-mirror
option. This will only be used for DockerHub images, not any other registry.
And the main issue: It does not support paths in the mirror URL given. Docker
always expects the registry at /
. This obviously doesn’t work with Harbor’s
domain/projectName/
scheme.
At the same time, cri-o does not suffer from this issue at all. It follows the
OCI containers-registries spec.
With this spec, and the containers-registries.conf
file, it can be configured
to rewrite pulls to any registry URL you like.
I will explain this later, but let’s start with the more complicated Docker
daemon case.
What does Docker actually do when pulling?
While trying to figure out how to solve the issue with Docker’s registry-mirror
option, I found this blog post,
which had an excellent idea: Just rewrite Docker’s requests to point them to the
right Harbor URL. And it worked. 🙂
Let’s start by having a look at the HTTP requests Docker makes when issuing the following command:
docker pull postgres:10
As the command does not have a registry domain defined, Docker defaults to
DockerHub.
Let’s imagine the Docker daemon is configured with --registry-mirror https://harbor.example.com
.
The first request Docker would try to make is this:
GET https://harbor.example.com/v2/
It would expect a 401 return code, and a www-authenticate
header.
This header looks something like this in the case of harbor:
www-authenticate: Bearer realm="https://harbor.example.com/service/token",service="harbor-registry"
Next, Docker tries to request a token:
https://harbor.example.com/service/token?scope=repository:library/postgres/pull&service=harbor-registry
Armed with that token, it would look for the manifest file for the posgres:10 image:
https://harbor.example.com/v2/library/postgres/manifests/10.0
This is where things start going wrong with harbor, because this request, send
to harbor, would look for the library
project, which does exist by default,
but is not a DockerHub mirror.
My first attempt to solve this issue was pretty simplistic: I configured an
additional route for the harbor-core
service in my Traefik ingress, with an
additional path rewrite to rewrite requests like /v2/library/postgres/manifests/10.0
to /v2/dockerhub-cache/library/postgres/manifests/10.0
. It looked like this:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: harbor-docker-mirror
annotations:
external-dns.alpha.kubernetes.io/hostname: "harbor-mirror.example.com"
external-dns.alpha.kubernetes.io/target: "ingress-k8s.example.com"
spec:
entryPoints:
- secureweb
routes:
- kind: Rule
match: Host(`harbor-mirror.example.com`)
middlewares:
- name: project-rewrite
namespace: harbor
services:
- kind: Service
name: harbor-core
namespace: harbor
port: http-web
scheme: http
---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
name: project-rewrite
spec:
replacePathRegex:
regex: ^\/v2\/(.+)$
replacement: /v2/dockerhub-cache/${1}
This worked somewhat. The initial request for /v2/
was rewritten. But then
I did not see the /service/token
request hit this new harbor-mirror
domain
at all. It went to the harbor
domain instead. And that request worked successfully,
Docker got a token from that endpoint.
But: The request would have been for a token to access the /library/postgres
repository.
The next request then went through the harbor-mirror
again, which meant the
request was correctly rewritten:
/v2/dockerhub-cache/library/postgres/manifests/10.0
But Harbor would now return a 401, because the token fetched in the previous
step was for /library/postgres/
, while the request was now for /dockerhub-cache/library/postgres
.
To fix this issue, I did not just need to rewrite the query parameter for the
/service/token
request, but also the one before that. Because the domain to
contact for the /service/token
request is taken from the www-authenticate
header of the response from the initial /v2/
request. And Harbor would of
course always answer with a fixed domain, the one from the externalURL
parameter in the Helm chart. And that’s not the route with the rewrite.
So I had to do two additional things, in addition to rewriting paths accessing
/v2/
:
- Rewrite the
www-authenticate
header from the response to the initial/v2/
request to make the Realm point to the special mirror domain, not Harbor’s domain - Rewrite the
scope=repository:
in the/service/token
request to prefix it with the name of the DockerHub mirror project in Harbor
It turned out that Traefik wasn’t really well equipped for that. It can of course
rewrite headers, but there’s no facility to work with regexes - I could only
replace the entire www-authenticate
header with a static value. And that seemed a bit too inflexible
to me.
So instead, I decided to set up another Pod, running the Caddy webserver, and using it to do the rewrites. I decided to use Caddy instead of Nginx, as the blog post I linked above did, because I’ve already got another Caddy serving as a webserver for my Nextcloud setup, but currently don’t have any Nginx in my Homelab.
I kept the Caddy setup pretty simple. Here’s the Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: caddy-dockerhub-mirror
spec:
replicas: 1
selector:
matchLabels:
app: caddy
template:
metadata:
labels:
app: caddy
automountServiceAccountToken: false
spec:
containers:
- name: caddy
image: caddy:2.7.6
volumeMounts:
- name: config
mountPath: /etc/caddy/
readOnly: true
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- name: caddy-http
containerPort: 8080
protocol: TCP
volumes:
- name: config
configMap:
name: caddy-mirror-conf
Then there’s also a service required:
apiVersion: v1
kind: Service
metadata:
name: caddy-mirror
spec:
type: ClusterIP
selector:
app: caddy
ports:
- name: caddy-http
port: 8080
targetPort: caddy-http
protocol: TCP
And finally an IngressRoute for my Traefik ingress:
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: harbor-docker-mirror
annotations:
external-dns.alpha.kubernetes.io/hostname: "harbor-mirror.example.com"
external-dns.alpha.kubernetes.io/target: "ingress-k8s.example.com"
spec:
entryPoints:
- secureweb
routes:
- kind: Rule
match: Host(`harbor-mirror.example.com`)
services:
- kind: Service
name: caddy-mirror
namespace: harbor
port: caddy-http
scheme: http
The really interesting part is the Caddy config:
apiVersion: v1
kind: ConfigMap
metadata:
name: caddy-mirror-conf
data:
Caddyfile: |
{
admin off
auto_https off
log {
output stdout
level INFO
}
}
:8080 {
log {
output stdout
format filter {
wrap json
fields {
request>headers>Authorization delete
request>headers>Cookie delete
}
}
}
@v2-subpath {
path_regexp repo ^/v2/(.+)
}
map /service/token {query.scope} {new_scope} {
~(repository:)(.*) "${1}dockerhub-cache/${2}"
}
rewrite /service/token ?scope={new_scope}&service={query.service}
header >Www-Authenticate harbor.example.com harbor-mirror.example.com
rewrite @v2-subpath /v2/dockerhub-cache/{re.repo.1}
reverse_proxy http://harbor-core.namespace-of-harbor.svc.cluster.local {
header_up Host "harbor.example.com"
}
}
The first rewrite is for all requests which go to /v2/
. Because I don’t want
to append the dockerhub-cache/
to the URL for the initial Docker daemon request
for /v2/
, I went with the ^/v2/(.+)
regex for the matcher:
@v2-subpath {
path_regexp repo ^/v2/(.+)
}
rewrite @v2-subpath /v2/dockerhub-cache/{re.repo.1}
These two lines define a rewrite for all paths /v2/.+
to /v2/dockerhub-cache/...
,
so that any request going over this mirror automatically accesses the DockerHub
mirror project on my Harbor instance.
The next line just replaces the canonical Harbor domain with the specific mirror
domain in the www-authenticate
header, so that the subsequent request for the
token goes through the mirror as well, instead of directly going to Harbor:
header >Www-Authenticate harbor.example.com harbor-mirror.example.com
With this, the realm="https://harbor.example.com/service/token"
part of the
header is rewritten to realm="https://harbor-mirror.example.com/service/token"
.
Now, the request for the token also goes through the Caddy instance, and I can
rewrite the repository in the request’s scope
parameter:
map /service/token {query.scope} {new_scope} {
~(repository:)(.*) "${1}dockerhub-cache/${2}"
}
rewrite /service/token ?scope={new_scope}&service={query.service}
The map
instruction matches only on requests to /service/token
and maps only
the scope
query parameter, to a Caddy-internal variable new_scope
, where
I split the scope=repository:library/postgres:pull
parameter, graft the
necessary /dockerhub-cache/
prefix in front of the /library/postgres
repository
definition. With this, the token request is made for the correct repository and
Harbor will accept requests for the image files accompanied by this token.
One note: I had also tried to rewrite the entire query part of the request in
one go, but I hit a weird issue. When operating on the whole query as one,
Caddy would urlencode more parts of the query, in particular the =
sign in the
scope
and service
parameters. And for some reason, Harbor did not like that.
It would only spit out a token when the =
signs were left as-is.
And with all of this combined, I could now set the registry-mirror
option for
my Docker agents to https://harbor-mirror.example.com
, and Docker pulls worked
as intended and used the dockerhub-cache mirror on my Harbor instance without
issue. 🎉
Configuring Docker and cri-o
Onto the last step: Configuring the Docker daemons in my Nomad cluster and the cri-o daemons in my Kubernetes cluster to use the new Harbor mirrors.
As noted above, Docker only supports mirrors for the DockerHub, nothing else.
So configuring those daemons is pretty simple, just adding this in the
/etc/docker/daemon.json
file:
{
"registry-mirrors": [
"https://harbor-mirror.example.com"
]
}
Luckily, registry-mirrors
is one of the Docker config options which can be live-reloaded,
so a pkill --signal SIGHUP dockerd
is enough, no restarts of the daemon and
running containers necessary.
The cri-o config is a bit more involved, but it does have the benefit of supporting
mirrors for any external registry you like.
Cri-o implements the containers-registries
config files. These can also be reloaded by sending a pkill --signal SIGHUP crio
to the daemon, without any restarts.
The mirror configs all have a similar format. As an example, the config for
registry.k8s.io
looks like this:
[[registry]]
prefix = "registry.k8s.io"
insecure = false
blocked = false
location = "registry.k8s.io"
[[registry.mirror]]
location = "harbor.example.com/k8s-cache"
I place that file into /etc/containers/registries.conf.d/k8s-mirror.conf
,
issue a SIGHUP, and cri-o will happily start pulling from the Harbor mirror
whenever an image from the official k8s registry is required. And like Docker,
it will pull from the original registry if the mirror is down.
And with that, I’ve got my container registry needs migrated fully to Kubernetes with Harbor. Especially the piece with the request rewrites to get a DockerHub mirror for Docker daemons going on Harbor was interesting to figure out and very satisfying to get working.