K3s - Upgrade
Prerequisites
Access to all nodes of the cluster through one of the following methods - Rancher -
SSH
protocol - AWSSession Manager
The K3s version tag you wish to upgrade to: https://github.com/k3s-io/k3s/releases
The
system-upgrade-controller
file that will be used to upgrade the K3s cluster: https://assets.master.k3s.getvisibility.com/system-upgrade-controller/v0.10.0/system-upgrade-controller.yamlThe Bundle file for the K3s upgrade in the Air-Gap Environment
Make sure you push all new docker images to the ECR
gv-public
docker registry that you need to install the new k3s version.
Focus/Synergy services
Updates and custom settings are automatically applied to all backend services using Fleet as long as the cluster has access to the public internet and can connect to the management server.
In case there’s no internet connection or the management server is down, the cluster agent will keep trying to reach the management server until a connection can be established.
Upgrading K3s to 1.24
Log in to Rancher or one of the master nodes of the cluster to use
kubectl
CLIList the node name and the K3s version:
kubectl get nodes
Add the label
k3s-upgrade=true
to the nodes: Note: In the case of a multi-node cluster, each node will be updated with the label mentioned above
kubectl label node --all k3s-upgrade=true
Deploy the
system-upgrade-controller
:
kubectl apply -f https://assets.master.k3s.getvisibility.com/system-upgrade-controller/v0.10.0/system-upgrade-controller.yaml
Create
upgrade-plan.yaml
file. Note: the keyversion
has the version of the K3s that the cluster will be upgraded to.
cat > upgrade-plan.yaml << EOF
---
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: k3s-latest
namespace: system-upgrade
spec:
concurrency: 1
version: v1.24.9+k3s2
nodeSelector:
matchExpressions:
- {key: k3s-upgrade, operator: Exists}
serviceAccountName: system-upgrade
upgrade:
image: docker.io/rancher/k3s-upgrade
EOF
Run the upgrade plan. The upgrade controller should watch for this plan and execute the upgrade on the labeled nodes
kubectl apply -f upgrade-plan.yaml
Once the plan is executed, all pods will restart and will take a few minutes to recover. Check the status of all the pods:
watch kubectl get pods -A
Check if the K3s version has been upgraded:
kubectl get nodes
Delete the
system-upgrade-controller
kubectl delete -f https://assets.master.k3s.getvisibility.com/system-upgrade-controller/v0.10.0/system-upgrade-controller.yaml
Demo Video
Here is the demo video that showcases the steps that need to be performed to upgrade K3s:
video
Upgrading K3s - AirGap (Manual Approach)
Take a shell session to each of the cluster nodes (VMs)
Download and Extract the bundle file:
tar -xf gv-platform-$VERSION.tar
to all the VMsPerform the following steps in each of the VMs to Upgrade K3s:
$ mkdir -p /var/lib/rancher/k3s/agent/images/
$ gunzip -c assets/k3s-airgap-images-amd64.tar.gz > /var/lib/rancher/k3s/agent/images/airgap-images.tar
$ cp assets/k3s /usr/local/bin && chmod +x /usr/local/bin/k3s
Restart the k3s service across each of the nodes Master nodes:
$ systemctl restart k3s.service
Worker nodes:
$ systemctl restart k3s-agent.service
Wait for a few minutes for the pods to recover.
watch kubectl get pods -A
Check the k3s version across the nodes
kubectl get nodes
Demo Video
Here is the demo video that showcases the steps that need to be performed to upgrade K3s in the Air Gap environment:
video
Upgrading K3s to 1.26
For the Platform Team: Local Cluster K3s Upgrade
If you are upgrading K3s of the local cluster, you would need to remove the existing PodSecurityPolicy resources.
We have only one of them under the chart aws-node-termination-handler
Patch the helm Chart to disable the psp resource.
kubectl patch helmchart aws-node-termination-handler -n kube-system --type='json' -p='[{"op": "add", "path": "/spec/set/rbac.pspEnabled", "value": "false"}]'
This will trigger the removal of the PSP resource
The traefik is deployed as daemonset in the local clusters. You would need to restart the daemonset instead when following the steps given in Post Upgrade Patch (broken link)
Deploy the
system-upgrade-controller:
kubectl apply -f https://assets.master.k3s.getvisibility.com/system-upgrade-controller/v0.13.1/system-upgrade-controller.yaml
Create the upgrade plan Note: the key
version
has the version of the K3s that the cluster will be upgraded to.
cat > upgrade-plan-server.yaml << EOF
---
# Server plan
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: server-plan
namespace: system-upgrade
spec:
concurrency: 1
cordon: true
nodeSelector:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- "true"
serviceAccountName: system-upgrade
upgrade:
image: rancher/k3s-upgrade
version: v1.26.10+k3s1
EOF
If you are also running a worker node then execute this too:
cat > upgrade-plan-agent.yaml << EOF
---
# Agent plan
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: agent-plan
namespace: system-upgrade
spec:
concurrency: 1
cordon: true
nodeSelector:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: DoesNotExist
prepare:
args:
- prepare
- server-plan
image: rancher/k3s-upgrade
serviceAccountName: system-upgrade
upgrade:
image: rancher/k3s-upgrade
version: v1.26.10+k3s1
EOF
Run the upgrade plan:
kubectl apply -f upgrade-plan-server.yaml
In the case of a Worker node execute this too:
kubectl apply -f upgrade-plan-agent.yaml
Once the plan is executed, all pods will restart and take a few minutes to recover Check the status of all the pods:
watch kubectl get pods -A
Check if the K3s version has been upgraded:
kubectl get nodes
Delete the
system-upgrade-controller:
kubectl delete -f https://assets.master.k3s.getvisibility.com/system-upgrade-controller/v0.13.1/system-upgrade-controller.yaml
Reference: Apply upgrade: https://docs.k3s.io/upgrades/automated#install-the-system-upgrade-controller
Post Upgrade Patch
We have seen an issue with Traefik not able to access any resources after the upgrade is implemented. Follow these steps to implement the fix
Run this patch to add
traefik.io
to the apiGroup of the ClusterRoletraefik-kube-system
kubectl patch clusterrole traefik-kube-system -n kube-system --type='json' -p='[{"op": "add", "path": "/rules/-1/apiGroups/-", "value": "traefik.io"}]'
Add the missing CRDs
kubectl apply -f https://assets.master.k3s.getvisibility.com/k3s/v1.26.10+k3s1/traefik-patch.yaml
Restart traefik deployment
kubectl rollout restart deployment traefik -n kube-system
Reference: https://github.com/k3s-io/k3s/issues/8755#issuecomment-1789526830
Upgrading K3s - AirGap (Manual Approach)
Follow these steps to upgrade k3s: Upgrading K3s - AirGap (Manual Approach)
Post Upgrade Patch
Run this patch to add
traefik.io
to the apiGroup of the ClusterRoletraefik-kube-system
kubectl patch clusterrole traefik-kube-system -n kube-system --type='json' -p='[{"op": "add", "path": "/rules/-1/apiGroups/-", "value": "traefik.io"}]'
Add the missing CRDs
kubectl apply -f assets/traefik-patch.yaml
Restart traefik deployment
kubectl rollout restart deployment traefik -n kube-system
Reference: https://github.com/k3s-io/k3s/issues/8755#issuecomment-1789526830
Certificates
By default, certificates in K3s expire in 12 months. If the certificates are expired or have fewer than 90 days remaining before they expire, the certificates are rotated when K3s is restarted.
Last updated
Was this helpful?