All posts
Editorial
Engineering
Insights
Product
Residencies

Upgrading Kubernetes: 10 Learnings from the Trenches

User Avatar
Brannon Dorsey
|
July 13, 2023
A technical deep-dive exploring some of the lessons our engineers have learned performing multiple K8s upgrades.

At Runway, we've been using Kubernetes to run our production workloads since day one. Over nearly five years, we've operated, (re)created, and upgraded our K8s clusters at least a dozen times. Our last upgrade was the hardest, as we had to modernize some of our oldest resources and third-party software deployed in the cluster, without causing application downtime. This post presents a few tips and strategies we learned during the process.

High-level Learnings

Feel free to jump around... (Hint: the Helm sections are really good).

  1. (🔗) Rely heavily on pre-production clusters for testing.
  2. (🔗) Create a runbook for the production upgrade.
  3. (🔗) Some version upgrades are bigger than others.
  4. (🔗) Kube-no-trouble can help you prepare for the bigger ones.
  5. (🔗) Understanding the Kubernetes Version Skew Policy can get you out of a sticky situation.
  6. (🔗) If it has a helm chart, use it. You'll be glad you did.
  7. (🔗) Upgrading a helm chart isn't as tricky as it seems, so long as you understand a few key things.
  8. (🔗) Helm chart versions introduce an unfamiliar pattern. It's worth recognizing.
  9. (🔗) Version control your manifests files and helm chart settings.
  10. (🔗) The fewer things you deploy, the less work it is to upgrade.

Going Deeper

...Or review all of our learnings in order.

Rely heavily on pre-production clusters for testing

We hope this one goes without saying, but you should perform your upgrades in a staging cluster first. If you don't have a safe pre-production cluster to prepare and test your upgrade in, you run a greater risk of breaking something in production.

The approach we've taken at Runway is to maintain one staging cluster for each production cluster (e.g. stage-aws1, prod-aws1, stage-aws2, etc...). The production and staging cluster environments are intended to be as identical as reasonably and economically possible. Our staging clusters serve two purposes:

  • They house our engineer's in-progress application code changes. This environment isolation protects live user data and jobs from being impacted by non-production code.
  • They allow us to test experimental changes to our infrastructure — like upgrading Kubernetes for instance.

There is no shame if you've found yourself in a scenario where you don't have a dummy/non-production cluster to try an upgrade on first, but we do strongly suggest you do something to change that. We've found that managing our clusters via a Terraform module simplifies the process of creating new clusters and keeping existing ones in sync over time.

Create a runbook for the production upgrade

This is perhaps the most important learning in this post. Having a detailed document with checklists that outline what needs to be done, how to do it, and what to do if things go wrong is critical to ensuring the rollout goes smoothly. Updating the checklist as you go can also help keep your team informed in the process.

Here's an example of what the checklist might look like.

Pre Upgrade

  • Review the Kubernetes version release and upgrade documents.
  • Review the version upgrade document from our managed Kubernetes service provider (e.g. EKS).
  • Use kubent to verify that no removed K8s APIs are used in the cluster.
  • Verify that the chart version of all helm releases deployed in the cluster are known to be compatible with the target version.
  • Perform this entire checklist in the staging cluster (including the upgrade and post-upgrade sections).
  • Verify all nodes are using the same kubelet version as your K8s API, pre-upgrade (e.g. kubectl get nodes | grep v1.21)

Upgrade

Phase I - Control plane upgrade

  • Upgrade the control plane via the AWS console.
  • Upgrade all managed add-ons.
  • Upgrade CoreDNS (version compatibility chart here).
  • Upgrade kube-proxy (version compatibility chart here).
  • The environment is working as expected.
  • There are no unexpected and non-transient errors in production logs or exception tooling (e.g. Sentry).
  • Nothing is paging.

Phase II - Node upgrades

  • Upgrade node groups to use the new kubelet version.
  • Upgrade managed node groups to use the latest AMI matching the target version.
  • Roll all node groups.
  • Manually cordon and drain nodes whose workloads which are preventing scale down.
  • Verify no nodes are running the old kubelet version.
  • Verify again that the environment is working as expected.

Post Upgrade

  • Upgrade helm charts and manual deployments which were previously held back due to incompatibility with the previous Kubernetes version

There are lots of opinions about the differences between a "runbook" and a "checklist". Our thinking is: who cares, so long as you produce something practical and useful. You want something simple enough that people will actually use and update it, but detailed enough that it isn't missing any crucial details.

Some version upgrades are bigger than others

We've found that most Kubernetes upgrades are fairly routine and unremarkable. You upgrade the control plane and nodes in staging, do some QA, and if all looks good, you do the same in production. Aside from a few transient errors during the control plane upgrade, the whole process can be pretty smooth.

That is, unless you're upgrading to a version which removes deprecated APIs that are still in use in your cluster. If that's the case, your struggle may be in the preparation rather than the upgrade.

This was the situation we found ourselves in when upgrading to v1.22. Many third-party resources we deployed were as old as our cluster, with some receiving no upgrades in four years. In fact, our NGINX ingress controller, prometheus operator, and cert-manager deployment were all well pre-1.0.

The K8s Deprecated API Migration Guide outlines every breaking API change in a way that humans can actually understand. Before upgrading, you must stop using any APIs that have been removed in the version you're targeting, and the migration guide can be a great first predictor of how hard is this going to be? We'll cover how to locate and remove the use of these APIs in the next section, but the takeaway here is that if there isn't much listed in the guide for your target upgrade version, you can expect to have a simpler upgrade.

Kube-no-trouble can help you prepare for the upgrade

The Kube-no-trouble project is the easiest way to detect what deprecated APIs are currently in use in your cluster before they cause a problem after an upgrade.

# Install in one line 
#   (source: https://github.com/doitintl/kube-no-trouble#install)
sh -c "$(curl -sSL https://git.io/install-kubent)"

# Uses your current Kubernetes context to detect and communicate 
# with the cluster
kubent
6:25PM INF >>> Kube No Trouble `kubent` <<<
6:25PM INF Initializing collectors and retrieving data
6:25PM INF Retrieved 103 resources from collector name=Cluster
6:25PM INF Retrieved 0 resources from collector name="Helm v3"
6:25PM INF Loaded ruleset name=deprecated-1-16.rego
6:25PM INF Loaded ruleset name=deprecated-1-20.rego
__________________________________________________________________________________________
>>> 1.16 Deprecated APIs <<<
------------------------------------------------------------------------------------------
KIND         NAMESPACE     NAME                    API_VERSION
Deployment   default       nginx-deployment-old    apps/v1beta1
Deployment   kube-system   event-exporter-v0.2.5   apps/v1beta1
Deployment   kube-system   k8s-snapshots           extensions/v1beta1
Deployment   kube-system   kube-dns                extensions/v1beta1
__________________________________________________________________________________________
>>> 1.20 Deprecated APIs <<<
------------------------------------------------------------------------------------------
KIND      NAMESPACE   NAME           API_VERSION
Ingress   default     test-ingress   extensions/v1beta1

Knowing is half the battle. Removing the use of the APIs is the other half.

Kubernetes supports multiple versions of an API at once. It is common for the "preferred" version of an API to be the latest stable version (or at least non-deprecated) but your use of it via CI or through a deployed operator can often reference an older deprecated version.

You can run kubectl api-resources to view the preferredVersion of all APIs currently supported by your cluster. If your versions of the resource in source control differ from the preferred version then you'll need to upgrade them in your code. But the good news is that the K8s API has probably already done the conversion for you. If you view a deployed resource you may find it reflected to you using the latest version and schema.

kubectl get -n $NAMESPACE $RESOURCE_TYPE $RESOURCE_NAME -o yaml

Kubernetes also maintains a tool called kubectl-convert which you can use to convert between different API versions. Details here.

Understanding the Kubernetes Version Skew Policy can get you out of a sticky situation

Kubernetes actively maintains the latest three releases and guarantees one year of patch support for every release. If you use a managed Kubernetes service like EKS, your cloud provider may maintain a different set of supported releases and release windows.

Beyond knowing which versions of K8s are supported, understanding the version skew each cluster component is tolerant to can help you understand the differences between what should be updated and what must be updated when planning your upgrade.

The policy states that:

  • Nothing can be higher than the Kubernetes API server (e.g. kube-apiserver). This is why we always upgrade the control plane first.
  • kubelet can be up to two minor versions behind kube-apiserver (this saved us in a pinch once when a regression on an AMI forced us to hold kubelet back).
  • kube-controller-manager, kube-scheduler, and cloud-controller-manager may be up to one minor version behind kube-apiserver.
  • kubectl can be up to one minor version of kube-apiserver (but we've gotten away with larger drifts 🤞).
  • We've experienced kube-proxy to be tolerant to one version ahead and behind kubelet, although this is not recommended.

If it has a Helm chart, use it

Many third-party tools and infrastructure packages you can deploy into K8s will offer you two options for doing so:

  • A simple kubectl command which deploys some static YAML files (sometimes called a "default static install")
  • A Helm chart

Always pick the Helm chart. We used to toy with deploying static manifest files because they were easier to reason about (no magic YAML generation) and we could give them a simple eye test to understand everything that would be deployed in our cluster. But we paid the price with this decision in the long run.

The reason being: Upgrades. If you use a Helm chart, your upgrade path will be much easier and well-defined. The day will undoubtedly come when you must upgrade a third-party component deployed in your cluster. When it does, if it was deployed with a Helm chart, other people will have already solved your problems for you (that's what chart maintainers do). If you deployed with static manifest files, you'll be the one having to find some meaningful mapping from an arbitrary collection of manifests thrown into your cluster and the ones associated with the release you want to upgrade to. Spoiler: This is terrible and a correct mapping/solution may not even exist.

Upgrading a helm chart isn't as tricky as it seems, so long as you understand a few key things

There is no magic to Helm, even if it feels that way. It takes some YAML templates/ published by chart maintainers, applies some rendering to them based on default the values.yaml file included in that published chart and settings you override, and then generates YAML files which get applied to your cluster.

Chart maintainers release their charts with sane defaults values and you get to narrowly change them in select ways. Helm remembers which values you override, so you don't have to worry about that next time you upgrade (although this section explains why you should).

Here's a few commands we lean on heavily when deploying or upgrading a helm chart release.

# Our normal upgrade process ----------------------------------------------------

REPO=ingress-nginx
CHART=ingress-nginx
NAMESPACE=nginx

# Get your currently deployed helm chart version and release name 👀
helm list -n $NAMESPACE

RELEASE=ingress-nginx

# List all remote versions of the chart that are available for install.
helm search repo $REPO/$CHART -l

DESIRED_CHART_VERSION=4.3.0

# Preview the differences between your deployed chart and one you're upgrading to
# Requires the helm `diff` plugin to be installed:
#   helm plugin install https://github.com/databus23/helm-diff
helm diff upgrade $RELEASE $REPO/$CHART -n $NAMESPACE --version $DESIRED_CHART_VERSION --reuse-values

# Don't like something you see there? Review the source.
helm pull $REPO/$CHART --version $DESIRED_CHART_VERSION
DOWNLOADED_CHART="$CHART-$DESIRED_CHART_VERSION.tgz"
tar -xvzf $DOWNLOADED_CHART && rm $DOWNLOADED_CHART
open $CHART

# Deploy the upgrade. I'd suggest actually doing this with something like helmfile instead
# so that version and settings are source controlled. More on that later.
helm upgrade $RELEASE $REPO/$CHART -n $NAMESPACE --version $DESIRED_CHART_VERSION --reuse-values

# Rollback to the previous version if something went wrong
helm rollback $RELEASE -n $NAMESPACE

# A few additional helpers -----------------------------------------------------

# View all deployed YAML manifests associated with a release
helm get manifest $RELEASE -n $NAMESPACE

# View all YAML manifests a chart will apply **before** running it
helm upgrade --install $RELEASE $REPO/$CHART -n $NAMESPACE --dry-run

# See what configurable values a release is deployed with
helm get values $RELEASE -n $NAMESPACE # Only the values you overwrote
helm get values $RELEASE -n $NAMESPACE --all # All of the values (defaults + yours)

Now imagine trying to do all of that with some static YAML files tossed into your cluster at some point in the past. 😅

Another bonus is that Helm automatically removes manifests which aren't deployed in subsequent release upgrades or new chart versions. In other words, it cleans up after itself.

# No need to remember to remove up orphaned resources with...
kubectl delete serviceaccount permissive-serviceaccount-used-by-deleted-deployment --namespace you-totally-remember-to-do-this-every-time-you-delete-something-right

Helm chart versions introduce an unfamiliar pattern. It's worth recognizing.

When upgrading a helm chart there are two versions you need to be aware of: The Helm chart version and one or more application versions. This is a concept you don't see everyday in the software engineering world.

We like to think of the chart version as a higher-level meta-version: It applies a semantic versioning scheme to the infrastructure manifests files needed to deploy a well-behaved version of an application (which has its own version).

A helm chart version upgrade may, but does not have to, change the application code version. Similarly, an application code version bump may require no infrastructure changes and result in no new helm version (although the chart maintainers may consider it best practice to do a minor helm chart version bump which changes only the application code version in the values.yaml file).

+----------------+    +----------------+    +----------------+    +----------------+
| Helm Chart v1  |----| Helm Chart v2  |----| Helm Chart v2  |----| Helm Chart v3  |
+--------+-------+    +--------+-------+    +--------+-------+    +--------+-------+
         |                     |                     |                     |
         |                     |                     |                     |
         v                     v                     v                     v
+----------------+    +----------------+    +----------------+    +----------------+
| App Code v1.0  |----| App Code v1.0  |----| App Code v1.1  |----| App Code v1.1  |
+----------------+    +----------------+    +----------------+    +----------------+

When trying to decide which helm chart version to upgrade to, we ask ourselves: "Can we tolerate the application code version used by the latest helm chart version?"

  • If yes, upgrade to the latest helm chart version.
  • If not, upgrade to the the highest helm chart version which ships with (or is tolerant to) our required application code version as its default application version.
# You can use a command like this to list all available chart
# versions and their corresponding application code versions
$ helm search repo ingress-nginx/ingress-nginx -l

NAME                       	CHART VERSION	APP VERSION	DESCRIPTION
ingress-nginx/ingress-nginx	4.6.0        	1.7.0      	...
ingress-nginx/ingress-nginx	4.5.2        	1.6.4      	...
ingress-nginx/ingress-nginx	4.5.0        	1.6.3      	...
ingress-nginx/ingress-nginx	4.4.2        	1.5.1      	...
ingress-nginx/ingress-nginx	4.4.1        	1.5.2      	...
ingress-nginx/ingress-nginx	4.4.0        	1.5.1      	...
ingress-nginx/ingress-nginx	4.3.0        	1.4.0      	...
ingress-nginx/ingress-nginx	4.2.5        	1.3.1      	...

Version control your resource manifests and helm chart settings

Avoid scenarios where the only representation of what's deployed in your cluster are the deployed manifests and helm releases themselves. We highly recommend that you check all of the manifests you deploy into your cluster into a Git repo. Tools like kustomize (now built into kubectl) and helmfile can help here. Better yet, you can automatically deploy manifests and helm charts from a git repository using tools like Argo CD and Flux, ensuring your Git repo is the source of truth for what's deployed inside K8s.

Good Infrastructure-as-Code (IaC) and GitOps practices can help you learn what needs to be upgraded, enable peer review for your suggested changes, and maintain a good history of past changes that can inform future ones.

The fewer things you deploy, the less work it is to upgrade

This sounds like a joke, but it's not. Each new component you deploy in your cluster is baggage that must be brought along for the long haul. Exercising mindfulness when deciding to add components and regularly removing outdated or frivolous deployments can leave you with fewer blockers the next time you prepare to upgrade Kubernetes.

Don't forget to apply

If you found this post interesting, we invite you to explore career opportunities at Runway. We're always on the lookout for talented individuals who share our passion for cloud infrastructure. We run big, complex things, in far off places.

May your Kubernetes upgrades be smooth and your downtime nonexistent. Until next time, happy upgrading!

Share this
Everything you need to make anything you want.
Trusted by the world's top creatives