Skip to content

Latest commit

 

History

History
234 lines (159 loc) · 20.8 KB

upgrade.md

File metadata and controls

234 lines (159 loc) · 20.8 KB

Upgrading Kubernetes Clusters

Prerequisites

All documentation in these guides assumes you have already downloaded both the Azure CLI and aks-engine. Follow the quickstart guide before continuing.

This guide assumes you already have deployed a cluster using aks-engine. For more details on how to do that see deploy.

Upgrade

This document provides guidance on how to upgrade the Kubernetes version for an existing AKS Engine cluster and recommendations for adopting aks-engine upgrade as a tool.

Know before you go

In order to ensure that your aks-engine upgrade operation runs smoothly, there are a few things you should be aware of before getting started.

  1. You will need access to the API Model (apimodel.json) that was generated by aks-engine deploy or aks-engine generate (by default this file is placed into a relative directory that looks like _output/<clustername>/).

  2. aks-engine upgrade expects an API model that conforms to the current state of the cluster. In other words, the Azure resources inside the resource group deployed by aks-engine should be in the same state as when they were originally created by aks-engine. If you perform manual operations on your Azure IaaS resources (other than successful aks-engine scale, aks-engine update, or aks-engine upgrade operations) DO NOT use aks-engine upgrade, as the aks-engine-generated ARM template won't be reconcilable against the state of the Azure resources that reside in the resource group. Some examples of manual operations that will prevent upgrade from working successfully:

  • renaming resources
  • executing follow-up CustomScriptExtensions against VMs after a cluster has been created: a VM or VMSS instance may only have a single CustomScriptExtension attached to it; follow-up operations CustomScriptExtension operations will essentially "replace" the CustomScriptExtension defined by aks-engine at cluster creation time, and aks-engine upgrade will not be able to recognize the VM resource.

aks-engine upgrade relies on some resources (such as VMs) to be named in accordance with the original aks-engine deployment. In summary, the set of Azure resources in the resource group are mutually reconcilable by aks-engine upgrade only if they have been exclusively created and managed as the result of a series of successive ARM template deployments originating from various AKS Engine commands that have run to completion successfully.

  1. aks-engine upgrade allows upgrading the Kubernetes version to any AKS Engine-supported patch release in the current minor release channel that is greater than the current version on the cluster (e.g., from 1.21.4 to 1.21.5), or to the next aks-engine-supported minor version (e.g., from 1.21.5 to 1.22.2). (Or, see aks-engine upgrade --force if you want to bypass AKS Engine "supported version requirements"). In practice, the next AKS Engine-supported minor version will commonly be a single minor version ahead of the current cluster version. However, if the cluster has not been upgraded in a significant amount of time, the "next" minor version may have no longer be supported by aks-engine. In such a case, your long-lived cluster will be upgradable to the nearest, supported minor version that aks-engine supports at the time of upgrade (e.g., from 1.17.18 to 1.19.15).

    To get the list of all available Kubernetes versions and upgrades, run the get-versions command:

    aks-engine get-versions

    To get the versions of Kubernetes that your particular cluster version is upgradable to, provide its current Kubernetes version in the version arg:

    aks-engine get-versions --version 1.19.14
  2. aks-engine upgrade relies upon a working connection to the cluster control plane during upgrade, both (1) to validate successful upgrade progress, and (2) to cordon and drain nodes before upgrading them, in order to minimize operational downtime of any running cluster workloads. If you are upgrading a private cluster, you must run aks-engine upgrade from a host VM that has network access to the control plane, for example a jumpbox VM that resides in the same VNET as the master VMs. For more information on private clusters refer to this documentation.

  3. If using aks-engine upgrade in production, it is recommended to stage an upgrade test on an cluster that was built to the same specifications (built with the same cluster configuration + the same version of the aks-engine command line tool) as your production cluster before performing the upgrade, especially if the cluster configuration is "interesting", or in other words differs significantly from defaults. The reason for this is that AKS Engine supports many different cluster configurations and the extent of E2E testing that the AKS Engine team runs cannot practically cover every possible configuration. Therefore, it is recommended that you ensure in a staging environment that your specific cluster configuration is upgradable using aks-engine upgrade before attempting this potentially destructive operation on your production cluster.

  4. aks-engine upgrade is backwards compatible. If you deployed with aks-engine version 0.27.x, you can run upgrade with version 0.29.y. In fact, it is recommended that you use the latest available aks-engine version when running an upgrade operation. This will ensure that you get the latest available software and bug fixes in your upgraded cluster.

  5. aks-engine upgrade will automatically re-generate your cluster configuration to best pair with the desired new version of Kubernetes, and/or the version of aks-engine that is used to execute aks-engine upgrade. To use an example of both:

  • When you upgrade to (for example) Kubernetes 1.21 from 1.20, AKS Engine will automatically change your control plane configuration (e.g., coredns, metrics-server, kube-proxy) so that the cluster component configurations have a close, known-working affinity with 1.21.
  • When you perform an upgrade, even if it is a Kubernetes patch release upgrade such as 1.21.4 to 1.21.5, but you use a newer version of aks-engine, a newer version of etcd (for example) may have been validated and configured as default since the version of aks-engine used to build the cluster was released. So, for example, without any explicit user direction, the newly upgraded cluster will now be running etcd v3.2.26 instead of v3.2.25. This is by design.

In summary, using aks-engine upgrade means you will freshen and re-pave the entire stack that underlies Kubernetes to reflect the best-known, recent implementation of Azure IaaS + OS + OS config + Kubernetes config.

Parameters

Parameter Required Description
--api-model yes Relative path to the API model (cluster definition) that declares the desired cluster configuration.
--kubeconfig no Path to kubeconfig; if not provided, it will be generated on the fly from the API model data.
--upgrade-version yes Version of Kubernetes to upgrade to.
--force no Force upgrading the cluster to desired version, regardless of version support. Allows same-version upgrades and downgrades.
--control-plane-only no Upgrade control plane VMs only, do not upgrade node pools (unsupported on air-gapped clouds).
--cordon-drain-timeout no How long to wait for each vm to be cordoned in minutes (default -1, i.e., no timeout).
--vm-timeout no How long to wait for each vm to be upgraded in minutes (default -1, i.e., no timeout).
--upgrade-windows-vhd no Upgrade image reference of all Windows nodes to a new AKS Engine-validated image, if available (default is true).
--azure-env no The target Azure cloud (default "AzurePublicCloud") to deploy to.
--subscription-id yes The subscription id the cluster is deployed in.
--resource-group yes The resource group the cluster is deployed in.
--location yes The location to deploy to.
--client-id depends The Service Principal Client ID. This is required if the auth-method is set to client_secret or client_certificate
--client-secret depends The Service Principal Client secret. This is required if the auth-method is set to client_secret
--certificate-path depends The path to the file which contains the client certificate. This is required if the auth-method is set to client_certificate
--identity-system no Identity system (default is azure_ad)
--auth-method no The authentication method used. Default value is client_secret. Other supported values are: cli, client_certificate, and device.
--private-key-path no Path to private key (used with --auth-method=client_certificate).
--language no Language to return error message in. Default value is "en-us").

Under the hood

During the upgrade, aks-engine successively visits virtual machines that constitute the cluster (first the master nodes, then the agent nodes) and performs the following operations:

Control plane nodes:

  • cordon the node and drain existing workloads
  • delete the VM
  • create new VM and install desired Kubernetes version
  • add the new VM to the cluster (custom annotations, labels and taints etc are retained automatically)

Agent nodes:

  • create new VM and install desired Kubernetes version
  • add the new VM to the cluster
  • evict any pods that might be scheduled onto this node by Kubernetes before copying custom node properties
  • copy the custom annotations, labels and taints of old node to new node.
  • cordon the node and drain existing workloads
  • delete the VM

Simple steps to run upgrade

Once you have read all the requirements, run aks-engine upgrade with the appropriate arguments:

./bin/aks-engine upgrade \
  --subscription-id <subscription id> \
  --api-model <generated apimodel.json> \
  --location <resource group location> \
  --resource-group <resource group name> \
  --upgrade-version <desired Kubernetes version>

For example,

./bin/aks-engine upgrade \
  --subscription-id xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx \
  --api-model _output/mycluster/apimodel.json \
  --location westus \
  --resource-group test-upgrade \
  --upgrade-version 1.8.7

Steps to run when using Key Vault for secrets

If you use Key Vault for secrets, you must specify a local kubeconfig file to connect to the cluster because aks-engine is currently unable to read secrets from a Key Vault during an upgrade.

 ./bin/aks-engine upgrade \
   --api-model _output/mycluster/apimodel.json \
   --location westus \
   --resource-group test-upgrade \
   --upgrade-version 1.21.5 \
   --kubeconfig ./path/to/kubeconfig.json

Known Limitations

Manual reconciliation

The upgrade operation is a long-running, successive set of ARM deployments, and for large clusters, more susceptible to one of those deployments failing. This is based on the design principle of upgrade enumerating, one-at-a-time, through each node in the cluster. A transient Azure resource allocation error could thus interrupt the successful progression of the overall transaction. At present, the upgrade operation is implemented to "fail fast"; and so, if a well formed upgrade operation fails before completing, it can be manually retried by invoking the exact same command line arguments as were sent originally. The upgrade operation will enumerate through the cluster nodes, skipping any nodes that have already been upgraded to the desired Kubernetes version. Those nodes that match the original Kubernetes version will then, one-at-a-time, be cordon and drained, and upgraded to the desired version. Put another way, an upgrade command is designed to be idempotent across retry scenarios.

Cluster-autoscaler + Availability Set

At this time, we don't recommend using aks-engine upgrade on clusters running the cluster-autoscaler addon that have Availability Set (non-VMSS) node pools.

Forcing an upgrade

The upgrade operation takes an optional --force argument:

-f, --force
force upgrading the cluster to desired version. Allows same version upgrades and downgrades.

In some situations, you might want to bypass the AKS Engine validation of your API model versions and cluster nodes versions. This is at your own risk and you should assess the potential harm of using this flag.

The --force parameter instructs the upgrade process to:

  • bypass the usual version validation
  • include all your cluster's nodes (masters and agents) in the upgrade process; nodes that are already on the target version will not be skipped.
  • allow any Kubernetes versions, including the non-supported or deprecated versions
  • accept downgrade operations

Note: If you pass in a version that AKS-Engine literally cannot install (e.g., a version of Kubernetes that does not exist), you may break your cluster.

For each node, the cluster will follow the same process described in the section above: Under the hood

Frequently Asked Questions

Can I use aks-engine upgrade to upgrade all possible cluster configurations in an existing cluster?

No! aks-engine upgrade was designed to exclusively update the Kubernetes version running on a cluster, without affecting any other cluster config (especially IaaS resources). Because under the hood aks-engine upgrade is actually removing and adding new VMs, various configuration changes may be delivered to the new VMs (such as the VM size), but these changes should be considered experimental and thoroughly tested in a staging environment before being integrated into a production workflow. Specifically, changes to the VNET, Load Balancer, and other network-related configuration are not supported as modifiable by aks-engine upgrade. If you need to change the Load Balancer config, for example, you will need to build a new cluster.

When should I use aks-engine upgrade --control-plane-only?

We actually recommend that you only use aks-engine upgrade --control-plane-only. There are a few reasons:

  • The aks-engine upgrade workflow has been implemented in such a way that assumes the underlying nodes are pets, and not cattle. Each node is carefully accounted for during the operation, and every effort is made to "put the cluster back together" as if the nodes simply went away for a few minutes, but then came back. (This is in fact not what's happening under the hood, as the original VMs are in fact destroyed, and replaced with entirely new VMs; only the data disks are actually preserved.) Such an approach is appropriate for control plane VMs, because they are actually defined by AKS Engine as more or less static resources. However, individual worker nodes are not statically defined — the nodes participating in a cluster are designed to be ephemeral in response to changing operational realities.
  • aks-engine upgrade does its best to minimize operational cluster downtime, but there will be some amount of interruption due to the fact that VMs are in fact deleted, then added, behind a distributed control plane (we're assuming you're running 3 or 5 control plane VMs). Given that a small amount of disruption is unavoidable given the architectural constraints of aks-engine upgrade, it is more suitable to absorb that disruption in the control plane, which is probably not user-impacting (unless your users are Kubernetes cluster administrators!). You may be able to afford a small maintenance window to update your control plane, while your existing production workloads continue to serve traffic reliably. Of course production traffic is not static, and any temporary control plane unavailability will disrupt the dynamic attributes of your cluster that ultimately serve user traffic. We do recommend upgrading the control plane during an appropriate time when it is more preferable for your cluster to be put into a "static" mode.
  • A Kubernetes cluster is likely to run a variety of production workloads, each with its own requirements for downtime maintenance. Running a cluster-wide operation like aks-engine upgrade has the result of forcing you to schedule a maintenance window for your control plane, and all production environments simultaneously.
  • More flexible node pool-specific tooling is available to upgrade various parts of your production-serving nodes. See the addpool, update, and scale documentation to help you develop cluster workflows for managing node pools distinct from the control plane.
  • aks-engine upgrade --control-plane-only is not expected to work as intended on air-gapped clouds. Unless you are forcing an upgrade to the current orchestrator version, aks-engine upgrade --control-plane-only on air-gapped clouds is expected to break the kube-proxy daemonset as agents will be required to pull the newer kube-proxy container image. A cluster can be recovered from this bad state by running a full cluster upgrade (control plane and agents).

What should I upgrade first, my control plane nodes, or my worker nodes?

tl;dr Upgrade your control plane first!

If following our guidance you employ aks-engine upgrade --control-plane-only to upgrade your control plane distinctly from your worker nodes, and a combination of aks-engine addpool and aks-engine update to upgrade worker nodes, the natural question is: which should I do first?

The Kubernetes project publishes that the control plane may be up to 2 versions higher than kubelet, but not vice versa. What this means is that you should not run a newer version of Kubernetes on a node than is running on the control plane. Relevant documentation:

Another example from the kubeadm community project outlines its upgrade process, which specifies upgrading the control plane first.

Can I use aks-engine upgrade --control-plane-only to change the control plane configuration irrespective of updating the Kubernetes version?

Yes, but with caveats. Essentially you may use the aks-engine upgrade --control-plane-only functionality to replace your control plane VMs, one-at-a-time, with newer VMs rendered from updated API model configuration. You should always stage such changes, however, by building a staging cluster (reproducing at a minimum the version of aks-engine used to build your production cluster, and the API model JSON used as input; in a best-case scenario it will be in the same location as well). Here are a few useful possibilities that will work:

  • Updating the VM SKU by changing the properties.masterProfile.vmSize value
  • Certain configurable/tuneable kubelet properties in properties.masterProfile.kubernetesConfig.kubeletConfig, e.g.:
    • "--feature-gates"
    • "--node-status-update-frequency"
    • "--pod-max-pids"
    • "--register-with-taints"
    • "--image-gc-high-threshold" or "--image-gc-low-threshold" Generally, don't change any listening ports or filepaths, as those may have static dependencies elsewhere.
  • Again, certain configurable/tuneable kubelet properties in:
    • properties.orchestratorProfile.kubernetesConfig.controllerManagerConfig
    • properties.orchestratorProfile.kubernetesConfig.cloudControllerManagerConfig
    • properties.orchestratorProfile.kubernetesConfig.apiServerConfig
    • properties.orchestratorProfile.kubernetesConfig.schedulerConfig
  • Control plane VM runtime kernel configuration via properties.masterProfile.kubernetesConfig.sysctldConfig

You may not change the following values, doing so may break your cluster!

  • DO NOT CHANGE the number of VMs in your control plane via masterProfile.count
  • DO NOT CHANGE the static IP address range of your control plane via masterProfile.firstConsecutiveStaticIP

These types of configuration changes are advanced, only do this if you're a confident, expert Kubernetes cluster administrator!

How can I recreate upgraded nodes that did not reach the Ready state?

If any of the upgraded nodes is not able to reach the Ready state, then aks-engine upgrade will exit showing a message similar to one below:

Error validating upgraded master VM: k8s-master-12345678-0
Error: upgrading cluster: Node was not ready within 20m0s

There are a variety of reasons why cluster nodes might not be able to come back online after an upgrade. Looking at the kubelet logs (sudo journalctl -u kubelet) may be a good first investigative step.

Once you identify the problem and update your API model, you can recreate the NotReady node by (1) changing the orchestrator tag of the virtual machine so it does not match the target upgrade version, and (2) retrying aks-engine upgrade. Failing to update the orchestrator tag will result in aks-engine upgrade treating the NotReady node as already upgraded and consequently ignoring it to move on to the next node.