:original_name: cce_10_0197.html
.. _cce_10_0197:
Upgrade Overview
================
To enable interoperability from one Kubernetes installation to the next, you must upgrade your Kubernetes clusters before the maintenance period ends.
After the latest Kubernetes version is available in CCE, CCE will describe the changes in this version.
You can use the CCE console to upgrade the Kubernetes version of a cluster.
An upgrade flag will be displayed on the cluster card view if there is a new version for the cluster to upgrade.
**How to check:**
Log in to the CCE console and check whether the message "New version available" is displayed in the lower left corner of the cluster. If yes, the cluster can be upgraded. If no, the cluster cannot be upgraded.
.. figure:: /_static/images/en-us_image_0000001482796460.png
:alt: **Figure 1** Cluster with the upgrade flag
**Figure 1** Cluster with the upgrade flag
.. _cce_10_0197__section19981121648:
Cluster Upgrade
---------------
The following table describes the target version to which each cluster version can be upgraded, the supported upgrade modes, and upgrade impacts.
.. table:: **Table 1** Cluster upgrade paths and impacts
+-----------------+-----------------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Source Version | Target Version | Upgrade Modes | Impacts |
+=================+=================+==================+================================================================================================================================================================+
| v1.19 | v1.21 | In-place upgrade | You need to learn about the differences between versions. For details, see :ref:`Precautions for Major Version Upgrade `. |
+-----------------+-----------------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| v1.17 | v1.19 | In-place upgrade | You need to learn about the differences between versions. For details, see :ref:`Precautions for Major Version Upgrade `. |
| | | | |
| v1.15 | | | |
+-----------------+-----------------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| v1.13 | v1.15 | Rolling upgrade | - **proxy** in the coredns add-on cannot be configured and needs to be replaced with **forward**. |
| | | | - The storage add-on is changed from storage-driver to everest. |
| | | Replace upgrade | |
+-----------------+-----------------+------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
Upgrade Modes
-------------
The upgrade processes are the same for master nodes. The differences between the upgrade modes of worker nodes are described as follows:
.. table:: **Table 2** Differences between upgrade modes and their advantages and disadvantages
+----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Upgrade Mode | Method | Advantage | Disadvantage |
+======================+==============================================================================================================================================================================================================================================================================================================+=========================================================================+=============================================================================================================================================================================================================================================+
| **In-place upgrade** | Kubernetes components, network components, and CCE management components are upgraded on the node. During the upgrade, service pods and networks are not affected. The **SchedulingDisabled** label will be added to all existing nodes. After the upgrade is complete, you can properly use existing nodes. | You do not need to migrate services, ensuring service continuity. | In-place upgrade does not upgrade the OS of a node. If you want to upgrade the OS, clear the corresponding node data after the node upgrade is complete and reset the node to upgrade the OS to a new version. |
+----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| **Rolling upgrade** | Only the Kubernetes components and certain network components are upgraded on the node. The **SchedulingDisabled** label will be added to all existing nodes to ensure that the running applications are not affected. | Services are not interrupted. | - **After the upgrade is complete, you need to manually create nodes and gradually release the old nodes.** The new nodes are billed additionally. After services are migrated to the new nodes, the old nodes can be deleted. |
| | | | |
| | .. important:: | | - After the rolling upgrade is complete, if you want to continue the upgrade to a later version, you need to reset the old nodes first. Otherwise, the pre-upgrade check cannot be passed. Services may be interrupted during the upgrade. |
| | | | |
| | NOTICE: | | |
| | | | |
| | - **After the upgrade is complete, you need to manually create nodes and gradually release the old nodes**, thereby migrating your applications to the new nodes. In this mode, you can control the upgrade process. | | |
+----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| **Replace upgrade** | The latest worker node image is used to reset the node OS. | This is the fastest upgrade mode and requires few manual interventions. | Data or configurations on the node will be lost, and services will be interrupted for a period of time. |
+----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
.. _cce_10_0197__section191131551162610:
Precautions for Major Version Upgrade
-------------------------------------
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Upgrade Path | Difference | Self-Check |
+=======================+==============================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================+================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================================+
| v1.19 to v1.21 | The bug of **exec probe timeouts** is fixed in Kubernetes 1.21. Before this bug fix, the exec probe does not consider the **timeoutSeconds** field. Instead, the probe will run indefinitely, even beyond its configured deadline. It will stop until the result is returned. If this field is not specified, the default value **1** is used. This field takes effect after the upgrade. If the probe runs over 1 second, the application health check may fail and the application may restart frequently. | Before the upgrade, check whether the timeout is properly set for the exec probe. |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| | kube-apiserver of CCE 1.19 or later requires that the Subject Alternative Names (SANs) field be configured for the certificate of your webhook server. Otherwise, kube-apiserver fails to call the webhook server after the upgrade, and containers cannot be started properly. | Before the upgrade, check whether the SAN field is configured in the certificate of your webhook server. |
| | | |
| | Root cause: X.509 `CommonName `__ is discarded in Go 1.15. kube-apiserver of CCE 1.19 is compiled using Go 1.15. If your webhook certificate does not have SANs, kube-apiserver does not process the **CommonName** field of the X.509 certificate as the host name by default. As a result, the authentication fails. | - If you do not have your own webhook server, you can skip this check. |
| | | - If the field is not set, you are advised to use the SAN field to specify the IP address and domain name supported by the certificate. |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| v1.15 to v1.19 | The control plane of CCE 1.19 is incompatible with Kubelet 1.15. If the master node fails to be upgraded or the node to be upgraded restarts after the master node is successfully upgraded, there is a high probability that the node is in the **NotReady** status. | #. In normal cases, this scenario is not triggered. |
| | | #. After the master node is upgraded, do not suspend the upgrade. Upgrade the node quickly. |
| | There is a high probability that kubelet restarts on the node that fails to be upgraded, triggering the node registration process. The default registration labels of kubelet 1.15 (**failure-domain.beta.kubernetes.io/is-baremetal** and **kubernetes.io/availablezone**) are regarded as an invalid label by kube-apiserver 1.19. | #. If a node fails to be upgraded and cannot be restored, evict applications on the node as soon as possible. Contact technical support and skip the node upgrade. After the upgrade is complete, reset the node. |
| | | |
| | The valid labels in v1.19 are **node.kubernetes.io/baremetal** and **failure-domain.beta.kubernetes.io/zone**. | |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| | In CCE 1.15 and 1.19 clusters, the Docker storage driver file system is switched from XFS to Ext4. As a result, the import package sequence in the pods of the upgraded Java application may be abnormal, causing pod exceptions. | Before the upgrade, check the Docker configuration file **/etc/docker/daemon.json** on the node. Check whether the value of **dm.fs** is **xfs**. |
| | | |
| | | - If the value is **ext4** or the storage driver is Overlay, you can skip the next steps. |
| | | - If the value is **xfs**, you are advised to deploy applications in the cluster of the new version in advance to test whether the applications are compatible with the new cluster version. |
| | | |
| | | .. code-block:: |
| | | |
| | | { |
| | | "storage-driver": "devicemapper", |
| | | "storage-opts": [ |
| | | "dm.thinpooldev=/dev/mapper/vgpaas-thinpool", |
| | | "dm.use_deferred_removal=true", |
| | | "dm.fs=xfs", |
| | | "dm.use_deferred_deletion=true" |
| | | ] |
| | | } |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| | kube-apiserver of CCE 1.19 or later requires that the Subject Alternative Names (SANs) field be configured for the certificate of your webhook server. Otherwise, kube-apiserver fails to call the webhook server after the upgrade, and containers cannot be started properly. | Before the upgrade, check whether the SAN field is configured in the certificate of your webhook server. |
| | | |
| | Root cause: X.509 `CommonName `__ is discarded in Go 1.15. kube-apiserver of CCE 1.19 is compiled using Go 1.15. The **CommonName** field is processed as the host name. As a result, the authentication fails. | - If you do not have your own webhook server, you can skip this check. |
| | | - If the field is not set, you are advised to use the SAN field to specify the IP address and domain name supported by the certificate. |
| | | |
| | | .. important:: |
| | | |
| | | NOTICE: |
| | | To mitigate the impact of version differences on cluster upgrade, CCE performs special processing during the upgrade from 1.15 to 1.19 and still supports certificates without SANs. However, no special processing is required for subsequent upgrades. You are advised to rectify your certificate as soon as possible. |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| | In clusters of v1.17.17 and later, CCE automatically creates pod security policies (PSPs) for you, which restrict the creation of pods with unsafe configurations, for example, pods for which **net.core.somaxconn** under a sysctl is configured in the security context. | After an upgrade, you can allow insecure system configurations as required. For details, see :ref:`Configuring a Pod Security Policy `. |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| v1.13 to v1.15 | After a VPC network cluster is upgraded, the master node occupies an extra CIDR block due to the upgrade of network components. If no container CIDR block is available for the new node, the pod scheduled to the node cannot run. | Generally, this problem occurs when the nodes in the cluster are about to fully occupy the container CIDR block. For example, the container CIDR block is 10.0.0.0/16, the number of available IP addresses is 65,536, and the VPC network is allocated a CIDR block with the fixed size (using the mask to determine the maximum number of container IP addresses allocated to each node). If the upper limit is 128, the cluster supports a maximum of 512 (65536/128) nodes, including the three master nodes. After the cluster is upgraded, each of the three master nodes occupies one CIDR block. As a result, 506 nodes are supported. |
+-----------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+