There are many reasons for a canary upgrade failure. In case of a canary upgrade failure, you can use the following solutions to handle it.
Solution: New Istio version does not support some CRDs, including ClusterRbacConfigs, ServiceRoles, ServiceRoleBindings, and Policies. If there are resources to be discarded in the current version, delete them before the upgrade.
Solution: Configure Istio gateway labels (specified by matchLabels) in {app: istio-ingressgateway, istio: ingressgateway} format.
Solution: ASM 1.8 and later versions do not support the tracing, kiali, grafana, and prometheus add-ons. Uninstall the add-ons before the upgrade. You can install open-source add-ons or use APM.
Solution: If the cluster is unavailable before the upgrade, do not perform the upgrade.
Solution: Prepare the required resources for the canary upgrade.
Solution: Use the cluster version listed in the following table.
Service Mesh Version |
Supported Cluster Version |
1.15 |
1.21, 1.23, 1.25, and 1.27 |
1.18 |
1.25, 1.27, 1.28, 1.29, and 1.30 |
Solutions: If you upgrade Istio from a non-canary version to a canary version, ensure that there are at least twice as many nodes labeled with istio:master as there are istiod instances, and at least twice as many schedulable nodes as there are istio-ingressgateway or istio-egressgateway instances (depending on which one is larger). If such conditions are not met, add nodes to meet the scheduling requirements or set the anti-affinity of istiod, istio-ingressgateway, and istio-egressgateway to Preferred.
preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - istiod (For the anti-affinity policy of istio-ingressgateway, replace the value of values with istio-ingressgateway. For the anti-affinity policy of istio-egressgateway, replace the value of values with istio-egressgateway.) namespaces: - istio-system topologyKey: kubernetes.io/hostname
Alternatively, change the anti-affinity from Required to Preferred on the CCE console.
Solution: If there are pods in the namespace when you migrate service mesh data from the Dedicated edition to the Basic edition, enable automatic injection for the namespace.