Failed to Schedule an Instance

Fault Locating

Viewing K8s Event Information

If the workload is in the Unready state and reports the "InstanceSchedulingFailed" event, check the workload's K8S events to identify the cause.

As shown in the following figure, the K8s event is "0/163 nodes are available: 133 Insufficient memory", indicating that the memory is insufficient.

Complex scheduling failure information:

Information interpretation:

The following is the fault locating procedure:

Check Item 1: Checking Whether a Node Is Available in the Cluster

Log in to the CCE console. In the navigation pane, choose Resource Management > Nodes to check whether the node where the workload runs is in the available state.

For example, the event "0/1 nodes are available: 1 node(s) were not ready, 1 node(s) were out of disk space" indicates that the pod fails to be scheduled due to no available node.

Solution

Check Item 2: Checking Whether Node Resources (CPU and Memory) Are Sufficient

If the requested workload resources exceed the available resources of the node where the workload runs, the node cannot provide the resources required to run new pods and pod scheduling onto the node will definitely fail.

  1. On the CCE console, choose Workloads > Deployments or StatefulSets in the navigation pane, click the workload name, and click Pods and then Events tabs to view pod events.

    The event "0/1 nodes are available: 1 Insufficient cpu" indicates that the pod fails to be scheduled due to insufficient node resources.

  2. In the navigation pane, choose Resource Management > Nodes to view available CPUs and memory of the node where the workload runs.

    In this example, 0.88 vCPUs and 0.8 GiB memory are available for the node.

  3. In the navigation pane, choose Workloads and click the workload name to view the workload's CPU request and memory request.

    In this example, the CPU request is 2 vCPUs and the memory request is 0.5 GiB. The CPU request exceeds the available CPU resources, which causes pod scheduling to fail.

Solution

On the ECS console, modify node specifications to expand node resources.

Check Item 3: Checking the Affinity and Anti-Affinity Configuration of the Workload

Inappropriate affinity policies will cause pod scheduling to fail.

0/1 nodes are available: 1 node(s) didn't match pod affinity/anti-affintity, 1 node(s) didn't match pod anti-affinity rules.

Solution

Check Item 4: Checking Whether the Workload's Volume and Node Reside in the Same AZ

Pod scheduling fails if the workload's volume and node reside in different AZs.

0/1 nodes are available: 1 NoVolumeZoneConflict.

Solution

In the AZ where the workload's node resides, create a new volume. Alternatively, create an identical workload and select an automatically assigned cloud storage volume.