:original_name: cce_10_0193.html

.. _cce_10_0193:

Volcano
=======

Introduction
------------

`Volcano <https://volcano.sh/en/docs/>`__ is a batch processing platform based on Kubernetes. It provides a series of features required by machine learning, deep learning, bioinformatics, genomics, and other big data applications, as a powerful supplement to Kubernetes capabilities.

Volcano provides general-purpose, high-performance computing capabilities, such as job scheduling, heterogeneous chip management, and job running management, serving end users through computing frameworks for different industries, such as AI, big data, gene sequencing, and rendering.

Volcano provides job scheduling, job management, and queue management for computing applications. Its main features are as follows:

-  Diverse computing frameworks, such as TensorFlow, MPI, and Spark, can run on Kubernetes in containers. Common APIs for batch computing jobs through CRD, various plug-ins, and advanced job lifecycle management are provided.
-  Advanced scheduling capabilities are provided for batch computing and high-performance computing scenarios, including group scheduling, preemptive priority scheduling, packing, resource reservation, and task topology.
-  Queues can be effectively managed for scheduling jobs. Complex job scheduling capabilities such as queue priority and multi-level queues are supported.

Volcano has been open-sourced in GitHub at https://github.com/volcano-sh/volcano.

Install and configure the Volcano add-on in CCE clusters. For details, see :ref:`Volcano Scheduling <cce_10_0423>`.

.. note::

   When using Volcano as a scheduler, use it to schedule all workloads in the cluster. This prevents resource scheduling conflicts caused by simultaneous working of multiple schedulers.

Installing the Add-on
---------------------

#. Log in to the CCE console and access the cluster console. Choose **Add-ons** in the navigation pane, locate **volcano** on the right, and click **Install**.

#. On the **Install Add-on** page, configure the specifications.

   .. table:: **Table 1** Volcano specifications

      +-----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Parameter                         | Description                                                                                                                                                                                                                                                                                                                                                            |
      +===================================+========================================================================================================================================================================================================================================================================================================================================================================+
      | Add-on Specifications             | Select **Single**, **Custom**, or **HA** for **Add-on Specifications**.                                                                                                                                                                                                                                                                                                |
      +-----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Pods                              | Number of pods that will be created to match the selected add-on specifications.                                                                                                                                                                                                                                                                                       |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   | If you select **Custom**, you can adjust the number of pods as required.                                                                                                                                                                                                                                                                                               |
      +-----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Multi-AZ                          | -  **Preferred**: Deployment pods of the add-on will be preferentially scheduled to nodes in different AZs. If all the nodes in the cluster are deployed in the same AZ, the pods will be scheduled to that AZ.                                                                                                                                                        |
      |                                   | -  **Required**: Deployment pods of the add-on will be forcibly scheduled to nodes in different AZs. If there are fewer AZs than pods, the extra pods will fail to run.                                                                                                                                                                                                |
      +-----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Containers                        | CPU and memory quotas of the container allowed for the selected add-on specifications.                                                                                                                                                                                                                                                                                 |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   | If you select **Custom**, the recommended values for **volcano-controller** and **volcano-scheduler** are as follows:                                                                                                                                                                                                                                                  |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   | -  If the number of nodes is less than 100, retain the default configuration. The requested CPU is 500 m, and the limit is 2000 m. The requested memory is 500 MiB, and the limit is 2000 MiB.                                                                                                                                                                         |
      |                                   | -  If the number of nodes is greater than 100, increase the requested CPU by 500 m and the requested memory by 1000 MiB each time 100 nodes (10,000 pods) are added. Increase the CPU limit by 1500 m and the memory limit by 1000 MiB.                                                                                                                                |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |    .. note::                                                                                                                                                                                                                                                                                                                                                           |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |       Recommended formula for calculating the request value:                                                                                                                                                                                                                                                                                                           |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |       -  CPU request value: Calculate the number of target nodes multiplied by the number of target pods, perform interpolation search based on the number of nodes in the cluster multiplied by the number of target pods in :ref:`Table 2 <cce_10_0193__table4742829185912>`, and round up the request value and limit value that are closest to the specifications. |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |          For example, for 2000 nodes and 20,000 pods, Number of target nodes x Number of target pods = 40 million, which is close to the specification of 700/70000 (Number of cluster nodes x Number of pods = 49 million). According to the following table, you are advised to set the CPU request value to 4000 m and the limit value to 5500 m.                   |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |       -  Memory request value: It is recommended that 2.4 GiB memory be allocated to every 1,000 nodes and 1 GiB memory be allocated to every 10,000 pods. The memory request value is the sum of these two values. (The obtained value may be different from the recommended value in :ref:`Table 2 <cce_10_0193__table4742829185912>`. You can use either of them.)  |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |          Memory request = Number of target nodes/1000 x 2.4 GiB + Number of target pods/10000 x 1 GiB                                                                                                                                                                                                                                                                  |
      |                                   |                                                                                                                                                                                                                                                                                                                                                                        |
      |                                   |          For example, for 2000 nodes and 20,000 pods, the memory request value is 6.8 GiB, that is, 2000/1000 x 2.4 GiB + 20000/10000 x 1 GiB.                                                                                                                                                                                                                         |
      +-----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

   .. _cce_10_0193__table4742829185912:

   .. table:: **Table 2** Recommended values for volcano-controller and volcano-scheduler

      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | Nodes/Pods in a Cluster | CPU Request (m) | CPU Limit (m) | Memory Request (MiB) | Memory Limit (MiB) |
      +=========================+=================+===============+======================+====================+
      | 50/5,000                | 500             | 2000          | 500                  | 2000               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 100/10,000              | 1000            | 2500          | 1500                 | 2500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 200/20,000              | 1500            | 3000          | 2500                 | 3500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 300/30,000              | 2000            | 3500          | 3500                 | 4500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 400/40,000              | 2500            | 4000          | 4500                 | 5500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 500/50,000              | 3000            | 4500          | 5500                 | 6500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 600/60,000              | 3500            | 5000          | 6500                 | 7500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+
      | 700/70,000              | 4000            | 5500          | 7500                 | 8500               |
      +-------------------------+-----------------+---------------+----------------------+--------------------+

#. Configure the add-on parameters.

   Configure parameters of the default volcano scheduler. For details, see :ref:`Table 4 <cce_10_0193__table562185146>`.

   .. code-block::

      colocation_enable: ''
      default_scheduler_conf:
        actions: 'allocate, backfill'
        tiers:
          - plugins:
              - name: 'priority'
              - name: 'gang'
              - name: 'conformance'
          - plugins:
              - name: 'drf'
              - name: 'predicates'
              - name: 'nodeorder'
          - plugins:
              - name: 'cce-gpu-topology-predicate'
              - name: 'cce-gpu-topology-priority'
              - name: 'cce-gpu'
          - plugins:
              - name: 'nodelocalvolume'
              - name: 'nodeemptydirvolume'
              - name: 'nodeCSIscheduling'
              - name: 'networkresource'
      tolerations:
        - effect: NoExecute
          key: node.kubernetes.io/not-ready
          operator: Exists
          tolerationSeconds: 60
        - effect: NoExecute
          key: node.kubernetes.io/unreachable
          operator: Exists
          tolerationSeconds: 60

   .. table:: **Table 3** Advanced Volcano configuration parameters

      +------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+
      | Plug-in                | Function                                                                                                                                                                                                   | Description                                                                                                                                                                                                               | Demonstration                                                           |
      +========================+============================================================================================================================================================================================================+===========================================================================================================================================================================================================================+=========================================================================+
      | default_scheduler_conf | Used to schedule pods. It consists of a series of actions and plug-ins and features high scalability. You can specify and implement actions and plug-ins based on your requirements.                       | It consists of actions and tiers.                                                                                                                                                                                         | None                                                                    |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |                                                                         |
      |                        |                                                                                                                                                                                                            | -  **actions**: defines the types and sequence of actions to be executed by the scheduler.                                                                                                                                |                                                                         |
      |                        |                                                                                                                                                                                                            | -  **tiers**: configures the plug-in list.                                                                                                                                                                                |                                                                         |
      +------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+
      | actions                | Actions to be executed in each scheduling phase. The configured action sequence is the scheduler execution sequence. For details, see `Actions <https://volcano.sh/en/docs/actions/>`__.                   | The following options are supported:                                                                                                                                                                                      | .. code-block::                                                         |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |                                                                         |
      |                        | The scheduler traverses all jobs to be scheduled and performs actions such as enqueue, allocate, preempt, reclaim, and backfill in the configured sequence to find the most appropriate node for each job. | -  **enqueue**: uses a series of filtering algorithms to filter out tasks to be scheduled and sends them to the queue to wait for scheduling. After this action, the task status changes from **pending** to **inqueue**. |    actions: 'allocate, backfill'                                        |
      |                        |                                                                                                                                                                                                            | -  **allocate**: selects the most suitable node based on a series of pre-selection and selection algorithms.                                                                                                              |                                                                         |
      |                        |                                                                                                                                                                                                            | -  **preempt**: performs preemption scheduling for tasks with higher priorities in the same queue based on priority rules.                                                                                                | .. note::                                                               |
      |                        |                                                                                                                                                                                                            | -  **backfill**: schedules pending tasks as much as possible to maximize the utilization of node resources.                                                                                                               |                                                                         |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |    When configuring **actions**, use either **preempt** or **enqueue**. |
      +------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+
      | plugins                | Implementation details of algorithms in actions based on different scenarios. For details, see `Plugins <https://volcano.sh/en/docs/plugins/>`__.                                                          | For details, see :ref:`Table 4 <cce_10_0193__table562185146>`.                                                                                                                                                            | None                                                                    |
      +------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+
      | tolerations            | Tolerance of the add-on to node taints.                                                                                                                                                                    | By default, the add-on can run on nodes with the **node.kubernetes.io/not-ready** or **node.kubernetes.io/unreachable** taint and the taint effect value is **NoExecute**, but it'll be evicted in 60 seconds.            | .. code-block::                                                         |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |                                                                         |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |    tolerations:                                                         |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |      - effect: NoExecute                                                |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |        key: node.kubernetes.io/not-ready                                |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |        operator: Exists                                                 |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |        tolerationSeconds: 60                                            |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |      - effect: NoExecute                                                |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |        key: node.kubernetes.io/unreachable                              |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |        operator: Exists                                                 |
      |                        |                                                                                                                                                                                                            |                                                                                                                                                                                                                           |        tolerationSeconds: 60                                            |
      +------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------+

   .. _cce_10_0193__table562185146:

   .. table:: **Table 4** Supported plug-ins

      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | Plug-in                    | Function                                                                                                                                                                                                                                                                          | Description                                                                                                                                                                                                                                                                 | Demonstration                                               |
      +============================+===================================================================================================================================================================================================================================================================================+=============================================================================================================================================================================================================================================================================+=============================================================+
      | binpack                    | Schedule pods to nodes with high resource usage (not allocating pods to light-loaded nodes) to reduce resource fragments.                                                                                                                                                         | **arguments**:                                                                                                                                                                                                                                                              | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   | -  **binpack.weight**: weight of the binpack plug-in.                                                                                                                                                                                                                       |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   | -  **binpack.cpu**: ratio of CPUs to all resources. The parameter value defaults to **1**.                                                                                                                                                                                  |      - name: binpack                                        |
      |                            |                                                                                                                                                                                                                                                                                   | -  **binpack.memory**: ratio of memory resources to all resources. The parameter value defaults to **1**.                                                                                                                                                                   |        arguments:                                           |
      |                            |                                                                                                                                                                                                                                                                                   | -  **binpack.resources**: other custom resource types requested by the pod, for example, **nvidia.com/gpu**. Multiple types can be configured and be separated by commas (,).                                                                                               |          binpack.weight: 10                                 |
      |                            |                                                                                                                                                                                                                                                                                   | -  **binpack.resources.**\ *<your_resource>*: weight of your custom resource in all resources. Multiple types of resources can be added. *<your_resource>* indicates the resource type defined in **binpack.resources**, for example, **binpack.resources.nvidia.com/gpu**. |          binpack.cpu: 1                                     |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          binpack.memory: 1                                  |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          binpack.resources: nvidia.com/gpu, example.com/foo |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          binpack.resources.nvidia.com/gpu: 2                |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          binpack.resources.example.com/foo: 3               |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | conformance                | Prevent key pods, such as the pods in the **kube-system** namespace from being preempted.                                                                                                                                                                                         | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'priority'                                     |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'gang'                                         |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |        enablePreemptable: false                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'conformance'                                  |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | gang                       | Consider a group of pods as a whole for resource allocation. This plug-in checks whether the number of scheduled pods in a job meets the minimum requirements for running the job. If yes, all pods in the job will be scheduled. If no, the pods will not be scheduled.          | **enablePreemptable**:                                                                                                                                                                                                                                                      | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            | .. note::                                                                                                                                                                                                                                                                         | -  **true**: Preemption enabled                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   | -  **false**: Preemption not enabled                                                                                                                                                                                                                                        |      - name: priority                                       |
      |                            |    If a gang scheduling policy is used, if the remaining resources in the cluster are greater than or equal to half of the minimum number of resources for running a job but less than the minimum of resources for running the job, autoscaler scale-outs will not be triggered. |                                                                                                                                                                                                                                                                             |      - name: gang                                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |        enablePreemptable: false                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: conformance                                    |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | priority                   | Schedule based on custom load priorities.                                                                                                                                                                                                                                         | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: priority                                       |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: gang                                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |        enablePreemptable: false                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: conformance                                    |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | overcommit                 | Resources in a cluster are scheduled after being accumulated in a certain multiple to improve the workload enqueuing efficiency. If all workloads are Deployments, remove this plugin or set the raising factor to **2.0**.                                                       | **arguments**:                                                                                                                                                                                                                                                              | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            | .. note::                                                                                                                                                                                                                                                                         | -  **overcommit-factor**: inflation factor, which defaults to **1.2**.                                                                                                                                                                                                      |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: overcommit                                     |
      |                            |    This plug-in is supported in Volcano 1.6.5 and later versions.                                                                                                                                                                                                                 |                                                                                                                                                                                                                                                                             |        arguments:                                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          overcommit-factor: 2.0                             |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | drf                        | The Dominant Resource Fairness (DRF) scheduling algorithm, which schedules jobs based on their dominant resource share. Jobs with a smaller resource share will be scheduled with a higher priority.                                                                              | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'drf'                                          |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'predicates'                                   |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeorder'                                    |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | predicates                 | Determine whether a task is bound to a node by using a series of evaluation algorithms, such as node/pod affinity, taint tolerance, node repetition, volume limits, and volume zone matching.                                                                                     | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'drf'                                          |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'predicates'                                   |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeorder'                                    |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | nodeorder                  | A common algorithm for selecting nodes. Nodes are scored in simulated resource allocation to find the most suitable node for the current job.                                                                                                                                     | Scoring parameters:                                                                                                                                                                                                                                                         | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   | -  **nodeaffinity.weight**: Pods are scheduled based on node affinity. This parameter defaults to **1**.                                                                                                                                                                    |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   | -  **podaffinity.weight**: Pods are scheduled based on pod affinity. This parameter defaults to **1**.                                                                                                                                                                      |      - name: nodeorder                                      |
      |                            |                                                                                                                                                                                                                                                                                   | -  **leastrequested.weight**: Pods are scheduled to the node with the least requested resources. This parameter defaults to **1**.                                                                                                                                          |        arguments:                                           |
      |                            |                                                                                                                                                                                                                                                                                   | -  **balancedresource.weight**: Pods are scheduled to the node with balanced resource allocation. This parameter defaults to **1**.                                                                                                                                         |          leastrequested.weight: 1                           |
      |                            |                                                                                                                                                                                                                                                                                   | -  **mostrequested.weight**: Pods are scheduled to the node with the most requested resources. This parameter defaults to **0**.                                                                                                                                            |          mostrequested.weight: 0                            |
      |                            |                                                                                                                                                                                                                                                                                   | -  **tainttoleration.weight**: Pods are scheduled to the node with a high taint tolerance. This parameter defaults to **1**.                                                                                                                                                |          nodeaffinity.weight: 1                             |
      |                            |                                                                                                                                                                                                                                                                                   | -  **imagelocality.weight**: Pods are scheduled to the node where the required images exist. This parameter defaults to **1**.                                                                                                                                              |          podaffinity.weight: 1                              |
      |                            |                                                                                                                                                                                                                                                                                   | -  **selectorspread.weight**: Pods are evenly scheduled to different nodes. This parameter defaults to **0**.                                                                                                                                                               |          balancedresource.weight: 1                         |
      |                            |                                                                                                                                                                                                                                                                                   | -  **volumebinding.weight**: Pods are scheduled to the node with the local PV delayed binding policy. This parameter defaults to **1**.                                                                                                                                     |          tainttoleration.weight: 1                          |
      |                            |                                                                                                                                                                                                                                                                                   | -  **podtopologyspread.weight**: Pods are scheduled based on the pod topology. This parameter defaults to **2**.                                                                                                                                                            |          imagelocality.weight: 1                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          volumebinding.weight: 1                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          podtopologyspread.weight: 2                        |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | cce-gpu-topology-predicate | GPU-topology scheduling preselection algorithm                                                                                                                                                                                                                                    | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu-topology-predicate'                   |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu-topology-priority'                    |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu'                                      |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | cce-gpu-topology-priority  | GPU-topology scheduling priority algorithm                                                                                                                                                                                                                                        | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu-topology-predicate'                   |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu-topology-priority'                    |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu'                                      |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | cce-gpu                    | GPU resource allocation that supports decimal GPU configurations by working with the gpu add-on.                                                                                                                                                                                  | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu-topology-predicate'                   |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu-topology-priority'                    |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'cce-gpu'                                      |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | numa-aware                 | NUMA affinity scheduling.                                                                                                                                                                                                                                                         | **arguments**:                                                                                                                                                                                                                                                              | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   | -  **weight**: weight of the numa-aware plug-in                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodelocalvolume'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeemptydirvolume'                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeCSIscheduling'                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'networkresource'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |        arguments:                                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          NetworkType: vpc-router                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: numa-aware                                     |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |        arguments:                                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          weight: 10                                         |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | networkresource            | The ENI requirement node can be preselected and filtered. The parameters are transferred by CCE and do not need to be manually configured.                                                                                                                                        | **arguments**:                                                                                                                                                                                                                                                              | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   | -  **NetworkType**: network type (**eni** or **vpc-router**)                                                                                                                                                                                                                |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodelocalvolume'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeemptydirvolume'                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeCSIscheduling'                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'networkresource'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |        arguments:                                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |          NetworkType: vpc-router                            |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | nodelocalvolume            | Filter out nodes that do not meet local volume requirements.                                                                                                                                                                                                                      | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodelocalvolume'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeemptydirvolume'                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeCSIscheduling'                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'networkresource'                              |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | nodeemptydirvolume         | Filter out nodes that do not meet the emptyDir requirements.                                                                                                                                                                                                                      | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodelocalvolume'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeemptydirvolume'                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeCSIscheduling'                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'networkresource'                              |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+
      | nodeCSIscheduling          | Filter out nodes with malfunctional everest.                                                                                                                                                                                                                                      | None                                                                                                                                                                                                                                                                        | .. code-block::                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |                                                             |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |    - plugins:                                               |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodelocalvolume'                              |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeemptydirvolume'                           |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'nodeCSIscheduling'                            |
      |                            |                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                             |      - name: 'networkresource'                              |
      +----------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+

#. Click **Install**.

Components
----------

.. table:: **Table 5** Volcano components

   +---------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
   | Container Component | Description                                                                                                       | Resource Type |
   +=====================+===================================================================================================================+===============+
   | volcano-scheduler   | Schedule pods.                                                                                                    | Deployment    |
   +---------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
   | volcano-controller  | Synchronize CRDs.                                                                                                 | Deployment    |
   +---------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
   | volcano-admission   | Webhook server, which verifies and modifies resources such as pods and jobs                                       | Deployment    |
   +---------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
   | volcano-agent       | Cloud native hybrid agent, which is used for node QoS assurance, CPU burst, and dynamic resource oversubscription | DaemonSet     |
   +---------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
   | resource-exporter   | Report the NUMA topology information of nodes.                                                                    | DaemonSet     |
   +---------------------+-------------------------------------------------------------------------------------------------------------------+---------------+

Modifying the volcano-scheduler Configurations Using the Console
----------------------------------------------------------------

Volcano scheduler is the component responsible for pod scheduling. It consists of a series of actions and plug-ins. Actions should be executed in every step. Plugins provide the action algorithm details in different scenarios. volcano-scheduler is highly scalable. You can specify and implement actions and plug-ins based on your requirements.

Volcano allows you to configure the scheduler during installation, upgrade, and editing. The configuration will be synchronized to volcano-scheduler-configmap.

This section describes how to configure volcano-scheduler.

.. note::

   Only Volcano of v1.7.1 and later support this function. On the new plugin page, options such as **plugins.eas_service** and **resource_exporter_enable** are replaced by **default_scheduler_conf**.

Log in to the CCE console and access the cluster console. Choose **Add-ons** in the navigation pane. On the right of the page, locate **volcano** and click **Install** or **Upgrade**. In the **Parameters** area, configure the volcano-scheduler parameters.

-  Using **resource_exporter**:

   .. code-block::

      {
          "ca_cert": "",
          "default_scheduler_conf": {
              "actions": "allocate, backfill",
              "tiers": [
                  {
                      "plugins": [
                          {
                              "name": "priority"
                          },
                          {
                              "name": "gang"
                          },
                          {
                              "name": "conformance"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "drf"
                          },
                          {
                              "name": "predicates"
                          },
                          {
                              "name": "nodeorder"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "cce-gpu-topology-predicate"
                          },
                          {
                              "name": "cce-gpu-topology-priority"
                          },
                          {
                              "name": "cce-gpu"
                          },
                          {
                              "name": "numa-aware" # add this also enable resource_exporter
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "nodelocalvolume"
                          },
                          {
                              "name": "nodeemptydirvolume"
                          },
                          {
                              "name": "nodeCSIscheduling"
                          },
                          {
                              "name": "networkresource"
                          }
                      ]
                  }
              ]
          },
          "server_cert": "",
          "server_key": ""
      }

   After this function is enabled, you can use the functions of the numa-aware plugin and resource_exporter at the same time.

-  Using **eas_service**:

   .. code-block::

      {
          "ca_cert": "",
          "default_scheduler_conf": {
              "actions": "allocate, backfill",
              "tiers": [
                  {
                      "plugins": [
                          {
                              "name": "priority"
                          },
                          {
                              "name": "gang"
                          },
                          {
                              "name": "conformance"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "drf"
                          },
                          {
                              "name": "predicates"
                          },
                          {
                              "name": "nodeorder"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "cce-gpu-topology-predicate"
                          },
                          {
                              "name": "cce-gpu-topology-priority"
                          },
                          {
                              "name": "cce-gpu"
                          },
                          {
                              "name": "eas",
                              "custom": {
                                  "availability_zone_id": "",
                                  "driver_id": "",
                                  "endpoint": "",
                                  "flavor_id": "",
                                  "network_type": "",
                                  "network_virtual_subnet_id": "",
                                  "pool_id": "",
                                  "project_id": "",
                                  "secret_name": "eas-service-secret"
                              }
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "nodelocalvolume"
                          },
                          {
                              "name": "nodeemptydirvolume"
                          },
                          {
                              "name": "nodeCSIscheduling"
                          },
                          {
                              "name": "networkresource"
                          }
                      ]
                  }
              ]
          },
          "server_cert": "",
          "server_key": ""
      }

-  Using **ief**:

   .. code-block::

      {
          "ca_cert": "",
          "default_scheduler_conf": {
              "actions": "allocate, backfill",
              "tiers": [
                  {
                      "plugins": [
                          {
                              "name": "priority"
                          },
                          {
                              "name": "gang"
                          },
                          {
                              "name": "conformance"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "drf"
                          },
                          {
                              "name": "predicates"
                          },
                          {
                              "name": "nodeorder"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "cce-gpu-topology-predicate"
                          },
                          {
                              "name": "cce-gpu-topology-priority"
                          },
                          {
                              "name": "cce-gpu"
                          },
                          {
                              "name": "ief",
                              "enableBestNode": true
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "nodelocalvolume"
                          },
                          {
                              "name": "nodeemptydirvolume"
                          },
                          {
                              "name": "nodeCSIscheduling"
                          },
                          {
                              "name": "networkresource"
                          }
                      ]
                  }
              ]
          },
          "server_cert": "",
          "server_key": ""
      }

Retaining the Original volcano-scheduler-configmap Configurations
-----------------------------------------------------------------

If you want to use the original configuration after the plug-in is upgraded, perform the following steps:

#. Check and back up the original volcano-scheduler-configmap configuration.

   Example:

   .. code-block::

      # kubectl edit cm volcano-scheduler-configmap -n kube-system
      apiVersion: v1
      data:
        default-scheduler.conf: |-
          actions: "enqueue, allocate, backfill"
          tiers:
          - plugins:
            - name: priority
            - name: gang
            - name: conformance
          - plugins:
            - name: drf
            - name: predicates
            - name: nodeorder
            - name: binpack
              arguments:
                binpack.cpu: 100
                binpack.weight: 10
                binpack.resources: nvidia.com/gpu
                binpack.resources.nvidia.com/gpu: 10000
          - plugins:
            - name: cce-gpu-topology-predicate
            - name: cce-gpu-topology-priority
            - name: cce-gpu
          - plugins:
            - name: nodelocalvolume
            - name: nodeemptydirvolume
            - name: nodeCSIscheduling
            - name: networkresource

#. Enter the customized content in the **Parameters** area on the console.

   .. code-block::

      {
          "ca_cert": "",
          "default_scheduler_conf": {
              "actions": "enqueue, allocate, backfill",
              "tiers": [
                  {
                      "plugins": [
                          {
                              "name": "priority"
                          },
                          {
                              "name": "gang"
                          },
                          {
                              "name": "conformance"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "drf"
                          },
                          {
                              "name": "predicates"
                          },
                          {
                              "name": "nodeorder"
                          },
                          {
                              "name": "binpack",
                              "arguments": {
                                  "binpack.cpu": 100,
                                  "binpack.weight": 10,
                                  "binpack.resources": "nvidia.com/gpu",
                                  "binpack.resources.nvidia.com/gpu": 10000
                              }
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "cce-gpu-topology-predicate"
                          },
                          {
                              "name": "cce-gpu-topology-priority"
                          },
                          {
                              "name": "cce-gpu"
                          }
                      ]
                  },
                  {
                      "plugins": [
                          {
                              "name": "nodelocalvolume"
                          },
                          {
                              "name": "nodeemptydirvolume"
                          },
                          {
                              "name": "nodeCSIscheduling"
                          },
                          {
                              "name": "networkresource"
                          }
                      ]
                  }
              ]
          },
          "server_cert": "",
          "server_key": ""
      }

   .. note::

      When this function is used, the original content in volcano-scheduler-configmap will be overwritten. Therefore, you must check whether volcano-scheduler-configmap has been modified during the upgrade. If yes, synchronize the modification to the upgrade page.

Uninstalling the Volcano Add-on
-------------------------------

After the add-on is uninstalled, all custom Volcano resources (:ref:`Table 6 <cce_10_0193__table148801381540>`) will be deleted, including the created resources. Reinstalling the add-on will not inherit or restore the tasks before the uninstallation. It is a good practice to uninstall the Volcano add-on only when no custom Volcano resources are being used in the cluster.

.. _cce_10_0193__table148801381540:

.. table:: **Table 6** Custom Volcano resources

   ============ ===================== =========== ==============
   Item         API Group             API Version Resource Level
   ============ ===================== =========== ==============
   Command      bus.volcano.sh        v1alpha1    Namespaced
   Job          batch.volcano.sh      v1alpha1    Namespaced
   Numatopology nodeinfo.volcano.sh   v1alpha1    Cluster
   PodGroup     scheduling.volcano.sh v1beta1     Namespaced
   Queue        scheduling.volcano.sh v1beta1     Cluster
   ============ ===================== =========== ==============