Dynamic Resources

Overview

Yarn provides distributed resource management for a big data cluster. The total volume of resources allocated to Yarn can be configured. Then Yarn allocates and schedules computing resources for job queues. The computing resources of MapReduce, Spark, Flink, and Hive job queues are allocated and scheduled by Yarn.

Yarn queues are fundamental units of scheduling computing resources.

The resources obtained by tenants using Yarn queues are dynamic resources. Users can dynamically create and modify the queue quotas and view the status and statistics of the queues.

Resource Pools

Nowadays, enterprise IT systems often face complex cluster environments and diverse upper-layer requirements. For example:

These require that the compute nodes be further partitioned.

Resource pools are used to specify the configuration of dynamic resources. Yarn queues are associated with resource pools for resource allocation and scheduling.

One tenant can have only one default resource pool. Users can be bound to the role of a tenant to use the resources in the resource pool of the tenant. To use resources in multiple resource pools, a user can be bound to roles of multiple tenants.

Scheduling Mechanism

Yarn dynamic resources support label-based scheduling. This policy creates labels for compute nodes (Yarn NodeManagers) and adds the compute nodes with the same label into the same resource pool. Then Yarn dynamically associates the queues with resource pools based on the resource requirements of the queues.

For example, a cluster has more than 40 nodes which are labeled by Normal, HighCPU, HighMEM, or HighIO based on their hardware and network configurations and added into four resource pools, respectively. Table 1 describes the performance of each node in the resource pool.

Table 1 Performance of each node in a resource pool

Label

Number of Nodes

Hardware and Network Configuration

Added To

Associated With

Normal

10

General

Resource pool A

Common queue

HighCPU

10

High-performance CPU

Resource pool B

Computing-intensive queue

HighMEM

10

Large memory

Resource pool C

Memory-intensive queue

HighIO

10

High-performance network

Resource pool D

I/O-intensive queue

A queue can use only the compute nodes in its associated resource pool.

Yarn queues are associated with specified resource pools to efficiently utilize resources in resource pools and maximize node performance.

FusionInsight Manager supports a maximum of 50 resource pools. The system has a default resource pool.

Schedulers

By default, the Superior scheduler is enabled for the MRS cluster.

To meet enterprise requirements and tackle scheduling challenges faced by the Yarn community, the Superior scheduler makes the following enhancements: