The Yarn Restart feature includes ResourceManager Restart and NodeManager Restart.
Go to the All Configurations page of Yarn and enter a parameter name in the search box by referring to Modifying Cluster Service Configuration Parameters.
Configure ResourceManager Restart as follows:
Parameter |
Description |
Default Value |
---|---|---|
yarn.resourcemanager.recovery.enabled |
Whether to enable ResourceManager to restore the status after startup. If this parameter is set to true, yarn.resourcemanager.store.class must also be set. |
true |
yarn.resourcemanager.store.class |
State-store class used to store the application and task statuses and certificate content. |
org.apache.hadoop.yarn.server.resourcemanager.recovery.AsyncZKRMStateStore |
yarn.resourcemanager.zk-state-store.parent-path |
Directory for storing ZKRMStateStore in ZooKeeper |
/rmstore |
yarn.resourcemanager.work-preserving-recovery.enabled |
Whether to enable ResourceManager work serving. This configuration is used only for Yarn feature verification. |
true |
yarn.resourcemanager.state-store.async.load |
Whether to apply asynchronous restoration to completed applications. |
true |
yarn.resourcemanager.zk-state-store.num-fetch-threads |
If asynchronous restoration is enabled, increasing the number of working threads can speed up the restoration of task information stored in ZooKeeper. The value must be greater than 0. |
20 |
Configure NodeManager Restart as follows:
Parameter |
Description |
Default Value |
---|---|---|
yarn.nodemanager.recovery.enabled |
Whether to enable the function of collecting logs upon a log collection failure when NodeManager is restarted and whether to restore the unfinished application |
true |
yarn.nodemanager.recovery.dir |
Local directory used by NodeManager to store container status |
${SRV_HOME}/tmp/yarn-nm-recovery |
yarn.nodemanager.recovery.supervised |
Whether NodeManager is monitored. After this parameter is enabled, NodeManager does not clear containers after exit. NodeManager assumes that it will restart and restore containers immediately. |
true |