Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

103 lines
10 KiB
HTML

<a name="mrs_01_1573"></a><a name="mrs_01_1573"></a>
<h1 class="topictitle1">State Backend</h1>
<div id="body1596163871747"><div class="section" id="mrs_01_1573__s29366c56bc86409c88bacc21398c6efe"><h4 class="sectiontitle">Scenarios</h4><p id="mrs_01_1573__a6425e870ecda438697be2f788cf7e9b7">Flink enables HA and job exception, as well as job pause and recovery during version upgrade. Flink depends on state backend to store job states and on the restart strategy to restart a job. You can configure state backend and the restart strategy.</p>
</div>
<div class="section" id="mrs_01_1573__s6b93853175ba4d9db463b12da5552742"><h4 class="sectiontitle">Configuration Description</h4><p id="mrs_01_1573__a12765726d4214fcbb10dd5317859759e">Configuration items include the state backend type, storage path, and restart strategy.</p>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1573__t17a8ee223eb54ff996550c15d24c3fb8" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameters</caption><thead align="left"><tr id="mrs_01_1573__r6a92fa0b7ef943cc8853b37a469f4ae1"><th align="left" class="cellrowborder" valign="top" width="25.000000000000007%" id="mcps1.3.2.3.2.5.1.1"><p id="mrs_01_1573__ab41a16133fd7426995b77ca144bc2df7">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.730000000000004%" id="mcps1.3.2.3.2.5.1.2"><p id="mrs_01_1573__a81f48ca72f034a90a4e14ed3f69f91a3">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="25.420000000000005%" id="mcps1.3.2.3.2.5.1.3"><p id="mrs_01_1573__ac12766f2f64043f1ab086642479553ed">Default Value</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="15.850000000000003%" id="mcps1.3.2.3.2.5.1.4"><p id="mrs_01_1573__a063ac5b03bbb492c8fc2fb73e712eac4">Mandatory</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1573__r6057475916ab4f67918c4490b53a700a"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__af898101668db417ca99f8aff03e293b7">state.backend.fs.checkpointdir</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__aa4ff5b9ff8ec498c8172150f156fc2e2">Path when the backend is set to <strong id="mrs_01_1573__b29745546285">filesystem</strong>. The path must be accessible by JobManager. Only the local mode is supported. In the cluster mode, use an HDFS path.</p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><p id="mrs_01_1573__a995e07aa06504fa2a196ba580218d4fa">hdfs:///flink/checkpoints</p>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__a479cf63deaa74cf4b81efc53966ba591">No</p>
</td>
</tr>
<tr id="mrs_01_1573__rc4655adf06244ef8af9b513ea0400a66"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__ae758b4470e2d499cb9ab94dc5cf13e5b">state.savepoints.dir</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__aef476b5698e54e3c9c4f0fe5eb12ad8e">Savepoint storage directory used by Flink to restore and update jobs. When a savepoint is triggered, the metadata of the savepoint is saved to this directory.</p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><p id="mrs_01_1573__a7826e6874cbd4deb85cbfe1357c911f5">hdfs:///flink/savepoint</p>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__adc27d906b10d44bfb59bb3189ad2abde">Mandatory in security mode</p>
</td>
</tr>
<tr id="mrs_01_1573__rc0c36aa680194f4f96dc82d1e7849f4d"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__afc3c28ddb822435894ce33c1566cb565">restart-strategy</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__a2571ae2447ec4dc6bb80385ea40ab749">Default restart policy, which is used for jobs for which no restart policy is specified. The options are as follows:</p>
<ul id="mrs_01_1573__u7decb649f0404920b19d0fca2a81f729"><li id="mrs_01_1573__l8c2c2619a365422da9f32458dffca330">fixed-delay</li><li id="mrs_01_1573__lbd841dcb4c1c4befb4e24a18e85fde51">failure-rate</li><li id="mrs_01_1573__l2776b6f28318466bba96ff50cd08e858">none</li></ul>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><p id="mrs_01_1573__abbd2f275e11d4c709789a3486a79cec0">none</p>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__a3bee5a816d0943419a5158160642bd4d">No</p>
</td>
</tr>
<tr id="mrs_01_1573__r4f0645ea41b943bfbecc4b68efd1ea86"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__a5318984f52bd4133b90432c3fd5d1717">restart-strategy.fixed-delay.attempts</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__a3711f4f9ad534b94a1cf396fa158f4e7">Number of retry times when the fixed-delay restart strategy is used.</p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><ul id="mrs_01_1573__u83e0a60931ee4d9da9c65a6600a9d2ab"><li id="mrs_01_1573__lca0c80492b4a4643bd3caff645bde934">If the checkpoint is enabled, the default value is the value of <strong id="mrs_01_1573__b12955923410312">Integer.MAX_VALUE</strong>.</li><li id="mrs_01_1573__l1190e3ea3e98417285aad18730fcfc2b">If the checkpoint is disabled, the default value is 3.</li></ul>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__a797bc7f00d9e4738be1c372f22c74c04">No</p>
</td>
</tr>
<tr id="mrs_01_1573__rf3c1f767f9c243b1bc7e0a576a311be9"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__a73d7c0c70b694f60b8db607b184a5112">restart-strategy.fixed-delay.delay</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__aaaf6d48598724d07864264d5f6b856c5">Retry interval when the fixed-delay strategy is used. The unit is ms/s/m/h/d.</p>
<p id="mrs_01_1573__a0bd4f134386e4920ba37e5390f0d5d10"></p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><ul id="mrs_01_1573__ud11d469b37d848b39be92b53312643d8"><li id="mrs_01_1573__le819ea29066b410aa1fe7326c7411fe3">If the checkpoint is enabled, the default value is 10s.</li><li id="mrs_01_1573__l41735805ff0943379f6359e1f98326c4">If the checkpoint is disabled, the default value is the value of <strong id="mrs_01_1573__b68312955710312">akka.ask.timeout</strong>.</li></ul>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__a48b063ba362641659b6481f724f3f6a8">No</p>
</td>
</tr>
<tr id="mrs_01_1573__r889031ab69674652ad7fd5ee4dc6aaee"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__a6e5222c6e55d4974ac8d297a79f593c5">restart-strategy.failure-rate.max-failures-per-interval</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__ab23f4eb0f69e427298b4a0776c0977a2">Maximum number of restart times in a specified period before a job fails when the fault rate policy is used.</p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><p id="mrs_01_1573__a784fa6dd795b4ee3a344cfa0195d3048">1</p>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__a9d803701a18c48a3aae80003c98a03e4">No</p>
</td>
</tr>
<tr id="mrs_01_1573__r8833d05c7cc7414bb16b79b38b6e2344"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__aa57e2a4fcbbe4eda9c474d814b018962">restart-strategy.failure-rate.failure-rate-interval</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__a4734602b35434c81919485e3afa1dc18">Retry interval when the failure-rate strategy is used. The unit is ms/s/m/h/d.</p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><p id="mrs_01_1573__aab2f2abeb64d4b538c796ac5fbf0baf2">60 s</p>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__abea5e436515e4d0c91d85fd9f5006c09">No</p>
</td>
</tr>
<tr id="mrs_01_1573__r0019a28e7e8642f8b7c6eb63730a2ba5"><td class="cellrowborder" valign="top" width="25.000000000000007%" headers="mcps1.3.2.3.2.5.1.1 "><p id="mrs_01_1573__a5ee0629a677341a8878a591cbc849f89">restart-strategy.failure-rate.delay</p>
</td>
<td class="cellrowborder" valign="top" width="33.730000000000004%" headers="mcps1.3.2.3.2.5.1.2 "><p id="mrs_01_1573__a35ff1f0ef95f46438eafa3bff3118971">Retry interval when the failure-rate strategy is used. The unit is ms/s/m/h/d.</p>
</td>
<td class="cellrowborder" valign="top" width="25.420000000000005%" headers="mcps1.3.2.3.2.5.1.3 "><p id="mrs_01_1573__a32d90b8a78aa4b8989252ea31b3adcea">The default value is the same as the value of <strong id="mrs_01_1573__b7393316010312">akka.ask.timeout</strong>. For details, see <a href="mrs_01_1568.html">Distributed Coordination (via Akka)</a>.</p>
</td>
<td class="cellrowborder" valign="top" width="15.850000000000003%" headers="mcps1.3.2.3.2.5.1.4 "><p id="mrs_01_1573__a9cebf6f0144f47bb84b7fcf6c7d7181c">No</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0592.html">Flink Configuration Management</a></div>
</div>
</div>