forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
49 lines
7.3 KiB
HTML
49 lines
7.3 KiB
HTML
<a name="mrs_01_1702"></a><a name="mrs_01_1702"></a>
|
|
|
|
<h1 class="topictitle1">The HDFS Client Is Unresponsive When the NameNode Is Overloaded for a Long Time</h1>
|
|
<div id="body1597735020145"><div class="section" id="mrs_01_1702__s6f23c0269985441a9b49c1a8d29d9848"><h4 class="sectiontitle"><strong id="mrs_01_1702__ac5e298b1e661431980a8b5c49a72968e">Question</strong></h4><p id="mrs_01_1702__a0d92099a193543b8bdfa91026e946ffb">When the NameNode node is overloaded (100% of the CPU is occupied), the NameNode is unresponsive. The HDFS clients that are connected to the overloaded NameNode fail to run properly. However, the HDFS clients that are newly connected to the NameNode will be switched to a backup NameNode and run properly.</p>
|
|
</div>
|
|
<div class="section" id="mrs_01_1702__s9f3828a8da1949ef8489920f4a8311c2"><h4 class="sectiontitle"><strong id="mrs_01_1702__ab971a756b2624940827048c28925db51">Answer</strong></h4><p id="mrs_01_1702__a723e58adc5954f3d970612b8de5ce263">The default configuration must be used (as described in <a href="#mrs_01_1702__tf99cac42ab7947b3bffe186b74e79d38">Table 1</a>) when the error preceding described occurs: the <strong id="mrs_01_1702__a89402a848dc24f19a6610bfa88adf1f6">keep alive </strong>mechanism is enabled for the RPC connection between the HDFS client and the NameNode. The <strong id="mrs_01_1702__abcd8a7b104b24899befbe33d32306b22">keep alive</strong> mechanism will keep the HDFS client waiting for the response from server and prevent the connection from being out timed, causing the unresponsiveness of the HDFS client.</p>
|
|
<p id="mrs_01_1702__a8c84dc27f0b94952a2b623f63d2278ad">Perform the following operations to the unresponsive HDFS client:</p>
|
|
<ul id="mrs_01_1702__u2a6ee0d37e8c41ca9ed2aead0d4db844"><li id="mrs_01_1702__l7e344d26c2804aa0babff5e71fd8ef5a">Leave the HDFS client waiting. Once the CPU usage of the node where NameNode locates drops, the NameNode will obtain CPU resources and the HDFS client will receive a response.</li><li id="mrs_01_1702__l37d7b8e86db2435cadb792f1e5947168">If you do not want to leave the HDFS client running, restart the application where the HDFS client locates to reconnect the HDFS client to another idle NameNode.</li></ul>
|
|
<p id="mrs_01_1702__a127bf675df5b41b286469426de1f56c5">Procedure:</p>
|
|
<p id="mrs_01_1702__aedfc3d2d2b6446dba666275890c0e41d">Configure the following parameters in the <strong id="mrs_01_1702__a32523fd358a84a98a905db068935efc6">c</strong><span class="filepath" id="mrs_01_1702__f6ba7d36919f348a3a134c329107a6525"><b>ore-site.xml</b></span> file on the client.</p>
|
|
|
|
<div class="tablenoborder"><a name="mrs_01_1702__tf99cac42ab7947b3bffe186b74e79d38"></a><a name="tf99cac42ab7947b3bffe186b74e79d38"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1702__tf99cac42ab7947b3bffe186b74e79d38" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameter description</caption><thead align="left"><tr id="mrs_01_1702__rad1b05b07b104269902485c33d5e35f6"><th align="left" class="cellrowborder" valign="top" width="17%" id="mcps1.3.2.7.2.4.1.1"><p id="mrs_01_1702__a69a528b3927f44fe9ec634d2e8e2be20"><strong id="mrs_01_1702__ae4aa3f8a6a154b1f9ba0086526601807">Parameter</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="67%" id="mcps1.3.2.7.2.4.1.2"><p id="mrs_01_1702__aac40ee77b5534701a79fca92d13c68a1"><strong id="mrs_01_1702__affa17b983df34448afe0166ac4afa5bc">Description</strong></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="16%" id="mcps1.3.2.7.2.4.1.3"><p id="mrs_01_1702__aeeffd33deec84e71935c75ddc5efd9eb"><strong id="mrs_01_1702__afd961f27ae5743398245a00534965336">Default Value</strong></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_1702__ra7d68ac9d89145149bccb2578ca9c2fc"><td class="cellrowborder" valign="top" width="17%" headers="mcps1.3.2.7.2.4.1.1 "><p id="mrs_01_1702__a2b522333fd0540dfade644cec91f83e6">ipc.client.ping</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="67%" headers="mcps1.3.2.7.2.4.1.2 "><p id="mrs_01_1702__a930ed8833899485590d2f8215335ad8d">If the <span class="parmname" id="mrs_01_1702__p833c8e6b5c144809b5df7dfae3fb86f3"><b>ipc.client.ping</b></span> parameter is configured to <span class="parmvalue" id="mrs_01_1702__pab9177629fd14700a473ee7e327fd71a"><b>true</b></span>, the HDFS client will wait for the response from the server and periodically send the <strong id="mrs_01_1702__a5d8e09f7300a4a888a54219368c10cb9">ping</strong> message to avoid disconnection caused by <strong id="mrs_01_1702__ac21e72c4d9a34443828de712d50f1075">tcp timeout</strong>.</p>
|
|
<p id="mrs_01_1702__a280af80aee174c9b8687f9dad1d04d88">If the <span class="parmname" id="mrs_01_1702__p391f143339e24f1187b60054925812cd"><b>ipc.client.ping</b></span> parameter is configured to <span class="parmvalue" id="mrs_01_1702__pde96362814ff4c54b6b9c508187ff5d1"><b>false</b></span>, the HDFS client will set the value of <span class="parmname" id="mrs_01_1702__pfe9daa97322e473097b30eb6c41a937a"><b>ipc.ping.interval</b></span><strong id="mrs_01_1702__ac413e3397ff3477392193850f6ed6e1a"> </strong>as the timeout time. If no response is received within that time, timeout occurs.</p>
|
|
<p id="mrs_01_1702__a4ed8f59d0e37479da529dae9849324f2">To avoid the unresponsiveness of HDFS when the NameNode is overloaded for a long time, you are advised to set the parameter to <span class="parmvalue" id="mrs_01_1702__p79017f961e5046fd86206117c27fc056"><b>false</b></span>.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="16%" headers="mcps1.3.2.7.2.4.1.3 "><p id="mrs_01_1702__a8e870fc4a82943b98f39172d4fa19eb0">true</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_1702__r250b5a537f7042128505f5176e139f5e"><td class="cellrowborder" valign="top" width="17%" headers="mcps1.3.2.7.2.4.1.1 "><p id="mrs_01_1702__adddaa0756fc349d381e4b47763899ce1">ipc.ping.interval</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="67%" headers="mcps1.3.2.7.2.4.1.2 "><p id="mrs_01_1702__aebd48ca3ae2148d7846829db553b280a">If the value of <span class="parmname" id="mrs_01_1702__p25908c45480c44d28ee5ceb6ca4e8e65"><b>ipc.client.ping</b></span> is <span class="parmvalue" id="mrs_01_1702__p6eb0704a1775408dba4ef75d3dd34a2f"><b>true</b></span>, <span class="parmname" id="mrs_01_1702__pf2dfcdb0e3174725ae0a4855a9ecb5b8"><b>ipc.ping.interval</b></span> indicates the interval between sending the ping messages.</p>
|
|
<p id="mrs_01_1702__af99967634770439db86dcf67ffc3caf1">If the value of <span class="parmname" id="mrs_01_1702__p5a52a4677ee049928d83547474534ff6"><b>ipc.client.ping</b></span> is <span class="parmvalue" id="mrs_01_1702__pf70998214ef442d9b1858207618f2d29"><b>false</b></span>, <span class="parmname" id="mrs_01_1702__pd6feaa02436f498fbd4f0dffd99352e4"><b>ipc.ping.interval</b></span> indicates the timeout time for connection.</p>
|
|
<p id="mrs_01_1702__a6d878cb2e35746299dd9c609ff131dbf">To avoid the unresponsiveness of HDFS when the NameNode is overloaded for a long time, you are advised to set the parameter to a large value, for example <span class="parmvalue" id="mrs_01_1702__p98c1ded5d29b4090b27d9b3dbfc82563"><b>900000</b></span> (unit ms) to avoid timeout when the server is busy.</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="16%" headers="mcps1.3.2.7.2.4.1.3 "><p id="mrs_01_1702__a5db00dcec7c147df9c83f3b8219f76da">60000</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1690.html">FAQ</a></div>
|
|
</div>
|
|
</div>
|
|
|