Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

26 lines
3.0 KiB
HTML

<a name="mrs_01_1654"></a><a name="mrs_01_1654"></a>
<h1 class="topictitle1">Why HMaster Times Out While Waiting for Namespace Table to be Assigned After Rebuilding Meta Using OfflineMetaRepair Tool and Startups Failed</h1>
<div id="body1596003895086"><div class="section" id="mrs_01_1654__s1af6c8e532f84f37bd71f83b18ee571f"><h4 class="sectiontitle">Question</h4><p id="mrs_01_1654__a744685012bd44ea5a92be0e6e5941b9f">Why HMaster times out while waiting for namespace table to be assigned after rebuilding meta using OfflineMetaRepair tool and startups failed?</p>
<p id="mrs_01_1654__a5c0058d86fc448569c4cf3045c6f5445">HMaster abort with following FATAL message,</p>
<pre class="screen" id="mrs_01_1654__s0aaba10586d44d45bd7a4fa2633c190c">2017-06-15 15:11:07,582 FATAL [Hostname:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
java.io.IOException: Timedout 120000ms waiting for namespace table to be assigned
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:1054)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:848)
at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:199)
at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:1871)
at java.lang.Thread.run(Thread.java:745)</pre>
</div>
<div class="section" id="mrs_01_1654__s6a99ab83e06c4fd193b8228a53388a41"><h4 class="sectiontitle">Answer</h4><p id="mrs_01_1654__adff3c9c3770c4c7d89331317d948a0de">When meta is rebuilt by OfflineMetaRepair tool then HMaster wait for all region server's WAL split during start up to avoid the data inconsistency problem. HMaster trigger user regions assignment once WAL split completes. So when the cluster is in the unusual scenario, there are chances WAL splitting may take long time which depends on multiple factors like too many WALs, slow I/O, region servers are not stable etc.</p>
<p id="mrs_01_1654__a77d11dde73b64bfb9c3b16353fc582c2">HMaster should be able to finish all region server WAL splitting successfully. Perform the following steps.</p>
<ol id="mrs_01_1654__o61459ccc17ed4d2db0e89d7df147a08b"><li id="mrs_01_1654__l648635d86bb34ac9b73fb6802e1d6d81">Make sure cluster is stable, no other problem exist. If any problem occurs, please correct them first.</li><li id="mrs_01_1654__l1e569c37e02d489cb93c1b85b0b7a375">Configure a large value to <span class="parmname" id="mrs_01_1654__peab7d63f3a964c0784a6494a7515c3a2"><b>hbase.master.initializationmonitor.timeout</b></span> parameters, default value is <span class="parmvalue" id="mrs_01_1654__p0764395c0b934914b69bd74f2c9a441f"><b>3600000</b></span> milliseconds.</li><li id="mrs_01_1654__l6334469a383440a298a08ef01ccc565f">Restart HBase service.</li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1638.html">Common Issues About HBase</a></div>
</div>
</div>