forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
96 lines
15 KiB
HTML
96 lines
15 KiB
HTML
<a name="ALM-19022"></a><a name="ALM-19022"></a>
|
|
|
|
<h1 class="topictitle1">ALM-19022 HBase Hotspot Detection Is Unavailable</h1>
|
|
<div id="body0000002007647317"><div class="section" id="ALM-19022__section42400121"><h4 class="sectiontitle"><span id="ALM-19022__text185357518384">Alarm Description</span></h4><p id="ALM-19022__p1779922">When the MetricController instance is installed for HBase, the alarm module checks the health status of the active HBase MetricController instance every 120 seconds. This alarm is generated when the active HBase MetricController instance does not exist or is unavailable and the hotspot detection function is unavailable.</p>
|
|
<p id="ALM-19022__p16019298">This alarm is cleared when the active HBase MetricController instance recovers.</p>
|
|
<div class="note" id="ALM-19022__note9955955"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-19022__p10779958195019">This alarm applies only to MRS 3.3.0 or later.</p>
|
|
</div></div>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section46056776"><h4 class="sectiontitle"><span id="ALM-19022__text1582805433817">Alarm Attributes</span></h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-19022__table3909558" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-19022__row9358345"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-19022__p19828475"><span id="ALM-19022__text17999570388">Alarm ID</span></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-19022__p62602629"><span id="ALM-19022__text318901183916">Alarm Severity</span></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-19022__p37648208"><span id="ALM-19022__text1568825511215">Auto Cleared</span></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-19022__row29606020"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-19022__p49277383">19022</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-19022__p13261740360">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-19022__p1825511402618">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section11857806"><h4 class="sectiontitle"><span id="ALM-19022__text5781104153916">Alarm Parameters</span></h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-19022__table10287189" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-19022__row45935908"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-19022__p29821069"><span id="ALM-19022__text171691577390">Parameter</span></p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-19022__p66696423"><span id="ALM-19022__text1459201019396">Description</span></p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-19022__row18190122316182"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19022__p13858113752316">Source</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19022__p187931338134115">Specifies the cluster for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-19022__row33701210"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19022__p39123317">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19022__p57042344">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-19022__row43619052"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19022__p37226997">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19022__p32410239">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-19022__row23256701"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-19022__p66118565">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-19022__p48772425">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section39611396"><h4 class="sectiontitle"><span id="ALM-19022__text8406151319394">Impact on the System</span></h4><p id="ALM-19022__p149952341174">The HBase hotspot detection function is unavailable.</p>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section20958252"><h4 class="sectiontitle"><span id="ALM-19022__text941851614397">Possible Causes</span></h4><ul id="ALM-19022__ul20817398"><li id="ALM-19022__li1188327193">The ZooKeeper service is abnormal.</li><li id="ALM-19022__li9280164">The HBase service is abnormal.</li><li id="ALM-19022__li16412613">In the current HBase service, the MetricController instance on the same node as the active HMaster instance is not started.</li><li id="ALM-19022__li7418144120197">The network is abnormal.</li></ul>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section118143257718"><h4 class="sectiontitle"><span id="ALM-19022__text7119112018395">Handling Procedure</span></h4><p class="tableheading" id="ALM-19022__p54353294"><strong id="ALM-19022__b15135086935">Check the ZooKeeper service status.</strong></p>
|
|
<ol id="ALM-19022__ol967113713192"><li id="ALM-19022__li116753791911"><span>In the service list on FusionInsight Manager, check whether <strong id="ALM-19022__b700098901114933">Running Status</strong> of ZooKeeper is <strong id="ALM-19022__b2002078094114933">Normal</strong>.</span><p><ul class="subitemlist" id="ALM-19022__ul167113714196"><li id="ALM-19022__li867163716190">If yes, go to <a href="#ALM-19022__li18661164216271">5</a>.</li><li id="ALM-19022__li14673371194">If no, go to <a href="#ALM-19022__li1267193701920">2</a>.</li></ul>
|
|
</p></li><li id="ALM-19022__li1267193701920"><a name="ALM-19022__li1267193701920"></a><a name="li1267193701920"></a><span>In the alarm list, check whether <strong id="ALM-19022__b1414187519114933">ALM-13000 ZooKeeper Service Unavailable</strong> exists.</span><p><ul class="subitemlist" id="ALM-19022__ul26783713195"><li id="ALM-19022__li76793711910">If yes, go to <a href="#ALM-19022__li667113714198">3</a>.</li><li id="ALM-19022__li14671437191915">If no, go to <a href="#ALM-19022__li18661164216271">5</a>.</li></ul>
|
|
</p></li><li id="ALM-19022__li667113714198"><a name="ALM-19022__li667113714198"></a><a name="li667113714198"></a><span>Rectify the fault by performing the operations provided for <strong id="ALM-19022__b836413547337">ALM-13000 ZooKeeper Service Unavailable</strong>.</span></li><li id="ALM-19022__li367113701911"><span>Wait for several minutes and check whether the alarm <strong id="ALM-19022__b147001165344">HBase Hotspot Detection Is Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-19022__ul76793751911"><li id="ALM-19022__li2671837191915">If yes, no further action is required.</li><li id="ALM-19022__li19671237151914">If no, go to <a href="#ALM-19022__li18661164216271">5</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p id="ALM-19022__p865314531778"><strong id="ALM-19022__b1748012335616">Check the HBase service status.</strong></p>
|
|
<ol start="5" id="ALM-19022__ol466218426271"><li id="ALM-19022__li18661164216271"><a name="ALM-19022__li18661164216271"></a><a name="li18661164216271"></a><span>In the service list on FusionInsight Manager, check whether <strong id="ALM-19022__b18974248183410">Running Status</strong> of HBase is <strong id="ALM-19022__b1097474893418">Normal</strong>.</span><p><ul class="subitemlist" id="ALM-19022__ul146611042202714"><li id="ALM-19022__li4661842122717">If yes, go to <a href="#ALM-19022__li61381651152817">9</a>.</li><li id="ALM-19022__li566194262714">If no, go to <a href="#ALM-19022__li18662154292714">6</a>.</li></ul>
|
|
</p></li><li id="ALM-19022__li18662154292714"><a name="ALM-19022__li18662154292714"></a><a name="li18662154292714"></a><span>In the alarm list, check whether the alarm ALM-19000 HBase Service Unavailable exists.</span><p><ul class="subitemlist" id="ALM-19022__ul1366214211276"><li id="ALM-19022__li2066144217277">If yes, go to <a href="#ALM-19022__li66625429278">7</a>.</li><li id="ALM-19022__li126627425274">If no, go to <a href="#ALM-19022__li61381651152817">9</a>.</li></ul>
|
|
</p></li><li id="ALM-19022__li66625429278"><a name="ALM-19022__li66625429278"></a><a name="li66625429278"></a><span>Rectify the fault by following the steps provided for <strong id="ALM-19022__b6885292357">ALM-19000 HBase Service Unavailable</strong>.</span></li><li id="ALM-19022__li3662542162713"><span>Wait for several minutes and check whether the alarm <strong id="ALM-19022__b9249143393512">HBase Hotspot Detection Is Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-19022__ul11662144222718"><li id="ALM-19022__li166628421274">If yes, no further action is required.</li><li class="subitemlist" id="ALM-19022__li1266215424271">If no, go to <a href="#ALM-19022__li61381651152817">9</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p id="ALM-19022__p868752102714"><strong id="ALM-19022__b1191412289287">Check whether the MetricController instance deployed on the same node as the active HMaster instance is started.</strong></p>
|
|
<ol start="9" id="ALM-19022__ol1113913517286"><li id="ALM-19022__li61381651152817"><a name="ALM-19022__li61381651152817"></a><a name="li61381651152817"></a><span>On FusionInsight Manager, choose <strong id="ALM-19022__b1855525619369">Cluster</strong> > <strong id="ALM-19022__b42663582366">Service</strong> > <strong id="ALM-19022__b1921119013715">HBase</strong>, and click <strong id="ALM-19022__b152267183717">Instances</strong> to check whether the <strong id="ALM-19022__b7728614153719">MetricController(Active)</strong> instance exists.</span><p><ul id="ALM-19022__ul18137551182817"><li id="ALM-19022__li213685102813">If yes, go to <a href="#ALM-19022__li182979395366">12</a>.</li><li id="ALM-19022__li1013719517283">If no, go to <a href="#ALM-19022__li12138165182818">10</a>.</li></ul>
|
|
</p></li><li id="ALM-19022__li12138165182818"><a name="ALM-19022__li12138165182818"></a><a name="li12138165182818"></a><span>Select the MetricController instance whose management IP address is the same as that of the active HMaster instance, and click <strong id="ALM-19022__b16234161112506">Start Instance</strong>.</span></li><li id="ALM-19022__li2139155152815"><span>After the MetricController instance is restarted, check whether the alarm <strong id="ALM-19022__b165503410387">HBase Hotspot Detection Is Unavailable</strong> is cleared.</span><p><ul id="ALM-19022__ul41391251132811"><li id="ALM-19022__li613819516284">If yes, no further action is required.</li><li id="ALM-19022__li813913519284">If no, go to <a href="#ALM-19022__li182979395366">12</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p id="ALM-19022__p69991826393"><strong id="ALM-19022__b34087649221">Check the network connectivity between the started MetricController instances and the active HMaster node.</strong></p>
|
|
<ol start="12" id="ALM-19022__ol14298143919367"><li id="ALM-19022__li182979395366"><a name="ALM-19022__li182979395366"></a><a name="li182979395366"></a><span>Log in to the node where the active HMaser instance is deployed and run <strong id="ALM-19022__b02971395367">ping</strong> <em id="ALM-19022__i165003820507">IP address of the node where the standby MetricController instance is deployed</em> to check whether the network connection between the started MetricController instances and the host where the active HMaster instance is deployed is normal.</span><p><ul class="subitemlist" id="ALM-19022__ul329718398364"><li id="ALM-19022__li1297139153613">If yes, go to <a href="#ALM-19022__li107641231103617">15</a>.</li><li class="subitemlist" id="ALM-19022__li1229719397368">If no, go to <a href="#ALM-19022__li929715395365">13</a>.</li></ul>
|
|
</p></li><li id="ALM-19022__li929715395365"><a name="ALM-19022__li929715395365"></a><a name="li929715395365"></a><span>Contact the network administrator to restore the network.</span></li><li id="ALM-19022__li6298193923611"><span>After the network recovers, check whether the alarm <strong id="ALM-19022__b4583132544015">HBase Hotspot Detection Is Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-19022__ul5298133993617"><li id="ALM-19022__li42981839123610">If yes, no further action is required.</li><li id="ALM-19022__li3298239123613">If no, go to <a href="#ALM-19022__li107641231103617">15</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p id="ALM-19022__p15601739207"><strong id="ALM-19022__b3606332013">Collect fault information.</strong></p>
|
|
<ol start="15" id="ALM-19022__ol167651631113615"><li id="ALM-19022__li107641231103617"><a name="ALM-19022__li107641231103617"></a><a name="li107641231103617"></a><span>On FusionInsight Manager, choose <strong id="ALM-19022__b1985771627114933">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-19022__b1016475368114933">Log</strong> > <strong id="ALM-19022__b1811065636114933">Download</strong>.</span></li><li id="ALM-19022__li07645310363"><span>Expand the <strong id="ALM-19022__b1683209692114933">Service</strong> drop-down list, and select <strong id="ALM-19022__b1178843058114933">HBase</strong> for the target cluster.</span></li><li id="ALM-19022__li73388391699"><span>In the <strong id="ALM-19022__b109542059414">Host</strong> area, select the host where the HMaster instance is deployed.</span></li><li id="ALM-19022__li976593115360"><span>Click the edit icon in the upper right corner, and set <strong id="ALM-19022__b103081519194118">Start Date</strong> and <strong id="ALM-19022__b130917192419">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-19022__b23091319184112">Download</strong>.</span></li><li id="ALM-19022__li77651631163618"><span>Contact <span id="ALM-19022__text12765631133618">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section563505465818"><h4 class="sectiontitle"><span id="ALM-19022__text1761202610393">Alarm Clearance</span></h4><p id="ALM-19022__p715945811718">This alarm is automatically cleared after the fault is rectified.</p>
|
|
</div>
|
|
<div class="section" id="ALM-19022__section762211012599"><h4 class="sectiontitle"><span id="ALM-19022__text107101829133911">Related Information</span></h4><p id="ALM-19022__p1218816411811"><span id="ALM-19022__text61294221672">None.</span></p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|