forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
107 lines
20 KiB
HTML
107 lines
20 KiB
HTML
<a name="ALM-16004"></a><a name="ALM-16004"></a>
|
|
|
|
<h1 class="topictitle1">ALM-16004 Hive Service Unavailable</h1>
|
|
<div id="body3979303"><div class="section" id="ALM-16004__s03c109effc4547febe3e44d3f3e0924d"><h4 class="sectiontitle">Description</h4><p id="ALM-16004__en-us_topic_0070543661_p25305558">This alarm is generated when the HiveServer service is unavailable. The system checks the HiveServer service status every 60 seconds.</p>
|
|
<p id="ALM-16004__en-us_topic_0070543661_p26423431">This alarm is cleared when the HiveServer service is normal.</p>
|
|
<div class="note" id="ALM-16004__en-us_topic_0070543661_note36484292"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-16004__p10300195719420">MRS 3.X supports the multi-instance function. If the multi-instance function is enabled in the cluster and multiple Hive service instances are installed, you need to determine the Hive service instance where the alarm is generated based on the value of <strong id="ALM-16004__en-us_topic_0070543661_b2437642">ServiceName</strong> in <strong id="ALM-16004__en-us_topic_0070543661_b21938782">Location</strong>. For example, if the Hive1 service is unavailable, <strong id="ALM-16004__en-us_topic_0070543661_b32210908">ServiceName=Hive1</strong> is displayed in <strong id="ALM-16004__en-us_topic_0070543661_b21462720">Location</strong>, and the operation object in the procedure needs to be changed from Hive to Hive1.</p>
|
|
</div></div>
|
|
</div>
|
|
<div class="section" id="ALM-16004__scde6e2ab4018442d8000740f6037d21b"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-16004__en-us_topic_0070543661_table60758784" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-16004__en-us_topic_0070543661_row10953395"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-16004__en-us_topic_0070543661_p14809797">Alarm ID</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-16004__en-us_topic_0070543661_p58742895">Alarm Severity</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-16004__en-us_topic_0070543661_p60554074">Automatically Cleared</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-16004__en-us_topic_0070543661_row5932974"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-16004__en-us_topic_0070543661_p10808849">16004</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-16004__en-us_topic_0070543661_p3101614">Critical</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-16004__en-us_topic_0070543661_p49904171">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-16004__s9e227f1798ee446a9bc44bc841034fd0"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-16004__en-us_topic_0070543661_table15706032" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-16004__en-us_topic_0070543661_row48131740"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-16004__en-us_topic_0070543661_p6356899">Name</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-16004__en-us_topic_0070543661_p45146773">Meaning</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-16004__row149546511273"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16004__p192431315431">Source</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16004__p692551319435">Specifies the cluster for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16004__en-us_topic_0070543661_row33010012"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16004__en-us_topic_0070543661_p56565344">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16004__en-us_topic_0070543661_p18390152">Specifies the service for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16004__en-us_topic_0070543661_row31293645"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16004__en-us_topic_0070543661_p51757351">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16004__en-us_topic_0070543661_p31595930">Specifies the role for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16004__en-us_topic_0070543661_row15927916"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16004__en-us_topic_0070543661_p15092790">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16004__en-us_topic_0070543661_p14556460">Specifies the host for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-16004__s11430147cc2a43d2aa03c1608269a347"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-16004__en-us_topic_0070543661_p38222648">The system cannot provide data loading, query, and extraction services.</p>
|
|
</div>
|
|
<div class="section" id="ALM-16004__s1a6c405a93014cc28359219712e19af2"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-16004__en-us_topic_0070543661_ul9026751"><li id="ALM-16004__en-us_topic_0070543661_li14131901">Hive service unavailability may be related to the faults of the Hive process as well as basic services, such as ZooKeeper, Hadoop distributed file system (HDFS), Yarn, and DBService.<ul id="ALM-16004__en-us_topic_0070543661_ul60078251"><li id="ALM-16004__en-us_topic_0070543661_li3833355">The ZooKeeper service is abnormal.</li><li id="ALM-16004__en-us_topic_0070543661_li34500196">The HDFS service is abnormal.</li><li id="ALM-16004__en-us_topic_0070543661_li42066313">The Yarn service is abnormal.</li><li id="ALM-16004__en-us_topic_0070543661_li43052497">The DBService service is abnormal.</li><li id="ALM-16004__en-us_topic_0070543661_li51928159">The Hive service process is abnormal. If the alarm is caused by Hive process fault, the alarm report has a delay of about 5 minutes.</li></ul>
|
|
</li><li id="ALM-16004__en-us_topic_0070543661_li64700251">The network communication between the Hive and basic services is interrupted.</li></ul>
|
|
</div>
|
|
<div class="section" id="ALM-16004__see23f4d284e44eda86f3b04d76728c9a"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-16004__en-us_topic_0070543661_p6228969"><strong id="ALM-16004__b1585974214158">Check the HiveServer/MetaStore process status.</strong></p>
|
|
<ol id="ALM-16004__ol53219730141514"><li id="ALM-16004__li7610594141457"><span>On the FusionInsight Manager portal, click <strong id="ALM-16004__b103621040173617">Cluster > </strong><em id="ALM-16004__i14386340113610">Name of the desired cluster</em><strong id="ALM-16004__b1836354033618"> > Services</strong> > <strong id="ALM-16004__b17727898141457">Hive</strong> > <strong id="ALM-16004__b25333360141457">Instance</strong>. In the Hive instance list, check whether the HiveServer or MetaStore instances are in the Unknown state.</span><p><ul class="subitemlist" id="ALM-16004__ul53041404141457"><li id="ALM-16004__li38736302141457">If yes, go to <a href="#ALM-16004__li45196532141457">2</a>.</li><li id="ALM-16004__li50632732141457">If no, go to <a href="#ALM-16004__li31923589141457">4</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li45196532141457"><a name="ALM-16004__li45196532141457"></a><a name="li45196532141457"></a><span>In the Hive instance list, choose <strong id="ALM-16004__b1386486141457">More</strong> > <strong id="ALM-16004__b12478377141457">Restart Instance</strong> to restart the HiveServer/MetaStore process.</span></li><li id="ALM-16004__li58269034141457"><span>In the alarm list, check whether <strong id="ALM-16004__b4115608141457">Hive Service Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-16004__ul24746001141457"><li id="ALM-16004__li37040479141457">If yes, no further action is required.</li><li id="ALM-16004__li47488798141457">If no, go to <a href="#ALM-16004__li31923589141457">4</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16004__p21387418141457"><strong id="ALM-16004__b50242462141524">Check the ZooKeeper service status.</strong></p>
|
|
<ol start="4" id="ALM-16004__ol56280432141542"><li id="ALM-16004__li31923589141457"><a name="ALM-16004__li31923589141457"></a><a name="li31923589141457"></a><span>On the FusionInsight Manager, check whether the alarm list contains <strong id="ALM-16004__b1773258125310">Process Fault</strong>.</span><p><ul class="subitemlist" id="ALM-16004__ul40829767141457"><li id="ALM-16004__li65323750141457">If yes, go to <a href="#ALM-16004__li58014365141457">5</a>.</li><li id="ALM-16004__li56732423141457">If no, go to <a href="#ALM-16004__li41412512141457">8</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li58014365141457"><a name="ALM-16004__li58014365141457"></a><a name="li58014365141457"></a><span>In the <strong id="ALM-16004__b19943181885510">Process Fault</strong>, check whether <strong id="ALM-16004__b52629894141457">ServiceName</strong> is <strong id="ALM-16004__b46462361672">ZooKeeper</strong>.</span><p><ul class="subitemlist" id="ALM-16004__ul51185283141457"><li id="ALM-16004__li48031417141457">If yes, go to <a href="#ALM-16004__li1543118141457">6</a>.</li><li id="ALM-16004__li65339577141457">If no, go to <a href="#ALM-16004__li41412512141457">8</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li1543118141457"><a name="ALM-16004__li1543118141457"></a><a name="li1543118141457"></a><span>Rectify the fault by following the steps provided in "ALM-12007 Process Fault".</span></li><li id="ALM-16004__li53002899141457"><span>In the alarm list, check whether <strong id="ALM-16004__b13888062141457">Hive Service Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-16004__ul52850139141457"><li id="ALM-16004__li57883699141457">If yes, no further action is required.</li><li id="ALM-16004__li58068020141457">If no, go to <a href="#ALM-16004__li41412512141457">8</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16004__p5889211141457"><strong id="ALM-16004__b3864586141552">Check the HDFS service status.</strong></p>
|
|
<ol start="8" id="ALM-16004__ol53298187141614"><li id="ALM-16004__li41412512141457"><a name="ALM-16004__li41412512141457"></a><a name="li41412512141457"></a><span>On the FusionInsight Manager, check whether the alarm list contains <strong id="ALM-16004__b269313387551">HDFS Service Unavailable</strong>.</span><p><ul class="subitemlist" id="ALM-16004__ul41884092141457"><li id="ALM-16004__li51516788141457">If yes, go to <a href="#ALM-16004__li66079189141457">9</a>.</li><li id="ALM-16004__li12110328141457">If no, go to <a href="#ALM-16004__li26828739141457">11</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li66079189141457"><a name="ALM-16004__li66079189141457"></a><a name="li66079189141457"></a><span>Rectify the fault by following the steps provided in "ALM-14000 HDFS Service Unavailable".</span></li><li id="ALM-16004__li16101312141457"><span>In the alarm list, check whether <strong id="ALM-16004__b57841791141457">Hive Service Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-16004__ul66479141141457"><li id="ALM-16004__li50814078141457">If yes, no further action is required.</li><li id="ALM-16004__li22299652141457">If no, go to <a href="#ALM-16004__li26828739141457">11</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16004__p61441358141457"><strong id="ALM-16004__b16955429141627">Check the Yarn service status.</strong></p>
|
|
<ol start="11" id="ALM-16004__ol46887731141635"><li id="ALM-16004__li26828739141457"><a name="ALM-16004__li26828739141457"></a><a name="li26828739141457"></a><span>In FusionInsight Manager alarm list, check whether <strong id="ALM-16004__b1281858195516">Yarn Service Unavailable</strong> is generated.</span><p><ul class="subitemlist" id="ALM-16004__ul47720213141457"><li id="ALM-16004__li60914512141457">If yes, go to <a href="#ALM-16004__li25644284141457">12</a>.</li><li id="ALM-16004__li35128407141457">If no, go to <a href="#ALM-16004__li53539591141457">14</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li25644284141457"><a name="ALM-16004__li25644284141457"></a><a name="li25644284141457"></a><span>Rectify the fault. For details, see "ALM-18000 Yarn Service Unavailable".</span></li><li id="ALM-16004__li7842061141457"><span>In the alarm list, check whether <strong id="ALM-16004__b29471969141457">Hive Service Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-16004__ul24951950141457"><li id="ALM-16004__li63921129141457">If yes, no further action is required.</li><li id="ALM-16004__li10228979141457">If no, go to <a href="#ALM-16004__li53539591141457">14</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16004__p23240961141457"><strong id="ALM-16004__b23952575141647">Check the DBService service status.</strong></p>
|
|
<ol start="14" id="ALM-16004__ol37731740141656"><li id="ALM-16004__li53539591141457"><a name="ALM-16004__li53539591141457"></a><a name="li53539591141457"></a><span>In FusionInsight Manager alarm list, check whether <strong id="ALM-16004__b14372174110569">DBService Service Unavailable</strong> is generated.</span><p><ul class="subitemlist" id="ALM-16004__ul65601167141457"><li id="ALM-16004__li12609611141457">If yes, go to <a href="#ALM-16004__li41739587141457">15</a>.</li><li id="ALM-16004__li14745559141457">If no, go to <a href="#ALM-16004__li44837990141457">17</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li41739587141457"><a name="ALM-16004__li41739587141457"></a><a name="li41739587141457"></a><span>Rectify the fault. For details, see "ALM-27001 DBService Service Unavailable".</span></li><li id="ALM-16004__li14974587141457"><span>In the alarm list, check whether <strong id="ALM-16004__b40111970141457">Hive Service Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-16004__ul40781591141457"><li id="ALM-16004__li25463412141457">If yes, no further action is required.</li><li id="ALM-16004__li49270530141457">If no, go to <a href="#ALM-16004__li44837990141457">17</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16004__p31490004141457"><strong id="ALM-16004__b918677014178">Check the network connection between the Hive and ZooKeeper, HDFS, Yarn, and DBService.</strong></p>
|
|
<ol start="17" id="ALM-16004__ol49936165141726"><li id="ALM-16004__li44837990141457"><a name="ALM-16004__li44837990141457"></a><a name="li44837990141457"></a><span>On the FusionInsight Manager, choose <strong id="ALM-16004__b198811471382">Cluster > </strong><em id="ALM-16004__i6882076389">Name of the desired cluster</em><strong id="ALM-16004__b1588115716381"> > Services</strong> > <strong id="ALM-16004__b4981998141457">Hive</strong>.</span></li><li id="ALM-16004__li4878376141457"><span>Click <strong id="ALM-16004__b888731141457">Instance</strong>.</span><p><p class="litext" id="ALM-16004__p7998582141457">The HiveServer instance list is displayed.</p>
|
|
</p></li><li id="ALM-16004__li63207427141457"><span>Click <strong id="ALM-16004__b43905386141457">Host Name</strong> in the row of <strong id="ALM-16004__b59604157141457">HiveServer</strong>.</span><p><p class="litext" id="ALM-16004__p66675371141457">The active HiveServer host status page is displayed.</p>
|
|
</p></li><li id="ALM-16004__li19527969141457"><a name="ALM-16004__li19527969141457"></a><a name="li19527969141457"></a><span>Record the IP address under <strong id="ALM-16004__b188482023141917">Basic Information</strong>.</span></li><li id="ALM-16004__li12189704141457"><span>Use the IP address obtained in <a href="#ALM-16004__li19527969141457">20</a> to log in to the host where the active HiveServer runs as user <strong id="ALM-16004__b41534001141457">omm</strong>.</span></li></ol><ol start="22" id="ALM-16004__ol7074128141759"><li id="ALM-16004__li4696813141457"><span>Run the <strong id="ALM-16004__b42598474141457">ping</strong> command to check whether communication between the host that runs the active HiveServer and the hosts that run the ZooKeeper, HDFS, Yarn, and DBService services is normal. (Obtain the IP addresses of the hosts that run the ZooKeeper, HDFS, Yarn, and DBService services in the same way as that for obtaining the IP address of the active HiveServer.)</span><p><ul class="subitemlist" id="ALM-16004__ul22891489141457"><li id="ALM-16004__li27924385141457">If yes, go to <a href="#ALM-16004__li18695793141457">25</a>.</li><li id="ALM-16004__li47282741141457">If no, go to <a href="#ALM-16004__li42271322141457">23</a>.</li></ul>
|
|
</p></li><li id="ALM-16004__li42271322141457"><a name="ALM-16004__li42271322141457"></a><a name="li42271322141457"></a><span>Contact the administrator to restore the network.</span></li><li id="ALM-16004__li62899931141457"><span>In the alarm list, check whether <strong id="ALM-16004__b44897586141457">Hive Service Unavailable</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-16004__ul32259713141457"><li id="ALM-16004__li1425092141457">If yes, no further action is required.</li><li id="ALM-16004__li48323655141457">If no, go to <a href="#ALM-16004__li18695793141457">25</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16004__p21901962141457"><strong id="ALM-16004__b62403998141811">Collect fault information.</strong></p>
|
|
<ol start="25" id="ALM-16004__ol15675769141823"><li id="ALM-16004__li18695793141457"><a name="ALM-16004__li18695793141457"></a><a name="li18695793141457"></a><span>On the FusionInsight Manager, choose <strong id="ALM-16004__b39977366113627">O&M</strong> > <strong id="ALM-16004__b24251979113627">Log > Download</strong>.</span></li><li id="ALM-16004__li65556694141457"><span>Select the following nodes in the required cluster from the <strong id="ALM-16004__b34044413141457">Service</strong>:</span><p><ul class="subitemlist" id="ALM-16004__ul22197158141457"><li id="ALM-16004__li6134036141457">ZooKeeper</li><li id="ALM-16004__li55206330141457">HDFS</li><li id="ALM-16004__li27094928141457">Yarn</li><li id="ALM-16004__li42527768141457">DBService</li><li id="ALM-16004__li47205593141457">Hive</li></ul>
|
|
</p></li><li id="ALM-16004__li1145664103113"><span>Click <span><img id="ALM-16004__image1945644173117" src="en-us_image_0269417380.png"></span> in the upper right corner, and set <strong id="ALM-16004__b6456941173117">Start Date</strong> and <strong id="ALM-16004__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-16004__b13456164113319">Download</strong>.</span></li><li id="ALM-16004__li770493295814"><span>Contact the <span id="ALM-16004__text4614151421417">O&M personnel</span> and send the collected logs.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="ALM-16004__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-16004__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
|
</div>
|
|
<div class="section" id="ALM-16004__s6465779a2c6641d3914e291d73ef6b38"><h4 class="sectiontitle">Related Information</h4><p id="ALM-16004__en-us_topic_0070543661_p14153363">None</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|