forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
93 lines
15 KiB
HTML
93 lines
15 KiB
HTML
<a name="ALM-16006"></a><a name="ALM-16006"></a>
|
|
|
|
<h1 class="topictitle1">ALM-16006 The Direct Memory Usage of the Hive Process Exceeds the Threshold</h1>
|
|
<div id="body17462133"><div class="section" id="ALM-16006__sc94429378c7f4b6aa34128a93d6b055d"><h4 class="sectiontitle">Description</h4><p id="ALM-16006__en-us_topic_0070543663_p31453894">The system checks the Hive service status every 30 seconds. The alarm is generated when the direct memory usage of an Hive service exceeds the threshold (95% of the maximum memory).</p>
|
|
<p id="ALM-16006__en-us_topic_0070543663_p14649592">Users can choose <strong id="ALM-16006__b78171431184518"><strong id="ALM-16006__b9817731104514">O&M > Alarm > Thresholds ></strong></strong> <em id="ALM-16006__i88201231164519">Name of the desired cluster</em> <strong id="ALM-16006__b681993113452"><strong id="ALM-16006__b14819133118456">> Hive</strong></strong> to change the threshold.</p>
|
|
<p id="ALM-16006__en-us_topic_0070543663_p10516848">The alarm is cleared when the direct memory usage is less than or equal to the threshold.</p>
|
|
</div>
|
|
<div class="section" id="ALM-16006__s7b79fbece59141689cce2accd1e531ce"><h4 class="sectiontitle">Attribute</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-16006__en-us_topic_0070543663_table46558337" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-16006__en-us_topic_0070543663_row38591729"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-16006__en-us_topic_0070543663_p38922373">Alarm ID</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-16006__en-us_topic_0070543663_p65704509">Alarm Severity</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-16006__en-us_topic_0070543663_p20465026">Automatically Cleared</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-16006__en-us_topic_0070543663_row47054395"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-16006__en-us_topic_0070543663_p53309649">16006</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-16006__en-us_topic_0070543663_p23114283">Major</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-16006__en-us_topic_0070543663_p60317617">Yes</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-16006__s0667798c2c07490ebcac9951416a4483"><h4 class="sectiontitle">Parameters</h4>
|
|
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-16006__en-us_topic_0070543663_table53888845" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-16006__en-us_topic_0070543663_row48518632"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-16006__en-us_topic_0070543663_p37695094">Name</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-16006__en-us_topic_0070543663_p33403740">Meaning</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="ALM-16006__row123354162717"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16006__p192431315431">Source</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16006__p692551319435">Specifies the cluster for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16006__en-us_topic_0070543663_row21348419"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16006__en-us_topic_0070543663_p51500370">ServiceName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16006__en-us_topic_0070543663_p10780466">Specifies the service name for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16006__en-us_topic_0070543663_row29915336"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16006__en-us_topic_0070543663_p7223140">RoleName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16006__en-us_topic_0070543663_p48203428">Specifies the role name for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16006__en-us_topic_0070543663_row31177669"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16006__en-us_topic_0070543663_p42363255">HostName</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16006__en-us_topic_0070543663_p8871617">Specifies the object (host ID) for which the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="ALM-16006__row87321750165319"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-16006__p4733195035315">Trigger Condition</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-16006__p1373310504532">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</div>
|
|
<div class="section" id="ALM-16006__s3b9a348491844ab7966f4a2e5830ba2d"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-16006__en-us_topic_0070543663_p47512396">When the direct memory usage of Hive is overhigh, the performance of Hive task operation is affected. In addition, a memory overflow may occur so that the Hive service is unavailable.</p>
|
|
</div>
|
|
<div class="section" id="ALM-16006__sc1220d9763434d1696d2640dfe1860a4"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-16006__en-us_topic_0070543663_p23298880">The direct memory of the Hive instance on the node is overused or the direct memory is inappropriately allocated. As a result, the usage exceeds the threshold.</p>
|
|
</div>
|
|
<div class="section" id="ALM-16006__s6011eea13fc74f13b8f8bd2784a1e87b"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-16006__en-us_topic_0070543663_p8161091"><strong id="ALM-16006__b16044232143424">Check direct memory usage.</strong></p>
|
|
<ol id="ALM-16006__ol23768507143434"><li id="ALM-16006__li54530635143419"><span>On the FusionInsight Manager portal, click <strong id="ALM-16006__b28662750155624">O&M > Alarm > Alarms</strong> and select the alarm whose <strong id="ALM-16006__b30872886143419">Alarm ID</strong> is <strong id="ALM-16006__b9420526143419">16006</strong>. Then check the role name in <strong id="ALM-16006__b14790172183618">Location </strong>and confirm the IP adress of the instance.</span><p><ul class="subitemlist" id="ALM-16006__ul65711283143419"><li id="ALM-16006__li22459760143419">If the role for which the alarm is generated is HiveServer, go to <a href="#ALM-16006__li31510133143419">2</a>.</li><li id="ALM-16006__li7301253143419">If the role for which the alarm is generated is MetaStore, go to <a href="#ALM-16006__li39131309143419">3</a>.</li></ul>
|
|
</p></li><li id="ALM-16006__li31510133143419"><a name="ALM-16006__li31510133143419"></a><a name="li31510133143419"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-16006__b18292121334716"><strong id="ALM-16006__b1629214132476">Cluster</strong></strong> > <em id="ALM-16006__i142971413184717">Name of the desired cluster</em> ><strong id="ALM-16006__b14294151317477"><strong id="ALM-16006__b1294191313474"> Services</strong> > <strong id="ALM-16006__b829461324713">Hive</strong> > <strong id="ALM-16006__b62944130472">Instance</strong></strong> and click the HiveServer for which the alarm is generated to go to the<strong id="ALM-16006__b14303164441516"> Dashboard </strong>page. Click the drop-down menu in the <strong id="ALM-16006__b880413313516">Chart </strong>area and choose <strong id="ALM-16006__b173086361652">Customize </strong>> <strong id="ALM-16006__b15702441192211">CPU and Memory</strong>, and select <strong id="ALM-16006__b60544464143419">HiveServer Memory Usage Statistics</strong> and click <strong id="ALM-16006__b8029266143419">OK</strong>, check whether the used direct memory of the HiveServer service reaches the threshold(default value: 95%) of the maximum direct memory specified for HiveServer.</span><p><ul class="subitemlist" id="ALM-16006__ul63153449143419"><li id="ALM-16006__li46390825143419">If yes, go to <a href="#ALM-16006__li4911009143419">4</a>.</li><li id="ALM-16006__li66669373143419">If no, go to <a href="#ALM-16006__li32472303143419">7</a>.</li></ul>
|
|
</p></li><li id="ALM-16006__li39131309143419"><a name="ALM-16006__li39131309143419"></a><a name="li39131309143419"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-16006__b23611818478">Cluster </strong>> <em id="ALM-16006__i12364010472">Name of the desired cluster</em> ><strong id="ALM-16006__b123623115473"> Services</strong> > <strong id="ALM-16006__b698122119566">Hive</strong> > <strong id="ALM-16006__b11983721125619">Instance</strong> and click the MetaStore for which the alarm is generated to go to the<strong id="ALM-16006__b56351250171618"> Dashboard </strong>page. Click the drop-down menu in the <strong id="ALM-16006__b9966756057">Chart </strong>area and choose <strong id="ALM-16006__b996617562515">Customize </strong>> <strong id="ALM-16006__b1258719494918">CPU and Memory</strong>, and select <strong id="ALM-16006__b34998665143419">MetaStore Memory Usage Statistics</strong> and click <strong id="ALM-16006__b46552530143419">OK</strong>, check whether the used direct memory of the MetaStore service reaches the threshold(default value: 95%) of the maximum direct memory specified for MetaStore.</span><p><ul class="subitemlist" id="ALM-16006__ul34174085143419"><li id="ALM-16006__li12658557143419">If yes, go to <a href="#ALM-16006__li4911009143419">4</a>.</li><li id="ALM-16006__li18710201143419">If no, go to <a href="#ALM-16006__li32472303143419">7</a>.</li></ul>
|
|
</p></li><li id="ALM-16006__li4911009143419"><a name="ALM-16006__li4911009143419"></a><a name="li4911009143419"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-16006__b730753764818">Cluster </strong>><em id="ALM-16006__i27321640194812">Name of the desired cluster </em>><strong id="ALM-16006__b133085378489"> Services</strong> > <strong id="ALM-16006__b40561099143018">Hive</strong> > <strong id="ALM-16006__b29505572143018">Configurations > All Configurations</strong>. Choose <strong id="ALM-16006__b19232238143419">HiveServer/MetaStore</strong> > <strong id="ALM-16006__b38872414143419">JVM</strong>. Adjust the value of <strong id="ALM-16006__b14307412143419">-XX:MaxDirectMemorySize</strong> in <strong id="ALM-16006__b61657848143419">HIVE_GC_OPTS/METASTORE_GC_OPTS</strong> as the following rules. Click <strong id="ALM-16006__b18049723143419">Save</strong>.</span><p><div class="note" id="ALM-16006__note638551412512"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><div class="p" id="ALM-16006__p863882712519">Suggestions for GC parameter settings for the HiveServer:<ul id="ALM-16006__ul15827113432817"><li id="ALM-16006__li188271534122816">It is recommended that you set the value of <strong id="ALM-16006__b664011715320">-XX:MaxDirectMemorySize</strong> to 1/8 of the value of <strong id="ALM-16006__b136401174324">-Xmx</strong>. For example, if <strong id="ALM-16006__b964015715323">-Xmx</strong> is set to 8 GB, <strong id="ALM-16006__b964017713216">-XX:MaxDirectMemorySize</strong> is set to 1024 MB. If <strong id="ALM-16006__b864011719328">-Xmx</strong> is set to 4 GB, <strong id="ALM-16006__b4640147173219">-XX:MaxDirectMemorySize</strong> is set to 512 MB. It is recommended that the value of <strong id="ALM-16006__b764037133217">-XX:MaxDirectMemorySize</strong> be greater than or equal to 512 MB.</li></ul>
|
|
</div>
|
|
<div class="p" id="ALM-16006__p141314122620">Suggestions for GC parameter settings for the MetaServer:<ul id="ALM-16006__ul13161155662810"><li id="ALM-16006__li19161165672820">It is recommended that you set the value of <strong id="ALM-16006__b1370018213329">-XX:MaxDirectMemorySize</strong> to 1/8 of the value of <strong id="ALM-16006__b1170013212325">-Xmx</strong>. For example, if <strong id="ALM-16006__b170012216329">-Xmx</strong> is set to 8 GB, <strong id="ALM-16006__b57001921113211">-XX:MaxDirectMemorySize</strong> is set to 1024 MB. If <strong id="ALM-16006__b137001821163211">-Xmx</strong> is set to 4 GB, <strong id="ALM-16006__b12700621163211">-XX:MaxDirectMemorySize</strong> is set to 512 MB. It is recommended that the value of <strong id="ALM-16006__b47001721123214">-XX:MaxDirectMemorySize</strong> be greater than or equal to 512 MB.</li></ul>
|
|
</div>
|
|
</div></div>
|
|
</p></li><li id="ALM-16006__li7977646317"><span>Click <strong id="ALM-16006__b193941191220">More > Restart Service </strong>to restart the service.</span></li><li id="ALM-16006__li47913692143419"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-16006__ul8876572143419"><li id="ALM-16006__li44199082143419">If yes, no further action is required.</li><li id="ALM-16006__li23355907143419">If no, go to <a href="#ALM-16006__li32472303143419">7</a>.</li></ul>
|
|
</p></li></ol>
|
|
<p class="tableheading" id="ALM-16006__p12780284143419"><strong id="ALM-16006__b28956683143450">Collect fault information.</strong></p>
|
|
<ol start="7" id="ALM-16006__ol18414172143455"><li id="ALM-16006__li32472303143419"><a name="ALM-16006__li32472303143419"></a><a name="li32472303143419"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-16006__b39977366113627">O&M</strong> > <strong id="ALM-16006__b24251979113627">Log > Download</strong>.</span></li><li id="ALM-16006__li22577129143419"><span>Select <strong id="ALM-16006__b1536613131126">Hive</strong> in the required cluster from the <strong id="ALM-16006__b23815273143419">Service</strong>.</span></li><li id="ALM-16006__li1145664103113"><span>Click <span><img id="ALM-16006__image1945644173117" src="en-us_image_0269417382.png"></span> in the upper right corner, and set <strong id="ALM-16006__b6456941173117">Start Date</strong> and <strong id="ALM-16006__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-16006__b13456164113319">Download</strong>.</span></li><li id="ALM-16006__li39353208143419"><span>Contact the <span id="ALM-16006__text4614151421417">O&M personnel</span> and send the collected fault logs.</span></li></ol>
|
|
</div>
|
|
<div class="section" id="ALM-16006__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-16006__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
|
</div>
|
|
<div class="section" id="ALM-16006__s9235bada32d04091b094d13ba7a38c2b"><h4 class="sectiontitle">Related Information</h4><p id="ALM-16006__en-us_topic_0070543663_p51048117">None</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
|
</div>
|
|
</div>
|
|
|