doc-exports/docs/mrs/umn/ALM-43008.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

86 lines
14 KiB
HTML

<a name="ALM-43008"></a><a name="ALM-43008"></a>
<h1 class="topictitle1">ALM-43008 The Direct Memory Usage of the JobHistory2x Process Exceeds the Threshold</h1>
<div id="body8662426"><div class="section" id="ALM-43008__s93d73d2856d343bfaf728e145ab021a1"><h4 class="sectiontitle">Description</h4><p id="ALM-43008__a5796a802465749ad8cf05a485c87e940">The system checks the JobHistory2x Process status every 30 seconds. The alarm is generated when the direct memory usage of a JobHistory2x Process exceeds the threshold (95% of the maximum memory).</p>
</div>
<div class="section" id="ALM-43008__s152d5bb9469b4a1e8b60f47907c929c8"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-43008__t046e0772cb0741cea020d2553fd99b54" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-43008__r59946452423f4b8caf9e464e089809ea"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-43008__ab508117634994b9793b9d132723c12c2">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-43008__a13ac10d1167e4fb693db45774355951a">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-43008__a7b18d1147e3d4b77b6cc42923db2787c">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-43008__r62c6122473f44825b8c31935000edaeb"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-43008__a2641111c961748a5923380bf757bc9fc">43008</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-43008__a826ac6059f6442bab964d5f2ad2f68cc">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-43008__a007483882c00449ca7a992f134e7a8e9">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-43008__sf6c62ba518b148e49bc1c75577ea0ff6"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-43008__t77900a157a094fc289fe19f0d8a4f5dc" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-43008__rce70d18506f447c493def36b8ca5eb6a"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-43008__a3071368ea61047e9892f370f0cff486e">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-43008__a8570341d1487448894e2da743b8ece72">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-43008__row1719612716546"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-43008__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-43008__p692551319435">Specifies the cluster for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-43008__r4f2f7f0c0a5a4eaf9caef359c7665462"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-43008__a2b4add4059714a61bc446ff5347c8151">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-43008__aa92f5ecdc93f45e28f69cef2035dc2ce">Specifies the service name for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-43008__rd76fa6c7060d4eeda356735ca421ab0d"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-43008__a35a1ed2eb2be4ed28739e60dea60dc18">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-43008__a08e2273772f34b78b4859f266a45aa0c">Specifies the role name for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-43008__r222154fdc0494cffb4413bb824c47e1d"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-43008__adfbe6911ca7c49d78f2e45dc5f776385">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-43008__afddb243187634ff5a125aeabb688b048">Specifies the object (host ID) for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-43008__rb037ee72ac1444619dcc00682cf011ef"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-43008__acf6c5d0ef43540408a32381f48903024">Trigger Condition</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-43008__a0e8557df2a054d93b63851e660e865b1">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-43008__s556fc6164e494aa3a4b30e14c9ef346f"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-43008__ad062cfa22c5d4254a9e2ac8c3b74546f">If the available JobHistory2x Process direct memory is insufficient, a memory overflow occurs and the service breaks down.</p>
</div>
<div class="section" id="ALM-43008__sa2928ebc352a48f0aa75a4cf61cb59e4"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-43008__a86b8c06fcca94d3c87eb96adca4d00d5">The direct memory of the JobHistory2x Process is overused or the direct memory is inappropriately allocated.</p>
</div>
<div class="section" id="ALM-43008__s75a0b022561c402eabc006b8a354b07d"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-43008__adc7551465013411b86e8c3dc66eb9ba8"><strong id="ALM-43008__b148907348310">Check direct memory usage.</strong></p>
<ol id="ALM-43008__ol531344203114"><li id="ALM-43008__li12313104210317"><span>On the FusionInsight Manager portal, choose <strong id="ALM-43008__b13206122711566">O&amp;M &gt; Alarm</strong><strong id="ALM-43008__b27872374104950"> &gt; Alarms</strong> and select the alarm whose <strong id="ALM-43008__a42fc45a85d464996b0a40048c0ea2d75">ID</strong> is <strong id="ALM-43008__a38e511aeecf24adc8891eaeecdaf2f58">43008</strong>. Check the <strong id="ALM-43008__b1955573445015">RoleName</strong> in <strong id="ALM-43008__b052583712505">Location</strong> and confirm the IP address of <strong id="ALM-43008__b1241513413507">HostName</strong>.</span></li><li id="ALM-43008__li1331318425316"><span>On the FusionInsight Manager portal, choose <strong id="ALM-43008__b193784291194">Cluster &gt; </strong><em id="ALM-43008__i17381129399">N</em><em id="ALM-43008__i538262918910">ame of the desired cluster</em><strong id="ALM-43008__b133795294913"> &gt; Services</strong> &gt; <strong id="ALM-43008__a2cec9e0cef3e474cbbe558fd3f28153e">Spark2x</strong> &gt; <strong id="ALM-43008__abfdb6ddc20cd43be946dda2c16cfe1c9">Instance</strong> and click the JobHistory2x for which the alarm is generated to go to the<strong id="ALM-43008__b14303164441516"> Dashboard </strong>page. Click the drop-down menu in the Chart area and choose<strong id="ALM-43008__b17292123023312"> Customize</strong> &gt; <strong id="ALM-43008__b1598312518349">Memory</strong> &gt; <strong id="ALM-43008__a3378449ff7d74a56b1973f1e2d2ad413">JobHistory2x Memory Usage Statistics</strong> from the drop-down list box in the upper right corner and click <strong id="ALM-43008__a63bca1a4530b4a33a94ba0d4d7a0bb0a">OK</strong>. Check whether the used direct memory of the JobHistory2x Process reaches the threshold(default value is 95%) of the maximum direct memory specified for JobHistory2x.</span><p><ul class="subitemlist" id="ALM-43008__u3d199806522b4554bbf715713a759b37"><li id="ALM-43008__le5db207e80dc4b2a857b84f321c74af9">If yes, go to <a href="#ALM-43008__li1385194210583">3</a>.</li><li id="ALM-43008__l59994ef153e040bba858607999833bbf">If no, go to <a href="#ALM-43008__li1088894514319">7</a>.</li></ul>
</p></li><li id="ALM-43008__li1385194210583"><a name="ALM-43008__li1385194210583"></a><a name="li1385194210583"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-43008__b338564214589">Cluster</strong> &gt; <em id="ALM-43008__i14385134265812">Name of the desired cluster</em> &gt; <strong id="ALM-43008__b9385242135815">Service</strong><strong id="ALM-43008__b9201638153513">s</strong> &gt; <strong id="ALM-43008__b11385134275818">Spark2x</strong> &gt; <strong id="ALM-43008__b438634218583">Instance</strong>. Click<strong id="ALM-43008__b338654215588"> </strong><strong id="ALM-43008__b938634215582">JobHistory2x </strong>by which the alarm is reported to go to the<strong id="ALM-43008__b032418519217"> Dashboard </strong>page, click the drop-down list in the upper right corner of the chart area, choose <strong id="ALM-43008__b183861842165810">Customize</strong> &gt; <strong id="ALM-43008__b738654211586"><strong id="ALM-43008__b19386442145813">Memory </strong>&gt; </strong><strong id="ALM-43008__b1238454295811">Direct Memory of JobHistory2x</strong>, and click <strong id="ALM-43008__b88991336318">OK</strong>. Based on the alarm generation time, check the values of the used direc memory of the JobHistory2x process in the corresponding period and obtain the maximum value.</span></li><li id="ALM-43008__li83131742163118"><span>On the FusionInsight Manager portal, choose <strong id="ALM-43008__b16524133819911">Cluster &gt; <em id="ALM-43008__i55248386910">N</em></strong><em id="ALM-43008__i852653813913">ame of the desired cluster</em> <strong id="ALM-43008__b115248381791">&gt; Services</strong> &gt; <strong id="ALM-43008__a70f2f821c9674cfeadecd3041d74ecce">Spark2x</strong> &gt; <strong id="ALM-43008__abc71228be8cb431db95ddc7c9b402b83">Configurations</strong>, and click <strong id="ALM-43008__aaf1d2077da134348b041cfca27039249">All Configurations</strong>. Choose <strong id="ALM-43008__a032b86d87e9f4299a06fe9b9803cb44f">JobHistory2x</strong> &gt; <strong id="ALM-43008__ab14c8248260b41f6891e84a5f3b5e5fa">Default</strong>. The default value of <strong id="ALM-43008__a6757227f42ed4ac894de6401cad8ce23">-XX:MaxDirectMemorySize</strong> in <strong id="ALM-43008__a5357f2a44a3047c0b620e2dcff8f3f6e">SPARK_DAEMON_JAVA_OPTS</strong> is 512 MB. You can change the value according to the following rules: Ratio of the maximum direct memory usage of the JobHistory2x to the <strong id="ALM-43008__b1574611014215">Threshold </strong>of the <strong id="ALM-43008__b155801393214">JobHistory2x </strong><strong id="ALM-43008__b1580109725">Direct </strong><strong id="ALM-43008__b95801491821">Memory Usage Statistics (JobHistory2x)</strong> in the alarm period. If this alarm is generated occasionally after the parameter value is adjusted, increase the value by 0.5 times. If the alarm is frequently reported after the parameter value is adjusted, increase the value by 1 time. It is recommended that the value be less than or equal to the value of <strong id="ALM-43008__b112097581040">SPARK_DAEMON_MEMORY</strong>.</span><p><div class="note" id="ALM-43008__note720841116452"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-43008__p134681747706">On the FusionInsight Manager home page, choose <strong id="ALM-43008__b204686471804">O&amp;M</strong> &gt; <strong id="ALM-43008__b16468747902">Alarm</strong> &gt; <strong id="ALM-43008__b18468947906">Thresholds </strong>&gt; <em id="ALM-43008__i1446811471107">Name of the desired cluster</em> <strong id="ALM-43008__b746814472009">&gt; </strong><strong id="ALM-43008__b546818474011">Spark2x</strong> &gt; <strong id="ALM-43008__b64681147905">Memory </strong>&gt;<strong id="ALM-43008__b1346814471107">JobHistory2x </strong><strong id="ALM-43008__b546719474010">Direct </strong><strong id="ALM-43008__b194684479020">Memory Usage Statistics (JobHistory2x)</strong> to view<strong id="ALM-43008__b952271223218"> Threshold</strong>.</p>
</div></div>
</p></li><li id="ALM-43008__li656515289295"><span>Restart all JobHistory2x instances.</span></li><li id="ALM-43008__li131334243110"><span>After 10 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-43008__u7bd7c0e4579e48148bb63b1c2dc1738e"><li id="ALM-43008__lb5fd8dc2a7a74fd58022ec4cd9b80fe2">If yes, no further action is required.</li><li id="ALM-43008__ld86b52b25b5f43daa7cc14268cdc8285">If no, go to <a href="#ALM-43008__li1088894514319">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-43008__a9bb559224c8f4dcea78c768abfa10188"><strong id="ALM-43008__b218783903112">Collect fault information.</strong></p>
<ol start="7" id="ALM-43008__ol1188824512317"><li id="ALM-43008__li1088894514319"><a name="ALM-43008__li1088894514319"></a><a name="li1088894514319"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-43008__b1828645519573">O&amp;M</strong> &gt; <strong id="ALM-43008__a1861fcf3e7834270a6c572676343ab7e">Log &gt; Download</strong>.</span></li><li id="ALM-43008__li168888451311"><span>Select <strong id="ALM-43008__ab8d0acdda7eb467e96a1067b4a57f730">Spark2x</strong> in the required cluster from the <strong id="ALM-43008__ae6c81981914f480fa3e66d78ed8534a2">Service</strong>.</span></li><li id="ALM-43008__li1988811452314"><span>Click <span><img id="ALM-43008__image1945644173117" src="en-us_image_0269417537.png"></span> in the upper right corner, and set <strong id="ALM-43008__b6456941173117">Start Date</strong> and <strong id="ALM-43008__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-43008__b13456164113319">Download</strong>.</span></li><li id="ALM-43008__li15888114543111"><span>Contact the <span id="ALM-43008__text4614151421417">O&amp;M personnel</span> and send the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-43008__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-43008__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-43008__s7edaf55b289b4956b7397fee36fe6781"><h4 class="sectiontitle">Related Information</h4><p id="ALM-43008__ab67015de54bf4dd49283e3892f6ddffc">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>