doc-exports/docs/mrs/umn/ALM-25008.html
Yang, Tong 5914b67d13 MRS UMN Doc 20240802 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2024-09-28 19:04:58 +00:00

91 lines
12 KiB
HTML

<a name="ALM-25008"></a><a name="ALM-25008"></a>
<h1 class="topictitle1">ALM-25008 SlapdServer CPU Usage Exceeds the Threshold</h1>
<div id="body0000001971816512"><div class="section" id="ALM-25008__section6427584"><h4 class="sectiontitle"><span id="ALM-25008__text8925301575">Alarm Description</span></h4><p id="ALM-25008__p649614519412">The system checks the CPU usage of the SlapdServer node every 30 seconds and compares the actual usage with the threshold. This alarm is generated when the SlapdServer CPU usage exceeds the threshold for multiple times (<strong id="ALM-25008__b96508419288">5</strong> by default).</p>
<p id="ALM-25008__p7890154011523">Its <strong id="ALM-25008__b1697251618268">Trigger Count</strong> is configurable. If <strong id="ALM-25008__b5972171619264">Trigger Count</strong> is set to <strong id="ALM-25008__b169726167268">1</strong>, this alarm is cleared when the SlapdServer CPU usage is less than or equal to the threshold. If <strong id="ALM-25008__b1853416135271">Trigger Count</strong> is greater than <strong id="ALM-25008__b353411312718">1</strong>, this alarm is cleared when the SlapdServer CPU usage is less than or equal to 90% of the threshold.</p>
</div>
<div class="section" id="ALM-25008__section57848263"><h4 class="sectiontitle"><span id="ALM-25008__text38748475555">Alarm Attributes</span></h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-25008__table53988588" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-25008__row25963404"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-25008__p57710042"><span id="ALM-25008__text17980150175619">Alarm ID</span></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-25008__p44001849"><span id="ALM-25008__text199471335614">Alarm Severity</span></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-25008__p7380012"><span id="ALM-25008__text152400388563">Auto Cleared</span></p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-25008__row14760880"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-25008__p14887165972519">25008</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-25008__p660834585110">Critical (default threshold: 85%)</p>
<p id="ALM-25008__p51431020">Major (default threshold: 75%)</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-25008__p14881165912515">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-25008__section50872323"><h4 class="sectiontitle"><span id="ALM-25008__text155061195577">Alarm Parameters</span></h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-25008__table22167579" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-25008__row15017071"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-25008__p21975462"><span id="ALM-25008__text776142495720">Parameter</span></p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-25008__p35182007"><span id="ALM-25008__text632018391572">Description</span></p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-25008__row1756114464143"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-25008__p3820532611">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-25008__p6810518268">Specifies the cluster or system for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-25008__row34521134"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-25008__p171352261">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-25008__p1461254261">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-25008__row6737354"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-25008__p15512518268">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-25008__p7112518267">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-25008__row1028801444414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-25008__p9288161412441">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-25008__p13288214194417">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-25008__row19401162054415"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-25008__p15402102044415">Trigger Condition</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-25008__p9402122064414">Specifies the threshold for triggering the alarm.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-25008__section55197725"><h4 class="sectiontitle"><span id="ALM-25008__text2266192715582">Impact on the System</span></h4><p id="ALM-25008__p47392170">Processes respond slowly or do not work.</p>
</div>
<div class="section" id="ALM-25008__section27017478"><h4 class="sectiontitle"><span id="ALM-25008__text12656240135813">Possible Causes</span></h4><ul id="ALM-25008__ul460131185210"><li id="ALM-25008__li1373752155210">The alarm threshold or alarm trigger count is improperly configured.</li><li id="ALM-25008__li1760201165215">The CPU configuration cannot meet service requirements, and the CPU usage reaches the upper limit.</li></ul>
</div>
<div class="section" id="ALM-25008__section535785120256"><h4 class="sectiontitle"><span id="ALM-25008__text19569135285811">Handling Procedure</span></h4><p id="ALM-25008__p18319915115316"><strong id="ALM-25008__b1692240193419">Check whether the alarm threshold or alarm trigger count is properly configured.</strong></p>
<ol id="ALM-25008__ol12485153614462"><li id="ALM-25008__li124853366461"><span>Log in to FusionInsight Manager, choose <strong id="ALM-25008__b10959185319342">O&amp;M</strong> &gt; <strong id="ALM-25008__b8960125312349">Alarm</strong> &gt; <strong id="ALM-25008__b896118537342">Thresholds</strong>, click the name of the desired cluster, choose <strong id="ALM-25008__b3962253113412">LdapServer</strong> &gt; <strong id="ALM-25008__b142712140111">Other</strong> &gt; <strong id="ALM-25008__b8963253153419">SlapdServer Service Total CPU Percentage</strong>, and check whether the alarm trigger count and alarm threshold are set properly.</span><p><ul id="ALM-25008__ul17485136134617"><li id="ALM-25008__li1748553654613">If yes, go to <a href="#ALM-25008__li848412361466">4</a>.</li><li id="ALM-25008__li19485153694610">If no, go to <a href="#ALM-25008__li174859361464">2</a>.</li></ul>
</p></li><li id="ALM-25008__li174859361464"><a name="ALM-25008__li174859361464"></a><a name="li174859361464"></a><span>Change the trigger count and alarm threshold based on the actual CPU usage, and apply the changes.</span></li><li id="ALM-25008__li1148563612460"><span>Wait 2 minutes and check whether the alarm is automatically cleared.</span><p><ul id="ALM-25008__ul9485143618462"><li id="ALM-25008__li748513618463">If yes, no further action is required.</li><li id="ALM-25008__li548563615468">If no, go to <a href="#ALM-25008__li848412361466">4</a>.</li></ul>
</p></li></ol>
<p id="ALM-25008__p832011512539"><strong id="ALM-25008__b45360489359">Check whether the CPU usage reaches the upper limit.</strong></p>
<ol start="4" id="ALM-25008__ol5485203614469"><li id="ALM-25008__li848412361466"><a name="ALM-25008__li848412361466"></a><a name="li848412361466"></a><span>On FusionInsight Manager, choose <strong id="ALM-25008__b971913559358">O&amp;M</strong> &gt; <strong id="ALM-25008__b972118552354">Alarm</strong> &gt; <strong id="ALM-25008__b1272255523516">Alarms</strong>. In the right pane, click this alarm and obtain the host name in <strong id="ALM-25008__b972310555352">Location</strong>.</span></li><li id="ALM-25008__li1248417366465"><a name="ALM-25008__li1248417366465"></a><a name="li1248417366465"></a><span>Choose <strong id="ALM-25008__b3925118345">Cluster</strong> &gt; <strong id="ALM-25008__b199251915349">Services</strong> &gt; <strong id="ALM-25008__b69256113343">LdapServer</strong>, click the <strong id="ALM-25008__b8916032173415">Instance</strong> tab, and click the SlapdServer instance corresponding to the host name in <a href="#ALM-25008__li848412361466">4</a>.</span></li><li id="ALM-25008__li133258517208"><a name="ALM-25008__li133258517208"></a><a name="li133258517208"></a><span>On the dashboard of the instance, observe the real-time data of the <strong id="ALM-25008__b159977196486">CPU Usage of a Single SlapdServer Instance</strong> chart for about 5 minutes and check whether the CPU usage exceeds the threshold (<strong id="ALM-25008__b128032916541">75%</strong> by default) for multiple times.</span><p><ul id="ALM-25008__ul1846911369207"><li id="ALM-25008__li1246923622018">If yes, go to <a href="#ALM-25008__li14826210161714">7</a>.</li><li id="ALM-25008__li0145124915202">If no, go to <a href="#ALM-25008__li89991152124618">9</a>.</li></ul>
</p></li><li id="ALM-25008__li14826210161714"><a name="ALM-25008__li14826210161714"></a><a name="li14826210161714"></a><span>Check whether the status of other SlapdServer instances is normal. For details, see <a href="#ALM-25008__li1248417366465">5</a> to <a href="#ALM-25008__li133258517208">6</a>.</span><p><ul id="ALM-25008__ul53828202177"><li id="ALM-25008__li1298672511175">If yes, contact the MRS cluster administrator to evaluate whether to expand the capacity of SlapdServer instances. Then, go to <a href="#ALM-25008__li12485203614616">8</a>.</li><li id="ALM-25008__li4382920191715">If no, repair the faulty SlapdServer instance and go to <a href="#ALM-25008__li12485203614616">8</a>.</li></ul>
</p></li><li id="ALM-25008__li12485203614616"><a name="ALM-25008__li12485203614616"></a><a name="li12485203614616"></a><span>Check whether the alarm is cleared.</span><p><ul id="ALM-25008__ul16484153654614"><li id="ALM-25008__li15484163634617">If yes, no further action is required.</li><li id="ALM-25008__li184842368460">If no, go to <a href="#ALM-25008__li89991152124618">9</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-25008__p22707215144835"><strong id="ALM-25008__b17539175141012">Collect fault information.</strong></p>
<ol start="9" id="ALM-25008__ol14015319462"><li id="ALM-25008__li89991152124618"><a name="ALM-25008__li89991152124618"></a><a name="li89991152124618"></a><span>On FusionInsight Manager, choose <strong id="ALM-25008__b18375135361015">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-25008__b1637515331015">Log</strong> &gt; <strong id="ALM-25008__b3375553171015">Download</strong>.</span></li><li id="ALM-25008__li15999175218461"><span>Expand the <strong id="ALM-25008__b2060812549107">Service</strong> drop-down list, and select <strong id="ALM-25008__b176086541100">LdapServer</strong> for the target cluster.</span></li><li id="ALM-25008__li1799955234619"><span>Click <span><img id="ALM-25008__image1299965219461" src="en-us_image_0000002008258989.png"></span> in the upper right corner, and set <strong id="ALM-25008__b9290115818109">Start Date</strong> and <strong id="ALM-25008__b02911158101019">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-25008__b4291185881017">Download</strong>.</span></li><li id="ALM-25008__li1602535462"><span>Contact <span id="ALM-25008__text176166613113">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-25008__section169311343318"><h4 class="sectiontitle"><span id="ALM-25008__text367020138593">Alarm Clearance</span></h4><p id="ALM-25008__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>
<div class="section" id="ALM-25008__section53362350"><h4 class="sectiontitle"><span id="ALM-25008__text1246242445916">Related Information</span></h4><p id="ALM-25008__p7522741"><span id="ALM-25008__text1881919412591">None.</span></p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>