forked from docs/doc-exports
MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
This commit is contained in:
parent
8682f10aa4
commit
3b1f73dece
11952
docs/mrs/umn/ALL_META.TXT.json
Normal file
11952
docs/mrs/umn/ALL_META.TXT.json
Normal file
File diff suppressed because it is too large
Load Diff
90
docs/mrs/umn/ALM-12001.html
Normal file
90
docs/mrs/umn/ALM-12001.html
Normal file
@ -0,0 +1,90 @@
|
||||
<a name="ALM-12001"></a><a name="ALM-12001"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12001 Audit Log Dumping Failure</h1>
|
||||
<div id="body31263779"><div class="section" id="ALM-12001__s5d20f08da0194ecda3909cfdb87e180c"><h4 class="sectiontitle">Description</h4><p id="ALM-12001__en-us_topic_0070543614_p23642491">Cluster audit logs need to be dumped on a third-party server due to the local historical data backup policy. The system starts to check the dump server at 3 a.m. every day. If the dump server meets the configuration conditions, audit logs can be successfully dumped. This alarm is generated when the audit log dump fails if the disk space of the dump directory on the third-party server is insufficient or a user changes the username, password, or dump directory of the dump server.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__s57fc45b8d80b415e8c5441559e485ac6"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12001__en-us_topic_0070543614_table35993632" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12001__en-us_topic_0070543614_row49791721"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12001__en-us_topic_0070543614_p6597583">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12001__en-us_topic_0070543614_p64642251">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12001__en-us_topic_0070543614_p1530946">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12001__en-us_topic_0070543614_row56897822"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12001__en-us_topic_0070543614_p45320903">12001</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12001__en-us_topic_0070543614_p47114532">Minor</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12001__en-us_topic_0070543614_p58180766">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__sefa6f724d78140acb906fa126d25870e"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12001__en-us_topic_0070543614_table15021596" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12001__en-us_topic_0070543614_row32398201"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12001__en-us_topic_0070543614_p7008594">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12001__en-us_topic_0070543614_p30825203">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12001__row1792471320439"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12001__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12001__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12001__en-us_topic_0070543614_row13813509"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12001__en-us_topic_0070543614_p45152431">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12001__en-us_topic_0070543614_p33468314">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12001__en-us_topic_0070543614_row32779375"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12001__en-us_topic_0070543614_p37883742">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12001__en-us_topic_0070543614_p48684257">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12001__en-us_topic_0070543614_row35505134"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12001__en-us_topic_0070543614_p57343595">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12001__en-us_topic_0070543614_p14319600">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__sf7026af787cb41dc8c68b2e6f1e00d43"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12001__en-us_topic_0070543614_p19036966">System can store a maximum of only 50 dump files locally. If the fault persists on the dump server, the local audit logs may be lost.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__s32c81d1592824ca0b8dbb3a21428f59d"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12001__en-us_topic_0070543614_ul65599284"><li id="ALM-12001__en-us_topic_0070543614_li53522652">The network connection is abnormal.</li><li id="ALM-12001__en-us_topic_0070543614_li11941827">The username, password, or dump directory of the dump server does not meet the configuration conditions.</li><li id="ALM-12001__en-us_topic_0070543614_li40367587">The disk space of the dump directory is insufficient.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__s3a2cd89f53084ce98c69427e4cf85a18"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12001__en-us_topic_0070543614_p48549077"><strong id="ALM-12001__b2009504815379">Check whether the network connection is normal.</strong></p>
|
||||
<ol id="ALM-12001__ol38120521153659"><li id="ALM-12001__li28892250153659"><span>On the FusionInsight Manager home page, choose <strong id="ALM-12001__b40492952153659">Audit > Configurations</strong>.</span></li><li id="ALM-12001__li58703659153659"><span>Check whether the SFTP IP on the dump configuration page is valid.</span><p><div class="litext" id="ALM-12001__p44686402153726">Log in to the node where Manager is located as user <strong id="ALM-12001__b58570884153659">root</strong> and run the <strong id="ALM-12001__b57375910153659">ping</strong> command to check whether the network connection between the SFTP server and the cluster is normal. <span id="ALM-12001__text187511520308"></span><span id="ALM-12001__text325002212305"></span><ul class="subitemlist" id="ALM-12001__ul66251821153659"><li id="ALM-12001__li16937138153659">If yes, go to <a href="#ALM-12001__li33093593154533">5</a>.</li><li id="ALM-12001__li29730934153659">If no, go to <a href="#ALM-12001__li64797305153659">3</a>.</li></ul>
|
||||
</div>
|
||||
</p></li><li id="ALM-12001__li64797305153659"><a name="ALM-12001__li64797305153659"></a><a name="li64797305153659"></a><span>Repair the network connection, reset the SFTP password, and click <strong id="ALM-12001__b59395483153659">OK</strong>.</span></li><li id="ALM-12001__li4235613153659"><span>Wait for 2 minutes and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12001__ul470623153659"><li id="ALM-12001__li46304841153659">If yes, no further action is required.</li><li id="ALM-12001__li59704615153659">If no, go to <a href="#ALM-12001__li33093593154533">5</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12001__p27591857153526"><strong id="ALM-12001__b64207308154529">Check whether the username, password, or dump directory are correct.</strong></p>
|
||||
<ol start="5" id="ALM-12001__ol23141570154614"><li id="ALM-12001__li33093593154533"><a name="ALM-12001__li33093593154533"></a><a name="li33093593154533"></a><span>On the dump configuration page, check whether the username, password, and dump directory of the third-party server are correct.</span><p><ul class="subitemlist" id="ALM-12001__ul33503227154533"><li id="ALM-12001__li48743610154533">If yes, go to <a href="#ALM-12001__li56273719154547">8</a>.</li><li id="ALM-12001__li55918363154533">If no, go to <a href="#ALM-12001__li63335387154533">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12001__li63335387154533"><a name="ALM-12001__li63335387154533"></a><a name="li63335387154533"></a><span>Change the username, password, or dump directory, reset the SFTP password and click <strong id="ALM-12001__b29406886154533">OK</strong>.</span></li><li id="ALM-12001__li48540218154533"><span>Wait for 2 minutes and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12001__ul5393357154533"><li id="ALM-12001__li33147578154533">If yes, no further action is required.</li><li id="ALM-12001__li599261154533">If no, go to <a href="#ALM-12001__li56273719154547">8</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12001__p13175067153526"><strong id="ALM-12001__b13850359154544">Check whether the disk space of the dump directory is sufficient.</strong></p>
|
||||
<ol start="8" id="ALM-12001__ol5407662154617"><li id="ALM-12001__li56273719154547"><a name="ALM-12001__li56273719154547"></a><a name="li56273719154547"></a><span>Log in to the third-party server as user <strong id="ALM-12001__b14607066154547">root</strong> and run the <strong id="ALM-12001__b64354738154547">df</strong> command to check whether the disk space of the dump directory of the third-party server exceeds 100 MB.</span><p><ul class="subitemlist" id="ALM-12001__ul43535337154547"><li id="ALM-12001__li45351298154547">If yes, go to <a href="#ALM-12001__li37575023154554">11</a>.</li><li id="ALM-12001__li49576502154547">If no, go to <a href="#ALM-12001__li61877356154547">9</a>.</li></ul>
|
||||
</p></li><li id="ALM-12001__li61877356154547"><a name="ALM-12001__li61877356154547"></a><a name="li61877356154547"></a><span>Expand disk space capacity for the third-party server, Reset the SFTP password and click <strong id="ALM-12001__b36701423154547">OK</strong></span></li><li id="ALM-12001__li53906996154547"><span>Wait for 2 minutes, view real-time alarms and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12001__ul35815828154547"><li id="ALM-12001__li20025293154547">If yes, no further action is required.</li><li id="ALM-12001__li11436076154547">If no, go to <a href="#ALM-12001__li37575023154554">11</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12001__en-us_topic_0070543614_p43960067"><strong id="ALM-12001__b4787357154551">Reset the dump rule.</strong></p>
|
||||
<ol start="11" id="ALM-12001__ol38224750154621"><li id="ALM-12001__li37575023154554"><a name="ALM-12001__li37575023154554"></a><a name="li37575023154554"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12001__b41457704154554">Audit > Configurations</strong>.</span></li><li id="ALM-12001__li23678021154554"><span>Reset dump rules, set the parameters properly, and click <strong id="ALM-12001__b2630891154554">OK</strong>.</span></li><li id="ALM-12001__li17396949154554"><span>Wait for 2 minutes, view real-time alarms and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12001__ul61585317154554"><li id="ALM-12001__li11775598154554">If yes, no further action is required.</li><li id="ALM-12001__li14299353154554">If no, go to <a href="#ALM-12001__li5991045915463">14</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12001__p28445835153631"><strong id="ALM-12001__b57966164154559">Collect fault information.</strong></p>
|
||||
<ol start="14" id="ALM-12001__ol17392131154624"><li id="ALM-12001__li5991045915463"><a name="ALM-12001__li5991045915463"></a><a name="li5991045915463"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12001__b5263123115415">O&M</strong> > <strong id="ALM-12001__b2156979815463">Log > Download</strong>.</span></li><li id="ALM-12001__li5396317115463"><span>Select <strong id="ALM-12001__b20461631242">OmmServer</strong> from the <strong id="ALM-12001__b63941092411">Service</strong> and click <strong id="ALM-12001__b3991118545">OK</strong>.</span></li><li id="ALM-12001__li1145664103113"><span>Click <span><img id="ALM-12001__image1945644173117" src="en-us_image_0269383808.png"></span> in the upper right corner, and set <strong id="ALM-12001__b6456941173117">Start Date</strong> and <strong id="ALM-12001__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12001__b13456164113319">Download</strong>.</span></li><li id="ALM-12001__li495644512588"><span>Contact the <span id="ALM-12001__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12001__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12001__s092f76f3a5334f47bf56d692c95eb040"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12001__en-us_topic_0070543614_p53782959">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
83
docs/mrs/umn/ALM-12004.html
Normal file
83
docs/mrs/umn/ALM-12004.html
Normal file
@ -0,0 +1,83 @@
|
||||
<a name="ALM-12004"></a><a name="ALM-12004"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12004 OLdap Resource Abnormal</h1>
|
||||
<div id="body1502172312905"><div class="section" id="ALM-12004__sa9183dffe6ae4531831efe0aeaadea33"><h4 class="sectiontitle">Description</h4><p id="ALM-12004__a2a423974896f489ea2f4464a5573f1a4">The system checks LDAP resources every 60 seconds. This alarm is generated when the system detects that the LDAP resources in Manager are abnormal for six consecutive times.</p>
|
||||
<p id="ALM-12004__a211c50bebca14a49b11fafdbb0479218">This alarm is cleared when the Ldap resource in the Manager recovers and the alarm handling is complete.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__s8661b0e1026e495dbebcb6fad4eea718"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12004__en-us_topic_0070546150_table66387117" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12004__en-us_topic_0070546150_row28652385"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12004__en-us_topic_0070546150_p39141857">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12004__en-us_topic_0070546150_p16373829">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12004__en-us_topic_0070546150_p51211766">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12004__en-us_topic_0070546150_row54512347"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12004__affdc687d14bc4040860b019da5217158">12004</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12004__en-us_topic_0070546150_p32373859">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12004__en-us_topic_0070546150_p5036885">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__sac319a63273b44db99a6c0209549f2dc"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12004__en-us_topic_0070546150_table5334556" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12004__en-us_topic_0070546150_row58348429"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12004__en-us_topic_0070546150_p28602343">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12004__en-us_topic_0070546150_p35088474">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12004__row3641114765516"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12004__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12004__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12004__en-us_topic_0070546150_row23594125"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12004__en-us_topic_0070546150_p32075944">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12004__en-us_topic_0070546150_p48014676">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12004__en-us_topic_0070546150_row29478904"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12004__en-us_topic_0070546150_p38980991">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12004__en-us_topic_0070546150_p3343723">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12004__en-us_topic_0070546150_row30093515"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12004__en-us_topic_0070546150_p21655625">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12004__en-us_topic_0070546150_p9275167">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__sca288cf049d84ba98a87595fcab5ceb0"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12004__a593429f6412e48dca10748a44325751f">The Manager and component WebUI authentication services are unavailable and cannot provide security authentication and user management functions for web upper-layer services. Users may be unable to log in to the WebUIs of Manager and components.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__s2055fade2c7e40f2a441dbfc0e19bfd1"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12004__a09b163110d984d42bbb5cfbfcf38a2ea">The LdapServer process in the Manager is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__s7f2ec925ce1940fbabe11f029654bc7b"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12004__en-us_topic_0070546150_p58223301"><strong id="ALM-12004__a09a6fcf99d72473783393ce6884de55d">Check whether the LdapServer process in the Manager is normal.</strong></p>
|
||||
<ol id="ALM-12004__oeee7008874234273afbf9f549eb46324"><li id="ALM-12004__l2b9a8ef7e6084db985420f5c44c16922"><span>Log in the Manager node in the cluster as user <strong id="ALM-12004__ada87ef7d3aaa449aa0eed17ccc419567">omm</strong>.</span><p><p class="litext" id="ALM-12004__a0db40b4c1cf54723b74339e6ea159597">Log in to FusionInsight Manager using the floating IP address, and run the <strong id="ALM-12004__a19c8a973f73044ff98879fae4c7a74b8">sh ${BIGDATA_HOME}/om-server/om/sbin/status-oms.sh</strong> command to check the information about the current Manager two-node cluster.</p>
|
||||
</p></li><li id="ALM-12004__lbc17079424494ad69f2dd0257acee2cd"><span>Run <strong id="ALM-12004__aecdb043853624e46b48efdd0222e0784">ps -ef | grep slapd</strong> command to check whether the LdapServer resource process in the <strong id="ALM-12004__accfc52122caf4e08aadc9544e71c1b2b">${BIGDATA_HOME}/om-server/om/</strong> in the process configuration file is running properly.</span><p><div class="note" id="ALM-12004__ndfac406ca6354edebc30d3508da5c0d4"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12004__acd7b61dbeba848bd8fc6e70ca411da8f">You can determine that the resource is normal by checking the following information:</p>
|
||||
<ol type="a" id="ALM-12004__ol6348204419565"><li id="ALM-12004__li1234914413567">After the <strong id="ALM-12004__b7350134413568">sh ${BIGDATA_HOME}/om-server/om/sbin/status-oms.sh</strong> command runs, <strong id="ALM-12004__b193511144175615">ResHAStatus</strong> of the OLdap is <strong id="ALM-12004__b7351144455613">Normal</strong>.</li><li id="ALM-12004__li1035119448564">After the <strong id="ALM-12004__b10352114418567">ps -ef | grep slapd</strong> command runs, the slapd process of port 21750 can be viewed.<ul id="ALM-12004__ul1735384414561"><li id="ALM-12004__li43551544175617">If yes, go to <a href="#ALM-12004__l6ef892f9c8f749aa9e6871e1a63797b1">3</a>.</li><li id="ALM-12004__li935754420569">If no, go to <a href="#ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde">4</a>.</li></ul>
|
||||
</li></ol>
|
||||
</div></div>
|
||||
</p></li><li id="ALM-12004__l6ef892f9c8f749aa9e6871e1a63797b1"><a name="ALM-12004__l6ef892f9c8f749aa9e6871e1a63797b1"></a><a name="l6ef892f9c8f749aa9e6871e1a63797b1"></a><span>Run the <strong id="ALM-12004__ac81351982ea44a3080848652eb80641f">kill -2</strong> <em id="ALM-12004__adf6ccee5cb6e4773b82ca5f68a8d4218">ldap pid</em> command to restart the LdapServer process and wait for 20 seconds. The HA starts the OLdap process automatically. Check whether the current OLdap resource is in normal state.</span><p><ul id="ALM-12004__u8057658d3505467190171bde28259d37"><li id="ALM-12004__lf80ef17bc2cc40138da0188b47a8b323">If yes, the operation is complete.</li><li id="ALM-12004__l0cc9afd9cc8b4222b41bcf9983d15d1e">If no, go to <a href="#ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12004__abb5516fb7b8647a3942c4c5b7f74fded"><strong id="ALM-12004__a760add342117469495c4fbe7e3daf04f">Collect fault information.</strong></p>
|
||||
<ol start="4" id="ALM-12004__o9661752a744349fba78569b7f04fcbcf"><li id="ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde"><a name="ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde"></a><a name="l4b1abbc809ee41c28ade2b2c4cfa6fde"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12004__b76841116134212">O&M</strong> > <strong id="ALM-12004__abd8fe9ab79df48fdb7b8bfe92c7768bc">Log > Download</strong>.</span></li><li id="ALM-12004__l19f3de8474a147ef88ac2d40f27fe72e"><span>Select <strong id="ALM-12004__a5ee1ffd31e954215a608adc09390aabe">OmsLdapServer</strong> and <strong id="ALM-12004__afed03600c0b1449aa46a036940dae621">OmmServer</strong> from the <strong id="ALM-12004__a6cf5036ea700402980e42d73cf308a63">Service</strong> and click <strong id="ALM-12004__b3991118545">OK</strong>.</span></li><li id="ALM-12004__li1145664103113"><span>Click <span><img id="ALM-12004__image1945644173117" src="en-us_image_0269383809.png"></span> in the upper right corner, and set <strong id="ALM-12004__b6456941173117">Start Date</strong> and <strong id="ALM-12004__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12004__b13456164113319">Download</strong>.</span></li><li id="ALM-12004__li495644512588"><span>Contact the <span id="ALM-12004__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12004__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12004__s333443260e4842bbb3edccaecd83225c"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12004__en-us_topic_0070546150_p37687133">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12005.html
Normal file
80
docs/mrs/umn/ALM-12005.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12005"></a><a name="ALM-12005"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12005 OKerberos Resource Abnormal</h1>
|
||||
<div id="body46505447"><div class="section" id="ALM-12005__sb4494fc0a9724a098acc051bfcf6d8d3"><h4 class="sectiontitle">Description</h4><p id="ALM-12005__en-us_topic_0070543646_p45934981">The alarm module checks the status of the Kerberos resource in Manager every 80 seconds. This alarm is generated when the alarm module detects that the Kerberos resources are abnormal for six consecutive times.</p>
|
||||
<p id="ALM-12005__en-us_topic_0070543646_p10761647">This alarm is cleared when the Kerberos resource recovers and the alarm handling is complete.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__sa6018773c1564645bf238f9af626b80f"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12005__en-us_topic_0070543646_table66387117" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12005__en-us_topic_0070543646_row28652385"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12005__en-us_topic_0070543646_p39141857">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12005__en-us_topic_0070543646_p16373829">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12005__en-us_topic_0070543646_p51211766">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12005__en-us_topic_0070543646_row54512347"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12005__en-us_topic_0070543646_p53423964">12005</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12005__en-us_topic_0070543646_p32373859">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12005__en-us_topic_0070543646_p5036885">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__s4483679f8b044c238c5174acefbc8975"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12005__en-us_topic_0070543646_table5334556" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12005__en-us_topic_0070543646_row58348429"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12005__en-us_topic_0070543646_p28602343">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12005__en-us_topic_0070543646_p35088474">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12005__row2341643135510"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12005__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12005__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12005__en-us_topic_0070543646_row23594125"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12005__en-us_topic_0070543646_p32075944">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12005__en-us_topic_0070543646_p48014676">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12005__en-us_topic_0070543646_row29478904"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12005__en-us_topic_0070543646_p38980991">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12005__en-us_topic_0070543646_p3343723">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12005__en-us_topic_0070543646_row30093515"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12005__en-us_topic_0070543646_p21655625">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12005__en-us_topic_0070543646_p9275167">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__sb62541a2e6e943b684e2619714ec9325"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12005__en-us_topic_0070543646_p13091062">The component WebUI authentication services are unavailable and cannot provide security authentication functions for web upper-layer services. Users may be unable to log in to FusionInsight Manager and the WebUIs of components.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__s9e07b149b27f429cb5b27b19fec75063"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12005__en-us_topic_0070543646_p53743093">The OLdap resource on which the Okerberos depends is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__sd77225bf1fcd431089a828a7a4601dd6"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12005__en-us_topic_0070543646_p58223301"><strong id="ALM-12005__b65926177164951">Check whether the OLdap resource on which the Okerberos depends is abnormal in the Manager.</strong></p>
|
||||
<ol id="ALM-12005__ol2732064816486"><li id="ALM-12005__li2258678016486"><span>Log in the Manager node in the cluster as user <strong id="ALM-12005__b2487926316486">omm</strong>.</span><p><p class="litext" id="ALM-12005__p144011943114415">Log in to FusionInsight Manager using the floating IP address, and run the <strong id="ALM-12005__b195443516486">sh ${BIGDATA_HOME}/om-server/om/sbin/status-oms.sh</strong> command to check the information about the current Manager two-node cluster.</p>
|
||||
</p></li><li id="ALM-12005__li593131416486"><span>Run the <strong id="ALM-12005__b1758991516486">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the OLdap resource status managed by HA is normal. (In single-node mode, the OLdap resource is in the Active_normal state; in the two-node mode, the OLdap resource is in the Active_normal state on the active node and in the Standby_normal state on the standby node.)</span><p><ul class="subitemlist" id="ALM-12005__ul2302865616486"><li id="ALM-12005__li1549700616486">If yes, go to <a href="#ALM-12005__li34421516164820">4</a>.</li><li id="ALM-12005__li4729798216486">If no, go to <a href="#ALM-12005__li4031832916486">3</a>.</li></ul>
|
||||
</p></li><li id="ALM-12005__li4031832916486"><a name="ALM-12005__li4031832916486"></a><a name="li4031832916486"></a><span>See the procedure in <a href="ALM-12004.html">ALM-12004 OLdap Resource Abnormal</a> to resolve the problem. After the OLdap resource status recovers, check whether the OKerberos resource status is normal.</span><p><ul class="subitemlist" id="ALM-12005__ul6413213716486"><li id="ALM-12005__li1067441916486">If yes, the operation is complete.</li><li id="ALM-12005__li5932157616486">If no, go to <a href="#ALM-12005__li34421516164820">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12005__p59418417164755"><strong id="ALM-12005__b21602359164826">Collect fault information.</strong></p>
|
||||
<ol start="4" id="ALM-12005__ol49138498164822"><li id="ALM-12005__li34421516164820"><a name="ALM-12005__li34421516164820"></a><a name="li34421516164820"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12005__b87862548435">O&M</strong> > <strong id="ALM-12005__b11281153164820">Log > Download</strong>.</span></li><li id="ALM-12005__li29990712164820"><span>Select <strong id="ALM-12005__b41358196164820">OmsKerberos</strong> and <strong id="ALM-12005__b36679449164820">OmmServer</strong> from the <strong id="ALM-12005__b18615181618813">Service</strong> and click <strong id="ALM-12005__b627792117815">OK</strong>.</span></li><li id="ALM-12005__li1145664103113"><span>Click <span><img id="ALM-12005__image1945644173117" src="en-us_image_0269383810.png"></span> in the upper right corner, and set <strong id="ALM-12005__b6456941173117">Start Date</strong> and <strong id="ALM-12005__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12005__b13456164113319">Download</strong>.</span></li><li id="ALM-12005__li495644512588"><span>Contact the <span id="ALM-12005__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12005__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12005__s0c09e68ea4404ad18d1e184da7dccee5"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12005__en-us_topic_0070543646_p37687133">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
108
docs/mrs/umn/ALM-12006.html
Normal file
108
docs/mrs/umn/ALM-12006.html
Normal file
File diff suppressed because it is too large
Load Diff
90
docs/mrs/umn/ALM-12007.html
Normal file
90
docs/mrs/umn/ALM-12007.html
Normal file
@ -0,0 +1,90 @@
|
||||
<a name="ALM-12007"></a><a name="ALM-12007"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12007 Process Fault</h1>
|
||||
<div id="body62162350"><div class="section" id="ALM-12007__s3c7152bd1bd648aea0a18beede86237d"><h4 class="sectiontitle">Description</h4><p id="ALM-12007__en-us_topic_0070543667_p46722268">This alarm is generated when the process health check module detects that the process connection status is <strong id="ALM-12007__en-us_topic_0070543667_b17847232">Bad</strong> for three consecutive times. The process health check module checks the process status every 5 seconds.</p>
|
||||
<p id="ALM-12007__en-us_topic_0070543667_p26407365">This alarm is cleared when the process can be connected.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__sb0d2e518431d4334b799e4fe2360d334"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12007__en-us_topic_0070543667_table58621855" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12007__en-us_topic_0070543667_row42640608"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12007__en-us_topic_0070543667_p31337228">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12007__en-us_topic_0070543667_p55287536">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12007__en-us_topic_0070543667_p49105461">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12007__en-us_topic_0070543667_row18119427"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p58387457">12007</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p31763543">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12007__en-us_topic_0070543667_p22710190">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__s83d0197c6d984834b79b3a1f2a44d5e4"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12007__en-us_topic_0070543667_table27586093" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12007__en-us_topic_0070543667_row64905719"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12007__en-us_topic_0070543667_p22871847">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12007__en-us_topic_0070543667_p40680291">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12007__row165443324551"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12007__en-us_topic_0070543667_row6769300"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p11442466">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p54424564">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12007__en-us_topic_0070543667_row20059035"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p14169102">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p6846651">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12007__en-us_topic_0070543667_row61619862"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p25152906">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p24119499">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__sdd4b61f1ce0c4c3382bbfb0b51833241"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12007__en-us_topic_0070543667_p7522368">The service provided by the process is unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__secbf87c6acc5443cb118200b72612df2"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12007__en-us_topic_0070543667_ul5332049"><li id="ALM-12007__en-us_topic_0070543667_li47988448">The instance process is abnormal.</li><li id="ALM-12007__en-us_topic_0070543667_li29242856">The disk space is insufficient.</li></ul>
|
||||
<div class="note" id="ALM-12007__en-us_topic_0070543667_note61859112"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12007__en-us_topic_0070543667_p19861098">If a large number of process fault alarms exist in a time segment, files in the installation directory may be deleted mistakenly or permission on the directory may be modified.</p>
|
||||
</div></div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__sad734a42f8ef40529fb21b797d8b41e9"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12007__en-us_topic_0070543667_p65245121"><strong id="ALM-12007__b73856891719">Check whether the instance process is abnormal.</strong></p>
|
||||
<ol id="ALM-12007__ol5390063317638"><li id="ALM-12007__li42005517036"><a name="ALM-12007__li42005517036"></a><a name="li42005517036"></a><span>In the FusionInsight Manager portal, click <strong id="ALM-12007__b3064793094522">O&M > Alarm<strong id="ALM-12007__b27872374104950"> > Alarms</strong></strong>, click <span><img id="ALM-12007__image14626452517" src="en-us_image_0000001080201158.png"></span> in the row where the alarm is located , and click the host name to view the host address for which the alarm is generated</span></li><li id="ALM-12007__li911601917036"><span>On the <strong id="ALM-12007__b378050117036">Alarms</strong> page, check whether the <a href="ALM-12006.html">ALM-12006 Node Fault</a> is generated.</span><p><ul class="subitemlist" id="ALM-12007__ul846943117036"><li id="ALM-12007__li452236417036">If yes, go to <a href="#ALM-12007__li20006517036">3</a>.</li><li id="ALM-12007__li3076720917036">If no, go to <a href="#ALM-12007__li195150317036">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12007__li20006517036"><a name="ALM-12007__li20006517036"></a><a name="li20006517036"></a><span>Handle the alarm according to <a href="ALM-12006.html">ALM-12006 Node Fault</a>.</span></li><li id="ALM-12007__li195150317036"><a name="ALM-12007__li195150317036"></a><a name="li195150317036"></a><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12007__b8307212154711">root</strong>. <span id="ALM-12007__text43649449460"></span>Check whether the installation directory user, user group, and permission of the alarm role are correct. The user, user group, and the permission must be <strong id="ALM-12007__b180058917036">omm:ficommon 750</strong>.</span><p><p class="subitemlist" id="ALM-12007__p7190141912118">For example, the NameNode installation directory is<strong id="ALM-12007__b16534123110112"> </strong><em id="ALM-12007__i677216419119">${BIGDATA_HOME}</em><strong id="ALM-12007__b177174617112">/FusionInsight_Current/</strong><em id="ALM-12007__i137264460113">1_8_NameNode</em><strong id="ALM-12007__b13731846191113">/etc</strong>.</p>
|
||||
<ul class="subitemlist" id="ALM-12007__ul2258645517036"><li id="ALM-12007__li1163004517036">If yes, go to <a href="#ALM-12007__li3396349817036">6</a>.</li><li id="ALM-12007__li250960617036">If no, go to <a href="#ALM-12007__li3247692317036">5</a>.</li></ul>
|
||||
</p></li><li id="ALM-12007__li3247692317036"><a name="ALM-12007__li3247692317036"></a><a name="li3247692317036"></a><span>Run the following command to set the permission to <strong id="ALM-12007__b1756352717036">750</strong> and <strong id="ALM-12007__b2385401617036">User:Group</strong> to <strong id="ALM-12007__b1335955517036">omm:ficommon</strong>:</span><p><p class="litext" id="ALM-12007__p833090817036"><strong id="ALM-12007__b5312713817036">chmod 750 </strong><em id="ALM-12007__i838219617036"><folder_name></em></p>
|
||||
<p class="litext" id="ALM-12007__p3343470817036"><strong id="ALM-12007__b786931417036">chown omm:ficommon </strong><em id="ALM-12007__i371496717036"><folder_name></em></p>
|
||||
</p></li><li id="ALM-12007__li3396349817036"><a name="ALM-12007__li3396349817036"></a><a name="li3396349817036"></a><span>Wait for 5 minutes. In the alarm list, check whether <strong id="ALM-12007__b2385685117036">ALM-12007 Process Fault</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul2693144617036"><li id="ALM-12007__li1338507017036">If yes, no further action is required.</li><li id="ALM-12007__li1044892317036">If no, go to <a href="#ALM-12007__li2657388817036">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12007__p5673574217645"><strong id="ALM-12007__b1353742417650">Check whether disk space is sufficient.</strong></p>
|
||||
<ol start="7" id="ALM-12007__ol2289926317658"><li id="ALM-12007__li2657388817036"><a name="ALM-12007__li2657388817036"></a><a name="li2657388817036"></a><span>On the FusionInsight Manager, check whether the alarm list contains <strong id="ALM-12007__b3723602917036">ALM-12017 Insufficient Disk Capacity</strong>.</span><p><ul class="subitemlist" id="ALM-12007__ul6260497717036"><li id="ALM-12007__li6332838717036">If yes, go to <a href="#ALM-12007__li500135217036">8</a>.</li><li id="ALM-12007__li2932572917036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
|
||||
</p></li><li id="ALM-12007__li500135217036"><a name="ALM-12007__li500135217036"></a><a name="li500135217036"></a><span>Rectify the fault by following the steps provided in <a href="ALM-12017.html">ALM-12017 Insufficient Disk Capacity</a>.</span></li><li id="ALM-12007__li2288625317036"><span>Wait for 5 minutes. In the alarm list, check whether <strong id="ALM-12007__b4501217017036">ALM-12017 Insufficient Disk Capacity</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul999945717036"><li id="ALM-12007__li2210716917036">If yes, go to <a href="#ALM-12007__li1723673717036">10</a>.</li><li id="ALM-12007__li4585029317036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
|
||||
</p></li><li id="ALM-12007__li1723673717036"><a name="ALM-12007__li1723673717036"></a><a name="li1723673717036"></a><span>Wait for 5 minutes. In the alarm list, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul3418148317036"><li id="ALM-12007__li464969017036">If yes, no further action is required.</li><li id="ALM-12007__li4108064417036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12007__p3392472417052"><strong id="ALM-12007__b2313861417057">Collect fault information.</strong></p>
|
||||
<ol start="11" id="ALM-12007__ol481086251710"><li id="ALM-12007__li1622379717036"><a name="ALM-12007__li1622379717036"></a><a name="li1622379717036"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12007__b2091290617036">O&M</strong> > <strong id="ALM-12007__b5399842717036">Log > Download</strong>.</span></li><li id="ALM-12007__li1598834917036"><span>According to the service name obtained in <a href="#ALM-12007__li42005517036">1</a>, select the component and <strong id="ALM-12007__b68821814172417">NodeAgent</strong> from the <strong id="ALM-12007__b15959191911544">Service</strong> and click <strong id="ALM-12007__b3991118545">OK</strong>.</span></li><li id="ALM-12007__li1145664103113"><span>Click <span><img id="ALM-12007__image1945644173117" src="en-us_image_0269383814.png"></span> in the upper right corner, and set <strong id="ALM-12007__b6456941173117">Start Date</strong> and <strong id="ALM-12007__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12007__b13456164113319">Download</strong>.</span></li><li id="ALM-12007__li495644512588"><span>Contact the <span id="ALM-12007__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12007__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12007__sb81c90a530914c14b08552a98ff5c8d0"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12007__en-us_topic_0070543667_p23735849">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
92
docs/mrs/umn/ALM-12010.html
Normal file
92
docs/mrs/umn/ALM-12010.html
Normal file
@ -0,0 +1,92 @@
|
||||
<a name="ALM-12010"></a><a name="ALM-12010"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12010 Manager Heartbeat Interruption Between the Active and Standby Nodes</h1>
|
||||
<div id="body52794692"><div class="section" id="ALM-12010__s280ef4e111974c26b59b1fff047f7699"><h4 class="sectiontitle">Description</h4><p id="ALM-12010__en-us_topic_0070543674_p63605632">This alarm is generated when the active Mager does not receive the heartbeat signal from the standby Manager within 7 seconds.</p>
|
||||
<p id="ALM-12010__en-us_topic_0070543674_p35579781">This alarm is cleared when the active Manager receives heartbeat signals from the standby Manager.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__s499887f79aa24499a2d2e7e398da0453"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12010__en-us_topic_0070543674_table63390002" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12010__en-us_topic_0070543674_row446859"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12010__en-us_topic_0070543674_p36195658">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12010__en-us_topic_0070543674_p46167218">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12010__en-us_topic_0070543674_p48557162">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12010__en-us_topic_0070543674_row40816026"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12010__en-us_topic_0070543674_p17763819">12010</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12010__en-us_topic_0070543674_p29583236">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12010__en-us_topic_0070543674_p47431950">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__s69e79a64c37f4996a9e6280d78e16d58"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12010__en-us_topic_0070543674_table16782769" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12010__en-us_topic_0070543674_row9145947"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12010__en-us_topic_0070543674_p2624279">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12010__en-us_topic_0070543674_p11240014">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12010__row113122810557"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12010__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12010__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12010__en-us_topic_0070543674_row38025962"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12010__en-us_topic_0070543674_p60204115">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12010__en-us_topic_0070543674_p44695152">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12010__en-us_topic_0070543674_row66712054"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12010__en-us_topic_0070543674_p34967293">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12010__en-us_topic_0070543674_p13778476">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12010__en-us_topic_0070543674_row56897427"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12010__en-us_topic_0070543674_p45288850">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12010__en-us_topic_0070543674_p44518222">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__s51cab4675b644be49bc4ff774ddbd51c"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12010__en-us_topic_0070543674_p9588202">When the active Manager process is abnormal, an active/standby failover cannot be performed, and services are affected.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__s7843db533b38470ea902ef6788b89a22"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12010__en-us_topic_0070543674_p38446893"></p>
|
||||
<ul id="ALM-12010__ul11347112011510"><li id="ALM-12010__li17347132014154">The link between the active and standby Manager is abnormal.</li><li id="ALM-12010__li127451022151512">The node name configuration is incorrect.</li><li id="ALM-12010__li15347620181517">The port is disabled by the firewall.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__s8af1753e22d647b9b1328244e85fc0a1"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12010__en-us_topic_0070543674_p27190637"><strong id="ALM-12010__b5350194613159">Check whether the network between the active and standby Manager server is normal.</strong></p>
|
||||
<ol id="ALM-12010__ol20655039202014"><li id="ALM-12010__li3649153912014"><span>In the FusionInsight Manager portal, click <strong id="ALM-12010__b3064793094522">O&M > Alarm<strong id="ALM-12010__b27872374104950"> > Alarms</strong></strong>, click <span><img id="ALM-12010__image4649163910207" src="en-us_image_0269383815.png"></span> in the row containing the alarm and view the IP address of the standby Manager (Peer Manager) server in the alarm details.</span></li><li id="ALM-12010__li665018399204"><span>Log in to the active Manager server as user <strong id="ALM-12010__b16650193982017">root</strong>. <span id="ALM-12010__text13862037144910"></span><span id="ALM-12010__text077751144915"></span></span></li><li id="ALM-12010__li86511539112014"><span>Run the <strong id="ALM-12010__b14650439102018">ping</strong> <em id="ALM-12010__i96503394205">standby Manager heartbeat IP address</em> command to check whether the standby Manager server is reachable.</span><p><ul class="subitemlist" id="ALM-12010__ul565043917209"><li id="ALM-12010__li665012399202">If yes, go to <a href="#ALM-12010__li206521339172011">6</a>.</li><li id="ALM-12010__li36504394207">If no, go to <a href="#ALM-12010__li18651103915205">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12010__li18651103915205"><a name="ALM-12010__li18651103915205"></a><a name="li18651103915205"></a><span>Contact the network administrator to check whether the network is faulty.</span><p><ul class="subitemlist" id="ALM-12010__ul1465123917207"><li id="ALM-12010__li7651539162019">If yes, go to <a href="#ALM-12010__li166511739102017">5</a>.</li><li id="ALM-12010__li12651153932016">If no, go to <a href="#ALM-12010__li206521339172011">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12010__li166511739102017"><a name="ALM-12010__li166511739102017"></a><a name="li166511739102017"></a><span>Rectify the network fault and check whether the alarm is cleared from the alarm list.</span><p><ul class="subitemlist" id="ALM-12010__ul12651143992015"><li id="ALM-12010__li66510391204">If yes, no further action is required.</li><li id="ALM-12010__li165193912202">If no, go to <a href="#ALM-12010__li206521339172011">6</a>.</li></ul>
|
||||
</p></li><li class="subitemlist" id="ALM-12010__li206521339172011"><a name="ALM-12010__li206521339172011"></a><a name="li206521339172011"></a><span>Run the following command to go to the software installation directory:</span><p><p id="ALM-12010__p1652939182013"><strong id="ALM-12010__b136521139172015">cd /opt</strong></p>
|
||||
</p></li><li id="ALM-12010__li206524391203"><span>Run the following command to find the configuration file directory of the active and standby nodes.</span><p><p id="ALM-12010__p8652153962016"><strong id="ALM-12010__b16652173917208">find -name hacom_local.xml</strong></p>
|
||||
</p></li><li id="ALM-12010__li9652143912209"><span>Run the following command to go to the <strong id="ALM-12010__b1265243992012">workspace</strong> directory:</span><p><p id="ALM-12010__p36527396208"><strong id="ALM-12010__b1765203982016">cd${BIGDATA_HOME}/om-server/OMS/workspace0</strong><strong id="ALM-12010__b1564419127399">/ha/local/hacom/conf/</strong></p>
|
||||
</p></li><li id="ALM-12010__li1065213914202"><span>Run the <strong id="ALM-12010__b11213458183417">vim</strong> command to open the <strong id="ALM-12010__b521318586344">hacom_local.xml</strong> file. Check whether the local and peer nodes are correctly configured. The local node is configured as the active node, and the peer node is configured as the standby node.</span><p><ul id="ALM-12010__ul1365263916206"><li id="ALM-12010__li13652123919204">If yes, go to <a href="#ALM-12010__li56481639112012">12</a>.</li><li id="ALM-12010__li126521439182014">If no, go to <a href="#ALM-12010__li18655163992011">10</a>.</li></ul>
|
||||
</p></li><li id="ALM-12010__li18655163992011"><a name="ALM-12010__li18655163992011"></a><a name="li18655163992011"></a><span>Modify the configuration of the active and standby nodes in the <strong id="ALM-12010__b8957024133513">hacom_local.xml</strong> file and press <strong id="ALM-12010__b59571324153518">Esc</strong> to return to the command mode. Run the <strong id="ALM-12010__b69571524173512">:wq</strong> command to save the modification and exit.</span></li><li id="ALM-12010__li1265563992014"><span>Check whether the alarm is cleared automatically.</span><p><ul id="ALM-12010__ul116551239192019"><li id="ALM-12010__li11655123992018">If yes, no further action is required.</li><li id="ALM-12010__li665543992012">If no, go to <a href="#ALM-12010__li56481639112012">12</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12010__p151791650141914"><strong id="ALM-12010__b193901952171915">Check whether the port is disabled by the firewall.</strong></p>
|
||||
<ol start="12" id="ALM-12010__ol1264983932018"><li id="ALM-12010__li56481639112012"><a name="ALM-12010__li56481639112012"></a><a name="li56481639112012"></a><span>Run the <strong id="ALM-12010__b193834425356">lsof -i :20012</strong> command to check whether the heartbeat ports of the active and standby nodes are enabled. If the command output is displayed, the ports are enabled. Otherwise, the ports are disabled by the firewall.</span><p><ul id="ALM-12010__ul20648143982016"><li id="ALM-12010__li2064816399204">If yes, go to <a href="#ALM-12010__li8648153982010">13</a>.</li><li id="ALM-12010__li116484391209">If no, go to <a href="#ALM-12010__li41244883171443">16</a>.</li></ul>
|
||||
</p></li><li id="ALM-12010__li8648153982010"><a name="ALM-12010__li8648153982010"></a><a name="li8648153982010"></a><span>Run the <strong id="ALM-12010__b064853911204">iptables -P INPUT ACCEPT</strong> command to avoid the server disconnection.</span></li><li id="ALM-12010__li8648113917204"><span>Run the following command to clear the firewall:</span><p><p id="ALM-12010__p1564893915206"><strong id="ALM-12010__b3648539112020">iptables -F</strong></p>
|
||||
</p></li><li id="ALM-12010__li5649163982013"><span>Check whether the alarm is cleared from the alarm list.</span><p><ul id="ALM-12010__ul12649143919207"><li id="ALM-12010__li76481939182016">If yes, no further action is required.</li><li id="ALM-12010__li6649839152018">If no, go to <a href="#ALM-12010__li41244883171443">16</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12010__p66076255171453"><strong id="ALM-12010__b56103124171459">Collect fault information.</strong></p>
|
||||
<ol start="16" id="ALM-12010__ol4742499917152"><li id="ALM-12010__li41244883171443"><a name="ALM-12010__li41244883171443"></a><a name="li41244883171443"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12010__b2091290617036">O&M</strong> > <strong id="ALM-12010__b4582764171443">Log > Download</strong>.</span></li><li id="ALM-12010__li52887856171443"><span>Select the following nodes from the <strong id="ALM-12010__b1114195518811">Service</strong> and click<strong id="ALM-12010__b11411559819"> OK</strong>:</span><p><ul class="subitemlist" id="ALM-12010__ul58072211171443"><li id="ALM-12010__li2749285171443">OmmServer</li><li id="ALM-12010__li24743571171443">Controller</li><li id="ALM-12010__li21365548171443">NodeAgent</li></ul>
|
||||
</p></li><li id="ALM-12010__li1145664103113"><span>Click <span><img id="ALM-12010__image1945644173117" src="en-us_image_0269383816.png"></span> in the upper right corner, and set <strong id="ALM-12010__b6456941173117">Start Date</strong> and <strong id="ALM-12010__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12010__b13456164113319">Download</strong>.</span></li><li id="ALM-12010__li495644512588"><span>Contact the <span id="ALM-12010__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12010__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12010__s785de8080aae450dbd0d37da4f9f95ef"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12010__en-us_topic_0070543674_p25816034">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
109
docs/mrs/umn/ALM-12011.html
Normal file
109
docs/mrs/umn/ALM-12011.html
Normal file
File diff suppressed because it is too large
Load Diff
89
docs/mrs/umn/ALM-12014.html
Normal file
89
docs/mrs/umn/ALM-12014.html
Normal file
@ -0,0 +1,89 @@
|
||||
<a name="ALM-12014"></a><a name="ALM-12014"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12014 Partition Lost</h1>
|
||||
<div id="body1841371"><div class="section" id="ALM-12014__s305a8061b9134145a1a1e3f83ea9bfc4"><h4 class="sectiontitle">Description</h4><p id="ALM-12014__en-us_topic_0070543526_p19713524">The system checks the partition status every 60 seconds. This alarm is generated when the system detects that a partition to which service directories are mounted is lost (because the device is removed or goes offline, or the partition is deleted). The system checks the partition status periodically.</p>
|
||||
<p id="ALM-12014__en-us_topic_0070543526_p43203995">This alarm must be manually cleared.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__s9888b5efac804e36a1257629159c863d"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12014__en-us_topic_0070543526_table9862716" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12014__en-us_topic_0070543526_row48323826"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12014__en-us_topic_0070543526_p21915813">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12014__en-us_topic_0070543526_p30350400">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12014__en-us_topic_0070543526_p42463340">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12014__en-us_topic_0070543526_row16978512"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12014__en-us_topic_0070543526_p33082261">12014</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12014__en-us_topic_0070543526_p62417507">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12014__en-us_topic_0070543526_p22653281">No</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__sdc60e2f5e8e54a56b097fb6639e617f1"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12014__en-us_topic_0070543526_table22976439" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12014__en-us_topic_0070543526_row62980604"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12014__en-us_topic_0070543526_p1155275">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12014__en-us_topic_0070543526_p26468439">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12014__row165391015115518"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12014__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12014__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12014__en-us_topic_0070543526_row63568775"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12014__en-us_topic_0070543526_p48797123">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12014__en-us_topic_0070543526_p60252916">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12014__en-us_topic_0070543526_row5405340"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12014__en-us_topic_0070543526_p35179415">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12014__en-us_topic_0070543526_p30960379">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12014__en-us_topic_0070543526_row10207955"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12014__en-us_topic_0070543526_p21538022">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12014__en-us_topic_0070543526_p66858230">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12014__en-us_topic_0070543526_row64853160"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12014__en-us_topic_0070543526_p18614589">DirName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12014__en-us_topic_0070543526_p31386755">Specifies the directory for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12014__en-us_topic_0070543526_row14045344"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12014__en-us_topic_0070543526_p63931057">PartitionName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12014__en-us_topic_0070543526_p11033160">Specifies the device partition for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__s643c7b7b793f43bc844587047e233b94"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12014__en-us_topic_0070543526_p21270730">Service data fails to be written into the partition, and the service system runs abnormally.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__s86d97a0503184bfd9d0e267312170d65"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12014__en-us_topic_0070543526_ul45207585"><li id="ALM-12014__en-us_topic_0070543526_li4215088">The hard disk is removed.</li><li id="ALM-12014__en-us_topic_0070543526_li37935797">The hard disk is offline, or a bad sector exists on the hard disk.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__sb1a1ee7b7a444d5dbe8388e9c9e8bba9"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12014__ol43371064173421"><li id="ALM-12014__li30640494173421"><span>On FusionInsight Manager, click <strong id="ALM-12014__b18317580173421">O&M > Alarm > Alarms</strong>, and click <span><img id="ALM-12014__image10408151910137" src="en-us_image_0269383822.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12014__li51941841173421"><span>Obtain <strong id="ALM-12014__b65960965173421">HostName</strong>, <strong id="ALM-12014__b56777780173421">PartitionName</strong> and <strong id="ALM-12014__b41237977173421">DirName</strong> from <strong id="ALM-12014__b645062473115">Location</strong>.</span></li><li id="ALM-12014__li15983295173421"><span>Check whether the disk of <strong id="ALM-12014__b64823390173421">PartitionName</strong> on <strong id="ALM-12014__b46539606173421">HostName</strong> is inserted to the correct server slot.</span><p><ul class="subitemlist" id="ALM-12014__ul9232462173421"><li id="ALM-12014__li11611727173421">If yes, go to <a href="#ALM-12014__li9631929173421">4</a>.</li><li id="ALM-12014__li1025829173421">If no, go to <a href="#ALM-12014__li18162941173421">5</a>.</li></ul>
|
||||
</p></li><li id="ALM-12014__li9631929173421"><a name="ALM-12014__li9631929173421"></a><a name="li9631929173421"></a><span>Contact hardware engineers to remove the faulty disk.</span></li><li id="ALM-12014__li18162941173421"><a name="ALM-12014__li18162941173421"></a><a name="li18162941173421"></a><span>Log in to the <strong id="ALM-12014__b19578501173421">HostName</strong> node where an alarm is reported and check whether there is a line containing <strong id="ALM-12014__b41988789173421">DirName</strong> in the <strong id="ALM-12014__b42354785173421">/etc/fstab</strong> file as user <strong id="ALM-12014__b37365710490">root</strong>. <span id="ALM-12014__text43649449460"></span></span><p><ul class="subitemlist" id="ALM-12014__ul61670428173421"><li id="ALM-12014__li8185528173421">If yes, go to <a href="#ALM-12014__li20338192173421">6</a>.</li><li id="ALM-12014__li59048052173421">If no, go to <a href="#ALM-12014__li48826004173421">7</a>.</li></ul>
|
||||
</p></li><li id="ALM-12014__li20338192173421"><a name="ALM-12014__li20338192173421"></a><a name="li20338192173421"></a><span>Run the <strong id="ALM-12014__b29248746173421">vi /etc/fstab</strong> command to edit the file and delete the line containing <strong id="ALM-12014__b61912122173421">DirName</strong>.</span></li><li id="ALM-12014__li48826004173421"><a name="ALM-12014__li48826004173421"></a><a name="li48826004173421"></a><span>Contact hardware engineers to insert a new disk. For details, see the hardware product document of the relevant model. If the faulty disk is in a RAID group, configure the RAID group. For details, see the configuration methods of the relevant RAID controller card.</span></li><li id="ALM-12014__li55753407173421"><span>Wait 20 to 30 minutes (The disk size determines the waiting time), and run the <strong id="ALM-12014__b36780855173421">mount</strong> command to check whether the disk has been mounted to the <strong id="ALM-12014__b62592242173421">DirName</strong> directory.</span><p><ul class="subitemlist" id="ALM-12014__ul28564444173421"><li id="ALM-12014__li26459270173421">If yes, manually clear the alarm. No further operation is required.</li><li id="ALM-12014__li62826150173421">If no, go to <a href="#ALM-12014__li1607193817587">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12014__p0392542185819"><strong id="ALM-12014__b59246063204559">Collect fault information.</strong></p>
|
||||
<ol start="9" id="ALM-12014__ol36071038115815"><li id="ALM-12014__li1607193817587"><a name="ALM-12014__li1607193817587"></a><a name="li1607193817587"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12014__b87862548435">O&M</strong> > <strong id="ALM-12014__b11281153164820">Log > Download</strong>.</span></li><li id="ALM-12014__li1560793895812"><span>Select the <strong id="ALM-12014__b486612581809">OmmServer</strong> from the Services drop-down list and click <strong id="ALM-12014__b20607238175815">OK</strong>.</span></li><li id="ALM-12014__li660723815584"><span>Set Start Date for log collection to 10 minutes ahead of the alarm generation time and End Date to 10 minutes behind the alarm generation time and click <strong id="ALM-12014__b15452018112">Download</strong>.</span></li><li id="ALM-12014__li495644512588"><span>Contact the <span id="ALM-12014__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12014__p697913319401">After the fault is rectified, the system does not automatically clear this alarm, and you need to manually clear the alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12014__s30aa982d9de44f9d918fba0190750058"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12014__en-us_topic_0070543526_p21391728">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
84
docs/mrs/umn/ALM-12015.html
Normal file
84
docs/mrs/umn/ALM-12015.html
Normal file
@ -0,0 +1,84 @@
|
||||
<a name="ALM-12015"></a><a name="ALM-12015"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12015 Partition Filesystem Readonly</h1>
|
||||
<div id="body9381957"><div class="section" id="ALM-12015__s41b60b34b63e454baf9aef68dfc8832e"><h4 class="sectiontitle">Description</h4><p id="ALM-12015__en-us_topic_0070543537_p17307650">The system checks the partition status every 60 seconds. This alarm is generated when the system detects that a partition to which service directories are mounted enters the read-only mode (due to a bad sector or a faulty file system). The system checks the partition status periodically.</p>
|
||||
<p id="ALM-12015__en-us_topic_0070543537_p21551128">This alarm is cleared when the system detects that the partition to which service directories are mounted exits from the read-only mode (because the file system is restored to read/write mode, the device is removed, or the device is formatted).</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__sffa8391d8f1c4f9e86277fa0559d6f9c"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12015__en-us_topic_0070543537_table810977" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12015__en-us_topic_0070543537_row13603222"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12015__en-us_topic_0070543537_p28119201">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12015__en-us_topic_0070543537_p63062784">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12015__en-us_topic_0070543537_p7811868">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12015__en-us_topic_0070543537_row28781532"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12015__en-us_topic_0070543537_p49602728">12015</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12015__en-us_topic_0070543537_p58398057">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12015__en-us_topic_0070543537_p32622215">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__sd94045db4cc74b8eb1b6042c7e810316"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12015__en-us_topic_0070543537_table25153769" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12015__en-us_topic_0070543537_row51573880"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12015__en-us_topic_0070543537_p16734741">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12015__en-us_topic_0070543537_p13336762">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12015__row1345119112556"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12015__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12015__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12015__en-us_topic_0070543537_row6535952"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12015__en-us_topic_0070543537_p59650110">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12015__en-us_topic_0070543537_p66929645">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12015__en-us_topic_0070543537_row65495901"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12015__en-us_topic_0070543537_p3567727">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12015__en-us_topic_0070543537_p20550486">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12015__en-us_topic_0070543537_row50736653"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12015__en-us_topic_0070543537_p16028238">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12015__en-us_topic_0070543537_p23218903">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12015__en-us_topic_0070543537_row7643543"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12015__en-us_topic_0070543537_p15147232">DirName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12015__en-us_topic_0070543537_p18966278">Specifies the directory for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12015__en-us_topic_0070543537_row36478775"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12015__en-us_topic_0070543537_p1990770">PartitionName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12015__en-us_topic_0070543537_p27034686">Specifies the device partition for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__s7a7c66de27a0476297f75d662c4fdd37"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12015__en-us_topic_0070543537_p42325954">Service data fails to be written into the partition, and the service system runs abnormally.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__sc5211f0c333e491987141617bb9cc5d2"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12015__en-us_topic_0070543537_p5850279">The hard disk is faulty, for example, a bad sector exists.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__s2082e61748a44109ae22b65edd6caf4f"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12015__en-us_topic_0070543537_ol4110613"><li id="ALM-12015__en-us_topic_0070543537_li36995518"><span>On FusionInsight Manager, choose <strong id="ALM-12015__b87862548435">O&M</strong> > <strong id="ALM-12015__b10296131615319">Alarm > Alarms</strong>, click<strong id="ALM-12015__b142969161035"> </strong><span><img id="ALM-12015__image10408151910137" src="en-us_image_0269383823.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12015__en-us_topic_0070543537_li64524211"><span>Obtain <strong id="ALM-12015__en-us_topic_0070543537_b59078569">HostName</strong> and <strong id="ALM-12015__en-us_topic_0070543537_b61945077">PartitionName</strong> from <strong id="ALM-12015__b196121357184515">Location</strong>. <strong id="ALM-12015__en-us_topic_0070543537_b51495331">HostName</strong> is the node where the alarm is reported, and <strong id="ALM-12015__en-us_topic_0070543537_b60804799">PartitionName</strong> is the partition of the faulty disk.</span></li><li id="ALM-12015__en-us_topic_0070543537_li10372286"><span>Contact hardware engineers to check whether the disk is faulty. If the disk is faulty, remove it from the server.</span></li><li id="ALM-12015__en-us_topic_0070543537_li26241711"><span>After the disk is removed, alarm <strong id="ALM-12015__en-us_topic_0070543537_b34848813">ALM-12014 Partition Lost</strong> is reported. Handle the alarm. For details, see <a href="ALM-12014.html">ALM-12014 Partition Lost</a>. After the alarm <strong id="ALM-12015__en-us_topic_0070543537_b4181593">ALM-12014 Partition Lost</strong> is cleared, alarm <strong id="ALM-12015__en-us_topic_0070543537_b37634337">ALM-12015 Partition Filesystem Readonly</strong> is automatically cleared.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12015__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12015__sf5e2cb9b038e4d10ba12d3a10354e8c4"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12015__en-us_topic_0070543537_p28482418">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
92
docs/mrs/umn/ALM-12016.html
Normal file
92
docs/mrs/umn/ALM-12016.html
Normal file
@ -0,0 +1,92 @@
|
||||
<a name="ALM-12016"></a><a name="ALM-12016"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12016 CPU Usage Exceeds the Threshold</h1>
|
||||
<div id="body3068798"><div class="section" id="ALM-12016__sd2aa377cedd7428ab43926bcd0571371"><h4 class="sectiontitle">Description</h4><p id="ALM-12016__en-us_topic_0070543548_p27177474">The system checks the CPU usage every 30 seconds and compares the actual CPU usage with the threshold. The CPU usage has a default threshold. This alarm is generated when the CPU usage exceeds the threshold for several times (configurable, 10 times by default) consecutively.</p>
|
||||
<p id="ALM-12016__p20853383104938">The alarm is cleared in the following two scenarios: The value of <strong id="ALM-12016__b6894114712255">Trigger Count</strong> is 1 and the CPU usage is smaller than or equal to the threshold; the value of <strong id="ALM-12016__b44134084101639"><strong id="ALM-12016__b041615559258">Trigger Count</strong> </strong>is greater than 1 and the CPU usage is smaller than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__sed5654e0fb4744e6b4d40addf988ce76"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12016__en-us_topic_0070543548_table15263883" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12016__en-us_topic_0070543548_row2649980"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12016__en-us_topic_0070543548_p13321813">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12016__en-us_topic_0070543548_p5325067">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12016__en-us_topic_0070543548_p28677253">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12016__en-us_topic_0070543548_row41156139"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p45312999">12016</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p46474270">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12016__en-us_topic_0070543548_p6319546">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__sba368bce011f4d36800cdf21f0be3bb8"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12016__en-us_topic_0070543548_table42121251" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12016__en-us_topic_0070543548_row29066061"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12016__en-us_topic_0070543548_p5540732">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12016__en-us_topic_0070543548_p46146172">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12016__row17737167175520"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12016__en-us_topic_0070543548_row46852469"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p36953614">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p40452719">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12016__en-us_topic_0070543548_row28530154"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p29241163">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p19724041">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12016__en-us_topic_0070543548_row43298646"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p17529450">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p10599309">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12016__en-us_topic_0070543548_row28284925"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12016__en-us_topic_0070543548_p9377597">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12016__en-us_topic_0070543548_p21387873">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__s5446085a2a0441728a92a541f5eb95ae"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12016__en-us_topic_0070543548_p54696146">Service processes respond slowly or become unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__s23c7881992f44efb95893912e391c0c0"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12016__en-us_topic_0070543548_ul1202807"><li id="ALM-12016__en-us_topic_0070543548_li10825264">The alarm threshold or alarm smoothing times are incorrect.</li><li id="ALM-12016__en-us_topic_0070543548_li30318520">CPU configuration cannot meet service requirements. The CPU usage reaches the upper limit.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__s43e4003b37294857a410ff23763ad2ef"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12016__en-us_topic_0070543548_p39881087"><strong id="ALM-12016__b58386659173930">Check whether the alarm threshold or alarm <strong id="ALM-12016__b18142175243719">Trigger Count</strong> are correct.</strong></p>
|
||||
<ol id="ALM-12016__ol1362745417400"><li id="ALM-12016__li24816170173938"><span>Change the alarm threshold and alarm <strong id="ALM-12016__b13281711203813">Trigger Count</strong> based on CPU usage.</span><p><p class="litext" id="ALM-12016__p6523306173938">On FusionInsight Manager, choose <strong id="ALM-12016__b73164535166">O&M</strong> > <strong id="ALM-12016__b1366935516171">Alarm</strong> > <strong id="ALM-12016__b14318131145112">Thresholds > </strong><em id="ALM-12016__i193217112515">Name of the desired cluster</em> > <strong id="ALM-12016__b16357675173938">Host</strong> > <strong id="ALM-12016__b13001354173938">CPU</strong> > <strong id="ALM-12016__b49903330173938">Host CPU Usage</strong> and change the alarm smoothing times based on CPU usage, as shown in <a href="#ALM-12016__fig42676420173938">Figure 1</a>.</p>
|
||||
<div class="note" id="ALM-12016__note57869743173938"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12016__p58625754173938">This option defines the alarm check phase. <strong id="ALM-12016__b74612137375">Trigger Count</strong> indicates the alarm check threshold. An alarm is generated when the number of check times exceeds the threshold.</p>
|
||||
</div></div>
|
||||
<div class="fignone" id="ALM-12016__fig42676420173938"><a name="ALM-12016__fig42676420173938"></a><a name="fig42676420173938"></a><span class="figcap"><b>Figure 1 </b>Setting alarm smoothing times</span><br><span><img id="ALM-12016__image122911304588" src="en-us_image_0269383824.png"></span></div>
|
||||
<p class="litext" id="ALM-12016__p21675643173938">On <strong id="ALM-12016__b66954485173938">Host CPU Usage</strong> page and click <strong id="ALM-12016__b511919416293">Modify</strong> in the <strong id="ALM-12016__b19162174615296">Operation</strong> column to change the alarm threshold, as shown in <a href="#ALM-12016__fig30961038173938">Figure 2</a>.</p>
|
||||
<div class="fignone" id="ALM-12016__fig30961038173938"><a name="ALM-12016__fig30961038173938"></a><a name="fig30961038173938"></a><span class="figcap"><b>Figure 2 </b>Setting an alarm threshold</span><br><span><img id="ALM-12016__image1615410501365" src="en-us_image_0000001440977805.png"></span></div>
|
||||
</p></li><li id="ALM-12016__li29621482173938"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul12793264173938"><li id="ALM-12016__li22018946173938">If yes, no further action is required.</li><li id="ALM-12016__li38704176173938">If no, go to <a href="#ALM-12016__li65266749173938">3</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12016__p48030518173938"><strong id="ALM-12016__b1326250617406">Check whether the CPU usage reaches the upper limit.</strong></p>
|
||||
<ol start="3" id="ALM-12016__ol44225396174015"><li id="ALM-12016__li65266749173938"><a name="ALM-12016__li65266749173938"></a><a name="li65266749173938"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12016__image168221113135319" src="en-us_image_0269383826.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12016__li52115308173938"><span>On the <strong id="ALM-12016__b51685932101729">Hosts</strong> page, click the node on which the alarm is reported.</span></li><li id="ALM-12016__li60590444173938"><span>View the CPU usage for 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the system administrator to add more CPUs.</span></li><li id="ALM-12016__li38620506173938"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul30302958173938"><li id="ALM-12016__li8878949173938">If yes, no further action is required.</li><li id="ALM-12016__li48106238173938">If no, go to <a href="#ALM-12016__li35735451173938">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12016__p51491657174016"><strong id="ALM-12016__b42921091174020">Collect fault information.</strong></p>
|
||||
<ol start="7" id="ALM-12016__ol57964469174025"><li id="ALM-12016__li35735451173938"><a name="ALM-12016__li35735451173938"></a><a name="li35735451173938"></a><span>On the FusionInsight Manager in the active cluster, choose <strong id="ALM-12016__b12040241173938">O&M</strong> > <strong id="ALM-12016__b41253307173938">Log > Download</strong>.</span></li><li id="ALM-12016__li49036890173938"><span>Select <strong id="ALM-12016__b53183609173938">OmmServer</strong> from the <strong id="ALM-12016__b477010478910">Service</strong> and click <strong id="ALM-12016__b1577112471895">OK</strong>.</span></li><li id="ALM-12016__li11141594173938"><span>Set <strong id="ALM-12016__b38678826173938">Start Date</strong> for log collection to 10 minutes ahead of the alarm generation time and <strong id="ALM-12016__b12565117173938">End Date</strong> to 10 minutes behind the alarm generation time in <strong id="ALM-12016__b20155417195615">Time Range</strong> and click <strong id="ALM-12016__b45977197173938">Download</strong>.</span></li><li id="ALM-12016__li495644512588"><span>Contact the <span id="ALM-12016__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12016__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12016__s8c5dd7b3b5ce47dfabf1d96c699ad06c"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12016__en-us_topic_0070543548_p58361484">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
99
docs/mrs/umn/ALM-12017.html
Normal file
99
docs/mrs/umn/ALM-12017.html
Normal file
@ -0,0 +1,99 @@
|
||||
<a name="ALM-12017"></a><a name="ALM-12017"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12017 Insufficient Disk Capacity</h1>
|
||||
<div id="body66975920"><div class="section" id="ALM-12017__s7a756a3074824ff29f40824ccac74790"><h4 class="sectiontitle">Description</h4><p id="ALM-12017__en-us_topic_0070543559_p25104075">The system checks the host disk usage of the system every 30 seconds and compares the actual disk usage with the threshold. The disk usage has a default threshold, this alarm is generated when the host disk usage exceeds the specified threshold.</p>
|
||||
<p id="ALM-12017__p2082647611242">When the <strong id="ALM-12017__b44134084101639"><strong id="ALM-12017__b041615559258">Trigger Count</strong></strong> is 1, this alarm is cleared when the usage of a host disk partition is less than or equal to the threshold. When the <strong id="ALM-12017__b153891654103012"><strong id="ALM-12017__b53896541301">Trigger Count</strong></strong> is greater than 1, this alarm is cleared when the usage of a host disk partition is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__s7eaf36ea595e48c7ad5d731ce280ebd9"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12017__en-us_topic_0070543559_table47260226" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12017__en-us_topic_0070543559_row59951591"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12017__en-us_topic_0070543559_p24240706">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12017__en-us_topic_0070543559_p17340145">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12017__en-us_topic_0070543559_p62374493">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12017__en-us_topic_0070543559_row19169133"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12017__en-us_topic_0070543559_p9195913">12017</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12017__en-us_topic_0070543559_p6671514">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12017__en-us_topic_0070543559_p3521800">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__s349818e9d8e9413dbca3219347d41604"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12017__en-us_topic_0070543559_table16830394" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12017__en-us_topic_0070543559_row2814577"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12017__en-us_topic_0070543559_p26654174">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12017__en-us_topic_0070543559_p11504467">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12017__row73657316554"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12017__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12017__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12017__en-us_topic_0070543559_row59446649"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12017__en-us_topic_0070543559_p50449271">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12017__en-us_topic_0070543559_p59859190">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12017__en-us_topic_0070543559_row1861806"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12017__en-us_topic_0070543559_p16588560">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12017__en-us_topic_0070543559_p1496135">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12017__en-us_topic_0070543559_row13465222"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12017__en-us_topic_0070543559_p16941207">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12017__en-us_topic_0070543559_p30060528">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12017__en-us_topic_0070543559_row2109304"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12017__en-us_topic_0070543559_p36635921">PartitionName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12017__en-us_topic_0070543559_p14719623">Specifies the device partition for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12017__en-us_topic_0070543559_row65367744"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12017__en-us_topic_0070543559_p60295883">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12017__en-us_topic_0070543559_p52128359">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__s44d29551fde24303a025841fbafd5684"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12017__en-us_topic_0070543559_p61647547">Service processes become unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__sd53668685806495fb8d456ba9e2c2c11"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12017__en-us_topic_0070543559_ul27395440"><li id="ALM-12017__en-us_topic_0070543559_li45232374">The alarm threshold is incorrect.</li><li id="ALM-12017__en-us_topic_0070543559_li4438190">Disk configuration of the server cannot meet service requirements.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__s6fd2395d167c4db4814624ea702a37ac"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12017__en-us_topic_0070543559_p23949084"><strong id="ALM-12017__b457009885739">Check whether the alarm threshold is appropriate.</strong></p>
|
||||
<ol id="ALM-12017__ol229057318582"><li id="ALM-12017__li3269990385745"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12017__b126241333219">O&M</strong> > <strong id="ALM-12017__b156241435323">Alarm ></strong> <strong id="ALM-12017__b1562412314328">Thresholds</strong><strong id="ALM-12017__b1962413373216"> > </strong><em id="ALM-12017__i1162415315324">Name of the desired cluster</em> > <strong id="ALM-12017__b962416314323">Host</strong> > <strong id="ALM-12017__b106241931323">Disk</strong> > <strong id="ALM-12017__b4624163203210">Disk Usage</strong> and check whether the threshold (configurable, 90% by default) is appropriate.</span><p><ul class="subitemlist" id="ALM-12017__ul1854640385745"><li id="ALM-12017__li1687169885745">If yes, go to <a href="#ALM-12017__li1280611085745">2</a>.</li><li id="ALM-12017__li2443033285745">If no, go to <a href="#ALM-12017__li2782670585745">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12017__li1280611085745"><a name="ALM-12017__li1280611085745"></a><a name="li1280611085745"></a><span>Choose <strong id="ALM-12017__b2586367385745">O&M</strong> > <strong id="ALM-12017__b1379910713499">Alarm ></strong> <strong id="ALM-12017__b2887114614242">Thresholds</strong><strong id="ALM-12017__b29831221166"> > </strong><em id="ALM-12017__i9983102101619">Name of the desired cluster</em> > <strong id="ALM-12017__b6413578985745">Host</strong> > <strong id="ALM-12017__b4035119385745">Disk</strong> > <strong id="ALM-12017__b2761642585745">Disk Usage</strong> and click <strong id="ALM-12017__b6659180133310">Modify</strong> in the <strong id="ALM-12017__b1374719315332">Operation</strong> column to change the alarm threshold based on site requirements. As shown in <a href="#ALM-12017__fig6063892885745">Figure 1</a>:</span><p><div class="fignone" id="ALM-12017__fig6063892885745"><a name="ALM-12017__fig6063892885745"></a><a name="fig6063892885745"></a><span class="figcap"><b>Figure 1 </b>Setting an alarm threshold</span><br><span><img id="ALM-12017__image1615410501365" src="en-us_image_0000001440977873.png"></span></div>
|
||||
</p></li><li id="ALM-12017__li4783109885745"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12017__ul59050785745"><li id="ALM-12017__li4814612685745">If yes, no further action is required.</li><li id="ALM-12017__li752215285745">If no, go to <a href="#ALM-12017__li2782670585745">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12017__p531456685745"><strong id="ALM-12017__b98862278588">Check whether the disk usage reaches the upper limit.</strong></p>
|
||||
<ol start="4" id="ALM-12017__ol1005390085829"><li id="ALM-12017__li2782670585745"><a name="ALM-12017__li2782670585745"></a><a name="li2782670585745"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12017__image168221113135319" src="en-us_image_0269383828.png"></span> in the row where the alarm is located to view the alarm host name and disk partition information in the alarm details.</span></li><li id="ALM-12017__li3937060885745"><span>Log in to the node where the alarm is generated as user <strong id="ALM-12017__b4911375485745">root</strong>. <span id="ALM-12017__text43649449460"></span></span></li><li id="ALM-12017__li1529764085745"><span>Run the <strong id="ALM-12017__b5391142133919">df -lmPT | awk '$2 != "iso9660"' | grep '^/dev/' | awk '{"readlink -m "$1 | getline real }{$1=real; print $0}' | sort -u -k 1,1</strong> command to check the system disk partition usage. Check whether the disk is mounted to the following directories based on the disk partition name obtained in <a href="#ALM-12017__li2782670585745">4</a>: <strong id="ALM-12017__b4568855685745">/</strong>, <strong id="ALM-12017__b2096079285745">/opt</strong>, <strong id="ALM-12017__b5442940785745">/tmp</strong>, <strong id="ALM-12017__b2010261785745">/var</strong>, <strong id="ALM-12017__b4670583385745">/var/log</strong>, and <strong id="ALM-12017__b2507614885745">/srv/BigData</strong>(can be customized).</span><p><ul class="subitemlist" id="ALM-12017__ul3152589985745"><li id="ALM-12017__li1790212085745">If yes, the disk is a system disk. Then go to <a href="#ALM-12017__li6170195385745">10</a>.</li><li id="ALM-12017__li4078557985745">If no, the disk is not a system disk. Then go to <a href="#ALM-12017__li1190839985745">7</a>.</li></ul>
|
||||
</p></li><li id="ALM-12017__li1190839985745"><a name="ALM-12017__li1190839985745"></a><a name="li1190839985745"></a><span>Run the <strong id="ALM-12017__b10661194925219">df -lmPT | awk '$2 != "iso9660"' | grep '^/dev/' | awk '{"readlink -m "$1 | getline real }{$1=real; print $0}' | sort -u -k 1,1</strong> command to check the system disk partition usage. Determine the role of the disk based on the disk partition name obtained in <a href="#ALM-12017__li2782670585745">4</a>.</span></li><li id="ALM-12017__li11884059152614"><span>Check the disk service.</span><p><div class="p" id="ALM-12017__p0769162644910">In <span id="ALM-12017__text13624174411515">MRS</span>, check whether the disk service is HDFS, Yarn, Kafka, Supervisor.<ul id="ALM-12017__ul148852372297"><li id="ALM-12017__li10740174317299">If yes, adjust the capacity. Then go to <a href="#ALM-12017__li1354951085745">9</a>.</li><li id="ALM-12017__li1159152152914">If no, go to <a href="#ALM-12017__li1359113885745">12</a>.</li></ul>
|
||||
</div>
|
||||
</p></li><li id="ALM-12017__li1354951085745"><a name="ALM-12017__li1354951085745"></a><a name="li1354951085745"></a><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12017__ul150550185745"><li id="ALM-12017__li4676654185745">If yes, no further action is required.</li><li id="ALM-12017__li2999343985745">If no, go to <a href="#ALM-12017__li1359113885745">12</a>.</li></ul>
|
||||
</p></li><li id="ALM-12017__li6170195385745"><a name="ALM-12017__li6170195385745"></a><a name="li6170195385745"></a><span>Run the <strong id="ALM-12017__b5483673385745">find / -xdev -size +500M -execls -l {} \;</strong> command to check whether a file larger than 500 MB exists on the node and disk.</span><p><ul class="subitemlist" id="ALM-12017__ul5159501585745"><li id="ALM-12017__li1259039885745">If yes, go to <a href="#ALM-12017__li3133628885745">11</a>.</li><li id="ALM-12017__li1318931985745">If no, go to <a href="#ALM-12017__li1359113885745">12</a>.</li></ul>
|
||||
</p></li><li id="ALM-12017__li3133628885745"><a name="ALM-12017__li3133628885745"></a><a name="li3133628885745"></a><span>Handle the large file and check whether the alarm is cleared 2 minutes later.</span><p><ul class="subitemlist" id="ALM-12017__ul2585143185745"><li id="ALM-12017__li1844667285745">If yes, no further action is required.</li><li id="ALM-12017__li1778546285745">If no, go to <a href="#ALM-12017__li1359113885745">12</a>.</li></ul>
|
||||
</p></li><li id="ALM-12017__li1359113885745"><a name="ALM-12017__li1359113885745"></a><a name="li1359113885745"></a><span>Contact the system administrator to expand the disk capacity.</span></li><li id="ALM-12017__li2833807185745"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12017__ul5088862685745"><li id="ALM-12017__li5521138285745">If yes, no further action is required.</li><li id="ALM-12017__li4293699485745">If no, go to <a href="#ALM-12017__li5603307085745">14</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12017__p5534445785745"><strong id="ALM-12017__b657764185839">Collect fault information.</strong></p>
|
||||
<ol start="14" id="ALM-12017__ol4750985985842"><li id="ALM-12017__li5603307085745"><a name="ALM-12017__li5603307085745"></a><a name="li5603307085745"></a><span>On FusionInsight Manager, choose <strong id="ALM-12017__b13819155015320">O&M</strong> > <strong id="ALM-12017__b1368243785745">Log > Download</strong>.</span></li><li id="ALM-12017__li1061898185745"><span>Select <strong id="ALM-12017__b1352831932712">OMS</strong> from the <strong id="ALM-12017__b13893145519916">Service</strong> and click <strong id="ALM-12017__b20893115513911">OK</strong>.</span></li><li id="ALM-12017__li1145664103113"><span>Click <span><img id="ALM-12017__image1945644173117" src="en-us_image_0269383829.png"></span> in the upper right corner, and set <strong id="ALM-12017__b6456941173117">Start Date</strong> and <strong id="ALM-12017__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12017__b13456164113319">Download</strong>.</span></li><li id="ALM-12017__li495644512588"><span>Contact the <span id="ALM-12017__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12017__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12017__sdc198514f48e40f5bccbcac7d37c39b0"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12017__en-us_topic_0070543559_p22957827">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
90
docs/mrs/umn/ALM-12018.html
Normal file
90
docs/mrs/umn/ALM-12018.html
Normal file
@ -0,0 +1,90 @@
|
||||
<a name="ALM-12018"></a><a name="ALM-12018"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12018 Memory Usage Exceeds the Threshold</h1>
|
||||
<div id="body45808846"><div class="section" id="ALM-12018__s50ccc31ef9e641198251dfc6a33c09c5"><h4 class="sectiontitle">Description</h4><p id="ALM-12018__en-us_topic_0070543570_p24703169">The system checks the memory usage of the system every 30 seconds and compares the actual memory usage with the threshold. The memory usage has a default threshold, this alarm is generated when the value of the memory usage exceeds the threshold.</p>
|
||||
<p id="ALM-12018__p21825693105317">When the <strong id="ALM-12018__b44134084101639">Trigger Count</strong> is 1, this alarm is cleared when the host memory usage is less than or equal to the threshold. When the <strong id="ALM-12018__b15541182813412">Trigger Count</strong> is greater than 1, this alarm is cleared when the host memory usage is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__s1994b6f8627642d88be08bcd2548de1d"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12018__en-us_topic_0070543570_table23435083" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12018__en-us_topic_0070543570_row32670228"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12018__en-us_topic_0070543570_p29042851">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12018__en-us_topic_0070543570_p3660763">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12018__en-us_topic_0070543570_p28086371">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12018__en-us_topic_0070543570_row60403555"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12018__en-us_topic_0070543570_p60849796">12018</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12018__en-us_topic_0070543570_p29886445">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12018__en-us_topic_0070543570_p4883015">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__sa1829ad9cd3a42e4b31bc79a098f3b74"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12018__en-us_topic_0070543570_table59979962" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12018__en-us_topic_0070543570_row19383628"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12018__en-us_topic_0070543570_p26570072">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12018__en-us_topic_0070543570_p4692191">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12018__row20753958155419"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12018__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12018__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12018__en-us_topic_0070543570_row44523211"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12018__en-us_topic_0070543570_p49610372">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12018__en-us_topic_0070543570_p59017201">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12018__en-us_topic_0070543570_row61392767"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12018__en-us_topic_0070543570_p6758230">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12018__en-us_topic_0070543570_p10545749">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12018__en-us_topic_0070543570_row27802884"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12018__en-us_topic_0070543570_p37441107">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12018__en-us_topic_0070543570_p12830815">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12018__en-us_topic_0070543570_row48368471"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12018__en-us_topic_0070543570_p25532040">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12018__en-us_topic_0070543570_p54829354">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__sa10379c3153f4b9f8b17bd5b143a38bb"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12018__en-us_topic_0070543570_p11992676">Service processes respond slowly or become unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__s9916d08120424cfda84ba922161f084e"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12018__en-us_topic_0070543570_ul31882719"><li id="ALM-12018__en-us_topic_0070543570_li18509020">Memory configuration cannot meet service requirements. The memory usage reaches the upper limit.</li><li id="ALM-12018__en-us_topic_0070543570_li32363459">The SUSE 12.X OS has an earlier <strong id="ALM-12018__en-us_topic_0070543570_b22835675">free</strong> command. The calculated memory usage cannot reflect the real-world memory usage.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__se4f7b617334646a38cb923c5f374e0ea"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12018__en-us_topic_0070543570_p37750382"><strong id="ALM-12018__b155002419247">Perform the following operations if SUSE 12.X is used.</strong></p>
|
||||
<ol id="ALM-12018__ol15347461930"><li id="ALM-12018__li470884489252"><span>Log in to any node in the cluster as user <strong id="ALM-12018__b679353816149">root</strong>, and run the <strong id="ALM-12018__b151226639252">cat /etc/*-release</strong> command to check whether the OS is SUSE 12.X as user <strong id="ALM-12018__b4403155210495">root</strong>. <span id="ALM-12018__text43649449460"></span></span><p><ul class="subitemlist" id="ALM-12018__ul276016719252"><li id="ALM-12018__li169761769252">If yes, go to <a href="#ALM-12018__li348492949252">2</a>.</li><li id="ALM-12018__li328930149252">If no, go to <a href="#ALM-12018__li5861159252">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12018__li348492949252"><a name="ALM-12018__li348492949252"></a><a name="li348492949252"></a><span>Run the <strong id="ALM-12018__b211428509252">cat /proc/meminfo | grep Mem</strong> command to check the real-world memory usage of the OS.</span><p><pre class="screen" id="ALM-12018__screen560679269252">MemTotal: 263576192 kB
|
||||
MemFree: 198283116 kB
|
||||
MemAvailable: 227641452 kB</pre>
|
||||
</p></li><li id="ALM-12018__li448043669252"><span>Calculate the real-world memory usage: Memory usage = 1 - (Memory available/Memory total)</span><p><ul class="subitemlist" id="ALM-12018__ul568914459252"><li id="ALM-12018__li42205629252">If the memory usage is lower than 90%, manually disable transferring from monitoring indicators to alarms.</li><li id="ALM-12018__li63212719252">If the memory usage is higher than 90%, go to <a href="#ALM-12018__li5861159252">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12018__p422609659252"><strong id="ALM-12018__b40915938935">Expand the system.</strong></p>
|
||||
<ol start="4" id="ALM-12018__ol28552339317"><li id="ALM-12018__li5861159252"><a name="ALM-12018__li5861159252"></a><a name="li5861159252"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12018__image168221113135319" src="en-us_image_0269383830.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12018__li474753219252"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12018__b52750359252">root</strong>. <span id="ALM-12018__text5966104516217"></span></span></li><li id="ALM-12018__li242002745617"><span>If the memory usage exceeds the threshold, perform memory capacity expansion.</span></li><li id="ALM-12018__li202957929252"><span>Run the command <strong id="ALM-12018__b246247099252">free -m | grep Mem\: | awk '{printf("%s,", $3 * 100 / $2)}'</strong> to check the system memory usage.</span></li><li id="ALM-12018__li305215859252"><span>Wait for 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12018__ul111473689252"><li id="ALM-12018__li316825749252">If yes, no further action is required.</li><li id="ALM-12018__li161516779252">If no, go to <a href="#ALM-12018__li372014939252">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12018__p332174499252"><strong id="ALM-12018__b165019069321">Collect fault information.</strong></p>
|
||||
<ol start="9" id="ALM-12018__ol300682989324"><li id="ALM-12018__li372014939252"><a name="ALM-12018__li372014939252"></a><a name="li372014939252"></a><span>On the FusionInsight Manager in the active cluster, choose <strong id="ALM-12018__b57841710145614">O&M</strong> > <strong id="ALM-12018__b563292829252">Log > Download</strong>.</span></li><li id="ALM-12018__li40625489252"><span>Select <strong id="ALM-12018__b663779889252">OmmServer</strong> from the <strong id="ALM-12018__b1099120531019">Servic</strong>e and click <strong id="ALM-12018__b999117511012">OK</strong>.</span></li><li id="ALM-12018__li1145664103113"><span>Click <span><img id="ALM-12018__image1945644173117" src="en-us_image_0269383831.png"></span> in the upper right corner, and set <strong id="ALM-12018__b6456941173117">Start Date</strong> and <strong id="ALM-12018__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12018__b13456164113319">Download</strong>.</span></li><li id="ALM-12018__li495644512588"><span>Contact the <span id="ALM-12018__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12018__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12018__s2b9d04a0d3d24da4a369d9ba061b65a6"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12018__en-us_topic_0070543570_p26386209">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
87
docs/mrs/umn/ALM-12027.html
Normal file
87
docs/mrs/umn/ALM-12027.html
Normal file
@ -0,0 +1,87 @@
|
||||
<a name="ALM-12027"></a><a name="ALM-12027"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12027 Host PID Usage Exceeds the Threshold</h1>
|
||||
<div id="body15885728"><div class="section" id="ALM-12027__s4ff73d9b6e5e4103a3820abbc876532e"><h4 class="sectiontitle">Description</h4><p id="ALM-12027__en-us_topic_0070543581_p5836323">The system checks the PID usage every 30 seconds and compares the actual PID usage with the default PID usage threshold. This alarm is generated when the system detects that the PID usage exceeds the threshold.</p>
|
||||
<p id="ALM-12027__p4934657911347">When the <strong id="ALM-12027__b44134084101639">Trigger Count</strong> is 1, this alarm is cleared when the PID usage is less than or equal to the threshold. When the <strong id="ALM-12027__b1741410113352">Trigger Count</strong> is greater than 1, this alarm is cleared when the PID usage is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__s100ea51423104e978209f1955534fa27"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12027__en-us_topic_0070543581_table26821252" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12027__en-us_topic_0070543581_row57828837"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12027__en-us_topic_0070543581_p53624206">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12027__en-us_topic_0070543581_p48593428">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12027__en-us_topic_0070543581_p43753566">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12027__en-us_topic_0070543581_row54377940"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12027__en-us_topic_0070543581_p42537056">12027</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12027__en-us_topic_0070543581_p22949479">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12027__en-us_topic_0070543581_p46968531">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__s0e56b478a67a4be0bc1ff52da93ed720"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12027__en-us_topic_0070543581_table46354700" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12027__en-us_topic_0070543581_row29477662"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12027__en-us_topic_0070543581_p38880413">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12027__en-us_topic_0070543581_p62305773">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12027__row4837155215414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12027__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12027__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12027__en-us_topic_0070543581_row13602870"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12027__en-us_topic_0070543581_p28090655">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12027__en-us_topic_0070543581_p60750544">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12027__en-us_topic_0070543581_row9883987"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12027__en-us_topic_0070543581_p62405508">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12027__en-us_topic_0070543581_p21681426">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12027__en-us_topic_0070543581_row60915111"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12027__en-us_topic_0070543581_p35176971">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12027__en-us_topic_0070543581_p30762366">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12027__en-us_topic_0070543581_row8425846"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12027__en-us_topic_0070543581_p11404934">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12027__en-us_topic_0070543581_p51384442">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__s0fe82127f2e84450a24be46e715835ca"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12027__en-us_topic_0070543581_p1390267">No PID is available for new processes and service processes are unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__s3ddd6cfc758a404a82adc3dfe898bd66"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12027__p681753145417">Too many processes are running on the node. You need to increase the value of <strong id="ALM-12027__en-us_topic_0070543581_b61845569">pid_max</strong>.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__s9445b6fc399a470295ea751769713fde"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12027__en-us_topic_0070543581_p55372696"><strong id="ALM-12027__b360029529747">Increase the value of pid_max.</strong></p>
|
||||
<ol id="ALM-12027__ol240915109757"><li id="ALM-12027__li639798269750"><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12027__image168221113135319" src="en-us_image_0269383832.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12027__li149834549750"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12027__b389475309750">root</strong>. <span id="ALM-12027__text43649449460"></span></span></li><li id="ALM-12027__li513020679750"><span>Run the <strong id="ALM-12027__b6333589750">cat /proc/sys/kernel/pid_max</strong>command to check the value of <strong id="ALM-12027__b57002299750">pid_max</strong>.</span></li><li id="ALM-12027__li205272659750"><span>If the PID usage exceeds the threshold, run the command <strong id="ALM-12027__b590654259750">echo </strong><em id="ALM-12027__i618267859750">new value </em><strong id="ALM-12027__b195701549750">> /proc/sys/kernel/pid_max</strong> to enlarge the value of <strong id="ALM-12027__b419136639750">pid_max</strong>.</span><p><p class="litext" id="ALM-12027__p395635099750">Example: <strong id="ALM-12027__b416786479750">echo 65536 > /proc/sys/kernel/pid_max</strong></p>
|
||||
<div class="note" id="ALM-12027__note163571615102916"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12027__p10664145203015">The maximum value of <span class="parmname" id="ALM-12027__parmname1566455103015"><b>pid_max</b></span> is as follows:</p>
|
||||
<ul id="ALM-12027__ul13990143413014"><li id="ALM-12027__li7990034173015">On 32-bit systems: 32768</li><li id="ALM-12027__li799018345307">On 64-bit systems: 4194304 (2^22)</li></ul>
|
||||
</div></div>
|
||||
</p></li><li id="ALM-12027__li148339459750"><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12027__ul590069549750"><li id="ALM-12027__li505276609750">If yes, no further action is required.</li><li id="ALM-12027__li662086519750">If no, go to <a href="#ALM-12027__li377225729750">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12027__p61837339750"><strong id="ALM-12027__b361001479817">Collect fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12027__ol116595289821"><li id="ALM-12027__li377225729750"><a name="ALM-12027__li377225729750"></a><a name="li377225729750"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12027__b311203779750">O&M</strong> > <strong id="ALM-12027__b116479379750">Log > Download</strong>.</span></li><li id="ALM-12027__li3107269750"><span>Select all services from the <strong id="ALM-12027__b356295299750">Service</strong> and click <strong id="ALM-12027__b3991118545">OK</strong>.</span></li><li id="ALM-12027__li1145664103113"><span>Click <span><img id="ALM-12027__image1945644173117" src="en-us_image_0269383834.png"></span> in the upper right corner, and set <strong id="ALM-12027__b6456941173117">Start Date</strong> and <strong id="ALM-12027__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12027__b13456164113319">Download</strong>.</span></li><li id="ALM-12027__li495644512588"><span>Contact the <span id="ALM-12027__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12027__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12027__s99f69a6c05834c85bf47a731f55376c2"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12027__en-us_topic_0070543581_p32793969">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
85
docs/mrs/umn/ALM-12028.html
Normal file
85
docs/mrs/umn/ALM-12028.html
Normal file
@ -0,0 +1,85 @@
|
||||
<a name="ALM-12028"></a><a name="ALM-12028"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12028 Number of Processes in the D State on a Host Exceeds the Threshold</h1>
|
||||
<div id="body14709652"><div class="section" id="ALM-12028__section23718688"><h4 class="sectiontitle">Description</h4><p id="ALM-12028__p50631172">The system checks the number of processes in the D state of user <strong id="ALM-12028__b16253141134213">omm</strong> on the host every 30 seconds and compares the actual number with the threshold. The number of processes in the D state on the host has a default threshold range. This alarm is generated when the number of processes exceeds the threshold.</p>
|
||||
<p id="ALM-12028__p53027366">This alarm is cleared when the <strong id="ALM-12028__b1896274320598">Trigger Count</strong> is <strong id="ALM-12028__b15669123210464">1</strong> and the total number of processes in the D state of user <strong id="ALM-12028__b19867204318485">omm</strong> on the host does not exceed the threshold. This alarm is cleared when the <strong id="ALM-12028__b134171188010">Trigger Count</strong> is greater than <strong id="ALM-12028__b466017588499">1</strong> and the total number of processes in the D state of user <strong id="ALM-12028__b1986717812518">omm</strong> on the host is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section12141602"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12028__table249371" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12028__row53434174"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12028__p33200870">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12028__p4915934">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12028__p62646350">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12028__row41189599"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12028__p48023224">12028</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12028__p64675970">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12028__p4262245">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section42165562"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12028__table9697544" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12028__row57456427"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12028__p23458978">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12028__p21129086">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12028__row538612136417"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12028__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12028__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12028__row33734439"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12028__p48135044">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12028__p6624510">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12028__row59620593"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12028__p64538720">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12028__p60253856">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12028__row5413798"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12028__p35864482">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12028__p19341890">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12028__row4565373514855"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12028__p696502014855">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12028__p2729572514855">Specifies the threshold for triggering the alarm.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section43945744"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12028__p23189230">Excessive system resources are used and service processes respond slowly.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section59967381"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12028__p66388367">The host responds slowly to I/O (disk I/O and network I/O) requests and some processes are in the D state and Z state.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section2835522"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12028__p8748685"><strong id="ALM-12028__b820613226166">Check the processes in the D state</strong><strong id="ALM-12028__b10206112220164"></strong><strong id="ALM-12028__b15207322181616">.</strong></p>
|
||||
<ol id="ALM-12028__ol5802802991057"><li id="ALM-12028__li6390942091049"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12028__image168221113135319" src="en-us_image_0263895749.png"></span> to view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12028__li1641579391049"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12028__b1426064751813">root</strong>. (<span id="ALM-12028__text995114020554"></span>) Then run the <strong id="ALM-12028__b3831387091049">su - omm</strong> command to switch to user <strong id="ALM-12028__b1288412448360">omm</strong>.</span></li><li id="ALM-12028__li2173547791049"><span>Run the following command as user <strong id="ALM-12028__b547373343910">omm</strong> to view the PID of the process that is in the D state:</span><p><p class="litext" id="ALM-12028__p5461083691049"><strong id="ALM-12028__b1352441191049">ps -elf | grep -v "\[thread_checkio\]" | awk 'NR!=1 {print $2, $3, $4}' | grep omm | awk -F' ' '{print $1, $3}' | grep -E "Z|D" | awk '{print $2}'</strong></p>
|
||||
</p></li><li id="ALM-12028__li2799290091049"><span>Check whether the command output is empty.</span><p><ul class="subitemlist" id="ALM-12028__ul1056686291049"><li id="ALM-12028__li747103591049">If yes, the service process is running properly. Then go to <a href="#ALM-12028__li2701143291049">6</a>.</li><li id="ALM-12028__li117409591049">If no, go to <a href="#ALM-12028__li573000391049">5</a>.</li></ul>
|
||||
</p></li><li id="ALM-12028__li573000391049"><a name="ALM-12028__li573000391049"></a><a name="li573000391049"></a><span>Switch to user <strong id="ALM-12028__b1281511314404">root</strong> and run the <strong id="ALM-12028__b8712438134020">reboot</strong> command to restart the host for which the alarm is generated. (Restarting a host is risky. Ensure that the service process is normal after the restart.)</span></li><li id="ALM-12028__li2701143291049"><a name="ALM-12028__li2701143291049"></a><a name="li2701143291049"></a><span>Check whether the alarm is cleared 5 minutes later.</span><p><ul class="subitemlist" id="ALM-12028__ul1358954691049"><li id="ALM-12028__li5157003291049">If yes, no further action is required.</li><li id="ALM-12028__li1642303091049">If no, go to <a href="#ALM-12028__li4177630091049">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12028__p5519705391049"><strong id="ALM-12028__b89637239112">Collect the fault information.</strong></p>
|
||||
<ol start="7" id="ALM-12028__ol128225129115"><li id="ALM-12028__li4177630091049"><a name="ALM-12028__li4177630091049"></a><a name="li4177630091049"></a><span>On FusionInsight Manager, choose <strong id="ALM-12028__b20291725145313">O&M</strong> > <strong id="ALM-12028__b64322519539">Log</strong> > <strong id="ALM-12028__b5431625185320">Download</strong>.</span></li><li id="ALM-12028__li4044238791049"><span>Select <strong id="ALM-12028__b884279457112956">OMS</strong> for <strong id="ALM-12028__b811590828112956">Service</strong> and click <strong id="ALM-12028__b1060353008112956">OK</strong>.</span></li><li id="ALM-12028__li2843716491049"><span>Click <span><img id="ALM-12028__image104601319175315" src="en-us_image_0263895796.png"></span> in the upper right corner, and set <strong id="ALM-12028__b522882672112956">Start Date</strong> and <strong id="ALM-12028__b2029904650112956">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12028__b449569331112956">Download</strong>.</span></li><li id="ALM-12028__li2170896591049"><span>Contact <span id="ALM-12028__text02161454416">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12028__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12028__section25519705"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12028__p14021434">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
149
docs/mrs/umn/ALM-12033.html
Normal file
149
docs/mrs/umn/ALM-12033.html
Normal file
File diff suppressed because it is too large
Load Diff
88
docs/mrs/umn/ALM-12034.html
Normal file
88
docs/mrs/umn/ALM-12034.html
Normal file
@ -0,0 +1,88 @@
|
||||
<a name="ALM-12034"></a><a name="ALM-12034"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12034 Periodical Backup Failure</h1>
|
||||
<div id="body51434987"><div class="section" id="ALM-12034__s00e65d9123db49e4a8231d70a887346f"><h4 class="sectiontitle">Description</h4><p id="ALM-12034__en-us_topic_0070543608_p47284016">The system executes the periodic backup task every 60 minutes. This alarm is generated when a periodical backup task fails to be executed. This alarm is cleared when the next backup task is executed successfully.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__sc655bbebee5e46b682b6059f1ba1a704"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12034__en-us_topic_0070543608_table4800092" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12034__en-us_topic_0070543608_row23796051"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12034__en-us_topic_0070543608_p48431953">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12034__en-us_topic_0070543608_p30674087">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12034__en-us_topic_0070543608_p1573118">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12034__en-us_topic_0070543608_row60313725"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12034__en-us_topic_0070543608_p53573581">12034</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12034__en-us_topic_0070543608_p44492822">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12034__en-us_topic_0070543608_p47148799">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__s920be18fd12b455cbc418d97cb6104c4"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12034__en-us_topic_0070543608_table60956336" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12034__en-us_topic_0070543608_row26788850"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12034__en-us_topic_0070543608_p22413252">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12034__en-us_topic_0070543608_p3534147">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12034__row115423179549"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12034__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12034__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12034__en-us_topic_0070543608_row17830486"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12034__en-us_topic_0070543608_p34983270">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12034__en-us_topic_0070543608_p15072602">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12034__en-us_topic_0070543608_row1435693"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12034__en-us_topic_0070543608_p49182323">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12034__en-us_topic_0070543608_p24345233">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12034__en-us_topic_0070543608_row17780513"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12034__en-us_topic_0070543608_p30935480">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12034__en-us_topic_0070543608_p22745914">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12034__en-us_topic_0070543608_row3386640"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12034__en-us_topic_0070543608_p5882400">TaskName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12034__en-us_topic_0070543608_p6712401">Specifies the task.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__s440963233f4d4750885cb986a9cad031"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12034__en-us_topic_0070543608_p6833610">There are not available backup packages for a long time, so the system cannot be restored in case of exceptions.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__s263b5f2875944e7b9df856ae80d2a053"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12034__en-us_topic_0070543608_p16651572">The alarm cause depends on the task details. Handle the alarm according to the logs and alarm details.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__s1ca44cb0f88942d591bb071c656d4ccc"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12034__en-us_topic_0070543608_p6600119"><strong id="ALM-12034__b11327931485">Check whether the disk space is sufficient.</strong></p>
|
||||
<ol id="ALM-12034__ol947516194522"><li id="ALM-12034__li739591494522"><span>In the FusionInsight Manager portal, click <strong id="ALM-12034__b3064793094522">O&M > Alarm<strong id="ALM-12034__b27872374104950"> > Alarms</strong></strong>.</span></li><li id="ALM-12034__li488781094522"><span>In the alarm list, click <span><img id="ALM-12034__image168221113135319" src="en-us_image_0269383843.png"></span> in the row where the alarm is located and obtain <strong id="ALM-12034__b6656323294522">TaskName</strong> from <strong id="ALM-12034__b9723191310467">Location</strong>.</span></li><li id="ALM-12034__li644408694522"><span>Choose <strong id="ALM-12034__b4399029494522">O&M</strong> > <strong id="ALM-12034__b6036833394522">Backup and Restoration > Backup Management</strong>.</span></li><li id="ALM-12034__li5220897594522"><span>Search for the backup task based on <strong id="ALM-12034__b0347912913">TaskName</strong> and click <strong id="ALM-12034__b20551318102819">More</strong><strong id="ALM-12034__b185711882811"> </strong>in the <strong id="ALM-12034__b43471515919">Operation</strong> column. In the displayed dialog box, click <strong id="ALM-12034__b63471511997">View History</strong> and view the task details.</span></li><li id="ALM-12034__li20896327494"><span>In the displayed dialog box and click <span><img id="ALM-12034__image5943924184912" src="en-us_image_0000001127057881.png"></span> to check whether the following message is displayed: Failed to backup xx due to insufficient disk space, move the data in the xx directory to other directories.</span><p><ul class="subitemlist" id="ALM-12034__ul450817218102"><li id="ALM-12034__li75085211107">If yes, go to <a href="#ALM-12034__li8265923133114">6</a>.</li><li id="ALM-12034__li2510182101010">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
|
||||
</p></li><li id="ALM-12034__li8265923133114"><a name="ALM-12034__li8265923133114"></a><a name="li8265923133114"></a><span>Choose <strong id="ALM-12034__b10266192333118">Backup Path</strong> > <strong id="ALM-12034__b1226611237319">View </strong>and obtain the <strong id="ALM-12034__b1626622314312">Backup Path</strong>.</span></li><li id="ALM-12034__li11760165519910"><span>Log in to the node as user <strong id="ALM-12034__b142279511119">root</strong> and run the following command to check the node mounting details:</span><p><p id="ALM-12034__p177811011105319"><span id="ALM-12034__text16214101716530"></span></p>
|
||||
<p id="ALM-12034__p1730510253112"><strong id="ALM-12034__b13233131210719">df -h</strong></p>
|
||||
</p></li><li id="ALM-12034__li75106309133"><span>Check whether the available space of the node to which the backup path is mounted is less than 20 GB.</span><p><ul class="subitemlist" id="ALM-12034__ul93921250101319"><li id="ALM-12034__li16393950171317">If yes, go to <a href="#ALM-12034__li181154133220">9</a>.</li><li id="ALM-12034__li1439665061320">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
|
||||
</p></li><li id="ALM-12034__li181154133220"><a name="ALM-12034__li181154133220"></a><a name="li181154133220"></a><span>Check whether there are many backup packages in the backup directory.</span><p><ul class="subitemlist" id="ALM-12034__ul83621143131416"><li id="ALM-12034__li13623437142">If yes, go to <a href="#ALM-12034__li3795101373317">10</a>.</li><li id="ALM-12034__li10364154313146">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
|
||||
</p></li><li id="ALM-12034__li3795101373317"><a name="ALM-12034__li3795101373317"></a><a name="li3795101373317"></a><span>Enable the available space of the node to which the backup directory is mounted to be greater than 20 GB by moving backup packages out of the backup directory or delete the backup packages.</span></li><li id="ALM-12034__li2833949294522"><span>After the problem is resolved, perform the backup task again and check whether the backup task execution is successful.</span><p><ul class="subitemlist" id="ALM-12034__ul6280115694522"><li id="ALM-12034__li4141931594522">If yes, go to <a href="#ALM-12034__li5916521794522">12</a>.</li><li id="ALM-12034__li6663022994522">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
|
||||
</p></li><li id="ALM-12034__li5916521794522"><a name="ALM-12034__li5916521794522"></a><a name="li5916521794522"></a><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12034__ul4385661594522"><li id="ALM-12034__li5372883994522">If yes, no further action is required.</li><li id="ALM-12034__li5706874094522">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="subitemlist" id="ALM-12034__p4445114113355"><strong id="ALM-12034__b1570250993141">Collect fault information.</strong></p>
|
||||
<ol start="13" id="ALM-12034__ol135018418359"><li id="ALM-12034__li115006411351"><a name="ALM-12034__li115006411351"></a><a name="li115006411351"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12034__b8500174113511">O&M</strong> > <strong id="ALM-12034__b4500941173512">Log > Download</strong>.</span></li><li id="ALM-12034__li13500174119354"><span>Select <strong id="ALM-12034__b450034113518">Controller</strong> from the <strong id="ALM-12034__b150044112358">Service</strong> and click <strong id="ALM-12034__b3991118545">OK</strong>.</span></li><li id="ALM-12034__li2501144119351"><span>Click <span><img id="ALM-12034__image13500184111355" src="en-us_image_0269383844.png"></span> in the upper right corner, and set <strong id="ALM-12034__b450010417354">Start Date</strong> and <strong id="ALM-12034__b1250124110357">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12034__b1950164118356">Download</strong>.</span></li><li id="ALM-12034__li495644512588"><span>Contact the <span id="ALM-12034__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12034__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12034__se2fbd74633544c99bb6e569541c957f0"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12034__en-us_topic_0070543608_p19444344">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
84
docs/mrs/umn/ALM-12035.html
Normal file
84
docs/mrs/umn/ALM-12035.html
Normal file
@ -0,0 +1,84 @@
|
||||
<a name="ALM-12035"></a><a name="ALM-12035"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12035 Unknown Data Status After Recovery Task Failure</h1>
|
||||
<div id="body47891496"><div class="section" id="ALM-12035__sb48ec6ce00d143b1b6fb7ce7774732e6"><h4 class="sectiontitle">Description</h4><p id="ALM-12035__en-us_topic_0070543609_p54112093">After the recovery task fails, the system automatically rolls back every 60 minutes. If the rollback fails, data may be lost. If this occurs, an alarm is reported. This alarm is cleared when the next recovery task execution is successful.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__saf7fb5d773874236a4f047fe49281308"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12035__en-us_topic_0070543609_table21003421" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12035__en-us_topic_0070543609_row63171031"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12035__en-us_topic_0070543609_p16579864">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12035__en-us_topic_0070543609_p791775">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12035__en-us_topic_0070543609_p64133777">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12035__en-us_topic_0070543609_row27453464"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12035__en-us_topic_0070543609_p9138140">12035</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12035__en-us_topic_0070543609_p1991876">Critical</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12035__en-us_topic_0070543609_p27124281">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__s27c794d7f2c14e7aa89b4a425e71bebf"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12035__en-us_topic_0070543609_table49583138" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12035__en-us_topic_0070543609_row9224964"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12035__en-us_topic_0070543609_p9024624">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12035__en-us_topic_0070543609_p59905940">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12035__row7361613185419"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12035__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12035__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12035__en-us_topic_0070543609_row20543001"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12035__en-us_topic_0070543609_p53370407">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12035__en-us_topic_0070543609_p28035685">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12035__en-us_topic_0070543609_row50994573"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12035__en-us_topic_0070543609_p36919767">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12035__en-us_topic_0070543609_p37711141">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12035__en-us_topic_0070543609_row3855953"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12035__en-us_topic_0070543609_p43896801">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12035__en-us_topic_0070543609_p65980010">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12035__en-us_topic_0070543609_row56949179"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12035__en-us_topic_0070543609_p49480747">TaskName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12035__en-us_topic_0070543609_p48517543">Specifies the task.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__sd2732db61cd24c008c2b028c1dade437"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12035__en-us_topic_0070543609_p37606936">After the recovery task fails, the system automatically rolls back. If the rollback fails, data may be lost or the data status may be unknown, which may affect services.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__sacfd7ee8334740b0ba21d1763037c632"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12035__en-us_topic_0070543609_p26263014">The alarm cause depends on the task details. Handle the alarm according to the logs and alarm details.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__s9bf9cfe815d64aefa40fafcd22fe46e5"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12035__en-us_topic_0070543609_p46929400"><strong id="ALM-12035__b3404416894635">Collect fault information.</strong></p>
|
||||
<ol id="ALM-12035__ol8912728101615"><li id="ALM-12035__li1191262861614"><span>In the FusionInsight Manager, choose <strong id="ALM-12035__b14623152119812">Cluster > </strong><em id="ALM-12035__i56519211481">Name of the desired cluster</em><strong id="ALM-12035__b1162417211489"> > Services</strong>, and check whether the running status of the component meets the requirements. (The OMS and DBService must be in the normal state, and other components must be stopped.)</span><p><ul id="ALM-12035__ul18912142801616"><li id="ALM-12035__li39121528141617">If yes, go to <a href="#ALM-12035__li18912172820165">9</a>.</li><li id="ALM-12035__li69121828141614">If no, go to <a href="#ALM-12035__li16912228111613">2</a>.</li></ul>
|
||||
</p></li><li id="ALM-12035__li16912228111613"><a name="ALM-12035__li16912228111613"></a><a name="li16912228111613"></a><span>Restore the component status as required and start the recovery task again.</span></li><li id="ALM-12035__li49121828171617"><span>Log in to the FusionInsight Manager portal and click <strong id="ALM-12035__b0912162814167">O&M > Alarm<strong id="ALM-12035__b19912728151615"> > Alarms</strong></strong>.</span></li><li id="ALM-12035__li591222818167"><span>In the alarm list, click <span><img id="ALM-12035__image159128280169" src="en-us_image_0269383845.png"></span> in the row where the alarm is located to obtain <strong id="ALM-12035__b59121128141611">TaskName</strong> from <strong id="ALM-12035__b2912162815161">Location</strong>.</span></li><li id="ALM-12035__li18912152891616"><span>Choose <strong id="ALM-12035__b12912162812167">O&M</strong> > <strong id="ALM-12035__b79123288163"><strong id="ALM-12035__b2912132812163">Backup and Restoration > </strong>Restoration Management</strong>.</span></li><li id="ALM-12035__li1912142813165"><span>Find the restoration task by <strong id="ALM-12035__b15912528101611">Task Name</strong> and view the task details.</span></li><li id="ALM-12035__li1991218288166"><span>Perform the recovery task again and check whether the recovery task execution is successful.</span><p><ul class="subitemlist" id="ALM-12035__ul491292812164"><li id="ALM-12035__li10912122819168">If yes, go to <a href="#ALM-12035__li691272812168">8</a>.</li><li id="ALM-12035__li10912192811612">If no, go to <a href="#ALM-12035__li18912172820165">9</a>.</li></ul>
|
||||
</p></li><li id="ALM-12035__li691272812168"><a name="ALM-12035__li691272812168"></a><a name="li691272812168"></a><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12035__ul191292851612"><li id="ALM-12035__li18912628181613">If yes, no further action is required.</li><li id="ALM-12035__li1991218285168">If no, go to <a href="#ALM-12035__li18912172820165">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12035__en-us_topic_0070543610_p36865955"><strong id="ALM-12035__b5671597695034">Collect fault information.</strong></p>
|
||||
<ol start="9" id="ALM-12035__ol17912928131615"><li id="ALM-12035__li18912172820165"><a name="ALM-12035__li18912172820165"></a><a name="li18912172820165"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12035__b11912192811618">O&M</strong> > <strong id="ALM-12035__b4912112871618">Log > Download</strong>.</span></li><li id="ALM-12035__li29127284164"><span>Select <strong id="ALM-12035__b1491242841616">Controller</strong> from the <strong id="ALM-12035__b9912928131617">Service</strong> and click <strong id="ALM-12035__b3991118545">OK</strong>.</span></li><li id="ALM-12035__li16912132810167"><span>Click <span><img id="ALM-12035__image119122281161" src="en-us_image_0269383846.png"></span> in the upper right corner, and set <strong id="ALM-12035__b4912228191616">Start Date</strong> and <strong id="ALM-12035__b19122028151619">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12035__b891272814169">Download</strong>.</span></li><li id="ALM-12035__li495644512588"><span>Contact the <span id="ALM-12035__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12035__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12035__s8e7bd6abf271476484274c81c6be7153"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12035__en-us_topic_0070543609_p47386872">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
91
docs/mrs/umn/ALM-12038.html
Normal file
91
docs/mrs/umn/ALM-12038.html
Normal file
@ -0,0 +1,91 @@
|
||||
<a name="ALM-12038"></a><a name="ALM-12038"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12038 Monitoring Indicator Dumping Failure</h1>
|
||||
<div id="body63636060"><div class="section" id="ALM-12038__s8e9121c2c414434483ea97a53f56b6a3"><h4 class="sectiontitle">Description</h4><p id="ALM-12038__en-us_topic_0070543612_p1601797">After monitoring indicator dumping is configured on FusionInsight Manager, the system checks the monitoring indicator dumping result at the dumping interval (60 seconds by default). This alarm is generated when the dumping fails.</p>
|
||||
<p id="ALM-12038__en-us_topic_0070543612_p14416173">This alarm is cleared when dumping is successful.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__s31d8d809c03f4781ab23dff587f0e76c"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12038__en-us_topic_0070543612_table26859337" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12038__en-us_topic_0070543612_row30865786"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12038__en-us_topic_0070543612_p17100761">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12038__en-us_topic_0070543612_p42984396">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12038__en-us_topic_0070543612_p59184060">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12038__en-us_topic_0070543612_row29179536"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12038__en-us_topic_0070543612_p14732209">12038</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12038__en-us_topic_0070543612_p52458246">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12038__en-us_topic_0070543612_p21259511">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__s4c187bb4cd7440c38de973462970a402"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12038__en-us_topic_0070543612_table44298862" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12038__en-us_topic_0070543612_row34203206"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12038__en-us_topic_0070543612_p18996311">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12038__en-us_topic_0070543612_p62306224">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12038__row775017375524"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12038__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12038__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12038__en-us_topic_0070543612_row13639418"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12038__en-us_topic_0070543612_p31051101">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12038__en-us_topic_0070543612_p32111232">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12038__en-us_topic_0070543612_row20565633"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12038__en-us_topic_0070543612_p55203559">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12038__en-us_topic_0070543612_p42303331">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12038__en-us_topic_0070543612_row45185659"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12038__en-us_topic_0070543612_p36159792">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12038__en-us_topic_0070543612_p43262062">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__s00b9cb8c5c10409681288b82523f4a66"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12038__en-us_topic_0070543612_p14566108">The upper-layer management system cannot obtain monitoring indicators from the FusionInsight Manager system.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__s4e59ea22202b4f69831fdaa7a30f2974"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12038__en-us_topic_0070543612_ul39004066"><li id="ALM-12038__en-us_topic_0070543612_li15492278">The server cannot be connected.</li><li id="ALM-12038__en-us_topic_0070543612_li5212777">The save path on the server cannot be accessed.</li><li id="ALM-12038__en-us_topic_0070543612_li46914996">The monitoring indicator file fails to be uploaded.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__s78696e4fd8994f578e59819068f88bd9"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12038__en-us_topic_0070543612_p42018335"><strong id="ALM-12038__b39938613103615">Check whether the server connection is normal.</strong></p>
|
||||
<ol id="ALM-12038__ol614335103629"><li id="ALM-12038__li55118711103617"><span>Check whether the network between the FusionInsight Manager system and the server is normal.</span><p><ul class="subitemlist" id="ALM-12038__ul50863543103617"><li id="ALM-12038__li35971633103617">If yes, go to <a href="#ALM-12038__li44378490103617">3</a>.</li><li id="ALM-12038__li28021126103617">If no, go to <a href="#ALM-12038__li59131350103617">2</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li59131350103617"><a name="ALM-12038__li59131350103617"></a><a name="li59131350103617"></a><span>Contact the network administrator to recover the network and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12038__ul51309392103617"><li id="ALM-12038__li26306358103617">If yes, no further action is required.</li><li id="ALM-12038__li50440286103617">If no, go to <a href="#ALM-12038__li44378490103617">3</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li44378490103617"><a name="ALM-12038__li44378490103617"></a><a name="li44378490103617"></a><span>Choose <strong id="ALM-12038__b62420103103617">System</strong> > <strong id="ALM-12038__b24910022103617"><strong id="ALM-12038__b1861155518585">Interconnection</strong> > Upload Performance Data</strong> and check whether the FTP username, password, port, dump mode, and public key configured on the upload performance data page are consistent with the configuration on the server.</span><p><ul class="subitemlist" id="ALM-12038__ul19844024103617"><li id="ALM-12038__li4445911103617">If yes, go to <a href="#ALM-12038__li31439394103617">5</a>.</li><li id="ALM-12038__li24574512103617">If no, go to <a href="#ALM-12038__li38260071103617">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li38260071103617"><a name="ALM-12038__li38260071103617"></a><a name="li38260071103617"></a><span>Enter the correct configuration information, click <strong id="ALM-12038__b63862097103617">OK</strong>, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12038__ul38583553103617"><li id="ALM-12038__li37887965103617">If yes, no further action is required.</li><li id="ALM-12038__li49026304103617">If no, go to <a href="#ALM-12038__li31439394103617">5</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12038__p11707659103617"><strong id="ALM-12038__b17484166103639">Check the permission of the save path on the server is correct.</strong></p>
|
||||
<ol start="5" id="ALM-12038__ol54871352103655"><li id="ALM-12038__li31439394103617"><a name="ALM-12038__li31439394103617"></a><a name="li31439394103617"></a><span>Choose <strong id="ALM-12038__b5975520704">System</strong> > <strong id="ALM-12038__b29779208015"><strong id="ALM-12038__b1997716201808">Interconnection</strong> > Upload Performance Data</strong> and check the configuration items <strong id="ALM-12038__b41413883103617">FTP Username</strong>, <strong id="ALM-12038__b37180634103617">Save Path</strong>, and <strong id="ALM-12038__b66190256103617">Dump Mode</strong>.</span><p><ul class="subitemlist" id="ALM-12038__ul48232508103617"><li id="ALM-12038__li59810542103617">If the dump mode is FTP, go to <a href="#ALM-12038__li58736977103617">6</a>.</li><li id="ALM-12038__li12815708103617">If the dump mode is SFTP, go to <a href="#ALM-12038__li38059143103617">7</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li58736977103617"><a name="ALM-12038__li58736977103617"></a><a name="li58736977103617"></a><span>Log in to the server in FTP mode. In the default path, check whether <strong id="ALM-12038__b14519094103617">FTP Username</strong> has the read and write permission of the relative path <strong id="ALM-12038__b63562985103617">Save Path</strong>.</span><p><ul class="subitemlist" id="ALM-12038__ul66178654103617"><li id="ALM-12038__li48328152103617">If yes, go to <a href="#ALM-12038__li35446984103617">9</a>.</li><li id="ALM-12038__li22266264103617">If no, go to <a href="#ALM-12038__li47558825103617">8</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li38059143103617"><a name="ALM-12038__li38059143103617"></a><a name="li38059143103617"></a><span>Log in to the server in SFTP mode and check whether <strong id="ALM-12038__b58870746103617">FTP Username</strong> has the read and write permission of the absolute path <strong id="ALM-12038__b60074666103617">Save Path</strong>.</span><p><ul class="subitemlist" id="ALM-12038__ul41511495103617"><li id="ALM-12038__li34209739103617">If yes, go to <a href="#ALM-12038__li35446984103617">9</a>.</li><li id="ALM-12038__li19525469103617">If no, go to <a href="#ALM-12038__li47558825103617">8</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li47558825103617"><a name="ALM-12038__li47558825103617"></a><a name="li47558825103617"></a><span>Add the read and write permission and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12038__ul61067974103617"><li id="ALM-12038__li6987973103617">If yes, no further action is required.</li><li id="ALM-12038__li29154951103617">If no, go to <a href="#ALM-12038__li35446984103617">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12038__p12740854103617"><strong id="ALM-12038__b5271163310375">Check whether the save path on the server has sufficient disk space.</strong></p>
|
||||
<ol start="9" id="ALM-12038__ol17378994103715"><li id="ALM-12038__li35446984103617"><a name="ALM-12038__li35446984103617"></a><a name="li35446984103617"></a><span>Log in to the server and check whether the save path has sufficient disk space.</span><p><ul class="subitemlist" id="ALM-12038__ul63590877103617"><li id="ALM-12038__li27059654103617">If yes, go to <a href="#ALM-12038__li51692141103617">11</a>.</li><li id="ALM-12038__li44348355103617">If no, go to <a href="#ALM-12038__li53095195103617">10</a>.</li></ul>
|
||||
</p></li><li id="ALM-12038__li53095195103617"><a name="ALM-12038__li53095195103617"></a><a name="li53095195103617"></a><span>Delete unnecessary files or go to the monitoring indicator dumping configuration page to change the save path. Then, check whether the save path has sufficient disk space.</span><p><ul class="subitemlist" id="ALM-12038__ul35452684103617"><li id="ALM-12038__li50587406103617">If yes, no further action is required.</li><li id="ALM-12038__li3939187103617">If no, go to <a href="#ALM-12038__li51692141103617">11</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12038__p50638708103617"><strong id="ALM-12038__b6018239103721">Collect fault information.</strong></p>
|
||||
<ol start="11" id="ALM-12038__ol22765086103724"><li id="ALM-12038__li51692141103617"><a name="ALM-12038__li51692141103617"></a><a name="li51692141103617"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12038__b2056231918912">O&M</strong> > <strong id="ALM-12038__b5743571103617">Log > Download</strong>.</span></li><li id="ALM-12038__li51051832103617"><span>Select <strong id="ALM-12038__b1352831932712">OMS</strong> from the <strong id="ALM-12038__b26313908103617">Service</strong> and click <strong id="ALM-12038__b3991118545">OK</strong>.</span></li><li id="ALM-12038__li1145664103113"><span>Click <span><img id="ALM-12038__image1945644173117" src="en-us_image_0269383850.png"></span> in the upper right corner, and set <strong id="ALM-12038__b6456941173117">Start Date</strong> and <strong id="ALM-12038__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12038__b13456164113319">Download</strong>.</span></li><li id="ALM-12038__li495644512588"><span>Contact the <span id="ALM-12038__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12038__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12038__sa06d89749189409e92e39d8915030685"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12038__en-us_topic_0070543612_p20827005">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
104
docs/mrs/umn/ALM-12039.html
Normal file
104
docs/mrs/umn/ALM-12039.html
Normal file
File diff suppressed because it is too large
Load Diff
103
docs/mrs/umn/ALM-12040.html
Normal file
103
docs/mrs/umn/ALM-12040.html
Normal file
File diff suppressed because it is too large
Load Diff
88
docs/mrs/umn/ALM-12041.html
Normal file
88
docs/mrs/umn/ALM-12041.html
Normal file
@ -0,0 +1,88 @@
|
||||
<a name="ALM-12041"></a><a name="ALM-12041"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12041 Incorrect Permission on Key Files</h1>
|
||||
<div id="body34348483"><div class="section" id="ALM-12041__s19cc37a409aa4bd694d71622ab60001a"><h4 class="sectiontitle">Description</h4><p id="ALM-12041__en-us_topic_0070543616_p31363207">The system checks whether the permission, user, and user group information about critical directories or files is normal every 5 minutes. This alarm is generated when the information is abnormal.</p>
|
||||
<p id="ALM-12041__en-us_topic_0070543616_p13833415">This alarm is cleared when the information becomes normal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__s31a94c53b2eb49e491db76428ba8487c"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12041__en-us_topic_0070543616_table46764862" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12041__en-us_topic_0070543616_row3894347"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12041__en-us_topic_0070543616_p47006691">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12041__en-us_topic_0070543616_p49445640">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12041__en-us_topic_0070543616_p45673889">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12041__en-us_topic_0070543616_row8597527"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12041__en-us_topic_0070543616_p25311057">12041</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12041__en-us_topic_0070543616_p36929719">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12041__en-us_topic_0070543616_p38517299">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__s2c0e3cea5a70449bac745c4f6f3c5835"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12041__en-us_topic_0070543616_table32893482" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12041__en-us_topic_0070543616_row62521132"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12041__en-us_topic_0070543616_p31046898">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12041__en-us_topic_0070543616_p31770809">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12041__row104888561512"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12041__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12041__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12041__en-us_topic_0070543616_row23298728"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12041__en-us_topic_0070543616_p8148840">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12041__en-us_topic_0070543616_p56076321">Specifies the service name for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12041__en-us_topic_0070543616_row34924842"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12041__en-us_topic_0070543616_p10339914">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12041__en-us_topic_0070543616_p32226678">Specifies the role name for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12041__en-us_topic_0070543616_row21604653"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12041__en-us_topic_0070543616_p5146455">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12041__en-us_topic_0070543616_p14209730">Specifies the object (host ID) for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12041__en-us_topic_0070543616_row60778711"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12041__en-us_topic_0070543616_p24128582">PathName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12041__en-us_topic_0070543616_p8258121">Specifies the path or name of the abnormal file.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__sa7ca1b2b65694ea7bc5f899bfb251f8f"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12041__en-us_topic_0070543616_p64928073">System functions are unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__s40c63dc25cc84bfc9e3241365ab0f0bd"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12041__p1177141835411">The file permission is abnormal or the file is lost due to a user manually modified information such as the file permission, user, and user group, or the system is powered off unexpectedly.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__s97497fe5175042fc8f02531ea6f82aa1"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12041__en-us_topic_0070543616_p53128798"><strong id="ALM-12041__b5076943711011">Check whether the abnormal file exists and whether the permission on the abnormal file is correct.</strong></p>
|
||||
<ol id="ALM-12041__ol5738633911023"><li id="ALM-12041__li4564192111014"><span>On the FusionInsight Manager portal, choose <strong id="ALM-12041__b2744094511014">O&M > Alarm<strong id="ALM-12041__b27872374104950"> > Alarms</strong></strong>.</span></li><li id="ALM-12041__li5407306011014"><span>Check the value of <strong id="ALM-12041__b812410911014">HostName</strong> to obtain the host name involved in this alarm. Check the value of <strong id="ALM-12041__b600811711014">PathName</strong> to obtain the path or name of the abnormal file.</span></li><li id="ALM-12041__li1784176011014"><span>Log in to the node for which the alarm is generated as user <strong id="ALM-12041__b1689549811014">root</strong>. <span id="ALM-12041__text43649449460"></span></span></li><li id="ALM-12041__li2193450211014"><span>Run the <strong id="ALM-12041__b2635812011014">ll </strong><em id="ALM-12041__i3589648911014">pathName</em> command, where <em id="ALM-12041__i5463295011014">pathName</em> indicates the name of the abnormal file to obtain the user, permission, and user group information about the file or directory.</span></li><li id="ALM-12041__li1834285111014"><a name="ALM-12041__li1834285111014"></a><a name="li1834285111014"></a><span>Go to <strong id="ALM-12041__b6319279611014">${BIGDATA_HOME}/om-agent/nodeagent/etc/agent/autocheck</strong> directory. Then run the <strong id="ALM-12041__b3186425611014">vi keyfile</strong> command and search for the name of the abnormal file and check the due permission of the file.</span><p><div class="note" id="ALM-12041__note21303849111810"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12041__p32947381111823">To ensure proper configuration synchronization between the active and standby OMS servers, files, directories, and files and sub-directories in the directories configured in <strong id="ALM-12041__b28090979111823">$OMS_RUN_PATH/workspace/ha/module/hasync/plugin/conf/filesync.xml </strong>will also be monitored except files and directories in <strong id="ALM-12041__b51492227111823">keyfile</strong>. User <strong id="ALM-12041__b60776860111823">omm </strong>must have read and write permissions of files and read and execute permissions of directories.</p>
|
||||
</div></div>
|
||||
</p></li><li id="ALM-12041__li937595411014"><span>Compare the real-world permission of the file with the due permission obtained in <a href="#ALM-12041__li1834285111014">5</a> and correct the permission, user, and user group information for the file.</span></li><li id="ALM-12041__li75110811014"><span>Wait a hour and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12041__ul4392001111014"><li id="ALM-12041__li1727472911014">If yes, no further action is required.</li><li id="ALM-12041__li5707578411014">If no, go to <a href="#ALM-12041__li1068683211014">8</a>.</li></ul>
|
||||
<div class="note" id="ALM-12041__note50974664111832"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12041__p22323065111841">If the disk partition where the cluster installation directory resides is used up, some temporary files will be generated in the program installation directory when running the <strong id="ALM-12041__b66689858111841">sed</strong> command fails. Users do not have the read, write, and execute permissions of these temporary files. The system reports an alarm indicating that permissions of temporary files are abnormal if these files are within the monitoring range of the alarm. Perform the preceding alarm handling processes to clear the alarm. Alternatively, you can directly delete the temporary files after confirming that files with abnormal permissions are temporary. The temporary file generated after a <strong id="ALM-12041__b63337813111841">sed</strong> command execution failure is similar to the following.</p>
|
||||
</div></div>
|
||||
<p class="subitemlist" id="ALM-12041__p132194544418"><span><img id="ALM-12041__image13221252114113" src="en-us_image_0269383855.jpg"></span></p>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12041__p5973578011014"><strong id="ALM-12041__b120539411028">Collect fault information.</strong></p>
|
||||
<ol start="8" id="ALM-12041__ol6667694311030"><li id="ALM-12041__li1068683211014"><a name="ALM-12041__li1068683211014"></a><a name="li1068683211014"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12041__b675997211014">O&M</strong> > <strong id="ALM-12041__b6083974911014">Log > Download</strong>.</span></li><li id="ALM-12041__li5465607911014"><span>Select <strong id="ALM-12041__b2907263111014">NodeAgent</strong> from the <strong id="ALM-12041__b6032708911014">Service</strong> and click <strong id="ALM-12041__b3991118545">OK</strong>.</span></li><li id="ALM-12041__li1145664103113"><span>Click <span><img id="ALM-12041__image1945644173117" src="en-us_image_0269383856.png"></span> in the upper right corner, and set <strong id="ALM-12041__b6456941173117">Start Date</strong> and <strong id="ALM-12041__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12041__b13456164113319">Download</strong>.</span></li><li id="ALM-12041__li495644512588"><span>Contact the <span id="ALM-12041__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12041__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12041__se412130f5804478682e4033a07e24342"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12041__en-us_topic_0070543616_p23767697">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
93
docs/mrs/umn/ALM-12042.html
Normal file
93
docs/mrs/umn/ALM-12042.html
Normal file
@ -0,0 +1,93 @@
|
||||
<a name="ALM-12042"></a><a name="ALM-12042"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12042 Incorrect Configuration of Key Files</h1>
|
||||
<div id="body906005"><div class="section" id="ALM-12042__sc3cca1c900634e64b9bcb9236c56da57"><h4 class="sectiontitle">Description</h4><p id="ALM-12042__en-us_topic_0070543617_p6246726">The system checks whether critical configurations are correct every 5 minutes. This alarm is generated when the configurations are abnormal.</p>
|
||||
<p id="ALM-12042__en-us_topic_0070543617_p56220539">This alarm is cleared when the configurations become normal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__sa04974fe3bf94125b8210b81160d194e"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12042__en-us_topic_0070543617_table57569777" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12042__en-us_topic_0070543617_row9019511"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12042__en-us_topic_0070543617_p59491808">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12042__en-us_topic_0070543617_p54107113">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12042__en-us_topic_0070543617_p20599994">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12042__en-us_topic_0070543617_row57986812"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12042__en-us_topic_0070543617_p66420182">12042</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12042__en-us_topic_0070543617_p11325681">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12042__en-us_topic_0070543617_p44964981">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__s7d7aa6df80b94fc798bdb00904628c42"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12042__en-us_topic_0070543617_table18284808" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12042__en-us_topic_0070543617_row52279956"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12042__en-us_topic_0070543617_p6818037">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12042__en-us_topic_0070543617_p15390090">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12042__row0134145115516"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12042__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12042__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12042__en-us_topic_0070543617_row38637755"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12042__en-us_topic_0070543617_p42650482">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12042__en-us_topic_0070543617_p32136980">Specifies the service name for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12042__en-us_topic_0070543617_row20797370"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12042__en-us_topic_0070543617_p6865386">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12042__en-us_topic_0070543617_p19225403">Specifies the role name for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12042__en-us_topic_0070543617_row38810903"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12042__en-us_topic_0070543617_p56675452">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12042__en-us_topic_0070543617_p27308885">Specifies the object (host ID) for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12042__en-us_topic_0070543617_row44453377"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12042__en-us_topic_0070543617_p43953764">PathName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12042__en-us_topic_0070543617_p3485103">Specifies the path or name of the abnormal file.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__scc749176d57847ee9dbe5ce2057bf3bd"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12042__en-us_topic_0070543617_p13857931">Functions related to the file are abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__s0ed39bd436594cd2af2414af2dd189c3"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12042__en-us_topic_0070543617_p48750623">The file configuration is modified manually or the system is powered off unexpectedly.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__sa72cb081ce9546069c49fa0a37a80746"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12042__en-us_topic_0070543617_p56486420"><strong id="ALM-12042__b364765131137">Check abnormal file configuration.</strong></p>
|
||||
<ol id="ALM-12042__ol5061051711317"><li id="ALM-12042__li181336611310"><span>On the FusionInsight Manager portal, choose <strong id="ALM-12042__b5239726811310">O&M > Alarm<strong id="ALM-12042__b27872374104950"> > Alarms</strong></strong>.</span></li><li id="ALM-12042__li4687569511310"><span>Check the value of <strong id="ALM-12042__b1632029711310">HostName</strong> to obtain the host name involved in this alarm. Check the value of <strong id="ALM-12042__b1266495111310">PathName</strong> to obtain the path or name of the abnormal file.</span></li><li id="ALM-12042__li3883495711310"><span>Log in to the node for which the alarm is generated as user <strong id="ALM-12042__b1922807611310">root</strong>. <span id="ALM-12042__text43649449460"></span></span></li><li id="ALM-12042__li5862385211310"><span>View the $BIGDATA_LOG_HOME/nodeagent/scriptlog/checkfileconfig.log file and analyze the cause based on the error log. Locate the check standards of the file in the <a href="#ALM-12042__en-us_topic_0070543617_cab">Related Information</a> and manually check and modify the file based on the standards.</span><p><p id="ALM-12042__p181921481219">Run the <strong id="ALM-12042__b18909133892210">vi </strong><em id="ALM-12042__i14910133813227">file name</em> command to enter the editing mode, and then press <strong id="ALM-12042__b25755449228">Insert</strong> to start editing.</p>
|
||||
<p id="ALM-12042__p16192108152119">After the modification is complete, press <strong id="ALM-12042__b976905012226">Esc</strong> to exit the editing mode and enter<strong id="ALM-12042__b161181354142215"> :wq</strong> to save the settings and exit.</p>
|
||||
<p id="ALM-12042__p830792372214">For example:</p>
|
||||
<p id="ALM-12042__p1819218813219"><strong id="ALM-12042__b0943142682220">vi /etc/ssh/sshd_config</strong></p>
|
||||
</p></li><li id="ALM-12042__li3021967611310"><span>Wait a hour and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12042__ul3019924411310"><li id="ALM-12042__li5785262811310">If yes, no further action is required.</li><li id="ALM-12042__li5555125411310">If no, go to <a href="#ALM-12042__li1843685711310">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12042__p335774111310"><strong id="ALM-12042__b2285498211323">Collect fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12042__ol3443027411326"><li id="ALM-12042__li1843685711310"><a name="ALM-12042__li1843685711310"></a><a name="li1843685711310"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12042__b354163311310">O&M</strong> > <strong id="ALM-12042__b3187470111310">Log > Download</strong>.</span></li><li id="ALM-12042__li3405016711310"><span>Select <strong id="ALM-12042__b3171399011310">NodeAgent</strong> from the <strong id="ALM-12042__b1699046211310">Service</strong> and click <strong id="ALM-12042__b3991118545">OK</strong>.</span></li><li id="ALM-12042__li1145664103113"><span>Click <span><img id="ALM-12042__image1945644173117" src="en-us_image_0269383857.png"></span> in the upper right corner, and set <strong id="ALM-12042__b6456941173117">Start Date</strong> and <strong id="ALM-12042__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12042__b13456164113319">Download</strong>.</span></li><li id="ALM-12042__li495644512588"><span>Contact the <span id="ALM-12042__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12042__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12042__en-us_topic_0070543617_cab"><a name="ALM-12042__en-us_topic_0070543617_cab"></a><a name="en-us_topic_0070543617_cab"></a><h4 class="sectiontitle">Related Information</h4><ul id="ALM-12042__en-us_topic_0070543617_ul41538621"><li id="ALM-12042__en-us_topic_0070543617_li38303271"><strong id="ALM-12042__en-us_topic_0070543617_b9185121">Check standards of /etc/fstab</strong><p id="ALM-12042__en-us_topic_0070543617_p15557228">Check whether the partitions configured in the <strong id="ALM-12042__en-us_topic_0070543617_b5797330">/etc/fstab</strong> file can be found in <strong id="ALM-12042__en-us_topic_0070543617_b52175970">/proc/mounts</strong>.</p>
|
||||
<p id="ALM-12042__p7844577162652">Check whether the swap partitions configured in fstab correspond to those in /proc/swaps.</p>
|
||||
</li><li id="ALM-12042__en-us_topic_0070543617_li52665528"><strong id="ALM-12042__en-us_topic_0070543617_b4227711">Check the /etc/hosts configuration file.</strong><p id="ALM-12042__en-us_topic_0070543617_p38049407">Run <strong id="ALM-12042__en-us_topic_0070543617_b6900343">cat /ect/hosts</strong>. If any of the following situations occurs, the <strong id="ALM-12042__en-us_topic_0070543617_b62103089">/etc/hosts</strong> configuration file is abnormal:</p>
|
||||
<ol id="ALM-12042__en-us_topic_0070543617_ol22056895"><li id="ALM-12042__en-us_topic_0070543617_li64294328">The <strong id="ALM-12042__en-us_topic_0070543617_b41778047">/etc/hosts</strong> file does not exist.</li><li id="ALM-12042__en-us_topic_0070543617_li40458108">The host name is not configured in the file.</li><li id="ALM-12042__en-us_topic_0070543617_li28578656">The host name maps to multiple IP addresses in the file.</li><li id="ALM-12042__en-us_topic_0070543617_li55881316">The IP address corresponding to the host name does not exist in the command output of the <strong id="ALM-12042__b482923119409">ifconfig </strong>command.</li><li id="ALM-12042__en-us_topic_0070543617_li33169801">One IP address maps to multiple host names in the file.</li></ol>
|
||||
</li><li id="ALM-12042__en-us_topic_0070543617_li30092761"><strong id="ALM-12042__en-us_topic_0070543617_b2399395">Check standards of /etc/ssh/sshd_config</strong><p id="ALM-12042__en-us_topic_0070543617_p21594562">Run the <strong id="ALM-12042__en-us_topic_0070543617_b60133334">vi /etc/ssh/sshd_config</strong> command to check whether configuration items are configured as follows:</p>
|
||||
<ol id="ALM-12042__en-us_topic_0070543617_ol4329096"><li id="ALM-12042__en-us_topic_0070543617_li38961865">The value of <strong id="ALM-12042__en-us_topic_0070543617_b15112466">UseDNS</strong> must be set to <strong id="ALM-12042__en-us_topic_0070543617_b1794468">no</strong>.</li><li id="ALM-12042__en-us_topic_0070543617_li16150216">The value of <strong id="ALM-12042__en-us_topic_0070543617_b11134224">MaxStartups</strong> must be greater than or equal to 1000.</li><li id="ALM-12042__en-us_topic_0070543617_li33099156">At least one of the <strong id="ALM-12042__en-us_topic_0070543617_b29456948">PasswordAuthentication</strong> and <strong id="ALM-12042__en-us_topic_0070543617_b63785947">ChallengeResponseAuthentication</strong> parameters must be left blank or at least one of the parameters be set to <strong id="ALM-12042__en-us_topic_0070543617_b37202617">yes</strong>.</li></ol>
|
||||
</li></ul>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
161
docs/mrs/umn/ALM-12045.html
Normal file
161
docs/mrs/umn/ALM-12045.html
Normal file
File diff suppressed because it is too large
Load Diff
96
docs/mrs/umn/ALM-12046.html
Normal file
96
docs/mrs/umn/ALM-12046.html
Normal file
@ -0,0 +1,96 @@
|
||||
<a name="ALM-12046"></a><a name="ALM-12046"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12046 Write Packet Dropped Rate Exceeds the Threshold</h1>
|
||||
<div id="body35044409"><div class="section" id="ALM-12046__section63708257"><h4 class="sectiontitle">Description</h4><p id="ALM-12046__p20024890">The system checks the write packet dropped rate every 30 seconds. This alarm is generated when the write packet dropped rate exceeds the threshold (the default threshold is 0.5%) for multiple times (the default value is <strong id="ALM-12046__b143131152516">5</strong>).</p>
|
||||
<p id="ALM-12046__p46006286">To change the threshold, choose <strong id="ALM-12046__b671126165910">O&M</strong> > <strong id="ALM-12046__b373718611599">Alarm</strong> > <strong id="ALM-12046__b1674813614593">Thresholds</strong> > <em id="ALM-12046__i17776645912">Name of the desired cluster</em> > <strong id="ALM-12046__b6790176115911">Host</strong> > <strong id="ALM-12046__b18803263596">Network Writing</strong> > <strong id="ALM-12046__b1482086135918">Write Packet Dropped Rate</strong>.</p>
|
||||
<p id="ALM-12046__p11403395">If <strong id="ALM-12046__b184431148607">Trigger Count</strong> is <strong id="ALM-12046__b9468989256">1</strong>, this alarm is cleared when the network write packet dropped rate is less than or equal to the threshold. If <strong id="ALM-12046__b538013571803">Trigger Count</strong> is greater than <strong id="ALM-12046__b1946915882515">1</strong>, this alarm is cleared when the network write packet dropped rate is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section36503402"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12046__table51259837" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12046__row11203870"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12046__p35098282">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12046__p24388590">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12046__p29318803">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12046__row26012865"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12046__p26667295">12046</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12046__p12567253">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12046__p11314572">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section60095170"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12046__table44065131" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12046__row57092581"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12046__p61096309">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12046__p49853969">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12046__row153946297383"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12046__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12046__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12046__row11639714"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12046__p3292816">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12046__p65391554">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12046__row51653081"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12046__p23149996">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12046__p63210425">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12046__row32022916"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12046__p43719384">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12046__p51609209">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12046__row61829697"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12046__p42149567">Port Name</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12046__p58671806">Specifies the network port for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12046__row58284214"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12046__p23400856">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12046__p16421199">Specifies the threshold for triggering the alarm.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section3985625"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12046__p55048771">The service performance deteriorates or some services time out.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section35870633"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12046__ul29765467"><li id="ALM-12046__li66562616">The alarm threshold is improperly configured.</li><li id="ALM-12046__li62192640">The network quality is poor.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section54400241"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12046__p4439065"><strong id="ALM-12046__b488114212259">Check whether the threshold is set properly.</strong></p>
|
||||
<ol id="ALM-12046__ol51757082114518"><li id="ALM-12046__li5429491411450"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12046__b530513292591">O&M</strong> > <strong id="ALM-12046__b73111296591">Alarm</strong> > <strong id="ALM-12046__b132072925911">Thresholds</strong> > <em id="ALM-12046__i1532816298599">Name of the desired cluster</em> > <strong id="ALM-12046__b534062916596">Host</strong> > <strong id="ALM-12046__b14347529165911">Network Writing</strong> > <strong id="ALM-12046__b13361529195913">Write Packet Dropped Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12046__b7369102916591">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12046__ul603276811450"><li id="ALM-12046__li1878771011450">If yes, go to <a href="#ALM-12046__li4369794811450">4</a>.</li><li id="ALM-12046__li4540955011450">If no, go to <a href="#ALM-12046__li5699560811450">2</a>.</li></ul>
|
||||
</p></li><li id="ALM-12046__li5699560811450"><a name="ALM-12046__li5699560811450"></a><a name="li5699560811450"></a><span>Choose <strong id="ALM-12046__b86275584598">O&M</strong> > <strong id="ALM-12046__b863815815596">Alarm</strong> > <strong id="ALM-12046__b46391158155914">Thresholds</strong> > <em id="ALM-12046__i17639135845918">Name of the desired cluster</em> > <strong id="ALM-12046__b1639175845912">Host</strong> > <strong id="ALM-12046__b1964015811598">Network Writing</strong> > <strong id="ALM-12046__b17640175865912">Write Packet Dropped Rate</strong>. Click <strong id="ALM-12046__b564014589596">Modify</strong> in the <strong id="ALM-12046__b1664135816596">Operation</strong> column to change the threshold.</span><p><p class="litext" id="ALM-12046__p3581190711450">See <a href="#ALM-12046__fig153215311450">Figure 1</a>.</p>
|
||||
<div class="fignone" id="ALM-12046__fig153215311450"><a name="ALM-12046__fig153215311450"></a><a name="fig153215311450"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12046__image1482785044213" src="en-us_image_0000001390459444.png"></span></div>
|
||||
</p></li><li id="ALM-12046__li1629248811450"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12046__ul1759973611450"><li id="ALM-12046__li4319843211450">If yes, no further action is required.</li><li id="ALM-12046__li941206611450">If no, go to <a href="#ALM-12046__li4369794811450">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12046__p2417989711450"><strong id="ALM-12046__b284519296260">Check whether the network connection is normal.</strong></p>
|
||||
<ol start="4" id="ALM-12046__ol50526507114556"><li id="ALM-12046__li4369794811450"><a name="ALM-12046__li4369794811450"></a><a name="li4369794811450"></a><span>Contact the network administrator to check whether the network is normal.</span><p><ul class="subitemlist" id="ALM-12046__ul4959457011450"><li id="ALM-12046__li4462316111450">If yes, rectify the fault and go to <a href="#ALM-12046__li6056359711450">5</a>.</li><li id="ALM-12046__li5770629011450">If no, go to <a href="#ALM-12046__li820146511450">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12046__li6056359711450"><a name="ALM-12046__li6056359711450"></a><a name="li6056359711450"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12046__ul1317526611450"><li id="ALM-12046__li5773721911450">If yes, no further action is required.</li><li id="ALM-12046__li4620316111450">If no, go to <a href="#ALM-12046__li820146511450">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12046__p5146853111450"><strong id="ALM-12046__b6696662511465">Collect the fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12046__ol4187815011462"><li id="ALM-12046__li820146511450"><a name="ALM-12046__li820146511450"></a><a name="li820146511450"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12046__b82519710275">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12046__b92521274278">Log</strong> > <strong id="ALM-12046__b122521742710">Download</strong>.</span></li><li id="ALM-12046__li670432911450"><span>Select <strong id="ALM-12046__b73620916276">OMS</strong> for <strong id="ALM-12046__b53624992712">Service</strong> and click <strong id="ALM-12046__b17362129182711">OK</strong>.</span></li><li id="ALM-12046__li6033896511450"><span>Expand the <strong id="ALM-12046__b1511218191705">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12046__li617977311450"><span>Click <span><img id="ALM-12046__image92961342720" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12046__b12391113112719">Start Date</strong> and <strong id="ALM-12046__b53961311278">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12046__b23914137276">Download</strong>.</span></li><li id="ALM-12046__li3079963411450"><span>Contact <span id="ALM-12046__text26871216142711">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12046__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12046__section19840123"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12046__p40881857">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
96
docs/mrs/umn/ALM-12047.html
Normal file
96
docs/mrs/umn/ALM-12047.html
Normal file
@ -0,0 +1,96 @@
|
||||
<a name="ALM-12047"></a><a name="ALM-12047"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12047 Read Packet Error Rate Exceeds the Threshold</h1>
|
||||
<div id="body45628126"><div class="section" id="ALM-12047__section664161"><h4 class="sectiontitle">Description</h4><p id="ALM-12047__p4890695">The system checks the read packet error rate every 30 seconds. This alarm is generated when the read packet error rate exceeds the threshold (the default threshold is <strong id="ALM-12047__b19977647102715">0.5%</strong>) for multiple times (the default value is <strong id="ALM-12047__b18988647172717">5</strong>).</p>
|
||||
<p id="ALM-12047__p44016255">To change the threshold, choose <strong id="ALM-12047__b1031528304">O&M</strong> > <strong id="ALM-12047__b1318428807">Alarm</strong> > <strong id="ALM-12047__b1619528101">Thresholds</strong> > <em id="ALM-12047__i151917281903">Name of the desired cluster</em> > <strong id="ALM-12047__b11209289016">Host</strong> > <strong id="ALM-12047__b102114282015">Network Reading</strong> > <strong id="ALM-12047__b8217281019">Read Packet Error Rate</strong>.</p>
|
||||
<p id="ALM-12047__p60601976">If <strong id="ALM-12047__b676194810111">Trigger Count</strong> is <strong id="ALM-12047__b167621155132713">1</strong>, this alarm is cleared when the read packet error rate is less than or equal to the threshold. If <strong id="ALM-12047__b47189568116">Trigger Count</strong> is greater than <strong id="ALM-12047__b17762755172711">1</strong>, this alarm is cleared when the read packet error rate is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section5977455"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12047__table9813018" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12047__row10452915"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12047__p41379750">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12047__p63425491">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12047__p37191136">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12047__row59692021"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12047__p3215547">12047</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12047__p59132761">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12047__p25024376">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section53797099"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12047__table13708608" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12047__row12493869"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12047__p5370469">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12047__p32354858">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12047__row242219184387"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12047__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12047__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12047__row3497827"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12047__p14888569">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12047__p65123411">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12047__row49239789"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12047__p28999977">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12047__p187933">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12047__row1691404"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12047__p2786056">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12047__p24344017">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12047__row17769566"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12047__p30048708">Port Name</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12047__p18026283">Specifies the network port for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12047__row28018822"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12047__p54932113">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12047__p20316144">Specifies the threshold for triggering the alarm.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section14411846"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12047__p34994956">The communication is intermittently interrupted, and services time out.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section62597753"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12047__ul16019148"><li id="ALM-12047__li9954605">The alarm threshold is improperly configured.</li><li id="ALM-12047__li22482584">The network quality is poor.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section26508869"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12047__p9150041"><strong id="ALM-12047__b48301864144321">Check whether the threshold is set properly.</strong></p>
|
||||
<ol id="ALM-12047__ol5621610514492"><li id="ALM-12047__li16991200144325"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12047__b1096642210512">O&M</strong> > <strong id="ALM-12047__b2977102211513">Alarm</strong> > <strong id="ALM-12047__b10994192210512">Thresholds</strong> > <em id="ALM-12047__i1822231959">Name of the desired cluster</em> > <strong id="ALM-12047__b614102317519">Host</strong> > <strong id="ALM-12047__b121811235515">Network Reading</strong> > <strong id="ALM-12047__b13271323151">Read Packet Error Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12047__b173611231556">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12047__ul54083694144325"><li id="ALM-12047__li61199409144325">If yes, go to <a href="#ALM-12047__li47122569144325">4</a>.</li><li id="ALM-12047__li58205082144325">If no, go to <a href="#ALM-12047__li18938060144325">2</a>.</li></ul>
|
||||
</p></li><li id="ALM-12047__li18938060144325"><a name="ALM-12047__li18938060144325"></a><a name="li18938060144325"></a><span>Choose <strong id="ALM-12047__b11895317762">O&M</strong> > <strong id="ALM-12047__b789714171965">Alarm</strong> > <strong id="ALM-12047__b9898141713613">Thresholds</strong> > <em id="ALM-12047__i389981710618">Name of the desired cluster</em> > <strong id="ALM-12047__b179008171767">Host</strong> > <strong id="ALM-12047__b109007171611">Network Reading</strong> > <strong id="ALM-12047__b790117174615">Read Packet Error Rate</strong>. Click <strong id="ALM-12047__b139032017464">Modify</strong> in the <strong id="ALM-12047__b169038177610">Operation</strong> column to change the threshold.</span><p><p class="litext" id="ALM-12047__p34109930144325">See <a href="#ALM-12047__fig35859496144325">Figure 1</a>.</p>
|
||||
<div class="fignone" id="ALM-12047__fig35859496144325"><a name="ALM-12047__fig35859496144325"></a><a name="fig35859496144325"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12047__image777621374319" src="en-us_image_0000001441218249.png"></span></div>
|
||||
</p></li><li id="ALM-12047__li11450397144325"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12047__ul34110047144325"><li id="ALM-12047__li36224819144325">If yes, no further action is required.</li><li id="ALM-12047__li48529247144325">If no, go to <a href="#ALM-12047__li47122569144325">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12047__p38554968144325"><strong id="ALM-12047__b1663413103111">Check whether the network connection is normal.</strong></p>
|
||||
<ol start="4" id="ALM-12047__ol41298640144920"><li id="ALM-12047__li47122569144325"><a name="ALM-12047__li47122569144325"></a><a name="li47122569144325"></a><span>Contact the network administrator to check whether the network is normal.</span><p><ul class="subitemlist" id="ALM-12047__ul12692381144325"><li id="ALM-12047__li55066931144325">If yes, rectify the fault and go to <a href="#ALM-12047__li52164171144325">5</a>.</li><li id="ALM-12047__li31236426144325">If no, go to <a href="#ALM-12047__li66824355144325">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12047__li52164171144325"><a name="ALM-12047__li52164171144325"></a><a name="li52164171144325"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12047__ul644002144325"><li id="ALM-12047__li21449944144325">If yes, no further action is required.</li><li id="ALM-12047__li59723879144325">If no, go to <a href="#ALM-12047__li66824355144325">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12047__p37260279144922"><strong id="ALM-12047__b41163092144926">Collect the fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12047__ol4946431144932"><li id="ALM-12047__li66824355144325"><a name="ALM-12047__li66824355144325"></a><a name="li66824355144325"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12047__b114185633111">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12047__b912756123118">Log</strong> > <strong id="ALM-12047__b313125673116">Download</strong>.</span></li><li id="ALM-12047__li64548284144325"><span>Select <strong id="ALM-12047__b13721135814311">OMS</strong> for <strong id="ALM-12047__b8721758153120">Service</strong> and click <strong id="ALM-12047__b187221358143114">OK</strong>.</span></li><li id="ALM-12047__li44063647144325"><span>Expand the <strong id="ALM-12047__b1780712356614">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12047__li61028510144325"><span>Click <span><img id="ALM-12047__image1171914283214" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12047__b672772103210">Start Date</strong> and <strong id="ALM-12047__b9727729327">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12047__b127281226321">Download</strong>.</span></li><li id="ALM-12047__li44362264144325"><span>Contact <span id="ALM-12047__text5904144183214">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12047__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12047__section37253236"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12047__p42609442">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
96
docs/mrs/umn/ALM-12048.html
Normal file
96
docs/mrs/umn/ALM-12048.html
Normal file
@ -0,0 +1,96 @@
|
||||
<a name="ALM-12048"></a><a name="ALM-12048"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12048 Write Packet Error Rate Exceeds the Threshold</h1>
|
||||
<div id="body54650370"><div class="section" id="ALM-12048__section7565538"><h4 class="sectiontitle">Description</h4><p id="ALM-12048__p64603859">The system checks the write packet error rate every 30 seconds. This alarm is generated when the write packet error rate exceeds the threshold (the default threshold is <strong id="ALM-12048__b5386112713322">0.5%</strong>) for multiple times (the default value is <strong id="ALM-12048__b183961527113215">5</strong>).</p>
|
||||
<p id="ALM-12048__p44563823">To change the threshold, choose <strong id="ALM-12048__b1763564319613">O&M</strong> > <strong id="ALM-12048__b1651194316612">Alarm</strong> > <strong id="ALM-12048__b865218437611">Thresholds</strong> > <em id="ALM-12048__i15653543160">Name of the desired cluster</em> > <strong id="ALM-12048__b1765418431268">Host</strong> > <strong id="ALM-12048__b176575431464">Network Writing</strong> > <strong id="ALM-12048__b7660343367">Write Packet Error Rate</strong>.</p>
|
||||
<p id="ALM-12048__p13165359111345">If <strong id="ALM-12048__b79521459210">Trigger Count</strong> is <strong id="ALM-12048__b203739349325">1</strong>, this alarm is cleared when the write packet error rate is less than or equal to the threshold. If <strong id="ALM-12048__b1684714149218">Trigger Count</strong> is greater than <strong id="ALM-12048__b1137463443211">1</strong>, this alarm is cleared when the write packet error rate is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section980979"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12048__table6337270" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12048__row27207367"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12048__p56313123">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12048__p65069111">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12048__p36106668">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12048__row38959018"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12048__p1563864">12048</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12048__p59564126">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12048__p59964901">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section8828819"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12048__table25318804" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12048__row65941238"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12048__p39640022">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12048__p56725186">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12048__row7619201273812"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12048__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12048__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12048__row31337379"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12048__p55299809">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12048__p50099555">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12048__row48242818"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12048__p15354149">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12048__p35726594">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12048__row53103893"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12048__p6448057">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12048__p52530602">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12048__row3013370"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12048__p42756424">Port Name</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12048__p40718359">Specifies the network port for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12048__row30920919"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12048__p21566473">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12048__p2053904">Specifies the threshold for triggering the alarm.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section12350514"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12048__p32148564">The communication is intermittently interrupted, and services time out.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section44045770"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12048__ul53896892"><li id="ALM-12048__li15309985">The alarm threshold is improperly configured.</li><li id="ALM-12048__li3572145">The network quality is poor.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section60867610"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12048__p20908314"><strong id="ALM-12048__b538516311339">Check whether the threshold is set properly.</strong></p>
|
||||
<ol id="ALM-12048__ol6406395014549"><li id="ALM-12048__li11357890145357"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12048__b11238448717">O&M</strong> > <strong id="ALM-12048__b42571744714">Alarm</strong> > <strong id="ALM-12048__b132639411718">Thresholds</strong> > <em id="ALM-12048__i11266204978">Name of the desired cluster</em> > <strong id="ALM-12048__b1126810415718">Host</strong> > <strong id="ALM-12048__b427064876">Network Writing</strong> > <strong id="ALM-12048__b527544074">Write Packet Error Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12048__b152781042714">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12048__ul1261987145357"><li id="ALM-12048__li57812933145357">If yes, go to <a href="#ALM-12048__li12888339145357">4</a>.</li><li id="ALM-12048__li52336003145357">If no, go to <a href="#ALM-12048__li15963175145357">2</a>.</li></ul>
|
||||
</p></li><li id="ALM-12048__li15963175145357"><a name="ALM-12048__li15963175145357"></a><a name="li15963175145357"></a><span>Choose <strong id="ALM-12048__b143281531670">O&M</strong> > <strong id="ALM-12048__b10334143112710">Alarm</strong> > <strong id="ALM-12048__b835115312714">Thresholds</strong> > <em id="ALM-12048__i9353231677">Name of the desired cluster</em> > <strong id="ALM-12048__b33577318710">Host</strong> > <strong id="ALM-12048__b5359531876">Network Writing</strong> > <strong id="ALM-12048__b11361143118716">Write Packet Error Rate</strong>. Click <strong id="ALM-12048__b236373115712">Modify</strong> in the <strong id="ALM-12048__b136519311171">Operation</strong> column to change the threshold.</span><p><p class="litext" id="ALM-12048__p47573930145357">See <a href="#ALM-12048__fig53221363145357">Figure 1</a>.</p>
|
||||
<div class="fignone" id="ALM-12048__fig53221363145357"><a name="ALM-12048__fig53221363145357"></a><a name="fig53221363145357"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12048__image71961316435" src="en-us_image_0000001390619040.png"></span></div>
|
||||
</p></li><li id="ALM-12048__li53127101145357"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12048__ul44566628145357"><li id="ALM-12048__li9450851145357">If yes, no further action is required.</li><li id="ALM-12048__li27321468145357">If no, go to <a href="#ALM-12048__li12888339145357">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12048__p65555334145357"><strong id="ALM-12048__b389343463617">Check whether the network connection is normal.</strong></p>
|
||||
<ol start="4" id="ALM-12048__ol7862559145428"><li id="ALM-12048__li12888339145357"><a name="ALM-12048__li12888339145357"></a><a name="li12888339145357"></a><span>Contact the network administrator to check whether the network is normal.</span><p><ul class="subitemlist" id="ALM-12048__ul31258199145357"><li id="ALM-12048__li8327923145357">If yes, rectify the fault and go to <a href="#ALM-12048__li60279330145357">5</a>.</li><li id="ALM-12048__li3473133145357">If no, go to <a href="#ALM-12048__li5643066145357">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12048__li60279330145357"><a name="ALM-12048__li60279330145357"></a><a name="li60279330145357"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12048__ul3229702145357"><li id="ALM-12048__li48886195145357">If yes, no further action is required.</li><li id="ALM-12048__li358855145357">If no, go to <a href="#ALM-12048__li5643066145357">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12048__p29067324145357"><strong id="ALM-12048__b10082732145437">Collect the fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12048__ol65647935145434"><li id="ALM-12048__li5643066145357"><a name="ALM-12048__li5643066145357"></a><a name="li5643066145357"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12048__b4624406810">O&M</strong> > <strong id="ALM-12048__b1867213013818">Log</strong> > <strong id="ALM-12048__b1867914017813">Download</strong>.</span></li><li id="ALM-12048__li50787595145357"><span>Select <strong id="ALM-12048__b8263126183">OMS</strong> for <strong id="ALM-12048__b9277766818">Service</strong> and click <strong id="ALM-12048__b142791561189">OK</strong>.</span></li><li id="ALM-12048__li54435176145357"><span>Expand the <strong id="ALM-12048__b192997101388">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12048__li20154536145357"><span>Click <span><img id="ALM-12048__image104601319175315" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12048__b154102151382">Start Date</strong> and <strong id="ALM-12048__b24171815687">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12048__b1742012153810">Download</strong>.</span></li><li id="ALM-12048__li21904738145357"><span>Contact <span id="ALM-12048__text1165617231785">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12048__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12048__section10937580"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12048__p58470402">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
97
docs/mrs/umn/ALM-12049.html
Normal file
97
docs/mrs/umn/ALM-12049.html
Normal file
@ -0,0 +1,97 @@
|
||||
<a name="ALM-12049"></a><a name="ALM-12049"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12049 Network Read Throughput Rate Exceeds the Threshold</h1>
|
||||
<div id="body31997351"><div class="section" id="ALM-12049__se9b9c6d800884a09a0f40e98b3430599"><h4 class="sectiontitle">Description</h4><p id="ALM-12049__en-us_topic_0070543623_p63310742">The system checks the network read throughput rate every 30 seconds and compares the actual throughput rate with the threshold (the default threshold is 80%). This alarm is generated when the system detects that the network read throughput rate exceeds the threshold for several times (5 times by default) consecutively.</p>
|
||||
<p id="ALM-12049__en-us_topic_0070543623_p32925773">To change the threshold, choose <strong id="ALM-12049__en-us_topic_0070543619_b28886228">O&M > Alarm</strong> > <strong id="ALM-12049__b16230357155011">Thresholds</strong> > <em id="ALM-12049__i55868155117">Name of the desired cluster</em> > <strong id="ALM-12049__en-us_topic_0070543619_b52985952">Host</strong> > <strong id="ALM-12049__en-us_topic_0070543623_b23107031">Network Reading</strong> > <strong id="ALM-12049__en-us_topic_0070543623_b59730208">Read Throughput Rate</strong>.</p>
|
||||
<p id="ALM-12049__p46310248111743">When the <strong id="ALM-12049__b48421890111935">Trigger Count</strong> is 1, this alarm is cleared when the network read throughput rate is less than or equal to the threshold. When the <strong id="ALM-12049__b57536198499">Trigger Count</strong> is greater than 1, this alarm is cleared when the network read throughput rate is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__s26fa47946e3940d78985b345eb89af5d"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12049__en-us_topic_0070543623_table56778103" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12049__en-us_topic_0070543623_row53391874"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12049__en-us_topic_0070543623_p29774567">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12049__en-us_topic_0070543623_p62929749">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12049__en-us_topic_0070543623_p64144883">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12049__en-us_topic_0070543623_row28353001"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12049__en-us_topic_0070543623_p14891744">12049</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12049__en-us_topic_0070543623_p65380607">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12049__en-us_topic_0070543623_p61337839">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__s8138b6fe67704cdc9e175b0cea25960c"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12049__en-us_topic_0070543623_table2309080" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12049__en-us_topic_0070543623_row50768709"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12049__en-us_topic_0070543623_p18624790">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12049__en-us_topic_0070543623_p32213060">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12049__row59191019515"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12049__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12049__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12049__en-us_topic_0070543623_row59121078"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12049__en-us_topic_0070543623_p24078037">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12049__en-us_topic_0070543623_p4163976">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12049__en-us_topic_0070543623_row37475789"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12049__en-us_topic_0070543623_p15640058">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12049__en-us_topic_0070543623_p58885167">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12049__en-us_topic_0070543623_row60204461"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12049__en-us_topic_0070543623_p44723138">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12049__en-us_topic_0070543623_p65804403">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12049__en-us_topic_0070543623_row55368722"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12049__en-us_topic_0070543623_p55681530">NetworkCardName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12049__en-us_topic_0070543623_p13910066">Specifies the network port for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12049__en-us_topic_0070543623_row58081731"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12049__en-us_topic_0070543623_p6999799">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12049__en-us_topic_0070543623_p30112847">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__s2833644255e043af93275a32d5a5bae9"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12049__en-us_topic_0070543623_p23221556">The service system runs improperly or is unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__sa03918d9e6754c80bef107ff31e23284"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12049__en-us_topic_0070543623_ul1897869"><li id="ALM-12049__en-us_topic_0070543623_li17080825">The alarm threshold is set improperly.</li><li id="ALM-12049__en-us_topic_0070543623_li19509699">The network port rate cannot meet the current service requirements.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__s0f9f5ec0a021434b9928f5bf4c940044"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12049__en-us_topic_0070543623_p36781757"><strong id="ALM-12049__b4092164015127">Check whether the threshold is set properly.</strong></p>
|
||||
<ol id="ALM-12049__ol6452245415148"><li id="ALM-12049__li4670351415131"><span>On the FusionInsight Manager, choose <strong id="ALM-12049__b15915337194818">O&M > Alarm</strong> > <strong id="ALM-12049__b191503711486">Thresholds</strong> > <em id="ALM-12049__i189151337174819">Name of the desired cluster</em> > <strong id="ALM-12049__b16915237154813">Host</strong> > <strong id="ALM-12049__b1691573754820">Network Reading</strong> > <strong id="ALM-12049__b10915143734811">Read Throughput Rate</strong> and check whether the alarm threshold is set properly. (By default, 80% is a proper value. However, users can configure the value as required.)</span><p><ul class="subitemlist" id="ALM-12049__ul5738506215131"><li id="ALM-12049__li2521002015131">If yes, go to <a href="#ALM-12049__li5611086815131">2</a>.</li><li id="ALM-12049__li2874573915131">If no, go to <a href="#ALM-12049__li3065917315131">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12049__li5611086815131"><a name="ALM-12049__li5611086815131"></a><a name="li5611086815131"></a><span>Based on actual usage condition, choose <strong id="ALM-12049__b07081191469">O&M > Alarm</strong> > <strong id="ALM-12049__b20106143125110">Thresholds</strong> > <em id="ALM-12049__i11541848175114">Name of the desired cluster</em> > <strong id="ALM-12049__b47111192469">Host</strong> > <strong id="ALM-12049__b2418814015131">Network Reading</strong> > <strong id="ALM-12049__b1308228415131">Read Throughput Rate</strong> and click <strong id="ALM-12049__b84051320104416">Modify</strong> in the<strong id="ALM-12049__b18538823144410"> Operation</strong> column to modify the alarm threshold.</span><p><p class="litext" id="ALM-12049__p5303205615131">For details, see <a href="#ALM-12049__fig566375315131">Figure 1</a>.</p>
|
||||
<div class="fignone" id="ALM-12049__fig566375315131"><a name="ALM-12049__fig566375315131"></a><a name="fig566375315131"></a><span class="figcap"><b>Figure 1 </b>Setting alarm thresholds</span><br><span><img id="ALM-12049__image1615410501365" src="en-us_image_0000001440858201.png"></span></div>
|
||||
</p></li><li id="ALM-12049__li6085933615131"><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12049__ul5129012315131"><li id="ALM-12049__li3523576915131">If yes, no further action is required.</li><li id="ALM-12049__li3552506415131">If no, go to <a href="#ALM-12049__li3065917315131">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12049__p5895793115131"><strong id="ALM-12049__b5562659915153">Check whether the network port rate can meet the service requirements.</strong></p>
|
||||
<ol start="4" id="ALM-12049__ol665573431527"><li id="ALM-12049__li3065917315131"><a name="ALM-12049__li3065917315131"></a><a name="li3065917315131"></a><span>On FusionInsight Manager, click <span><img id="ALM-12049__image168221113135319" src="en-us_image_0269383872.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the network port name for which the alarm is generated.</span></li><li id="ALM-12049__li36506615131"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12049__b749710315131">root</strong>. <span id="ALM-12049__text43649449460"></span></span></li><li id="ALM-12049__li1487667815131"><span>Run the <strong id="ALM-12049__b328560015131">ethtool </strong><em id="ALM-12049__i2957040015131">network port name</em> command to check the maximum speed of the current network port.</span><p><div class="note" id="ALM-12049__note4639220615131"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12049__p6480701315131">In the VM environment, you cannot run a command to query the network port rate. It is recommended that you contact the system administrator to confirm whether the network port rate meets the requirements.</p>
|
||||
</div></div>
|
||||
</p></li><li id="ALM-12049__li6678124515131"><span>If the network read throughput rate exceeds the threshold, contact the system administrator to increase the network port rate.</span></li><li id="ALM-12049__li3780745315131"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12049__ul6509010915131"><li id="ALM-12049__li6416030115131">If yes, no further action is required.</li><li id="ALM-12049__li2960185515131">If no, go to <a href="#ALM-12049__li4699944215131">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12049__p4894007015131"><strong id="ALM-12049__b3536737115217">Collect fault information.</strong></p>
|
||||
<ol start="9" id="ALM-12049__ol2826703815214"><li id="ALM-12049__li4699944215131"><a name="ALM-12049__li4699944215131"></a><a name="li4699944215131"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12049__b39977366113627">O&M</strong> > <strong id="ALM-12049__b24251979113627">Log > Download</strong>.</span></li><li id="ALM-12049__li6522206415131"><span>Select <strong id="ALM-12049__b1352831932712">OMS</strong> from the <strong id="ALM-12049__b4885847115131">Service</strong> and click <strong id="ALM-12049__b3991118545">OK</strong>.</span></li><li id="ALM-12049__li4849583015131"><span>Set <strong id="ALM-12049__b5012766815131">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12049__li1145664103113"><span>Click <span><img id="ALM-12049__image1945644173117" src="en-us_image_0269383873.png"></span> in the upper right corner, and set <strong id="ALM-12049__b6456941173117">Start Date</strong> and <strong id="ALM-12049__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12049__b13456164113319">Download</strong>.</span></li><li id="ALM-12049__li495644512588"><span>Contact the <span id="ALM-12049__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12049__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12049__sb05d12c7f86745d595325a3df0353b1f"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12049__en-us_topic_0070543623_p10777594">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
97
docs/mrs/umn/ALM-12050.html
Normal file
97
docs/mrs/umn/ALM-12050.html
Normal file
@ -0,0 +1,97 @@
|
||||
<a name="ALM-12050"></a><a name="ALM-12050"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12050 Network Write Throughput Rate Exceeds the Threshold</h1>
|
||||
<div id="body28237008"><div class="section" id="ALM-12050__s4e0ce6b64884459bb9d9e6a5a48a93b4"><h4 class="sectiontitle">Description</h4><p id="ALM-12050__en-us_topic_0070543624_p36743086">The system checks the network write throughput rate every 30 seconds and compares the actual throughput rate with the threshold (the default threshold is 80%). This alarm is generated when the system detects that the network write throughput rate exceeds the threshold for several times (5 times by default) consecutively.</p>
|
||||
<p id="ALM-12050__en-us_topic_0070543624_p62252322">To change the threshold, choose <strong id="ALM-12050__en-us_topic_0070543619_b28886228">O&M > Alarm</strong> > <strong id="ALM-12050__b16230357155011">Thresholds</strong> > <em id="ALM-12050__i55868155117">Name of the desired cluster</em> > <strong id="ALM-12050__en-us_topic_0070543619_b52985952">Host</strong> > <strong id="ALM-12050__en-us_topic_0070543624_b49366740">Network Writing</strong> > <strong id="ALM-12050__en-us_topic_0070543624_b39283002">Write Throughput Rate</strong>.</p>
|
||||
<p id="ALM-12050__p8312662111839">When the <strong id="ALM-12050__b48421890111935">Trigger Count</strong> is 1, this alarm is cleared when the network write throughput rate is less than or equal to the threshold. When the <strong id="ALM-12050__b4893144644919">Trigger Count</strong> is greater than 1, this alarm is cleared when the network write throughput rate is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__s321d438edad94eb5b71f07ef1875200b"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12050__en-us_topic_0070543624_table48932486" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12050__en-us_topic_0070543624_row44783995"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12050__en-us_topic_0070543624_p3625001">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12050__en-us_topic_0070543624_p25189639">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12050__en-us_topic_0070543624_p27094845">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12050__en-us_topic_0070543624_row47198836"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12050__en-us_topic_0070543624_p65009363">12050</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12050__en-us_topic_0070543624_p31267075">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12050__en-us_topic_0070543624_p49605164">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__s14ad62dd58b44b258ec1b3102859a75e"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12050__en-us_topic_0070543624_table58595317" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12050__en-us_topic_0070543624_row63173322"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12050__en-us_topic_0070543624_p16765496">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12050__en-us_topic_0070543624_p15827946">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12050__row13771148154919"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12050__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12050__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12050__en-us_topic_0070543624_row6995273"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12050__en-us_topic_0070543624_p29746236">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12050__en-us_topic_0070543624_p60634924">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12050__en-us_topic_0070543624_row8843406"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12050__en-us_topic_0070543624_p45227267">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12050__en-us_topic_0070543624_p39530025">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12050__en-us_topic_0070543624_row20225910"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12050__en-us_topic_0070543624_p27685991">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12050__en-us_topic_0070543624_p27972830">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12050__en-us_topic_0070543624_row50428878"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12050__en-us_topic_0070543624_p58207302">NetworkCardName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12050__en-us_topic_0070543624_p17170998">Specifies the network port for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12050__en-us_topic_0070543624_row20321259"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12050__en-us_topic_0070543624_p35409283">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12050__en-us_topic_0070543624_p49579688">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__s2a5aa4c54d5043f78f3531f7f1779cbe"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12050__en-us_topic_0070543624_p56531777">The service system runs improperly or is unavailable.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__sf1f91024377049c1863cc8ab14993c9d"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12050__en-us_topic_0070543624_ul15671223"><li id="ALM-12050__en-us_topic_0070543624_li6823280">The alarm threshold is set improperly.</li><li id="ALM-12050__en-us_topic_0070543624_li61409521">The network port rate cannot meet the current service requirements.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__s288b004e523b4795aa832a7ef214236d"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12050__en-us_topic_0070543624_p8115287"><strong id="ALM-12050__b4779452715650">Check whether the threshold is set properly.</strong></p>
|
||||
<ol id="ALM-12050__ol626009901578"><li id="ALM-12050__li3381340415653"><span>On the FusionInsight Manager, choose <strong id="ALM-12050__b034142294917">O&M > Alarm</strong> > <strong id="ALM-12050__b1334142294919">Thresholds</strong> > <em id="ALM-12050__i63492294910">Name of the desired cluster</em> > <strong id="ALM-12050__b134152244914">Host</strong> > <strong id="ALM-12050__b1349229499">Network Writing</strong> > <strong id="ALM-12050__b1434722204913">Write Throughput Rate</strong> and check whether the alarm threshold is set properly. (By default, 80% is a proper value. However, users can configure the value as required.)</span><p><ul class="subitemlist" id="ALM-12050__ul1867012515653"><li id="ALM-12050__li683775815653">If yes, go to <a href="#ALM-12050__li3034361015653">4</a>.</li><li id="ALM-12050__li1698753915653">If no, go to <a href="#ALM-12050__li2386220215653">2</a>.</li></ul>
|
||||
</p></li><li id="ALM-12050__li2386220215653"><a name="ALM-12050__li2386220215653"></a><a name="li2386220215653"></a><span>Based on actual usage condition, choose <strong id="ALM-12050__b972065414613">O&M > Alarm</strong> > <strong id="ALM-12050__b17713333531">Thresholds</strong> > <em id="ALM-12050__i277113336535">Name of the desired cluster</em> > <strong id="ALM-12050__b12724175415463">Host</strong> > <strong id="ALM-12050__b2479886215653">Network Writing</strong> > <strong id="ALM-12050__b6255081015653">Write Throughput Rate</strong> and click <strong id="ALM-12050__b84051320104416">Modify</strong> in the<strong id="ALM-12050__b18538823144410"> Operation</strong> column to modify the alarm threshold.</span><p><p class="litext" id="ALM-12050__p3345084615653">For details, see <a href="#ALM-12050__fig2514972915653">Figure 1</a>.</p>
|
||||
<div class="fignone" id="ALM-12050__fig2514972915653"><a name="ALM-12050__fig2514972915653"></a><a name="fig2514972915653"></a><span class="figcap"><b>Figure 1 </b>Setting alarm thresholds</span><br><span><img id="ALM-12050__image1615410501365" src="en-us_image_0000001440978021.png"></span></div>
|
||||
</p></li><li id="ALM-12050__li5919843115653"><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12050__ul6204017715653"><li id="ALM-12050__li1343323115653">If yes, no further action is required.</li><li id="ALM-12050__li1434989315653">If no, go to <a href="#ALM-12050__li3034361015653">4</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12050__p2149068415653"><strong id="ALM-12050__b828937615716">Check whether the network port rate can meet the service requirements.</strong></p>
|
||||
<ol start="4" id="ALM-12050__ol3843532615729"><li id="ALM-12050__li3034361015653"><a name="ALM-12050__li3034361015653"></a><a name="li3034361015653"></a><span>On FusionInsight Manager, click <span><img id="ALM-12050__image168221113135319" src="en-us_image_0269383875.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the network port name for which the alarm is generated.</span></li><li id="ALM-12050__li4191332115653"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12050__b465703515653">root</strong>. <span id="ALM-12050__text43649449460"></span></span></li><li id="ALM-12050__li3191668615653"><span>Run the <strong id="ALM-12050__b4167557215653">ethtool</strong><em id="ALM-12050__i3953582915653">network port name</em> command to check the maximum speed of the current network port.</span><p><div class="note" id="ALM-12050__note4828554115653"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12050__p2027814115653">In the VM environment, you cannot run a command to query the network port rate. It is recommended that you contact the system administrator to confirm whether the network port rate meets the requirements.</p>
|
||||
</div></div>
|
||||
</p></li><li id="ALM-12050__li1881472115653"><span>If the network write throughput rate exceeds the threshold, contact the system administrator to increase the network port rate.</span></li><li id="ALM-12050__li2938411415653"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12050__ul3018892815653"><li id="ALM-12050__li3511476815653">If yes, no further action is required.</li><li id="ALM-12050__li2572394615653">If no, go to <a href="#ALM-12050__li1329206015653">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12050__p326490115653"><strong id="ALM-12050__b6410918015740">Collect fault information.</strong></p>
|
||||
<ol start="9" id="ALM-12050__ol1685979415736"><li id="ALM-12050__li1329206015653"><a name="ALM-12050__li1329206015653"></a><a name="li1329206015653"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12050__b39977366113627">O&M</strong> > <strong id="ALM-12050__b24251979113627">Log > Download</strong>.</span></li><li id="ALM-12050__li3479759015653"><span>Select <strong id="ALM-12050__b1352831932712">OMS</strong> from the <strong id="ALM-12050__b291511315653">Service</strong> and click <strong id="ALM-12050__b3991118545">OK</strong>.</span></li><li id="ALM-12050__li3252315653"><span>Set <strong id="ALM-12050__b4474285615653">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12050__li1145664103113"><span>Click <span><img id="ALM-12050__image1945644173117" src="en-us_image_0269383876.png"></span> in the upper right corner, and set <strong id="ALM-12050__b6456941173117">Start Date</strong> and <strong id="ALM-12050__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12050__b13456164113319">Download</strong>.</span></li><li id="ALM-12050__li495644512588"><span>Contact the <span id="ALM-12050__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12050__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12050__s11075209b34046419a709e10361cf809"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12050__en-us_topic_0070543624_p65824820">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
104
docs/mrs/umn/ALM-12051.html
Normal file
104
docs/mrs/umn/ALM-12051.html
Normal file
File diff suppressed because it is too large
Load Diff
101
docs/mrs/umn/ALM-12052.html
Normal file
101
docs/mrs/umn/ALM-12052.html
Normal file
File diff suppressed because it is too large
Load Diff
95
docs/mrs/umn/ALM-12053.html
Normal file
95
docs/mrs/umn/ALM-12053.html
Normal file
@ -0,0 +1,95 @@
|
||||
<a name="ALM-12053"></a><a name="ALM-12053"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12053 Host File Handle Usage Exceeds the Threshold</h1>
|
||||
<div id="body56195330"><div class="section" id="ALM-12053__s588d6593be7b4b38bd936d6f0eb49578"><h4 class="sectiontitle">Description</h4><p id="ALM-12053__en-us_topic_0070543628_p20719228">The system checks the file handle usage every 30 seconds and compares the actual usage with the threshold (the default threshold is 80%). This alarm is generated when the host file handle usage exceeds the threshold for several times (5 times by default) consecutively.</p>
|
||||
<p id="ALM-12053__en-us_topic_0070543628_p52255327">To change the threshold, choose<strong id="ALM-12053__en-us_topic_0070543619_b28886228"> O&M > Alarm</strong> > <strong id="ALM-12053__b15474819115615">Thresholds</strong> > <em id="ALM-12053__i13216192517564">Name of the desired cluster</em> > <strong id="ALM-12053__en-us_topic_0070543619_b52985952">Host</strong> > <strong id="ALM-12053__en-us_topic_0070543628_b26402563">Host Status</strong> > <strong id="ALM-12053__en-us_topic_0070543628_b58232852">Host File Handle Usage</strong>.</p>
|
||||
<p id="ALM-12053__p29570505105948">When the <strong id="ALM-12053__b48421890111935">Trigger Count</strong> is 1, this alarm is cleared when the host file handle usage is less than or equal to the threshold. When the <strong id="ALM-12053__b158664714507">Trigger Count</strong> is greater than 1, this alarm is cleared when the host file handle usage is less than or equal to 90% of the threshold.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__s62666dbd819b448db3e666064e5ea1d8"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12053__en-us_topic_0070543628_table38947106" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12053__en-us_topic_0070543628_row38104721"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12053__en-us_topic_0070543628_p66583537">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12053__en-us_topic_0070543628_p24557426">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12053__en-us_topic_0070543628_p42994491">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12053__en-us_topic_0070543628_row60001781"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12053__en-us_topic_0070543628_p28306120">12053</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12053__en-us_topic_0070543628_p11094363">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12053__en-us_topic_0070543628_p26228182">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__s88cffe211633496b86ac1bd0da586ac2"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12053__en-us_topic_0070543628_table44108002" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12053__en-us_topic_0070543628_row38425129"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12053__en-us_topic_0070543628_p25427778">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12053__en-us_topic_0070543628_p46384143">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12053__row159784544819"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12053__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12053__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12053__en-us_topic_0070543628_row66128121"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12053__en-us_topic_0070543628_p54777575">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12053__en-us_topic_0070543628_p7798594">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12053__en-us_topic_0070543628_row3078483"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12053__en-us_topic_0070543628_p48030588">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12053__en-us_topic_0070543628_p65272389">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12053__en-us_topic_0070543628_row50580589"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12053__en-us_topic_0070543628_p3387017">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12053__en-us_topic_0070543628_p5912986">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12053__en-us_topic_0070543628_row53216882"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12053__en-us_topic_0070543628_p15600205">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12053__en-us_topic_0070543628_p55657105">Specifies the threshold triggering the alarm. If the current indicator value exceeds this threshold, the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__s482c3188f93a4caeabe5b3c86faf3dce"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12053__en-us_topic_0070543628_p11931620">The I/O operations, such as opening a file or connecting to network, cannot be performed and programs are abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__en-us_topic_0070543628_section373139"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12053__en-us_topic_0070543628_ul26937201"><li id="ALM-12053__li184022012102816">The application process is abnormal. For example, the opened file or socket is not closed.</li><li id="ALM-12053__en-us_topic_0070543628_li41108220">The number of file handles cannot meet the current service requirements.</li><li id="ALM-12053__en-us_topic_0070543628_li34429662">The system is abnormal.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__se041063f671f4371a7e0bb7c4da04f29"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12053__p9858548184015"><strong id="ALM-12053__b11685182963818">Check information about files opened in processes.</strong></p>
|
||||
<ol id="ALM-12053__ol2107954134014"><li id="ALM-12053__li142191911124120"><span>On FusionInsight Manager, click <span><img id="ALM-12053__image1219131174117" src="en-us_image_0269383882.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12053__li184472141416"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12053__b294641818419">root</strong>. <span id="ALM-12053__text18701027134116"></span></span></li><li id="ALM-12053__li1762124184114"><span>Run the <strong id="ALM-12053__b06214124117">lsof -n|awk '{print $2}'|sort|uniq -c|sort -nr|more</strong> command to check the process that occupies excessive file handles.</span></li><li id="ALM-12053__li264144244316"><span>Check whether the processes in which a large number of files are opened are normal. For example, check whether there are files or sockets not closed.</span><p><ul id="ALM-12053__ul192411041445"><li id="ALM-12053__li10241144134412">If yes, go to <a href="#ALM-12053__li698311306446">5</a>.</li><li id="ALM-12053__li125435134444">If no, go to <a href="#ALM-12053__li50842733151924">7</a>.</li></ul>
|
||||
</p></li><li id="ALM-12053__li698311306446"><a name="ALM-12053__li698311306446"></a><a name="li698311306446"></a><span>Release the abnormal processes that occupy too many file handles.</span></li><li id="ALM-12053__li137485054416"><span>Five minutes later, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12053__ul19374750194414"><li id="ALM-12053__li33741750154420">If yes, no further action is required.</li><li id="ALM-12053__li537418505442">If no, go to <a href="#ALM-12053__li50842733151924">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12053__en-us_topic_0070543628_p37339219"><strong id="ALM-12053__b50291933151922">Increase the number of file handles.</strong></p>
|
||||
<ol start="7" id="ALM-12053__ol66890550151936"><li id="ALM-12053__li50842733151924"><a name="ALM-12053__li50842733151924"></a><a name="li50842733151924"></a><span>On FusionInsight Manager, click <span><img id="ALM-12053__image168221113135319" src="en-us_image_0269383883.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12053__li24620726151924"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12053__b54931419151924">root</strong>.</span></li><li id="ALM-12053__li103121715194518"><a name="ALM-12053__li103121715194518"></a><a name="li103121715194518"></a><span>Contact the system administrator to increase the number of system file handles.</span></li><li id="ALM-12053__li37165512528"><span>Run the <strong id="ALM-12053__b1690117451482">cat /proc/sys/fs/file-nr</strong> command to view the used handles and the maximum number of file handles. The first value is the number of used handles, the third value is the maximum number. Please check whether the usage exceeds the threshold.</span><p><ul class="subitemlist" id="ALM-12053__ul198522013534"><li class="subitemlist" id="ALM-12053__li816519713539">If yes, go to <a href="#ALM-12053__li103121715194518">9</a>.</li><li id="ALM-12053__li885215017534">If no, go to <a href="#ALM-12053__li133010151924">11</a>.<pre class="screen" id="ALM-12053__screen3672717115216"># cat /proc/sys/fs/file-nr
|
||||
12704 0 640000</pre>
|
||||
</li></ul>
|
||||
</p></li><li id="ALM-12053__li133010151924"><a name="ALM-12053__li133010151924"></a><a name="li133010151924"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12053__ul18228740151924"><li id="ALM-12053__li5548368151924">If yes, no further action is required.</li><li id="ALM-12053__li46764658151924">If no, go to <a href="#ALM-12053__li21666806151924">12</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12053__p29840940151924"><strong id="ALM-12053__b63945205151942">Check whether the system environment is abnormal.</strong></p>
|
||||
<ol start="12" id="ALM-12053__ol62901314151953"><li id="ALM-12053__li21666806151924"><a name="ALM-12053__li21666806151924"></a><a name="li21666806151924"></a><span>Contact the system administrator to check whether the operating system is abnormal.</span><p><ul class="subitemlist" id="ALM-12053__ul2407422151924"><li id="ALM-12053__li10773860151924">If yes, go to <a href="#ALM-12053__li23370043151924">13</a> to rectify the fault.</li><li id="ALM-12053__li267491151924">If no, go to <a href="#ALM-12053__li58218801151924">14</a>.</li></ul>
|
||||
</p></li><li id="ALM-12053__li23370043151924"><a name="ALM-12053__li23370043151924"></a><a name="li23370043151924"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12053__ul19344122151924"><li id="ALM-12053__li60783531151924">If yes, no further action is required.</li><li id="ALM-12053__li24518968151924">If no, go to <a href="#ALM-12053__li58218801151924">14</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p class="tableheading" id="ALM-12053__p39879373151924"><strong id="ALM-12053__b60486860151959">Collect fault information.</strong></p>
|
||||
<ol start="14" id="ALM-12053__ol4489551315202"><li id="ALM-12053__li58218801151924"><a name="ALM-12053__li58218801151924"></a><a name="li58218801151924"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12053__b39977366113627">O&M</strong> > <strong id="ALM-12053__b24251979113627">Log > Download</strong>.</span></li><li id="ALM-12053__li57014808151924"><span>Select <strong id="ALM-12053__b1352831932712">OMS</strong> from the <strong id="ALM-12053__b18102480151924">Service</strong> and click <strong id="ALM-12053__b3991118545">OK</strong>.</span></li><li id="ALM-12053__li54796720151924"><span>Set <strong id="ALM-12053__b43371226151924">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12053__li1145664103113"><span>Click <span><img id="ALM-12053__image1945644173117" src="en-us_image_0269383884.png"></span> in the upper right corner, and set <strong id="ALM-12053__b6456941173117">Start Date</strong> and <strong id="ALM-12053__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12053__b13456164113319">Download</strong>.</span></li><li id="ALM-12053__li495644512588"><span>Contact the <span id="ALM-12053__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12053__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12053__s493b241850bc4762b8217e2687a42795"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12053__en-us_topic_0070543628_p26286878">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
109
docs/mrs/umn/ALM-12054.html
Normal file
109
docs/mrs/umn/ALM-12054.html
Normal file
File diff suppressed because it is too large
Load Diff
109
docs/mrs/umn/ALM-12055.html
Normal file
109
docs/mrs/umn/ALM-12055.html
Normal file
File diff suppressed because it is too large
Load Diff
77
docs/mrs/umn/ALM-12057.html
Normal file
77
docs/mrs/umn/ALM-12057.html
Normal file
@ -0,0 +1,77 @@
|
||||
<a name="ALM-12057"></a><a name="ALM-12057"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12057 Metadata Not Configured with the Task to Periodically Back Up Data to a Third-Party Server</h1>
|
||||
<div id="body1522740021438"><div class="section" id="ALM-12057__section242494205216"><h4 class="sectiontitle">Description</h4><p id="ALM-12057__p58603131142152">After the system is installed, it checks whether the task for periodically backing up metadata to the third-party server, and then performs the check hourly. If the task for periodically backing up metadata to a third-party server is not configured, a critical alarm is generated.</p>
|
||||
<p id="ALM-12057__p168212378415">This alarm is cleared when a user creates such a backup task.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section5275193475511"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12057__table41587473" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12057__row58005028"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12057__en-us_topic_0070543632_p6470829">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12057__en-us_topic_0070543632_p54375137">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12057__en-us_topic_0070543632_p42310006">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12057__row51334123"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12057__p64423324">12057</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12057__en-us_topic_0070543632_p44770184">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12057__en-us_topic_0070543632_p2506287">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section51641626125613"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12057__table65772283" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12057__row63387408"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12057__en-us_topic_0070543632_p6953712">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12057__en-us_topic_0070543632_p26379772">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12057__row586792910417"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12057__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12057__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12057__row31175049"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12057__p42151080">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12057__en-us_topic_0070543632_p49828093">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12057__row59387122"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12057__p45627593">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12057__en-us_topic_0070543632_p45185452">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12057__row43628233"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12057__p44225986">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12057__en-us_topic_0070543632_p41561572">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section174127432566"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12057__p136871128012">If metadata is not backed up to a third-party server, metadata cannot be restored if both the active and standby management nodes of the cluster are faulty and local backup data is lost.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section42966593568"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12057__p240915442254">Metadata is not configured with the task to periodically back up data to a third-party server.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section1525571619574"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12057__ol449617567348"><li id="ALM-12057__li1611744911013"><span>On the FusionInsight Manager portal choose <strong id="ALM-12057__b188358153113">O&M > Alarm > Alarms</strong>.</span></li><li id="ALM-12057__li169585911117"><span>In the alarm list, click <span><img id="ALM-12057__image168221113135319" src="en-us_image_0269383889.png"></span> in the row where the alarm is located and identify the data module from which the alarm is generated based on <strong id="ALM-12057__b4668102723111">Additional Information</strong>.</span></li><li id="ALM-12057__li11496856143419"><span>Choose <strong id="ALM-12057__b721210326">O&M</strong> > <strong id="ALM-12057__b1488442514323">Backup and Restoration > Backup Management</strong> > <strong id="ALM-12057__b55459305323">Create</strong>.</span></li><li id="ALM-12057__li144225714510"><span>Configure a backup task. The backup data to be configured is consistent with the data in Additional Information of the alarm.</span></li><li id="ALM-12057__li1133644161218"><span>After the backup task is created successfully, wait for two minutes and check whether the alarm is cleared.</span><p><ul id="ALM-12057__ul643195154411"><li id="ALM-12057__li5431451134410">If yes, no further action is required.</li><li id="ALM-12057__li1843551124416">If no, go to <a href="#ALM-12057__li1185962516113">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12057__p1284212519115"><strong id="ALM-12057__b1432912914719">Collect fault information</strong></p>
|
||||
<ol start="6" id="ALM-12057__ol8860142514111"><li id="ALM-12057__li1185962516113"><a name="ALM-12057__li1185962516113"></a><a name="li1185962516113"></a><span>On FusionInsight Manager, choose <strong id="ALM-12057__b2068611561668">O&M</strong> > <strong id="ALM-12057__b19686105610610">Log > Download</strong>.</span></li><li id="ALM-12057__li13859112516110"><span>In the <strong id="ALM-12057__b8859172516114">Service</strong> area, select <strong id="ALM-12057__b285913251016">Controller</strong> and click <strong id="ALM-12057__b3991118545">OK</strong>.</span></li><li id="ALM-12057__li4859182515115"><span>Click <span><img id="ALM-12057__image185919251512" src="en-us_image_0269383890.png"></span> in the upper right corner, and set <strong id="ALM-12057__b198594252011">Start Date</strong> and <strong id="ALM-12057__b58593251114">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12057__b11859025919">Download</strong>.</span></li><li id="ALM-12057__li495644512588"><span>Contact the <span id="ALM-12057__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12057__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12057__section8679102916579"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12057__p115781141349">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
95
docs/mrs/umn/ALM-12061.html
Normal file
95
docs/mrs/umn/ALM-12061.html
Normal file
@ -0,0 +1,95 @@
|
||||
<a name="ALM-12061"></a><a name="ALM-12061"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12061 Process Usage Exceeds the Threshold</h1>
|
||||
<div id="body1546852608752"><div class="section" id="ALM-12061__section45251551191910"><h4 class="sectiontitle">Description</h4><p id="ALM-12061__p8690451111916">The system checks the usage of the omm process every 30 seconds. Users can run the <strong id="ALM-12061__b8690125131915">ps -o nlwp, pid, args, -u omm | awk '{sum+=$1} END {print "", sum}'</strong> command to obtain the number of concurrent processes of user <strong id="ALM-12061__b1969014511198">omm</strong>. Run the <strong id="ALM-12061__b166906516196">ulimit -u</strong>command to obtain the maximum number of processes that can be simultaneously opened by user <strong id="ALM-12061__b19690175115197">omm</strong>. Divide the number of concurrent processes by the maximum number to obtain the process usage of user <strong id="ALM-12061__b196901551201915">omm</strong>. The process usage has a default threshold. This alarm is generated when the process usage exceeds the threshold.</p>
|
||||
<p id="ALM-12061__p96908512194">If <strong id="ALM-12061__b2690155141916">Trigger Count </strong>is <strong id="ALM-12061__b069095113193">3</strong> and the process usage is less than or equal to the threshold, this alarm is cleared. If <strong id="ALM-12061__b8690185151913">Trigger Count</strong> is greater than <strong id="ALM-12061__b0690551141910">1</strong>and the process usage is less than or equal to 90% of the threshold, this alarm is cleared.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section75265516199"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12061__table11528115171913" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12061__row13691351131919"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12061__p1169145119198">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12061__p206914516195">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12061__p126911651151918">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12061__row1669113514192"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12061__p1269145151915">12061</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12061__p8691175121917">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12061__p4691145115196">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section115319514194"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12061__table105321951141912" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12061__row1269219516194"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12061__p15692105115190">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12061__p469295120198">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12061__row759218834110"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12061__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12061__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12061__row4692145112197"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12061__p1969235120195">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12061__p1969215513194">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12061__row9692165112196"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12061__p1692175110197">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12061__p9692105141912">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12061__row669212515194"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12061__p16692951181919">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12061__p6692251101911">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12061__row569215101914"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12061__p1569215131917">Trigger Condition</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12061__p569318516195">Specifies the threshold for triggering the alarm.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section554019510195"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12061__ul10693185117194"><li id="ALM-12061__li13693185111915">Switch to user <strong id="ALM-12061__b36932051161915">omm</strong> fails.</li><li id="ALM-12061__li186937513198">New omm process cannot be created.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section19542851121912"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12061__ul116931351161910"><li id="ALM-12061__li369312514191">The alarm threshold is improperly configured.</li><li id="ALM-12061__li176935515190">The maximum number of processes (including threads) that can be concurrently opened by user <strong id="ALM-12061__b116937517193">omm</strong> is inappropriate.</li><li id="ALM-12061__li669355121914">An excessive number of threads are opened at the same time.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section145451851131917"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12061__p12693135116199"><strong id="ALM-12061__b166935517198">Check whether the alarm threshold or alarm hit number is properly configured.</strong></p>
|
||||
<ol id="ALM-12061__ol1937419236218"><li id="ALM-12061__li63741123102117"><span>On the FusionInsight Manager, change the alarm threshold and <strong id="ALM-12061__b1936942319210">Trigger Count</strong> based on the actual CPU usage.</span><p><p id="ALM-12061__p53741023132117">Specifically, choose <strong id="ALM-12061__b12369223182120">O&M </strong>> <strong id="ALM-12061__b7369182362114">Alarm</strong> > <strong id="ALM-12061__b183696238213">Thresholds</strong> ><em id="ALM-12061__i2811143010409"> Name of the desired cluster</em> > <strong id="ALM-12061__b1736902316215">Host</strong>> <strong id="ALM-12061__b1369122314213">Process</strong> > <strong id="ALM-12061__b1736992318217">omm Process Usage</strong> to change Trigger Count.</p>
|
||||
<div class="note" id="ALM-12061__note1837419235216"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12061__p6374102312216">The alarm is generated when the process usage exceeds the threshold for the times specified by <strong id="ALM-12061__b1237411237214">Trigger Count</strong>.</p>
|
||||
</div></div>
|
||||
<p id="ALM-12061__p1737417236213">Set the alarm threshold based on the actual process usage. To check the process usage, choose <strong id="ALM-12061__b4374172315215">O&M</strong> > <strong id="ALM-12061__b11374192352114">Alarm</strong> > <strong id="ALM-12061__b183741423162110">Thresholds</strong> > <em id="ALM-12061__i18450436164420">Name of the desired cluster</em> > <strong id="ALM-12061__b2374102311219">Host</strong>> <strong id="ALM-12061__b51371152474">Process</strong> > <strong id="ALM-12061__b1693614974714">omm Process Usage</strong>, as shown in <a href="#ALM-12061__fig437414238216">Figure 1</a>.</p>
|
||||
<div class="fignone" id="ALM-12061__fig437414238216"><a name="ALM-12061__fig437414238216"></a><a name="fig437414238216"></a><span class="figcap"><b>Figure 1 </b>Setting an alarm threshold</span><br><span><img id="ALM-12061__image1615410501365" src="en-us_image_0000001440858217.png"></span></div>
|
||||
</p></li><li id="ALM-12061__li33745237217"><span>2 minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12061__ul1437412317219"><li id="ALM-12061__li2374182312217">If it is, no further action is required.</li><li id="ALM-12061__li2374112315211">If it is not, go to <a href="#ALM-12061__li936717234216">3</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12061__p630219198214"><strong id="ALM-12061__b6695451191916">Check whether the maximum number of processes (including threads) opened by user omm is appropriate.</strong></p>
|
||||
<ol start="3" id="ALM-12061__ol13367112317219"><li id="ALM-12061__li936717234216"><a name="ALM-12061__li936717234216"></a><a name="li936717234216"></a><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12061__li1136752311217"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12061__b1136717231212">root</strong>. <span id="ALM-12061__text985593916354"></span></span></li><li id="ALM-12061__li15367523112112"><span>Run the <strong id="ALM-12061__b5367122302115">su - omm</strong> command to switch to user <strong id="ALM-12061__b193671623132111">omm</strong>.</span></li><li id="ALM-12061__li8367112332111"><span>Run the <strong id="ALM-12061__b14367122392112">ulimit -u</strong> command to obtain the maximum number of threads that can be concurrently opened by user <strong id="ALM-12061__b1236732392116">omm</strong> and check whether the number is greater than or equal to 60000.</span><p><ul id="ALM-12061__ul136710230215"><li id="ALM-12061__li13367423122115">If it is, go to <a href="#ALM-12061__li293443912213">8</a>.</li><li id="ALM-12061__li2367102320214">If it is not, go to <a href="#ALM-12061__li8367152314217">7</a>.</li></ul>
|
||||
</p></li><li id="ALM-12061__li8367152314217"><a name="ALM-12061__li8367152314217"></a><a name="li8367152314217"></a><span>Run the <strong id="ALM-12061__b53671823112118">ulimit -u 60000</strong> command to change the maximum number to 60000. Two minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12061__ul19367423152119"><li id="ALM-12061__li93671123122113">If it is, no further action is required.</li><li id="ALM-12061__li836702332116">If it is not, go to <a href="#ALM-12061__li1668345092117">12</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12061__p7839436162117"><strong id="ALM-12061__b1836742382111">Check whether an excessive number of processes are opened at the same time.</strong></p>
|
||||
<ol start="8" id="ALM-12061__ol1093673902112"><li id="ALM-12061__li293443912213"><a name="ALM-12061__li293443912213"></a><a name="li293443912213"></a><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12061__li3934143952119"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12061__b209341539202116">root</strong>.</span></li><li id="ALM-12061__li893473922118"><span>Run the <strong id="ALM-12061__b199341039112112">ps -o nlwp, pid, lwp, args, -u omm|sort -n</strong> command to check the numbers of threads used by the system. The result is sorted based on the thread number. Analyze the top 5 thread numbers and check whether the threads are incorrectly used. If they are, contact maintenance personnel to rectify the fault. If they are not, run the <strong id="ALM-12061__b209343391212">ulimit -u</strong> command to change the maximum number to be greater than 60000.</span></li><li id="ALM-12061__li119349396211"><span>Five minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12061__ul11934203918217"><li id="ALM-12061__li29341139172111">If it is, no further action is required.</li><li id="ALM-12061__li10934539102120">If it is not, go to <a href="#ALM-12061__li1668345092117">12</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12061__p56917471218"><strong id="ALM-12061__b1493463982113">Collect fault information.</strong></p>
|
||||
<ol start="12" id="ALM-12061__ol18685115014216"><li id="ALM-12061__li1668345092117"><a name="ALM-12061__li1668345092117"></a><a name="li1668345092117"></a><span>On the FusionInsight Manager home page of the active clusters, choose <strong id="ALM-12061__b968317505217">O&M </strong>> <strong id="ALM-12061__b156836505210">Log</strong> > <strong id="ALM-12061__b7683135018213">Download</strong>.</span></li><li id="ALM-12061__li868355022113"><span>Select <strong id="ALM-12061__b6683950172114">OmmServer</strong> and <strong id="ALM-12061__b468318504214">NodeAgent</strong> from the <strong id="ALM-12061__b33411729132615">Service</strong> and click <strong id="ALM-12061__b3991118545">OK</strong>.</span></li><li id="ALM-12061__li8685135062120"><span>Click <span><img id="ALM-12061__image12683135092120" src="en-us_image_0269383906.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12061__b136837501219">Start Date</strong> and <strong id="ALM-12061__b86832508216">End Date</strong> to 10 minutes before and after the alarm generation time respectively and click <strong id="ALM-12061__b1168545014219">OK</strong>. Then, click <strong id="ALM-12061__b13685125042113">Download</strong>.</span></li><li id="ALM-12061__li495644512588"><span>Contact the <span id="ALM-12061__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section10584175161919"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12061__p6698105111191">This alarm will be automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12061__section8584185131911"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12061__p11698651141916">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
97
docs/mrs/umn/ALM-12062.html
Normal file
97
docs/mrs/umn/ALM-12062.html
Normal file
@ -0,0 +1,97 @@
|
||||
<a name="ALM-12062"></a><a name="ALM-12062"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12062 OMS Parameter Configurations Mismatch with the Cluster Scale</h1>
|
||||
<div id="body1546915438900"><div class="section" id="ALM-12062__section2747821101717"><h4 class="sectiontitle">Description</h4><p id="ALM-12062__p53271255205214">The system checks whether the OMS parameter configurations match with the cluster scale at each top hour. If the OMS parameter configurations do not meet the cluster scale requirements, the system generates this alarm. This alarm is automatically cleared when the OMS parameter configurations are modified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section127478213171"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12062__table7749721191719" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12062__row6867152161714"><th align="left" class="cellrowborder" valign="top" width="34.37343734373437%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12062__p03908133538">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="34.31343134313431%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12062__p239001375320">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="31.313131313131308%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12062__p1939041395319">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12062__row586722117171"><td class="cellrowborder" valign="top" width="34.37343734373437%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12062__p33906131535">12062</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="34.31343134313431%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12062__p73902013155315">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="31.313131313131308%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12062__p1539021315312">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section14755172115173"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12062__table17756521131714" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12062__row18671421131712"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12062__p786772121719">Parameter</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12062__p286742191711">Description</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12062__row15959022415"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12062__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12062__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12062__row786710211177"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12062__p58673218178">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12062__p4868821191713">Specifies the name of the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12062__row286819215174"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12062__p186818216176">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12062__p7868721131716">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12062__row14868221161713"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12062__p1986842116171">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12062__p10868132118175">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section1776462111715"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12062__p13799111525419">The OMS configuration is not modified when the cluster is installed or the system capacity is expanded.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section6765152119174"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12062__p12567152916541">The OMS parameter configurations mismatch with the cluster scale.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section87667210173"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12062__p456793710548"><strong id="ALM-12062__b356720377542">Check whether the OMS parameter configurations match with the cluster scale.</strong></p>
|
||||
<ol id="ALM-12062__ol87012317557"><li id="ALM-12062__li489962395514"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12062__li152261503555"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12062__b022675065516">root</strong>. <span id="ALM-12062__text985593916354"></span></span></li><li id="ALM-12062__li95861858185515"><span>Run the <strong id="ALM-12062__b19586105865511">su - omm</strong> command to switch to user <strong id="ALM-12062__b6602115865515">omm</strong>.</span></li><li id="ALM-12062__li960214583555"><span>Run the <strong id="ALM-12062__b660235865514">vi $BIGDATA_LOG_HOME/controller/scriptlog/modify_manager_param.log</strong> command to open the log file and search for the log file containing the following information: Current oms configurations cannot support <em id="ALM-12062__i260210581552">xx</em> nodes. In the information, <em id="ALM-12062__i1760210587558">xx</em> indicates the number of nodes in the cluster.</span></li><li id="ALM-12062__li1895714113811"><span>Optimize the current cluster configuration by following the instructions in <a href="#ALM-12062__section117861721171717">Optimizing Manager Configurations Based on the Number of Cluster Nodes</a>.</span></li><li id="ALM-12062__li199275175618"><span>One hour later, check whether the alarm is cleared.</span><p><ul id="ALM-12062__ul65231712185619"><li id="ALM-12062__li4861118105614">If it is, no further action is required.</li><li id="ALM-12062__li152720248562">If it is not, go to <a href="#ALM-12062__li8140111212587">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12062__p13421113195811"><strong id="ALM-12062__b204218131586">Collect fault information.</strong></p>
|
||||
<ol start="7" id="ALM-12062__ol1514001219584"><li id="ALM-12062__li8140111212587"><a name="ALM-12062__li8140111212587"></a><a name="li8140111212587"></a><span>On FusionInsight Manager, choose <strong id="ALM-12062__b12140112175816">O&M</strong> > <strong id="ALM-12062__b114011127584">Log</strong> > <strong id="ALM-12062__b141404121585">Download</strong>.</span></li><li id="ALM-12062__li9140101216585"><span>Select <strong id="ALM-12062__b15140101214581">Controller</strong> from the <strong id="ALM-12062__b214071255817">Service</strong> and click <strong id="ALM-12062__b3991118545">OK</strong>.</span></li><li id="ALM-12062__li121401712195814"><span>Click <span><img id="ALM-12062__image1914021213589" src="en-us_image_0269383907.png"></span> in the upper right corner, and set <strong id="ALM-12062__b15140101215811">Start Date</strong> and <strong id="ALM-12062__b121408123588">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12062__b1214091210583">Download</strong>.</span></li><li id="ALM-12062__li495644512588"><span>Contact the <span id="ALM-12062__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12062__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12062__section117861721171717"><a name="ALM-12062__section117861721171717"></a><a name="section117861721171717"></a><h4 class="sectiontitle">Related Information</h4><p id="ALM-12062__p6786101413374"><strong id="ALM-12062__b1539194154415">Optimizing Manager Configurations Based on the Number of Cluster Nodes</strong></p>
|
||||
<ol id="ALM-12062__ol05141233717"><li id="ALM-12062__en-us_topic_0165590374_li26979450111823"><span>Log in to the active Manager node as user <strong id="ALM-12062__en-us_topic_0165590374_b51895181112128">omm</strong>.</span></li><li id="ALM-12062__en-us_topic_0165590374_li37226470112023"><span>Run the following command to switch the directory:</span><p><p id="ALM-12062__en-us_topic_0165590374_p57368764112211"><strong id="ALM-12062__en-us_topic_0165590374_b25553026112214">cd ${BIGDATA_HOME}/om-server/om/sbin</strong></p>
|
||||
</p></li><li id="ALM-12062__en-us_topic_0165590374_li42402569112040"><span>Run the following command to view the current Manager configurations.</span><p><p id="ALM-12062__en-us_topic_0165590374_p49977307112647"><strong id="ALM-12062__en-us_topic_0165590374_b52915491112650">sh oms_config_info.sh -q</strong></p>
|
||||
</p></li><li id="ALM-12062__en-us_topic_0165590374_li49167719112555"><span>Run the following command to specify the number of nodes in the current cluster.</span><p><p id="ALM-12062__en-us_topic_0165590374_p64566987112853">Command format: <strong id="ALM-12062__en-us_topic_0165590374_b7323796112918">sh oms_config_info.sh -s </strong><em id="ALM-12062__en-us_topic_0165590374_i45810750112920">number of nodes</em></p>
|
||||
<p id="ALM-12062__en-us_topic_0165590374_p13336332113026">Example:</p>
|
||||
<p id="ALM-12062__en-us_topic_0165590374_p28514502112923"><strong id="ALM-12062__en-us_topic_0165590374_b34554882153757">sh oms_config_info.sh -s 10</strong><strong id="ALM-12062__en-us_topic_0165590374_b42558486153757">00</strong></p>
|
||||
<p id="ALM-12062__en-us_topic_0165590374_p56151975113352">Enter <span class="parmname" id="ALM-12062__en-us_topic_0165590374_parmname54856661113358"><b>y</b></span> as prompted.</p>
|
||||
<pre class="screen" id="ALM-12062__en-us_topic_0165590374_screen1838109215412">The following configurations will be modified:
|
||||
Module Parameter Current Target
|
||||
Controller controller.Xmx 4096m => 16384m
|
||||
Controller controller.Xms 1024m => 8192m
|
||||
Controller controller.node.heartbeat.error.threshold 30000 => 60000
|
||||
Pms pms.mem 8192m => 10240m
|
||||
Do you really want to do this operation? (y/n):</pre>
|
||||
<p id="ALM-12062__en-us_topic_0165590374_p33978317113511">The configurations are updated successfully if the following information is displayed:</p>
|
||||
<pre class="screen" id="ALM-12062__en-us_topic_0165590374_screen66711405113653">...
|
||||
Operation has been completed. Now restarting OMS server. [done]
|
||||
Restarted oms server successfully.</pre>
|
||||
<div class="note" id="ALM-12062__en-us_topic_0165590374_note26248943114621"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="ALM-12062__en-us_topic_0165590374_ul56308272114644"><li id="ALM-12062__en-us_topic_0165590374_li30858860114644">OMS is automatically restarted during the configuration update process.</li><li id="ALM-12062__en-us_topic_0165590374_li28603951114646">Clusters with similar quantities of nodes have same Manager configurations. For example, when the number of nodes is changed from 100 to 101, no configuration item needs to be updated.</li></ul>
|
||||
</div></div>
|
||||
</p></li></ol>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
86
docs/mrs/umn/ALM-12063.html
Normal file
86
docs/mrs/umn/ALM-12063.html
Normal file
@ -0,0 +1,86 @@
|
||||
<a name="ALM-12063"></a><a name="ALM-12063"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12063 Unavailable Disk</h1>
|
||||
<div id="body1546933104148"><div class="section" id="ALM-12063__section2747821101717"><h4 class="sectiontitle">Description</h4><p id="ALM-12063__p640318111024">The system checks whether the data disk of the current host is available at the top of each hour. The system creates files, writes files, and deletes files in the mount directory of the disk. If the operations fail, the alarm is generated. If the operations succeed, the disk is available, and the alarm is cleared.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section127478213171"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12063__table7749721191719" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12063__row6867152161714"><th align="left" class="cellrowborder" valign="top" width="34.34343434343434%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12063__p03908133538">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="34.34343434343434%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12063__p239001375320">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="31.313131313131308%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12063__p1939041395319">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12063__row586722117171"><td class="cellrowborder" valign="top" width="34.34343434343434%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12063__p33906131535">12063</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="34.34343434343434%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12063__p73902013155315">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="31.313131313131308%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12063__p1539021315312">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section14755172115173"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12063__table17756521131714" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12063__row18671421131712"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12063__p786772121719">Parameter</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12063__p286742191711">Description</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12063__row14976557204013"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12063__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12063__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12063__row786710211177"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12063__p58673218178">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12063__p4868821191713">Specifies the name of the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12063__row286819215174"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12063__p186818216176">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12063__p7868721131716">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12063__row14868221161713"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12063__p1986842116171">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12063__p10868132118175">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12063__row1790721212314"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12063__p129071912831">DiskName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12063__p690719129311">Specifies the disk for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section1776462111715"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12063__p17812163912213">Data read or write on the data disk fails, and services are abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section6765152119174"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12063__ul161755171417"><li id="ALM-12063__li10175717546">The permission of the disk mount directory is abnormal.</li><li id="ALM-12063__li101751172415">There are disk bad sectors.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section87667210173"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12063__p982615013436"><strong id="ALM-12063__b148266084312">Check whether the permission of the disk mount directory is normal.</strong></p>
|
||||
<ol id="ALM-12063__ol153513712451"><li id="ALM-12063__li053519784510"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host and <strong id="ALM-12063__b9535779453">DiskName</strong> for the disk for which the alarm is generated.</span></li><li id="ALM-12063__li8535167194513"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12063__b1053519711454">root</strong>. <span id="ALM-12063__text985593916354"></span></span></li><li id="ALM-12063__li135352074456"><span>Run the <strong id="ALM-12063__b165354764512">df -h |grep DiskName</strong> command to obtain the mount point and check whether the permission of the mount directory is unwritable or unreadable.</span><p><ul id="ALM-12063__ul25357764510"><li id="ALM-12063__li753517711453">If it is, go to <a href="#ALM-12063__li1053537184512">4</a>.</li><li id="ALM-12063__li2535107184515">If it is not, go to <a href="#ALM-12063__li8140111212587">8</a>.<div class="note" id="ALM-12063__note483271124518"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12063__p681582324511">If the permission of the mount directory is 000 or the owner is <strong id="ALM-12063__b5815192354515">root</strong>, the mount directory is unreadable and unwritable.</p>
|
||||
</div></div>
|
||||
</li></ul>
|
||||
</p></li></ol><ol start="4" id="ALM-12063__ol1053510744517"><li id="ALM-12063__li1053537184512"><a name="ALM-12063__li1053537184512"></a><a name="li1053537184512"></a><span>Modify the directory permission.</span></li><li id="ALM-12063__li13535977455"><span>One hour later, check whether this alarm is cleared.</span><p><ul id="ALM-12063__ul1453518794514"><li id="ALM-12063__li453514774516">If it is, no further action is required.</li><li id="ALM-12063__li135357784518">If it is not, go to <a href="#ALM-12063__li4535871458">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12063__li4535871458"><a name="ALM-12063__li4535871458"></a><a name="li4535871458"></a><span>Contact hardware engineers to rectify the disk.</span></li><li id="ALM-12063__li1353518719457"><span>One hour later, check whether this alarm is cleared.</span><p><ul id="ALM-12063__ul6535167124514"><li id="ALM-12063__li05355711456">If it is, no further action is required.</li><li id="ALM-12063__li65354717453">If it is not, go to <a href="#ALM-12063__li8140111212587">8</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12063__p18256224611"><strong id="ALM-12063__b42515254610">Collect fault information.</strong></p>
|
||||
<ol start="8" id="ALM-12063__ol1996717458377"><li id="ALM-12063__li8140111212587"><a name="ALM-12063__li8140111212587"></a><a name="li8140111212587"></a><span>On FusionInsight Manager, choose <strong id="ALM-12063__b12140112175816">O&M</strong> > <strong id="ALM-12063__b114011127584">Log</strong> > <strong id="ALM-12063__b141404121585">Download</strong>.</span></li><li id="ALM-12063__li9140101216585"><span>Select <strong id="ALM-12063__b069717155404">NodeAgent</strong> from the <strong id="ALM-12063__b214071255817">Service</strong> and click <strong id="ALM-12063__b3991118545">OK</strong>.</span></li><li id="ALM-12063__li296716454377"><span>Click <span><img id="ALM-12063__image109671245153716" src="en-us_image_0269383908.png"></span> in the upper right corner, and set <strong id="ALM-12063__b99671445103719">Start Date</strong> and <strong id="ALM-12063__b3967114563711">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12063__b2967194513374">Download</strong>.</span></li><li id="ALM-12063__li495644512588"><span>Contact the <span id="ALM-12063__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12063__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12063__section117861721171717"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12063__p3869621161713">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
78
docs/mrs/umn/ALM-12064.html
Normal file
78
docs/mrs/umn/ALM-12064.html
Normal file
@ -0,0 +1,78 @@
|
||||
<a name="ALM-12064"></a><a name="ALM-12064"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12064 Host Random Port Range Conflicts with Cluster Used Port</h1>
|
||||
<div id="body1547165364811"><div class="section" id="ALM-12064__section332351094219"><h4 class="sectiontitle">Alarm Description</h4><p id="ALM-12064__p429252484216">The system checks whether the random port range of the host conflicts with the range of ports used by the Cluster system every hour. The alarm is generated if they conflict. The alarm is automatically cleared when the random port range of the host is changed to the normal range.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section15323151010426"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12064__table5323710144210" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12064__row133231010144218"><th align="left" class="cellrowborder" valign="top" width="34.34343434343434%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12064__p4323910104212">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="34.34343434343434%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12064__p18323171020428">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="31.313131313131308%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12064__p1532301018426">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12064__row7338310164219"><td class="cellrowborder" valign="top" width="34.34343434343434%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12064__p533810101427">12064</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="34.34343434343434%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12064__p733813103420">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="31.313131313131308%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12064__p113383103423">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section93382010114217"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12064__table1633810106423" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12064__row1833818109426"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12064__p1733810102429">Parameter</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12064__p533819103428">Description</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12064__row1350085224012"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12064__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12064__p692551319435">Specifiestheclusterorsystemforwhichthealarmisgenerated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12064__row3338210134211"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12064__p1933801016429">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12064__p5338101011423">Specifies the name of the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12064__row19338121016427"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12064__p17338141024217">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12064__p9338161010425">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12064__row143381810154214"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12064__p735410104425">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12064__p8354191018422">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section73541103423"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12064__p17812163912213">The default port of the Cluster system is occupied. As a result, some processes fail to be started.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section1735461018427"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12064__p52213894213">The random port range configuration is modified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section693292174218"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12064__p158581330123713"><strong id="ALM-12064__b1585853015379">Check the random port range of the system.</strong></p>
|
||||
<ol id="ALM-12064__ol13967134518379"><li id="ALM-12064__li199671454371"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12064__li69671245143710"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12064__b1396794515374">root</strong>. <span id="ALM-12064__text985593916354"></span></span></li><li id="ALM-12064__li1996794533711"><span>Run the <strong id="ALM-12064__b69671845103713">cat /proc/sys/net/ipv4/ip_local_port_range</strong> command to obtain the random port range of the host and check whether the minimum value is smaller than 32768.</span><p><ul id="ALM-12064__ul1496716452370"><li id="ALM-12064__li209671245173712">If it is, go to <a href="#ALM-12064__li1796713455375">4</a>.</li><li id="ALM-12064__li89671745183711">If it is not, goto <a href="#ALM-12064__li1396704514377">7</a>.</li></ul>
|
||||
</p></li><li id="ALM-12064__li1796713455375"><a name="ALM-12064__li1796713455375"></a><a name="li1796713455375"></a><span>Run the <strong id="ALM-12064__b1296734510372">vim /etc/sysctl.conf</strong> command to change the value of <strong id="ALM-12064__b1296711459372">net.ipv4.ip_local_port_range</strong> to <strong id="ALM-12064__b496794523715">32768 61000</strong>. If this parameter does not exist, add the following configuration: <strong id="ALM-12064__b129678452378">net.ipv4.ip_local_port_range = 32768 61000</strong>.</span></li><li id="ALM-12064__li79678451371"><span>Run the <strong id="ALM-12064__b11967445133718">sysctl -p /etc/sysctl.conf</strong> command for the modification to take effect.</span></li><li id="ALM-12064__li496704563711"><span>One hour later, check whether the alarm is cleared.</span><p><ul id="ALM-12064__ul16967445203711"><li id="ALM-12064__li1596784553710">If it is, no further action is required.</li><li id="ALM-12064__li1796784514375">If it is not, go to <a href="#ALM-12064__li1396704514377">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12064__p23701710174214"><strong id="ALM-12064__b123701110164218">Collect fault information.</strong></p>
|
||||
<ol start="7" id="ALM-12064__ol1996717458377"><li id="ALM-12064__li1396704514377"><a name="ALM-12064__li1396704514377"></a><a name="li1396704514377"></a><span>On FusionInsight Manager, choose <strong id="ALM-12064__b1996754543712">O&M</strong> > <strong id="ALM-12064__b20967645173714">Log</strong> > <strong id="ALM-12064__b1496734511372">Download</strong>.</span></li><li id="ALM-12064__li1596764533717"><span>Select <strong id="ALM-12064__b13967174519376">NodeAgent</strong> for <strong id="ALM-12064__b196744553714">Service</strong> and click <strong id="ALM-12064__b3991118545">OK</strong>.</span></li><li id="ALM-12064__li296716454377"><span>Click <span><img id="ALM-12064__image109671245153716" src="en-us_image_0269383909.png"></span> in the upper right corner, and set <strong id="ALM-12064__b99671445103719">Start Date</strong> and <strong id="ALM-12064__b3967114563711">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12064__b2967194513374">Download</strong>.</span></li><li id="ALM-12064__li495644512588"><span>Contact the <span id="ALM-12064__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section14385121020422"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12064__p2038591034212">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12064__section113853101423"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12064__p133851310194211">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
91
docs/mrs/umn/ALM-12066.html
Normal file
91
docs/mrs/umn/ALM-12066.html
Normal file
@ -0,0 +1,91 @@
|
||||
<a name="ALM-12066"></a><a name="ALM-12066"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12066 Trust Relationships Between Nodes Become Invalid</h1>
|
||||
<div id="body1547168128796"><div class="section" id="ALM-12066__section10369415133116"><h4 class="sectiontitle">Description</h4><p id="ALM-12066__p324232317301">The system checks whether the trust relationship between the active OMS node and other Agent nodes is normal every hour. The alarm is generated if the mutual trust fails. This alarm is automatically cleared if this problem is resolved.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section8323192410322"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12066__table1479793583212" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12066__row107991735133210"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12066__p18799183583212">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12066__p1680123511326">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12066__p1980173523217">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12066__row880183517329"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12066__p108014356328">12066</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12066__p19802163593213">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12066__p880215356323">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section652875914327"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12066__table1090459143316" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12066__row190429173313"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12066__p129062911339">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12066__p10906093332">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12066__row1035763317362"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12066__row18907109203311"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p99095916333">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p4909159173310">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12066__row4910691332"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p39101953320">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p5911189173310">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12066__row59118923315"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12066__p0912169123319">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12066__p169131916332">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section2990133614335"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12066__p531812513564">Some operations on the management plane may be abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12066__ul913288183510"><li id="ALM-12066__li713414815352">The <strong id="ALM-12066__b22461400518">/etc/ssh/sshd_config</strong> configuration file is damaged.</li><li id="ALM-12066__li131351185357">The password of user <strong id="ALM-12066__b10643161513517">omm</strong> has expired.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section071212121445"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12066__p14212204913111"><strong id="ALM-12066__b4515327657">Check the status of the /etc/ssh/sshd_config configuration file.</strong></p>
|
||||
<ol id="ALM-12066__ol363257182811"><li id="ALM-12066__li263016792816"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm and click <span><img id="ALM-12066__image1663017722814" src="en-us_image_0263895789.png"></span> to view the host list in the alarm details.</span></li><li id="ALM-12066__li17631167192814"><span>Log in to the active OMS node as user <strong id="ALM-12066__b173458362104930">omm</strong>. <span id="ALM-12066__text38540585518"></span></span></li><li id="ALM-12066__li17631374283"><span>Run the <strong id="ALM-12066__b8591193761511">ssh</strong> command, for example, <strong id="ALM-12066__b1611013111616">ssh</strong> <strong id="ALM-12066__b461113131618"><em id="ALM-12066__i8702204181616">host2</em></strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i1032492071610"><strong id="ALM-12066__b18558144131812">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul1963111718289"><li id="ALM-12066__li363117710285">If yes, go to <a href="#ALM-12066__li176321676280">4</a>.</li><li id="ALM-12066__li136319782815">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12066__li176321676280"><a name="ALM-12066__li176321676280"></a><a name="li176321676280"></a><span>Open the <strong id="ALM-12066__b19350203172016">/etc/ssh/sshd_config</strong> configuration file on host2 and check whether <strong id="ALM-12066__b497416449207">AllowUsers</strong> or <strong id="ALM-12066__b683084712203">DenyUsers</strong> is configured for other nodes.</span><p><ul id="ALM-12066__ul263219711285"><li id="ALM-12066__li66323716289">If yes, go to <a href="#ALM-12066__li846318425575">5</a>.</li><li id="ALM-12066__li1763211732817">If no, contact OS experts.</li></ul>
|
||||
</p></li><li id="ALM-12066__li846318425575"><a name="ALM-12066__li846318425575"></a><a name="li846318425575"></a><span>Modify the whitelist or blacklist to ensure that user <strong id="ALM-12066__b5862624122211">omm</strong> is in the whitelist or not in the blacklist. Check whether the alarm is cleared.</span><p><ul id="ALM-12066__ul111918318587"><li id="ALM-12066__li17191331165814">If yes, no further action is required.</li><li id="ALM-12066__li15858237195817">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12066__p31281710101318"><strong id="ALM-12066__b580872612411">Check the status of the password of user omm.</strong></p>
|
||||
<ol start="6" id="ALM-12066__ol19148181010138"><li id="ALM-12066__li9148131091317"><a name="ALM-12066__li9148131091317"></a><a name="li9148131091317"></a><span>Check the interaction information of the <strong id="ALM-12066__b17968171562512">ssh</strong> command.</span><p><ul class="subitemlist" id="ALM-12066__ul181481910161315"><li id="ALM-12066__li13148310111319">If the password of user <strong id="ALM-12066__b1022313330252">omm</strong> is required, go to <a href="#ALM-12066__li81482101138">7</a>.</li><li id="ALM-12066__li121483102136">If message "Enter passphrase for key '/home/omm/.ssh/id_rsa':" is displayed, go to <a href="#ALM-12066__li106306742813">9</a>.</li></ul>
|
||||
</p></li><li class="subitemlist" id="ALM-12066__li81482101138"><a name="ALM-12066__li81482101138"></a><a name="li81482101138"></a><span>Check the trust list (<strong id="ALM-12066__b75785322610">/home/omm/.ssh/authorized_keys</strong>) of user <strong id="ALM-12066__b730655672611">omm</strong> on the OMS node and host2 node. Check whether the trust list contains the public key file (<strong id="ALM-12066__b1756913136278">/home/omm/.ssh/id_rsa.pub</strong>) of user <strong id="ALM-12066__b34871732719">omm</strong> on the peer host.</span><p><ul id="ALM-12066__ul6148151021318"><li id="ALM-12066__li614861061316">If yes, contact OS experts.</li><li id="ALM-12066__li11482010131312">If no, add the public key of user <strong id="ALM-12066__b663884152710">omm</strong> of the peer host to the trust list of the local host.</li></ul>
|
||||
</p></li><li id="ALM-12066__li19341633125911"><span>Add the public key of user <strong id="ALM-12066__b0377113310287">omm</strong> of the peer host to the trust list of the local host. Run the <strong id="ALM-12066__b1737092382919">ssh</strong> command, for example, <strong id="ALM-12066__b6889113012290">ssh host2</strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i81833373014"><strong id="ALM-12066__b0720241270">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul137211213508"><li id="ALM-12066__li153121714307">If yes, go to <a href="#ALM-12066__li106306742813">9</a>.</li><li id="ALM-12066__li7313414402">If no, check whether the alarm is cleared. If the alarm is cleared, no further action is required; otherwise, go to <a href="#ALM-12066__li106306742813">9</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12066__p124132216288"><strong id="ALM-12066__b1967293410811">Collect the fault information.</strong></p>
|
||||
<ol start="9" id="ALM-12066__ol146302742816"><li class="subitemlist" id="ALM-12066__li106306742813"><a name="ALM-12066__li106306742813"></a><a name="li106306742813"></a><span>On FusionInsight Manager, choose <strong id="ALM-12066__b140942549104930">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12066__b180541324104930">Log</strong> > <strong id="ALM-12066__b1225148528104930">Download</strong>.</span></li><li id="ALM-12066__li06301476283"><span>Select <strong id="ALM-12066__b192996136104930">Controller</strong> for <strong id="ALM-12066__b345013368916">Service</strong> and click <strong id="ALM-12066__b1962404791104930">OK</strong>.</span></li><li id="ALM-12066__li126301173286"><span>Click <span><img id="ALM-12066__image863057122812" src="en-us_image_0263895540.png"></span> in the upper right corner to set the log collection time range. Generally, the time range is 10 minutes before and after the alarm generation time. Click <strong id="ALM-12066__b575409479104930">Download</strong>.</span></li><li id="ALM-12066__li2630274284"><span>Contact <span id="ALM-12066__text1793615574113">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12066__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12066__section8222143110380"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12066__p4686124105919">Perform the following steps to handle abnormal trust relationships between nodes:</p>
|
||||
<div class="notice" id="ALM-12066__note64991413518"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><ul id="ALM-12066__ul19616958163514"><li id="ALM-12066__li1616145863512">Perform this operation as user <strong id="ALM-12066__b2165161015166">omm</strong>.</li><li id="ALM-12066__li861655833518">If the network between nodes is disconnected, rectify the network fault first. Check whether the two nodes are connected to the same security group and whether <strong id="ALM-12066__b196521759121612">hosts.deny</strong> and <strong id="ALM-12066__b1613616201712">hosts.allow</strong> are set.</li></ul>
|
||||
</div></div>
|
||||
<ol id="ALM-12066__ol1978732155814"><li id="ALM-12066__li597853215581">Run the <strong id="ALM-12066__b186632016173">ssh-add -l</strong> command on both nodes to check whether any identities exist.<p id="ALM-12066__p392110588248"><span><img id="ALM-12066__image8432143962413" src="en-us_image_0000001226576418.png"></span></p>
|
||||
<ul id="ALM-12066__ul122791263414"><li id="ALM-12066__li122797214348">If yes, go to <a href="#ALM-12066__li09782325586">4</a>.</li><li id="ALM-12066__li14378713415">If no, go to <a href="#ALM-12066__li16978123275815">2</a>.</li></ul>
|
||||
</li><li id="ALM-12066__li16978123275815"><a name="ALM-12066__li16978123275815"></a><a name="li16978123275815"></a>If no identities are displayed, run the <strong id="ALM-12066__b6267121682419">ps -ef|grep ssh-agent</strong> command to find the <strong id="ALM-12066__b1666702220243">ssh-agent</strong> process, stop the process, and wait for the process to automatically restart.<p id="ALM-12066__p629941492510"><span><img id="ALM-12066__image138828117259" src="en-us_image_0000001227056330.png"></span></p>
|
||||
</li><li id="ALM-12066__li1997863215584">Run the <strong id="ALM-12066__b18989588253">ssh-add -l</strong> command to check whether the identities have been added. If yes, manually run the <strong id="ALM-12066__b559031413264">ssh</strong> command to check whether the trust relationship is normal.<p id="ALM-12066__p492712369259"><span><img id="ALM-12066__image1579143210257" src="en-us_image_0000001271536445.png"></span></p>
|
||||
</li><li id="ALM-12066__li09782325586"><a name="ALM-12066__li09782325586"></a><a name="li09782325586"></a>If identities exist, check whether the <span class="filepath" id="ALM-12066__filepath1443720119218"><b>/home/omm/.ssh/authorized_keys</b></span> file contains the information in the <span class="filepath" id="ALM-12066__filepath693611119214"><b>/home/omm/.ssh/id_rsa.pub</b></span> file of the peer node. If it does not, manually add the information.</li><li id="ALM-12066__li497914322582">Check whether the permissions on the files in the <strong id="ALM-12066__b152771124143011">/home/omm/.ssh</strong> directory are modified.</li><li id="ALM-12066__li8979193218587">Check the <strong id="ALM-12066__b2982446153018">/var/log/Bigdata/nodeagent/scriptlog/ssh-agent-monitor.log</strong> file.</li><li id="ALM-12066__li3979632105814">If the <strong id="ALM-12066__b09816214325">/home</strong> directory of user <strong id="ALM-12066__b1171105173213">omm</strong> is deleted, contact MRS support personnel for assistance.</li></ol>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
79
docs/mrs/umn/ALM-12067.html
Normal file
79
docs/mrs/umn/ALM-12067.html
Normal file
@ -0,0 +1,79 @@
|
||||
<a name="ALM-12067"></a><a name="ALM-12067"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12067 Tomcat Resource Is Abnormal</h1>
|
||||
<div id="body1547168282144"><div class="section" id="ALM-12067__section10369415133116"><h4 class="sectiontitle">Description</h4><p id="ALM-12067__p50249318">HA checks the Tomcat resources of Manager every 85 seconds. This alarm is generated when HA detects that the Tomcat resources are abnormal for two consecutive times.</p>
|
||||
<p id="ALM-12067__p49590684">This alarm is cleared when HA detects that the Tomcat resources become normal.</p>
|
||||
<p id="ALM-12067__p79241142103811"><strong id="ALM-12067__b7477489024496">Resource Type</strong> of Tomcat is <strong id="ALM-12067__b14760897284496">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new Tomcat resources have been enabled on the new active Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby Manager switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section8323192410322"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12067__table1479793583212" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12067__row107991735133210"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12067__p18799183583212">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12067__p1680123511326">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12067__p1980173523217">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12067__row880183517329"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12067__p108014356328">12067</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12067__p19802163593213">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12067__p880215356323">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section652875914327"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12067__table1090459143316" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12067__row190429173313"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12067__p129062911339">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12067__p10906093332">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12067__row1287182713614"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12067__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12067__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12067__row18907109203311"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12067__p99095916333">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12067__p4909159173310">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12067__row4910691332"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12067__p39101953320">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12067__p5911189173310">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12067__row59118923315"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12067__p0912169123319">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12067__p169131916332">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section2990133614335"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12067__ul25260697"><li id="ALM-12067__li26019688">The active/standby Manager switchover occurs.</li><li id="ALM-12067__li32850608">The Tomcat process repeatedly restarts.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12067__ul12589142315014"><li id="ALM-12067__li3591142315501">The Tomcat directory permission is abnormal, and the Tomcat process is abnormal.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section071212121445"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12067__p3197164020479"><strong id="ALM-12067__b64575930152820">Check whether the permission on the Tomcat directory is normal.</strong></p>
|
||||
<ol id="ALM-12067__ol01141266283"><li id="ALM-12067__li111412602820"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12067__image10114162611289" src="en-us_image_0263895412.png"></span> to view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12067__li2011432610283"><span>Log in to the alarm host as user <strong id="ALM-12067__b2011452617286">root</strong>. <span id="ALM-12067__text65184518511"></span></span></li><li id="ALM-12067__li6114182682819"><span>Run the <strong id="ALM-12067__b101141226122818">su - omm</strong> command to switch to user <strong id="ALM-12067__b1740514446548">omm</strong>.</span></li><li id="ALM-12067__li181141726192815"><span>Run the <strong id="ALM-12067__b19114226192818">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/tomcat.log</strong> command to check whether the Tomcat resource log contains keyword <strong id="ALM-12067__b61141926122811">Cannot find <em id="ALM-12067__i1163833916240">XXX</em></strong> and rectify the file permission based on the keyword.</span></li><li id="ALM-12067__li51141626202816"><span>After 5 minutes, check whether the alarm is automatically cleared. </span><p><ul class="subitemlist" id="ALM-12067__ul911415261288"><li id="ALM-12067__li911492612811">If yes, no further action is required.</li><li id="ALM-12067__li1711402612820">If no, go to <a href="#ALM-12067__li711211264288">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12067__p124132216288"><strong id="ALM-12067__b1967293410811">Collect the fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12067__ol7112102616281"><li class="subitemlist" id="ALM-12067__li711211264288"><a name="ALM-12067__li711211264288"></a><a name="li711211264288"></a><span>On FusionInsight Manager, choose <strong id="ALM-12067__b8360182718578">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12067__b536002785718">Log</strong> > <strong id="ALM-12067__b33611827205714">Download</strong>.</span></li><li id="ALM-12067__li31126266289"><span>In the <strong id="ALM-12067__b1071163118573">Services</strong> area, select <strong id="ALM-12067__b3821031155716">OmmServer</strong> and <strong id="ALM-12067__b68263135711">Tomcat</strong>, and click <strong id="ALM-12067__b682931185716">OK</strong>.</span></li><li id="ALM-12067__li2011292612815"><span>Click <span><img id="ALM-12067__image51121126122816" src="en-us_image_0263895407.png"></span> in the upper right corner, and set <strong id="ALM-12067__b55511722583">Start Date</strong> and <strong id="ALM-12067__b17552923588">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12067__b8552127586">Download</strong>.</span></li><li id="ALM-12067__li15112192672816"><span>Contact <span id="ALM-12067__text1694528635">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12067__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12067__section8222143110380"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12067__p3223173183819">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12068.html
Normal file
80
docs/mrs/umn/ALM-12068.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12068"></a><a name="ALM-12068"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12068 ACS Resource Exception</h1>
|
||||
<div id="body1547192669430"><div class="section" id="ALM-12068__section10369415133116"><h4 class="sectiontitle">Description</h4><p id="ALM-12068__p50249318">HA checks the ACS resources of Manager every 80 seconds. This alarm is generated when HA detects that the ACS resources are abnormal for two consecutive times.</p>
|
||||
<p id="ALM-12068__p49590684">This alarm is cleared when HA detects that the ACS resources are normal.</p>
|
||||
<p id="ALM-12068__p79241142103811"><strong id="ALM-12068__b20074010264505">Resource Type</strong> of ACS is <strong id="ALM-12068__b9447829754505">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new ACS resources have been enabled on the new active Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby Manager switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section8323192410322"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12068__table1479793583212" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12068__row107991735133210"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12068__p18799183583212">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12068__p1680123511326">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12068__p1980173523217">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12068__row880183517329"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12068__p108014356328">12068</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12068__p19802163593213">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12068__p880215356323">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section652875914327"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12068__table1090459143316" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12068__row190429173313"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12068__p129062911339">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12068__p10906093332">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12068__row1399511218366"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12068__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12068__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12068__row18907109203311"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12068__p99095916333">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12068__p4909159173310">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12068__row4910691332"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12068__p39101953320">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12068__p5911189173310">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12068__row59118923315"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12068__p0912169123319">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12068__p169131916332">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section2990133614335"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12068__ul25260697"><li id="ALM-12068__li26019688">The active/standby Manager switchover occurs.</li><li id="ALM-12068__li32850608">The ACS process repeatedly restarts, which may cause the FusionInsight Manager login failure.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12068__p610083015544">The ACS process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section5440125035617"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12068__p8324186"><strong id="ALM-12068__b15118501163833">Check whether the ACS process is normal.</strong></p>
|
||||
<ol id="ALM-12068__ol5558276163811"><li id="ALM-12068__li34357272165726"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12068__image168221113135319" src="en-us_image_0263895733.png"></span> to view the name of the host for which the alarm is generated.</span></li><li id="ALM-12068__li50024484163811"><span>Log in to the alarm host as user <strong id="ALM-12068__b1241211221169">root</strong>. <span id="ALM-12068__text1942962220620"></span></span></li><li id="ALM-12068__li17626636132716"><span>Run the <strong id="ALM-12068__b8588144553112">su - omm</strong> command and then <strong id="ALM-12068__b32015537163811">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> to check whether the status of the ACS resources managed by the HA is normal. In the single-node system, the ACS resource is in the normal state. In the dual-node system, the ACS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul class="subitemlist" id="ALM-12068__ul66289368274"><li id="ALM-12068__li1062811360271">If yes, go to <a href="#ALM-12068__li6152360163635">6</a>.</li><li id="ALM-12068__li46281436112719">If no, go to <a href="#ALM-12068__li139657016249">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12068__li139657016249"><a name="ALM-12068__li139657016249"></a><a name="li139657016249"></a><span>Run the <strong id="ALM-12068__b20158102319162">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/acs.log</strong> command to check whether the ACS resource log of HA contains the keyword <strong id="ALM-12068__b12635154014714">ERROR</strong>. If yes, analyze the logs to locate the resource exception cause and fix the exception.</span></li><li id="ALM-12068__li14736019164314"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12068__ul473671984320"><li id="ALM-12068__li9736151912432">If yes, no further action is required.</li><li id="ALM-12068__li4736141910439">If no, go to <a href="#ALM-12068__li6152360163635">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12068__p3652216163758"><strong id="ALM-12068__b26858758163828">Collect the fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12068__ol26111342163819"><li id="ALM-12068__li6152360163635"><a name="ALM-12068__li6152360163635"></a><a name="li6152360163635"></a><span>On FusionInsight Manager, choose <strong id="ALM-12068__b198926401682">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12068__b16892134019819">Log</strong> > <strong id="ALM-12068__b1789318401185">Download</strong>.</span></li><li id="ALM-12068__li55371246163635"><span>In the <strong id="ALM-12068__b8713343188">Services</strong> area, select <strong id="ALM-12068__b1272114432815">Controller</strong> and <strong id="ALM-12068__b1872120439817">OmmServer</strong>, and click <strong id="ALM-12068__b177222431087">OK</strong>.</span></li><li id="ALM-12068__li28579174163635"><span>Click <span><img id="ALM-12068__image69691781225" src="en-us_image_0263895594.png"></span> in the upper right corner, and set <strong id="ALM-12068__b1482814481884">Start Date</strong> and <strong id="ALM-12068__b68291648584">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12068__b1382914812818">Download</strong>.</span></li><li id="ALM-12068__li33211732163635"><span>Contact <span id="ALM-12068__text21221703916">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section129720811223"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12068__p19973168152211">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12068__section3193699"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12068__p23142944">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12069.html
Normal file
80
docs/mrs/umn/ALM-12069.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12069"></a><a name="ALM-12069"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12069 AOS Resource Exception</h1>
|
||||
<div id="body1547192891145"><div class="section" id="ALM-12069__section10369415133116"><h4 class="sectiontitle">Description</h4><p id="ALM-12069__p50249318">HA checks the AOS resources of Manager every 81 seconds. This alarm is generated when HA detects that the AOS resources are abnormal for two consecutive times.</p>
|
||||
<p id="ALM-12069__p49590684">This alarm is cleared when HA detects that the AOS resources become normal.</p>
|
||||
<p id="ALM-12069__p79241142103811"><strong id="ALM-12069__b146243299544538">Resource Type</strong> of AOS is <strong id="ALM-12069__b188142279344538">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new AOS resources have been enabled on the new active Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby Manager switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section8323192410322"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12069__table1479793583212" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12069__row107991735133210"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12069__p18799183583212">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12069__p1680123511326">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12069__p1980173523217">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12069__row880183517329"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12069__p108014356328">12069</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12069__p19802163593213">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12069__p880215356323">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section652875914327"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12069__table1090459143316" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12069__row190429173313"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12069__p129062911339">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12069__p10906093332">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12069__row17451710103612"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12069__p17935380415">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12069__p187931338134115">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12069__row18907109203311"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12069__p99095916333">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12069__p4909159173310">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12069__row4910691332"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12069__p39101953320">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12069__p5911189173310">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12069__row59118923315"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12069__p0912169123319">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12069__p169131916332">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section2990133614335"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12069__ul25260697"><li id="ALM-12069__li26019688">The active/standby Manager switchover occurs.</li><li id="ALM-12069__li32850608">The AOS process repeatedly restarts, which may cause the FusionInsight Manager login failure.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12069__p14940123162411">The AOS process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section1541443812244"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12069__p8324186"><strong id="ALM-12069__b15118501163833">Check whether the AOS process is normal.</strong></p>
|
||||
<ol id="ALM-12069__ol5558276163811"><li id="ALM-12069__li34357272165726"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12069__image168221113135319" src="en-us_image_0263895369.png"></span> to view the name of the host for which the alarm is generated.</span></li><li id="ALM-12069__li50024484163811"><span>Log in to the alarm host as user <strong id="ALM-12069__b96866141813">root</strong>. <span id="ALM-12069__text116882111811"></span></span></li><li id="ALM-12069__li17626636132716"><span>Run the <strong id="ALM-12069__b199545565144538">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the AOS resources managed by the HA is normal. In the single-node system, the AOS resource is in the normal state. In the dual-node system, the AOS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul class="subitemlist" id="ALM-12069__ul66289368274"><li id="ALM-12069__li1062811360271">If yes, go to <a href="#ALM-12069__li6152360163635">6</a>.</li><li id="ALM-12069__li46281436112719">If no, go to <a href="#ALM-12069__li139657016249">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12069__li139657016249"><a name="ALM-12069__li139657016249"></a><a name="li139657016249"></a><span>Run the <strong id="ALM-12069__b15175108193211">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/aos.log</strong> command to check whether the AOS resource log of HA contains the keyword <strong id="ALM-12069__b1918314817326">ERROR</strong>. If yes, analyze the logs to locate the resource exception cause and fix the exception.</span></li><li id="ALM-12069__li14736019164314"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12069__ul473671984320"><li id="ALM-12069__li9736151912432">If yes, no further action is required.</li><li id="ALM-12069__li4736141910439">If no, go to <a href="#ALM-12069__li6152360163635">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12069__p3652216163758"><strong id="ALM-12069__b26858758163828">Collect the fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12069__ol26111342163819"><li id="ALM-12069__li6152360163635"><a name="ALM-12069__li6152360163635"></a><a name="li6152360163635"></a><span>On FusionInsight Manager, choose <strong id="ALM-12069__b4651852193219">O&M</strong>. In the navigation pane on the left, choose <strong id="ALM-12069__b76526528326">Log</strong> > <strong id="ALM-12069__b46521552153219">Download</strong>.</span></li><li id="ALM-12069__li55371246163635"><span>In the <strong id="ALM-12069__b118685519325">Services</strong> area, select <strong id="ALM-12069__b1586165523216">Controller</strong> and <strong id="ALM-12069__b1686155512326">OmmServer</strong>, and click <strong id="ALM-12069__b5861955163217">OK</strong>.</span></li><li id="ALM-12069__li28579174163635"><span>Click <span><img id="ALM-12069__image69691781225" src="en-us_image_0263895883.png"></span> in the upper right corner, and set <strong id="ALM-12069__b182615123314">Start Date</strong> and <strong id="ALM-12069__b102629118330">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12069__b62621211335">Download</strong>.</span></li><li id="ALM-12069__li33211732163635"><span>Contact <span id="ALM-12069__text5719151393316">O&M personnel</span> and provide the collected logs.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section129720811223"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12069__p19973168152211">This alarm is automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12069__section3193699"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12069__p23142944">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12070.html
Normal file
80
docs/mrs/umn/ALM-12070.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12070"></a><a name="ALM-12070"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12070 Controller Resource Is Abnormal</h1>
|
||||
<div id="body1547192931355"><div class="section" id="ALM-12070__section2747821101717"><h4 class="sectiontitle">Alarm Description</h4><p id="ALM-12070__p868415518212">HA checks the controller resources of Manager every 80 seconds. This alarm is generated when HA detects that the controller resources are abnormal for 2 consecutive times.</p>
|
||||
<p id="ALM-12070__p06843510216">This alarm is cleared when the Controller resource is normal.</p>
|
||||
<p id="ALM-12070__p14684251525"><strong id="ALM-12070__b06841253216">Resource Type</strong> of Controller is <strong id="ALM-12070__b1068413515211">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new Controller resources have been enabled on the new active FusionInsight Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section127478213171"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12070__table7749721191719" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12070__row6867152161714"><th align="left" class="cellrowborder" valign="top" width="34.34343434343434%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12070__p03908133538">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="34.34343434343434%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12070__p239001375320">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="31.313131313131308%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12070__p1939041395319">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12070__row586722117171"><td class="cellrowborder" valign="top" width="34.34343434343434%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12070__p33906131535">12070</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="34.34343434343434%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12070__p73902013155315">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="31.313131313131308%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12070__p1539021315312">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section14755172115173"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12070__table17756521131714" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12070__row18671421131712"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12070__p786772121719">Parameter</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12070__p286742191711">Description</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12070__row1561019154012"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12070__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12070__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12070__row786710211177"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12070__p58673218178">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12070__p4868821191713">Specifies the name of the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12070__row286819215174"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12070__p186818216176">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12070__p7868721131716">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12070__row14868221161713"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12070__p1986842116171">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12070__p10868132118175">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section1776462111715"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12070__ul57451821824"><li id="ALM-12070__li6745132115216">The active/standby FusionInsight Manager switchover occurs.</li><li id="ALM-12070__li1274512211725">The Controller process repeatedly restarts, which may cause the FusionInsight Manager login failure.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section6765152119174"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12070__p3608142911216">The Controller process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section87667210173"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12070__p8711155114555"><strong id="ALM-12070__b598112553552">Check whether the controller process is normal.</strong></p>
|
||||
<ol id="ALM-12070__ol12903923231"><li id="ALM-12070__li1890320236317"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the name of the host for which the alarm is generated.</span></li><li id="ALM-12070__li2903122315315"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12070__b1190319231738">root</strong>. <span id="ALM-12070__text985593916354"></span></span></li><li id="ALM-12070__li47901049125519"><span>Run the <strong id="ALM-12070__b171639418567">su - omm</strong> command to switch to user <strong id="ALM-12070__b1482687105615">omm</strong>.Run the <strong id="ALM-12070__b590314231316">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the Controller resources managed by the HA is normal. In the single-node system, the Controller resource is in the normal state. In the dual-node system, the Controller resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul id="ALM-12070__ul1490314236318"><li id="ALM-12070__li090332320310">If it is, go to <a href="#ALM-12070__li69038231234">6</a>.</li><li id="ALM-12070__li9903192317316">If it is not, go to <a href="#ALM-12070__li6903202312318">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12070__li6903202312318"><a name="ALM-12070__li6903202312318"></a><a name="li6903202312318"></a><span>Run the <strong id="ALM-12070__b16903112313312">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/controller.log</strong> command to view the Controller resource logs, and run the <strong id="ALM-12070__b290310231836">vi $BIGDATA_LOG_HOME/controller/controller.log </strong>command to view the Controller running logs, check whether the keyword <strong id="ALM-12070__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12070__li1590310231933"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12070__ul209032231431"><li id="ALM-12070__li199031823835">If it is, no further action is required.</li><li id="ALM-12070__li159039231338">If it is not, go to <a href="#ALM-12070__li69038231234">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12070__p13421113195811"><strong id="ALM-12070__b204218131586">Collect fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12070__ol39031423835"><li id="ALM-12070__li69038231234"><a name="ALM-12070__li69038231234"></a><a name="li69038231234"></a><span>On FusionInsight Manager, choose <strong id="ALM-12070__b590352315317">O&M</strong> > <strong id="ALM-12070__b59030233320">Log</strong> > <strong id="ALM-12070__b1290362318319">Download</strong>.</span></li><li id="ALM-12070__li18903202318317"><span>Select <strong id="ALM-12070__b6883925124310">Controller </strong>and<strong id="ALM-12070__b1588372554312"> OmmServe</strong> for <strong id="ALM-12070__b890312231830">Service</strong> and click <strong id="ALM-12070__b3991118545">OK</strong>.</span></li><li id="ALM-12070__li18903523531"><span>Click <span><img id="ALM-12070__image18903132310317" src="en-us_image_0269383915.png"></span> in the upper right corner, and set <strong id="ALM-12070__b129031823137">Start Date</strong> and <strong id="ALM-12070__b1990322312314">End Date</strong> for log collection to 1 hour before and after the alarm generation time, respectively. Then, click <strong id="ALM-12070__b4903132312320">Download</strong>.</span></li><li id="ALM-12070__li495644512588"><span>Contact the <span id="ALM-12070__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12070__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12070__section117861721171717"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12070__p3869621161713">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12071.html
Normal file
80
docs/mrs/umn/ALM-12071.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12071"></a><a name="ALM-12071"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12071 Httpd Resource Is Abnormal</h1>
|
||||
<div id="body1547193420658"><div class="section" id="ALM-12071__section1873012221819"><h4 class="sectiontitle">Description</h4><p id="ALM-12071__p1795612211819">HA checks the httpd resources of Manager every 120 seconds. This alarm is generated when HA detects that the httpd resources are abnormal for 10 consecutive times.</p>
|
||||
<p id="ALM-12071__p14956102261815">This alarm is cleared when the httpd resource is normal.</p>
|
||||
<p id="ALM-12071__p1495610221182"><strong id="ALM-12071__b9956132220185">Resource Type</strong> of httpd is <strong id="ALM-12071__b1195615225187">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new httpd resources have been enabled on the new active FusionInsight Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section17732172241819"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12071__table3734112271815" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12071__row1195613222180"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12071__p1695610220189">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12071__p7956182213188">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12071__p14956112221819">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12071__row595613228188"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12071__p1595672217183">12071</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12071__p1695612201813">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12071__p8956162218181">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section1174682271815"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12071__table1374822251819" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12071__row19576225184"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12071__p1695792221820">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12071__p6957132291818">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12071__row3135141314405"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12071__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12071__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12071__row169571122121811"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12071__p795702218183">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12071__p1795716222185">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12071__row16957192271810"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12071__p16957122161813">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12071__p14957162221810">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12071__row8957132241814"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12071__p29571822111811">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12071__p12957192213186">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section167631322161811"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12071__ul16957122212187"><li id="ALM-12071__li9957322141814">The active/standby FusionInsight Manager switchover occurs.</li><li id="ALM-12071__li2957162220182">The httpd process is repeatedly restarts, which may lead to the failure to visit the native service UI.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section17770132211810"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12071__p480785161917">The httpd process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section17774192220180"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12071__p5957822201812"><strong id="ALM-12071__b1595714229188">Check whether the httpd process is abnormal.</strong></p>
|
||||
<ol id="ALM-12071__ol108431951181816"><li id="ALM-12071__li11843165119182"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the name of the host for which the alarm is generated.</span></li><li id="ALM-12071__li8843105111811"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12071__b584335112186">root</strong>. <span id="ALM-12071__text985593916354"></span></span></li><li id="ALM-12071__li214051915720"><span>Run the <strong id="ALM-12071__b43699226576">su - omm</strong> command to switch to user <strong id="ALM-12071__b9147527105718">omm</strong>.</span></li><li id="ALM-12071__li4843951151819"><span>Run the <strong id="ALM-12071__b3843205114186">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the httpd resources managed by the HA is normal. In the single-node system, the httpd resource is in the normal state. In the dual-node system, the httpd resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul id="ALM-12071__ul14843105113181"><li id="ALM-12071__li88432515181">If it is, go to <a href="#ALM-12071__li384145118188">7</a>.</li><li id="ALM-12071__li158431517188">If it is not, go to <a href="#ALM-12071__li584395101819">5</a>.</li></ul>
|
||||
</p></li><li id="ALM-12071__li584395101819"><a name="ALM-12071__li584395101819"></a><a name="li584395101819"></a><span>Run the <strong id="ALM-12071__b6843951201818">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/httpd.log</strong> command to view the httpd resource logs, check whether the keyword <strong id="ALM-12071__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12071__li118438511180"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12071__ul1484315115185"><li id="ALM-12071__li3843175115182">If it is, no further action is required.</li><li id="ALM-12071__li1184355116180">If it is not, go to <a href="#ALM-12071__li384145118188">7</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12071__p1674954751819"><strong id="ALM-12071__b149571522171815">Collect fault information.</strong></p>
|
||||
<ol start="7" id="ALM-12071__ol118431551101813"><li id="ALM-12071__li384145118188"><a name="ALM-12071__li384145118188"></a><a name="li384145118188"></a><span>On FusionInsight Manager, choose <strong id="ALM-12071__b884013510187">O&M</strong> > <strong id="ALM-12071__b384045118183">Log</strong> > <strong id="ALM-12071__b8841155115188">Download</strong>.</span></li><li id="ALM-12071__li5841351151811"><span>Select <strong id="ALM-12071__b78412516184">Controller</strong> and <strong id="ALM-12071__b2841175116185">OmmServer</strong> for <strong id="ALM-12071__b18841451201818">Service</strong> and click <strong id="ALM-12071__b3991118545">OK</strong>.</span></li><li id="ALM-12071__li1684175131820"><span>Click <span><img id="ALM-12071__image1084185120186" src="en-us_image_0269383916.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12071__b684175111183">Start Date</strong> and <strong id="ALM-12071__b14841185112187">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12071__b8841551191812">OK</strong>. Then, click <strong id="ALM-12071__b10841155112188">Download</strong>.</span></li><li id="ALM-12071__li495644512588"><span>Contact the <span id="ALM-12071__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section17816122101811"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12071__p1395992212185">This alarm will be automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12071__section081882241814"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12071__p1295982214187">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
85
docs/mrs/umn/ALM-12072.html
Normal file
85
docs/mrs/umn/ALM-12072.html
Normal file
@ -0,0 +1,85 @@
|
||||
<a name="ALM-12072"></a><a name="ALM-12072"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12072 FloatIP Resource Is Abnormal</h1>
|
||||
<div id="body1547193420658"><div class="section" id="ALM-12072__section626017484164"><h4 class="sectiontitle">Description</h4><p id="ALM-12072__p15433448121614">HA checks the floatip resources of Manager every 9 seconds. This alarm is generated when HA detects that the floatip resources are abnormal for 3 consecutive times.</p>
|
||||
<p id="ALM-12072__p15433154881618">This alarm is cleared when the FloatIP resource is normal.</p>
|
||||
<p id="ALM-12072__p7433164891618"><strong id="ALM-12072__b4433648141611">Resource Type</strong> of FloatIP is <strong id="ALM-12072__b243317487168">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new FloatIP resources have been enabled on the new active FusionInsight Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section626118482164"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12072__table12262104816161" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12072__row2433548161618"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12072__p44341548171620">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12072__p16434148131610">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12072__p643434814161">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12072__row24341648101615"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12072__p1434104841611">12072</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12072__p1143444817168">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12072__p1143414485168">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section1527020486169"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12072__table627119487161" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12072__row44341548111616"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12072__p1543434811620">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12072__p1743434814169">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12072__row1053616815401"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12072__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12072__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12072__row443414481160"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12072__p2434124810167">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12072__p13434048151613">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12072__row343414482164"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12072__p1643434815162">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12072__p1243411487164">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12072__row643494819168"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12072__p18434114851619">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12072__p15434134812161">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section182791448171620"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12072__ul74355481165"><li id="ALM-12072__li1435184812166">The active/standby FusionInsight Manager switchover occurs.</li><li id="ALM-12072__li104351748121610">The FloatIP process is repeatedly restarts, which may lead to the failure to visit the native service UI.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section15284948161619"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12072__ul11435134871614"><li id="ALM-12072__li134353483164">The floating IP address is abnormal.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section182871248141613"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12072__p14354485169"><strong id="ALM-12072__b154352487167">Check the floating IP address status of the active management node.</strong></p>
|
||||
<ol id="ALM-12072__ol1726941101718"><li id="ALM-12072__li13268519176"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the address of the host for which the alarm is generated and the resource name.</span></li><li id="ALM-12072__li326811112176"><span>Log in to the active management node as user <strong id="ALM-12072__b122688191718">root</strong>. <span id="ALM-12072__text985593916354"></span><span id="ALM-12072__text3230164484916"></span></span></li><li id="ALM-12072__li67021344840"><span>Run the following command, go to the <strong id="ALM-12072__b19704174415416">${BIGDATA_HOME}/om-server/om/sbin/</strong> directory.</span><p><p id="ALM-12072__p6644753175416"><strong id="ALM-12072__b1043540105513">su - omm</strong></p>
|
||||
<p id="ALM-12072__p1435805011578"><strong id="ALM-12072__b18358125075711">cd </strong><strong id="ALM-12072__b113580509579">${BIGDATA_HOME}/om-server/om/sbin/</strong></p>
|
||||
</p></li><li id="ALM-12072__li62687118178"><span>Run the <strong id="ALM-12072__b14833124916718">sh status-oms.sh</strong> command, and execute the <strong id="ALM-12072__b12681519174">status-oms.sh</strong> script to check whether the floating IP address of the active FusionInsight Manager is normal. View the command output, locate the row where <strong id="ALM-12072__b202681611179">ResName</strong> is <strong id="ALM-12072__b17268111111713">floatip</strong>, and check whether the following information is displayed.</span><p><p id="ALM-12072__p202681101719">For example:</p>
|
||||
<pre class="screen" id="ALM-12072__screen826841131713">10-10-10-160 floatip Normal Normal Single_active</pre>
|
||||
<ul id="ALM-12072__ul1326814118178"><li id="ALM-12072__li42686101713">If it is, go to <a href="#ALM-12072__li726861151715">8</a>.</li><li id="ALM-12072__li1726812111720">If it is not, go to <a href="#ALM-12072__li162681212172">5</a>.</li></ul>
|
||||
</p></li><li id="ALM-12072__li162681212172"><a name="ALM-12072__li162681212172"></a><a name="li162681212172"></a><span>Run the <strong id="ALM-12072__b132681110175">ifconfig </strong>command to check whether the NIC with the floating IP address exists.</span><p><ul id="ALM-12072__ul1268171101715"><li id="ALM-12072__li192680141716">If it does, go to <a href="#ALM-12072__li726861151715">8</a>.</li><li id="ALM-12072__li226812111171">If it does not, go to <a href="#ALM-12072__li19269111111714">6</a>.</li></ul>
|
||||
</p></li><li id="ALM-12072__li19269111111714"><a name="ALM-12072__li19269111111714"></a><a name="li19269111111714"></a><span>Run the <strong id="ALM-12072__b1426819113173">ifconfig</strong> <em id="ALM-12072__i192695111177">NIC name Floating IPaddress</em> netmask <em id="ALM-12072__i2269181181716">Subnet mask</em> command to reconfigure the NIC with the floating IP address. (For example, <strong id="ALM-12072__b2026991161713">ifconfig eth0 10.10.10.102 netmask 255.255.255.0</strong>).</span></li><li id="ALM-12072__li1426917141719"><span>Five minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12072__ul3269113174"><li id="ALM-12072__li5269101141717">If it is, no further action is required.</li><li id="ALM-12072__li152691214173">If it is not, go to <a href="#ALM-12072__li726861151715">8</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12072__p194344582164"><strong id="ALM-12072__b11436748171614">Collect fault information.</strong></p>
|
||||
<ol start="8" id="ALM-12072__ol1326817181717"><li id="ALM-12072__li726861151715"><a name="ALM-12072__li726861151715"></a><a name="li726861151715"></a><span>On FusionInsight Manager, choose <strong id="ALM-12072__b026812121711">O&M</strong> > <strong id="ALM-12072__b726811111719">Log</strong> > <strong id="ALM-12072__b926841131719">Download</strong>.</span></li><li id="ALM-12072__li162681171713"><span>Select <strong id="ALM-12072__b17268191151713">Controller</strong> and <strong id="ALM-12072__b42681516170">OmmServer</strong> for <strong id="ALM-12072__b112681114179">Service</strong> and click <strong id="ALM-12072__b3991118545">OK</strong>.</span></li><li id="ALM-12072__li1326812151712"><span>Click <span><img id="ALM-12072__image1626812113177" src="en-us_image_0269383917.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12072__b1726819191714">Start Date</strong> and <strong id="ALM-12072__b182681113175">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12072__b2268161191713">OK</strong>. Then, click <strong id="ALM-12072__b1326891101719">Download</strong>.</span></li><li id="ALM-12072__li495644512588"><span>Contact the <span id="ALM-12072__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section1132214841620"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12072__p134361483167">This alarm will be automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12072__section18323104816168"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12072__p44361248161610">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12073.html
Normal file
80
docs/mrs/umn/ALM-12073.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12073"></a><a name="ALM-12073"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12073 CEP Resource Is Abnormal</h1>
|
||||
<div id="body1547193420658"><div class="section" id="ALM-12073__section24601758201512"><h4 class="sectiontitle">Description</h4><p id="ALM-12073__p99041958181518">HA checks the cep resources of Manager every 60 seconds. This alarm is generated when HA detects that the cep resources are abnormal for 2 consecutive times.</p>
|
||||
<p id="ALM-12073__p69041558161518">This alarm is cleared when the CEP resource is normal.</p>
|
||||
<p id="ALM-12073__p1490411584158"><strong id="ALM-12073__b79043583158">Resource Type</strong> of CEP is <strong id="ALM-12073__b1904175821515">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new CEP resources have been enabled on the new active FusionInsight Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section3467175816152"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12073__table547145861512" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12073__row139041258201517"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12073__p1090465816153">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12073__p990410583158">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12073__p13904175818159">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12073__row1090645820151"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12073__p8906958191515">12073</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12073__p29064580155">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12073__p690616582159">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section1495125801510"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12073__table84982058131516" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12073__row1190635871513"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12073__p20906185812155">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12073__p4906558101516">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12073__row20656733401"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12073__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12073__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12073__row1906758141517"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12073__p9906358171514">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12073__p99061358151511">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12073__row09071558191517"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12073__p18907175815155">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12073__p8907125891519">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12073__row7907135891519"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12073__p3907105841512">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12073__p39071658181515">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section752805841514"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12073__ul109071458171515"><li id="ALM-12073__li2907195814159">The active/standby FusionInsight Manager switchover occurs.</li><li id="ALM-12073__li12907145815158">The CEP process repeatedly restarts, causing monitoring data to be abnormal.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section154117580159"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12073__p13955124019163">The CEP process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section355120589155"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12073__p1090845841520"><strong id="ALM-12073__b15908558141519">Check whether the CEP process is abnormal.</strong></p>
|
||||
<ol id="ALM-12073__ol3262531161613"><li id="ALM-12073__li162612031121620"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the name of the host for which the alarm is generated.</span></li><li id="ALM-12073__li8261133171620"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12073__b1026143111610">root</strong>. <span id="ALM-12073__text985593916354"></span></span></li><li id="ALM-12073__li14261163118165"><span>Run the <strong id="ALM-12073__b2261131171610">su -omm</strong> command and then the <strong id="ALM-12073__b426119312164">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the CEP resources managed by the HA is normal. In the single-node system, the CEP resource is in the normal state. In the dual-node system, the CEP resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul id="ALM-12073__ul426133112169"><li id="ALM-12073__li172611231161613">If it is, go to <a href="#ALM-12073__li9258163110165">6</a>.</li><li id="ALM-12073__li12613317162">If it is not, go to <a href="#ALM-12073__li8262123151618">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12073__li8262123151618"><a name="ALM-12073__li8262123151618"></a><a name="li8262123151618"></a><span>Run the <strong id="ALM-12073__b1026193171612">vi $BIGDATA_LOG_HOME/omm/oms/cep/cep.log </strong>and <strong id="ALM-12073__b1226213316168">vi $BIGDATA_LOG_HOME/omm/oms/cep/scriptlog/cep_ha.log </strong>commands to view the CEP resource logs, check whether the keyword <strong id="ALM-12073__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12073__li132629311160"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12073__ul6262831171619"><li id="ALM-12073__li16262153141620">If it is, no further action is required.</li><li id="ALM-12073__li826216312163">If it is not, go to <a href="#ALM-12073__li9258163110165">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12073__p10254192814164"><strong id="ALM-12073__b3909105810152">Collect fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12073__ol526063113163"><li id="ALM-12073__li9258163110165"><a name="ALM-12073__li9258163110165"></a><a name="li9258163110165"></a><span>On FusionInsight Manager, choose <strong id="ALM-12073__b1125815315166">O&M</strong> > <strong id="ALM-12073__b1525823113164">Log</strong> > <strong id="ALM-12073__b625823114166">Download</strong>.</span></li><li id="ALM-12073__li18258163151613"><span>Select <strong id="ALM-12073__b9258163115162">Controller</strong> and <strong id="ALM-12073__b22584311164">OmmServer</strong> for <strong id="ALM-12073__b2258831111615">Service</strong> and click <strong id="ALM-12073__b3991118545">OK</strong>.</span></li><li id="ALM-12073__li12260531161614"><span>Click <span><img id="ALM-12073__image126014312167" src="en-us_image_0269383918.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12073__b19260123119168">Start Date</strong> and <strong id="ALM-12073__b32609319168">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12073__b626015316169">OK</strong>. Then, click <strong id="ALM-12073__b1426043161612">Download</strong>.</span></li><li id="ALM-12073__li495644512588"><span>Contact the <span id="ALM-12073__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section9650125851520"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12073__p1909195801515">This alarm will be automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12073__section56541158121513"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12073__p3909175816157">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
81
docs/mrs/umn/ALM-12074.html
Normal file
81
docs/mrs/umn/ALM-12074.html
Normal file
@ -0,0 +1,81 @@
|
||||
<a name="ALM-12074"></a><a name="ALM-12074"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12074 FMS Resource Is Abnormal</h1>
|
||||
<div id="body1547193420658"><div class="section" id="ALM-12074__section1025315248149"><h4 class="sectiontitle">Description</h4><p id="ALM-12074__p12471152414146">HA checks the fms resources of Manager every 60 seconds. This alarm is generated when HA detects that the fms resources are abnormal for 2 consecutive times.</p>
|
||||
<p id="ALM-12074__p9471162418140">This alarm is cleared when the FMS resource is normal.</p>
|
||||
<p id="ALM-12074__p1471142491415"><strong id="ALM-12074__b134712246140">Resource Type</strong> of FMS is <strong id="ALM-12074__b1947192491410">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new FMS resources have been enabled on the new active FusionInsight Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section1925572441410"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12074__table12256142420143" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12074__row7471124161414"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12074__p104711624121419">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12074__p7471152413145">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12074__p647112412141">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12074__row14471122471411"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12074__p84711924121412">12074</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12074__p1471192414140">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12074__p7471162411146">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section152685245145"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12074__table326913248146" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12074__row84712024201410"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12074__p74731224171416">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12074__p147302411144">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12074__row132191557113913"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12074__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12074__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12074__row1047362410144"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12074__p184734246148">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12074__p147316247147">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12074__row947352441414"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12074__p647382418147">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12074__p1473122414147">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12074__row347313246144"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12074__p64737245144">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12074__p104739246146">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section182821024171411"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12074__ul147362419144"><li id="ALM-12074__li1847362418142">The active/standby FusionInsight Manager switchover occurs.</li><li id="ALM-12074__li20473132411147">The FMS process repeatedly restarts. As a result, alarm information may fail to be reported.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section192899247141"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12074__p360513492551">The FMS process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section11292132412149"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12074__p3473172411416"><strong id="ALM-12074__b94731524131416">Check whether the FMS process is abnormal.</strong></p>
|
||||
<ol id="ALM-12074__ol7833539131412"><li id="ALM-12074__li2083323971413"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the name of the host for which the alarm is generated.</span></li><li id="ALM-12074__li683353971413"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12074__b78338399141">root</strong>. <span id="ALM-12074__text985593916354"></span></span></li><li id="ALM-12074__li3833339151414"><span>Run the <strong id="ALM-12074__b38330393148">su -omm</strong> command and then the <strong id="ALM-12074__b78339393148">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the FMS resources managed by the HA is normal. In the single-node system, the FMS resource is in the normal state. In the dual-node system, the FMS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul id="ALM-12074__ul178331739111416"><li id="ALM-12074__li198331739171419">If it is, go to <a href="#ALM-12074__li5828173931412">6</a>.</li><li id="ALM-12074__li12833193919144">If it is not, go to <a href="#ALM-12074__li1183383931416">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12074__li1183383931416"><a name="ALM-12074__li1183383931416"></a><a name="li1183383931416"></a><span>Run the <strong id="ALM-12074__b783323918148">vi $BIGDATA_LOG_HOME/omm/oms/fms/fms.log </strong>and <strong id="ALM-12074__b108331539101416">vi $BIGDATA_LOG_HOME/omm/oms/fms/scriptlog/fms_ha.log </strong>commands to view the FMS resource logs, check whether the keyword <strong id="ALM-12074__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12074__li4833133971410"><span>5 minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12074__ul1983383991412"><li id="ALM-12074__li983311395149">If it is, no further action is required.</li><li id="ALM-12074__li0833103914141">If it is not, go to <a href="#ALM-12074__li5828173931412">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12074__p590913362141"><strong id="ALM-12074__b2474172420144">Collect fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12074__ol1683343918146"><li id="ALM-12074__li5828173931412"><a name="ALM-12074__li5828173931412"></a><a name="li5828173931412"></a><span>On FusionInsight Manager, choose <strong id="ALM-12074__b18828113913148">O&M</strong>> <strong id="ALM-12074__b98286392144">Log</strong> > <strong id="ALM-12074__b13828203912147">Download</strong>.</span></li><li id="ALM-12074__li383393912140"><span>Select <strong id="ALM-12074__b3828039191417">Controller</strong> and <strong id="ALM-12074__b1583363981411">OmmServer</strong> for <strong id="ALM-12074__b17833639111417">Service</strong> and click <strong id="ALM-12074__b3991118545">OK</strong>.</span></li><li id="ALM-12074__li18833339101411"><span>Click <span><img id="ALM-12074__image1383383917144" src="en-us_image_0269383919.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12074__b783323981410">Start Date</strong> and <strong id="ALM-12074__b1683314393142">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12074__b08331339131417">OK</strong>. Then, click <strong id="ALM-12074__b11833103913145">Download</strong>.</span></li><li id="ALM-12074__li495644512588"><span>Contact the <span id="ALM-12074__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section13393241148"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12074__p34742024121418">This alarm will be automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12074__section113411224171413"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12074__p247562413143">None</p>
|
||||
</div>
|
||||
<p id="ALM-12074__p8060118"></p>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
80
docs/mrs/umn/ALM-12075.html
Normal file
80
docs/mrs/umn/ALM-12075.html
Normal file
@ -0,0 +1,80 @@
|
||||
<a name="ALM-12075"></a><a name="ALM-12075"></a>
|
||||
|
||||
<h1 class="topictitle1">ALM-12075 PMS Resource Is Abnormal</h1>
|
||||
<div id="body1547193420658"><div class="section" id="ALM-12075__section1944518918126"><h4 class="sectiontitle">Description</h4><p id="ALM-12075__p1056011916128">HA checks the pms resources of Manager every 55 seconds. This alarm is generated when HA detects that the pms resources are abnormal for three consecutive times.</p>
|
||||
<p id="ALM-12075__p1656012919127">This alarm is cleared when the PMS resource is normal.</p>
|
||||
<p id="ALM-12075__p165609911129"><strong id="ALM-12075__b956029161214">Resource Type</strong> of PMS is <strong id="ALM-12075__b155608991210">Single-active</strong>. Active/standby will be triggered upon resource exceptions. When this alarm is generated, the active/standby switchover is complete and new PMS resources have been enabled on the new active FusionInsight Manager. In this case, this alarm is cleared. This alarm is used to notify users of the cause of the active/standby switchover.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section1744712910128"><h4 class="sectiontitle">Attribute</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12075__table54475911124" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12075__row256039141211"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12075__p656099121220">Alarm ID</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12075__p55611920128">Alarm Severity</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12075__p2056115961218">Auto Clear</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12075__row156189101213"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12075__p185611296125">12075</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12075__p195611798121">Major</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12075__p105612961220">Yes</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section1245369191215"><h4 class="sectiontitle">Parameters</h4>
|
||||
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12075__table1245418918120" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12075__row1856115917125"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12075__p756112916126">Name</p>
|
||||
</th>
|
||||
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12075__p185611194128">Meaning</p>
|
||||
</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody><tr id="ALM-12075__row83772527399"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12075__p192431315431">Source</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12075__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12075__row656110912128"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12075__p195611951210">ServiceName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12075__p105611292121">Specifies the service for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12075__row175610991210"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12075__p5561159101211">RoleName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12075__p25612951213">Specifies the role for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
<tr id="ALM-12075__row956114921215"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12075__p756139201217">HostName</p>
|
||||
</td>
|
||||
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12075__p1556115941220">Specifies the host for which the alarm is generated.</p>
|
||||
</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section94599971213"><h4 class="sectiontitle">Impact on the System</h4><ul id="ALM-12075__ul1756211913120"><li id="ALM-12075__li14562391129">The active/standby FusionInsight Manager switchover occurs.</li><li id="ALM-12075__li75628981213">The PMS process repeatedly restarts, causing monitoring information to be abnormal.</li></ul>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section646214911216"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12075__p562201075915">The PMS process is abnormal.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section44646913126"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12075__p17562169131217"><strong id="ALM-12075__b2056239121213">Check whether the PMS process is abnormal.</strong></p>
|
||||
<ol id="ALM-12075__ol6884419101214"><li id="ALM-12075__li11884121991216"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the name of the host for which the alarm is generated.</span></li><li id="ALM-12075__li18884171901214"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12075__b1884121916127">root</strong>. <span id="ALM-12075__text985593916354"></span></span></li><li id="ALM-12075__li3884131913125"><span>Run the <strong id="ALM-12075__b388491912124">su -omm</strong> command and then the <strong id="ALM-12075__b198847192127">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the PMS resources managed by the HA is normal. In the single-node system, the PMS resource is in the normal state. In the dual-node system, the PMS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul id="ALM-12075__ul8884121991210"><li id="ALM-12075__li1188421915127">If it is, go to <a href="#ALM-12075__li11878219121215">6</a>.</li><li id="ALM-12075__li4884161961212">If it is not, go to <a href="#ALM-12075__li1288412199129">4</a>.</li></ul>
|
||||
</p></li><li id="ALM-12075__li1288412199129"><a name="ALM-12075__li1288412199129"></a><a name="li1288412199129"></a><span>Run the <strong id="ALM-12075__b158841519181218">vi $BIGDATA_LOG_HOME/omm/oms/pms/pms.log </strong>and <strong id="ALM-12075__b788491910125">vi $BIGDATA_LOG_HOME/omm/oms/pms/scriptlog/pms_ha.log </strong>commands to view the PMS resource logs, check whether the keyword <strong id="ALM-12075__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12075__li78841319171220"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12075__ul3884161911217"><li id="ALM-12075__li1188471921218">If it is, no further action is required.</li><li id="ALM-12075__li1588410192121">If it is not, go to <a href="#ALM-12075__li11878219121215">6</a>.</li></ul>
|
||||
</p></li></ol>
|
||||
<p id="ALM-12075__p15407181618128"><strong id="ALM-12075__b18564169141217">Collect fault information.</strong></p>
|
||||
<ol start="6" id="ALM-12075__ol12884201951213"><li id="ALM-12075__li11878219121215"><a name="ALM-12075__li11878219121215"></a><a name="li11878219121215"></a><span>On FusionInsight Manager, choose <strong id="ALM-12075__b10878419161212">O&M</strong>> <strong id="ALM-12075__b14878719191219">Log</strong> > <strong id="ALM-12075__b287818193127">Download</strong>.</span></li><li id="ALM-12075__li168821919121211"><span>Select <strong id="ALM-12075__b28789195123">Controller</strong> and <strong id="ALM-12075__b2878111915128">OmmServer</strong> for <strong id="ALM-12075__b12878161916123">Service</strong> and click <strong id="ALM-12075__b3991118545">OK</strong>.</span></li><li id="ALM-12075__li1488411198127"><span>Click <span><img id="ALM-12075__image7882319161212" src="en-us_image_0269383920.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12075__b7882151981218">Start Date</strong> and <strong id="ALM-12075__b4882201914126">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12075__b1388215197122">OK</strong>. Then, click <strong id="ALM-12075__b19882181917126">Download</strong>.</span></li><li id="ALM-12075__li495644512588"><span>Contact the <span id="ALM-12075__text4614151421417">O&M personnel</span> and send the collected log information.</span></li></ol>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section1848218920128"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12075__p25641990124">This alarm will be automatically cleared after the fault is rectified.</p>
|
||||
</div>
|
||||
<div class="section" id="ALM-12075__section1148319911210"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12075__p856418914127">None</p>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<div class="familylinks">
|
||||
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
|
||||
</div>
|
||||
</div>
|
||||
|
102
docs/mrs/umn/ALM-12076.html
Normal file
102
docs/mrs/umn/ALM-12076.html
Normal file
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user