MRS UMN 320-lts.1 version

Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
This commit is contained in:
Yang, Tong 2023-08-13 18:25:47 +00:00 committed by zuul
parent 139ab2d266
commit b7a42db732
950 changed files with 13246 additions and 7048 deletions

File diff suppressed because it is too large Load Diff

View File

@ -75,7 +75,7 @@
<ol start="11" id="ALM-12001__ol38224750154621"><li id="ALM-12001__li37575023154554"><a name="ALM-12001__li37575023154554"></a><a name="li37575023154554"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12001__b41457704154554">Audit &gt; Configurations</strong>.</span></li><li id="ALM-12001__li23678021154554"><span>Reset dump rules, set the parameters properly, and click <strong id="ALM-12001__b2630891154554">OK</strong>.</span></li><li id="ALM-12001__li17396949154554"><span>Wait for 2 minutes, view real-time alarms and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12001__ul61585317154554"><li id="ALM-12001__li11775598154554">If yes, no further action is required.</li><li id="ALM-12001__li14299353154554">If no, go to <a href="#ALM-12001__li5991045915463">14</a>.</li></ul>
</p></li></ol>
<p id="ALM-12001__p28445835153631"><strong id="ALM-12001__b57966164154559">Collect fault information.</strong></p>
<ol start="14" id="ALM-12001__ol17392131154624"><li id="ALM-12001__li5991045915463"><a name="ALM-12001__li5991045915463"></a><a name="li5991045915463"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12001__b5263123115415">O&amp;M</strong> &gt; <strong id="ALM-12001__b2156979815463">Log &gt; Download</strong>.</span></li><li id="ALM-12001__li5396317115463"><span>Select <strong id="ALM-12001__b20461631242">OmmServer</strong> from the <strong id="ALM-12001__b63941092411">Service</strong> and click <strong id="ALM-12001__b3991118545">OK</strong>.</span></li><li id="ALM-12001__li1145664103113"><span>Click <span><img id="ALM-12001__image1945644173117" src="en-us_image_0269383808.png"></span> in the upper right corner, and set <strong id="ALM-12001__b6456941173117">Start Date</strong> and <strong id="ALM-12001__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12001__b13456164113319">Download</strong>.</span></li><li id="ALM-12001__li495644512588"><span>Contact the <span id="ALM-12001__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="14" id="ALM-12001__ol17392131154624"><li id="ALM-12001__li5991045915463"><a name="ALM-12001__li5991045915463"></a><a name="li5991045915463"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12001__b5263123115415">O&amp;M</strong> &gt; <strong id="ALM-12001__b2156979815463">Log &gt; Download</strong>.</span></li><li id="ALM-12001__li5396317115463"><span>Select <strong id="ALM-12001__b20461631242">OmmServer</strong> from the <strong id="ALM-12001__b63941092411">Service</strong> and click <strong id="ALM-12001__b3991118545">OK</strong>.</span></li><li id="ALM-12001__li1145664103113"><span>Click <span><img id="ALM-12001__image1945644173117" src="en-us_image_0000001582807597.png"></span> in the upper right corner, and set <strong id="ALM-12001__b6456941173117">Start Date</strong> and <strong id="ALM-12001__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12001__b13456164113319">Download</strong>.</span></li><li id="ALM-12001__li495644512588"><span>Contact the <span id="ALM-12001__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12001__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12001__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -68,7 +68,7 @@
</p></li><li id="ALM-12004__l6ef892f9c8f749aa9e6871e1a63797b1"><a name="ALM-12004__l6ef892f9c8f749aa9e6871e1a63797b1"></a><a name="l6ef892f9c8f749aa9e6871e1a63797b1"></a><span>Run the <strong id="ALM-12004__ac81351982ea44a3080848652eb80641f">kill -2</strong> <em id="ALM-12004__adf6ccee5cb6e4773b82ca5f68a8d4218">ldap pid</em> command to restart the LdapServer process and wait for 20 seconds. The HA starts the OLdap process automatically. Check whether the current OLdap resource is in normal state.</span><p><ul id="ALM-12004__u8057658d3505467190171bde28259d37"><li id="ALM-12004__lf80ef17bc2cc40138da0188b47a8b323">If yes, the operation is complete.</li><li id="ALM-12004__l0cc9afd9cc8b4222b41bcf9983d15d1e">If no, go to <a href="#ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde">4</a>.</li></ul>
</p></li></ol>
<p id="ALM-12004__abb5516fb7b8647a3942c4c5b7f74fded"><strong id="ALM-12004__a760add342117469495c4fbe7e3daf04f">Collect fault information.</strong></p>
<ol start="4" id="ALM-12004__o9661752a744349fba78569b7f04fcbcf"><li id="ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde"><a name="ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde"></a><a name="l4b1abbc809ee41c28ade2b2c4cfa6fde"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12004__b76841116134212">O&amp;M</strong> &gt; <strong id="ALM-12004__abd8fe9ab79df48fdb7b8bfe92c7768bc">Log &gt; Download</strong>.</span></li><li id="ALM-12004__l19f3de8474a147ef88ac2d40f27fe72e"><span>Select <strong id="ALM-12004__a5ee1ffd31e954215a608adc09390aabe">OmsLdapServer</strong> and <strong id="ALM-12004__afed03600c0b1449aa46a036940dae621">OmmServer</strong> from the <strong id="ALM-12004__a6cf5036ea700402980e42d73cf308a63">Service</strong> and click <strong id="ALM-12004__b3991118545">OK</strong>.</span></li><li id="ALM-12004__li1145664103113"><span>Click <span><img id="ALM-12004__image1945644173117" src="en-us_image_0269383809.png"></span> in the upper right corner, and set <strong id="ALM-12004__b6456941173117">Start Date</strong> and <strong id="ALM-12004__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12004__b13456164113319">Download</strong>.</span></li><li id="ALM-12004__li495644512588"><span>Contact the <span id="ALM-12004__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="4" id="ALM-12004__o9661752a744349fba78569b7f04fcbcf"><li id="ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde"><a name="ALM-12004__l4b1abbc809ee41c28ade2b2c4cfa6fde"></a><a name="l4b1abbc809ee41c28ade2b2c4cfa6fde"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12004__b76841116134212">O&amp;M</strong> &gt; <strong id="ALM-12004__abd8fe9ab79df48fdb7b8bfe92c7768bc">Log &gt; Download</strong>.</span></li><li id="ALM-12004__l19f3de8474a147ef88ac2d40f27fe72e"><span>Select <strong id="ALM-12004__a5ee1ffd31e954215a608adc09390aabe">OmsLdapServer</strong> and <strong id="ALM-12004__afed03600c0b1449aa46a036940dae621">OmmServer</strong> from the <strong id="ALM-12004__a6cf5036ea700402980e42d73cf308a63">Service</strong> and click <strong id="ALM-12004__b3991118545">OK</strong>.</span></li><li id="ALM-12004__li1145664103113"><span>Click <span><img id="ALM-12004__image1945644173117" src="en-us_image_0000001532767626.png"></span> in the upper right corner, and set <strong id="ALM-12004__b6456941173117">Start Date</strong> and <strong id="ALM-12004__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12004__b13456164113319">Download</strong>.</span></li><li id="ALM-12004__li495644512588"><span>Contact the <span id="ALM-12004__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12004__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12004__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -65,7 +65,7 @@
</p></li><li id="ALM-12005__li4031832916486"><a name="ALM-12005__li4031832916486"></a><a name="li4031832916486"></a><span>See the procedure in <a href="ALM-12004.html">ALM-12004 OLdap Resource Abnormal</a> to resolve the problem. After the OLdap resource status recovers, check whether the OKerberos resource status is normal.</span><p><ul class="subitemlist" id="ALM-12005__ul6413213716486"><li id="ALM-12005__li1067441916486">If yes, the operation is complete.</li><li id="ALM-12005__li5932157616486">If no, go to <a href="#ALM-12005__li34421516164820">4</a>.</li></ul>
</p></li></ol>
<p id="ALM-12005__p59418417164755"><strong id="ALM-12005__b21602359164826">Collect fault information.</strong></p>
<ol start="4" id="ALM-12005__ol49138498164822"><li id="ALM-12005__li34421516164820"><a name="ALM-12005__li34421516164820"></a><a name="li34421516164820"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12005__b87862548435">O&amp;M</strong> &gt; <strong id="ALM-12005__b11281153164820">Log &gt; Download</strong>.</span></li><li id="ALM-12005__li29990712164820"><span>Select <strong id="ALM-12005__b41358196164820">OmsKerberos</strong> and <strong id="ALM-12005__b36679449164820">OmmServer</strong> from the <strong id="ALM-12005__b18615181618813">Service</strong> and click <strong id="ALM-12005__b627792117815">OK</strong>.</span></li><li id="ALM-12005__li1145664103113"><span>Click <span><img id="ALM-12005__image1945644173117" src="en-us_image_0269383810.png"></span> in the upper right corner, and set <strong id="ALM-12005__b6456941173117">Start Date</strong> and <strong id="ALM-12005__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12005__b13456164113319">Download</strong>.</span></li><li id="ALM-12005__li495644512588"><span>Contact the <span id="ALM-12005__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="4" id="ALM-12005__ol49138498164822"><li id="ALM-12005__li34421516164820"><a name="ALM-12005__li34421516164820"></a><a name="li34421516164820"></a><span>On the FusionInsight Manager home page, choose <strong id="ALM-12005__b87862548435">O&amp;M</strong> &gt; <strong id="ALM-12005__b11281153164820">Log &gt; Download</strong>.</span></li><li id="ALM-12005__li29990712164820"><span>Select <strong id="ALM-12005__b41358196164820">OmsKerberos</strong> and <strong id="ALM-12005__b36679449164820">OmmServer</strong> from the <strong id="ALM-12005__b18615181618813">Service</strong> and click <strong id="ALM-12005__b627792117815">OK</strong>.</span></li><li id="ALM-12005__li1145664103113"><span>Click <span><img id="ALM-12005__image1945644173117" src="en-us_image_0000001532607838.png"></span> in the upper right corner, and set <strong id="ALM-12005__b6456941173117">Start Date</strong> and <strong id="ALM-12005__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12005__b13456164113319">Download</strong>.</span></li><li id="ALM-12005__li495644512588"><span>Contact the <span id="ALM-12005__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12005__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12005__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -57,25 +57,27 @@
</div>
<div class="section" id="ALM-12006__section6597800"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12006__p28294521">Services on the node are unavailable.</p>
</div>
<div class="section" id="ALM-12006__section59380201"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12006__p3196339692532">The network is disconnected, the hardware is faulty, or the operating system runs slowly.</p>
<div class="section" id="ALM-12006__section59380201"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12006__ul652118533718"><li id="ALM-12006__li1852116517378">The network is disconnected, the hardware is faulty, or the operating system runs slowly.</li><li id="ALM-12006__li1473614616373">The memory of the NodeAgent process is insufficient.</li></ul>
</div>
<div class="section" id="ALM-12006__section64659764"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12006__p17236725"><strong id="ALM-12006__b662616519645">Check whether the network is disconnected, whether the hardware is faulty, or whether the operating system runs commands slowly.</strong></p>
<ol id="ALM-12006__ol25386555165047"><li id="ALM-12006__li14747189165028"><span>On FusionInsight Manager, choose <strong id="ALM-12006__b147455436444647">O&amp;M</strong> &gt; <strong id="ALM-12006__b123126530744647">Alarm</strong> &gt; <strong id="ALM-12006__b61870647444647">Alarms</strong>. On the page that is displayed, click <span><img id="ALM-12006__image186131198418" src="en-us_image_0263895827.png"></span> in the row containing the alarm, click the host name, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12006__li13283100165028"><span>Log in to the active management node as user <strong id="ALM-12006__b6368294144647">root</strong>. <span id="ALM-12006__text18300131824619"></span></span></li><li id="ALM-12006__li59218045165028"><span>Run the <strong id="ALM-12006__b12346761832">ping </strong><em id="ALM-12006__i22903501421">IP address of the faulty host</em> command to check whether the faulty node is reachable.</span><p><ul class="subitemlist" id="ALM-12006__ul28949404165028"><li id="ALM-12006__li52511662165028">If yes, go to <a href="#ALM-12006__li6096449165028">12</a>.</li><li id="ALM-12006__li25586221165028">If no, go to <a href="#ALM-12006__li61437024165028">4</a>.</li></ul>
<ol id="ALM-12006__ol25386555165047"><li id="ALM-12006__li14747189165028"><span>On FusionInsight Manager, choose <strong id="ALM-12006__b147455436444647">O&amp;M</strong> &gt; <strong id="ALM-12006__b123126530744647">Alarm</strong> &gt; <strong id="ALM-12006__b61870647444647">Alarms</strong>. On the page that is displayed, click <span><img id="ALM-12006__image186131198418" src="en-us_image_0000001583127417.png"></span> in the row containing the alarm, click the host name, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12006__li13283100165028"><span>Log in to the active management node as user <strong id="ALM-12006__b6368294144647">root</strong>. <span id="ALM-12006__text1460138164615"></span> <span id="ALM-12006__text18300131824619"></span></span><p><div class="note" id="ALM-12006__note17203152312217"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12006__p920518235220">If the faulty node is the active management node and fails login, the network of the active management node may be faulty. In this case, go to <a href="#ALM-12006__li61437024165028">4</a>.</p>
</div></div>
</p></li><li id="ALM-12006__li59218045165028"><span>Run the <strong id="ALM-12006__b12346761832">ping </strong><em id="ALM-12006__i22903501421">IP address of the faulty host</em> command to check whether the faulty node is reachable.</span><p><ul class="subitemlist" id="ALM-12006__ul28949404165028"><li id="ALM-12006__li52511662165028">If yes, go to <a href="#ALM-12006__li5888111210353">12</a>.</li><li id="ALM-12006__li25586221165028">If no, go to <a href="#ALM-12006__li61437024165028">4</a>.</li></ul>
</p></li><li id="ALM-12006__li61437024165028"><a name="ALM-12006__li61437024165028"></a><a name="li61437024165028"></a><span>Contact the network administrator to check whether the network is faulty.</span><p><ul class="subitemlist" id="ALM-12006__ul59022119165028"><li id="ALM-12006__li31932358165028">If yes, go to <a href="#ALM-12006__li23885090165028">5</a>.</li><li id="ALM-12006__li36384175165028">If no, go to <a href="#ALM-12006__li9040006165028">6</a>.</li></ul>
</p></li><li id="ALM-12006__li23885090165028"><a name="ALM-12006__li23885090165028"></a><a name="li23885090165028"></a><span>Rectify the network fault and check whether the alarm is cleared from the alarm list.</span><p><ul class="subitemlist" id="ALM-12006__ul32480060165028"><li id="ALM-12006__li16062307165028">If yes, no further action is required.</li><li id="ALM-12006__li25978516165028">If no, go to <a href="#ALM-12006__li9040006165028">6</a>.</li></ul>
</p></li><li id="ALM-12006__li9040006165028"><a name="ALM-12006__li9040006165028"></a><a name="li9040006165028"></a><span>Contact the hardware administrator to check whether the hardware (CPU or memory) of the node is faulty.</span><p><ul class="subitemlist" id="ALM-12006__ul30830606165028"><li id="ALM-12006__li55644148165028">If yes, go to <a href="#ALM-12006__li15590464165028">7</a>.</li><li id="ALM-12006__li10882163165028">If no, go to <a href="#ALM-12006__li6096449165028">12</a>.</li></ul>
</p></li><li id="ALM-12006__li9040006165028"><a name="ALM-12006__li9040006165028"></a><a name="li9040006165028"></a><span>Contact the hardware administrator to check whether the hardware (CPU or memory) of the node is faulty.</span><p><ul class="subitemlist" id="ALM-12006__ul30830606165028"><li id="ALM-12006__li55644148165028">If yes, go to <a href="#ALM-12006__li15590464165028">7</a>.</li><li id="ALM-12006__li10882163165028">If no, go to <a href="#ALM-12006__li5888111210353">12</a>.</li></ul>
</p></li><li id="ALM-12006__li15590464165028"><a name="ALM-12006__li15590464165028"></a><a name="li15590464165028"></a><span>Repair or replace faulty components and restart the node. Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12006__ul40789195165028"><li id="ALM-12006__li13496136165028">If yes, no further action is required.</li><li id="ALM-12006__li19445213165028">If no, go to <a href="#ALM-12006__li4828856593250">8</a>.</li></ul>
</p></li><li id="ALM-12006__li4828856593250"><a name="ALM-12006__li4828856593250"></a><a name="li4828856593250"></a><span>If a large number of node faults are reported in the cluster, the floating IP addresses may be abnormal. As a result, Controller cannot detect the NodeAgent heartbeat.</span><p><p id="ALM-12006__p3491733393224">Log in to any management node and view the <strong id="ALM-12006__b732851717384">/var/log/Bigdata/omm/oms/ha/scriptlog/floatip.log</strong> log to check whether the logs generated one to two minutes before and after the faults occur are complete.</p>
<p id="ALM-12006__p4582054693224">For example, a complete log is in the following format:</p>
<pre class="screen" id="ALM-12006__screen6326427393538">2017-12-09 04:10:51,000 INFO (floatip) Read from ${BIGDATA_HOME}/om-server_<span id="ALM-12006__text469562161513">8.1.0.1</span>/om/etc/om/routeSetConf.ini,value is : yes
<pre class="screen" id="ALM-12006__screen6326427393538">2017-12-09 04:10:51,000 INFO (floatip) Read from ${BIGDATA_HOME}/om-server_*/om/etc/om/routeSetConf.ini,value is : yes
2017-12-09 04:10:51,000 INFO (floatip) check wsNetExport : eth0 is up.
2017-12-09 04:10:51,000 INFO (floatip) check omNetExport : eth0 is up.
2017-12-09 04:10:51,000 INFO (floatip) check wsInterface : eRth0:oms, wsFloatIp: XXX.XXX.XXX.XXX.
2017-12-09 04:10:51,000 INFO (floatip) check omInterface : eth0:oms, omFloatIp: XXX.XXX.XXX.XXX.
2017-12-09 04:10:51,000 INFO (floatip) check wsFloatIp : XXX.XXX.XXX.XXX is reachable.
2017-12-09 04:10:52,000 INFO (floatip) check omFloatIp : XXX.XXX.XXX.XXX is reachable.</pre>
<ul id="ALM-12006__ul5461745114114"><li id="ALM-12006__li6461124584119">If yes, go to <a href="#ALM-12006__li6096449165028">12</a>.</li><li id="ALM-12006__li164619453417">If no, go to <a href="#ALM-12006__li3216108493510">9</a>.</li></ul>
</p></li><li id="ALM-12006__li3216108493510"><a name="ALM-12006__li3216108493510"></a><a name="li3216108493510"></a><span>Check whether the omNetExport log is printed after the wsNetExport is detected or whether the interval for printing two logs exceeds 10 seconds or longer.</span><p><ul id="ALM-12006__ul782416216425"><li id="ALM-12006__li7824825421">If yes, go to <a href="#ALM-12006__li1419227193519">10</a>.</li><li id="ALM-12006__li182418254210">If no, go to <a href="#ALM-12006__li6096449165028">12</a>.</li></ul>
<ul id="ALM-12006__ul5461745114114"><li id="ALM-12006__li6461124584119">If yes, go to <a href="#ALM-12006__li5888111210353">12</a>.</li><li id="ALM-12006__li164619453417">If no, go to <a href="#ALM-12006__li3216108493510">9</a>.</li></ul>
</p></li><li id="ALM-12006__li3216108493510"><a name="ALM-12006__li3216108493510"></a><a name="li3216108493510"></a><span>Check whether the omNetExport log is printed after the wsNetExport is detected or whether the interval for printing two logs exceeds 10 seconds or longer.</span><p><ul id="ALM-12006__ul782416216425"><li id="ALM-12006__li7824825421">If yes, go to <a href="#ALM-12006__li1419227193519">10</a>.</li><li id="ALM-12006__li182418254210">If no, go to <a href="#ALM-12006__li5888111210353">12</a>.</li></ul>
</p></li><li id="ALM-12006__li1419227193519"><a name="ALM-12006__li1419227193519"></a><a name="li1419227193519"></a><span>View the <strong id="ALM-12006__b1245811118516">/var/log/message</strong> file of the OS to check whether sssd frequently restarts or nscd exception information is displayed when the fault occurs. For Red Hat, check sssd information. For SUSE, check nscd information.</span><p><p id="ALM-12006__p3553051916202">sssd restart example</p>
<pre class="screen" id="ALM-12006__screen1465770316215">Feb 7 11:38:16 10-132-190-105 sssd[pam]: Shutting down
Feb 7 11:38:16 10-132-190-105 sssd[nss]: Shutting down
@ -89,11 +91,15 @@ Feb 7 11:38:16 10-132-190-105 sssd[pam]: Starting up</pre>
<pre class="screen" id="ALM-12006__screen24115840162055">Feb 11 11:44:42 10-120-205-33 nscd: nss_ldap: failed to bind to LDAP server ldaps://10.120.205.55:21780: Can't contact LDAP server
Feb 11 11:44:43 10-120-205-33 ntpq: nss_ldap: failed to bind to LDAP server ldaps://10.120.205.55:21780: Can't contact LDAP server
Feb 11 11:44:44 10-120-205-33 ntpq: nss_ldap: failed to bind to LDAP server ldaps://10.120.205.92:21780: Can't contact LDAP server</pre>
<ul id="ALM-12006__ul324281844210"><li id="ALM-12006__li172421218184210">If yes, go to <a href="#ALM-12006__li5998962193529">11</a>.</li><li id="ALM-12006__li9242918184211">If no, go to <a href="#ALM-12006__li6096449165028">12</a>.</li></ul>
<ul id="ALM-12006__ul324281844210"><li id="ALM-12006__li172421218184210">If yes, go to <a href="#ALM-12006__li5998962193529">11</a>.</li><li id="ALM-12006__li9242918184211">If no, go to <a href="#ALM-12006__li6096449165028">14</a>.</li></ul>
</p></li><li id="ALM-12006__li5998962193529"><a name="ALM-12006__li5998962193529"></a><a name="li5998962193529"></a><span>Check whether the LdapServer node is faulty, for example, the service IP address is unreachable or the network latency is too high. If the fault occurs periodically, locate and eliminate it and run the <strong id="ALM-12006__b95641746101015">top</strong> command to check whether abnormal software exists.</span></li></ol>
<p class="tableheading" id="ALM-12006__p20324852165055"><strong id="ALM-12006__b38076319165058">Collect the fault information.</strong></p>
<ol start="12" id="ALM-12006__ol4338365616513"><li id="ALM-12006__li6096449165028"><a name="ALM-12006__li6096449165028"></a><a name="li6096449165028"></a><span>On FusionInsight Manager, choose <strong id="ALM-12006__b101119310920">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12006__b3161731891">Log</strong> &gt; <strong id="ALM-12006__b1818636918">Download</strong>.</span></li><li id="ALM-12006__li17328746165028"><span>Select the following nodes from <strong id="ALM-12006__b74792121696">Services</strong> and click <strong id="ALM-12006__b24914121699">OK</strong>.</span><p><ul class="subitemlist" id="ALM-12006__ul1925416165028"><li id="ALM-12006__li54868049165028">NodeAgent</li><li id="ALM-12006__li24050400165028">Controller</li><li id="ALM-12006__li15127016165028">OS</li></ul>
</p></li><li id="ALM-12006__li21740992165028"><span>Click <span><img id="ALM-12006__image104601319175315" src="en-us_image_0263895607.png"></span> in the upper right corner, and set <strong id="ALM-12006__b54651160144647">Start Date</strong> and <strong id="ALM-12006__b209720056144647">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12006__b190140052244647">Download</strong>.</span></li><li id="ALM-12006__li16189904165028"><span>Contact <span id="ALM-12006__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<p id="ALM-12006__p6200111143517"><strong id="ALM-12006__b511447204519">Check whether the memory of the NodeAgent process is insufficient.</strong></p>
<ol start="12" id="ALM-12006__ol5887312103520"><li id="ALM-12006__li5888111210353"><a name="ALM-12006__li5888111210353"></a><a name="li5888111210353"></a><span>Log in to the faulty node as user <strong id="ALM-12006__b23661558399">root</strong> and run the following command to view the NodeAgent process logs:</span><p><p id="ALM-12006__p815935944018"><strong id="ALM-12006__b125692104112">vi /var/log/Bigdata/nodeagent/scriptlog/agent_gc.log.*.current</strong></p>
</p></li><li id="ALM-12006__li428520391437"><span>Check whether the log file contains an error indicating that the metaspace size or heap memory size is insufficient.</span><p><ul id="ALM-12006__ul3488183864410"><li id="ALM-12006__li12488173894411">If yes, contact <span id="ALM-12006__text11646204144519">O&amp;M personnel</span> personnel to change the memory size.</li><li id="ALM-12006__li136386250454">If no, go to <a href="#ALM-12006__li6096449165028">14</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12006__p20324852165055"><strong id="ALM-12006__b38076319165058">Collect fault information.</strong></p>
<ol start="14" id="ALM-12006__ol4338365616513"><li id="ALM-12006__li6096449165028"><a name="ALM-12006__li6096449165028"></a><a name="li6096449165028"></a><span>On FusionInsight Manager, choose <strong id="ALM-12006__b101119310920">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12006__b3161731891">Log</strong> &gt; <strong id="ALM-12006__b1818636918">Download</strong>.</span></li><li id="ALM-12006__li17328746165028"><span>Select the following nodes from <strong id="ALM-12006__b74792121696">Services</strong> and click <strong id="ALM-12006__b24914121699">OK</strong>.</span><p><ul class="subitemlist" id="ALM-12006__ul1925416165028"><li id="ALM-12006__li54868049165028">NodeAgent</li><li id="ALM-12006__li24050400165028">Controller</li><li id="ALM-12006__li15127016165028">OS</li></ul>
</p></li><li id="ALM-12006__li21740992165028"><span>Click <span><img id="ALM-12006__image104601319175315" src="en-us_image_0000001532767474.png"></span> in the upper right corner, and set <strong id="ALM-12006__b54651160144647">Start Date</strong> and <strong id="ALM-12006__b209720056144647">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12006__b190140052244647">Download</strong>.</span></li><li id="ALM-12006__li16189904165028"><span>Contact <span id="ALM-12006__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12006__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12006__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -62,7 +62,7 @@
</div></div>
</div>
<div class="section" id="ALM-12007__sad734a42f8ef40529fb21b797d8b41e9"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12007__en-us_topic_0070543667_p65245121"><strong id="ALM-12007__b73856891719">Check whether the instance process is abnormal.</strong></p>
<ol id="ALM-12007__ol5390063317638"><li id="ALM-12007__li42005517036"><a name="ALM-12007__li42005517036"></a><a name="li42005517036"></a><span>In the FusionInsight Manager portal, click <strong id="ALM-12007__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12007__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12007__image14626452517" src="en-us_image_0000001080201158.png"></span> in the row where the alarm is located , and click the host name to view the host address for which the alarm is generated</span></li><li id="ALM-12007__li911601917036"><span>On the <strong id="ALM-12007__b378050117036">Alarms</strong> page, check whether the <a href="ALM-12006.html">ALM-12006 Node Fault</a> is generated.</span><p><ul class="subitemlist" id="ALM-12007__ul846943117036"><li id="ALM-12007__li452236417036">If yes, go to <a href="#ALM-12007__li20006517036">3</a>.</li><li id="ALM-12007__li3076720917036">If no, go to <a href="#ALM-12007__li195150317036">4</a>.</li></ul>
<ol id="ALM-12007__ol5390063317638"><li id="ALM-12007__li42005517036"><a name="ALM-12007__li42005517036"></a><a name="li42005517036"></a><span>In the FusionInsight Manager portal, click <strong id="ALM-12007__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12007__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12007__image14626452517" src="en-us_image_0000001582807817.png"></span> in the row where the alarm is located , and click the host name to view the host address for which the alarm is generated</span></li><li id="ALM-12007__li911601917036"><span>On the <strong id="ALM-12007__b378050117036">Alarms</strong> page, check whether the <a href="ALM-12006.html">ALM-12006 Node Fault</a> is generated.</span><p><ul class="subitemlist" id="ALM-12007__ul846943117036"><li id="ALM-12007__li452236417036">If yes, go to <a href="#ALM-12007__li20006517036">3</a>.</li><li id="ALM-12007__li3076720917036">If no, go to <a href="#ALM-12007__li195150317036">4</a>.</li></ul>
</p></li><li id="ALM-12007__li20006517036"><a name="ALM-12007__li20006517036"></a><a name="li20006517036"></a><span>Handle the alarm according to <a href="ALM-12006.html">ALM-12006 Node Fault</a>.</span></li><li id="ALM-12007__li195150317036"><a name="ALM-12007__li195150317036"></a><a name="li195150317036"></a><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12007__b8307212154711">root</strong>. <span id="ALM-12007__text43649449460"></span>Check whether the installation directory user, user group, and permission of the alarm role are correct. The user, user group, and the permission must be <strong id="ALM-12007__b180058917036">omm:ficommon 750</strong>.</span><p><p class="subitemlist" id="ALM-12007__p7190141912118">For example, the NameNode installation directory is<strong id="ALM-12007__b16534123110112"> </strong><em id="ALM-12007__i677216419119">${BIGDATA_HOME}</em><strong id="ALM-12007__b177174617112">/FusionInsight_Current/</strong><em id="ALM-12007__i137264460113">1_8_NameNode</em><strong id="ALM-12007__b13731846191113">/etc</strong>.</p>
<ul class="subitemlist" id="ALM-12007__ul2258645517036"><li id="ALM-12007__li1163004517036">If yes, go to <a href="#ALM-12007__li3396349817036">6</a>.</li><li id="ALM-12007__li250960617036">If no, go to <a href="#ALM-12007__li3247692317036">5</a>.</li></ul>
</p></li><li id="ALM-12007__li3247692317036"><a name="ALM-12007__li3247692317036"></a><a name="li3247692317036"></a><span>Run the following command to set the permission to <strong id="ALM-12007__b1756352717036">750</strong> and <strong id="ALM-12007__b2385401617036">User:Group</strong> to <strong id="ALM-12007__b1335955517036">omm:ficommon</strong>:</span><p><p class="litext" id="ALM-12007__p833090817036"><strong id="ALM-12007__b5312713817036">chmod 750 </strong><em id="ALM-12007__i838219617036">&lt;folder_name&gt;</em></p>
@ -75,7 +75,7 @@
</p></li><li id="ALM-12007__li1723673717036"><a name="ALM-12007__li1723673717036"></a><a name="li1723673717036"></a><span>Wait for 5 minutes. In the alarm list, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul3418148317036"><li id="ALM-12007__li464969017036">If yes, no further action is required.</li><li id="ALM-12007__li4108064417036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
</p></li></ol>
<p id="ALM-12007__p3392472417052"><strong id="ALM-12007__b2313861417057">Collect fault information.</strong></p>
<ol start="11" id="ALM-12007__ol481086251710"><li id="ALM-12007__li1622379717036"><a name="ALM-12007__li1622379717036"></a><a name="li1622379717036"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12007__b2091290617036">O&amp;M</strong> &gt; <strong id="ALM-12007__b5399842717036">Log &gt; Download</strong>.</span></li><li id="ALM-12007__li1598834917036"><span>According to the service name obtained in <a href="#ALM-12007__li42005517036">1</a>, select the component and <strong id="ALM-12007__b68821814172417">NodeAgent</strong> from the <strong id="ALM-12007__b15959191911544">Service</strong> and click <strong id="ALM-12007__b3991118545">OK</strong>.</span></li><li id="ALM-12007__li1145664103113"><span>Click <span><img id="ALM-12007__image1945644173117" src="en-us_image_0269383814.png"></span> in the upper right corner, and set <strong id="ALM-12007__b6456941173117">Start Date</strong> and <strong id="ALM-12007__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12007__b13456164113319">Download</strong>.</span></li><li id="ALM-12007__li495644512588"><span>Contact the <span id="ALM-12007__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="11" id="ALM-12007__ol481086251710"><li id="ALM-12007__li1622379717036"><a name="ALM-12007__li1622379717036"></a><a name="li1622379717036"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12007__b2091290617036">O&amp;M</strong> &gt; <strong id="ALM-12007__b5399842717036">Log &gt; Download</strong>.</span></li><li id="ALM-12007__li1598834917036"><span>According to the service name obtained in <a href="#ALM-12007__li42005517036">1</a>, select the component and <strong id="ALM-12007__b68821814172417">NodeAgent</strong> from the <strong id="ALM-12007__b15959191911544">Service</strong> and click <strong id="ALM-12007__b3991118545">OK</strong>.</span></li><li id="ALM-12007__li1145664103113"><span>Click <span><img id="ALM-12007__image1945644173117" src="en-us_image_0000001583127509.png"></span> in the upper right corner, and set <strong id="ALM-12007__b6456941173117">Start Date</strong> and <strong id="ALM-12007__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12007__b13456164113319">Download</strong>.</span></li><li id="ALM-12007__li495644512588"><span>Contact the <span id="ALM-12007__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12007__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12007__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -61,7 +61,7 @@
<ul id="ALM-12010__ul11347112011510"><li id="ALM-12010__li17347132014154">The link between the active and standby Manager is abnormal.</li><li id="ALM-12010__li127451022151512">The node name configuration is incorrect.</li><li id="ALM-12010__li15347620181517">The port is disabled by the firewall.</li></ul>
</div>
<div class="section" id="ALM-12010__s8af1753e22d647b9b1328244e85fc0a1"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12010__en-us_topic_0070543674_p27190637"><strong id="ALM-12010__b5350194613159">Check whether the network between the active and standby Manager server is normal.</strong></p>
<ol id="ALM-12010__ol20655039202014"><li id="ALM-12010__li3649153912014"><span>In the FusionInsight Manager portal, click <strong id="ALM-12010__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12010__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12010__image4649163910207" src="en-us_image_0269383815.png"></span> in the row containing the alarm and view the IP address of the standby Manager (Peer Manager) server in the alarm details.</span></li><li id="ALM-12010__li665018399204"><span>Log in to the active Manager server as user <strong id="ALM-12010__b16650193982017">root</strong>. <span id="ALM-12010__text13862037144910"></span><span id="ALM-12010__text077751144915"></span></span></li><li id="ALM-12010__li86511539112014"><span>Run the <strong id="ALM-12010__b14650439102018">ping</strong> <em id="ALM-12010__i96503394205">standby Manager heartbeat IP address</em> command to check whether the standby Manager server is reachable.</span><p><ul class="subitemlist" id="ALM-12010__ul565043917209"><li id="ALM-12010__li665012399202">If yes, go to <a href="#ALM-12010__li206521339172011">6</a>.</li><li id="ALM-12010__li36504394207">If no, go to <a href="#ALM-12010__li18651103915205">4</a>.</li></ul>
<ol id="ALM-12010__ol20655039202014"><li id="ALM-12010__li3649153912014"><span>In the FusionInsight Manager portal, click <strong id="ALM-12010__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12010__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12010__image4649163910207" src="en-us_image_0000001583127401.png"></span> in the row containing the alarm and view the IP address of the standby Manager (Peer Manager) server in the alarm details.</span></li><li id="ALM-12010__li665018399204"><span>Log in to the active Manager server as user <strong id="ALM-12010__b16650193982017">root</strong>. <span id="ALM-12010__text13862037144910"></span><span id="ALM-12010__text077751144915"></span></span></li><li id="ALM-12010__li86511539112014"><span>Run the <strong id="ALM-12010__b14650439102018">ping</strong> <em id="ALM-12010__i96503394205">standby Manager heartbeat IP address</em> command to check whether the standby Manager server is reachable.</span><p><ul class="subitemlist" id="ALM-12010__ul565043917209"><li id="ALM-12010__li665012399202">If yes, go to <a href="#ALM-12010__li206521339172011">6</a>.</li><li id="ALM-12010__li36504394207">If no, go to <a href="#ALM-12010__li18651103915205">4</a>.</li></ul>
</p></li><li id="ALM-12010__li18651103915205"><a name="ALM-12010__li18651103915205"></a><a name="li18651103915205"></a><span>Contact the network administrator to check whether the network is faulty.</span><p><ul class="subitemlist" id="ALM-12010__ul1465123917207"><li id="ALM-12010__li7651539162019">If yes, go to <a href="#ALM-12010__li166511739102017">5</a>.</li><li id="ALM-12010__li12651153932016">If no, go to <a href="#ALM-12010__li206521339172011">6</a>.</li></ul>
</p></li><li id="ALM-12010__li166511739102017"><a name="ALM-12010__li166511739102017"></a><a name="li166511739102017"></a><span>Rectify the network fault and check whether the alarm is cleared from the alarm list.</span><p><ul class="subitemlist" id="ALM-12010__ul12651143992015"><li id="ALM-12010__li66510391204">If yes, no further action is required.</li><li id="ALM-12010__li165193912202">If no, go to <a href="#ALM-12010__li206521339172011">6</a>.</li></ul>
</p></li><li class="subitemlist" id="ALM-12010__li206521339172011"><a name="ALM-12010__li206521339172011"></a><a name="li206521339172011"></a><span>Run the following command to go to the software installation directory:</span><p><p id="ALM-12010__p1652939182013"><strong id="ALM-12010__b136521139172015">cd /opt</strong></p>
@ -77,7 +77,7 @@
</p></li></ol>
<p id="ALM-12010__p66076255171453"><strong id="ALM-12010__b56103124171459">Collect fault information.</strong></p>
<ol start="16" id="ALM-12010__ol4742499917152"><li id="ALM-12010__li41244883171443"><a name="ALM-12010__li41244883171443"></a><a name="li41244883171443"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12010__b2091290617036">O&amp;M</strong> &gt; <strong id="ALM-12010__b4582764171443">Log &gt; Download</strong>.</span></li><li id="ALM-12010__li52887856171443"><span>Select the following nodes from the <strong id="ALM-12010__b1114195518811">Service</strong> and click<strong id="ALM-12010__b11411559819"> OK</strong>:</span><p><ul class="subitemlist" id="ALM-12010__ul58072211171443"><li id="ALM-12010__li2749285171443">OmmServer</li><li id="ALM-12010__li24743571171443">Controller</li><li id="ALM-12010__li21365548171443">NodeAgent</li></ul>
</p></li><li id="ALM-12010__li1145664103113"><span>Click <span><img id="ALM-12010__image1945644173117" src="en-us_image_0269383816.png"></span> in the upper right corner, and set <strong id="ALM-12010__b6456941173117">Start Date</strong> and <strong id="ALM-12010__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12010__b13456164113319">Download</strong>.</span></li><li id="ALM-12010__li495644512588"><span>Contact the <span id="ALM-12010__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</p></li><li id="ALM-12010__li1145664103113"><span>Click <span><img id="ALM-12010__image1945644173117" src="en-us_image_0000001532767502.png"></span> in the upper right corner, and set <strong id="ALM-12010__b6456941173117">Start Date</strong> and <strong id="ALM-12010__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12010__b13456164113319">Download</strong>.</span></li><li id="ALM-12010__li495644512588"><span>Contact the <span id="ALM-12010__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12010__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12010__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -60,7 +60,7 @@
<div class="section" id="ALM-12011__s77f5924161444716a130206f2960adf2"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12011__ul14145165617812"><li id="ALM-12011__li414545618819">The link between the active and standby Managers is interrupted or The storage space of the <strong id="ALM-12011__b239214252236">/srv/BigData/LocalBackup</strong> directory is full.</li><li id="ALM-12011__li42413581788">The synchronization file does not exist or the file permission is incorrect.</li></ul>
</div>
<div class="section" id="ALM-12011__s229a984fc400445ab382e100b6b3e00c"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12011__en-us_topic_0070543504_p8274172"><strong id="ALM-12011__b65333561171814">Check whether the network between the active Manager server and the standby Manager server is normal.</strong></p>
<ol id="ALM-12011__ol53693817171754"><li id="ALM-12011__li61677856171750"><span>In the FusionInsight Manager portal, click <strong id="ALM-12011__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12011__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12011__image168221113135319" src="en-us_image_0269383817.png"></span> in the row where the alarm is located and obtain the standby Manager server IP address (Peer Manager IP address) in the alarm details.</span></li><li id="ALM-12011__li218225171750"><span>Log in to the active Manager server as user <strong id="ALM-12011__b18229793171750">root</strong>. <span id="ALM-12011__text43649449460"></span></span></li><li id="ALM-12011__li45826494171750"><span>Run the <strong id="ALM-12011__b1964026171750">ping </strong><em id="ALM-12011__i17676237171750">standby Manager IP address</em> command to check whether the standby Manager server is reachable.</span><p><ul class="subitemlist" id="ALM-12011__ul20004913171750"><li id="ALM-12011__li22489118171750">If yes, go to <a href="#ALM-12011__li983315367129">6</a>.</li><li id="ALM-12011__li9679308171750">If no, go to <a href="#ALM-12011__li3033024171750">4</a>.</li></ul>
<ol id="ALM-12011__ol53693817171754"><li id="ALM-12011__li61677856171750"><span>In the FusionInsight Manager portal, click <strong id="ALM-12011__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12011__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12011__image168221113135319" src="en-us_image_0000001532607938.png"></span> in the row where the alarm is located and obtain the standby Manager server IP address (Peer Manager IP address) in the alarm details.</span></li><li id="ALM-12011__li218225171750"><span>Log in to the active Manager server as user <strong id="ALM-12011__b18229793171750">root</strong>. <span id="ALM-12011__text43649449460"></span></span></li><li id="ALM-12011__li45826494171750"><span>Run the <strong id="ALM-12011__b1964026171750">ping </strong><em id="ALM-12011__i17676237171750">standby Manager IP address</em> command to check whether the standby Manager server is reachable.</span><p><ul class="subitemlist" id="ALM-12011__ul20004913171750"><li id="ALM-12011__li22489118171750">If yes, go to <a href="#ALM-12011__li983315367129">6</a>.</li><li id="ALM-12011__li9679308171750">If no, go to <a href="#ALM-12011__li3033024171750">4</a>.</li></ul>
</p></li><li id="ALM-12011__li3033024171750"><a name="ALM-12011__li3033024171750"></a><a name="li3033024171750"></a><span>Contact the network administrator to check whether the network is faulty.</span><p><ul class="subitemlist" id="ALM-12011__ul45076245171750"><li id="ALM-12011__li20958557171750">If yes, go to <a href="#ALM-12011__li52745930171750">5</a>.</li><li id="ALM-12011__li19921552171750">If no, go to <a href="#ALM-12011__li983315367129">6</a>.</li></ul>
</p></li><li id="ALM-12011__li52745930171750"><a name="ALM-12011__li52745930171750"></a><a name="li52745930171750"></a><span>Rectify the network fault and check whether the alarm is cleared from the alarm list.</span><p><ul class="subitemlist" id="ALM-12011__ul35448373171750"><li id="ALM-12011__li27297218171750">If yes, no further action is required.</li><li id="ALM-12011__li63591031171750">If no, go to <a href="#ALM-12011__li983315367129">6</a>.</li></ul>
</p></li></ol>
@ -87,14 +87,14 @@
<p id="ALM-12011__p726310487918"><strong id="ALM-12011__b12631484919">vi </strong><em id="ALM-12011__i1026319487919">ha.log.2021-03-22_12-00-07</em></p>
<p id="ALM-12011__p1526310489912">Check whether error information is reported before and after the alarm generation time.</p>
<ul id="ALM-12011__ul42631481197"><li id="ALM-12011__li132632482916">If yes, rectify the fault based on the error information. Then go to <a href="#ALM-12011__li985632952514">13</a>.<p id="ALM-12011__p10535704244">For example, if the following error information is displayed, the directory permission is insufficient. In this case, change the directory permission to be the same as that on the normal node.</p>
<p id="ALM-12011__p2263148298"><span><img id="ALM-12011__image026311483918" src="en-us_image_0000001271157721.png"></span></p>
<p id="ALM-12011__p2263148298"><span><img id="ALM-12011__image026311483918" src="en-us_image_0000001583087593.png"></span></p>
</li><li id="ALM-12011__li82631648997">If no, go to <a href="#ALM-12011__li65512922171750">14</a>.</li></ul>
</li></ol>
</p></li><li id="ALM-12011__li985632952514"><a name="ALM-12011__li985632952514"></a><a name="li985632952514"></a><span>Wait about 10 minute and check whether the alarm is cleared.</span><p><ul id="ALM-12011__ul118561229142514"><li id="ALM-12011__li1085622915256">If yes, no further action is required.</li><li id="ALM-12011__li20856429172515">If no, go to <a href="#ALM-12011__li65512922171750">14</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12011__p3197719917181"><strong id="ALM-12011__b5988844117185">Collect fault information.</strong></p>
<ol start="14" id="ALM-12011__ol2384130317188"><li id="ALM-12011__li65512922171750"><a name="ALM-12011__li65512922171750"></a><a name="li65512922171750"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12011__b173565410011">O&amp;M</strong> &gt; <strong id="ALM-12011__b44561915171750">Log &gt; Download</strong>.</span></li><li id="ALM-12011__li28001919171750"><span>Select the following nodes from the <strong id="ALM-12011__b18740191794">Service</strong> and click <strong id="ALM-12011__b15740101993">OK</strong>:</span><p><ul class="subitemlist" id="ALM-12011__ul40394026171750"><li id="ALM-12011__li44518484171750">OmmServer</li><li id="ALM-12011__li65122042171750">Controller</li><li id="ALM-12011__li49227467171750">NodeAgent</li></ul>
</p></li><li id="ALM-12011__li1145664103113"><span>Click <span><img id="ALM-12011__image1945644173117" src="en-us_image_0269383818.png"></span> in the upper right corner, and set <strong id="ALM-12011__b6456941173117">Start Date</strong> and <strong id="ALM-12011__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12011__b13456164113319">Download</strong>.</span></li><li id="ALM-12011__li495644512588"><span>Contact the <span id="ALM-12011__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</p></li><li id="ALM-12011__li1145664103113"><span>Click <span><img id="ALM-12011__image1945644173117" src="en-us_image_0000001582927829.png"></span> in the upper right corner, and set <strong id="ALM-12011__b6456941173117">Start Date</strong> and <strong id="ALM-12011__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12011__b13456164113319">Download</strong>.</span></li><li id="ALM-12011__li495644512588"><span>Contact the <span id="ALM-12011__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12011__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12011__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

169
docs/mrs/umn/ALM-12012.html Normal file

File diff suppressed because it is too large Load Diff

View File

@ -69,7 +69,7 @@
</div>
<div class="section" id="ALM-12014__s86d97a0503184bfd9d0e267312170d65"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12014__en-us_topic_0070543526_ul45207585"><li id="ALM-12014__en-us_topic_0070543526_li4215088">The hard disk is removed.</li><li id="ALM-12014__en-us_topic_0070543526_li37935797">The hard disk is offline, or a bad sector exists on the hard disk.</li></ul>
</div>
<div class="section" id="ALM-12014__sb1a1ee7b7a444d5dbe8388e9c9e8bba9"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12014__ol43371064173421"><li id="ALM-12014__li30640494173421"><span>On FusionInsight Manager, click <strong id="ALM-12014__b18317580173421">O&amp;M &gt; Alarm &gt; Alarms</strong>, and click <span><img id="ALM-12014__image10408151910137" src="en-us_image_0269383822.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12014__li51941841173421"><span>Obtain <strong id="ALM-12014__b65960965173421">HostName</strong>, <strong id="ALM-12014__b56777780173421">PartitionName</strong> and <strong id="ALM-12014__b41237977173421">DirName</strong> from <strong id="ALM-12014__b645062473115">Location</strong>.</span></li><li id="ALM-12014__li15983295173421"><span>Check whether the disk of <strong id="ALM-12014__b64823390173421">PartitionName</strong> on <strong id="ALM-12014__b46539606173421">HostName</strong> is inserted to the correct server slot.</span><p><ul class="subitemlist" id="ALM-12014__ul9232462173421"><li id="ALM-12014__li11611727173421">If yes, go to <a href="#ALM-12014__li9631929173421">4</a>.</li><li id="ALM-12014__li1025829173421">If no, go to <a href="#ALM-12014__li18162941173421">5</a>.</li></ul>
<div class="section" id="ALM-12014__sb1a1ee7b7a444d5dbe8388e9c9e8bba9"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12014__ol43371064173421"><li id="ALM-12014__li30640494173421"><span>On FusionInsight Manager, click <strong id="ALM-12014__b18317580173421">O&amp;M &gt; Alarm &gt; Alarms</strong>, and click <span><img id="ALM-12014__image10408151910137" src="en-us_image_0000001532767638.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12014__li51941841173421"><span>Obtain <strong id="ALM-12014__b65960965173421">HostName</strong>, <strong id="ALM-12014__b56777780173421">PartitionName</strong> and <strong id="ALM-12014__b41237977173421">DirName</strong> from <strong id="ALM-12014__b645062473115">Location</strong>.</span></li><li id="ALM-12014__li15983295173421"><span>Check whether the disk of <strong id="ALM-12014__b64823390173421">PartitionName</strong> on <strong id="ALM-12014__b46539606173421">HostName</strong> is inserted to the correct server slot.</span><p><ul class="subitemlist" id="ALM-12014__ul9232462173421"><li id="ALM-12014__li11611727173421">If yes, go to <a href="#ALM-12014__li9631929173421">4</a>.</li><li id="ALM-12014__li1025829173421">If no, go to <a href="#ALM-12014__li18162941173421">5</a>.</li></ul>
</p></li><li id="ALM-12014__li9631929173421"><a name="ALM-12014__li9631929173421"></a><a name="li9631929173421"></a><span>Contact hardware engineers to remove the faulty disk.</span></li><li id="ALM-12014__li18162941173421"><a name="ALM-12014__li18162941173421"></a><a name="li18162941173421"></a><span>Log in to the <strong id="ALM-12014__b19578501173421">HostName</strong> node where an alarm is reported and check whether there is a line containing <strong id="ALM-12014__b41988789173421">DirName</strong> in the <strong id="ALM-12014__b42354785173421">/etc/fstab</strong> file as user <strong id="ALM-12014__b37365710490">root</strong>. <span id="ALM-12014__text43649449460"></span></span><p><ul class="subitemlist" id="ALM-12014__ul61670428173421"><li id="ALM-12014__li8185528173421">If yes, go to <a href="#ALM-12014__li20338192173421">6</a>.</li><li id="ALM-12014__li59048052173421">If no, go to <a href="#ALM-12014__li48826004173421">7</a>.</li></ul>
</p></li><li id="ALM-12014__li20338192173421"><a name="ALM-12014__li20338192173421"></a><a name="li20338192173421"></a><span>Run the <strong id="ALM-12014__b29248746173421">vi /etc/fstab</strong> command to edit the file and delete the line containing <strong id="ALM-12014__b61912122173421">DirName</strong>.</span></li><li id="ALM-12014__li48826004173421"><a name="ALM-12014__li48826004173421"></a><a name="li48826004173421"></a><span>Contact hardware engineers to insert a new disk. For details, see the hardware product document of the relevant model. If the faulty disk is in a RAID group, configure the RAID group. For details, see the configuration methods of the relevant RAID controller card.</span></li><li id="ALM-12014__li55753407173421"><span>Wait 20 to 30 minutes (The disk size determines the waiting time), and run the <strong id="ALM-12014__b36780855173421">mount</strong> command to check whether the disk has been mounted to the <strong id="ALM-12014__b62592242173421">DirName</strong> directory.</span><p><ul class="subitemlist" id="ALM-12014__ul28564444173421"><li id="ALM-12014__li26459270173421">If yes, manually clear the alarm. No further operation is required.</li><li id="ALM-12014__li62826150173421">If no, go to <a href="#ALM-12014__li1607193817587">9</a>.</li></ul>
</p></li></ol>

View File

@ -69,7 +69,7 @@
</div>
<div class="section" id="ALM-12015__sc5211f0c333e491987141617bb9cc5d2"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12015__en-us_topic_0070543537_p5850279">The hard disk is faulty, for example, a bad sector exists.</p>
</div>
<div class="section" id="ALM-12015__s2082e61748a44109ae22b65edd6caf4f"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12015__en-us_topic_0070543537_ol4110613"><li id="ALM-12015__en-us_topic_0070543537_li36995518"><span>On FusionInsight Manager, choose <strong id="ALM-12015__b87862548435">O&amp;M</strong> &gt; <strong id="ALM-12015__b10296131615319">Alarm &gt; Alarms</strong>, click<strong id="ALM-12015__b142969161035"> </strong><span><img id="ALM-12015__image10408151910137" src="en-us_image_0269383823.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12015__en-us_topic_0070543537_li64524211"><span>Obtain <strong id="ALM-12015__en-us_topic_0070543537_b59078569">HostName</strong> and <strong id="ALM-12015__en-us_topic_0070543537_b61945077">PartitionName</strong> from <strong id="ALM-12015__b196121357184515">Location</strong>. <strong id="ALM-12015__en-us_topic_0070543537_b51495331">HostName</strong> is the node where the alarm is reported, and <strong id="ALM-12015__en-us_topic_0070543537_b60804799">PartitionName</strong> is the partition of the faulty disk.</span></li><li id="ALM-12015__en-us_topic_0070543537_li10372286"><span>Contact hardware engineers to check whether the disk is faulty. If the disk is faulty, remove it from the server.</span></li><li id="ALM-12015__en-us_topic_0070543537_li26241711"><span>After the disk is removed, alarm <strong id="ALM-12015__en-us_topic_0070543537_b34848813">ALM-12014 Partition Lost</strong> is reported. Handle the alarm. For details, see <a href="ALM-12014.html">ALM-12014 Partition Lost</a>. After the alarm <strong id="ALM-12015__en-us_topic_0070543537_b4181593">ALM-12014 Partition Lost</strong> is cleared, alarm <strong id="ALM-12015__en-us_topic_0070543537_b37634337">ALM-12015 Partition Filesystem Readonly</strong> is automatically cleared.</span></li></ol>
<div class="section" id="ALM-12015__s2082e61748a44109ae22b65edd6caf4f"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12015__en-us_topic_0070543537_ol4110613"><li id="ALM-12015__en-us_topic_0070543537_li36995518"><span>On FusionInsight Manager, choose <strong id="ALM-12015__b87862548435">O&amp;M</strong> &gt; <strong id="ALM-12015__b10296131615319">Alarm &gt; Alarms</strong>, click<strong id="ALM-12015__b142969161035"> </strong><span><img id="ALM-12015__image10408151910137" src="en-us_image_0000001582807717.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12015__en-us_topic_0070543537_li64524211"><span>Obtain <strong id="ALM-12015__en-us_topic_0070543537_b59078569">HostName</strong> and <strong id="ALM-12015__en-us_topic_0070543537_b61945077">PartitionName</strong> from <strong id="ALM-12015__b196121357184515">Location</strong>. <strong id="ALM-12015__en-us_topic_0070543537_b51495331">HostName</strong> is the node where the alarm is reported, and <strong id="ALM-12015__en-us_topic_0070543537_b60804799">PartitionName</strong> is the partition of the faulty disk.</span></li><li id="ALM-12015__en-us_topic_0070543537_li10372286"><span>Contact hardware engineers to check whether the disk is faulty. If the disk is faulty, remove it from the server.</span></li><li id="ALM-12015__en-us_topic_0070543537_li26241711"><span>After the disk is removed, alarm <strong id="ALM-12015__en-us_topic_0070543537_b34848813">ALM-12014 Partition Lost</strong> is reported. Handle the alarm. For details, see <a href="ALM-12014.html">ALM-12014 Partition Lost</a>. After the alarm <strong id="ALM-12015__en-us_topic_0070543537_b4181593">ALM-12014 Partition Lost</strong> is cleared, alarm <strong id="ALM-12015__en-us_topic_0070543537_b37634337">ALM-12015 Partition Filesystem Readonly</strong> is automatically cleared.</span></li></ol>
</div>
<div class="section" id="ALM-12015__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12015__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -68,13 +68,13 @@
<ol id="ALM-12016__ol1362745417400"><li id="ALM-12016__li24816170173938"><span>Change the alarm threshold and alarm <strong id="ALM-12016__b13281711203813">Trigger Count</strong> based on CPU usage.</span><p><p class="litext" id="ALM-12016__p6523306173938">On FusionInsight Manager, choose <strong id="ALM-12016__b73164535166">O&amp;M</strong> &gt; <strong id="ALM-12016__b1366935516171">Alarm</strong> &gt; <strong id="ALM-12016__b14318131145112">Thresholds &gt; </strong><em id="ALM-12016__i193217112515">Name of the desired cluster</em> &gt; <strong id="ALM-12016__b16357675173938">Host</strong> &gt; <strong id="ALM-12016__b13001354173938">CPU</strong> &gt; <strong id="ALM-12016__b49903330173938">Host CPU Usage</strong> and change the alarm smoothing times based on CPU usage, as shown in <a href="#ALM-12016__fig42676420173938">Figure 1</a>.</p>
<div class="note" id="ALM-12016__note57869743173938"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12016__p58625754173938">This option defines the alarm check phase. <strong id="ALM-12016__b74612137375">Trigger Count</strong> indicates the alarm check threshold. An alarm is generated when the number of check times exceeds the threshold.</p>
</div></div>
<div class="fignone" id="ALM-12016__fig42676420173938"><a name="ALM-12016__fig42676420173938"></a><a name="fig42676420173938"></a><span class="figcap"><b>Figure 1 </b>Setting alarm smoothing times</span><br><span><img id="ALM-12016__image122911304588" src="en-us_image_0269383824.png"></span></div>
<div class="fignone" id="ALM-12016__fig42676420173938"><a name="ALM-12016__fig42676420173938"></a><a name="fig42676420173938"></a><span class="figcap"><b>Figure 1 </b>Setting alarm smoothing times</span><br><span><img id="ALM-12016__image122911304588" src="en-us_image_0000001583087533.png"></span></div>
<p class="litext" id="ALM-12016__p21675643173938">On <strong id="ALM-12016__b66954485173938">Host CPU Usage</strong> page and click <strong id="ALM-12016__b511919416293">Modify</strong> in the <strong id="ALM-12016__b19162174615296">Operation</strong> column to change the alarm threshold, as shown in <a href="#ALM-12016__fig30961038173938">Figure 2</a>.</p>
<div class="fignone" id="ALM-12016__fig30961038173938"><a name="ALM-12016__fig30961038173938"></a><a name="fig30961038173938"></a><span class="figcap"><b>Figure 2 </b>Setting an alarm threshold</span><br><span><img id="ALM-12016__image1615410501365" src="en-us_image_0000001440977805.png"></span></div>
<div class="fignone" id="ALM-12016__fig30961038173938"><a name="ALM-12016__fig30961038173938"></a><a name="fig30961038173938"></a><span class="figcap"><b>Figure 2 </b>Setting an alarm threshold</span><br><span><img id="ALM-12016__image1615410501365" src="en-us_image_0000001583127513.png"></span></div>
</p></li><li id="ALM-12016__li29621482173938"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul12793264173938"><li id="ALM-12016__li22018946173938">If yes, no further action is required.</li><li id="ALM-12016__li38704176173938">If no, go to <a href="#ALM-12016__li65266749173938">3</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12016__p48030518173938"><strong id="ALM-12016__b1326250617406">Check whether the CPU usage reaches the upper limit.</strong></p>
<ol start="3" id="ALM-12016__ol44225396174015"><li id="ALM-12016__li65266749173938"><a name="ALM-12016__li65266749173938"></a><a name="li65266749173938"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12016__image168221113135319" src="en-us_image_0269383826.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12016__li52115308173938"><span>On the <strong id="ALM-12016__b51685932101729">Hosts</strong> page, click the node on which the alarm is reported.</span></li><li id="ALM-12016__li60590444173938"><span>View the CPU usage for 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the system administrator to add more CPUs.</span></li><li id="ALM-12016__li38620506173938"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul30302958173938"><li id="ALM-12016__li8878949173938">If yes, no further action is required.</li><li id="ALM-12016__li48106238173938">If no, go to <a href="#ALM-12016__li35735451173938">7</a>.</li></ul>
<ol start="3" id="ALM-12016__ol44225396174015"><li id="ALM-12016__li65266749173938"><a name="ALM-12016__li65266749173938"></a><a name="li65266749173938"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12016__image168221113135319" src="en-us_image_0000001582927773.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12016__li52115308173938"><span>On the <strong id="ALM-12016__b51685932101729">Hosts</strong> page, click the node on which the alarm is reported.</span></li><li id="ALM-12016__li60590444173938"><span>View the CPU usage for 5 minutes. If the CPU usage exceeds the threshold for multiple times, contact the system administrator to add more CPUs.</span></li><li id="ALM-12016__li38620506173938"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12016__ul30302958173938"><li id="ALM-12016__li8878949173938">If yes, no further action is required.</li><li id="ALM-12016__li48106238173938">If no, go to <a href="#ALM-12016__li35735451173938">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12016__p51491657174016"><strong id="ALM-12016__b42921091174020">Collect fault information.</strong></p>
<ol start="7" id="ALM-12016__ol57964469174025"><li id="ALM-12016__li35735451173938"><a name="ALM-12016__li35735451173938"></a><a name="li35735451173938"></a><span>On the FusionInsight Manager in the active cluster, choose <strong id="ALM-12016__b12040241173938">O&amp;M</strong> &gt; <strong id="ALM-12016__b41253307173938">Log &gt; Download</strong>.</span></li><li id="ALM-12016__li49036890173938"><span>Select <strong id="ALM-12016__b53183609173938">OmmServer</strong> from the <strong id="ALM-12016__b477010478910">Service</strong> and click <strong id="ALM-12016__b1577112471895">OK</strong>.</span></li><li id="ALM-12016__li11141594173938"><span>Set <strong id="ALM-12016__b38678826173938">Start Date</strong> for log collection to 10 minutes ahead of the alarm generation time and <strong id="ALM-12016__b12565117173938">End Date</strong> to 10 minutes behind the alarm generation time in <strong id="ALM-12016__b20155417195615">Time Range</strong> and click <strong id="ALM-12016__b45977197173938">Download</strong>.</span></li><li id="ALM-12016__li495644512588"><span>Contact the <span id="ALM-12016__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>

View File

@ -71,11 +71,11 @@
</div>
<div class="section" id="ALM-12017__s6fd2395d167c4db4814624ea702a37ac"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12017__en-us_topic_0070543559_p23949084"><strong id="ALM-12017__b457009885739">Check whether the alarm threshold is appropriate.</strong></p>
<ol id="ALM-12017__ol229057318582"><li id="ALM-12017__li3269990385745"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12017__b126241333219">O&amp;M</strong> &gt; <strong id="ALM-12017__b156241435323">Alarm &gt;</strong> <strong id="ALM-12017__b1562412314328">Thresholds</strong><strong id="ALM-12017__b1962413373216"> &gt; </strong><em id="ALM-12017__i1162415315324">Name of the desired cluster</em> &gt; <strong id="ALM-12017__b962416314323">Host</strong> &gt; <strong id="ALM-12017__b106241931323">Disk</strong> &gt; <strong id="ALM-12017__b4624163203210">Disk Usage</strong> and check whether the threshold (configurable, 90% by default) is appropriate.</span><p><ul class="subitemlist" id="ALM-12017__ul1854640385745"><li id="ALM-12017__li1687169885745">If yes, go to <a href="#ALM-12017__li1280611085745">2</a>.</li><li id="ALM-12017__li2443033285745">If no, go to <a href="#ALM-12017__li2782670585745">4</a>.</li></ul>
</p></li><li id="ALM-12017__li1280611085745"><a name="ALM-12017__li1280611085745"></a><a name="li1280611085745"></a><span>Choose <strong id="ALM-12017__b2586367385745">O&amp;M</strong> &gt; <strong id="ALM-12017__b1379910713499">Alarm &gt;</strong> <strong id="ALM-12017__b2887114614242">Thresholds</strong><strong id="ALM-12017__b29831221166"> &gt; </strong><em id="ALM-12017__i9983102101619">Name of the desired cluster</em> &gt; <strong id="ALM-12017__b6413578985745">Host</strong> &gt; <strong id="ALM-12017__b4035119385745">Disk</strong> &gt; <strong id="ALM-12017__b2761642585745">Disk Usage</strong> and click <strong id="ALM-12017__b6659180133310">Modify</strong> in the <strong id="ALM-12017__b1374719315332">Operation</strong> column to change the alarm threshold based on site requirements. As shown in <a href="#ALM-12017__fig6063892885745">Figure 1</a>:</span><p><div class="fignone" id="ALM-12017__fig6063892885745"><a name="ALM-12017__fig6063892885745"></a><a name="fig6063892885745"></a><span class="figcap"><b>Figure 1 </b>Setting an alarm threshold</span><br><span><img id="ALM-12017__image1615410501365" src="en-us_image_0000001440977873.png"></span></div>
</p></li><li id="ALM-12017__li1280611085745"><a name="ALM-12017__li1280611085745"></a><a name="li1280611085745"></a><span>Choose <strong id="ALM-12017__b2586367385745">O&amp;M</strong> &gt; <strong id="ALM-12017__b1379910713499">Alarm &gt;</strong> <strong id="ALM-12017__b2887114614242">Thresholds</strong><strong id="ALM-12017__b29831221166"> &gt; </strong><em id="ALM-12017__i9983102101619">Name of the desired cluster</em> &gt; <strong id="ALM-12017__b6413578985745">Host</strong> &gt; <strong id="ALM-12017__b4035119385745">Disk</strong> &gt; <strong id="ALM-12017__b2761642585745">Disk Usage</strong> and click <strong id="ALM-12017__b6659180133310">Modify</strong> in the <strong id="ALM-12017__b1374719315332">Operation</strong> column to change the alarm threshold based on site requirements. As shown in <a href="#ALM-12017__fig6063892885745">Figure 1</a>:</span><p><div class="fignone" id="ALM-12017__fig6063892885745"><a name="ALM-12017__fig6063892885745"></a><a name="fig6063892885745"></a><span class="figcap"><b>Figure 1 </b>Setting an alarm threshold</span><br><span><img id="ALM-12017__image1615410501365" src="en-us_image_0000001582927861.png"></span></div>
</p></li><li id="ALM-12017__li4783109885745"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12017__ul59050785745"><li id="ALM-12017__li4814612685745">If yes, no further action is required.</li><li id="ALM-12017__li752215285745">If no, go to <a href="#ALM-12017__li2782670585745">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12017__p531456685745"><strong id="ALM-12017__b98862278588">Check whether the disk usage reaches the upper limit.</strong></p>
<ol start="4" id="ALM-12017__ol1005390085829"><li id="ALM-12017__li2782670585745"><a name="ALM-12017__li2782670585745"></a><a name="li2782670585745"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12017__image168221113135319" src="en-us_image_0269383828.png"></span> in the row where the alarm is located to view the alarm host name and disk partition information in the alarm details.</span></li><li id="ALM-12017__li3937060885745"><span>Log in to the node where the alarm is generated as user <strong id="ALM-12017__b4911375485745">root</strong>. <span id="ALM-12017__text43649449460"></span></span></li><li id="ALM-12017__li1529764085745"><span>Run the <strong id="ALM-12017__b5391142133919">df -lmPT | awk '$2 != "iso9660"' | grep '^/dev/' | awk '{"readlink -m "$1 | getline real }{$1=real; print $0}' | sort -u -k 1,1</strong> command to check the system disk partition usage. Check whether the disk is mounted to the following directories based on the disk partition name obtained in <a href="#ALM-12017__li2782670585745">4</a>: <strong id="ALM-12017__b4568855685745">/</strong>, <strong id="ALM-12017__b2096079285745">/opt</strong>, <strong id="ALM-12017__b5442940785745">/tmp</strong>, <strong id="ALM-12017__b2010261785745">/var</strong>, <strong id="ALM-12017__b4670583385745">/var/log</strong>, and <strong id="ALM-12017__b2507614885745">/srv/BigData</strong>(can be customized).</span><p><ul class="subitemlist" id="ALM-12017__ul3152589985745"><li id="ALM-12017__li1790212085745">If yes, the disk is a system disk. Then go to <a href="#ALM-12017__li6170195385745">10</a>.</li><li id="ALM-12017__li4078557985745">If no, the disk is not a system disk. Then go to <a href="#ALM-12017__li1190839985745">7</a>.</li></ul>
<ol start="4" id="ALM-12017__ol1005390085829"><li id="ALM-12017__li2782670585745"><a name="ALM-12017__li2782670585745"></a><a name="li2782670585745"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12017__image168221113135319" src="en-us_image_0000001582807909.png"></span> in the row where the alarm is located to view the alarm host name and disk partition information in the alarm details.</span></li><li id="ALM-12017__li3937060885745"><span>Log in to the node where the alarm is generated as user <strong id="ALM-12017__b4911375485745">root</strong>. <span id="ALM-12017__text43649449460"></span></span></li><li id="ALM-12017__li1529764085745"><span>Run the <strong id="ALM-12017__b5391142133919">df -lmPT | awk '$2 != "iso9660"' | grep '^/dev/' | awk '{"readlink -m "$1 | getline real }{$1=real; print $0}' | sort -u -k 1,1</strong> command to check the system disk partition usage. Check whether the disk is mounted to the following directories based on the disk partition name obtained in <a href="#ALM-12017__li2782670585745">4</a>: <strong id="ALM-12017__b4568855685745">/</strong>, <strong id="ALM-12017__b2096079285745">/opt</strong>, <strong id="ALM-12017__b5442940785745">/tmp</strong>, <strong id="ALM-12017__b2010261785745">/var</strong>, <strong id="ALM-12017__b4670583385745">/var/log</strong>, and <strong id="ALM-12017__b2507614885745">/srv/BigData</strong>(can be customized).</span><p><ul class="subitemlist" id="ALM-12017__ul3152589985745"><li id="ALM-12017__li1790212085745">If yes, the disk is a system disk. Then go to <a href="#ALM-12017__li6170195385745">10</a>.</li><li id="ALM-12017__li4078557985745">If no, the disk is not a system disk. Then go to <a href="#ALM-12017__li1190839985745">7</a>.</li></ul>
</p></li><li id="ALM-12017__li1190839985745"><a name="ALM-12017__li1190839985745"></a><a name="li1190839985745"></a><span>Run the <strong id="ALM-12017__b10661194925219">df -lmPT | awk '$2 != "iso9660"' | grep '^/dev/' | awk '{"readlink -m "$1 | getline real }{$1=real; print $0}' | sort -u -k 1,1</strong> command to check the system disk partition usage. Determine the role of the disk based on the disk partition name obtained in <a href="#ALM-12017__li2782670585745">4</a>.</span></li><li id="ALM-12017__li11884059152614"><span>Check the disk service.</span><p><div class="p" id="ALM-12017__p0769162644910">In <span id="ALM-12017__text13624174411515">MRS</span>, check whether the disk service is HDFS, Yarn, Kafka, Supervisor.<ul id="ALM-12017__ul148852372297"><li id="ALM-12017__li10740174317299">If yes, adjust the capacity. Then go to <a href="#ALM-12017__li1354951085745">9</a>.</li><li id="ALM-12017__li1159152152914">If no, go to <a href="#ALM-12017__li1359113885745">12</a>.</li></ul>
</div>
</p></li><li id="ALM-12017__li1354951085745"><a name="ALM-12017__li1354951085745"></a><a name="li1354951085745"></a><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12017__ul150550185745"><li id="ALM-12017__li4676654185745">If yes, no further action is required.</li><li id="ALM-12017__li2999343985745">If no, go to <a href="#ALM-12017__li1359113885745">12</a>.</li></ul>
@ -84,7 +84,7 @@
</p></li><li id="ALM-12017__li1359113885745"><a name="ALM-12017__li1359113885745"></a><a name="li1359113885745"></a><span>Contact the system administrator to expand the disk capacity.</span></li><li id="ALM-12017__li2833807185745"><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12017__ul5088862685745"><li id="ALM-12017__li5521138285745">If yes, no further action is required.</li><li id="ALM-12017__li4293699485745">If no, go to <a href="#ALM-12017__li5603307085745">14</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12017__p5534445785745"><strong id="ALM-12017__b657764185839">Collect fault information.</strong></p>
<ol start="14" id="ALM-12017__ol4750985985842"><li id="ALM-12017__li5603307085745"><a name="ALM-12017__li5603307085745"></a><a name="li5603307085745"></a><span>On FusionInsight Manager, choose <strong id="ALM-12017__b13819155015320">O&amp;M</strong> &gt; <strong id="ALM-12017__b1368243785745">Log &gt; Download</strong>.</span></li><li id="ALM-12017__li1061898185745"><span>Select <strong id="ALM-12017__b1352831932712">OMS</strong> from the <strong id="ALM-12017__b13893145519916">Service</strong> and click <strong id="ALM-12017__b20893115513911">OK</strong>.</span></li><li id="ALM-12017__li1145664103113"><span>Click <span><img id="ALM-12017__image1945644173117" src="en-us_image_0269383829.png"></span> in the upper right corner, and set <strong id="ALM-12017__b6456941173117">Start Date</strong> and <strong id="ALM-12017__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12017__b13456164113319">Download</strong>.</span></li><li id="ALM-12017__li495644512588"><span>Contact the <span id="ALM-12017__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="14" id="ALM-12017__ol4750985985842"><li id="ALM-12017__li5603307085745"><a name="ALM-12017__li5603307085745"></a><a name="li5603307085745"></a><span>On FusionInsight Manager, choose <strong id="ALM-12017__b13819155015320">O&amp;M</strong> &gt; <strong id="ALM-12017__b1368243785745">Log &gt; Download</strong>.</span></li><li id="ALM-12017__li1061898185745"><span>Select <strong id="ALM-12017__b1352831932712">OMS</strong> from the <strong id="ALM-12017__b13893145519916">Service</strong> and click <strong id="ALM-12017__b20893115513911">OK</strong>.</span></li><li id="ALM-12017__li1145664103113"><span>Click <span><img id="ALM-12017__image1945644173117" src="en-us_image_0000001583127613.png"></span> in the upper right corner, and set <strong id="ALM-12017__b6456941173117">Start Date</strong> and <strong id="ALM-12017__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12017__b13456164113319">Download</strong>.</span></li><li id="ALM-12017__li495644512588"><span>Contact the <span id="ALM-12017__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12017__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12017__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -72,10 +72,10 @@ MemAvailable: 227641452 kB</pre>
</p></li><li id="ALM-12018__li448043669252"><span>Calculate the real-world memory usage: Memory usage = 1 - (Memory available/Memory total)</span><p><ul class="subitemlist" id="ALM-12018__ul568914459252"><li id="ALM-12018__li42205629252">If the memory usage is lower than 90%, manually disable transferring from monitoring indicators to alarms.</li><li id="ALM-12018__li63212719252">If the memory usage is higher than 90%, go to <a href="#ALM-12018__li5861159252">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12018__p422609659252"><strong id="ALM-12018__b40915938935">Expand the system.</strong></p>
<ol start="4" id="ALM-12018__ol28552339317"><li id="ALM-12018__li5861159252"><a name="ALM-12018__li5861159252"></a><a name="li5861159252"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12018__image168221113135319" src="en-us_image_0269383830.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12018__li474753219252"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12018__b52750359252">root</strong>. <span id="ALM-12018__text5966104516217"></span></span></li><li id="ALM-12018__li242002745617"><span>If the memory usage exceeds the threshold, perform memory capacity expansion.</span></li><li id="ALM-12018__li202957929252"><span>Run the command <strong id="ALM-12018__b246247099252">free -m | grep Mem\: | awk '{printf("%s,", $3 * 100 / $2)}'</strong> to check the system memory usage.</span></li><li id="ALM-12018__li305215859252"><span>Wait for 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12018__ul111473689252"><li id="ALM-12018__li316825749252">If yes, no further action is required.</li><li id="ALM-12018__li161516779252">If no, go to <a href="#ALM-12018__li372014939252">9</a>.</li></ul>
<ol start="4" id="ALM-12018__ol28552339317"><li id="ALM-12018__li5861159252"><a name="ALM-12018__li5861159252"></a><a name="li5861159252"></a><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12018__image168221113135319" src="en-us_image_0000001582927669.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12018__li474753219252"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12018__b52750359252">root</strong>. <span id="ALM-12018__text5966104516217"></span></span></li><li id="ALM-12018__li242002745617"><span>If the memory usage exceeds the threshold, perform memory capacity expansion.</span></li><li id="ALM-12018__li202957929252"><span>Run the command <strong id="ALM-12018__b246247099252">free -m | grep Mem\: | awk '{printf("%s,", $3 * 100 / $2)}'</strong> to check the system memory usage.</span></li><li id="ALM-12018__li305215859252"><span>Wait for 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12018__ul111473689252"><li id="ALM-12018__li316825749252">If yes, no further action is required.</li><li id="ALM-12018__li161516779252">If no, go to <a href="#ALM-12018__li372014939252">9</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12018__p332174499252"><strong id="ALM-12018__b165019069321">Collect fault information.</strong></p>
<ol start="9" id="ALM-12018__ol300682989324"><li id="ALM-12018__li372014939252"><a name="ALM-12018__li372014939252"></a><a name="li372014939252"></a><span>On the FusionInsight Manager in the active cluster, choose <strong id="ALM-12018__b57841710145614">O&amp;M</strong> &gt; <strong id="ALM-12018__b563292829252">Log &gt; Download</strong>.</span></li><li id="ALM-12018__li40625489252"><span>Select <strong id="ALM-12018__b663779889252">OmmServer</strong> from the <strong id="ALM-12018__b1099120531019">Servic</strong>e and click <strong id="ALM-12018__b999117511012">OK</strong>.</span></li><li id="ALM-12018__li1145664103113"><span>Click <span><img id="ALM-12018__image1945644173117" src="en-us_image_0269383831.png"></span> in the upper right corner, and set <strong id="ALM-12018__b6456941173117">Start Date</strong> and <strong id="ALM-12018__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12018__b13456164113319">Download</strong>.</span></li><li id="ALM-12018__li495644512588"><span>Contact the <span id="ALM-12018__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="9" id="ALM-12018__ol300682989324"><li id="ALM-12018__li372014939252"><a name="ALM-12018__li372014939252"></a><a name="li372014939252"></a><span>On the FusionInsight Manager in the active cluster, choose <strong id="ALM-12018__b57841710145614">O&amp;M</strong> &gt; <strong id="ALM-12018__b563292829252">Log &gt; Download</strong>.</span></li><li id="ALM-12018__li40625489252"><span>Select <strong id="ALM-12018__b663779889252">OmmServer</strong> from the <strong id="ALM-12018__b1099120531019">Servic</strong>e and click <strong id="ALM-12018__b999117511012">OK</strong>.</span></li><li id="ALM-12018__li1145664103113"><span>Click <span><img id="ALM-12018__image1945644173117" src="en-us_image_0000001583127413.png"></span> in the upper right corner, and set <strong id="ALM-12018__b6456941173117">Start Date</strong> and <strong id="ALM-12018__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12018__b13456164113319">Download</strong>.</span></li><li id="ALM-12018__li495644512588"><span>Contact the <span id="ALM-12018__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12018__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12018__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -65,14 +65,14 @@
<div class="section" id="ALM-12027__s3ddd6cfc758a404a82adc3dfe898bd66"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12027__p681753145417">Too many processes are running on the node. You need to increase the value of <strong id="ALM-12027__en-us_topic_0070543581_b61845569">pid_max</strong>.</p>
</div>
<div class="section" id="ALM-12027__s9445b6fc399a470295ea751769713fde"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12027__en-us_topic_0070543581_p55372696"><strong id="ALM-12027__b360029529747">Increase the value of pid_max.</strong></p>
<ol id="ALM-12027__ol240915109757"><li id="ALM-12027__li639798269750"><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12027__image168221113135319" src="en-us_image_0269383832.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12027__li149834549750"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12027__b389475309750">root</strong>. <span id="ALM-12027__text43649449460"></span></span></li><li id="ALM-12027__li513020679750"><span>Run the <strong id="ALM-12027__b6333589750">cat /proc/sys/kernel/pid_max</strong>command to check the value of <strong id="ALM-12027__b57002299750">pid_max</strong>.</span></li><li id="ALM-12027__li205272659750"><span>If the PID usage exceeds the threshold, run the command <strong id="ALM-12027__b590654259750">echo </strong><em id="ALM-12027__i618267859750">new value </em><strong id="ALM-12027__b195701549750">&gt; /proc/sys/kernel/pid_max</strong> to enlarge the value of <strong id="ALM-12027__b419136639750">pid_max</strong>.</span><p><p class="litext" id="ALM-12027__p395635099750">Example: <strong id="ALM-12027__b416786479750">echo 65536 &gt; /proc/sys/kernel/pid_max</strong></p>
<ol id="ALM-12027__ol240915109757"><li id="ALM-12027__li639798269750"><span>In the alarm list on FusionInsight Manager, click <span><img id="ALM-12027__image168221113135319" src="en-us_image_0000001532607906.png"></span> in the row where the alarm is located to view the alarm host address in the alarm details.</span></li><li id="ALM-12027__li149834549750"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12027__b389475309750">root</strong>. <span id="ALM-12027__text43649449460"></span></span></li><li id="ALM-12027__li513020679750"><span>Run the <strong id="ALM-12027__b6333589750">cat /proc/sys/kernel/pid_max</strong>command to check the value of <strong id="ALM-12027__b57002299750">pid_max</strong>.</span></li><li id="ALM-12027__li205272659750"><span>If the PID usage exceeds the threshold, run the command <strong id="ALM-12027__b590654259750">echo </strong><em id="ALM-12027__i618267859750">new value </em><strong id="ALM-12027__b195701549750">&gt; /proc/sys/kernel/pid_max</strong> to enlarge the value of <strong id="ALM-12027__b419136639750">pid_max</strong>.</span><p><p class="litext" id="ALM-12027__p395635099750">Example: <strong id="ALM-12027__b416786479750">echo 65536 &gt; /proc/sys/kernel/pid_max</strong></p>
<div class="note" id="ALM-12027__note163571615102916"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12027__p10664145203015">The maximum value of <span class="parmname" id="ALM-12027__parmname1566455103015"><b>pid_max</b></span> is as follows:</p>
<ul id="ALM-12027__ul13990143413014"><li id="ALM-12027__li7990034173015">On 32-bit systems: 32768</li><li id="ALM-12027__li799018345307">On 64-bit systems: 4194304 (2^22)</li></ul>
</div></div>
</p></li><li id="ALM-12027__li148339459750"><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12027__ul590069549750"><li id="ALM-12027__li505276609750">If yes, no further action is required.</li><li id="ALM-12027__li662086519750">If no, go to <a href="#ALM-12027__li377225729750">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12027__p61837339750"><strong id="ALM-12027__b361001479817">Collect fault information.</strong></p>
<ol start="6" id="ALM-12027__ol116595289821"><li id="ALM-12027__li377225729750"><a name="ALM-12027__li377225729750"></a><a name="li377225729750"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12027__b311203779750">O&amp;M</strong> &gt; <strong id="ALM-12027__b116479379750">Log &gt; Download</strong>.</span></li><li id="ALM-12027__li3107269750"><span>Select all services from the <strong id="ALM-12027__b356295299750">Service</strong> and click <strong id="ALM-12027__b3991118545">OK</strong>.</span></li><li id="ALM-12027__li1145664103113"><span>Click <span><img id="ALM-12027__image1945644173117" src="en-us_image_0269383834.png"></span> in the upper right corner, and set <strong id="ALM-12027__b6456941173117">Start Date</strong> and <strong id="ALM-12027__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12027__b13456164113319">Download</strong>.</span></li><li id="ALM-12027__li495644512588"><span>Contact the <span id="ALM-12027__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12027__ol116595289821"><li id="ALM-12027__li377225729750"><a name="ALM-12027__li377225729750"></a><a name="li377225729750"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12027__b311203779750">O&amp;M</strong> &gt; <strong id="ALM-12027__b116479379750">Log &gt; Download</strong>.</span></li><li id="ALM-12027__li3107269750"><span>Select all services from the <strong id="ALM-12027__b356295299750">Service</strong> and click <strong id="ALM-12027__b3991118545">OK</strong>.</span></li><li id="ALM-12027__li1145664103113"><span>Click <span><img id="ALM-12027__image1945644173117" src="en-us_image_0000001582927797.png"></span> in the upper right corner, and set <strong id="ALM-12027__b6456941173117">Start Date</strong> and <strong id="ALM-12027__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12027__b13456164113319">Download</strong>.</span></li><li id="ALM-12027__li495644512588"><span>Contact the <span id="ALM-12027__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12027__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12027__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -1,8 +1,10 @@
<a name="ALM-12028"></a><a name="ALM-12028"></a>
<h1 class="topictitle1">ALM-12028 Number of Processes in the D State on a Host Exceeds the Threshold</h1>
<div id="body14709652"><div class="section" id="ALM-12028__section23718688"><h4 class="sectiontitle">Description</h4><p id="ALM-12028__p50631172">The system checks the number of processes in the D state of user <strong id="ALM-12028__b16253141134213">omm</strong> on the host every 30 seconds and compares the actual number with the threshold. The number of processes in the D state on the host has a default threshold range. This alarm is generated when the number of processes exceeds the threshold.</p>
<p id="ALM-12028__p53027366">This alarm is cleared when the <strong id="ALM-12028__b1896274320598">Trigger Count</strong> is <strong id="ALM-12028__b15669123210464">1</strong> and the total number of processes in the D state of user <strong id="ALM-12028__b19867204318485">omm</strong> on the host does not exceed the threshold. This alarm is cleared when the <strong id="ALM-12028__b134171188010">Trigger Count</strong> is greater than <strong id="ALM-12028__b466017588499">1</strong> and the total number of processes in the D state of user <strong id="ALM-12028__b1986717812518">omm</strong> on the host is less than or equal to 90% of the threshold.</p>
<h1 class="topictitle1">ALM-12028 Number of Processes in the D State and Z State on a Host Exceeds the Threshold</h1>
<div id="body14709652"><div class="section" id="ALM-12028__section23718688"><h4 class="sectiontitle">Description</h4><p id="ALM-12028__p50631172">The system checks the number of processes in the D stateand Z state of user <strong id="ALM-12028__b16253141134213">omm</strong> on the host every 30 seconds and compares the actual number with the threshold. The number of processes in the D state and Z state on the host has a default threshold range. This alarm is generated when the number of processes exceeds the threshold.</p>
<p id="ALM-12028__p53027366">This alarm is cleared when the <strong id="ALM-12028__b1896274320598">Trigger Count</strong> is <strong id="ALM-12028__b15669123210464">1</strong> and the total number of processes in the D state and Z state of user <strong id="ALM-12028__b19867204318485">omm</strong> on the host does not exceed the threshold. This alarm is cleared when the <strong id="ALM-12028__b134171188010">Trigger Count</strong> is greater than <strong id="ALM-12028__b466017588499">1</strong> and the total number of processes in the D state and Z state of user <strong id="ALM-12028__b1986717812518">omm</strong> on the host is less than or equal to 90% of the threshold.</p>
<div class="note" id="ALM-12028__note13991618131016"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12028__p84028186101">The function of checking the number of processes in the Z state on the host applies to MRS 3.2.0 or later.</p>
</div></div>
</div>
<div class="section" id="ALM-12028__section12141602"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12028__table249371" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12028__row53434174"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12028__p33200870">Alarm ID</p>
@ -64,13 +66,13 @@
</div>
<div class="section" id="ALM-12028__section59967381"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12028__p66388367">The host responds slowly to I/O (disk I/O and network I/O) requests and some processes are in the D state and Z state.</p>
</div>
<div class="section" id="ALM-12028__section2835522"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12028__p8748685"><strong id="ALM-12028__b820613226166">Check the processes in the D state</strong><strong id="ALM-12028__b10206112220164"></strong><strong id="ALM-12028__b15207322181616">.</strong></p>
<ol id="ALM-12028__ol5802802991057"><li id="ALM-12028__li6390942091049"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12028__image168221113135319" src="en-us_image_0263895749.png"></span> to view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12028__li1641579391049"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12028__b1426064751813">root</strong>. (<span id="ALM-12028__text995114020554"></span>) Then run the <strong id="ALM-12028__b3831387091049">su - omm</strong> command to switch to user <strong id="ALM-12028__b1288412448360">omm</strong>.</span></li><li id="ALM-12028__li2173547791049"><span>Run the following command as user <strong id="ALM-12028__b547373343910">omm</strong> to view the PID of the process that is in the D state:</span><p><p class="litext" id="ALM-12028__p5461083691049"><strong id="ALM-12028__b1352441191049">ps -elf | grep -v "\[thread_checkio\]" | awk 'NR!=1 {print $2, $3, $4}' | grep omm | awk -F' ' '{print $1, $3}' | grep -E "Z|D" | awk '{print $2}'</strong></p>
<div class="section" id="ALM-12028__section2835522"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12028__p8748685"><strong id="ALM-12028__b168151162515">Check the processes in the D state</strong><strong id="ALM-12028__b1581161112520"> and Z state</strong><strong id="ALM-12028__b19812114253">.</strong></p>
<ol id="ALM-12028__ol5802802991057"><li id="ALM-12028__li6390942091049"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12028__image168221113135319" src="en-us_image_0000001532448262.png"></span> to view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12028__li1641579391049"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12028__b1426064751813">root</strong>. (<span id="ALM-12028__text995114020554"></span>) Then run the <strong id="ALM-12028__b3831387091049">su - omm</strong> command to switch to user <strong id="ALM-12028__b1288412448360">omm</strong>.</span></li><li id="ALM-12028__li12129135691210"><span>Run the following command as user <strong id="ALM-12028__b1667813740112956">omm</strong> to view the PID of the process that is in the D state and Z state:</span><p><p class="litext" id="ALM-12028__p91301556161211"><strong id="ALM-12028__b613095661213">ps -elf | grep -v "\[thread_checkio\]" | awk 'NR!=1 {print $2, $3, $4}' | grep omm | awk -F' ' '{print $1, $3}' | grep -E "Z|D" | awk '{print $2}'</strong></p>
</p></li><li id="ALM-12028__li2799290091049"><span>Check whether the command output is empty.</span><p><ul class="subitemlist" id="ALM-12028__ul1056686291049"><li id="ALM-12028__li747103591049">If yes, the service process is running properly. Then go to <a href="#ALM-12028__li2701143291049">6</a>.</li><li id="ALM-12028__li117409591049">If no, go to <a href="#ALM-12028__li573000391049">5</a>.</li></ul>
</p></li><li id="ALM-12028__li573000391049"><a name="ALM-12028__li573000391049"></a><a name="li573000391049"></a><span>Switch to user <strong id="ALM-12028__b1281511314404">root</strong> and run the <strong id="ALM-12028__b8712438134020">reboot</strong> command to restart the host for which the alarm is generated. (Restarting a host is risky. Ensure that the service process is normal after the restart.)</span></li><li id="ALM-12028__li2701143291049"><a name="ALM-12028__li2701143291049"></a><a name="li2701143291049"></a><span>Check whether the alarm is cleared 5 minutes later.</span><p><ul class="subitemlist" id="ALM-12028__ul1358954691049"><li id="ALM-12028__li5157003291049">If yes, no further action is required.</li><li id="ALM-12028__li1642303091049">If no, go to <a href="#ALM-12028__li4177630091049">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12028__p5519705391049"><strong id="ALM-12028__b89637239112">Collect the fault information.</strong></p>
<ol start="7" id="ALM-12028__ol128225129115"><li id="ALM-12028__li4177630091049"><a name="ALM-12028__li4177630091049"></a><a name="li4177630091049"></a><span>On FusionInsight Manager, choose <strong id="ALM-12028__b20291725145313">O&amp;M</strong> &gt; <strong id="ALM-12028__b64322519539">Log</strong> &gt; <strong id="ALM-12028__b5431625185320">Download</strong>.</span></li><li id="ALM-12028__li4044238791049"><span>Select <strong id="ALM-12028__b884279457112956">OMS</strong> for <strong id="ALM-12028__b811590828112956">Service</strong> and click <strong id="ALM-12028__b1060353008112956">OK</strong>.</span></li><li id="ALM-12028__li2843716491049"><span>Click <span><img id="ALM-12028__image104601319175315" src="en-us_image_0263895796.png"></span> in the upper right corner, and set <strong id="ALM-12028__b522882672112956">Start Date</strong> and <strong id="ALM-12028__b2029904650112956">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12028__b449569331112956">Download</strong>.</span></li><li id="ALM-12028__li2170896591049"><span>Contact <span id="ALM-12028__text02161454416">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="7" id="ALM-12028__ol128225129115"><li id="ALM-12028__li4177630091049"><a name="ALM-12028__li4177630091049"></a><a name="li4177630091049"></a><span>On FusionInsight Manager, choose <strong id="ALM-12028__b750820372495">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12028__b550820377491">Log</strong> &gt; <strong id="ALM-12028__b185081037134914">Download</strong>.</span></li><li id="ALM-12028__li4044238791049"><span>Select <strong id="ALM-12028__b884279457112956">OMS</strong> for <strong id="ALM-12028__b811590828112956">Service</strong> and click <strong id="ALM-12028__b1060353008112956">OK</strong>.</span></li><li id="ALM-12028__li2843716491049"><span>Click <span><img id="ALM-12028__image104601319175315" src="en-us_image_0000001583087581.png"></span> in the upper right corner, and set <strong id="ALM-12028__b522882672112956">Start Date</strong> and <strong id="ALM-12028__b2029904650112956">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12028__b449569331112956">Download</strong>.</span></li><li id="ALM-12028__li2170896591049"><span>Contact <span id="ALM-12028__text02161454416">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12028__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12028__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -1,16 +1,19 @@
<a name="ALM-12033"></a><a name="ALM-12033"></a>
<h1 class="topictitle1">ALM-12033 Slow Disk Fault</h1>
<div id="body16648012"><div class="section" id="ALM-12033__section37461388"><h4 class="sectiontitle">Description</h4><ul id="ALM-12033__ul58461341018"><li id="ALM-12033__li9495543441">For HDDs, the alarm is triggered when any of the following conditions is met:<ul id="ALM-12033__ul14596172044610"><li id="ALM-12033__li185963201462">The system runs the <strong id="ALM-12033__b1527164612239">iostat</strong> command every 3 seconds, and detects that the <strong id="ALM-12033__b9269442241">svctm</strong> value exceeds 1000 ms for 10 consecutive periods within 30 seconds.</li><li id="ALM-12033__li1959692084618">The system runs the <strong id="ALM-12033__b36725295510">iostat</strong> command every 3 seconds, and detects that more than 60% of I/O exceeds 150 ms within 300 seconds.</li></ul>
</li><li id="ALM-12033__li88478345118">For SSDs, the alarm is triggered when any of the following conditions is met:<ul id="ALM-12033__ul1697514491912"><li id="ALM-12033__li20184348616">The system runs the <strong id="ALM-12033__b15668702267">iostat</strong> command every 3 seconds, and detects that the <strong id="ALM-12033__b266920112620">svctm</strong> value exceeds 1000 ms for 10 consecutive periods within 30 seconds.</li><li id="ALM-12033__li818514818112">The system runs the <strong id="ALM-12033__b549820317266">iostat</strong> command every 3 seconds, and detects that more than 60% of I/O exceeds 20 ms within 300 seconds.</li></ul>
<div id="body16648012"><div class="section" id="ALM-12033__section37461388"><h4 class="sectiontitle">Description</h4><ul id="ALM-12033__ul58461341018"><li id="ALM-12033__li9495543441">For HDDs, the alarm is triggered when any of the following conditions is met:<ul id="ALM-12033__ul12610161595313"><li id="ALM-12033__li5610201585311">The system runs the <strong id="ALM-12033__b1417353444611">iostat</strong> command every 3 seconds, and detects that the <strong id="ALM-12033__b15173183464614">svctm</strong> value exceeds 1000 ms for 7 consecutive periods within 30 seconds.</li><li id="ALM-12033__li9610111545314">The system runs the <strong id="ALM-12033__b46619613475">iostat</strong> command every 3 seconds, and detects that more than 50% of I/Os take more than 150 ms within 300s.</li></ul>
</li><li id="ALM-12033__li88478345118">For SSDs, the alarm is triggered when any of the following conditions is met:<ul id="ALM-12033__ul1697514491912"><li id="ALM-12033__li20184348616">The system runs the <strong id="ALM-12033__b15668702267">iostat</strong> command every 3 seconds, and detects that the <strong id="ALM-12033__b266920112620">svctm</strong> value exceeds 1000 ms for 10 consecutive periods within 30 seconds.</li><li id="ALM-12033__li818514818112">The system runs the <strong id="ALM-12033__b549820317266">iostat</strong> command every 3 seconds, and detects that more than 60% of I/Os take more than 20 ms within 300 seconds.</li></ul>
</li></ul>
<p id="ALM-12033__p1147865811515">This alarm is automatically cleared when the preceding conditions have not been met for 15 minutes.</p>
<div class="note" id="ALM-12033__note146121953385"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12033__p10787194912146">The formula for calculating <strong id="ALM-12033__b289596163110">svctm</strong> is as follows:</p>
<div class="note" id="ALM-12033__note146121953385"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12033__p10787194912146">The <strong id="ALM-12033__b4851941125114">svctm</strong> value can be obtained as follows:</p>
<ul id="ALM-12033__ul12122227541"><li id="ALM-12033__li1775013885414">MRS 3.1.0:<p id="ALM-12033__p6761124418546"><a name="ALM-12033__li1775013885414"></a><a name="li1775013885414"></a>Run the <strong id="ALM-12033__b6647834165418">iostat -x -t</strong> command in the OS.</p>
<p id="ALM-12033__p29371953145511"><span><img id="ALM-12033__image1950415575516" src="en-us_image_0000001583087321.png"></span></p>
</li><li id="ALM-12033__li023264515117">Versions later than MRS 3.1.0:</li></ul>
<p id="ALM-12033__p332417335118">svctm = (tot_ticks_new - tot_ticks_old)/(rd_ios_new + wr_ios_new - rd_ios_old - wr_ios_old)</p>
<p id="ALM-12033__p4167121643616">If <strong id="ALM-12033__b15597134712416">rd_ios_new + wr_ios_new - rd_ios_old - wr_ios_old</strong> is <strong id="ALM-12033__b8379165513414">0</strong>, then <strong id="ALM-12033__b7311200253">svctm</strong> is <strong id="ALM-12033__b245516612518">0</strong>.</p>
<p id="ALM-12033__p1268752201517">The parameters can be obtained as follows:</p>
<p id="ALM-12033__p5648122416463">The system runs the <strong id="ALM-12033__b3375154216449">cat /proc/diskstats</strong> command every 3 seconds to collect data. For example:</p>
<p id="ALM-12033__p1657515122539"><span><img id="ALM-12033__image1675110291273" src="en-us_image_0000001410107141.png"></span></p>
<p id="ALM-12033__p1657515122539"><span><img id="ALM-12033__image1675110291273" src="en-us_image_0000001582807613.png"></span></p>
<p id="ALM-12033__p146243408539">In these two commands:</p>
<p id="ALM-12033__p1264621195310">In the data collected for the first time, the number in the fourth column is the <strong id="ALM-12033__b1798092372110">rd_ios_old</strong> value, the number in the eighth column is the <strong id="ALM-12033__b13685518220">wr_ios_old</strong> value, and the number in the thirteenth column is the <strong id="ALM-12033__b1637011119229">tot_ticks_old</strong> value.</p>
<p id="ALM-12033__p415119825410">In the data collected for the second time, the number in the fourth column is the <strong id="ALM-12033__b14962368224">rd_ios_new</strong> value, the number in the eighth column is the <strong id="ALM-12033__b13102936112218">wr_ios_new</strong> value, and the number in the thirteenth column is the <strong id="ALM-12033__b410817369224">tot_ticks_new</strong> value.</p>
@ -29,7 +32,7 @@
</thead>
<tbody><tr id="ALM-12033__row65888257"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12033__p35348568">12033</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12033__p44661780">Major</p>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12033__p44661780">Minor</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12033__p60834461">Yes</p>
</td>
@ -81,7 +84,7 @@
<div class="section" id="ALM-12033__section15140644"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12033__p1856018391458"><strong id="ALM-12033__b3282392191458">Check the disk status.</strong></p>
<ol id="ALM-12033__ol4468641992138"><li id="ALM-12033__li4149191591458"><span>On FusionInsight Manager, choose <strong id="ALM-12033__b54488846544512">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12033__b20884598144512">Alarm</strong> &gt; <strong id="ALM-12033__b27619653444512">Alarms</strong>.</span></li><li id="ALM-12033__li3788291791458"><a name="ALM-12033__li3788291791458"></a><a name="li3788291791458"></a><span>View the detailed information about the alarm. Check the values of <strong id="ALM-12033__b1216047996">HostName</strong> and <strong id="ALM-12033__b43072495916">DiskName</strong> in the location information to obtain the information about the faulty disk for which the alarm is generated.</span></li><li id="ALM-12033__li540193791458"><span>Check whether the node for which the alarm is generated is in a virtualization environment. </span><p><ul id="ALM-12033__ul4861744191458"><li id="ALM-12033__li3490378991458">If yes, go to <a href="#ALM-12033__li2831628891458">4</a>.</li><li id="ALM-12033__li863462891458">If no, go to <a href="#ALM-12033__li2583597491458">7</a>.</li></ul>
</p></li><li id="ALM-12033__li2831628891458"><a name="ALM-12033__li2831628891458"></a><a name="li2831628891458"></a><span>Check whether the storage performance provided by the virtualization environment meets the hardware requirements. Then, go to <a href="#ALM-12033__li1205527419227">5</a>.</span></li><li id="ALM-12033__li1205527419227"><a name="ALM-12033__li1205527419227"></a><a name="li1205527419227"></a><span>Log in to the alarm node as user <strong id="ALM-12033__b19653192618269">root</strong>, run the <strong id="ALM-12033__b13449155511259">df -h</strong> command, and check whether the command output contains the value of the <strong id="ALM-12033__b1074473672410">DiskName</strong> field. <span id="ALM-12033__text23715444267"></span></span><p><ul id="ALM-12033__ul12100362193111"><li id="ALM-12033__li56917037201355">If yes, go to <a href="#ALM-12033__li2583597491458">7</a>.</li><li id="ALM-12033__li8577348201455">If no, go to <a href="#ALM-12033__li2325719119312">6</a>.</li></ul>
</p></li><li id="ALM-12033__li2325719119312"><a name="ALM-12033__li2325719119312"></a><a name="li2325719119312"></a><span>Run the <strong id="ALM-12033__b1673412214263">lsblk</strong> command to check whether the mapping between the value of <strong id="ALM-12033__b1388164762511">DiskName</strong> and the disk has been created.</span><p><div class="p" id="ALM-12033__p55380970201120"><span><img id="ALM-12033__image94412418324" src="en-us_image_0263895818.jpg"></span><ul id="ALM-12033__ul245583919286"><li id="ALM-12033__li40945773201617">If yes, go to <a href="#ALM-12033__li2583597491458">7</a>. .</li><li id="ALM-12033__li4547636219286">If no, go to <a href="#ALM-12033__li4518231891458">22</a>.</li></ul>
</p></li><li id="ALM-12033__li2325719119312"><a name="ALM-12033__li2325719119312"></a><a name="li2325719119312"></a><span>Run the <strong id="ALM-12033__b1673412214263">lsblk</strong> command to check whether the mapping between the value of <strong id="ALM-12033__b1388164762511">DiskName</strong> and the disk has been created.</span><p><div class="p" id="ALM-12033__p55380970201120"><span><img id="ALM-12033__image94412418324" src="en-us_image_0000001583127305.jpg"></span><ul id="ALM-12033__ul245583919286"><li id="ALM-12033__li40945773201617">If yes, go to <a href="#ALM-12033__li2583597491458">7</a>. .</li><li id="ALM-12033__li4547636219286">If no, go to <a href="#ALM-12033__li4518231891458">22</a>.</li></ul>
</div>
</p></li><li id="ALM-12033__li2583597491458"><a name="ALM-12033__li2583597491458"></a><a name="li2583597491458"></a><span>Log in to the alarm node as user <strong id="ALM-12033__b3119718091458">root</strong>, run the <strong id="ALM-12033__b1233916891458">lsscsi | grep "/dev/sd[x]"</strong> command to view the disk information, and check whether RAID has been set up. <span id="ALM-12033__text1899265513266"></span></span><p><div class="note" id="ALM-12033__note4394365191458"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12033__p5994854291458">In the command, <strong id="ALM-12033__b470511334308">/dev/sd[x]</strong> indicates the disk name obtained in <a href="#ALM-12033__li3788291791458">2</a>.</p>
</div></div>
@ -100,7 +103,7 @@
</p></li><li id="ALM-12033__li1145378391458"><a name="ALM-12033__li1145378391458"></a><a name="li1145378391458"></a><span>Run the <strong id="ALM-12033__b3597518991458">smartctl -l error -H /dev/sd[x]</strong> command to check the Glist of the disk and determine whether the disk is normal.</span><p><p id="ALM-12033__p5534124691458">Example:</p>
<p id="ALM-12033__p2830917391458"><strong id="ALM-12033__b5345597091458">smartctl -l error -H /dev/sda</strong></p>
<p id="ALM-12033__p1134168691458">Check the <strong id="ALM-12033__b1453654720449">Command/Feature_name</strong> column in the command output. If <strong id="ALM-12033__b23561756164415">READ SECTOR(S)</strong> or <strong id="ALM-12033__b8635195920442">WRITE SECTOR(S)</strong> is displayed, the disk has bad sectors. If other errors occur, the disk circuit board is faulty. Both errors indicate that the disk is abnormal and needs to be replaced.</p>
<p id="ALM-12033__p3496631491458">If "No Errors Logged" is displayed, no error log exists. You can perform step 9 to trigger the disk SMART self-check. </p>
<p id="ALM-12033__p3496631491458">If "No Errors Logged" is displayed, no error log exists. You can trigger the disk SMART self-check.</p>
<ul id="ALM-12033__ul4626137591458"><li id="ALM-12033__li1369919991458">If yes, go to <a href="#ALM-12033__li2167780691458">11</a>.</li><li id="ALM-12033__li3589332091458">If no, go to <a href="#ALM-12033__li6235920691458">18</a>.</li></ul>
</p></li><li id="ALM-12033__li2167780691458"><a name="ALM-12033__li2167780691458"></a><a name="li2167780691458"></a><span>Run the <strong id="ALM-12033__b6088252791458">smartctl -t long /dev/sd[x]</strong> command to trigger the disk SMART self-check. After the command is executed, the time when the self-check is to be completed is displayed. After the self-check is completed, repeat <a href="#ALM-12033__li3483730991458">9</a> and <a href="#ALM-12033__li1145378391458">10</a> to check whether the disk is working properly. </span><p><p id="ALM-12033__p2440318291458">Example:</p>
<p id="ALM-12033__p1830205491458"><strong id="ALM-12033__b3050076291458">smartctl -t long /dev/sda</strong></p>
@ -120,8 +123,8 @@
<ul id="ALM-12033__ul4302853391458"><li id="ALM-12033__li5171248091458">If yes, go to <a href="#ALM-12033__li5027541391458">14</a>.</li><li id="ALM-12033__li2796133291458">If no, go to <a href="#ALM-12033__li6235920691458">18</a>.</li></ul>
</p></li><li id="ALM-12033__li5027541391458"><a name="ALM-12033__li5027541391458"></a><a name="li5027541391458"></a><span>Run the <strong id="ALM-12033__b4982553391458">smartctl -d [sat|scsi]+megaraid,[DID] -l error -H /dev/sd[x]</strong> command to check the Glist of the disk and determine whether the hard disk is working properly.</span><p><p id="ALM-12033__p4577661891458">Example:</p>
<p id="ALM-12033__p933637991458"><strong id="ALM-12033__b1691855591458">smartctl -d sat+megaraid,2 -l error -H /dev/sda</strong></p>
<p id="ALM-12033__p1804927491458">Check the <strong id="ALM-12033__b112875461773">Command/Featrue_name</strong> column in the command output. If <strong id="ALM-12033__b228717461573">READ SECTOR(S)</strong> or <strong id="ALM-12033__b228734613715">WRITE SECTOR(S)</strong> is displayed, the disk has bad sectors. If other errors occur, the disk circuit board is faulty. Both errors indicate that the disk is abnormal and needs to be replaced.</p>
<p id="ALM-12033__p2822574691458">If "No Errors Logged" is displayed, no error log exists. You can perform step 9 to trigger the disk SMART self-check. </p>
<p id="ALM-12033__p1804927491458">Check the <strong id="ALM-12033__b112875461773">Command/Feature_name</strong> column in the command output. If <strong id="ALM-12033__b228717461573">READ SECTOR(S)</strong> or <strong id="ALM-12033__b228734613715">WRITE SECTOR(S)</strong> is displayed, the disk has bad sectors. If other errors occur, the disk circuit board is faulty. Both errors indicate that the disk is abnormal and needs to be replaced.</p>
<p id="ALM-12033__p2822574691458">If "No Errors Logged" is displayed, no error log exists. You can trigger the disk SMART self-check.</p>
<ul id="ALM-12033__ul5270512291458"><li id="ALM-12033__li458405291458">If yes, go to <a href="#ALM-12033__li1119862391458">15</a>.</li><li id="ALM-12033__li3576394791458">If no, go to <a href="#ALM-12033__li6235920691458">18</a>.</li></ul>
</p></li><li id="ALM-12033__li1119862391458"><a name="ALM-12033__li1119862391458"></a><a name="li1119862391458"></a><span>Run the <strong id="ALM-12033__b3367874791458">smartctl -d [sat|scsi]+megaraid,[DID] -t long /dev/sd[x]</strong> command to trigger the disk SMART self-check. After the command is executed, the time when the self-check is to be completed is displayed. After the self-check is completed, repeat <a href="#ALM-12033__li4568369291458">13</a> and <a href="#ALM-12033__li5027541391458">14</a> to check whether the disk is working properly. </span><p><p id="ALM-12033__p5707158091458">Example:</p>
<p id="ALM-12033__p4388217491458"><strong id="ALM-12033__b5939525391458">smartctl -d sat+megaraid,2 -t long /dev/sda</strong></p>
@ -134,7 +137,7 @@
<ol start="18" id="ALM-12033__ol5110722692159"><li id="ALM-12033__li6235920691458"><a name="ALM-12033__li6235920691458"></a><a name="li6235920691458"></a><span>On FusionInsight Manager, choose <strong id="ALM-12033__b10218142171210">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12033__b19228521101217">Alarm</strong> &gt; <strong id="ALM-12033__b1922813217128">Alarms</strong>.</span></li><li id="ALM-12033__li2436194691458"><span>View the detailed information about the alarm. Check the values of <strong id="ALM-12033__b8430753201817">HostName</strong> and <strong id="ALM-12033__b19444145316187">DiskName</strong> in the location information to obtain the information about the faulty disk for which the alarm is reported.</span></li><li id="ALM-12033__li1793092591458"><span>Replace the disk.</span></li><li id="ALM-12033__li2716060091458"><span>Check whether the alarm is cleared.</span><p><ul id="ALM-12033__ul4311881291458"><li id="ALM-12033__li5252499591458">If yes, no further action is required.</li><li id="ALM-12033__li296290891458">If no, go to <a href="#ALM-12033__li4518231891458">22</a>.</li></ul>
</p></li></ol>
<p id="ALM-12033__p98841749221"><strong id="ALM-12033__b218487059221">Collect the fault information.</strong></p>
<ol start="22" id="ALM-12033__ol24355139224"><li id="ALM-12033__li4518231891458"><a name="ALM-12033__li4518231891458"></a><a name="li4518231891458"></a><span>On FusionInsight Manager, choose <strong id="ALM-12033__b1657416563128">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12033__b657418563122">Log</strong> &gt; <strong id="ALM-12033__b657435661218">Download</strong>.</span></li><li id="ALM-12033__li398767891458"><span>Select <strong id="ALM-12033__b01051101315">OMS</strong> for <strong id="ALM-12033__b192116161319">Service</strong> and click <strong id="ALM-12033__b16219111132">OK</strong>.</span></li><li id="ALM-12033__li3588910391458"><span>Click <span><img id="ALM-12033__image104601319175315" src="en-us_image_0263895453.png"></span> in the upper right corner, and set <strong id="ALM-12033__b209140148644512">Start Date</strong> and <strong id="ALM-12033__b31843827044512">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12033__b148916401344512">Download</strong>.</span></li><li id="ALM-12033__li5456647491458"><span>Contact <span id="ALM-12033__text10816191314136">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="22" id="ALM-12033__ol24355139224"><li id="ALM-12033__li4518231891458"><a name="ALM-12033__li4518231891458"></a><a name="li4518231891458"></a><span>On FusionInsight Manager, choose <strong id="ALM-12033__b1657416563128">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12033__b657418563122">Log</strong> &gt; <strong id="ALM-12033__b657435661218">Download</strong>.</span></li><li id="ALM-12033__li398767891458"><span>Select <strong id="ALM-12033__b01051101315">OMS</strong> for <strong id="ALM-12033__b192116161319">Service</strong> and click <strong id="ALM-12033__b16219111132">OK</strong>.</span></li><li id="ALM-12033__li3588910391458"><span>Click <span><img id="ALM-12033__image104601319175315" src="en-us_image_0000001532927338.png"></span> in the upper right corner, and set <strong id="ALM-12033__b209140148644512">Start Date</strong> and <strong id="ALM-12033__b31843827044512">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12033__b148916401344512">Download</strong>.</span></li><li id="ALM-12033__li5456647491458"><span>Contact <span id="ALM-12033__text10816191314136">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12033__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12033__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -64,7 +64,7 @@
<div class="section" id="ALM-12034__s263b5f2875944e7b9df856ae80d2a053"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12034__en-us_topic_0070543608_p16651572">The alarm cause depends on the task details. Handle the alarm according to the logs and alarm details.</p>
</div>
<div class="section" id="ALM-12034__s1ca44cb0f88942d591bb071c656d4ccc"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12034__en-us_topic_0070543608_p6600119"><strong id="ALM-12034__b11327931485">Check whether the disk space is sufficient.</strong></p>
<ol id="ALM-12034__ol947516194522"><li id="ALM-12034__li739591494522"><span>In the FusionInsight Manager portal, click <strong id="ALM-12034__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12034__b27872374104950"> &gt; Alarms</strong></strong>.</span></li><li id="ALM-12034__li488781094522"><span>In the alarm list, click <span><img id="ALM-12034__image168221113135319" src="en-us_image_0269383843.png"></span> in the row where the alarm is located and obtain <strong id="ALM-12034__b6656323294522">TaskName</strong> from <strong id="ALM-12034__b9723191310467">Location</strong>.</span></li><li id="ALM-12034__li644408694522"><span>Choose <strong id="ALM-12034__b4399029494522">O&amp;M</strong> &gt; <strong id="ALM-12034__b6036833394522">Backup and Restoration &gt; Backup Management</strong>.</span></li><li id="ALM-12034__li5220897594522"><span>Search for the backup task based on <strong id="ALM-12034__b0347912913">TaskName</strong> and click <strong id="ALM-12034__b20551318102819">More</strong><strong id="ALM-12034__b185711882811"> </strong>in the <strong id="ALM-12034__b43471515919">Operation</strong> column. In the displayed dialog box, click <strong id="ALM-12034__b63471511997">View History</strong> and view the task details.</span></li><li id="ALM-12034__li20896327494"><span>In the displayed dialog box and click <span><img id="ALM-12034__image5943924184912" src="en-us_image_0000001127057881.png"></span> to check whether the following message is displayed: Failed to backup xx due to insufficient disk space, move the data in the xx directory to other directories.</span><p><ul class="subitemlist" id="ALM-12034__ul450817218102"><li id="ALM-12034__li75085211107">If yes, go to <a href="#ALM-12034__li8265923133114">6</a>.</li><li id="ALM-12034__li2510182101010">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
<ol id="ALM-12034__ol947516194522"><li id="ALM-12034__li739591494522"><span>In the FusionInsight Manager portal, click <strong id="ALM-12034__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12034__b27872374104950"> &gt; Alarms</strong></strong>.</span></li><li id="ALM-12034__li488781094522"><span>In the alarm list, click <span><img id="ALM-12034__image168221113135319" src="en-us_image_0000001582807645.png"></span> in the row where the alarm is located and obtain <strong id="ALM-12034__b6656323294522">TaskName</strong> from <strong id="ALM-12034__b9723191310467">Location</strong>.</span></li><li id="ALM-12034__li644408694522"><span>Choose <strong id="ALM-12034__b4399029494522">O&amp;M</strong> &gt; <strong id="ALM-12034__b6036833394522">Backup and Restoration &gt; Backup Management</strong>.</span></li><li id="ALM-12034__li5220897594522"><span>Search for the backup task based on <strong id="ALM-12034__b0347912913">TaskName</strong> and click <strong id="ALM-12034__b20551318102819">More</strong><strong id="ALM-12034__b185711882811"> </strong>in the <strong id="ALM-12034__b43471515919">Operation</strong> column. In the displayed dialog box, click <strong id="ALM-12034__b63471511997">View History</strong> and view the task details.</span></li><li id="ALM-12034__li20896327494"><span>In the displayed dialog box and click <span><img id="ALM-12034__image5943924184912" src="en-us_image_0000001532927370.png"></span> to check whether the following message is displayed: Failed to backup xx due to insufficient disk space, move the data in the xx directory to other directories.</span><p><ul class="subitemlist" id="ALM-12034__ul450817218102"><li id="ALM-12034__li75085211107">If yes, go to <a href="#ALM-12034__li8265923133114">6</a>.</li><li id="ALM-12034__li2510182101010">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
</p></li><li id="ALM-12034__li8265923133114"><a name="ALM-12034__li8265923133114"></a><a name="li8265923133114"></a><span>Choose <strong id="ALM-12034__b10266192333118">Backup Path</strong> &gt; <strong id="ALM-12034__b1226611237319">View </strong>and obtain the <strong id="ALM-12034__b1626622314312">Backup Path</strong>.</span></li><li id="ALM-12034__li11760165519910"><span>Log in to the node as user <strong id="ALM-12034__b142279511119">root</strong> and run the following command to check the node mounting details:</span><p><p id="ALM-12034__p177811011105319"><span id="ALM-12034__text16214101716530"></span></p>
<p id="ALM-12034__p1730510253112"><strong id="ALM-12034__b13233131210719">df -h</strong></p>
</p></li><li id="ALM-12034__li75106309133"><span>Check whether the available space of the node to which the backup path is mounted is less than 20 GB.</span><p><ul class="subitemlist" id="ALM-12034__ul93921250101319"><li id="ALM-12034__li16393950171317">If yes, go to <a href="#ALM-12034__li181154133220">9</a>.</li><li id="ALM-12034__li1439665061320">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
@ -73,7 +73,7 @@
</p></li><li id="ALM-12034__li5916521794522"><a name="ALM-12034__li5916521794522"></a><a name="li5916521794522"></a><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12034__ul4385661594522"><li id="ALM-12034__li5372883994522">If yes, no further action is required.</li><li id="ALM-12034__li5706874094522">If no, go to <a href="#ALM-12034__li115006411351">13</a>.</li></ul>
</p></li></ol>
<p class="subitemlist" id="ALM-12034__p4445114113355"><strong id="ALM-12034__b1570250993141">Collect fault information.</strong></p>
<ol start="13" id="ALM-12034__ol135018418359"><li id="ALM-12034__li115006411351"><a name="ALM-12034__li115006411351"></a><a name="li115006411351"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12034__b8500174113511">O&amp;M</strong> &gt; <strong id="ALM-12034__b4500941173512">Log &gt; Download</strong>.</span></li><li id="ALM-12034__li13500174119354"><span>Select <strong id="ALM-12034__b450034113518">Controller</strong> from the <strong id="ALM-12034__b150044112358">Service</strong> and click <strong id="ALM-12034__b3991118545">OK</strong>.</span></li><li id="ALM-12034__li2501144119351"><span>Click <span><img id="ALM-12034__image13500184111355" src="en-us_image_0269383844.png"></span> in the upper right corner, and set <strong id="ALM-12034__b450010417354">Start Date</strong> and <strong id="ALM-12034__b1250124110357">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12034__b1950164118356">Download</strong>.</span></li><li id="ALM-12034__li495644512588"><span>Contact the <span id="ALM-12034__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="13" id="ALM-12034__ol135018418359"><li id="ALM-12034__li115006411351"><a name="ALM-12034__li115006411351"></a><a name="li115006411351"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12034__b8500174113511">O&amp;M</strong> &gt; <strong id="ALM-12034__b4500941173512">Log &gt; Download</strong>.</span></li><li id="ALM-12034__li13500174119354"><span>Select <strong id="ALM-12034__b450034113518">Controller</strong> from the <strong id="ALM-12034__b150044112358">Service</strong> and click <strong id="ALM-12034__b3991118545">OK</strong>.</span></li><li id="ALM-12034__li2501144119351"><span>Click <span><img id="ALM-12034__image13500184111355" src="en-us_image_0000001532448214.png"></span> in the upper right corner, and set <strong id="ALM-12034__b450010417354">Start Date</strong> and <strong id="ALM-12034__b1250124110357">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12034__b1950164118356">Download</strong>.</span></li><li id="ALM-12034__li495644512588"><span>Contact the <span id="ALM-12034__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12034__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12034__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -65,11 +65,11 @@
</div>
<div class="section" id="ALM-12035__s9bf9cfe815d64aefa40fafcd22fe46e5"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12035__en-us_topic_0070543609_p46929400"><strong id="ALM-12035__b3404416894635">Collect fault information.</strong></p>
<ol id="ALM-12035__ol8912728101615"><li id="ALM-12035__li1191262861614"><span>In the FusionInsight Manager, choose <strong id="ALM-12035__b14623152119812">Cluster &gt; </strong><em id="ALM-12035__i56519211481">Name of the desired cluster</em><strong id="ALM-12035__b1162417211489"> &gt; Services</strong>, and check whether the running status of the component meets the requirements. (The OMS and DBService must be in the normal state, and other components must be stopped.)</span><p><ul id="ALM-12035__ul18912142801616"><li id="ALM-12035__li39121528141617">If yes, go to <a href="#ALM-12035__li18912172820165">9</a>.</li><li id="ALM-12035__li69121828141614">If no, go to <a href="#ALM-12035__li16912228111613">2</a>.</li></ul>
</p></li><li id="ALM-12035__li16912228111613"><a name="ALM-12035__li16912228111613"></a><a name="li16912228111613"></a><span>Restore the component status as required and start the recovery task again.</span></li><li id="ALM-12035__li49121828171617"><span>Log in to the FusionInsight Manager portal and click <strong id="ALM-12035__b0912162814167">O&amp;M &gt; Alarm<strong id="ALM-12035__b19912728151615"> &gt; Alarms</strong></strong>.</span></li><li id="ALM-12035__li591222818167"><span>In the alarm list, click <span><img id="ALM-12035__image159128280169" src="en-us_image_0269383845.png"></span> in the row where the alarm is located to obtain <strong id="ALM-12035__b59121128141611">TaskName</strong> from <strong id="ALM-12035__b2912162815161">Location</strong>.</span></li><li id="ALM-12035__li18912152891616"><span>Choose <strong id="ALM-12035__b12912162812167">O&amp;M</strong> &gt; <strong id="ALM-12035__b79123288163"><strong id="ALM-12035__b2912132812163">Backup and Restoration &gt; </strong>Restoration Management</strong>.</span></li><li id="ALM-12035__li1912142813165"><span>Find the restoration task by <strong id="ALM-12035__b15912528101611">Task Name</strong> and view the task details.</span></li><li id="ALM-12035__li1991218288166"><span>Perform the recovery task again and check whether the recovery task execution is successful.</span><p><ul class="subitemlist" id="ALM-12035__ul491292812164"><li id="ALM-12035__li10912122819168">If yes, go to <a href="#ALM-12035__li691272812168">8</a>.</li><li id="ALM-12035__li10912192811612">If no, go to <a href="#ALM-12035__li18912172820165">9</a>.</li></ul>
</p></li><li id="ALM-12035__li16912228111613"><a name="ALM-12035__li16912228111613"></a><a name="li16912228111613"></a><span>Restore the component status as required and start the recovery task again.</span></li><li id="ALM-12035__li49121828171617"><span>Log in to the FusionInsight Manager portal and click <strong id="ALM-12035__b0912162814167">O&amp;M &gt; Alarm<strong id="ALM-12035__b19912728151615"> &gt; Alarms</strong></strong>.</span></li><li id="ALM-12035__li591222818167"><span>In the alarm list, click <span><img id="ALM-12035__image159128280169" src="en-us_image_0000001532448366.png"></span> in the row where the alarm is located to obtain <strong id="ALM-12035__b59121128141611">TaskName</strong> from <strong id="ALM-12035__b2912162815161">Location</strong>.</span></li><li id="ALM-12035__li18912152891616"><span>Choose <strong id="ALM-12035__b12912162812167">O&amp;M</strong> &gt; <strong id="ALM-12035__b79123288163"><strong id="ALM-12035__b2912132812163">Backup and Restoration &gt; </strong>Restoration Management</strong>.</span></li><li id="ALM-12035__li1912142813165"><span>Find the restoration task by <strong id="ALM-12035__b15912528101611">Task Name</strong> and view the task details.</span></li><li id="ALM-12035__li1991218288166"><span>Perform the recovery task again and check whether the recovery task execution is successful.</span><p><ul class="subitemlist" id="ALM-12035__ul491292812164"><li id="ALM-12035__li10912122819168">If yes, go to <a href="#ALM-12035__li691272812168">8</a>.</li><li id="ALM-12035__li10912192811612">If no, go to <a href="#ALM-12035__li18912172820165">9</a>.</li></ul>
</p></li><li id="ALM-12035__li691272812168"><a name="ALM-12035__li691272812168"></a><a name="li691272812168"></a><span>After 2 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12035__ul191292851612"><li id="ALM-12035__li18912628181613">If yes, no further action is required.</li><li id="ALM-12035__li1991218285168">If no, go to <a href="#ALM-12035__li18912172820165">9</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12035__en-us_topic_0070543610_p36865955"><strong id="ALM-12035__b5671597695034">Collect fault information.</strong></p>
<ol start="9" id="ALM-12035__ol17912928131615"><li id="ALM-12035__li18912172820165"><a name="ALM-12035__li18912172820165"></a><a name="li18912172820165"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12035__b11912192811618">O&amp;M</strong> &gt; <strong id="ALM-12035__b4912112871618">Log &gt; Download</strong>.</span></li><li id="ALM-12035__li29127284164"><span>Select <strong id="ALM-12035__b1491242841616">Controller</strong> from the <strong id="ALM-12035__b9912928131617">Service</strong> and click <strong id="ALM-12035__b3991118545">OK</strong>.</span></li><li id="ALM-12035__li16912132810167"><span>Click <span><img id="ALM-12035__image119122281161" src="en-us_image_0269383846.png"></span> in the upper right corner, and set <strong id="ALM-12035__b4912228191616">Start Date</strong> and <strong id="ALM-12035__b19122028151619">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12035__b891272814169">Download</strong>.</span></li><li id="ALM-12035__li495644512588"><span>Contact the <span id="ALM-12035__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="9" id="ALM-12035__ol17912928131615"><li id="ALM-12035__li18912172820165"><a name="ALM-12035__li18912172820165"></a><a name="li18912172820165"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12035__b11912192811618">O&amp;M</strong> &gt; <strong id="ALM-12035__b4912112871618">Log &gt; Download</strong>.</span></li><li id="ALM-12035__li29127284164"><span>Select <strong id="ALM-12035__b1491242841616">Controller</strong> from the <strong id="ALM-12035__b9912928131617">Service</strong> and click <strong id="ALM-12035__b3991118545">OK</strong>.</span></li><li id="ALM-12035__li16912132810167"><span>Click <span><img id="ALM-12035__image119122281161" src="en-us_image_0000001583127489.png"></span> in the upper right corner, and set <strong id="ALM-12035__b4912228191616">Start Date</strong> and <strong id="ALM-12035__b19122028151619">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12035__b891272814169">Download</strong>.</span></li><li id="ALM-12035__li495644512588"><span>Contact the <span id="ALM-12035__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12035__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12035__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -0,0 +1,98 @@
<a name="ALM-12037"></a><a name="ALM-12037"></a>
<h1 class="topictitle1">ALM-12037 NTP Server Abnormal</h1>
<div id="body17797340"><div class="section" id="ALM-12037__sa107da5851c34446a7ddc485e2be0e4e"><h4 class="sectiontitle">Description</h4><p id="ALM-12037__en-us_topic_0070543611_p13746654">The system checks the NTP server status every 60 seconds. This alarm is generated when the system detects that the NTP server is abnormal for 10 consecutive times.</p>
<p id="ALM-12037__en-us_topic_0070543611_p56611028">This alarm is cleared when the NTP server recovers.</p>
</div>
<div class="section" id="ALM-12037__sa194c00f1135471fa223b098d0eee867"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12037__en-us_topic_0070543611_table22090566" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12037__en-us_topic_0070543611_row9560264"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12037__en-us_topic_0070543611_p36183900">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12037__en-us_topic_0070543611_p45214781">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12037__en-us_topic_0070543611_p38518610">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12037__en-us_topic_0070543611_row32999683"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12037__en-us_topic_0070543611_p55728698">12037</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12037__en-us_topic_0070543611_p17730652">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12037__en-us_topic_0070543611_p26896723">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12037__s9eb0bf01153b4666998c7817b199c8d4"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12037__en-us_topic_0070543611_table31150983" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12037__en-us_topic_0070543611_row47571430"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12037__en-us_topic_0070543611_p28080649">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12037__en-us_topic_0070543611_p59940076">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12037__row1339154925218"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12037__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12037__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12037__en-us_topic_0070543611_row23307959"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12037__en-us_topic_0070543611_p8896531">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12037__en-us_topic_0070543611_p49530407">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12037__en-us_topic_0070543611_row43120486"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12037__en-us_topic_0070543611_p3098498">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12037__en-us_topic_0070543611_p49651819">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12037__en-us_topic_0070543611_row44213192"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12037__en-us_topic_0070543611_p24498766">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12037__en-us_topic_0070543611_p38243029">Specifies the IP address of the NTP server for which the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12037__s6154f3349ab347ed997d03b086956284"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12037__en-us_topic_0070543611_p10677636">The NTP server configured on the active OMS node is abnormal. In this case, the active OMS node cannot synchronize time with the NTP server and a time offset may be generated in the cluster.</p>
</div>
<div class="section" id="ALM-12037__s0c12362544fb484a832aad2e1306c715"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12037__en-us_topic_0070543611_ul59582212"><li id="ALM-12037__en-us_topic_0070543611_li66477866">The NTP server network is abnormal.</li><li id="ALM-12037__en-us_topic_0070543611_li61429885">The NTP server authentication fails.</li><li id="ALM-12037__en-us_topic_0070543611_li15998054">The NTP server time cannot be obtained.</li><li id="ALM-12037__en-us_topic_0070543611_li9764758">The time obtained from the NTP server is not continuously updated.</li></ul>
</div>
<div class="section" id="ALM-12037__s10907d0f1dcf40acb84507bb13294ade"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12037__en-us_topic_0070543611_p52747895"><strong id="ALM-12037__b154553399520">Check the NTP server network.</strong></p>
<ol id="ALM-12037__ol954788095212"><li id="ALM-12037__li472667049523"><span>On the FusionInsight Manager portal, click <strong id="ALM-12037__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12037__b27872374104950"> &gt; Alarms</strong></strong> and click <span><img id="ALM-12037__image168221113135319" src="en-us_image_0000001532607750.png"></span> in the row where the alarm is located.</span></li><li id="ALM-12037__li477866769523"><span>View the alarm additional information to check whether the NTP server fails to be pinged.</span><p><ul class="subitemlist" id="ALM-12037__ul127661719523"><li id="ALM-12037__li305801229523">If yes, go to <a href="#ALM-12037__li601372919523">3</a>.</li><li id="ALM-12037__li610707879523">If no, go to <a href="#ALM-12037__li392824159523">4</a>.</li></ul>
</p></li><li id="ALM-12037__li601372919523"><a name="ALM-12037__li601372919523"></a><a name="li601372919523"></a><span>Contact the network administrator to check the network configuration and ensure that the network between the NTP server and the active OMS node is normal. Then, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12037__ul628802729523"><li id="ALM-12037__li274269039523">If yes, no further action is required.</li><li id="ALM-12037__li69866969523">If no, go to <a href="#ALM-12037__li392824159523">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12037__p290515429523"><strong id="ALM-12037__b4031175895218">Check whether the NTP server authentication fails.</strong></p>
<ol start="4" id="ALM-12037__ol3089634495234"><li id="ALM-12037__li392824159523"><a name="ALM-12037__li392824159523"></a><a name="li392824159523"></a><span>Log in to the active OMS node as user <strong id="ALM-12037__b43647129523">root</strong>. <span id="ALM-12037__text43649449460"></span><span id="ALM-12037__text14706453185317"></span></span></li><li id="ALM-12037__li24978126120"><a name="ALM-12037__li24978126120"></a><a name="li24978126120"></a><span>Run the following command to check the status of the resources on the active and standby nodes:</span><p><p id="ALM-12037__p2725124754013"><strong id="ALM-12037__b7837101712463">su - omm</strong></p>
<p id="ALM-12037__p125816103218"><strong id="ALM-12037__b060934913117">sh ${BIGDATA_HOME}/om-server/om/sbin/status-oms.sh</strong></p>
<ul id="ALM-12037__ul1738514151618"><li id="ALM-12037__li738191451610">If "chrony" is displayed in the <strong id="ALM-12037__b843385082412">ResName</strong> column of the command output, go to <a href="#ALM-12037__li179951620123417">6</a>.</li><li id="ALM-12037__li470235981613">If "ntp" is displayed in the <strong id="ALM-12037__b89131117258">ResName</strong> column, go to <a href="#ALM-12037__li650965649523">7</a>.</li></ul>
<div class="note" id="ALM-12037__note29074434286"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12037__p9513171811497">If both "chrony" and "ntp" are displayed in the <strong id="ALM-12037__b143061127132719">ResName</strong> column of the command output, the NTP service mode is being switched. Wait for 10 minutes and perform <a href="#ALM-12037__li24978126120">5</a> again. If both "chrony" and "ntp" still exist in the <strong id="ALM-12037__b12306112710272">ResName</strong> column, contact <span id="ALM-12037__text4614151421417">O&amp;M personnel</span>.</p>
</div></div>
</p></li><li id="ALM-12037__li179951620123417"><a name="ALM-12037__li179951620123417"></a><a name="li179951620123417"></a><span>Run the command <strong id="ALM-12037__b16104134383813">chronyc sources</strong> to check whether the NTP server authentication fails.</span><p><p id="ALM-12037__p1015014320354">If the value of <strong id="ALM-12037__b621325373815">Reach</strong> for chrony is <strong id="ALM-12037__b102132053193816">0</strong>, the connection or authentication fails.</p>
<ul class="subitemlist" id="ALM-12037__ul54119014395"><li id="ALM-12037__li2411802390">If yes, go to <a href="#ALM-12037__li654599509523">12</a>.</li><li id="ALM-12037__li184110113912">If no, go to <a href="#ALM-12037__li282496949523">8</a>.</li></ul>
</p></li><li id="ALM-12037__li650965649523"><a name="ALM-12037__li650965649523"></a><a name="li650965649523"></a><span>Run the command <strong id="ALM-12037__b179974159523">ntpq -np</strong> to check whether the NTP server authentication fails.</span><p><p class="litext" id="ALM-12037__p338873759523">If <strong id="ALM-12037__b277590109523">refid</strong> of the NTP server is <strong id="ALM-12037__b485045069523">.AUTH.</strong>, the authentication fails.</p>
<ul class="subitemlist" id="ALM-12037__ul306298239523"><li id="ALM-12037__li605228449523">If yes, go to <a href="#ALM-12037__li654599509523">12</a>.</li><li id="ALM-12037__li34033139523">If no, go to <a href="#ALM-12037__li282496949523">8</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12037__p72329519523"><strong id="ALM-12037__b6048972995240">Check whether the time can be obtained from the NTP server.</strong></p>
<ol start="8" id="ALM-12037__ol2184198095256"><li id="ALM-12037__li282496949523"><a name="ALM-12037__li282496949523"></a><a name="li282496949523"></a><span>View the alarm additional information to check whether the time can be obtained from the NTP server.</span><p><ul class="subitemlist" id="ALM-12037__ul255084769523"><li id="ALM-12037__li383303099523">If yes, go to <a href="#ALM-12037__li549211539523">9</a>.</li><li id="ALM-12037__li177473569523">If no, go to <a href="#ALM-12037__li301460699523">10</a>.</li></ul>
</p></li><li id="ALM-12037__li549211539523"><a name="ALM-12037__li549211539523"></a><a name="li549211539523"></a><span>Contact the provider of the NTP server to rectify the NTP server fault. After the NTP server is normal, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12037__ul586733539523"><li id="ALM-12037__li529206609523">If yes, no further action is required.</li><li id="ALM-12037__li587150449523">If no, go to <a href="#ALM-12037__li301460699523">10</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12037__p582981339523"><strong id="ALM-12037__b366631259537">Check whether the time obtained from the NTP server is not continuously updated.</strong></p>
<ol start="10" id="ALM-12037__ol5240653195320"><li id="ALM-12037__li301460699523"><a name="ALM-12037__li301460699523"></a><a name="li301460699523"></a><span>View the alarm additional information to check whether the time obtained from the NTP server is not continuously updated.</span><p><ul class="subitemlist" id="ALM-12037__ul33495639523"><li id="ALM-12037__li194284229523">If yes, go to <a href="#ALM-12037__li251290419523">11</a>.</li><li id="ALM-12037__li301983359523">If no, go to <a href="#ALM-12037__li654599509523">12</a>.</li></ul>
</p></li><li id="ALM-12037__li251290419523"><a name="ALM-12037__li251290419523"></a><a name="li251290419523"></a><span>Contact the provider of the NTP server to rectify the NTP server fault. After the NTP server is normal, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12037__ul185373339523"><li id="ALM-12037__li28791669523">If yes, no further action is required.</li><li id="ALM-12037__li318858659523">If no, go to <a href="#ALM-12037__li654599509523">12</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12037__p326182779523"><strong id="ALM-12037__b4063632295325">Collect fault information.</strong></p>
<ol start="12" id="ALM-12037__ol2743408695328"><li id="ALM-12037__li654599509523"><a name="ALM-12037__li654599509523"></a><a name="li654599509523"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12037__b248347779523">O&amp;M</strong> &gt; <strong id="ALM-12037__b221864089523">Log &gt; Download</strong>.</span></li><li id="ALM-12037__li82842029523"><span>Select <strong id="ALM-12037__b522686449523">NodeAgent</strong> and <strong id="ALM-12037__b6557569523">OmmServer</strong> from the <strong id="ALM-12037__b59018059523">Service</strong> and click <strong id="ALM-12037__b3991118545">OK</strong>.</span></li><li id="ALM-12037__li1145664103113"><span>Click <span><img id="ALM-12037__image1945644173117" src="en-us_image_0000001582927645.png"></span> in the upper right corner, and set <strong id="ALM-12037__b6456941173117">Start Date</strong> and <strong id="ALM-12037__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12037__b13456164113319">Download</strong>.</span></li><li id="ALM-12037__li495644512588"><span>Contact the <span id="ALM-12037__text106761111141812">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12037__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12037__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-12037__s2019f7f9464943979dfebc0db92b7520"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12037__en-us_topic_0070543611_p33875336">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>

View File

@ -76,7 +76,7 @@
</p></li><li id="ALM-12038__li53095195103617"><a name="ALM-12038__li53095195103617"></a><a name="li53095195103617"></a><span>Delete unnecessary files or go to the monitoring indicator dumping configuration page to change the save path. Then, check whether the save path has sufficient disk space.</span><p><ul class="subitemlist" id="ALM-12038__ul35452684103617"><li id="ALM-12038__li50587406103617">If yes, no further action is required.</li><li id="ALM-12038__li3939187103617">If no, go to <a href="#ALM-12038__li51692141103617">11</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12038__p50638708103617"><strong id="ALM-12038__b6018239103721">Collect fault information.</strong></p>
<ol start="11" id="ALM-12038__ol22765086103724"><li id="ALM-12038__li51692141103617"><a name="ALM-12038__li51692141103617"></a><a name="li51692141103617"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12038__b2056231918912">O&amp;M</strong> &gt; <strong id="ALM-12038__b5743571103617">Log &gt; Download</strong>.</span></li><li id="ALM-12038__li51051832103617"><span>Select <strong id="ALM-12038__b1352831932712">OMS</strong> from the <strong id="ALM-12038__b26313908103617">Service</strong> and click <strong id="ALM-12038__b3991118545">OK</strong>.</span></li><li id="ALM-12038__li1145664103113"><span>Click <span><img id="ALM-12038__image1945644173117" src="en-us_image_0269383850.png"></span> in the upper right corner, and set <strong id="ALM-12038__b6456941173117">Start Date</strong> and <strong id="ALM-12038__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12038__b13456164113319">Download</strong>.</span></li><li id="ALM-12038__li495644512588"><span>Contact the <span id="ALM-12038__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="11" id="ALM-12038__ol22765086103724"><li id="ALM-12038__li51692141103617"><a name="ALM-12038__li51692141103617"></a><a name="li51692141103617"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12038__b2056231918912">O&amp;M</strong> &gt; <strong id="ALM-12038__b5743571103617">Log &gt; Download</strong>.</span></li><li id="ALM-12038__li51051832103617"><span>Select <strong id="ALM-12038__b1352831932712">OMS</strong> from the <strong id="ALM-12038__b26313908103617">Service</strong> and click <strong id="ALM-12038__b3991118545">OK</strong>.</span></li><li id="ALM-12038__li1145664103113"><span>Click <span><img id="ALM-12038__image1945644173117" src="en-us_image_0000001532927442.png"></span> in the upper right corner, and set <strong id="ALM-12038__b6456941173117">Start Date</strong> and <strong id="ALM-12038__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12038__b13456164113319">Download</strong>.</span></li><li id="ALM-12038__li495644512588"><span>Contact the <span id="ALM-12038__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12038__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12038__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -75,7 +75,7 @@
<div class="section" id="ALM-12039__s2aec9ff7cd804900af8457f84b365f70"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12039__en-us_topic_0070543613_ul18926225"><li id="ALM-12039__en-us_topic_0070543613_li36118300">The network between the active and standby nodes is unstable.</li><li id="ALM-12039__en-us_topic_0070543613_li56629250">The standby OMS Database is abnormal.</li><li id="ALM-12039__en-us_topic_0070543613_li39901202">The standby node disk space is full.</li></ul>
</div>
<div class="section" id="ALM-12039__s3db2913b091445d59edc8bff2fa84546"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12039__en-us_topic_0070543613_p10771936"><strong id="ALM-12039__b4973408104948">Check whether the network between the active and standby nodes is normal.</strong></p>
<ol id="ALM-12039__ol3742838310508"><li id="ALM-12039__li49524776104950"><span>Log in to FusionInsight Manager, click <strong id="ALM-12039__b27872374104950">O&amp;M &gt; Alarm<strong id="ALM-12039__b2084142661316"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12039__image168221113135319" src="en-us_image_0269383851.png"></span> in the row where the alarm is located, and query the standby OMS Database IP address.</span></li><li id="ALM-12039__li52083901104950"><span>Log in to the active OMS Database node as user <strong id="ALM-12039__b43069802104950">root</strong>. <span id="ALM-12039__text43649449460"></span></span></li><li id="ALM-12039__li5718024104950"><span>Run the <strong id="ALM-12039__b66101931104950">ping </strong><em id="ALM-12039__i58046467104950">Standby OMS Database heartbeat IP address</em> command to check whether the standby OMS Database node is reachable.</span><p><ul class="subitemlist" id="ALM-12039__ul635336104950"><li id="ALM-12039__li4143393104950">If yes, go to <a href="#ALM-12039__li19362442104950">6</a>.</li><li id="ALM-12039__li70592104950">If no, go to <a href="#ALM-12039__li36080609104950">4</a>.</li></ul>
<ol id="ALM-12039__ol3742838310508"><li id="ALM-12039__li49524776104950"><span>Log in to FusionInsight Manager, click <strong id="ALM-12039__b27872374104950">O&amp;M &gt; Alarm<strong id="ALM-12039__b2084142661316"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12039__image168221113135319" src="en-us_image_0000001582927757.png"></span> in the row where the alarm is located, and query the standby OMS Database IP address.</span></li><li id="ALM-12039__li52083901104950"><span>Log in to the active OMS Database node as user <strong id="ALM-12039__b43069802104950">root</strong>. <span id="ALM-12039__text43649449460"></span></span></li><li id="ALM-12039__li5718024104950"><span>Run the <strong id="ALM-12039__b66101931104950">ping </strong><em id="ALM-12039__i58046467104950">Standby OMS Database heartbeat IP address</em> command to check whether the standby OMS Database node is reachable.</span><p><ul class="subitemlist" id="ALM-12039__ul635336104950"><li id="ALM-12039__li4143393104950">If yes, go to <a href="#ALM-12039__li19362442104950">6</a>.</li><li id="ALM-12039__li70592104950">If no, go to <a href="#ALM-12039__li36080609104950">4</a>.</li></ul>
</p></li><li id="ALM-12039__li36080609104950"><a name="ALM-12039__li36080609104950"></a><a name="li36080609104950"></a><span>Contact the network administrator to check whether the network is faulty.</span><p><ul class="subitemlist" id="ALM-12039__ul18922037104950"><li id="ALM-12039__li60506784104950">If yes, go to <a href="#ALM-12039__li35036231104950">5</a>.</li><li id="ALM-12039__li2102448104950">If no, go to <a href="#ALM-12039__li19362442104950">6</a>.</li></ul>
</p></li><li id="ALM-12039__li35036231104950"><a name="ALM-12039__li35036231104950"></a><a name="li35036231104950"></a><span>Rectify the network fault and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12039__ul31915716104950"><li id="ALM-12039__li56290029104950">If yes, no further action is required.</li><li id="ALM-12039__li63198514104950">If no, go to <a href="#ALM-12039__li19362442104950">6</a>.</li></ul>
</p></li></ol>
@ -89,7 +89,7 @@
</p></li><li id="ALM-12039__li27597409104950"><a name="ALM-12039__li27597409104950"></a><a name="li27597409104950"></a><span>Expand the disk capacity.</span></li><li id="ALM-12039__li21260851104950"><span>After the disk capacity is expanded, wait 2 minutes and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12039__ul6890515104950"><li id="ALM-12039__li47050096104950">If yes, no further action is required.</li><li id="ALM-12039__li52961395104950">If no, go to <a href="#ALM-12039__li64121842104950">16</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12039__p62014640104950"><strong id="ALM-12039__b2323110410512">Collect fault information.</strong></p>
<ol start="16" id="ALM-12039__ol6204213010516"><li id="ALM-12039__li64121842104950"><a name="ALM-12039__li64121842104950"></a><a name="li64121842104950"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12039__b57129933104950">O&amp;M</strong> &gt; <strong id="ALM-12039__b44407351104950">Log &gt; Download</strong>.</span></li><li id="ALM-12039__li65050230104950"><span>Select <strong id="ALM-12039__b40225672104950">OMMServer</strong> from the <strong id="ALM-12039__b26486728104950">Service</strong> and click <strong id="ALM-12039__b3991118545">OK</strong>.</span></li><li id="ALM-12039__li1145664103113"><span>Click <span><img id="ALM-12039__image1945644173117" src="en-us_image_0269383852.png"></span> in the upper right corner, and set <strong id="ALM-12039__b6456941173117">Start Date</strong> and <strong id="ALM-12039__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12039__b13456164113319">Download</strong>.</span></li><li id="ALM-12039__li495644512588"><span>Contact the <span id="ALM-12039__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="16" id="ALM-12039__ol6204213010516"><li id="ALM-12039__li64121842104950"><a name="ALM-12039__li64121842104950"></a><a name="li64121842104950"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12039__b57129933104950">O&amp;M</strong> &gt; <strong id="ALM-12039__b44407351104950">Log &gt; Download</strong>.</span></li><li id="ALM-12039__li65050230104950"><span>Select <strong id="ALM-12039__b40225672104950">OMMServer</strong> from the <strong id="ALM-12039__b26486728104950">Service</strong> and click <strong id="ALM-12039__b3991118545">OK</strong>.</span></li><li id="ALM-12039__li1145664103113"><span>Click <span><img id="ALM-12039__image1945644173117" src="en-us_image_0000001532448378.png"></span> in the upper right corner, and set <strong id="ALM-12039__b6456941173117">Start Date</strong> and <strong id="ALM-12039__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12039__b13456164113319">Download</strong>.</span></li><li id="ALM-12039__li495644512588"><span>Contact the <span id="ALM-12039__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12039__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12039__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -1,7 +1,11 @@
<a name="ALM-12040"></a><a name="ALM-12040"></a>
<h1 class="topictitle1">ALM-12040 Insufficient System Entropy</h1>
<div id="body50087568"><div class="section" id="ALM-12040__section49858936"><h4 class="sectiontitle">Description</h4><p id="ALM-12040__p144638408307">The system checks the entropy for five consecutive times at 00:00 every day. Specifically, the system checks whether rng-tools or haveged has been enabled and correctly configured. If neither is configured, the system continues to check the entropy. If the entropy is less than 100 for five consecutive times, this alarm is reported.</p>
<div id="body50087568"><div class="section" id="ALM-12040__section49858936"><h4 class="sectiontitle">Description</h4><p id="ALM-12040__p2074190332">MRS 3.2.0 or later:</p>
<p id="ALM-12040__p6122155912253">The system checks whether the rng-tools or haveged tool has been enabled and correctly configured every 5 minutes. If neither tool is configured, this alarm is generated. If either is configured, the system continues to check the entropy. If the entropy is less than 100 for five consecutive times, this alarm is generated.</p>
<p id="ALM-12040__p512217596259">This alarm is cleared when rng-tools or haveged has been installed and enabled on the target node and the entropy of the OS is greater than or equal to 100 in at least one of five entropy checks.</p>
<p id="ALM-12040__p117941134143317">MRS 3.1.2 or earlier:</p>
<p id="ALM-12040__p144638408307">The system checks the entropy for five consecutive times at 00:00 every day. Specifically, the system checks whether rng-tools or haveged has been enabled and correctly configured. If neither is configured, the system continues to check the entropy. If the entropy is less than 100 for five consecutive times, this alarm is reported.</p>
<p id="ALM-12040__p1146314023010">This alarm is cleared when the system detects that the true random number mode has been configured, the random number parameters have been configured in the pseudo-random number mode, or neither mode is configured but the entropy of the OS is greater than or equal to 100 in at least one of five entropy checks.</p>
</div>
<div class="section" id="ALM-12040__section46077243"><h4 class="sectiontitle">Attribute</h4>
@ -90,7 +94,7 @@ Restart=always</pre>
</p></li><li id="ALM-12040__li20231214524"><a name="ALM-12040__li20231214524"></a><a name="li20231214524"></a><span>Wait until the system to check the entropy at 00:00 on the following day and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12040__ul17214121526"><li id="ALM-12040__li172812165218">If yes, no further action is required.</li><li id="ALM-12040__li10211245210">If no, go to <a href="#ALM-12040__li5962839105655">10</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12040__p39013326105655"><strong id="ALM-12040__b15098459105711">Collect fault information.</strong></p>
<ol start="10" id="ALM-12040__ol3438675910577"><li id="ALM-12040__li5962839105655"><a name="ALM-12040__li5962839105655"></a><a name="li5962839105655"></a><span>On FusionInsight Manager, choose <strong id="ALM-12040__b15129118135012">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12040__b913828115012">Log</strong> &gt; <strong id="ALM-12040__b131389811500">Download</strong>.</span></li><li id="ALM-12040__li53665559105655"><span>Select <strong id="ALM-12040__b168670067183456">NodeAgent</strong> for <strong id="ALM-12040__b77671734683456">Service</strong> and click <strong id="ALM-12040__b26186472983456">OK</strong>.</span></li><li id="ALM-12040__li13227985105655"><span>Click <span><img id="ALM-12040__image104601319175315" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12040__b357114351501">Start Date</strong> and <strong id="ALM-12040__b1572183555014">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12040__b18573163555012">Download</strong>.</span></li><li id="ALM-12040__li64833892105655"><span>Contact <span id="ALM-12040__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="10" id="ALM-12040__ol3438675910577"><li id="ALM-12040__li5962839105655"><a name="ALM-12040__li5962839105655"></a><a name="li5962839105655"></a><span>On FusionInsight Manager, choose <strong id="ALM-12040__b15129118135012">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12040__b913828115012">Log</strong> &gt; <strong id="ALM-12040__b131389811500">Download</strong>.</span></li><li id="ALM-12040__li53665559105655"><span>Select <strong id="ALM-12040__b168670067183456">NodeAgent</strong> for <strong id="ALM-12040__b77671734683456">Service</strong> and click <strong id="ALM-12040__b26186472983456">OK</strong>.</span></li><li id="ALM-12040__li13227985105655"><span>Click <span><img id="ALM-12040__image104601319175315" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12040__b357114351501">Start Date</strong> and <strong id="ALM-12040__b1572183555014">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12040__b18573163555012">Download</strong>.</span></li><li id="ALM-12040__li64833892105655"><span>Contact <span id="ALM-12040__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12040__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12040__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -70,10 +70,10 @@
</p></li><li id="ALM-12041__li937595411014"><span>Compare the real-world permission of the file with the due permission obtained in <a href="#ALM-12041__li1834285111014">5</a> and correct the permission, user, and user group information for the file.</span></li><li id="ALM-12041__li75110811014"><span>Wait a hour and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12041__ul4392001111014"><li id="ALM-12041__li1727472911014">If yes, no further action is required.</li><li id="ALM-12041__li5707578411014">If no, go to <a href="#ALM-12041__li1068683211014">8</a>.</li></ul>
<div class="note" id="ALM-12041__note50974664111832"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12041__p22323065111841">If the disk partition where the cluster installation directory resides is used up, some temporary files will be generated in the program installation directory when running the <strong id="ALM-12041__b66689858111841">sed</strong> command fails. Users do not have the read, write, and execute permissions of these temporary files. The system reports an alarm indicating that permissions of temporary files are abnormal if these files are within the monitoring range of the alarm. Perform the preceding alarm handling processes to clear the alarm. Alternatively, you can directly delete the temporary files after confirming that files with abnormal permissions are temporary. The temporary file generated after a <strong id="ALM-12041__b63337813111841">sed</strong> command execution failure is similar to the following.</p>
</div></div>
<p class="subitemlist" id="ALM-12041__p132194544418"><span><img id="ALM-12041__image13221252114113" src="en-us_image_0269383855.jpg"></span></p>
<p class="subitemlist" id="ALM-12041__p132194544418"><span><img id="ALM-12041__image13221252114113" src="en-us_image_0000001532927558.jpg"></span></p>
</p></li></ol>
<p class="tableheading" id="ALM-12041__p5973578011014"><strong id="ALM-12041__b120539411028">Collect fault information.</strong></p>
<ol start="8" id="ALM-12041__ol6667694311030"><li id="ALM-12041__li1068683211014"><a name="ALM-12041__li1068683211014"></a><a name="li1068683211014"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12041__b675997211014">O&amp;M</strong> &gt; <strong id="ALM-12041__b6083974911014">Log &gt; Download</strong>.</span></li><li id="ALM-12041__li5465607911014"><span>Select <strong id="ALM-12041__b2907263111014">NodeAgent</strong> from the <strong id="ALM-12041__b6032708911014">Service</strong> and click <strong id="ALM-12041__b3991118545">OK</strong>.</span></li><li id="ALM-12041__li1145664103113"><span>Click <span><img id="ALM-12041__image1945644173117" src="en-us_image_0269383856.png"></span> in the upper right corner, and set <strong id="ALM-12041__b6456941173117">Start Date</strong> and <strong id="ALM-12041__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12041__b13456164113319">Download</strong>.</span></li><li id="ALM-12041__li495644512588"><span>Contact the <span id="ALM-12041__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="8" id="ALM-12041__ol6667694311030"><li id="ALM-12041__li1068683211014"><a name="ALM-12041__li1068683211014"></a><a name="li1068683211014"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12041__b675997211014">O&amp;M</strong> &gt; <strong id="ALM-12041__b6083974911014">Log &gt; Download</strong>.</span></li><li id="ALM-12041__li5465607911014"><span>Select <strong id="ALM-12041__b2907263111014">NodeAgent</strong> from the <strong id="ALM-12041__b6032708911014">Service</strong> and click <strong id="ALM-12041__b3991118545">OK</strong>.</span></li><li id="ALM-12041__li1145664103113"><span>Click <span><img id="ALM-12041__image1945644173117" src="en-us_image_0000001532607890.png"></span> in the upper right corner, and set <strong id="ALM-12041__b6456941173117">Start Date</strong> and <strong id="ALM-12041__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12041__b13456164113319">Download</strong>.</span></li><li id="ALM-12041__li495644512588"><span>Contact the <span id="ALM-12041__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12041__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12041__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -72,7 +72,7 @@
</p></li><li id="ALM-12042__li3021967611310"><span>Wait a hour and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12042__ul3019924411310"><li id="ALM-12042__li5785262811310">If yes, no further action is required.</li><li id="ALM-12042__li5555125411310">If no, go to <a href="#ALM-12042__li1843685711310">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12042__p335774111310"><strong id="ALM-12042__b2285498211323">Collect fault information.</strong></p>
<ol start="6" id="ALM-12042__ol3443027411326"><li id="ALM-12042__li1843685711310"><a name="ALM-12042__li1843685711310"></a><a name="li1843685711310"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12042__b354163311310">O&amp;M</strong> &gt; <strong id="ALM-12042__b3187470111310">Log &gt; Download</strong>.</span></li><li id="ALM-12042__li3405016711310"><span>Select <strong id="ALM-12042__b3171399011310">NodeAgent</strong> from the <strong id="ALM-12042__b1699046211310">Service</strong> and click <strong id="ALM-12042__b3991118545">OK</strong>.</span></li><li id="ALM-12042__li1145664103113"><span>Click <span><img id="ALM-12042__image1945644173117" src="en-us_image_0269383857.png"></span> in the upper right corner, and set <strong id="ALM-12042__b6456941173117">Start Date</strong> and <strong id="ALM-12042__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12042__b13456164113319">Download</strong>.</span></li><li id="ALM-12042__li495644512588"><span>Contact the <span id="ALM-12042__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12042__ol3443027411326"><li id="ALM-12042__li1843685711310"><a name="ALM-12042__li1843685711310"></a><a name="li1843685711310"></a><span>On the FusionInsight Manager portal, choose <strong id="ALM-12042__b354163311310">O&amp;M</strong> &gt; <strong id="ALM-12042__b3187470111310">Log &gt; Download</strong>.</span></li><li id="ALM-12042__li3405016711310"><span>Select <strong id="ALM-12042__b3171399011310">NodeAgent</strong> from the <strong id="ALM-12042__b1699046211310">Service</strong> and click <strong id="ALM-12042__b3991118545">OK</strong>.</span></li><li id="ALM-12042__li1145664103113"><span>Click <span><img id="ALM-12042__image1945644173117" src="en-us_image_0000001532927502.png"></span> in the upper right corner, and set <strong id="ALM-12042__b6456941173117">Start Date</strong> and <strong id="ALM-12042__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12042__b13456164113319">Download</strong>.</span></li><li id="ALM-12042__li495644512588"><span>Contact the <span id="ALM-12042__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12042__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12042__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -73,7 +73,7 @@
<div class="section" id="ALM-12045__section56798701"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12045__ul27994219"><li id="ALM-12045__li37441695101640">An OS exception occurs.</li><li id="ALM-12045__li60731574192851">The NICs are bonded in active/standby mode.</li><li id="ALM-12045__li50621380">The alarm threshold is improperly configured.</li><li id="ALM-12045__li52939239">The network quality is poor.</li></ul>
</div>
<div class="section" id="ALM-12045__section41426264"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12045__p60550233154039"><strong id="ALM-12045__b20378211155946">View the network packet dropped rate.</strong></p>
<ol id="ALM-12045__ol54177744154120"><li id="ALM-12045__li34357272165726"><span>On FusionInsight Manager, choose <strong id="ALM-12045__b8597763200">O&amp;M</strong> &gt; <strong id="ALM-12045__b1760510652010">Alarm</strong> &gt; <strong id="ALM-12045__b19605968206">Alarms</strong>. On the page that is displayed, click <span><img id="ALM-12045__image168221113135319" src="en-us_image_0263895776.png"></span> in the row containing the alarm, and view the name of the host for which the alarm is generated and the NIC name.</span></li><li id="ALM-12045__li17837656154120"><span>Log in to the alarm node as user <strong id="ALM-12045__b35564051154120">omm</strong>, and run the <strong id="ALM-12045__b143893142219">/sbin/ifconfig </strong><em id="ALM-12045__i07461520122210">NIC name</em> command to check whether packet loss occurs on the network.</span><p><p id="ALM-12045__p897517249258"><span><img id="ALM-12045__image14835549449" src="en-us_image_0000001390459688.png"></span></p>
<ol id="ALM-12045__ol54177744154120"><li id="ALM-12045__li34357272165726"><span>On FusionInsight Manager, choose <strong id="ALM-12045__b8597763200">O&amp;M</strong> &gt; <strong id="ALM-12045__b1760510652010">Alarm</strong> &gt; <strong id="ALM-12045__b19605968206">Alarms</strong>. On the page that is displayed, click <span><img id="ALM-12045__image168221113135319" src="en-us_image_0000001583087417.png"></span> in the row containing the alarm, and view the name of the host for which the alarm is generated and the NIC name.</span></li><li id="ALM-12045__li17837656154120"><span>Log in to the alarm node as user <strong id="ALM-12045__b35564051154120">omm</strong>, and run the <strong id="ALM-12045__b143893142219">/sbin/ifconfig </strong><em id="ALM-12045__i07461520122210">NIC name</em> command to check whether packet loss occurs on the network.</span><p><p id="ALM-12045__p897517249258"><span><img id="ALM-12045__image14835549449" src="en-us_image_0000001532767498.png"></span></p>
<div class="note" id="ALM-12045__note5975624192520"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="ALM-12045__ul29750248258"><li class="text" id="ALM-12045__li13975142415259"><em id="ALM-12045__i182770239944413">IP address of the node for which the alarm is generated</em>: Query the IP address of the node for which the alarm is generated on the <strong id="ALM-12045__b76887363443">Hosts</strong> page of FusionInsight Manager based on the value of <strong id="ALM-12045__b37108310144413">HostName</strong> in the alarm location information. Check both the IP addresses of the management plane and service plane.</li><li id="ALM-12045__li19975124182513">Packet loss rate = (Number of dropped packets/Total number of received packets) x 100%. If the packet loss rate is greater than the system threshold (0.5% by default), read packets are dropped.</li></ul>
</div></div>
<ul id="ALM-12045__ul12976132492510"><li id="ALM-12045__li1097522414255">If yes, go to <a href="#ALM-12045__li4196511811134">11</a>.</li><li id="ALM-12045__li297652462516">If no, go to <a href="#ALM-12045__li6542838717657">3</a>.</li></ul>
@ -93,8 +93,8 @@ Red Hat Enterprise Linux Server release<strong id="ALM-12045__b26880224102544">
Linux version <strong id="ALM-12045__b37899196102550">3.0.101-63-default</strong> (geeko@buildhost) (gcc version 4.3.4 [gcc-4_3-branch revision 152973] (SUSE Linux) ) #1 SMP Tue Jun 23 16:02:31 UTC 2015 (4b89d0c)</pre>
<ul id="ALM-12045__ul62847380172126"><li id="ALM-12045__li9858303195115">If yes, the alarm sending function cannot be enabled. Go to <a href="#ALM-12045__li43950618195120">7</a>.</li><li id="ALM-12045__li5930366195117">If no, go to <a href="#ALM-12045__li4196511811134">11</a>.</li></ul>
</p></li><li id="ALM-12045__li43950618195120"><a name="ALM-12045__li43950618195120"></a><a name="li43950618195120"></a><span>Log in to FusionInsight Manager and choose <strong id="ALM-12045__b167130161044413">O&amp;M</strong> &gt; <strong id="ALM-12045__b157735138144413">Alarm</strong> &gt; <strong id="ALM-12045__b156701955944413">Threshold Configuration</strong>.</span></li></ol><ol start="8" id="ALM-12045__ol26457910172340"><li id="ALM-12045__li26465420174815"><span>In the navigation tree of the <strong id="ALM-12045__b478911510483">Thresholds</strong> page, choose <em id="ALM-12045__i594510221489">Name of the desired cluster</em> &gt; <strong id="ALM-12045__b167061042124815">Host</strong> &gt; <strong id="ALM-12045__b6981184610481">Network Reading</strong> &gt; <strong id="ALM-12045__b9341172144913">Read Packet Dropped Rate</strong>. In the area on the right, check whether the <strong id="ALM-12045__b1278202219498">Switch</strong> is toggled on.</span><p><ul id="ALM-12045__ul20313429174820"><li id="ALM-12045__li9894347174820">If yes, the alarm sending function is enabled. Go to <a href="#ALM-12045__li38517503111027">9</a>.</li><li id="ALM-12045__li56297179194352">If no, the alarm sending function is disabled. Go to <a href="#ALM-12045__li16613085112024">10</a>.</li></ul>
</p></li><li id="ALM-12045__li38517503111027"><a name="ALM-12045__li38517503111027"></a><a name="li38517503111027"></a><span>In the area on the right, toggle <strong id="ALM-12045__b1172691785113">Switch</strong> off to disable the checking of <strong id="ALM-12045__b1523917125216">Network Read Packet Dropped Rate Exceeds the Threshold</strong>.</span><p><p id="ALM-12045__p11736263111027"><span><img id="ALM-12045__image828012285713" src="en-us_image_0263895526.png"></span></p>
</p></li><li id="ALM-12045__li16613085112024"><a name="ALM-12045__li16613085112024"></a><a name="li16613085112024"></a><span>On the <strong id="ALM-12045__b16749813195314">Alarm</strong> page of FusionInsight Manager, search for alarm <strong id="ALM-12045__b444015317534">12045</strong> and manually clear the alarm if it is not automatically cleared. No further action is required.</span><p><p id="ALM-12045__p1861091166"><span><img id="ALM-12045__image11618931616" src="en-us_image_0263895376.png"></span></p>
</p></li><li id="ALM-12045__li38517503111027"><a name="ALM-12045__li38517503111027"></a><a name="li38517503111027"></a><span>In the area on the right, toggle <strong id="ALM-12045__b1172691785113">Switch</strong> off to disable the checking of <strong id="ALM-12045__b1523917125216">Network Read Packet Dropped Rate Exceeds the Threshold</strong>.</span><p><p id="ALM-12045__p11736263111027"><span><img id="ALM-12045__image828012285713" src="en-us_image_0000001532607762.png"></span></p>
</p></li><li id="ALM-12045__li16613085112024"><a name="ALM-12045__li16613085112024"></a><a name="li16613085112024"></a><span>On the <strong id="ALM-12045__b16749813195314">Alarm</strong> page of FusionInsight Manager, search for alarm <strong id="ALM-12045__b444015317534">12045</strong> and manually clear the alarm if it is not automatically cleared. No further action is required.</span><p><p id="ALM-12045__p1861091166"><span><img id="ALM-12045__image11618931616" src="en-us_image_0000001532448274.png"></span></p>
<div class="note" id="ALM-12045__note60160766112035"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12045__p4575985112035">ID of the Network Read Packet Dropped Rate Exceeds the Threshold alarm is <strong id="ALM-12045__b1878673214337">12045</strong>.</p>
</div></div>
</p></li></ol>
@ -138,7 +138,7 @@ Slave queue ID: 0</pre>
</p></li></ol>
<p class="tableheading" id="ALM-12045__p60219992"><strong id="ALM-12045__b47522666112832">Check whether the threshold is set properly.</strong></p>
<ol start="14" id="ALM-12045__ol16493433173222"><li id="ALM-12045__li61276131112834"><a name="ALM-12045__li61276131112834"></a><a name="li61276131112834"></a><span>Log in to FusionInsight Manager, choose <strong id="ALM-12045__b659184595419">O&amp;M</strong> &gt; <strong id="ALM-12045__b57512047155419">Alarm</strong> &gt; <strong id="ALM-12045__b23011451552">Thresholds</strong> &gt; <em id="ALM-12045__i18305310145510">Name of the desired cluster</em> &gt; <strong id="ALM-12045__b6129181516557">Host</strong> &gt; <strong id="ALM-12045__b882618236551">Network Reading</strong> &gt; <strong id="ALM-12045__b135892028105514">Read Packet Dropped Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12045__b144531924155615">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12045__ul36634620112834"><li id="ALM-12045__li23616603112834">If yes, go to <a href="#ALM-12045__li56023883112834">17</a>.</li><li id="ALM-12045__li33896675112834">If no, go to <a href="#ALM-12045__li47653126112834">15</a>.</li></ul>
</p></li></ol><ol start="15" id="ALM-12045__ol13032980174025"><li id="ALM-12045__li47653126112834"><a name="ALM-12045__li47653126112834"></a><a name="li47653126112834"></a><span>Choose <strong id="ALM-12045__b66788575566">O&amp;M</strong> &gt; <strong id="ALM-12045__b9758759195615">Alarm</strong> &gt; <strong id="ALM-12045__b53403618572">Thresholds</strong> &gt; <em id="ALM-12045__i666599145719">Name of the desired cluster</em> &gt; <strong id="ALM-12045__b16662161395714">Host</strong> &gt; <strong id="ALM-12045__b182811922155712">Network Reading</strong> &gt; <strong id="ALM-12045__b489452945716">Read Packet Dropped Rate</strong>. Click <strong id="ALM-12045__b204309457574">Modify</strong> in the <strong id="ALM-12045__b320605515717">Operation</strong> column to change the threshold. See <a href="#ALM-12045__fig52784093112834">Figure 1</a>.</span><p><div class="fignone" id="ALM-12045__fig52784093112834"><a name="ALM-12045__fig52784093112834"></a><a name="fig52784093112834"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12045__image956695784115" src="en-us_image_0000001390618884.png"></span></div>
</p></li></ol><ol start="15" id="ALM-12045__ol13032980174025"><li id="ALM-12045__li47653126112834"><a name="ALM-12045__li47653126112834"></a><a name="li47653126112834"></a><span>Choose <strong id="ALM-12045__b66788575566">O&amp;M</strong> &gt; <strong id="ALM-12045__b9758759195615">Alarm</strong> &gt; <strong id="ALM-12045__b53403618572">Thresholds</strong> &gt; <em id="ALM-12045__i666599145719">Name of the desired cluster</em> &gt; <strong id="ALM-12045__b16662161395714">Host</strong> &gt; <strong id="ALM-12045__b182811922155712">Network Reading</strong> &gt; <strong id="ALM-12045__b489452945716">Read Packet Dropped Rate</strong>. Click <strong id="ALM-12045__b204309457574">Modify</strong> in the <strong id="ALM-12045__b320605515717">Operation</strong> column to change the threshold. See <a href="#ALM-12045__fig52784093112834">Figure 1</a>.</span><p><div class="fignone" id="ALM-12045__fig52784093112834"><a name="ALM-12045__fig52784093112834"></a><a name="fig52784093112834"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12045__image956695784115" src="en-us_image_0000001582927657.png"></span></div>
</p></li><li id="ALM-12045__li20285900112834"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12045__ul59074262112834"><li id="ALM-12045__li26224954112834">If yes, no further action is required.</li><li id="ALM-12045__li43846509112834">If no, go to <a href="#ALM-12045__li56023883112834">17</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12045__p2385731319369"><strong id="ALM-12045__b1338922719369">Check whether the network connection is normal.</strong></p>
@ -146,7 +146,7 @@ Slave queue ID: 0</pre>
</p></li><li id="ALM-12045__li4503547112834"><a name="ALM-12045__li4503547112834"></a><a name="li4503547112834"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12045__ul17454193112834"><li id="ALM-12045__li34452907112834">If yes, no further action is required.</li><li id="ALM-12045__li39222057112834">If no, go to <a href="#ALM-12045__li40531926112834">19</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12045__p22870015112834"><strong id="ALM-12045__b58378062112918">Collect the fault information.</strong></p>
<ol start="19" id="ALM-12045__ol57529826112922"><li id="ALM-12045__li40531926112834"><a name="ALM-12045__li40531926112834"></a><a name="li40531926112834"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12045__b15490161312420">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12045__b1349021382415">Log</strong> &gt; <strong id="ALM-12045__b7491161320243">Download</strong>.</span></li><li id="ALM-12045__li29243017112834"><span>Select <strong id="ALM-12045__b1473121611242">OMS</strong> for <strong id="ALM-12045__b20731016152419">Service</strong> and click <strong id="ALM-12045__b8746168242">OK</strong>.</span></li><li id="ALM-12045__li61860565112834"><span>Expand the <strong id="ALM-12045__b168351953175820">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12045__li19874180112834"><span>Click <span><img id="ALM-12045__image104601319175315" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12045__b198664245246">Start Date</strong> and <strong id="ALM-12045__b12867324152414">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12045__b1286718244241">Download</strong>.</span></li><li id="ALM-12045__li66304723112834"><span>Contact <span id="ALM-12045__text14546632162412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="19" id="ALM-12045__ol57529826112922"><li id="ALM-12045__li40531926112834"><a name="ALM-12045__li40531926112834"></a><a name="li40531926112834"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12045__b15490161312420">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12045__b1349021382415">Log</strong> &gt; <strong id="ALM-12045__b7491161320243">Download</strong>.</span></li><li id="ALM-12045__li29243017112834"><span>Select <strong id="ALM-12045__b1473121611242">OMS</strong> for <strong id="ALM-12045__b20731016152419">Service</strong> and click <strong id="ALM-12045__b8746168242">OK</strong>.</span></li><li id="ALM-12045__li61860565112834"><span>Expand the <strong id="ALM-12045__b168351953175820">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12045__li19874180112834"><span>Click <span><img id="ALM-12045__image104601319175315" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12045__b198664245246">Start Date</strong> and <strong id="ALM-12045__b12867324152414">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12045__b1286718244241">Download</strong>.</span></li><li id="ALM-12045__li66304723112834"><span>Contact <span id="ALM-12045__text14546632162412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12045__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12045__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -73,7 +73,7 @@
<div class="section" id="ALM-12046__section54400241"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12046__p4439065"><strong id="ALM-12046__b488114212259">Check whether the threshold is set properly.</strong></p>
<ol id="ALM-12046__ol51757082114518"><li id="ALM-12046__li5429491411450"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12046__b530513292591">O&amp;M</strong> &gt; <strong id="ALM-12046__b73111296591">Alarm</strong> &gt; <strong id="ALM-12046__b132072925911">Thresholds</strong> &gt; <em id="ALM-12046__i1532816298599">Name of the desired cluster</em> &gt; <strong id="ALM-12046__b534062916596">Host</strong> &gt; <strong id="ALM-12046__b14347529165911">Network Writing</strong> &gt; <strong id="ALM-12046__b13361529195913">Write Packet Dropped Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12046__b7369102916591">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12046__ul603276811450"><li id="ALM-12046__li1878771011450">If yes, go to <a href="#ALM-12046__li4369794811450">4</a>.</li><li id="ALM-12046__li4540955011450">If no, go to <a href="#ALM-12046__li5699560811450">2</a>.</li></ul>
</p></li><li id="ALM-12046__li5699560811450"><a name="ALM-12046__li5699560811450"></a><a name="li5699560811450"></a><span>Choose <strong id="ALM-12046__b86275584598">O&amp;M</strong> &gt; <strong id="ALM-12046__b863815815596">Alarm</strong> &gt; <strong id="ALM-12046__b46391158155914">Thresholds</strong> &gt; <em id="ALM-12046__i17639135845918">Name of the desired cluster</em> &gt; <strong id="ALM-12046__b1639175845912">Host</strong> &gt; <strong id="ALM-12046__b1964015811598">Network Writing</strong> &gt; <strong id="ALM-12046__b17640175865912">Write Packet Dropped Rate</strong>. Click <strong id="ALM-12046__b564014589596">Modify</strong> in the <strong id="ALM-12046__b1664135816596">Operation</strong> column to change the threshold.</span><p><p class="litext" id="ALM-12046__p3581190711450">See <a href="#ALM-12046__fig153215311450">Figure 1</a>.</p>
<div class="fignone" id="ALM-12046__fig153215311450"><a name="ALM-12046__fig153215311450"></a><a name="fig153215311450"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12046__image1482785044213" src="en-us_image_0000001390459444.png"></span></div>
<div class="fignone" id="ALM-12046__fig153215311450"><a name="ALM-12046__fig153215311450"></a><a name="fig153215311450"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12046__image1482785044213" src="en-us_image_0000001582807837.png"></span></div>
</p></li><li id="ALM-12046__li1629248811450"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12046__ul1759973611450"><li id="ALM-12046__li4319843211450">If yes, no further action is required.</li><li id="ALM-12046__li941206611450">If no, go to <a href="#ALM-12046__li4369794811450">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12046__p2417989711450"><strong id="ALM-12046__b284519296260">Check whether the network connection is normal.</strong></p>
@ -81,7 +81,7 @@
</p></li><li id="ALM-12046__li6056359711450"><a name="ALM-12046__li6056359711450"></a><a name="li6056359711450"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12046__ul1317526611450"><li id="ALM-12046__li5773721911450">If yes, no further action is required.</li><li id="ALM-12046__li4620316111450">If no, go to <a href="#ALM-12046__li820146511450">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12046__p5146853111450"><strong id="ALM-12046__b6696662511465">Collect the fault information.</strong></p>
<ol start="6" id="ALM-12046__ol4187815011462"><li id="ALM-12046__li820146511450"><a name="ALM-12046__li820146511450"></a><a name="li820146511450"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12046__b82519710275">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12046__b92521274278">Log</strong> &gt; <strong id="ALM-12046__b122521742710">Download</strong>.</span></li><li id="ALM-12046__li670432911450"><span>Select <strong id="ALM-12046__b73620916276">OMS</strong> for <strong id="ALM-12046__b53624992712">Service</strong> and click <strong id="ALM-12046__b17362129182711">OK</strong>.</span></li><li id="ALM-12046__li6033896511450"><span>Expand the <strong id="ALM-12046__b1511218191705">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12046__li617977311450"><span>Click <span><img id="ALM-12046__image92961342720" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12046__b12391113112719">Start Date</strong> and <strong id="ALM-12046__b53961311278">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12046__b23914137276">Download</strong>.</span></li><li id="ALM-12046__li3079963411450"><span>Contact <span id="ALM-12046__text26871216142711">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="6" id="ALM-12046__ol4187815011462"><li id="ALM-12046__li820146511450"><a name="ALM-12046__li820146511450"></a><a name="li820146511450"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12046__b82519710275">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12046__b92521274278">Log</strong> &gt; <strong id="ALM-12046__b122521742710">Download</strong>.</span></li><li id="ALM-12046__li670432911450"><span>Select <strong id="ALM-12046__b73620916276">OMS</strong> for <strong id="ALM-12046__b53624992712">Service</strong> and click <strong id="ALM-12046__b17362129182711">OK</strong>.</span></li><li id="ALM-12046__li6033896511450"><span>Expand the <strong id="ALM-12046__b1511218191705">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12046__li617977311450"><span>Click <span><img id="ALM-12046__image92961342720" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12046__b12391113112719">Start Date</strong> and <strong id="ALM-12046__b53961311278">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12046__b23914137276">Download</strong>.</span></li><li id="ALM-12046__li3079963411450"><span>Contact <span id="ALM-12046__text26871216142711">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12046__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12046__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -73,7 +73,7 @@
<div class="section" id="ALM-12047__section26508869"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12047__p9150041"><strong id="ALM-12047__b48301864144321">Check whether the threshold is set properly.</strong></p>
<ol id="ALM-12047__ol5621610514492"><li id="ALM-12047__li16991200144325"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12047__b1096642210512">O&amp;M</strong> &gt; <strong id="ALM-12047__b2977102211513">Alarm</strong> &gt; <strong id="ALM-12047__b10994192210512">Thresholds</strong> &gt; <em id="ALM-12047__i1822231959">Name of the desired cluster</em> &gt; <strong id="ALM-12047__b614102317519">Host</strong> &gt; <strong id="ALM-12047__b121811235515">Network Reading</strong> &gt; <strong id="ALM-12047__b13271323151">Read Packet Error Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12047__b173611231556">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12047__ul54083694144325"><li id="ALM-12047__li61199409144325">If yes, go to <a href="#ALM-12047__li47122569144325">4</a>.</li><li id="ALM-12047__li58205082144325">If no, go to <a href="#ALM-12047__li18938060144325">2</a>.</li></ul>
</p></li><li id="ALM-12047__li18938060144325"><a name="ALM-12047__li18938060144325"></a><a name="li18938060144325"></a><span>Choose <strong id="ALM-12047__b11895317762">O&amp;M</strong> &gt; <strong id="ALM-12047__b789714171965">Alarm</strong> &gt; <strong id="ALM-12047__b9898141713613">Thresholds</strong> &gt; <em id="ALM-12047__i389981710618">Name of the desired cluster</em> &gt; <strong id="ALM-12047__b179008171767">Host</strong> &gt; <strong id="ALM-12047__b109007171611">Network Reading</strong> &gt; <strong id="ALM-12047__b790117174615">Read Packet Error Rate</strong>. Click <strong id="ALM-12047__b139032017464">Modify</strong> in the <strong id="ALM-12047__b169038177610">Operation</strong> column to change the threshold.</span><p><p class="litext" id="ALM-12047__p34109930144325">See <a href="#ALM-12047__fig35859496144325">Figure 1</a>.</p>
<div class="fignone" id="ALM-12047__fig35859496144325"><a name="ALM-12047__fig35859496144325"></a><a name="fig35859496144325"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12047__image777621374319" src="en-us_image_0000001441218249.png"></span></div>
<div class="fignone" id="ALM-12047__fig35859496144325"><a name="ALM-12047__fig35859496144325"></a><a name="fig35859496144325"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12047__image777621374319" src="en-us_image_0000001532767698.png"></span></div>
</p></li><li id="ALM-12047__li11450397144325"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12047__ul34110047144325"><li id="ALM-12047__li36224819144325">If yes, no further action is required.</li><li id="ALM-12047__li48529247144325">If no, go to <a href="#ALM-12047__li47122569144325">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12047__p38554968144325"><strong id="ALM-12047__b1663413103111">Check whether the network connection is normal.</strong></p>
@ -81,7 +81,7 @@
</p></li><li id="ALM-12047__li52164171144325"><a name="ALM-12047__li52164171144325"></a><a name="li52164171144325"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12047__ul644002144325"><li id="ALM-12047__li21449944144325">If yes, no further action is required.</li><li id="ALM-12047__li59723879144325">If no, go to <a href="#ALM-12047__li66824355144325">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12047__p37260279144922"><strong id="ALM-12047__b41163092144926">Collect the fault information.</strong></p>
<ol start="6" id="ALM-12047__ol4946431144932"><li id="ALM-12047__li66824355144325"><a name="ALM-12047__li66824355144325"></a><a name="li66824355144325"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12047__b114185633111">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12047__b912756123118">Log</strong> &gt; <strong id="ALM-12047__b313125673116">Download</strong>.</span></li><li id="ALM-12047__li64548284144325"><span>Select <strong id="ALM-12047__b13721135814311">OMS</strong> for <strong id="ALM-12047__b8721758153120">Service</strong> and click <strong id="ALM-12047__b187221358143114">OK</strong>.</span></li><li id="ALM-12047__li44063647144325"><span>Expand the <strong id="ALM-12047__b1780712356614">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12047__li61028510144325"><span>Click <span><img id="ALM-12047__image1171914283214" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12047__b672772103210">Start Date</strong> and <strong id="ALM-12047__b9727729327">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12047__b127281226321">Download</strong>.</span></li><li id="ALM-12047__li44362264144325"><span>Contact <span id="ALM-12047__text5904144183214">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="6" id="ALM-12047__ol4946431144932"><li id="ALM-12047__li66824355144325"><a name="ALM-12047__li66824355144325"></a><a name="li66824355144325"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12047__b114185633111">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12047__b912756123118">Log</strong> &gt; <strong id="ALM-12047__b313125673116">Download</strong>.</span></li><li id="ALM-12047__li64548284144325"><span>Select <strong id="ALM-12047__b13721135814311">OMS</strong> for <strong id="ALM-12047__b8721758153120">Service</strong> and click <strong id="ALM-12047__b187221358143114">OK</strong>.</span></li><li id="ALM-12047__li44063647144325"><span>Expand the <strong id="ALM-12047__b1780712356614">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12047__li61028510144325"><span>Click <span><img id="ALM-12047__image1171914283214" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12047__b672772103210">Start Date</strong> and <strong id="ALM-12047__b9727729327">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12047__b127281226321">Download</strong>.</span></li><li id="ALM-12047__li44362264144325"><span>Contact <span id="ALM-12047__text5904144183214">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12047__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12047__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -73,7 +73,7 @@
<div class="section" id="ALM-12048__section60867610"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12048__p20908314"><strong id="ALM-12048__b538516311339">Check whether the threshold is set properly.</strong></p>
<ol id="ALM-12048__ol6406395014549"><li id="ALM-12048__li11357890145357"><span>Log in to FusionInsight Manager, choose <strong id="ALM-12048__b11238448717">O&amp;M</strong> &gt; <strong id="ALM-12048__b42571744714">Alarm</strong> &gt; <strong id="ALM-12048__b132639411718">Thresholds</strong> &gt; <em id="ALM-12048__i11266204978">Name of the desired cluster</em> &gt; <strong id="ALM-12048__b1126810415718">Host</strong> &gt; <strong id="ALM-12048__b427064876">Network Writing</strong> &gt; <strong id="ALM-12048__b527544074">Write Packet Error Rate</strong>, and check whether the alarm threshold is configured properly. The default value is <strong id="ALM-12048__b152781042714">0.5%</strong>. You can adjust the threshold as needed.</span><p><ul class="subitemlist" id="ALM-12048__ul1261987145357"><li id="ALM-12048__li57812933145357">If yes, go to <a href="#ALM-12048__li12888339145357">4</a>.</li><li id="ALM-12048__li52336003145357">If no, go to <a href="#ALM-12048__li15963175145357">2</a>.</li></ul>
</p></li><li id="ALM-12048__li15963175145357"><a name="ALM-12048__li15963175145357"></a><a name="li15963175145357"></a><span>Choose <strong id="ALM-12048__b143281531670">O&amp;M</strong> &gt; <strong id="ALM-12048__b10334143112710">Alarm</strong> &gt; <strong id="ALM-12048__b835115312714">Thresholds</strong> &gt; <em id="ALM-12048__i9353231677">Name of the desired cluster</em> &gt; <strong id="ALM-12048__b33577318710">Host</strong> &gt; <strong id="ALM-12048__b5359531876">Network Writing</strong> &gt; <strong id="ALM-12048__b11361143118716">Write Packet Error Rate</strong>. Click <strong id="ALM-12048__b236373115712">Modify</strong> in the <strong id="ALM-12048__b136519311171">Operation</strong> column to change the threshold.</span><p><p class="litext" id="ALM-12048__p47573930145357">See <a href="#ALM-12048__fig53221363145357">Figure 1</a>.</p>
<div class="fignone" id="ALM-12048__fig53221363145357"><a name="ALM-12048__fig53221363145357"></a><a name="fig53221363145357"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12048__image71961316435" src="en-us_image_0000001390619040.png"></span></div>
<div class="fignone" id="ALM-12048__fig53221363145357"><a name="ALM-12048__fig53221363145357"></a><a name="fig53221363145357"></a><span class="figcap"><b>Figure 1 </b>Configuring the alarm threshold</span><br><span><img id="ALM-12048__image71961316435" src="en-us_image_0000001532767658.png"></span></div>
</p></li><li id="ALM-12048__li53127101145357"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12048__ul44566628145357"><li id="ALM-12048__li9450851145357">If yes, no further action is required.</li><li id="ALM-12048__li27321468145357">If no, go to <a href="#ALM-12048__li12888339145357">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12048__p65555334145357"><strong id="ALM-12048__b389343463617">Check whether the network connection is normal.</strong></p>
@ -81,7 +81,7 @@
</p></li><li id="ALM-12048__li60279330145357"><a name="ALM-12048__li60279330145357"></a><a name="li60279330145357"></a><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12048__ul3229702145357"><li id="ALM-12048__li48886195145357">If yes, no further action is required.</li><li id="ALM-12048__li358855145357">If no, go to <a href="#ALM-12048__li5643066145357">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12048__p29067324145357"><strong id="ALM-12048__b10082732145437">Collect the fault information.</strong></p>
<ol start="6" id="ALM-12048__ol65647935145434"><li id="ALM-12048__li5643066145357"><a name="ALM-12048__li5643066145357"></a><a name="li5643066145357"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12048__b4624406810">O&amp;M</strong> &gt; <strong id="ALM-12048__b1867213013818">Log</strong> &gt; <strong id="ALM-12048__b1867914017813">Download</strong>.</span></li><li id="ALM-12048__li50787595145357"><span>Select <strong id="ALM-12048__b8263126183">OMS</strong> for <strong id="ALM-12048__b9277766818">Service</strong> and click <strong id="ALM-12048__b142791561189">OK</strong>.</span></li><li id="ALM-12048__li54435176145357"><span>Expand the <strong id="ALM-12048__b192997101388">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12048__li20154536145357"><span>Click <span><img id="ALM-12048__image104601319175315" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12048__b154102151382">Start Date</strong> and <strong id="ALM-12048__b24171815687">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12048__b1742012153810">Download</strong>.</span></li><li id="ALM-12048__li21904738145357"><span>Contact <span id="ALM-12048__text1165617231785">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="6" id="ALM-12048__ol65647935145434"><li id="ALM-12048__li5643066145357"><a name="ALM-12048__li5643066145357"></a><a name="li5643066145357"></a><span>On FusionInsight Manager of the active cluster, choose <strong id="ALM-12048__b4624406810">O&amp;M</strong> &gt; <strong id="ALM-12048__b1867213013818">Log</strong> &gt; <strong id="ALM-12048__b1867914017813">Download</strong>.</span></li><li id="ALM-12048__li50787595145357"><span>Select <strong id="ALM-12048__b8263126183">OMS</strong> for <strong id="ALM-12048__b9277766818">Service</strong> and click <strong id="ALM-12048__b142791561189">OK</strong>.</span></li><li id="ALM-12048__li54435176145357"><span>Expand the <strong id="ALM-12048__b192997101388">Hosts</strong> dialog box and select the alarm node and the active OMS node.</span></li><li id="ALM-12048__li20154536145357"><span>Click <span><img id="ALM-12048__image104601319175315" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12048__b154102151382">Start Date</strong> and <strong id="ALM-12048__b24171815687">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time respectively. Then, click <strong id="ALM-12048__b1742012153810">Download</strong>.</span></li><li id="ALM-12048__li21904738145357"><span>Contact <span id="ALM-12048__text1165617231785">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12048__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12048__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -73,16 +73,16 @@
<div class="section" id="ALM-12049__s0f9f5ec0a021434b9928f5bf4c940044"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12049__en-us_topic_0070543623_p36781757"><strong id="ALM-12049__b4092164015127">Check whether the threshold is set properly.</strong></p>
<ol id="ALM-12049__ol6452245415148"><li id="ALM-12049__li4670351415131"><span>On the FusionInsight Manager, choose <strong id="ALM-12049__b15915337194818">O&amp;M &gt; Alarm</strong> &gt; <strong id="ALM-12049__b191503711486">Thresholds</strong> &gt; <em id="ALM-12049__i189151337174819">Name of the desired cluster</em> &gt; <strong id="ALM-12049__b16915237154813">Host</strong> &gt; <strong id="ALM-12049__b1691573754820">Network Reading</strong> &gt; <strong id="ALM-12049__b10915143734811">Read Throughput Rate</strong> and check whether the alarm threshold is set properly. (By default, 80% is a proper value. However, users can configure the value as required.)</span><p><ul class="subitemlist" id="ALM-12049__ul5738506215131"><li id="ALM-12049__li2521002015131">If yes, go to <a href="#ALM-12049__li5611086815131">2</a>.</li><li id="ALM-12049__li2874573915131">If no, go to <a href="#ALM-12049__li3065917315131">4</a>.</li></ul>
</p></li><li id="ALM-12049__li5611086815131"><a name="ALM-12049__li5611086815131"></a><a name="li5611086815131"></a><span>Based on actual usage condition, choose <strong id="ALM-12049__b07081191469">O&amp;M &gt; Alarm</strong> &gt; <strong id="ALM-12049__b20106143125110">Thresholds</strong> &gt; <em id="ALM-12049__i11541848175114">Name of the desired cluster</em> &gt; <strong id="ALM-12049__b47111192469">Host</strong> &gt; <strong id="ALM-12049__b2418814015131">Network Reading</strong> &gt; <strong id="ALM-12049__b1308228415131">Read Throughput Rate</strong> and click <strong id="ALM-12049__b84051320104416">Modify</strong> in the<strong id="ALM-12049__b18538823144410"> Operation</strong> column to modify the alarm threshold.</span><p><p class="litext" id="ALM-12049__p5303205615131">For details, see <a href="#ALM-12049__fig566375315131">Figure 1</a>.</p>
<div class="fignone" id="ALM-12049__fig566375315131"><a name="ALM-12049__fig566375315131"></a><a name="fig566375315131"></a><span class="figcap"><b>Figure 1 </b>Setting alarm thresholds</span><br><span><img id="ALM-12049__image1615410501365" src="en-us_image_0000001440858201.png"></span></div>
<div class="fignone" id="ALM-12049__fig566375315131"><a name="ALM-12049__fig566375315131"></a><a name="fig566375315131"></a><span class="figcap"><b>Figure 1 </b>Setting alarm thresholds</span><br><span><img id="ALM-12049__image1615410501365" src="en-us_image_0000001532448486.png"></span></div>
</p></li><li id="ALM-12049__li6085933615131"><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12049__ul5129012315131"><li id="ALM-12049__li3523576915131">If yes, no further action is required.</li><li id="ALM-12049__li3552506415131">If no, go to <a href="#ALM-12049__li3065917315131">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12049__p5895793115131"><strong id="ALM-12049__b5562659915153">Check whether the network port rate can meet the service requirements.</strong></p>
<ol start="4" id="ALM-12049__ol665573431527"><li id="ALM-12049__li3065917315131"><a name="ALM-12049__li3065917315131"></a><a name="li3065917315131"></a><span>On FusionInsight Manager, click <span><img id="ALM-12049__image168221113135319" src="en-us_image_0269383872.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the network port name for which the alarm is generated.</span></li><li id="ALM-12049__li36506615131"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12049__b749710315131">root</strong>. <span id="ALM-12049__text43649449460"></span></span></li><li id="ALM-12049__li1487667815131"><span>Run the <strong id="ALM-12049__b328560015131">ethtool </strong><em id="ALM-12049__i2957040015131">network port name</em> command to check the maximum speed of the current network port.</span><p><div class="note" id="ALM-12049__note4639220615131"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12049__p6480701315131">In the VM environment, you cannot run a command to query the network port rate. It is recommended that you contact the system administrator to confirm whether the network port rate meets the requirements.</p>
<ol start="4" id="ALM-12049__ol665573431527"><li id="ALM-12049__li3065917315131"><a name="ALM-12049__li3065917315131"></a><a name="li3065917315131"></a><span>On FusionInsight Manager, click <span><img id="ALM-12049__image168221113135319" src="en-us_image_0000001582927869.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the network port name for which the alarm is generated.</span></li><li id="ALM-12049__li36506615131"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12049__b749710315131">root</strong>. <span id="ALM-12049__text43649449460"></span></span></li><li id="ALM-12049__li1487667815131"><span>Run the <strong id="ALM-12049__b328560015131">ethtool </strong><em id="ALM-12049__i2957040015131">network port name</em> command to check the maximum speed of the current network port.</span><p><div class="note" id="ALM-12049__note4639220615131"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12049__p6480701315131">In the VM environment, you cannot run a command to query the network port rate. It is recommended that you contact the system administrator to confirm whether the network port rate meets the requirements.</p>
</div></div>
</p></li><li id="ALM-12049__li6678124515131"><span>If the network read throughput rate exceeds the threshold, contact the system administrator to increase the network port rate.</span></li><li id="ALM-12049__li3780745315131"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12049__ul6509010915131"><li id="ALM-12049__li6416030115131">If yes, no further action is required.</li><li id="ALM-12049__li2960185515131">If no, go to <a href="#ALM-12049__li4699944215131">9</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12049__p4894007015131"><strong id="ALM-12049__b3536737115217">Collect fault information.</strong></p>
<ol start="9" id="ALM-12049__ol2826703815214"><li id="ALM-12049__li4699944215131"><a name="ALM-12049__li4699944215131"></a><a name="li4699944215131"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12049__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12049__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12049__li6522206415131"><span>Select <strong id="ALM-12049__b1352831932712">OMS</strong> from the <strong id="ALM-12049__b4885847115131">Service</strong> and click <strong id="ALM-12049__b3991118545">OK</strong>.</span></li><li id="ALM-12049__li4849583015131"><span>Set <strong id="ALM-12049__b5012766815131">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12049__li1145664103113"><span>Click <span><img id="ALM-12049__image1945644173117" src="en-us_image_0269383873.png"></span> in the upper right corner, and set <strong id="ALM-12049__b6456941173117">Start Date</strong> and <strong id="ALM-12049__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12049__b13456164113319">Download</strong>.</span></li><li id="ALM-12049__li495644512588"><span>Contact the <span id="ALM-12049__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="9" id="ALM-12049__ol2826703815214"><li id="ALM-12049__li4699944215131"><a name="ALM-12049__li4699944215131"></a><a name="li4699944215131"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12049__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12049__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12049__li6522206415131"><span>Select <strong id="ALM-12049__b1352831932712">OMS</strong> from the <strong id="ALM-12049__b4885847115131">Service</strong> and click <strong id="ALM-12049__b3991118545">OK</strong>.</span></li><li id="ALM-12049__li4849583015131"><span>Set <strong id="ALM-12049__b5012766815131">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12049__li1145664103113"><span>Click <span><img id="ALM-12049__image1945644173117" src="en-us_image_0000001582807921.png"></span> in the upper right corner, and set <strong id="ALM-12049__b6456941173117">Start Date</strong> and <strong id="ALM-12049__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12049__b13456164113319">Download</strong>.</span></li><li id="ALM-12049__li495644512588"><span>Contact the <span id="ALM-12049__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12049__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12049__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -73,16 +73,16 @@
<div class="section" id="ALM-12050__s288b004e523b4795aa832a7ef214236d"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12050__en-us_topic_0070543624_p8115287"><strong id="ALM-12050__b4779452715650">Check whether the threshold is set properly.</strong></p>
<ol id="ALM-12050__ol626009901578"><li id="ALM-12050__li3381340415653"><span>On the FusionInsight Manager, choose <strong id="ALM-12050__b034142294917">O&amp;M &gt; Alarm</strong> &gt; <strong id="ALM-12050__b1334142294919">Thresholds</strong> &gt; <em id="ALM-12050__i63492294910">Name of the desired cluster</em> &gt; <strong id="ALM-12050__b134152244914">Host</strong> &gt; <strong id="ALM-12050__b1349229499">Network Writing</strong> &gt; <strong id="ALM-12050__b1434722204913">Write Throughput Rate</strong> and check whether the alarm threshold is set properly. (By default, 80% is a proper value. However, users can configure the value as required.)</span><p><ul class="subitemlist" id="ALM-12050__ul1867012515653"><li id="ALM-12050__li683775815653">If yes, go to <a href="#ALM-12050__li3034361015653">4</a>.</li><li id="ALM-12050__li1698753915653">If no, go to <a href="#ALM-12050__li2386220215653">2</a>.</li></ul>
</p></li><li id="ALM-12050__li2386220215653"><a name="ALM-12050__li2386220215653"></a><a name="li2386220215653"></a><span>Based on actual usage condition, choose <strong id="ALM-12050__b972065414613">O&amp;M &gt; Alarm</strong> &gt; <strong id="ALM-12050__b17713333531">Thresholds</strong> &gt; <em id="ALM-12050__i277113336535">Name of the desired cluster</em> &gt; <strong id="ALM-12050__b12724175415463">Host</strong> &gt; <strong id="ALM-12050__b2479886215653">Network Writing</strong> &gt; <strong id="ALM-12050__b6255081015653">Write Throughput Rate</strong> and click <strong id="ALM-12050__b84051320104416">Modify</strong> in the<strong id="ALM-12050__b18538823144410"> Operation</strong> column to modify the alarm threshold.</span><p><p class="litext" id="ALM-12050__p3345084615653">For details, see <a href="#ALM-12050__fig2514972915653">Figure 1</a>.</p>
<div class="fignone" id="ALM-12050__fig2514972915653"><a name="ALM-12050__fig2514972915653"></a><a name="fig2514972915653"></a><span class="figcap"><b>Figure 1 </b>Setting alarm thresholds</span><br><span><img id="ALM-12050__image1615410501365" src="en-us_image_0000001440978021.png"></span></div>
<div class="fignone" id="ALM-12050__fig2514972915653"><a name="ALM-12050__fig2514972915653"></a><a name="fig2514972915653"></a><span class="figcap"><b>Figure 1 </b>Setting alarm thresholds</span><br><span><img id="ALM-12050__image1615410501365" src="en-us_image_0000001532448282.png"></span></div>
</p></li><li id="ALM-12050__li5919843115653"><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12050__ul6204017715653"><li id="ALM-12050__li1343323115653">If yes, no further action is required.</li><li id="ALM-12050__li1434989315653">If no, go to <a href="#ALM-12050__li3034361015653">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12050__p2149068415653"><strong id="ALM-12050__b828937615716">Check whether the network port rate can meet the service requirements.</strong></p>
<ol start="4" id="ALM-12050__ol3843532615729"><li id="ALM-12050__li3034361015653"><a name="ALM-12050__li3034361015653"></a><a name="li3034361015653"></a><span>On FusionInsight Manager, click <span><img id="ALM-12050__image168221113135319" src="en-us_image_0269383875.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the network port name for which the alarm is generated.</span></li><li id="ALM-12050__li4191332115653"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12050__b465703515653">root</strong>. <span id="ALM-12050__text43649449460"></span></span></li><li id="ALM-12050__li3191668615653"><span>Run the <strong id="ALM-12050__b4167557215653">ethtool</strong><em id="ALM-12050__i3953582915653">network port name</em> command to check the maximum speed of the current network port.</span><p><div class="note" id="ALM-12050__note4828554115653"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12050__p2027814115653">In the VM environment, you cannot run a command to query the network port rate. It is recommended that you contact the system administrator to confirm whether the network port rate meets the requirements.</p>
<ol start="4" id="ALM-12050__ol3843532615729"><li id="ALM-12050__li3034361015653"><a name="ALM-12050__li3034361015653"></a><a name="li3034361015653"></a><span>On FusionInsight Manager, click <span><img id="ALM-12050__image168221113135319" src="en-us_image_0000001532767506.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the network port name for which the alarm is generated.</span></li><li id="ALM-12050__li4191332115653"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12050__b465703515653">root</strong>. <span id="ALM-12050__text43649449460"></span></span></li><li id="ALM-12050__li3191668615653"><span>Run the <strong id="ALM-12050__b4167557215653">ethtool</strong><em id="ALM-12050__i3953582915653">network port name</em> command to check the maximum speed of the current network port.</span><p><div class="note" id="ALM-12050__note4828554115653"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12050__p2027814115653">In the VM environment, you cannot run a command to query the network port rate. It is recommended that you contact the system administrator to confirm whether the network port rate meets the requirements.</p>
</div></div>
</p></li><li id="ALM-12050__li1881472115653"><span>If the network write throughput rate exceeds the threshold, contact the system administrator to increase the network port rate.</span></li><li id="ALM-12050__li2938411415653"><span>Check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12050__ul3018892815653"><li id="ALM-12050__li3511476815653">If yes, no further action is required.</li><li id="ALM-12050__li2572394615653">If no, go to <a href="#ALM-12050__li1329206015653">9</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12050__p326490115653"><strong id="ALM-12050__b6410918015740">Collect fault information.</strong></p>
<ol start="9" id="ALM-12050__ol1685979415736"><li id="ALM-12050__li1329206015653"><a name="ALM-12050__li1329206015653"></a><a name="li1329206015653"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12050__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12050__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12050__li3479759015653"><span>Select <strong id="ALM-12050__b1352831932712">OMS</strong> from the <strong id="ALM-12050__b291511315653">Service</strong> and click <strong id="ALM-12050__b3991118545">OK</strong>.</span></li><li id="ALM-12050__li3252315653"><span>Set <strong id="ALM-12050__b4474285615653">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12050__li1145664103113"><span>Click <span><img id="ALM-12050__image1945644173117" src="en-us_image_0269383876.png"></span> in the upper right corner, and set <strong id="ALM-12050__b6456941173117">Start Date</strong> and <strong id="ALM-12050__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12050__b13456164113319">Download</strong>.</span></li><li id="ALM-12050__li495644512588"><span>Contact the <span id="ALM-12050__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="9" id="ALM-12050__ol1685979415736"><li id="ALM-12050__li1329206015653"><a name="ALM-12050__li1329206015653"></a><a name="li1329206015653"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12050__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12050__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12050__li3479759015653"><span>Select <strong id="ALM-12050__b1352831932712">OMS</strong> from the <strong id="ALM-12050__b291511315653">Service</strong> and click <strong id="ALM-12050__b3991118545">OK</strong>.</span></li><li id="ALM-12050__li3252315653"><span>Set <strong id="ALM-12050__b4474285615653">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12050__li1145664103113"><span>Click <span><img id="ALM-12050__image1945644173117" src="en-us_image_0000001583087425.png"></span> in the upper right corner, and set <strong id="ALM-12050__b6456941173117">Start Date</strong> and <strong id="ALM-12050__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12050__b13456164113319">Download</strong>.</span></li><li id="ALM-12050__li495644512588"><span>Contact the <span id="ALM-12050__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12050__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12050__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -71,7 +71,7 @@
<div class="section" id="ALM-12051__s0cead5bbc9184838988c86d15e691059"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12051__p842912019417">Massive small files are stored in the disk.</p>
</div>
<div class="section" id="ALM-12051__s693cda05471c449fbbb49adf59fe9622"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12051__en-us_topic_0070543626_p60322871"><strong id="ALM-12051__b50867683151048">Massive small files are stored in the disk.</strong></p>
<ol id="ALM-12051__ol7307055151059"><li id="ALM-12051__li366824151050"><span>On FusionInsight Manager, choose <strong id="ALM-12051__b3405855155015">O&amp;M &gt; Alarm &gt; Alarms</strong> and click <span><img id="ALM-12051__image168221113135319" src="en-us_image_0269383877.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the disk partition for which the alarm is generated.</span></li><li id="ALM-12051__li29712783151050"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12051__b3301420151050">root</strong>. <span id="ALM-12051__text43649449460"></span></span></li><li id="ALM-12051__li51564885151050"><span>Run the <strong id="ALM-12051__b62048598192327">df -i | grep -iE "</strong><em id="ALM-12051__i45978739192327">partition name|</em>Filesystem" command to check the current disk Inode usage.</span><p><pre class="screen" id="ALM-12051__screen3233016319146"># df -i | grep -iE "<em id="ALM-12051__i85483581988"><strong id="ALM-12051__b67335711988">xvda2</strong></em>|Filesystem"
<ol id="ALM-12051__ol7307055151059"><li id="ALM-12051__li366824151050"><span>On FusionInsight Manager, choose <strong id="ALM-12051__b3405855155015">O&amp;M &gt; Alarm &gt; Alarms</strong> and click <span><img id="ALM-12051__image168221113135319" src="en-us_image_0000001583127373.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host and the disk partition for which the alarm is generated.</span></li><li id="ALM-12051__li29712783151050"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12051__b3301420151050">root</strong>. <span id="ALM-12051__text43649449460"></span></span></li><li id="ALM-12051__li51564885151050"><span>Run the <strong id="ALM-12051__b62048598192327">df -i | grep -iE "</strong><em id="ALM-12051__i45978739192327">partition name|</em>Filesystem" command to check the current disk Inode usage.</span><p><pre class="screen" id="ALM-12051__screen3233016319146"># df -i | grep -iE "<em id="ALM-12051__i85483581988"><strong id="ALM-12051__b67335711988">xvda2</strong></em>|Filesystem"
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/xvda2 2359296 207420 2151876 <strong id="ALM-12051__b4380300819328">9%</strong> /</pre>
</p></li><li id="ALM-12051__li14711322338"><span>If the Inode usage exceeds the threshold, manually check small files stored in the disk partition and confirm whether these small files can be deleted.</span><p><div class="note" id="ALM-12051__note187031221315"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12051__p17058221139">Run the <strong id="ALM-12051__b767393015319">for i in /*; do echo $i; find $i|wc -l;</strong> <strong id="ALM-12051__b176739309311">done</strong> command to query the number of files in a partition. Replace <strong id="ALM-12051__b1367310309311">/*</strong> with the specified partition.</p>
@ -89,7 +89,7 @@ Filesystem Inodes IUsed IFree IUse% Mounted on
</p></li><li id="ALM-12051__li52275864151050"><a name="ALM-12051__li52275864151050"></a><a name="li52275864151050"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12051__ul42070605151050"><li id="ALM-12051__li65141340151050">If yes, no further action is required.</li><li class="subitemlist" id="ALM-12051__li14419336439">If no, go to <a href="#ALM-12051__li1819875814203">6</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12051__p32707140151050"><strong id="ALM-12051__b22914379151125">Collect fault information.</strong></p>
<ol start="6" id="ALM-12051__ol40476616151128"><li id="ALM-12051__li1819875814203"><a name="ALM-12051__li1819875814203"></a><a name="li1819875814203"></a><span>On the FusionInsight Manager home page of the active cluster, choose<strong id="ALM-12051__b32032649151050"> </strong><strong id="ALM-12051__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12051__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12051__li24712483151050"><span>Select <strong id="ALM-12051__b1352831932712">OMS</strong> from the <strong id="ALM-12051__b48358353151050">Service</strong> and click <strong id="ALM-12051__b3991118545">OK</strong>.</span></li><li id="ALM-12051__li55554122151050"><span>Set <strong id="ALM-12051__b21085761151050">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12051__li1145664103113"><span>Click <span><img id="ALM-12051__image1945644173117" src="en-us_image_0269383878.png"></span> in the upper right corner, and set <strong id="ALM-12051__b6456941173117">Start Date</strong> and <strong id="ALM-12051__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12051__b13456164113319">Download</strong>.</span></li><li id="ALM-12051__li495644512588"><span>Contact the <span id="ALM-12051__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12051__ol40476616151128"><li id="ALM-12051__li1819875814203"><a name="ALM-12051__li1819875814203"></a><a name="li1819875814203"></a><span>On the FusionInsight Manager home page of the active cluster, choose<strong id="ALM-12051__b32032649151050"> </strong><strong id="ALM-12051__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12051__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12051__li24712483151050"><span>Select <strong id="ALM-12051__b1352831932712">OMS</strong> from the <strong id="ALM-12051__b48358353151050">Service</strong> and click <strong id="ALM-12051__b3991118545">OK</strong>.</span></li><li id="ALM-12051__li55554122151050"><span>Set <strong id="ALM-12051__b21085761151050">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12051__li1145664103113"><span>Click <span><img id="ALM-12051__image1945644173117" src="en-us_image_0000001532927410.png"></span> in the upper right corner, and set <strong id="ALM-12051__b6456941173117">Start Date</strong> and <strong id="ALM-12051__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12051__b13456164113319">Download</strong>.</span></li><li id="ALM-12051__li495644512588"><span>Contact the <span id="ALM-12051__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12051__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12051__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -66,7 +66,7 @@
<div class="section" id="ALM-12052__sb7b610c6de7745eb88b799c8579eadf1"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12052__en-us_topic_0070543627_ul48073244"><li id="ALM-12052__en-us_topic_0070543627_li30006020">The temporary port cannot meet the current service requirements.</li><li id="ALM-12052__en-us_topic_0070543627_li1618726">The system is abnormal.</li></ul>
</div>
<div class="section" id="ALM-12052__s6decbfe8b04e489d9cf8766a9aa9271f"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12052__en-us_topic_0070543627_p64008009"><strong id="ALM-12052__b36299953151424">Expand the temporary port number range.</strong></p>
<ol id="ALM-12052__ol4904735151436"><li id="ALM-12052__li53454689151427"><span>On FusionInsight Manager, click <span><img id="ALM-12052__image168221113135319" src="en-us_image_0269383880.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12052__li34862525151427"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12052__b6057410214588">omm</strong>.</span></li><li id="ALM-12052__li5292302151427"><span>Run<strong id="ALM-12052__b4455986911048"> </strong>the<strong id="ALM-12052__b6549450511048"> cat /proc/sys/net/ipv4/ip_local_port_range |cut -f 1 </strong>command to obtain the value of the start port and run the <strong id="ALM-12052__b1596735623311"><strong id="ALM-12052__b179672056113317">cat /proc/sys/net/ipv4/ip_local_port_range</strong> |cut -f 2 </strong>command to obtain the value of the end port. The total number of temporary ports is the value of the end port minus the value of the start port. If the total number of temporary ports is smaller than 28,232, the random port range of the OS is narrow. Contact the system administrator to increase the port range.</span></li><li id="ALM-12052__li235192813711"><span>Run the <strong id="ALM-12052__b1571566811118">ss -ant 2&gt;/dev/null | grep -v LISTEN | awk 'NR &gt; 2 {print $4}'|cut -d ':' -f 2 | awk '$1 &gt;"</strong><i><span class="varname" id="ALM-12052__varname1665926611118">Value of the start port</span></i><strong id="ALM-12052__b722328511118">" {print $1}' | sort -u | wc -l</strong> command to calculate the number of used temporary ports.</span></li><li id="ALM-12052__li47630726151427"><span>The formula for calculating the usage of the temporary ports is: Usage of the temporary ports = (Number of used temporary ports/Total number of temporary ports) x 100%. Check whether the temporary port usage exceeds the threshold.</span><p><ul id="ALM-12052__ul22547539165328"><li id="ALM-12052__li56893717165328">If yes, go to <a href="#ALM-12052__li39311997145458">7</a>.</li><li id="ALM-12052__li20178347165328">If no, go to <a href="#ALM-12052__li61526456151427">6</a>.</li></ul>
<ol id="ALM-12052__ol4904735151436"><li id="ALM-12052__li53454689151427"><span>On FusionInsight Manager, click <span><img id="ALM-12052__image168221113135319" src="en-us_image_0000001532607970.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12052__li34862525151427"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12052__b6057410214588">omm</strong>.</span></li><li id="ALM-12052__li5292302151427"><span>Run<strong id="ALM-12052__b4455986911048"> </strong>the<strong id="ALM-12052__b6549450511048"> cat /proc/sys/net/ipv4/ip_local_port_range |cut -f 1 </strong>command to obtain the value of the start port and run the <strong id="ALM-12052__b1596735623311"><strong id="ALM-12052__b179672056113317">cat /proc/sys/net/ipv4/ip_local_port_range</strong> |cut -f 2 </strong>command to obtain the value of the end port. The total number of temporary ports is the value of the end port minus the value of the start port. If the total number of temporary ports is smaller than 28,232, the random port range of the OS is narrow. Contact the system administrator to increase the port range.</span></li><li id="ALM-12052__li235192813711"><span>Run the <strong id="ALM-12052__b1571566811118">ss -ant 2&gt;/dev/null | grep -v LISTEN | awk 'NR &gt; 2 {print $4}'|cut -d ':' -f 2 | awk '$1 &gt;"</strong><i><span class="varname" id="ALM-12052__varname1665926611118">Value of the start port</span></i><strong id="ALM-12052__b722328511118">" {print $1}' | sort -u | wc -l</strong> command to calculate the number of used temporary ports.</span></li><li id="ALM-12052__li47630726151427"><span>The formula for calculating the usage of the temporary ports is: Usage of the temporary ports = (Number of used temporary ports/Total number of temporary ports) x 100%. Check whether the temporary port usage exceeds the threshold.</span><p><ul id="ALM-12052__ul22547539165328"><li id="ALM-12052__li56893717165328">If yes, go to <a href="#ALM-12052__li39311997145458">7</a>.</li><li id="ALM-12052__li20178347165328">If no, go to <a href="#ALM-12052__li61526456151427">6</a>.</li></ul>
</p></li><li id="ALM-12052__li61526456151427"><a name="ALM-12052__li61526456151427"></a><a name="li61526456151427"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12052__ul46327333151427"><li id="ALM-12052__li26023356151427">If yes, no further action is required.</li><li id="ALM-12052__li27517102151427">If no, go to <a href="#ALM-12052__li39311997145458">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12052__p14292813151427"><strong id="ALM-12052__b49945765151444">Check whether the system environment is abnormal.</strong></p>
@ -86,7 +86,7 @@ tcp 0 0 10-120-85-154:45435 10-120-85-154:9866 CLOSE_WAIT 94237/java
</p></li><li id="ALM-12052__li785710172156"><span>After obtaining the administrator's approval, clear the processes that occupy a large number of ports. Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12052__ul45958539151427"><li id="ALM-12052__li56769573151427">If yes, no further action is required.</li><li id="ALM-12052__li34932666151427">If no, go to <a href="#ALM-12052__li57585220151427">10</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12052__p10973675151427"><strong id="ALM-12052__b3641674915155">Collect fault information.</strong></p>
<ol start="10" id="ALM-12052__ol5485290715150"><li id="ALM-12052__li57585220151427"><a name="ALM-12052__li57585220151427"></a><a name="li57585220151427"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12052__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12052__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12052__li60837487151427"><span>Select <strong id="ALM-12052__b1352831932712">OMS</strong> from the <strong id="ALM-12052__b33891259151427">Service</strong> and click <strong id="ALM-12052__b3991118545">OK</strong>.</span></li><li id="ALM-12052__li28889415151427"><span>Set <strong id="ALM-12052__b10666475151427">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12052__li1145664103113"><span>Click <span><img id="ALM-12052__image1945644173117" src="en-us_image_0269383881.png"></span> in the upper right corner, and set <strong id="ALM-12052__b6456941173117">Start Date</strong> and <strong id="ALM-12052__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12052__b13456164113319">Download</strong>.</span></li><li id="ALM-12052__li495644512588"><span>Contact the <span id="ALM-12052__text4614151421417">O&amp;M personnel</span> and send the collected log information and files <strong id="ALM-12052__b201061554424">port_result.txt</strong> and <strong id="ALM-12052__b1210685412211">ps_result.txt</strong>. Then, delete the two residual temporary files from the environment.</span></li></ol>
<ol start="10" id="ALM-12052__ol5485290715150"><li id="ALM-12052__li57585220151427"><a name="ALM-12052__li57585220151427"></a><a name="li57585220151427"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12052__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12052__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12052__li60837487151427"><span>Select <strong id="ALM-12052__b1352831932712">OMS</strong> from the <strong id="ALM-12052__b33891259151427">Service</strong> and click <strong id="ALM-12052__b3991118545">OK</strong>.</span></li><li id="ALM-12052__li28889415151427"><span>Set <strong id="ALM-12052__b10666475151427">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12052__li1145664103113"><span>Click <span><img id="ALM-12052__image1945644173117" src="en-us_image_0000001582807913.png"></span> in the upper right corner, and set <strong id="ALM-12052__b6456941173117">Start Date</strong> and <strong id="ALM-12052__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12052__b13456164113319">Download</strong>.</span></li><li id="ALM-12052__li495644512588"><span>Contact the <span id="ALM-12052__text4614151421417">O&amp;M personnel</span> and send the collected log information and files <strong id="ALM-12052__b201061554424">port_result.txt</strong> and <strong id="ALM-12052__b1210685412211">ps_result.txt</strong>. Then, delete the two residual temporary files from the environment.</span></li></ol>
</div>
<div class="section" id="ALM-12052__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12052__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -66,11 +66,11 @@
<div class="section" id="ALM-12053__en-us_topic_0070543628_section373139"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12053__en-us_topic_0070543628_ul26937201"><li id="ALM-12053__li184022012102816">The application process is abnormal. For example, the opened file or socket is not closed.</li><li id="ALM-12053__en-us_topic_0070543628_li41108220">The number of file handles cannot meet the current service requirements.</li><li id="ALM-12053__en-us_topic_0070543628_li34429662">The system is abnormal.</li></ul>
</div>
<div class="section" id="ALM-12053__se041063f671f4371a7e0bb7c4da04f29"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12053__p9858548184015"><strong id="ALM-12053__b11685182963818">Check information about files opened in processes.</strong></p>
<ol id="ALM-12053__ol2107954134014"><li id="ALM-12053__li142191911124120"><span>On FusionInsight Manager, click <span><img id="ALM-12053__image1219131174117" src="en-us_image_0269383882.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12053__li184472141416"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12053__b294641818419">root</strong>. <span id="ALM-12053__text18701027134116"></span></span></li><li id="ALM-12053__li1762124184114"><span>Run the <strong id="ALM-12053__b06214124117">lsof -n|awk '{print $2}'|sort|uniq -c|sort -nr|more</strong> command to check the process that occupies excessive file handles.</span></li><li id="ALM-12053__li264144244316"><span>Check whether the processes in which a large number of files are opened are normal. For example, check whether there are files or sockets not closed.</span><p><ul id="ALM-12053__ul192411041445"><li id="ALM-12053__li10241144134412">If yes, go to <a href="#ALM-12053__li698311306446">5</a>.</li><li id="ALM-12053__li125435134444">If no, go to <a href="#ALM-12053__li50842733151924">7</a>.</li></ul>
<ol id="ALM-12053__ol2107954134014"><li id="ALM-12053__li142191911124120"><span>On FusionInsight Manager, click <span><img id="ALM-12053__image1219131174117" src="en-us_image_0000001532927418.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12053__li184472141416"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12053__b294641818419">root</strong>. <span id="ALM-12053__text18701027134116"></span></span></li><li id="ALM-12053__li1762124184114"><span>Run the <strong id="ALM-12053__b06214124117">lsof -n|awk '{print $2}'|sort|uniq -c|sort -nr|more</strong> command to check the process that occupies excessive file handles.</span></li><li id="ALM-12053__li264144244316"><span>Check whether the processes in which a large number of files are opened are normal. For example, check whether there are files or sockets not closed.</span><p><ul id="ALM-12053__ul192411041445"><li id="ALM-12053__li10241144134412">If yes, go to <a href="#ALM-12053__li698311306446">5</a>.</li><li id="ALM-12053__li125435134444">If no, go to <a href="#ALM-12053__li50842733151924">7</a>.</li></ul>
</p></li><li id="ALM-12053__li698311306446"><a name="ALM-12053__li698311306446"></a><a name="li698311306446"></a><span>Release the abnormal processes that occupy too many file handles.</span></li><li id="ALM-12053__li137485054416"><span>Five minutes later, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12053__ul19374750194414"><li id="ALM-12053__li33741750154420">If yes, no further action is required.</li><li id="ALM-12053__li537418505442">If no, go to <a href="#ALM-12053__li50842733151924">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12053__en-us_topic_0070543628_p37339219"><strong id="ALM-12053__b50291933151922">Increase the number of file handles.</strong></p>
<ol start="7" id="ALM-12053__ol66890550151936"><li id="ALM-12053__li50842733151924"><a name="ALM-12053__li50842733151924"></a><a name="li50842733151924"></a><span>On FusionInsight Manager, click <span><img id="ALM-12053__image168221113135319" src="en-us_image_0269383883.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12053__li24620726151924"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12053__b54931419151924">root</strong>.</span></li><li id="ALM-12053__li103121715194518"><a name="ALM-12053__li103121715194518"></a><a name="li103121715194518"></a><span>Contact the system administrator to increase the number of system file handles.</span></li><li id="ALM-12053__li37165512528"><span>Run the <strong id="ALM-12053__b1690117451482">cat /proc/sys/fs/file-nr</strong> command to view the used handles and the maximum number of file handles. The first value is the number of used handles, the third value is the maximum number. Please check whether the usage exceeds the threshold.</span><p><ul class="subitemlist" id="ALM-12053__ul198522013534"><li class="subitemlist" id="ALM-12053__li816519713539">If yes, go to <a href="#ALM-12053__li103121715194518">9</a>.</li><li id="ALM-12053__li885215017534">If no, go to <a href="#ALM-12053__li133010151924">11</a>.<pre class="screen" id="ALM-12053__screen3672717115216"># cat /proc/sys/fs/file-nr
<ol start="7" id="ALM-12053__ol66890550151936"><li id="ALM-12053__li50842733151924"><a name="ALM-12053__li50842733151924"></a><a name="li50842733151924"></a><span>On FusionInsight Manager, click <span><img id="ALM-12053__image168221113135319" src="en-us_image_0000001532607746.png"></span> in the row where the alarm is located in the real-time alarm list and obtain the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12053__li24620726151924"><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12053__b54931419151924">root</strong>.</span></li><li id="ALM-12053__li103121715194518"><a name="ALM-12053__li103121715194518"></a><a name="li103121715194518"></a><span>Contact the system administrator to increase the number of system file handles.</span></li><li id="ALM-12053__li37165512528"><span>Run the <strong id="ALM-12053__b1690117451482">cat /proc/sys/fs/file-nr</strong> command to view the used handles and the maximum number of file handles. The first value is the number of used handles, the third value is the maximum number. Please check whether the usage exceeds the threshold.</span><p><ul class="subitemlist" id="ALM-12053__ul198522013534"><li class="subitemlist" id="ALM-12053__li816519713539">If yes, go to <a href="#ALM-12053__li103121715194518">9</a>.</li><li id="ALM-12053__li885215017534">If no, go to <a href="#ALM-12053__li133010151924">11</a>.<pre class="screen" id="ALM-12053__screen3672717115216"># cat /proc/sys/fs/file-nr
12704 0 640000</pre>
</li></ul>
</p></li><li id="ALM-12053__li133010151924"><a name="ALM-12053__li133010151924"></a><a name="li133010151924"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12053__ul18228740151924"><li id="ALM-12053__li5548368151924">If yes, no further action is required.</li><li id="ALM-12053__li46764658151924">If no, go to <a href="#ALM-12053__li21666806151924">12</a>.</li></ul>
@ -80,7 +80,7 @@
</p></li><li id="ALM-12053__li23370043151924"><a name="ALM-12053__li23370043151924"></a><a name="li23370043151924"></a><span>Wait for 5 minutes, and check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12053__ul19344122151924"><li id="ALM-12053__li60783531151924">If yes, no further action is required.</li><li id="ALM-12053__li24518968151924">If no, go to <a href="#ALM-12053__li58218801151924">14</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12053__p39879373151924"><strong id="ALM-12053__b60486860151959">Collect fault information.</strong></p>
<ol start="14" id="ALM-12053__ol4489551315202"><li id="ALM-12053__li58218801151924"><a name="ALM-12053__li58218801151924"></a><a name="li58218801151924"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12053__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12053__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12053__li57014808151924"><span>Select <strong id="ALM-12053__b1352831932712">OMS</strong> from the <strong id="ALM-12053__b18102480151924">Service</strong> and click <strong id="ALM-12053__b3991118545">OK</strong>.</span></li><li id="ALM-12053__li54796720151924"><span>Set <strong id="ALM-12053__b43371226151924">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12053__li1145664103113"><span>Click <span><img id="ALM-12053__image1945644173117" src="en-us_image_0269383884.png"></span> in the upper right corner, and set <strong id="ALM-12053__b6456941173117">Start Date</strong> and <strong id="ALM-12053__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12053__b13456164113319">Download</strong>.</span></li><li id="ALM-12053__li495644512588"><span>Contact the <span id="ALM-12053__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="14" id="ALM-12053__ol4489551315202"><li id="ALM-12053__li58218801151924"><a name="ALM-12053__li58218801151924"></a><a name="li58218801151924"></a><span>On the FusionInsight Manager home page of the active cluster, choose <strong id="ALM-12053__b39977366113627">O&amp;M</strong> &gt; <strong id="ALM-12053__b24251979113627">Log &gt; Download</strong>.</span></li><li id="ALM-12053__li57014808151924"><span>Select <strong id="ALM-12053__b1352831932712">OMS</strong> from the <strong id="ALM-12053__b18102480151924">Service</strong> and click <strong id="ALM-12053__b3991118545">OK</strong>.</span></li><li id="ALM-12053__li54796720151924"><span>Set <strong id="ALM-12053__b43371226151924">Host</strong> to the node for which the alarm is generated and the active OMS node.</span></li><li id="ALM-12053__li1145664103113"><span>Click <span><img id="ALM-12053__image1945644173117" src="en-us_image_0000001582927641.png"></span> in the upper right corner, and set <strong id="ALM-12053__b6456941173117">Start Date</strong> and <strong id="ALM-12053__b11456154113318">End Date</strong> for log collection to 30 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12053__b13456164113319">Download</strong>.</span></li><li id="ALM-12053__li495644512588"><span>Contact the <span id="ALM-12053__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12053__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12053__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -2,7 +2,10 @@
<h1 class="topictitle1">ALM-12054 Invalid Certificate File</h1>
<div id="body4761603"><div class="section" id="ALM-12054__section14878122"><h4 class="sectiontitle">Description</h4><p id="ALM-12054__p50145535">The system checks whether the certificate file is invalid (has expired or is not valid yet) on 23:00 every day. This alarm is generated when the certificate file is invalid.</p>
<p id="ALM-12054__p48656638">This alarm is cleared when a valid certificate is imported.</p>
<p id="ALM-12054__p48656638">This alarm is cleared when a valid certificate is imported and the alarm detection mechanism is triggered on the next hour.</p>
<div class="note" id="ALM-12054__note1958711114167"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12054__p175881411121613">For MRS 3.2.0 or later, the certificate file is checked at the beginning of each hour.</p>
<p id="ALM-12054__p186151039192511">For versions earlier than MRS 3.2.0, the certificate file is checked on 23:00 every day.</p>
</div></div>
</div>
<div class="section" id="ALM-12054__section66794237"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12054__table48873592" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12054__row18928010"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12054__p56773879">Alarm ID</p>
@ -65,8 +68,8 @@
<div class="section" id="ALM-12054__section39072761"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12054__p51542282">No certificate (CA certificate, HA root certificate, HA user certificate, Gaussdb root certificate, or Gaussdb user certificate) is imported to the system, the certificate fails to be imported, or the certificate file is invalid.</p>
</div>
<div class="section" id="ALM-12054__section16110535"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12054__p14175317"><strong id="ALM-12054__b17561093983012">Check the alarm cause.</strong></p>
<ol id="ALM-12054__ol15202643152315"><li id="ALM-12054__li3518787615237"><span>On FusionInsight Manager, locate the target alarm in the real-time alarm list and click <span><img id="ALM-12054__image168221113135319" src="en-us_image_0263895749.png"></span>.</span><p><p class="litext" id="ALM-12054__p4827967915237">View <strong id="ALM-12054__b1735515248158">Additional Information</strong> to obtain the additional information about the alarm.</p>
<ul class="subitemlist" id="ALM-12054__ul6505776815237"><li id="ALM-12054__li3084159815237">If <strong id="ALM-12054__b10712888143012">CA Certificate</strong> is displayed in the additional alarm information, log in to the active OMS management node as user <strong id="ALM-12054__b2115831533012">omm</strong> and go to <a href="#ALM-12054__li2768003415237">2</a>.</li><li id="ALM-12054__li205560515237">If <strong id="ALM-12054__b18381976313012">HA root Certificate</strong> is displayed in the additional information, view <strong id="ALM-12054__b14553259973012">Location</strong> to obtain the name of the host involved in this alarm. Then, log in to the host as user <strong id="ALM-12054__b10513863303012">omm</strong> and go to <a href="#ALM-12054__li6628516015237">3</a>.</li><li id="ALM-12054__li2214172115237">If <strong id="ALM-12054__b18169878033012">HA server Certificate</strong> is displayed in the additional information, view <strong id="ALM-12054__b6062742563012">Location</strong> to obtain the name of the host involved in this alarm. Then, log in to the host as user <strong id="ALM-12054__b10652669033012">omm</strong> and go to <a href="#ALM-12054__li64457371511">4</a>.</li><li id="ALM-12054__li5926131164116">If <strong id="ALM-12054__b10765181319597">Certificate has expired</strong> is displayed in the additional information, view <strong id="ALM-12054__b1161010381398">Location</strong> to obtain the host name of the node for which the alarm is generated. Then, log in to the host as user <strong id="ALM-12054__b425162014104">omm</strong> and perform <a href="#ALM-12054__li2768003415237">2</a> to <a href="#ALM-12054__li64457371511">4</a> to check whether the certificates have expired. If these certificates have not expired, check whether other certificates have been imported. If yes, import the certificate files again.</li></ul>
<ol id="ALM-12054__ol15202643152315"><li id="ALM-12054__li3518787615237"><span>On FusionInsight Manager, locate the target alarm in the real-time alarm list and click <span><img id="ALM-12054__image168221113135319" src="en-us_image_0000001532448262.png"></span>.</span><p><p class="litext" id="ALM-12054__p4827967915237">View <strong id="ALM-12054__b1735515248158">Additional Information</strong> to obtain the additional information about the alarm.</p>
<ul class="subitemlist" id="ALM-12054__ul6505776815237"><li id="ALM-12054__li3084159815237">If <strong id="ALM-12054__b10712888143012">CA Certificate</strong> is displayed in the additional alarm information, log in to the active OMS management node as user <strong id="ALM-12054__b2115831533012">omm</strong> and go to <a href="#ALM-12054__li2768003415237">2</a>.</li><li id="ALM-12054__li205560515237">If <strong id="ALM-12054__b18381976313012">HA root Certificate</strong> is displayed in the additional information, view <strong id="ALM-12054__b14553259973012">Location</strong> to obtain the name of the host involved in this alarm. Then, log in to the host as user <strong id="ALM-12054__b10513863303012">omm</strong> and go to <a href="#ALM-12054__li6628516015237">3</a>.</li><li id="ALM-12054__li2214172115237">If <strong id="ALM-12054__b18169878033012">HA server Certificate</strong> is displayed in the additional information, view <strong id="ALM-12054__b6062742563012">Location</strong> to obtain the name of the host involved in this alarm. Then, log in to the host as user <strong id="ALM-12054__b10652669033012">omm</strong> and go to <a href="#ALM-12054__li64457371511">4</a>.</li><li id="ALM-12054__li5926131164116">If <strong id="ALM-12054__b2672141219591">Certificate has expired</strong> is displayed in the additional information, view <strong id="ALM-12054__b105451432204">Location</strong> to obtain the name of the host for which the alarm is generated. Then, log in to the host as user <strong id="ALM-12054__b1133110114812">omm</strong> and perform <a href="#ALM-12054__li2768003415237">2</a> to <a href="#ALM-12054__li64457371511">4</a> in sequence to check whether the certificates have expired. If these certificates have not expired, check whether other certificates have been imported. If yes, import the certificate files again.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12054__p4864900615237"><strong id="ALM-12054__b58840153152325">Check the validity period of the certificate files in the system.</strong></p>
<ol start="2" id="ALM-12054__ol32858266152358"><li id="ALM-12054__li2768003415237"><a name="ALM-12054__li2768003415237"></a><a name="li2768003415237"></a><span>Check whether the current system time is in the validity period of the CA certificate. </span><p><p class="litext" id="ALM-12054__p47089478143530">Run the <strong id="ALM-12054__b02413395020">bash ${CONTROLLER_HOME}/security/cert/conf/querycertvalidity.sh</strong> command to check the effective time and due time of the CA root certificate.</p>
@ -94,7 +97,7 @@
<ul class="subitemlist" id="ALM-12054__ul6583370515237"><li id="ALM-12054__li5632256215237">If yes, go to <a href="#ALM-12054__li993320915237">7</a>.</li><li id="ALM-12054__li3714101715237">If no, no further action is required.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12054__p5563243315237"><strong id="ALM-12054__b30164211152420">Collect the fault information.</strong></p>
<ol start="7" id="ALM-12054__ol55826366152424"><li id="ALM-12054__li993320915237"><a name="ALM-12054__li993320915237"></a><a name="li993320915237"></a><span>On FusionInsight Manager, choose <strong id="ALM-12054__b4151203863012">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12054__b12295014603012">Log</strong> &gt; <strong id="ALM-12054__b7905950833012">Download</strong>.</span></li><li id="ALM-12054__li2229001815237"><span>In the <strong id="ALM-12054__b4980650983012">Services</strong> area, select <strong id="ALM-12054__b1450275463012">Controller</strong>, <strong id="ALM-12054__b5024522853012">OmmServer</strong>, <strong id="ALM-12054__b545910514410">OmmCore</strong>, and <strong id="ALM-12054__b16912131010417">Tomcat</strong>, and click <strong id="ALM-12054__b18481545533012">OK</strong>.</span></li><li id="ALM-12054__li6639244115237"><span>Click <span><img id="ALM-12054__image104601319175315" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12054__b9743421579">Start Date</strong> and <strong id="ALM-12054__b157431721876">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12054__b57438212715">Download</strong>.</span></li><li id="ALM-12054__li907865015237"><span>Contact <span id="ALM-12054__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="7" id="ALM-12054__ol55826366152424"><li id="ALM-12054__li993320915237"><a name="ALM-12054__li993320915237"></a><a name="li993320915237"></a><span>On FusionInsight Manager, choose <strong id="ALM-12054__b4151203863012">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12054__b12295014603012">Log</strong> &gt; <strong id="ALM-12054__b7905950833012">Download</strong>.</span></li><li id="ALM-12054__li2229001815237"><span>In the <strong id="ALM-12054__b4980650983012">Services</strong> area, select <strong id="ALM-12054__b1450275463012">Controller</strong>, <strong id="ALM-12054__b5024522853012">OmmServer</strong>, <strong id="ALM-12054__b545910514410">OmmCore</strong>, and <strong id="ALM-12054__b16912131010417">Tomcat</strong>, and click <strong id="ALM-12054__b18481545533012">OK</strong>.</span></li><li id="ALM-12054__li6639244115237"><span>Click <span><img id="ALM-12054__image104601319175315" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12054__b9743421579">Start Date</strong> and <strong id="ALM-12054__b157431721876">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12054__b57438212715">Download</strong>.</span></li><li id="ALM-12054__li907865015237"><span>Contact <span id="ALM-12054__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12054__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12054__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -2,7 +2,10 @@
<h1 class="topictitle1">ALM-12055 The Certificate File Is About to Expire</h1>
<div id="body59468999"><div class="section" id="ALM-12055__section39779984"><h4 class="sectiontitle">Description</h4><p id="ALM-12055__p52259651">The system checks the certificate file on 23:00 every day. This alarm is generated if the certificate file is about to expire within 30 days.</p>
<p id="ALM-12055__p574811">This alarm is cleared when a certificate that is not about to expire is imported.</p>
<p id="ALM-12055__p574811">This alarm is cleared when a certificate that is not about to expire is imported and the alarm detection mechanism is triggered on the next hour.</p>
<div class="note" id="ALM-12055__note1958711114167"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12055__p175881411121613">For MRS 3.2.0 or later, the certificate file is checked at the beginning of each hour.</p>
<p id="ALM-12055__p1243264711298">For versions earlier than MRS 3.2.0, the certificate file is checked on 23:00 every day.</p>
</div></div>
</div>
<div class="section" id="ALM-12055__section22475544"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12055__table46559760" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12055__row56576642"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12055__p19305289">Alarm ID</p>
@ -65,7 +68,7 @@
<div class="section" id="ALM-12055__section10108989"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12055__p63941749">The remaining validity period of a system certificate (CA certificate, HA root certificate, HA user certificate, Gaussdb root certificate, or Gaussdb user certificate) is less than 30 days.</p>
</div>
<div class="section" id="ALM-12055__section23872039"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12055__p11899166"><strong id="ALM-12055__b1887045508312">Check the alarm cause.</strong></p>
<ol id="ALM-12055__ol43542277152959"><li id="ALM-12055__li17570723152950"><span>On FusionInsight Manager, locate the target alarm in the real-time alarm list and click <span><img id="ALM-12055__image168221113135319" src="en-us_image_0263895749.png"></span>.</span><p><p class="litext" id="ALM-12055__p14741576152950">View <strong id="ALM-12055__b2570155932812">Additional Information</strong> to obtain the additional information about the alarm.</p>
<ol id="ALM-12055__ol43542277152959"><li id="ALM-12055__li17570723152950"><span>On FusionInsight Manager, locate the target alarm in the real-time alarm list and click <span><img id="ALM-12055__image168221113135319" src="en-us_image_0000001532448262.png"></span>.</span><p><p class="litext" id="ALM-12055__p14741576152950">View <strong id="ALM-12055__b2570155932812">Additional Information</strong> to obtain the additional information about the alarm.</p>
<ul class="subitemlist" id="ALM-12055__ul7673462152950"><li id="ALM-12055__li9190972152950">If <strong id="ALM-12055__b5620522910">CA Certificate</strong> is displayed in the additional alarm information, log in to the active OMS management node as user <strong id="ALM-12055__b3677516299">omm</strong> and go to <a href="#ALM-12055__li31866665152950">2</a>.</li><li id="ALM-12055__li56441759152950">If <strong id="ALM-12055__b0156041182917">HA root Certificate</strong> is displayed in the additional information, view <strong id="ALM-12055__b121574411299">Location</strong> to obtain the name of the host involved in this alarm. Then, log in to the host as user <strong id="ALM-12055__b1215714419290">omm</strong> and go to <a href="#ALM-12055__li35214520152950">3</a>.</li><li id="ALM-12055__li8309147152950">If <strong id="ALM-12055__b17901512193020">HA server Certificate</strong> is displayed in the additional information, view <strong id="ALM-12055__b1879191210300">Location</strong> to obtain the name of the host involved in this alarm. Then, log in to the host as user <strong id="ALM-12055__b879181223010">omm</strong> and go to <a href="#ALM-12055__li089064874420">4</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12055__p1952302152950"><strong id="ALM-12055__b53159753412">Check the validity period of the certificate files in the system.</strong></p>
@ -94,7 +97,7 @@
<ul class="subitemlist" id="ALM-12055__ul176911540125116"><li id="ALM-12055__li669119403510">If yes, go to <a href="#ALM-12055__li48423894152950">7</a>.</li><li id="ALM-12055__li1869114010511">If no, no further action is required.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12055__p65221176152950"><strong id="ALM-12055__b29152840153038">Collect the fault information.</strong></p>
<ol start="7" id="ALM-12055__ol35401324153041"><li id="ALM-12055__li48423894152950"><a name="ALM-12055__li48423894152950"></a><a name="li48423894152950"></a><span>On FusionInsight Manager, choose <strong id="ALM-12055__b151432717503">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12055__b11520152710504">Log</strong> &gt; <strong id="ALM-12055__b10521172713504">Download</strong>.</span></li><li id="ALM-12055__li33161866152950"><span>In the <strong id="ALM-12055__b19231630205011">Services</strong> area, select <strong id="ALM-12055__b2923183065012">Controller</strong>, <strong id="ALM-12055__b169231304504">OmmServer</strong>, <strong id="ALM-12055__b14923430145017">OmmCore</strong>, and <strong id="ALM-12055__b179231830165014">Tomcat</strong>, and click <strong id="ALM-12055__b1923143012505">OK</strong>.</span></li><li id="ALM-12055__li30021345152950"><span>Click <span><img id="ALM-12055__image104601319175315" src="en-us_image_0263895382.png"></span> in the upper right corner, and set <strong id="ALM-12055__b16249143625013">Start Date</strong> and <strong id="ALM-12055__b824993611507">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12055__b13249123612501">Download</strong>.</span></li><li id="ALM-12055__li15809856152950"><span>Contact <span id="ALM-12055__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="7" id="ALM-12055__ol35401324153041"><li id="ALM-12055__li48423894152950"><a name="ALM-12055__li48423894152950"></a><a name="li48423894152950"></a><span>On FusionInsight Manager, choose <strong id="ALM-12055__b151432717503">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12055__b11520152710504">Log</strong> &gt; <strong id="ALM-12055__b10521172713504">Download</strong>.</span></li><li id="ALM-12055__li33161866152950"><span>In the <strong id="ALM-12055__b19231630205011">Services</strong> area, select <strong id="ALM-12055__b2923183065012">Controller</strong>, <strong id="ALM-12055__b169231304504">OmmServer</strong>, <strong id="ALM-12055__b14923430145017">OmmCore</strong>, and <strong id="ALM-12055__b179231830165014">Tomcat</strong>, and click <strong id="ALM-12055__b1923143012505">OK</strong>.</span></li><li id="ALM-12055__li30021345152950"><span>Click <span><img id="ALM-12055__image104601319175315" src="en-us_image_0000001532927350.png"></span> in the upper right corner, and set <strong id="ALM-12055__b16249143625013">Start Date</strong> and <strong id="ALM-12055__b824993611507">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12055__b13249123612501">Download</strong>.</span></li><li id="ALM-12055__li15809856152950"><span>Contact <span id="ALM-12055__text126301214142412">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12055__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12055__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -59,10 +59,10 @@
</div>
<div class="section" id="ALM-12057__section42966593568"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12057__p240915442254">Metadata is not configured with the task to periodically back up data to a third-party server.</p>
</div>
<div class="section" id="ALM-12057__section1525571619574"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12057__ol449617567348"><li id="ALM-12057__li1611744911013"><span>On the FusionInsight Manager portal choose <strong id="ALM-12057__b188358153113">O&amp;M &gt; Alarm &gt; Alarms</strong>.</span></li><li id="ALM-12057__li169585911117"><span>In the alarm list, click <span><img id="ALM-12057__image168221113135319" src="en-us_image_0269383889.png"></span> in the row where the alarm is located and identify the data module from which the alarm is generated based on <strong id="ALM-12057__b4668102723111">Additional Information</strong>.</span></li><li id="ALM-12057__li11496856143419"><span>Choose <strong id="ALM-12057__b721210326">O&amp;M</strong> &gt; <strong id="ALM-12057__b1488442514323">Backup and Restoration &gt; Backup Management</strong> &gt; <strong id="ALM-12057__b55459305323">Create</strong>.</span></li><li id="ALM-12057__li144225714510"><span>Configure a backup task. The backup data to be configured is consistent with the data in Additional Information of the alarm.</span></li><li id="ALM-12057__li1133644161218"><span>After the backup task is created successfully, wait for two minutes and check whether the alarm is cleared.</span><p><ul id="ALM-12057__ul643195154411"><li id="ALM-12057__li5431451134410">If yes, no further action is required.</li><li id="ALM-12057__li1843551124416">If no, go to <a href="#ALM-12057__li1185962516113">6</a>.</li></ul>
<div class="section" id="ALM-12057__section1525571619574"><h4 class="sectiontitle">Procedure</h4><ol id="ALM-12057__ol449617567348"><li id="ALM-12057__li1611744911013"><span>On the FusionInsight Manager portal choose <strong id="ALM-12057__b188358153113">O&amp;M &gt; Alarm &gt; Alarms</strong>.</span></li><li id="ALM-12057__li169585911117"><span>In the alarm list, click <span><img id="ALM-12057__image168221113135319" src="en-us_image_0000001532927570.png"></span> in the row where the alarm is located and identify the data module from which the alarm is generated based on <strong id="ALM-12057__b4668102723111">Additional Information</strong>.</span></li><li id="ALM-12057__li11496856143419"><span>Choose <strong id="ALM-12057__b721210326">O&amp;M</strong> &gt; <strong id="ALM-12057__b1488442514323">Backup and Restoration &gt; Backup Management</strong> &gt; <strong id="ALM-12057__b55459305323">Create</strong>.</span></li><li id="ALM-12057__li144225714510"><span>Configure a backup task. The backup data to be configured is consistent with the data in Additional Information of the alarm.</span></li><li id="ALM-12057__li1133644161218"><span>After the backup task is created successfully, wait for two minutes and check whether the alarm is cleared.</span><p><ul id="ALM-12057__ul643195154411"><li id="ALM-12057__li5431451134410">If yes, no further action is required.</li><li id="ALM-12057__li1843551124416">If no, go to <a href="#ALM-12057__li1185962516113">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12057__p1284212519115"><strong id="ALM-12057__b1432912914719">Collect fault information</strong></p>
<ol start="6" id="ALM-12057__ol8860142514111"><li id="ALM-12057__li1185962516113"><a name="ALM-12057__li1185962516113"></a><a name="li1185962516113"></a><span>On FusionInsight Manager, choose <strong id="ALM-12057__b2068611561668">O&amp;M</strong> &gt; <strong id="ALM-12057__b19686105610610">Log &gt; Download</strong>.</span></li><li id="ALM-12057__li13859112516110"><span>In the <strong id="ALM-12057__b8859172516114">Service</strong> area, select <strong id="ALM-12057__b285913251016">Controller</strong> and click <strong id="ALM-12057__b3991118545">OK</strong>.</span></li><li id="ALM-12057__li4859182515115"><span>Click <span><img id="ALM-12057__image185919251512" src="en-us_image_0269383890.png"></span> in the upper right corner, and set <strong id="ALM-12057__b198594252011">Start Date</strong> and <strong id="ALM-12057__b58593251114">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12057__b11859025919">Download</strong>.</span></li><li id="ALM-12057__li495644512588"><span>Contact the <span id="ALM-12057__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12057__ol8860142514111"><li id="ALM-12057__li1185962516113"><a name="ALM-12057__li1185962516113"></a><a name="li1185962516113"></a><span>On FusionInsight Manager, choose <strong id="ALM-12057__b2068611561668">O&amp;M</strong> &gt; <strong id="ALM-12057__b19686105610610">Log &gt; Download</strong>.</span></li><li id="ALM-12057__li13859112516110"><span>In the <strong id="ALM-12057__b8859172516114">Service</strong> area, select <strong id="ALM-12057__b285913251016">Controller</strong> and click <strong id="ALM-12057__b3991118545">OK</strong>.</span></li><li id="ALM-12057__li4859182515115"><span>Click <span><img id="ALM-12057__image185919251512" src="en-us_image_0000001583087553.png"></span> in the upper right corner, and set <strong id="ALM-12057__b198594252011">Start Date</strong> and <strong id="ALM-12057__b58593251114">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12057__b11859025919">Download</strong>.</span></li><li id="ALM-12057__li495644512588"><span>Contact the <span id="ALM-12057__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12057__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12057__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -69,7 +69,7 @@
<div class="note" id="ALM-12061__note1837419235216"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-12061__p6374102312216">The alarm is generated when the process usage exceeds the threshold for the times specified by <strong id="ALM-12061__b1237411237214">Trigger Count</strong>.</p>
</div></div>
<p id="ALM-12061__p1737417236213">Set the alarm threshold based on the actual process usage. To check the process usage, choose <strong id="ALM-12061__b4374172315215">O&amp;M</strong> &gt; <strong id="ALM-12061__b11374192352114">Alarm</strong> &gt; <strong id="ALM-12061__b183741423162110">Thresholds</strong> &gt; <em id="ALM-12061__i18450436164420">Name of the desired cluster</em> &gt; <strong id="ALM-12061__b2374102311219">Host</strong>&gt; <strong id="ALM-12061__b51371152474">Process</strong> &gt; <strong id="ALM-12061__b1693614974714">omm Process Usage</strong>, as shown in <a href="#ALM-12061__fig437414238216">Figure 1</a>.</p>
<div class="fignone" id="ALM-12061__fig437414238216"><a name="ALM-12061__fig437414238216"></a><a name="fig437414238216"></a><span class="figcap"><b>Figure 1 </b>Setting an alarm threshold</span><br><span><img id="ALM-12061__image1615410501365" src="en-us_image_0000001440858217.png"></span></div>
<div class="fignone" id="ALM-12061__fig437414238216"><a name="ALM-12061__fig437414238216"></a><a name="fig437414238216"></a><span class="figcap"><b>Figure 1 </b>Setting an alarm threshold</span><br><span><img id="ALM-12061__image1615410501365" src="en-us_image_0000001583127621.png"></span></div>
</p></li><li id="ALM-12061__li33745237217"><span>2 minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12061__ul1437412317219"><li id="ALM-12061__li2374182312217">If it is, no further action is required.</li><li id="ALM-12061__li2374112315211">If it is not, go to <a href="#ALM-12061__li936717234216">3</a>.</li></ul>
</p></li></ol>
<p id="ALM-12061__p630219198214"><strong id="ALM-12061__b6695451191916">Check whether the maximum number of processes (including threads) opened by user omm is appropriate.</strong></p>
@ -80,7 +80,7 @@
<ol start="8" id="ALM-12061__ol1093673902112"><li id="ALM-12061__li293443912213"><a name="ALM-12061__li293443912213"></a><a name="li293443912213"></a><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12061__li3934143952119"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12061__b209341539202116">root</strong>.</span></li><li id="ALM-12061__li893473922118"><span>Run the <strong id="ALM-12061__b199341039112112">ps -o nlwp, pid, lwp, args, -u omm|sort -n</strong> command to check the numbers of threads used by the system. The result is sorted based on the thread number. Analyze the top 5 thread numbers and check whether the threads are incorrectly used. If they are, contact maintenance personnel to rectify the fault. If they are not, run the <strong id="ALM-12061__b209343391212">ulimit -u</strong> command to change the maximum number to be greater than 60000.</span></li><li id="ALM-12061__li119349396211"><span>Five minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12061__ul11934203918217"><li id="ALM-12061__li29341139172111">If it is, no further action is required.</li><li id="ALM-12061__li10934539102120">If it is not, go to <a href="#ALM-12061__li1668345092117">12</a>.</li></ul>
</p></li></ol>
<p id="ALM-12061__p56917471218"><strong id="ALM-12061__b1493463982113">Collect fault information.</strong></p>
<ol start="12" id="ALM-12061__ol18685115014216"><li id="ALM-12061__li1668345092117"><a name="ALM-12061__li1668345092117"></a><a name="li1668345092117"></a><span>On the FusionInsight Manager home page of the active clusters, choose <strong id="ALM-12061__b968317505217">O&amp;M </strong>&gt; <strong id="ALM-12061__b156836505210">Log</strong> &gt; <strong id="ALM-12061__b7683135018213">Download</strong>.</span></li><li id="ALM-12061__li868355022113"><span>Select <strong id="ALM-12061__b6683950172114">OmmServer</strong> and <strong id="ALM-12061__b468318504214">NodeAgent</strong> from the <strong id="ALM-12061__b33411729132615">Service</strong> and click <strong id="ALM-12061__b3991118545">OK</strong>.</span></li><li id="ALM-12061__li8685135062120"><span>Click <span><img id="ALM-12061__image12683135092120" src="en-us_image_0269383906.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12061__b136837501219">Start Date</strong> and <strong id="ALM-12061__b86832508216">End Date</strong> to 10 minutes before and after the alarm generation time respectively and click <strong id="ALM-12061__b1168545014219">OK</strong>. Then, click <strong id="ALM-12061__b13685125042113">Download</strong>.</span></li><li id="ALM-12061__li495644512588"><span>Contact the <span id="ALM-12061__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="12" id="ALM-12061__ol18685115014216"><li id="ALM-12061__li1668345092117"><a name="ALM-12061__li1668345092117"></a><a name="li1668345092117"></a><span>On the FusionInsight Manager home page of the active clusters, choose <strong id="ALM-12061__b968317505217">O&amp;M </strong>&gt; <strong id="ALM-12061__b156836505210">Log</strong> &gt; <strong id="ALM-12061__b7683135018213">Download</strong>.</span></li><li id="ALM-12061__li868355022113"><span>Select <strong id="ALM-12061__b6683950172114">OmmServer</strong> and <strong id="ALM-12061__b468318504214">NodeAgent</strong> from the <strong id="ALM-12061__b33411729132615">Service</strong> and click <strong id="ALM-12061__b3991118545">OK</strong>.</span></li><li id="ALM-12061__li8685135062120"><span>Click <span><img id="ALM-12061__image12683135092120" src="en-us_image_0000001532927646.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12061__b136837501219">Start Date</strong> and <strong id="ALM-12061__b86832508216">End Date</strong> to 10 minutes before and after the alarm generation time respectively and click <strong id="ALM-12061__b1168545014219">OK</strong>. Then, click <strong id="ALM-12061__b13685125042113">Download</strong>.</span></li><li id="ALM-12061__li495644512588"><span>Contact the <span id="ALM-12061__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12061__section10584175161919"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12061__p6698105111191">This alarm will be automatically cleared after the fault is rectified.</p>
</div>

View File

@ -62,7 +62,7 @@
<ol id="ALM-12062__ol87012317557"><li id="ALM-12062__li489962395514"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12062__li152261503555"><span>Log in to the host where the alarm is generated as user <strong id="ALM-12062__b022675065516">root</strong>. <span id="ALM-12062__text985593916354"></span></span></li><li id="ALM-12062__li95861858185515"><span>Run the <strong id="ALM-12062__b19586105865511">su - omm</strong> command to switch to user <strong id="ALM-12062__b6602115865515">omm</strong>.</span></li><li id="ALM-12062__li960214583555"><span>Run the <strong id="ALM-12062__b660235865514">vi $BIGDATA_LOG_HOME/controller/scriptlog/modify_manager_param.log</strong> command to open the log file and search for the log file containing the following information: Current oms configurations cannot support <em id="ALM-12062__i260210581552">xx</em> nodes. In the information, <em id="ALM-12062__i1760210587558">xx</em> indicates the number of nodes in the cluster.</span></li><li id="ALM-12062__li1895714113811"><span>Optimize the current cluster configuration by following the instructions in <a href="#ALM-12062__section117861721171717">Optimizing Manager Configurations Based on the Number of Cluster Nodes</a>.</span></li><li id="ALM-12062__li199275175618"><span>One hour later, check whether the alarm is cleared.</span><p><ul id="ALM-12062__ul65231712185619"><li id="ALM-12062__li4861118105614">If it is, no further action is required.</li><li id="ALM-12062__li152720248562">If it is not, go to <a href="#ALM-12062__li8140111212587">7</a>.</li></ul>
</p></li></ol>
<p id="ALM-12062__p13421113195811"><strong id="ALM-12062__b204218131586">Collect fault information.</strong></p>
<ol start="7" id="ALM-12062__ol1514001219584"><li id="ALM-12062__li8140111212587"><a name="ALM-12062__li8140111212587"></a><a name="li8140111212587"></a><span>On FusionInsight Manager, choose <strong id="ALM-12062__b12140112175816">O&amp;M</strong> &gt; <strong id="ALM-12062__b114011127584">Log</strong> &gt; <strong id="ALM-12062__b141404121585">Download</strong>.</span></li><li id="ALM-12062__li9140101216585"><span>Select <strong id="ALM-12062__b15140101214581">Controller</strong> from the <strong id="ALM-12062__b214071255817">Service</strong> and click <strong id="ALM-12062__b3991118545">OK</strong>.</span></li><li id="ALM-12062__li121401712195814"><span>Click <span><img id="ALM-12062__image1914021213589" src="en-us_image_0269383907.png"></span> in the upper right corner, and set <strong id="ALM-12062__b15140101215811">Start Date</strong> and <strong id="ALM-12062__b121408123588">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12062__b1214091210583">Download</strong>.</span></li><li id="ALM-12062__li495644512588"><span>Contact the <span id="ALM-12062__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="7" id="ALM-12062__ol1514001219584"><li id="ALM-12062__li8140111212587"><a name="ALM-12062__li8140111212587"></a><a name="li8140111212587"></a><span>On FusionInsight Manager, choose <strong id="ALM-12062__b12140112175816">O&amp;M</strong> &gt; <strong id="ALM-12062__b114011127584">Log</strong> &gt; <strong id="ALM-12062__b141404121585">Download</strong>.</span></li><li id="ALM-12062__li9140101216585"><span>Select <strong id="ALM-12062__b15140101214581">Controller</strong> from the <strong id="ALM-12062__b214071255817">Service</strong> and click <strong id="ALM-12062__b3991118545">OK</strong>.</span></li><li id="ALM-12062__li121401712195814"><span>Click <span><img id="ALM-12062__image1914021213589" src="en-us_image_0000001532607874.png"></span> in the upper right corner, and set <strong id="ALM-12062__b15140101215811">Start Date</strong> and <strong id="ALM-12062__b121408123588">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12062__b1214091210583">Download</strong>.</span></li><li id="ALM-12062__li495644512588"><span>Contact the <span id="ALM-12062__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12062__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12062__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -71,7 +71,7 @@
</p></li><li id="ALM-12063__li4535871458"><a name="ALM-12063__li4535871458"></a><a name="li4535871458"></a><span>Contact hardware engineers to rectify the disk.</span></li><li id="ALM-12063__li1353518719457"><span>One hour later, check whether this alarm is cleared.</span><p><ul id="ALM-12063__ul6535167124514"><li id="ALM-12063__li05355711456">If it is, no further action is required.</li><li id="ALM-12063__li65354717453">If it is not, go to <a href="#ALM-12063__li8140111212587">8</a>.</li></ul>
</p></li></ol>
<p id="ALM-12063__p18256224611"><strong id="ALM-12063__b42515254610">Collect fault information.</strong></p>
<ol start="8" id="ALM-12063__ol1996717458377"><li id="ALM-12063__li8140111212587"><a name="ALM-12063__li8140111212587"></a><a name="li8140111212587"></a><span>On FusionInsight Manager, choose <strong id="ALM-12063__b12140112175816">O&amp;M</strong> &gt; <strong id="ALM-12063__b114011127584">Log</strong> &gt; <strong id="ALM-12063__b141404121585">Download</strong>.</span></li><li id="ALM-12063__li9140101216585"><span>Select <strong id="ALM-12063__b069717155404">NodeAgent</strong> from the <strong id="ALM-12063__b214071255817">Service</strong> and click <strong id="ALM-12063__b3991118545">OK</strong>.</span></li><li id="ALM-12063__li296716454377"><span>Click <span><img id="ALM-12063__image109671245153716" src="en-us_image_0269383908.png"></span> in the upper right corner, and set <strong id="ALM-12063__b99671445103719">Start Date</strong> and <strong id="ALM-12063__b3967114563711">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12063__b2967194513374">Download</strong>.</span></li><li id="ALM-12063__li495644512588"><span>Contact the <span id="ALM-12063__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="8" id="ALM-12063__ol1996717458377"><li id="ALM-12063__li8140111212587"><a name="ALM-12063__li8140111212587"></a><a name="li8140111212587"></a><span>On FusionInsight Manager, choose <strong id="ALM-12063__b12140112175816">O&amp;M</strong> &gt; <strong id="ALM-12063__b114011127584">Log</strong> &gt; <strong id="ALM-12063__b141404121585">Download</strong>.</span></li><li id="ALM-12063__li9140101216585"><span>Select <strong id="ALM-12063__b069717155404">NodeAgent</strong> from the <strong id="ALM-12063__b214071255817">Service</strong> and click <strong id="ALM-12063__b3991118545">OK</strong>.</span></li><li id="ALM-12063__li296716454377"><span>Click <span><img id="ALM-12063__image109671245153716" src="en-us_image_0000001583087405.png"></span> in the upper right corner, and set <strong id="ALM-12063__b99671445103719">Start Date</strong> and <strong id="ALM-12063__b3967114563711">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12063__b2967194513374">Download</strong>.</span></li><li id="ALM-12063__li495644512588"><span>Contact the <span id="ALM-12063__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12063__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12063__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -63,7 +63,7 @@
</p></li><li id="ALM-12064__li1796713455375"><a name="ALM-12064__li1796713455375"></a><a name="li1796713455375"></a><span>Run the <strong id="ALM-12064__b1296734510372">vim /etc/sysctl.conf</strong> command to change the value of <strong id="ALM-12064__b1296711459372">net.ipv4.ip_local_port_range</strong> to <strong id="ALM-12064__b496794523715">32768 61000</strong>. If this parameter does not exist, add the following configuration: <strong id="ALM-12064__b129678452378">net.ipv4.ip_local_port_range = 32768 61000</strong>.</span></li><li id="ALM-12064__li79678451371"><span>Run the <strong id="ALM-12064__b11967445133718">sysctl -p /etc/sysctl.conf</strong> command for the modification to take effect.</span></li><li id="ALM-12064__li496704563711"><span>One hour later, check whether the alarm is cleared.</span><p><ul id="ALM-12064__ul16967445203711"><li id="ALM-12064__li1596784553710">If it is, no further action is required.</li><li id="ALM-12064__li1796784514375">If it is not, go to <a href="#ALM-12064__li1396704514377">7</a>.</li></ul>
</p></li></ol>
<p id="ALM-12064__p23701710174214"><strong id="ALM-12064__b123701110164218">Collect fault information.</strong></p>
<ol start="7" id="ALM-12064__ol1996717458377"><li id="ALM-12064__li1396704514377"><a name="ALM-12064__li1396704514377"></a><a name="li1396704514377"></a><span>On FusionInsight Manager, choose <strong id="ALM-12064__b1996754543712">O&amp;M</strong> &gt; <strong id="ALM-12064__b20967645173714">Log</strong> &gt; <strong id="ALM-12064__b1496734511372">Download</strong>.</span></li><li id="ALM-12064__li1596764533717"><span>Select <strong id="ALM-12064__b13967174519376">NodeAgent</strong> for <strong id="ALM-12064__b196744553714">Service</strong> and click <strong id="ALM-12064__b3991118545">OK</strong>.</span></li><li id="ALM-12064__li296716454377"><span>Click <span><img id="ALM-12064__image109671245153716" src="en-us_image_0269383909.png"></span> in the upper right corner, and set <strong id="ALM-12064__b99671445103719">Start Date</strong> and <strong id="ALM-12064__b3967114563711">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12064__b2967194513374">Download</strong>.</span></li><li id="ALM-12064__li495644512588"><span>Contact the <span id="ALM-12064__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="7" id="ALM-12064__ol1996717458377"><li id="ALM-12064__li1396704514377"><a name="ALM-12064__li1396704514377"></a><a name="li1396704514377"></a><span>On FusionInsight Manager, choose <strong id="ALM-12064__b1996754543712">O&amp;M</strong> &gt; <strong id="ALM-12064__b20967645173714">Log</strong> &gt; <strong id="ALM-12064__b1496734511372">Download</strong>.</span></li><li id="ALM-12064__li1596764533717"><span>Select <strong id="ALM-12064__b13967174519376">NodeAgent</strong> for <strong id="ALM-12064__b196744553714">Service</strong> and click <strong id="ALM-12064__b3991118545">OK</strong>.</span></li><li id="ALM-12064__li296716454377"><span>Click <span><img id="ALM-12064__image109671245153716" src="en-us_image_0000001532767522.png"></span> in the upper right corner, and set <strong id="ALM-12064__b99671445103719">Start Date</strong> and <strong id="ALM-12064__b3967114563711">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12064__b2967194513374">Download</strong>.</span></li><li id="ALM-12064__li495644512588"><span>Contact the <span id="ALM-12064__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12064__section14385121020422"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12064__p2038591034212">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -59,7 +59,7 @@
<div class="section" id="ALM-12066__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12066__ul913288183510"><li id="ALM-12066__li713414815352">The <strong id="ALM-12066__b22461400518">/etc/ssh/sshd_config</strong> configuration file is damaged.</li><li id="ALM-12066__li131351185357">The password of user <strong id="ALM-12066__b10643161513517">omm</strong> has expired.</li></ul>
</div>
<div class="section" id="ALM-12066__section071212121445"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12066__p14212204913111"><strong id="ALM-12066__b4515327657">Check the status of the /etc/ssh/sshd_config configuration file.</strong></p>
<ol id="ALM-12066__ol363257182811"><li id="ALM-12066__li263016792816"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm and click <span><img id="ALM-12066__image1663017722814" src="en-us_image_0263895789.png"></span> to view the host list in the alarm details.</span></li><li id="ALM-12066__li17631167192814"><span>Log in to the active OMS node as user <strong id="ALM-12066__b173458362104930">omm</strong>. <span id="ALM-12066__text38540585518"></span></span></li><li id="ALM-12066__li17631374283"><span>Run the <strong id="ALM-12066__b8591193761511">ssh</strong> command, for example, <strong id="ALM-12066__b1611013111616">ssh</strong> <strong id="ALM-12066__b461113131618"><em id="ALM-12066__i8702204181616">host2</em></strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i1032492071610"><strong id="ALM-12066__b18558144131812">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul1963111718289"><li id="ALM-12066__li363117710285">If yes, go to <a href="#ALM-12066__li176321676280">4</a>.</li><li id="ALM-12066__li136319782815">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
<ol id="ALM-12066__ol363257182811"><li id="ALM-12066__li263016792816"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm and click <span><img id="ALM-12066__image1663017722814" src="en-us_image_0000001532448306.png"></span> to view the host list in the alarm details.</span></li><li id="ALM-12066__li17631167192814"><span>Log in to the active OMS node as user <strong id="ALM-12066__b173458362104930">omm</strong>. <span id="ALM-12066__text38540585518"></span></span></li><li id="ALM-12066__li17631374283"><span>Run the <strong id="ALM-12066__b8591193761511">ssh</strong> command, for example, <strong id="ALM-12066__b1611013111616">ssh</strong> <strong id="ALM-12066__b461113131618"><em id="ALM-12066__i8702204181616">host2</em></strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i1032492071610"><strong id="ALM-12066__b18558144131812">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul1963111718289"><li id="ALM-12066__li363117710285">If yes, go to <a href="#ALM-12066__li176321676280">4</a>.</li><li id="ALM-12066__li136319782815">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
</p></li><li id="ALM-12066__li176321676280"><a name="ALM-12066__li176321676280"></a><a name="li176321676280"></a><span>Open the <strong id="ALM-12066__b19350203172016">/etc/ssh/sshd_config</strong> configuration file on host2 and check whether <strong id="ALM-12066__b497416449207">AllowUsers</strong> or <strong id="ALM-12066__b683084712203">DenyUsers</strong> is configured for other nodes.</span><p><ul id="ALM-12066__ul263219711285"><li id="ALM-12066__li66323716289">If yes, go to <a href="#ALM-12066__li846318425575">5</a>.</li><li id="ALM-12066__li1763211732817">If no, contact OS experts.</li></ul>
</p></li><li id="ALM-12066__li846318425575"><a name="ALM-12066__li846318425575"></a><a name="li846318425575"></a><span>Modify the whitelist or blacklist to ensure that user <strong id="ALM-12066__b5862624122211">omm</strong> is in the whitelist or not in the blacklist. Check whether the alarm is cleared.</span><p><ul id="ALM-12066__ul111918318587"><li id="ALM-12066__li17191331165814">If yes, no further action is required.</li><li id="ALM-12066__li15858237195817">If no, go to <a href="#ALM-12066__li9148131091317">6</a>.</li></ul>
</p></li></ol>
@ -69,17 +69,17 @@
</p></li><li id="ALM-12066__li19341633125911"><span>Add the public key of user <strong id="ALM-12066__b0377113310287">omm</strong> of the peer host to the trust list of the local host. Run the <strong id="ALM-12066__b1737092382919">ssh</strong> command, for example, <strong id="ALM-12066__b6889113012290">ssh host2</strong>, on each node in the alarm details to check whether the connection fails. (<em id="ALM-12066__i81833373014"><strong id="ALM-12066__b0720241270">host2</strong></em> is a node other than the OMS node in the alarm details.)</span><p><ul id="ALM-12066__ul137211213508"><li id="ALM-12066__li153121714307">If yes, go to <a href="#ALM-12066__li106306742813">9</a>.</li><li id="ALM-12066__li7313414402">If no, check whether the alarm is cleared. If the alarm is cleared, no further action is required; otherwise, go to <a href="#ALM-12066__li106306742813">9</a>.</li></ul>
</p></li></ol>
<p id="ALM-12066__p124132216288"><strong id="ALM-12066__b1967293410811">Collect the fault information.</strong></p>
<ol start="9" id="ALM-12066__ol146302742816"><li class="subitemlist" id="ALM-12066__li106306742813"><a name="ALM-12066__li106306742813"></a><a name="li106306742813"></a><span>On FusionInsight Manager, choose <strong id="ALM-12066__b140942549104930">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12066__b180541324104930">Log</strong> &gt; <strong id="ALM-12066__b1225148528104930">Download</strong>.</span></li><li id="ALM-12066__li06301476283"><span>Select <strong id="ALM-12066__b192996136104930">Controller</strong> for <strong id="ALM-12066__b345013368916">Service</strong> and click <strong id="ALM-12066__b1962404791104930">OK</strong>.</span></li><li id="ALM-12066__li126301173286"><span>Click <span><img id="ALM-12066__image863057122812" src="en-us_image_0263895540.png"></span> in the upper right corner to set the log collection time range. Generally, the time range is 10 minutes before and after the alarm generation time. Click <strong id="ALM-12066__b575409479104930">Download</strong>.</span></li><li id="ALM-12066__li2630274284"><span>Contact <span id="ALM-12066__text1793615574113">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="9" id="ALM-12066__ol146302742816"><li class="subitemlist" id="ALM-12066__li106306742813"><a name="ALM-12066__li106306742813"></a><a name="li106306742813"></a><span>On FusionInsight Manager, choose <strong id="ALM-12066__b140942549104930">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12066__b180541324104930">Log</strong> &gt; <strong id="ALM-12066__b1225148528104930">Download</strong>.</span></li><li id="ALM-12066__li06301476283"><span>Select <strong id="ALM-12066__b192996136104930">Controller</strong> for <strong id="ALM-12066__b345013368916">Service</strong> and click <strong id="ALM-12066__b1962404791104930">OK</strong>.</span></li><li id="ALM-12066__li126301173286"><span>Click <span><img id="ALM-12066__image863057122812" src="en-us_image_0000001583087445.png"></span> in the upper right corner to set the log collection time range. Generally, the time range is 10 minutes before and after the alarm generation time. Click <strong id="ALM-12066__b575409479104930">Download</strong>.</span></li><li id="ALM-12066__li2630274284"><span>Contact <span id="ALM-12066__text1793615574113">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12066__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12066__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>
<div class="section" id="ALM-12066__section8222143110380"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12066__p4686124105919">Perform the following steps to handle abnormal trust relationships between nodes:</p>
<div class="notice" id="ALM-12066__note64991413518"><span class="noticetitle"><img src="public_sys-resources/notice_3.0-en-us.png"> </span><div class="noticebody"><ul id="ALM-12066__ul19616958163514"><li id="ALM-12066__li1616145863512">Perform this operation as user <strong id="ALM-12066__b2165161015166">omm</strong>.</li><li id="ALM-12066__li861655833518">If the network between nodes is disconnected, rectify the network fault first. Check whether the two nodes are connected to the same security group and whether <strong id="ALM-12066__b196521759121612">hosts.deny</strong> and <strong id="ALM-12066__b1613616201712">hosts.allow</strong> are set.</li></ul>
</div></div>
<ol id="ALM-12066__ol1978732155814"><li id="ALM-12066__li597853215581">Run the <strong id="ALM-12066__b186632016173">ssh-add -l</strong> command on both nodes to check whether any identities exist.<p id="ALM-12066__p392110588248"><span><img id="ALM-12066__image8432143962413" src="en-us_image_0000001226576418.png"></span></p>
<ol id="ALM-12066__ol1978732155814"><li id="ALM-12066__li597853215581">Run the <strong id="ALM-12066__b186632016173">ssh-add -l</strong> command on both nodes to check whether any identities exist.<p id="ALM-12066__p392110588248"><span><img id="ALM-12066__image8432143962413" src="en-us_image_0000001582927685.png"></span></p>
<ul id="ALM-12066__ul122791263414"><li id="ALM-12066__li122797214348">If yes, go to <a href="#ALM-12066__li09782325586">4</a>.</li><li id="ALM-12066__li14378713415">If no, go to <a href="#ALM-12066__li16978123275815">2</a>.</li></ul>
</li><li id="ALM-12066__li16978123275815"><a name="ALM-12066__li16978123275815"></a><a name="li16978123275815"></a>If no identities are displayed, run the <strong id="ALM-12066__b6267121682419">ps -ef|grep ssh-agent</strong> command to find the <strong id="ALM-12066__b1666702220243">ssh-agent</strong> process, stop the process, and wait for the process to automatically restart.<p id="ALM-12066__p629941492510"><span><img id="ALM-12066__image138828117259" src="en-us_image_0000001227056330.png"></span></p>
</li><li id="ALM-12066__li1997863215584">Run the <strong id="ALM-12066__b18989588253">ssh-add -l</strong> command to check whether the identities have been added. If yes, manually run the <strong id="ALM-12066__b559031413264">ssh</strong> command to check whether the trust relationship is normal.<p id="ALM-12066__p492712369259"><span><img id="ALM-12066__image1579143210257" src="en-us_image_0000001271536445.png"></span></p>
</li><li id="ALM-12066__li16978123275815"><a name="ALM-12066__li16978123275815"></a><a name="li16978123275815"></a>If no identities are displayed, run the <strong id="ALM-12066__b6267121682419">ps -ef|grep ssh-agent</strong> command to find the <strong id="ALM-12066__b1666702220243">ssh-agent</strong> process, stop the process, and wait for the process to automatically restart.<p id="ALM-12066__p629941492510"><span><img id="ALM-12066__image138828117259" src="en-us_image_0000001532767530.png"></span></p>
</li><li id="ALM-12066__li1997863215584">Run the <strong id="ALM-12066__b18989588253">ssh-add -l</strong> command to check whether the identities have been added. If yes, manually run the <strong id="ALM-12066__b559031413264">ssh</strong> command to check whether the trust relationship is normal.<p id="ALM-12066__p492712369259"><span><img id="ALM-12066__image1579143210257" src="en-us_image_0000001582807737.png"></span></p>
</li><li id="ALM-12066__li09782325586"><a name="ALM-12066__li09782325586"></a><a name="li09782325586"></a>If identities exist, check whether the <span class="filepath" id="ALM-12066__filepath1443720119218"><b>/home/omm/.ssh/authorized_keys</b></span> file contains the information in the <span class="filepath" id="ALM-12066__filepath693611119214"><b>/home/omm/.ssh/id_rsa.pub</b></span> file of the peer node. If it does not, manually add the information.</li><li id="ALM-12066__li497914322582">Check whether the permissions on the files in the <strong id="ALM-12066__b152771124143011">/home/omm/.ssh</strong> directory are modified.</li><li id="ALM-12066__li8979193218587">Check the <strong id="ALM-12066__b2982446153018">/var/log/Bigdata/nodeagent/scriptlog/ssh-agent-monitor.log</strong> file.</li><li id="ALM-12066__li3979632105814">If the <strong id="ALM-12066__b09816214325">/home</strong> directory of user <strong id="ALM-12066__b1171105173213">omm</strong> is deleted, contact MRS support personnel for assistance.</li></ol>
</div>
</div>

View File

@ -61,10 +61,10 @@
<div class="section" id="ALM-12067__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12067__ul12589142315014"><li id="ALM-12067__li3591142315501">The Tomcat directory permission is abnormal, and the Tomcat process is abnormal.</li></ul>
</div>
<div class="section" id="ALM-12067__section071212121445"><h4 class="sectiontitle">Procedure</h4><p id="ALM-12067__p3197164020479"><strong id="ALM-12067__b64575930152820">Check whether the permission on the Tomcat directory is normal.</strong></p>
<ol id="ALM-12067__ol01141266283"><li id="ALM-12067__li111412602820"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12067__image10114162611289" src="en-us_image_0263895412.png"></span> to view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12067__li2011432610283"><span>Log in to the alarm host as user <strong id="ALM-12067__b2011452617286">root</strong>. <span id="ALM-12067__text65184518511"></span></span></li><li id="ALM-12067__li6114182682819"><span>Run the <strong id="ALM-12067__b101141226122818">su - omm</strong> command to switch to user <strong id="ALM-12067__b1740514446548">omm</strong>.</span></li><li id="ALM-12067__li181141726192815"><span>Run the <strong id="ALM-12067__b19114226192818">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/tomcat.log</strong> command to check whether the Tomcat resource log contains keyword <strong id="ALM-12067__b61141926122811">Cannot find <em id="ALM-12067__i1163833916240">XXX</em></strong> and rectify the file permission based on the keyword.</span></li><li id="ALM-12067__li51141626202816"><span>After 5 minutes, check whether the alarm is automatically cleared. </span><p><ul class="subitemlist" id="ALM-12067__ul911415261288"><li id="ALM-12067__li911492612811">If yes, no further action is required.</li><li id="ALM-12067__li1711402612820">If no, go to <a href="#ALM-12067__li711211264288">6</a>.</li></ul>
<ol id="ALM-12067__ol01141266283"><li id="ALM-12067__li111412602820"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12067__image10114162611289" src="en-us_image_0000001583127457.png"></span> to view the IP address of the host for which the alarm is generated.</span></li><li id="ALM-12067__li2011432610283"><span>Log in to the alarm host as user <strong id="ALM-12067__b2011452617286">root</strong>. <span id="ALM-12067__text65184518511"></span></span></li><li id="ALM-12067__li6114182682819"><span>Run the <strong id="ALM-12067__b101141226122818">su - omm</strong> command to switch to user <strong id="ALM-12067__b1740514446548">omm</strong>.</span></li><li id="ALM-12067__li181141726192815"><span>Run the <strong id="ALM-12067__b19114226192818">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/tomcat.log</strong> command to check whether the Tomcat resource log contains keyword <strong id="ALM-12067__b61141926122811">Cannot find <em id="ALM-12067__i1163833916240">XXX</em></strong> and rectify the file permission based on the keyword.</span></li><li id="ALM-12067__li51141626202816"><span>After 5 minutes, check whether the alarm is automatically cleared. </span><p><ul class="subitemlist" id="ALM-12067__ul911415261288"><li id="ALM-12067__li911492612811">If yes, no further action is required.</li><li id="ALM-12067__li1711402612820">If no, go to <a href="#ALM-12067__li711211264288">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12067__p124132216288"><strong id="ALM-12067__b1967293410811">Collect the fault information.</strong></p>
<ol start="6" id="ALM-12067__ol7112102616281"><li class="subitemlist" id="ALM-12067__li711211264288"><a name="ALM-12067__li711211264288"></a><a name="li711211264288"></a><span>On FusionInsight Manager, choose <strong id="ALM-12067__b8360182718578">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12067__b536002785718">Log</strong> &gt; <strong id="ALM-12067__b33611827205714">Download</strong>.</span></li><li id="ALM-12067__li31126266289"><span>In the <strong id="ALM-12067__b1071163118573">Services</strong> area, select <strong id="ALM-12067__b3821031155716">OmmServer</strong> and <strong id="ALM-12067__b68263135711">Tomcat</strong>, and click <strong id="ALM-12067__b682931185716">OK</strong>.</span></li><li id="ALM-12067__li2011292612815"><span>Click <span><img id="ALM-12067__image51121126122816" src="en-us_image_0263895407.png"></span> in the upper right corner, and set <strong id="ALM-12067__b55511722583">Start Date</strong> and <strong id="ALM-12067__b17552923588">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12067__b8552127586">Download</strong>.</span></li><li id="ALM-12067__li15112192672816"><span>Contact <span id="ALM-12067__text1694528635">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="6" id="ALM-12067__ol7112102616281"><li class="subitemlist" id="ALM-12067__li711211264288"><a name="ALM-12067__li711211264288"></a><a name="li711211264288"></a><span>On FusionInsight Manager, choose <strong id="ALM-12067__b8360182718578">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12067__b536002785718">Log</strong> &gt; <strong id="ALM-12067__b33611827205714">Download</strong>.</span></li><li id="ALM-12067__li31126266289"><span>In the <strong id="ALM-12067__b1071163118573">Services</strong> area, select <strong id="ALM-12067__b3821031155716">OmmServer</strong> and <strong id="ALM-12067__b68263135711">Tomcat</strong>, and click <strong id="ALM-12067__b682931185716">OK</strong>.</span></li><li id="ALM-12067__li2011292612815"><span>Click <span><img id="ALM-12067__image51121126122816" src="en-us_image_0000001532767558.png"></span> in the upper right corner, and set <strong id="ALM-12067__b55511722583">Start Date</strong> and <strong id="ALM-12067__b17552923588">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12067__b8552127586">Download</strong>.</span></li><li id="ALM-12067__li15112192672816"><span>Contact <span id="ALM-12067__text1694528635">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12067__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12067__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -61,11 +61,11 @@
<div class="section" id="ALM-12068__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12068__p610083015544">The ACS process is abnormal.</p>
</div>
<div class="section" id="ALM-12068__section5440125035617"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12068__p8324186"><strong id="ALM-12068__b15118501163833">Check whether the ACS process is normal.</strong></p>
<ol id="ALM-12068__ol5558276163811"><li id="ALM-12068__li34357272165726"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12068__image168221113135319" src="en-us_image_0263895733.png"></span> to view the name of the host for which the alarm is generated.</span></li><li id="ALM-12068__li50024484163811"><span>Log in to the alarm host as user <strong id="ALM-12068__b1241211221169">root</strong>. <span id="ALM-12068__text1942962220620"></span></span></li><li id="ALM-12068__li17626636132716"><span>Run the <strong id="ALM-12068__b8588144553112">su - omm</strong> command and then <strong id="ALM-12068__b32015537163811">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> to check whether the status of the ACS resources managed by the HA is normal. In the single-node system, the ACS resource is in the normal state. In the dual-node system, the ACS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul class="subitemlist" id="ALM-12068__ul66289368274"><li id="ALM-12068__li1062811360271">If yes, go to <a href="#ALM-12068__li6152360163635">6</a>.</li><li id="ALM-12068__li46281436112719">If no, go to <a href="#ALM-12068__li139657016249">4</a>.</li></ul>
<ol id="ALM-12068__ol5558276163811"><li id="ALM-12068__li34357272165726"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12068__image168221113135319" src="en-us_image_0000001582927805.png"></span> to view the name of the host for which the alarm is generated.</span></li><li id="ALM-12068__li50024484163811"><span>Log in to the alarm host as user <strong id="ALM-12068__b1241211221169">root</strong>. <span id="ALM-12068__text1942962220620"></span></span></li><li id="ALM-12068__li17626636132716"><span>Run the <strong id="ALM-12068__b8588144553112">su - omm</strong> command and then <strong id="ALM-12068__b32015537163811">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> to check whether the status of the ACS resources managed by the HA is normal. In the single-node system, the ACS resource is in the normal state. In the dual-node system, the ACS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul class="subitemlist" id="ALM-12068__ul66289368274"><li id="ALM-12068__li1062811360271">If yes, go to <a href="#ALM-12068__li6152360163635">6</a>.</li><li id="ALM-12068__li46281436112719">If no, go to <a href="#ALM-12068__li139657016249">4</a>.</li></ul>
</p></li><li id="ALM-12068__li139657016249"><a name="ALM-12068__li139657016249"></a><a name="li139657016249"></a><span>Run the <strong id="ALM-12068__b20158102319162">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/acs.log</strong> command to check whether the ACS resource log of HA contains the keyword <strong id="ALM-12068__b12635154014714">ERROR</strong>. If yes, analyze the logs to locate the resource exception cause and fix the exception.</span></li><li id="ALM-12068__li14736019164314"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12068__ul473671984320"><li id="ALM-12068__li9736151912432">If yes, no further action is required.</li><li id="ALM-12068__li4736141910439">If no, go to <a href="#ALM-12068__li6152360163635">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12068__p3652216163758"><strong id="ALM-12068__b26858758163828">Collect the fault information.</strong></p>
<ol start="6" id="ALM-12068__ol26111342163819"><li id="ALM-12068__li6152360163635"><a name="ALM-12068__li6152360163635"></a><a name="li6152360163635"></a><span>On FusionInsight Manager, choose <strong id="ALM-12068__b198926401682">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12068__b16892134019819">Log</strong> &gt; <strong id="ALM-12068__b1789318401185">Download</strong>.</span></li><li id="ALM-12068__li55371246163635"><span>In the <strong id="ALM-12068__b8713343188">Services</strong> area, select <strong id="ALM-12068__b1272114432815">Controller</strong> and <strong id="ALM-12068__b1872120439817">OmmServer</strong>, and click <strong id="ALM-12068__b177222431087">OK</strong>.</span></li><li id="ALM-12068__li28579174163635"><span>Click <span><img id="ALM-12068__image69691781225" src="en-us_image_0263895594.png"></span> in the upper right corner, and set <strong id="ALM-12068__b1482814481884">Start Date</strong> and <strong id="ALM-12068__b68291648584">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12068__b1382914812818">Download</strong>.</span></li><li id="ALM-12068__li33211732163635"><span>Contact <span id="ALM-12068__text21221703916">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="6" id="ALM-12068__ol26111342163819"><li id="ALM-12068__li6152360163635"><a name="ALM-12068__li6152360163635"></a><a name="li6152360163635"></a><span>On FusionInsight Manager, choose <strong id="ALM-12068__b198926401682">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12068__b16892134019819">Log</strong> &gt; <strong id="ALM-12068__b1789318401185">Download</strong>.</span></li><li id="ALM-12068__li55371246163635"><span>In the <strong id="ALM-12068__b8713343188">Services</strong> area, select <strong id="ALM-12068__b1272114432815">Controller</strong> and <strong id="ALM-12068__b1872120439817">OmmServer</strong>, and click <strong id="ALM-12068__b177222431087">OK</strong>.</span></li><li id="ALM-12068__li28579174163635"><span>Click <span><img id="ALM-12068__image69691781225" src="en-us_image_0000001532607914.png"></span> in the upper right corner, and set <strong id="ALM-12068__b1482814481884">Start Date</strong> and <strong id="ALM-12068__b68291648584">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12068__b1382914812818">Download</strong>.</span></li><li id="ALM-12068__li33211732163635"><span>Contact <span id="ALM-12068__text21221703916">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12068__section129720811223"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12068__p19973168152211">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -61,11 +61,11 @@
<div class="section" id="ALM-12069__section950130153414"><h4 class="sectiontitle">Possible Causes</h4><p id="ALM-12069__p14940123162411">The AOS process is abnormal.</p>
</div>
<div class="section" id="ALM-12069__section1541443812244"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12069__p8324186"><strong id="ALM-12069__b15118501163833">Check whether the AOS process is normal.</strong></p>
<ol id="ALM-12069__ol5558276163811"><li id="ALM-12069__li34357272165726"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12069__image168221113135319" src="en-us_image_0263895369.png"></span> to view the name of the host for which the alarm is generated.</span></li><li id="ALM-12069__li50024484163811"><span>Log in to the alarm host as user <strong id="ALM-12069__b96866141813">root</strong>. <span id="ALM-12069__text116882111811"></span></span></li><li id="ALM-12069__li17626636132716"><span>Run the <strong id="ALM-12069__b199545565144538">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the AOS resources managed by the HA is normal. In the single-node system, the AOS resource is in the normal state. In the dual-node system, the AOS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul class="subitemlist" id="ALM-12069__ul66289368274"><li id="ALM-12069__li1062811360271">If yes, go to <a href="#ALM-12069__li6152360163635">6</a>.</li><li id="ALM-12069__li46281436112719">If no, go to <a href="#ALM-12069__li139657016249">4</a>.</li></ul>
<ol id="ALM-12069__ol5558276163811"><li id="ALM-12069__li34357272165726"><span>In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and click <span><img id="ALM-12069__image168221113135319" src="en-us_image_0000001532448286.png"></span> to view the name of the host for which the alarm is generated.</span></li><li id="ALM-12069__li50024484163811"><span>Log in to the alarm host as user <strong id="ALM-12069__b96866141813">root</strong>. <span id="ALM-12069__text116882111811"></span></span></li><li id="ALM-12069__li17626636132716"><span>Run the <strong id="ALM-12069__b199545565144538">sh ${BIGDATA_HOME}/om-server/OMS/workspace0/ha/module/hacom/script/status_ha.sh</strong> command to check whether the status of the AOS resources managed by the HA is normal. In the single-node system, the AOS resource is in the normal state. In the dual-node system, the AOS resource is in the normal state on the active node and in the stopped state on the standby node.</span><p><ul class="subitemlist" id="ALM-12069__ul66289368274"><li id="ALM-12069__li1062811360271">If yes, go to <a href="#ALM-12069__li6152360163635">6</a>.</li><li id="ALM-12069__li46281436112719">If no, go to <a href="#ALM-12069__li139657016249">4</a>.</li></ul>
</p></li><li id="ALM-12069__li139657016249"><a name="ALM-12069__li139657016249"></a><a name="li139657016249"></a><span>Run the <strong id="ALM-12069__b15175108193211">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/aos.log</strong> command to check whether the AOS resource log of HA contains the keyword <strong id="ALM-12069__b1918314817326">ERROR</strong>. If yes, analyze the logs to locate the resource exception cause and fix the exception.</span></li><li id="ALM-12069__li14736019164314"><span>After 5 minutes, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12069__ul473671984320"><li id="ALM-12069__li9736151912432">If yes, no further action is required.</li><li id="ALM-12069__li4736141910439">If no, go to <a href="#ALM-12069__li6152360163635">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12069__p3652216163758"><strong id="ALM-12069__b26858758163828">Collect the fault information.</strong></p>
<ol start="6" id="ALM-12069__ol26111342163819"><li id="ALM-12069__li6152360163635"><a name="ALM-12069__li6152360163635"></a><a name="li6152360163635"></a><span>On FusionInsight Manager, choose <strong id="ALM-12069__b4651852193219">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12069__b76526528326">Log</strong> &gt; <strong id="ALM-12069__b46521552153219">Download</strong>.</span></li><li id="ALM-12069__li55371246163635"><span>In the <strong id="ALM-12069__b118685519325">Services</strong> area, select <strong id="ALM-12069__b1586165523216">Controller</strong> and <strong id="ALM-12069__b1686155512326">OmmServer</strong>, and click <strong id="ALM-12069__b5861955163217">OK</strong>.</span></li><li id="ALM-12069__li28579174163635"><span>Click <span><img id="ALM-12069__image69691781225" src="en-us_image_0263895883.png"></span> in the upper right corner, and set <strong id="ALM-12069__b182615123314">Start Date</strong> and <strong id="ALM-12069__b102629118330">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12069__b62621211335">Download</strong>.</span></li><li id="ALM-12069__li33211732163635"><span>Contact <span id="ALM-12069__text5719151393316">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
<ol start="6" id="ALM-12069__ol26111342163819"><li id="ALM-12069__li6152360163635"><a name="ALM-12069__li6152360163635"></a><a name="li6152360163635"></a><span>On FusionInsight Manager, choose <strong id="ALM-12069__b4651852193219">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-12069__b76526528326">Log</strong> &gt; <strong id="ALM-12069__b46521552153219">Download</strong>.</span></li><li id="ALM-12069__li55371246163635"><span>In the <strong id="ALM-12069__b118685519325">Services</strong> area, select <strong id="ALM-12069__b1586165523216">Controller</strong> and <strong id="ALM-12069__b1686155512326">OmmServer</strong>, and click <strong id="ALM-12069__b5861955163217">OK</strong>.</span></li><li id="ALM-12069__li28579174163635"><span>Click <span><img id="ALM-12069__image69691781225" src="en-us_image_0000001582927665.png"></span> in the upper right corner, and set <strong id="ALM-12069__b182615123314">Start Date</strong> and <strong id="ALM-12069__b102629118330">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12069__b62621211335">Download</strong>.</span></li><li id="ALM-12069__li33211732163635"><span>Contact <span id="ALM-12069__text5719151393316">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-12069__section129720811223"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12069__p19973168152211">This alarm is automatically cleared after the fault is rectified.</p>
</div>

View File

@ -65,7 +65,7 @@
</p></li><li id="ALM-12070__li6903202312318"><a name="ALM-12070__li6903202312318"></a><a name="li6903202312318"></a><span>Run the <strong id="ALM-12070__b16903112313312">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/controller.log</strong> command to view the Controller resource logs, and run the <strong id="ALM-12070__b290310231836">vi $BIGDATA_LOG_HOME/controller/controller.log </strong>command to view the Controller running logs, check whether the keyword <strong id="ALM-12070__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12070__li1590310231933"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12070__ul209032231431"><li id="ALM-12070__li199031823835">If it is, no further action is required.</li><li id="ALM-12070__li159039231338">If it is not, go to <a href="#ALM-12070__li69038231234">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12070__p13421113195811"><strong id="ALM-12070__b204218131586">Collect fault information.</strong></p>
<ol start="6" id="ALM-12070__ol39031423835"><li id="ALM-12070__li69038231234"><a name="ALM-12070__li69038231234"></a><a name="li69038231234"></a><span>On FusionInsight Manager, choose <strong id="ALM-12070__b590352315317">O&amp;M</strong> &gt; <strong id="ALM-12070__b59030233320">Log</strong> &gt; <strong id="ALM-12070__b1290362318319">Download</strong>.</span></li><li id="ALM-12070__li18903202318317"><span>Select <strong id="ALM-12070__b6883925124310">Controller </strong>and<strong id="ALM-12070__b1588372554312"> OmmServe</strong> for <strong id="ALM-12070__b890312231830">Service</strong> and click <strong id="ALM-12070__b3991118545">OK</strong>.</span></li><li id="ALM-12070__li18903523531"><span>Click <span><img id="ALM-12070__image18903132310317" src="en-us_image_0269383915.png"></span> in the upper right corner, and set <strong id="ALM-12070__b129031823137">Start Date</strong> and <strong id="ALM-12070__b1990322312314">End Date</strong> for log collection to 1 hour before and after the alarm generation time, respectively. Then, click <strong id="ALM-12070__b4903132312320">Download</strong>.</span></li><li id="ALM-12070__li495644512588"><span>Contact the <span id="ALM-12070__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12070__ol39031423835"><li id="ALM-12070__li69038231234"><a name="ALM-12070__li69038231234"></a><a name="li69038231234"></a><span>On FusionInsight Manager, choose <strong id="ALM-12070__b590352315317">O&amp;M</strong> &gt; <strong id="ALM-12070__b59030233320">Log</strong> &gt; <strong id="ALM-12070__b1290362318319">Download</strong>.</span></li><li id="ALM-12070__li18903202318317"><span>Select <strong id="ALM-12070__b6883925124310">Controller </strong>and<strong id="ALM-12070__b1588372554312"> OmmServe</strong> for <strong id="ALM-12070__b890312231830">Service</strong> and click <strong id="ALM-12070__b3991118545">OK</strong>.</span></li><li id="ALM-12070__li18903523531"><span>Click <span><img id="ALM-12070__image18903132310317" src="en-us_image_0000001582927629.png"></span> in the upper right corner, and set <strong id="ALM-12070__b129031823137">Start Date</strong> and <strong id="ALM-12070__b1990322312314">End Date</strong> for log collection to 1 hour before and after the alarm generation time, respectively. Then, click <strong id="ALM-12070__b4903132312320">Download</strong>.</span></li><li id="ALM-12070__li495644512588"><span>Contact the <span id="ALM-12070__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12070__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12070__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>

View File

@ -65,7 +65,7 @@
</p></li><li id="ALM-12071__li584395101819"><a name="ALM-12071__li584395101819"></a><a name="li584395101819"></a><span>Run the <strong id="ALM-12071__b6843951201818">vi $BIGDATA_LOG_HOME/omm/oms/ha/scriptlog/httpd.log</strong> command to view the httpd resource logs, check whether the keyword <strong id="ALM-12071__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12071__li118438511180"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12071__ul1484315115185"><li id="ALM-12071__li3843175115182">If it is, no further action is required.</li><li id="ALM-12071__li1184355116180">If it is not, go to <a href="#ALM-12071__li384145118188">7</a>.</li></ul>
</p></li></ol>
<p id="ALM-12071__p1674954751819"><strong id="ALM-12071__b149571522171815">Collect fault information.</strong></p>
<ol start="7" id="ALM-12071__ol118431551101813"><li id="ALM-12071__li384145118188"><a name="ALM-12071__li384145118188"></a><a name="li384145118188"></a><span>On FusionInsight Manager, choose <strong id="ALM-12071__b884013510187">O&amp;M</strong> &gt; <strong id="ALM-12071__b384045118183">Log</strong> &gt; <strong id="ALM-12071__b8841155115188">Download</strong>.</span></li><li id="ALM-12071__li5841351151811"><span>Select <strong id="ALM-12071__b78412516184">Controller</strong> and <strong id="ALM-12071__b2841175116185">OmmServer</strong> for <strong id="ALM-12071__b18841451201818">Service</strong> and click <strong id="ALM-12071__b3991118545">OK</strong>.</span></li><li id="ALM-12071__li1684175131820"><span>Click <span><img id="ALM-12071__image1084185120186" src="en-us_image_0269383916.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12071__b684175111183">Start Date</strong> and <strong id="ALM-12071__b14841185112187">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12071__b8841551191812">OK</strong>. Then, click <strong id="ALM-12071__b10841155112188">Download</strong>.</span></li><li id="ALM-12071__li495644512588"><span>Contact the <span id="ALM-12071__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="7" id="ALM-12071__ol118431551101813"><li id="ALM-12071__li384145118188"><a name="ALM-12071__li384145118188"></a><a name="li384145118188"></a><span>On FusionInsight Manager, choose <strong id="ALM-12071__b884013510187">O&amp;M</strong> &gt; <strong id="ALM-12071__b384045118183">Log</strong> &gt; <strong id="ALM-12071__b8841155115188">Download</strong>.</span></li><li id="ALM-12071__li5841351151811"><span>Select <strong id="ALM-12071__b78412516184">Controller</strong> and <strong id="ALM-12071__b2841175116185">OmmServer</strong> for <strong id="ALM-12071__b18841451201818">Service</strong> and click <strong id="ALM-12071__b3991118545">OK</strong>.</span></li><li id="ALM-12071__li1684175131820"><span>Click <span><img id="ALM-12071__image1084185120186" src="en-us_image_0000001582927741.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12071__b684175111183">Start Date</strong> and <strong id="ALM-12071__b14841185112187">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12071__b8841551191812">OK</strong>. Then, click <strong id="ALM-12071__b10841155112188">Download</strong>.</span></li><li id="ALM-12071__li495644512588"><span>Contact the <span id="ALM-12071__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12071__section17816122101811"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12071__p1395992212185">This alarm will be automatically cleared after the fault is rectified.</p>
</div>

View File

@ -70,7 +70,7 @@
</p></li><li id="ALM-12072__li19269111111714"><a name="ALM-12072__li19269111111714"></a><a name="li19269111111714"></a><span>Run the <strong id="ALM-12072__b1426819113173">ifconfig</strong> <em id="ALM-12072__i192695111177">NIC name Floating IPaddress</em> netmask <em id="ALM-12072__i2269181181716">Subnet mask</em> command to reconfigure the NIC with the floating IP address. (For example, <strong id="ALM-12072__b2026991161713">ifconfig eth0 10.10.10.102 netmask 255.255.255.0</strong>).</span></li><li id="ALM-12072__li1426917141719"><span>Five minutes later, check whether the alarm is cleared.</span><p><ul id="ALM-12072__ul3269113174"><li id="ALM-12072__li5269101141717">If it is, no further action is required.</li><li id="ALM-12072__li152691214173">If it is not, go to <a href="#ALM-12072__li726861151715">8</a>.</li></ul>
</p></li></ol>
<p id="ALM-12072__p194344582164"><strong id="ALM-12072__b11436748171614">Collect fault information.</strong></p>
<ol start="8" id="ALM-12072__ol1326817181717"><li id="ALM-12072__li726861151715"><a name="ALM-12072__li726861151715"></a><a name="li726861151715"></a><span>On FusionInsight Manager, choose <strong id="ALM-12072__b026812121711">O&amp;M</strong> &gt; <strong id="ALM-12072__b726811111719">Log</strong> &gt; <strong id="ALM-12072__b926841131719">Download</strong>.</span></li><li id="ALM-12072__li162681171713"><span>Select <strong id="ALM-12072__b17268191151713">Controller</strong> and <strong id="ALM-12072__b42681516170">OmmServer</strong> for <strong id="ALM-12072__b112681114179">Service</strong> and click <strong id="ALM-12072__b3991118545">OK</strong>.</span></li><li id="ALM-12072__li1326812151712"><span>Click <span><img id="ALM-12072__image1626812113177" src="en-us_image_0269383917.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12072__b1726819191714">Start Date</strong> and <strong id="ALM-12072__b182681113175">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12072__b2268161191713">OK</strong>. Then, click <strong id="ALM-12072__b1326891101719">Download</strong>.</span></li><li id="ALM-12072__li495644512588"><span>Contact the <span id="ALM-12072__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="8" id="ALM-12072__ol1326817181717"><li id="ALM-12072__li726861151715"><a name="ALM-12072__li726861151715"></a><a name="li726861151715"></a><span>On FusionInsight Manager, choose <strong id="ALM-12072__b026812121711">O&amp;M</strong> &gt; <strong id="ALM-12072__b726811111719">Log</strong> &gt; <strong id="ALM-12072__b926841131719">Download</strong>.</span></li><li id="ALM-12072__li162681171713"><span>Select <strong id="ALM-12072__b17268191151713">Controller</strong> and <strong id="ALM-12072__b42681516170">OmmServer</strong> for <strong id="ALM-12072__b112681114179">Service</strong> and click <strong id="ALM-12072__b3991118545">OK</strong>.</span></li><li id="ALM-12072__li1326812151712"><span>Click <span><img id="ALM-12072__image1626812113177" src="en-us_image_0000001582807857.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12072__b1726819191714">Start Date</strong> and <strong id="ALM-12072__b182681113175">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12072__b2268161191713">OK</strong>. Then, click <strong id="ALM-12072__b1326891101719">Download</strong>.</span></li><li id="ALM-12072__li495644512588"><span>Contact the <span id="ALM-12072__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12072__section1132214841620"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12072__p134361483167">This alarm will be automatically cleared after the fault is rectified.</p>
</div>

View File

@ -65,7 +65,7 @@
</p></li><li id="ALM-12073__li8262123151618"><a name="ALM-12073__li8262123151618"></a><a name="li8262123151618"></a><span>Run the <strong id="ALM-12073__b1026193171612">vi $BIGDATA_LOG_HOME/omm/oms/cep/cep.log </strong>and <strong id="ALM-12073__b1226213316168">vi $BIGDATA_LOG_HOME/omm/oms/cep/scriptlog/cep_ha.log </strong>commands to view the CEP resource logs, check whether the keyword <strong id="ALM-12073__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12073__li132629311160"><span>Five minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12073__ul6262831171619"><li id="ALM-12073__li16262153141620">If it is, no further action is required.</li><li id="ALM-12073__li826216312163">If it is not, go to <a href="#ALM-12073__li9258163110165">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12073__p10254192814164"><strong id="ALM-12073__b3909105810152">Collect fault information.</strong></p>
<ol start="6" id="ALM-12073__ol526063113163"><li id="ALM-12073__li9258163110165"><a name="ALM-12073__li9258163110165"></a><a name="li9258163110165"></a><span>On FusionInsight Manager, choose <strong id="ALM-12073__b1125815315166">O&amp;M</strong> &gt; <strong id="ALM-12073__b1525823113164">Log</strong> &gt; <strong id="ALM-12073__b625823114166">Download</strong>.</span></li><li id="ALM-12073__li18258163151613"><span>Select <strong id="ALM-12073__b9258163115162">Controller</strong> and <strong id="ALM-12073__b22584311164">OmmServer</strong> for <strong id="ALM-12073__b2258831111615">Service</strong> and click <strong id="ALM-12073__b3991118545">OK</strong>.</span></li><li id="ALM-12073__li12260531161614"><span>Click <span><img id="ALM-12073__image126014312167" src="en-us_image_0269383918.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12073__b19260123119168">Start Date</strong> and <strong id="ALM-12073__b32609319168">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12073__b626015316169">OK</strong>. Then, click <strong id="ALM-12073__b1426043161612">Download</strong>.</span></li><li id="ALM-12073__li495644512588"><span>Contact the <span id="ALM-12073__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12073__ol526063113163"><li id="ALM-12073__li9258163110165"><a name="ALM-12073__li9258163110165"></a><a name="li9258163110165"></a><span>On FusionInsight Manager, choose <strong id="ALM-12073__b1125815315166">O&amp;M</strong> &gt; <strong id="ALM-12073__b1525823113164">Log</strong> &gt; <strong id="ALM-12073__b625823114166">Download</strong>.</span></li><li id="ALM-12073__li18258163151613"><span>Select <strong id="ALM-12073__b9258163115162">Controller</strong> and <strong id="ALM-12073__b22584311164">OmmServer</strong> for <strong id="ALM-12073__b2258831111615">Service</strong> and click <strong id="ALM-12073__b3991118545">OK</strong>.</span></li><li id="ALM-12073__li12260531161614"><span>Click <span><img id="ALM-12073__image126014312167" src="en-us_image_0000001532927546.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12073__b19260123119168">Start Date</strong> and <strong id="ALM-12073__b32609319168">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12073__b626015316169">OK</strong>. Then, click <strong id="ALM-12073__b1426043161612">Download</strong>.</span></li><li id="ALM-12073__li495644512588"><span>Contact the <span id="ALM-12073__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12073__section9650125851520"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12073__p1909195801515">This alarm will be automatically cleared after the fault is rectified.</p>
</div>

View File

@ -65,7 +65,7 @@
</p></li><li id="ALM-12074__li1183383931416"><a name="ALM-12074__li1183383931416"></a><a name="li1183383931416"></a><span>Run the <strong id="ALM-12074__b783323918148">vi $BIGDATA_LOG_HOME/omm/oms/fms/fms.log </strong>and <strong id="ALM-12074__b108331539101416">vi $BIGDATA_LOG_HOME/omm/oms/fms/scriptlog/fms_ha.log </strong>commands to view the FMS resource logs, check whether the keyword <strong id="ALM-12074__b9187145311439">ERROR</strong> exists. Analyze the logs to locate and rectify the fault.</span></li><li id="ALM-12074__li4833133971410"><span>5 minutes later, check whether this alarm is cleared.</span><p><ul id="ALM-12074__ul1983383991412"><li id="ALM-12074__li983311395149">If it is, no further action is required.</li><li id="ALM-12074__li0833103914141">If it is not, go to <a href="#ALM-12074__li5828173931412">6</a>.</li></ul>
</p></li></ol>
<p id="ALM-12074__p590913362141"><strong id="ALM-12074__b2474172420144">Collect fault information.</strong></p>
<ol start="6" id="ALM-12074__ol1683343918146"><li id="ALM-12074__li5828173931412"><a name="ALM-12074__li5828173931412"></a><a name="li5828173931412"></a><span>On FusionInsight Manager, choose <strong id="ALM-12074__b18828113913148">O&amp;M</strong>&gt; <strong id="ALM-12074__b98286392144">Log</strong> &gt; <strong id="ALM-12074__b13828203912147">Download</strong>.</span></li><li id="ALM-12074__li383393912140"><span>Select <strong id="ALM-12074__b3828039191417">Controller</strong> and <strong id="ALM-12074__b1583363981411">OmmServer</strong> for <strong id="ALM-12074__b17833639111417">Service</strong> and click <strong id="ALM-12074__b3991118545">OK</strong>.</span></li><li id="ALM-12074__li18833339101411"><span>Click <span><img id="ALM-12074__image1383383917144" src="en-us_image_0269383919.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12074__b783323981410">Start Date</strong> and <strong id="ALM-12074__b1683314393142">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12074__b08331339131417">OK</strong>. Then, click <strong id="ALM-12074__b11833103913145">Download</strong>.</span></li><li id="ALM-12074__li495644512588"><span>Contact the <span id="ALM-12074__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
<ol start="6" id="ALM-12074__ol1683343918146"><li id="ALM-12074__li5828173931412"><a name="ALM-12074__li5828173931412"></a><a name="li5828173931412"></a><span>On FusionInsight Manager, choose <strong id="ALM-12074__b18828113913148">O&amp;M</strong>&gt; <strong id="ALM-12074__b98286392144">Log</strong> &gt; <strong id="ALM-12074__b13828203912147">Download</strong>.</span></li><li id="ALM-12074__li383393912140"><span>Select <strong id="ALM-12074__b3828039191417">Controller</strong> and <strong id="ALM-12074__b1583363981411">OmmServer</strong> for <strong id="ALM-12074__b17833639111417">Service</strong> and click <strong id="ALM-12074__b3991118545">OK</strong>.</span></li><li id="ALM-12074__li18833339101411"><span>Click <span><img id="ALM-12074__image1383383917144" src="en-us_image_0000001532767442.png"></span> in the upper right corner. In the displayed dialog box, set <strong id="ALM-12074__b783323981410">Start Date</strong> and <strong id="ALM-12074__b1683314393142">End Date</strong> to 1 hour before and after the alarm generation time respectively and click <strong id="ALM-12074__b08331339131417">OK</strong>. Then, click <strong id="ALM-12074__b11833103913145">Download</strong>.</span></li><li id="ALM-12074__li495644512588"><span>Contact the <span id="ALM-12074__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12074__section13393241148"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12074__p34742024121418">This alarm will be automatically cleared after the fault is rectified.</p>
</div>

Some files were not shown because too many files have changed in this diff Show More