doc-exports/docs/mrs/umn/ALM-12007.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

91 lines
13 KiB
HTML

<a name="ALM-12007"></a><a name="ALM-12007"></a>
<h1 class="topictitle1">ALM-12007 Process Fault</h1>
<div id="body62162350"><div class="section" id="ALM-12007__s3c7152bd1bd648aea0a18beede86237d"><h4 class="sectiontitle">Description</h4><p id="ALM-12007__en-us_topic_0070543667_p46722268">This alarm is generated when the process health check module detects that the process connection status is <strong id="ALM-12007__en-us_topic_0070543667_b17847232">Bad</strong> for three consecutive times. The process health check module checks the process status every 5 seconds.</p>
<p id="ALM-12007__en-us_topic_0070543667_p26407365">This alarm is cleared when the process can be connected.</p>
</div>
<div class="section" id="ALM-12007__sb0d2e518431d4334b799e4fe2360d334"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12007__en-us_topic_0070543667_table58621855" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12007__en-us_topic_0070543667_row42640608"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-12007__en-us_topic_0070543667_p31337228">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-12007__en-us_topic_0070543667_p55287536">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-12007__en-us_topic_0070543667_p49105461">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12007__en-us_topic_0070543667_row18119427"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p58387457">12007</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p31763543">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-12007__en-us_topic_0070543667_p22710190">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12007__s83d0197c6d984834b79b3a1f2a44d5e4"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-12007__en-us_topic_0070543667_table27586093" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-12007__en-us_topic_0070543667_row64905719"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-12007__en-us_topic_0070543667_p22871847">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-12007__en-us_topic_0070543667_p40680291">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-12007__row165443324551"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__p192431315431">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__p692551319435">Specifies the cluster or system for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12007__en-us_topic_0070543667_row6769300"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p11442466">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p54424564">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12007__en-us_topic_0070543667_row20059035"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p14169102">RoleName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p6846651">Specifies the role for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-12007__en-us_topic_0070543667_row61619862"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-12007__en-us_topic_0070543667_p25152906">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-12007__en-us_topic_0070543667_p24119499">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-12007__sdd4b61f1ce0c4c3382bbfb0b51833241"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-12007__en-us_topic_0070543667_p7522368">The service provided by the process is unavailable.</p>
</div>
<div class="section" id="ALM-12007__secbf87c6acc5443cb118200b72612df2"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-12007__en-us_topic_0070543667_ul5332049"><li id="ALM-12007__en-us_topic_0070543667_li47988448">The instance process is abnormal.</li><li id="ALM-12007__en-us_topic_0070543667_li29242856">The disk space is insufficient.</li></ul>
<div class="note" id="ALM-12007__en-us_topic_0070543667_note61859112"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="ALM-12007__en-us_topic_0070543667_p19861098">If a large number of process fault alarms exist in a time segment, files in the installation directory may be deleted mistakenly or permission on the directory may be modified.</p>
</div></div>
</div>
<div class="section" id="ALM-12007__sad734a42f8ef40529fb21b797d8b41e9"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-12007__en-us_topic_0070543667_p65245121"><strong id="ALM-12007__b73856891719">Check whether the instance process is abnormal.</strong></p>
<ol id="ALM-12007__ol5390063317638"><li id="ALM-12007__li42005517036"><a name="ALM-12007__li42005517036"></a><a name="li42005517036"></a><span>In the FusionInsight Manager portal, click <strong id="ALM-12007__b3064793094522">O&amp;M &gt; Alarm<strong id="ALM-12007__b27872374104950"> &gt; Alarms</strong></strong>, click <span><img id="ALM-12007__image14626452517" src="en-us_image_0000001080201158.png"></span> in the row where the alarm is located , and click the host name to view the host address for which the alarm is generated</span></li><li id="ALM-12007__li911601917036"><span>On the <strong id="ALM-12007__b378050117036">Alarms</strong> page, check whether the <a href="ALM-12006.html">ALM-12006 Node Fault</a> is generated.</span><p><ul class="subitemlist" id="ALM-12007__ul846943117036"><li id="ALM-12007__li452236417036">If yes, go to <a href="#ALM-12007__li20006517036">3</a>.</li><li id="ALM-12007__li3076720917036">If no, go to <a href="#ALM-12007__li195150317036">4</a>.</li></ul>
</p></li><li id="ALM-12007__li20006517036"><a name="ALM-12007__li20006517036"></a><a name="li20006517036"></a><span>Handle the alarm according to <a href="ALM-12006.html">ALM-12006 Node Fault</a>.</span></li><li id="ALM-12007__li195150317036"><a name="ALM-12007__li195150317036"></a><a name="li195150317036"></a><span>Log in to the host for which the alarm is generated as user <strong id="ALM-12007__b8307212154711">root</strong>. <span id="ALM-12007__text43649449460"></span>Check whether the installation directory user, user group, and permission of the alarm role are correct. The user, user group, and the permission must be <strong id="ALM-12007__b180058917036">omm:ficommon 750</strong>.</span><p><p class="subitemlist" id="ALM-12007__p7190141912118">For example, the NameNode installation directory is<strong id="ALM-12007__b16534123110112"> </strong><em id="ALM-12007__i677216419119">${BIGDATA_HOME}</em><strong id="ALM-12007__b177174617112">/FusionInsight_Current/</strong><em id="ALM-12007__i137264460113">1_8_NameNode</em><strong id="ALM-12007__b13731846191113">/etc</strong>.</p>
<ul class="subitemlist" id="ALM-12007__ul2258645517036"><li id="ALM-12007__li1163004517036">If yes, go to <a href="#ALM-12007__li3396349817036">6</a>.</li><li id="ALM-12007__li250960617036">If no, go to <a href="#ALM-12007__li3247692317036">5</a>.</li></ul>
</p></li><li id="ALM-12007__li3247692317036"><a name="ALM-12007__li3247692317036"></a><a name="li3247692317036"></a><span>Run the following command to set the permission to <strong id="ALM-12007__b1756352717036">750</strong> and <strong id="ALM-12007__b2385401617036">User:Group</strong> to <strong id="ALM-12007__b1335955517036">omm:ficommon</strong>:</span><p><p class="litext" id="ALM-12007__p833090817036"><strong id="ALM-12007__b5312713817036">chmod 750 </strong><em id="ALM-12007__i838219617036">&lt;folder_name&gt;</em></p>
<p class="litext" id="ALM-12007__p3343470817036"><strong id="ALM-12007__b786931417036">chown omm:ficommon </strong><em id="ALM-12007__i371496717036">&lt;folder_name&gt;</em></p>
</p></li><li id="ALM-12007__li3396349817036"><a name="ALM-12007__li3396349817036"></a><a name="li3396349817036"></a><span>Wait for 5 minutes. In the alarm list, check whether <strong id="ALM-12007__b2385685117036">ALM-12007 Process Fault</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul2693144617036"><li id="ALM-12007__li1338507017036">If yes, no further action is required.</li><li id="ALM-12007__li1044892317036">If no, go to <a href="#ALM-12007__li2657388817036">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-12007__p5673574217645"><strong id="ALM-12007__b1353742417650">Check whether disk space is sufficient.</strong></p>
<ol start="7" id="ALM-12007__ol2289926317658"><li id="ALM-12007__li2657388817036"><a name="ALM-12007__li2657388817036"></a><a name="li2657388817036"></a><span>On the FusionInsight Manager, check whether the alarm list contains <strong id="ALM-12007__b3723602917036">ALM-12017 Insufficient Disk Capacity</strong>.</span><p><ul class="subitemlist" id="ALM-12007__ul6260497717036"><li id="ALM-12007__li6332838717036">If yes, go to <a href="#ALM-12007__li500135217036">8</a>.</li><li id="ALM-12007__li2932572917036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
</p></li><li id="ALM-12007__li500135217036"><a name="ALM-12007__li500135217036"></a><a name="li500135217036"></a><span>Rectify the fault by following the steps provided in <a href="ALM-12017.html">ALM-12017 Insufficient Disk Capacity</a>.</span></li><li id="ALM-12007__li2288625317036"><span>Wait for 5 minutes. In the alarm list, check whether <strong id="ALM-12007__b4501217017036">ALM-12017 Insufficient Disk Capacity</strong> is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul999945717036"><li id="ALM-12007__li2210716917036">If yes, go to <a href="#ALM-12007__li1723673717036">10</a>.</li><li id="ALM-12007__li4585029317036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
</p></li><li id="ALM-12007__li1723673717036"><a name="ALM-12007__li1723673717036"></a><a name="li1723673717036"></a><span>Wait for 5 minutes. In the alarm list, check whether the alarm is cleared.</span><p><ul class="subitemlist" id="ALM-12007__ul3418148317036"><li id="ALM-12007__li464969017036">If yes, no further action is required.</li><li id="ALM-12007__li4108064417036">If no, go to <a href="#ALM-12007__li1622379717036">11</a>.</li></ul>
</p></li></ol>
<p id="ALM-12007__p3392472417052"><strong id="ALM-12007__b2313861417057">Collect fault information.</strong></p>
<ol start="11" id="ALM-12007__ol481086251710"><li id="ALM-12007__li1622379717036"><a name="ALM-12007__li1622379717036"></a><a name="li1622379717036"></a><span>On the FusionInsight Manager, choose <strong id="ALM-12007__b2091290617036">O&amp;M</strong> &gt; <strong id="ALM-12007__b5399842717036">Log &gt; Download</strong>.</span></li><li id="ALM-12007__li1598834917036"><span>According to the service name obtained in <a href="#ALM-12007__li42005517036">1</a>, select the component and <strong id="ALM-12007__b68821814172417">NodeAgent</strong> from the <strong id="ALM-12007__b15959191911544">Service</strong> and click <strong id="ALM-12007__b3991118545">OK</strong>.</span></li><li id="ALM-12007__li1145664103113"><span>Click <span><img id="ALM-12007__image1945644173117" src="en-us_image_0269383814.png"></span> in the upper right corner, and set <strong id="ALM-12007__b6456941173117">Start Date</strong> and <strong id="ALM-12007__b11456154113318">End Date</strong> for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-12007__b13456164113319">Download</strong>.</span></li><li id="ALM-12007__li495644512588"><span>Contact the <span id="ALM-12007__text4614151421417">O&amp;M personnel</span> and send the collected log information.</span></li></ol>
</div>
<div class="section" id="ALM-12007__section1529716184534"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-12007__p4677152685316">After the fault is rectified, the system automatically clears this alarm.</p>
</div>
<div class="section" id="ALM-12007__sb81c90a530914c14b08552a98ff5c8d0"><h4 class="sectiontitle">Related Information</h4><p id="ALM-12007__en-us_topic_0070543667_p23735849">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>