doc-exports/docs/mrs/umn/ALM-24004.html
Yang, Tong 3b1f73dece MRS UMN 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-13 12:03:34 +00:00

101 lines
13 KiB
HTML

<a name="ALM-24004"></a><a name="ALM-24004"></a>
<h1 class="topictitle1">ALM-24004 Exception Occurs When Flume Reads Data</h1>
<div id="body7813744"><div class="section" id="ALM-24004__section64226993"><h4 class="sectiontitle">Description</h4><p id="ALM-24004__p28933522">The alarm module monitors the status of Flume Source. This alarm is generated immediately when the duration in which Source fails to read the data exceeds the threshold.</p>
<p id="ALM-24004__p59075108">The default threshold is <strong id="ALM-24004__b04002030173211">0</strong>, indicating that the threshold is disabled. You can change the threshold by modifying the <strong id="ALM-24004__b146476571732">properties.properties</strong> file in the <strong id="ALM-24004__b12444192512416">conf</strong> directory. Specifically, modify the <strong id="ALM-24004__b364714577315">NoDatatime</strong> parameter of required the source.</p>
<p id="ALM-24004__p61913932">The alarm is cleared when Source reads the data and the alarm handling is complete.</p>
</div>
<div class="section" id="ALM-24004__section41172029"><h4 class="sectiontitle">Attribute</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-24004__table48972615" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-24004__row30410955"><th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.1"><p id="ALM-24004__p47368280">Alarm ID</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.2"><p id="ALM-24004__p11625453">Alarm Severity</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="33.33333333333333%" id="mcps1.3.2.2.1.4.1.3"><p id="ALM-24004__p2137651">Auto Clear</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-24004__row38932049"><td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.1 "><p id="ALM-24004__p66488304">24004</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.2 "><p id="ALM-24004__p16843576">Major</p>
</td>
<td class="cellrowborder" valign="top" width="33.33333333333333%" headers="mcps1.3.2.2.1.4.1.3 "><p id="ALM-24004__p22152400">Yes</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-24004__section35003948"><h4 class="sectiontitle">Parameters</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="ALM-24004__table49513952" frame="border" border="1" rules="all"><thead align="left"><tr id="ALM-24004__row17509858"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.1"><p id="ALM-24004__p9012379">Name</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.3.2.1.3.1.2"><p id="ALM-24004__p58914097">Meaning</p>
</th>
</tr>
</thead>
<tbody><tr id="ALM-24004__row16441185817153"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-24004__p17935380415">Source</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-24004__p187931338134115">Specifies the cluster for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-24004__row7312523"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-24004__p55443473">ServiceName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-24004__p61736365">Specifies the service for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-24004__row18756373"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-24004__p42871274">HostName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-24004__p50021192">Specifies the host for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-24004__row910544383116"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-24004__p122238832019">AgentId</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-24004__p1710634323116">Specifies the ID of the agent for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-24004__row47537545"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-24004__p1259352420200">ComponentType</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-24004__p38945440">Specifies the component type for which the alarm is generated.</p>
</td>
</tr>
<tr id="ALM-24004__row14964643"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.1 "><p id="ALM-24004__p29641231152015">ComponentName</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.3.2.1.3.1.2 "><p id="ALM-24004__p2757266">Specifies the component name for which the alarm is generated.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<div class="section" id="ALM-24004__section46600077"><h4 class="sectiontitle">Impact on the System</h4><p id="ALM-24004__p22012011">If data is found in the data source and Flume Source continuously fails to read data, the data collection is stopped.</p>
</div>
<div class="section" id="ALM-24004__section16747515"><h4 class="sectiontitle">Possible Causes</h4><ul id="ALM-24004__ul38142436"><li id="ALM-24004__li7737604">Flume Source is faulty, so data cannot be sent.</li><li id="ALM-24004__li2529573">The network is faulty, so the data cannot be sent.</li></ul>
</div>
<div class="section" id="ALM-24004__section16509910"><h4 class="sectiontitle">Procedure</h4><p class="tableheading" id="ALM-24004__p3568855"><strong id="ALM-24004__b6441639685910">Check whether Flume Source is faulty.</strong></p>
<ol id="ALM-24004__ol242739611730"><li id="ALM-24004__li4903238511655"><span>Open the <strong id="ALM-24004__b7113192120714">properties.properties</strong> configuration file on the local PC, search for <strong id="ALM-24004__b1940214451174">keyword type = spooldir</strong> in the file, and check whether the Flume source type is spoolDir.</span><p><ul class="subitemlist" id="ALM-24004__ul544804211655"><li id="ALM-24004__li5763456311655">If yes, go to <a href="#ALM-24004__li3561010711655">2</a>.</li><li id="ALM-24004__li3788804011655">If no, go to <a href="#ALM-24004__li3862672011655">3</a>.</li></ul>
</p></li><li id="ALM-24004__li3561010711655"><a name="ALM-24004__li3561010711655"></a><a name="li3561010711655"></a><span>View the spoolDir directory to check whether all files are already transferred.</span><p><ul class="subitemlist" id="ALM-24004__ul4869592111655"><li id="ALM-24004__li3863828511655">If yes, no further action is required.</li><li id="ALM-24004__li4269336011655">If no, go to <a href="#ALM-24004__li2692021011655">5</a>.<div class="note" id="ALM-24004__note1844521105010"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="ALM-24004__p1544519185017">The monitoring directory of spooDir is specified by the <strong id="ALM-24004__b34541730983">.spoolDir</strong> parameter in the <strong id="ALM-24004__b737414341080">properties.properties</strong> configuration file. If all files in the monitoring directory have been transferred, the file name extension of all files in the monitoring directory is <strong id="ALM-24004__b372035211317">.COMPLETED</strong>.</p>
</div></div>
</li></ul>
</p></li><li id="ALM-24004__li3862672011655"><a name="ALM-24004__li3862672011655"></a><a name="li3862672011655"></a><span>Open the <strong id="ALM-24004__b53823131298">properties.properties</strong> configuration file on the local PC, search for <strong id="ALM-24004__b123822138913">org.apache.flume.source.kafka.KafkaSource</strong> in the file, and check whether the Flume source type is Kafka.</span><p><ul class="subitemlist" id="ALM-24004__ul1920493811655"><li id="ALM-24004__li6584642311655">If yes, go to <a href="#ALM-24004__li4027383611655">4</a>.</li><li id="ALM-24004__li3196004311655">If no, go to <a href="#ALM-24004__li5944850711655">7</a>.</li></ul>
</p></li><li id="ALM-24004__li4027383611655"><a name="ALM-24004__li4027383611655"></a><a name="li4027383611655"></a><span>Check whether the topic data configured by Kafka Source has been used up. </span><p><ul class="subitemlist" id="ALM-24004__ul2684449211655"><li id="ALM-24004__li1209616611655">If yes, no further action is required.</li><li id="ALM-24004__li4026542311655">If no, go to <a href="#ALM-24004__li2692021011655">5</a>.</li></ul>
</p></li><li id="ALM-24004__li2692021011655"><a name="ALM-24004__li2692021011655"></a><a name="li2692021011655"></a><span>On FusionInsight Manager, choose <strong id="ALM-24004__b847352521713">Cluster</strong> &gt; <em id="ALM-24004__i184771925191713">Name of the desired cluster</em> &gt; <strong id="ALM-24004__b295011515287">Services</strong> &gt; <strong id="ALM-24004__b258342808595">Flume</strong> &gt; <strong id="ALM-24004__b311819338595">Instance</strong>.</span></li><li id="ALM-24004__li3462695211655"><span>Go to the Flume instance page of the faulty node to check whether the indicator <strong id="ALM-24004__b427086228595">Source Speed Metrics</strong> in the alarm is 0.</span><p><ul class="subitemlist" id="ALM-24004__ul6007981611655"><li id="ALM-24004__li3305340111655">If yes, go to <a href="#ALM-24004__li1313046711655">11</a>.</li><li id="ALM-24004__li2904515611655">If no, go to <a href="#ALM-24004__li5944850711655">7</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-24004__p384743911655"><strong id="ALM-24004__b563006349020">Check the network connection between the faulty node and the node that corresponds to the Flume Source IP address.</strong></p>
<ol start="7" id="ALM-24004__ol2443014511812"><li id="ALM-24004__li5944850711655"><a name="ALM-24004__li5944850711655"></a><a name="li5944850711655"></a><span>Open the <strong id="ALM-24004__b3100339599">properties.properties</strong> configuration file on the local PC, search for <strong id="ALM-24004__b188905491399">type = avro</strong> in the file, and check whether the Flume source type is Avro.</span><p><ul class="subitemlist" id="ALM-24004__ul1406193011655"><li id="ALM-24004__li5331974511655">If yes, go to <a href="#ALM-24004__li6550564111655">8</a>.</li><li id="ALM-24004__li2393205811655">If no, go to <a href="#ALM-24004__li1313046711655">11</a>.</li></ul>
</p></li><li id="ALM-24004__li6550564111655"><a name="ALM-24004__li6550564111655"></a><a name="li6550564111655"></a><span>Log in to the faulty node as user <strong id="ALM-24004__b64231458595">root</strong>, and run the <strong id="ALM-24004__b578083128595">ping </strong><em id="ALM-24004__i505127648595">IP address of the Flume source</em> command to check whether the peer host can be pinged successfully. <span id="ALM-24004__text149987167524"></span></span><p><ul class="subitemlist" id="ALM-24004__ul2964802511655"><li id="ALM-24004__li4422757511655">If yes, go to <a href="#ALM-24004__li1313046711655">11</a>.</li><li id="ALM-24004__li2566384611655">If no, go to <a href="#ALM-24004__li5267986211655">9</a>.</li></ul>
</p></li><li id="ALM-24004__li5267986211655"><a name="ALM-24004__li5267986211655"></a><a name="li5267986211655"></a><span>Contact the network administrator to restore the network.</span></li><li id="ALM-24004__li3128510211655"><span>In the alarm list, check whether the alarm is cleared after a period.</span><p><ul class="subitemlist" id="ALM-24004__ul2192735211655"><li id="ALM-24004__li435671311655">If yes, no further action is required.</li><li id="ALM-24004__li1734945311655">If no, go to <a href="#ALM-24004__li1313046711655">11</a>.</li></ul>
</p></li></ol>
<p class="tableheading" id="ALM-24004__p6312844611655"><strong id="ALM-24004__b92212632">Collect the fault information.</strong></p>
<ol start="11" id="ALM-24004__ol3198166911835"><li id="ALM-24004__li1313046711655"><a name="ALM-24004__li1313046711655"></a><a name="li1313046711655"></a><span>On FusionInsight Manager, choose <strong id="ALM-24004__b39958216542">O&amp;M</strong>. In the navigation pane on the left, choose <strong id="ALM-24004__b59952217544">Log</strong> &gt; <strong id="ALM-24004__b699511219547">Download</strong>.</span></li><li id="ALM-24004__li5106534011655"><span>Expand the <strong id="ALM-24004__b19232419182614">Service</strong> drop-down list, and select <strong id="ALM-24004__b623891913262">Flume</strong> for the target cluster.</span></li><li id="ALM-24004__li5693488411655"><span>Click <span><img id="ALM-24004__image1945644173117" src="en-us_image_0269417449.png"></span> in the upper right corner, and set <strong id="ALM-24004__b6456941173117">Start Date</strong> and <strong id="ALM-24004__b11456154113318">End Date</strong> for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click <strong id="ALM-24004__b13456164113319">Download</strong>.</span></li><li id="ALM-24004__li4832292311655"><span>Contact <span id="ALM-24004__text496473924320">O&amp;M personnel</span> and provide the collected logs.</span></li></ol>
</div>
<div class="section" id="ALM-24004__section169311343318"><h4 class="sectiontitle">Alarm Clearing</h4><p id="ALM-24004__p754913417333">This alarm is automatically cleared after the fault is rectified.</p>
</div>
<div class="section" id="ALM-24004__section14371463"><h4 class="sectiontitle">Related Information</h4><p id="ALM-24004__p51658577">None</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1298.html">Alarm Reference (Applicable to MRS 3.x)</a></div>
</div>
</div>