Yang, Tong 6182f91ba8 MRS component operation guide_normal 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2022-12-09 14:55:21 +00:00

118 lines
16 KiB
HTML

<a name="mrs_01_1065"></a><a name="mrs_01_1065"></a>
<h1 class="topictitle1">Typical Scenario: Collecting Logs from Kafka and Uploading Them to HDFS</h1>
<div id="body1590374514342"><div class="section" id="mrs_01_1065__sa06487045b2a46959003725608759034"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_1065__ad2f1b0536d754b7985d8044694d2b0cd">This section describes how to use the Flume client to collect logs from the topic list (test1) of Kafka and save them to the <span class="filepath" id="mrs_01_1065__filepath436743518476"><b>/flume/test</b></span> directory on HDFS.</p>
<p id="mrs_01_1065__p1190153615238">This section applies to MRS 3.<em id="mrs_01_1065__i4635135135616">x</em> or later clusters.</p>
<div class="note" id="mrs_01_1065__nc300101343a147f6a176ea799c4a4c57"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p class="text" id="mrs_01_1065__acb5b871afc9c48c69dd7f66475dbba5d">By default, the cluster network environment is secure and the SSL authentication is not enabled during the data transmission process. For details about how to use the encryption mode, see <a href="mrs_01_1069.html">Configuring the Encrypted Transmission</a>. The configuration applies to scenarios where only the Flume is configured, for example, Kafka Source+Memory Channel+HDFS Sink.</p>
</div></div>
</div>
<div class="section" id="mrs_01_1065__s04430e3bb5244658b9b6a46f0f0fd94b"><h4 class="sectiontitle">Prerequisites</h4><ul id="mrs_01_1065__u1feede73da194b919e884e199fe992cb"><li id="mrs_01_1065__l1b1790aa95f0401ba4a902cd2b115feb">The cluster has been installed, including the HDFS, Kafka, and Flume services.</li><li id="mrs_01_1065__li3520154014471">The Flume client has been installed. For details, see <span id="mrs_01_1065__ph56371413174818"><a href="https://docs.otc.t-systems.com/cmpntguide/mrs/mrs_01_0392.html" target="_blank" rel="noopener noreferrer">Installing the Flume Client</a></span>.</li><li id="mrs_01_1065__l1e83c3cfd1ab42de873e171eb6faffab">The network environment of the cluster is secure.</li><li id="mrs_01_1065__l41ce6de757c84926a5de5cb39ebf1c54">You have created user <strong id="mrs_01_1065__b9306122217568">flume_hdfs</strong> and authorized the HDFS directory and data to be operated during log verification.</li></ul>
</div>
<div class="section" id="mrs_01_1065__sc3b5cac3e20b49f8929cba66c9b90493"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_1065__o540af6194cdd4007ae77e6d9ee5b9fdf"><li id="mrs_01_1065__ldba310a477e84e65bb39d8422f3f9664"><span>On FusionInsight Manager, choose <span class="menucascade" id="mrs_01_1065__menucascade22091716255739"><b><span class="uicontrol" id="mrs_01_1065__uicontrol210808680055739">System &gt; User</span></b></span> and choose <span class="menucascade" id="mrs_01_1065__menucascade127193121655739"><b><span class="uicontrol" id="mrs_01_1065__uicontrol77877507355739">More &gt; Download Authentication Credential</span></b></span> to download the Kerberos certificate file of user <strong id="mrs_01_1065__b195394508355739">flume_hdfs</strong> and save it to the local host.</span></li><li id="mrs_01_1065__l16a8a754e9654e03adf7ca038acb5b2b"><span>Configure the client parameters of the Flume role.</span><p><div class="p" id="mrs_01_1065__p59360377157">Use the Flume configuration tool on FusionInsight Manager to configure the Flume role client parameters and generate a configuration file.<ol type="a" id="mrs_01_1065__oa430f9283d074f839f46beb10194944a"><li id="mrs_01_1065__l815a6e6ed66f447a8bbacb4cb4abda77">Log in to FusionInsight Manager and choose<strong id="mrs_01_1065__b8259171211011"> Cluster</strong> &gt; <strong id="mrs_01_1065__b14260191211106">Services</strong>. On the page that is displayed, choose <strong id="mrs_01_1065__b32601512131014">Flume</strong>. On the displayed page, click the <strong id="mrs_01_1065__b92601512121016">Configuration Tool</strong> tab.</li><li id="mrs_01_1065__l766f1033789447cc9d824597e5687aa6">Set <strong id="mrs_01_1065__b21834960655739">Agent Name</strong> to <strong id="mrs_01_1065__b211605281855739">client</strong>. Select the source, channel, and sink to be used, drag them to the GUI on the right, and connect them.<p id="mrs_01_1065__a9cd223e3f6984c09ab85ca3280c9bd45">For example, use Kafka Source, Memory Channel, and HDFS Sink.</p>
</li><li id="mrs_01_1065__lf82319791edf4cd1a6af4ac3d2b17ece">Double-click the source, channel, and sink. Set corresponding configuration parameters by referring to <a href="#mrs_01_1065__table2029895217498">Table 1</a> based on the actual environment.<div class="note" id="mrs_01_1065__n12c0807eb21342be8b372535b9673cbe"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_1065__u318b45ac8d8341578c77bb498420decc"><li id="mrs_01_1065__l0a2de2d5bf434353ba558a76d4c635ab">If you want to continue using the <strong id="mrs_01_1065__b433055013103">properties.propretites</strong> file by modifying it, log in to FusionInsight Manager, choose <strong id="mrs_01_1065__b733205021016">Cluster</strong> &gt; <em id="mrs_01_1065__i43331950181018">Name of the desired cluster</em> &gt; <strong id="mrs_01_1065__b633445016103">Services</strong>. On the page that is displayed, choose <strong id="mrs_01_1065__b5335125061012">Flume</strong>. On the displayed page, click the <strong id="mrs_01_1065__b1233535020106">Configuration Tool</strong> tab, click <strong id="mrs_01_1065__b16336175051015">Import</strong>, import the file, and modify the configuration items related to non-encrypted transmission.</li><li id="mrs_01_1065__l503aff2a2d7748ba861a9fc8eaa40c9c">It is recommended that the numbers of Sources, Channels, and Sinks do not exceed 40 during configuration file import. Otherwise, the response time may be very long.</li></ul>
</div></div>
<div class="tablenoborder"><a name="mrs_01_1065__table2029895217498"></a><a name="table2029895217498"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_1065__table2029895217498" frame="border" border="1" rules="all"><caption><b>Table 1 </b>Parameters to be modified for the Flume role client</caption><thead align="left"><tr id="mrs_01_1065__row4298052104916"><th align="left" class="cellrowborder" valign="top" width="28.83%" id="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1"><p id="mrs_01_1065__p6298165210495">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="37.169999999999995%" id="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2"><p id="mrs_01_1065__p329865224917">Description</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="34%" id="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3"><p id="mrs_01_1065__p1298652184917">Example Value</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_1065__row1129816529499"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p92991852154918">Name</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p32991452144916">The value must be unique and cannot be left blank.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p1629915524499">test</p>
</td>
</tr>
<tr id="mrs_01_1065__row1029925216496"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p15299175284917">kafka.topics</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p102991152184915">Specifies the subscribed Kafka topic list, in which topics are separated by commas (,). This parameter cannot be left blank.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p72991552194914">test1</p>
</td>
</tr>
<tr id="mrs_01_1065__row0299195284911"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p202998526498">kafka.consumer.group.id</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p132998528495">Specifies the data group ID obtained from Kafka. This parameter cannot be left blank.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p13299752164910">flume</p>
</td>
</tr>
<tr id="mrs_01_1065__row62991552164917"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p1629919521497">kafka.bootstrap.servers</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p11299252144910">Specifies the bootstrap IP address and port list of Kafka. The default value is all Kafka lists in a Kafka cluster. If Kafka has been installed in the cluster and its configurations have been synchronized, this parameter can be left blank.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p12991352114920">192.168.101.10:9092</p>
</td>
</tr>
<tr id="mrs_01_1065__row329911527494"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p8299125274917">batchSize</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p929985244911">Specifies the number of events that Flume sends in a batch (number of data pieces).</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p1529975218492">61200</p>
</td>
</tr>
<tr id="mrs_01_1065__row330025215495"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p6300145212491">hdfs.path</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p13300185219492">Specifies the HDFS data write directory. This parameter cannot be left blank.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p5300185217497">hdfs://hacluster/flume/test</p>
</td>
</tr>
<tr id="mrs_01_1065__row163001152154919"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p230045220494">hdfs.filePrefix</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p23001352124911">Specifies the file name prefix after data is written to HDFS.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p430005274912">TMP_</p>
</td>
</tr>
<tr id="mrs_01_1065__row8300185213494"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p1030025214490">hdfs.batchSize</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p1430018529491">Specifies the maximum number of events that can be written to HDFS once.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p43001852154919">61200</p>
</td>
</tr>
<tr id="mrs_01_1065__row630055274915"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p153001452104917">hdfs.kerberosPrincipal</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p1530095215491">Specifies the Kerberos authentication user, which is mandatory in security versions. This configuration is required only in security clusters.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p12300115264914">flume_hdfs</p>
</td>
</tr>
<tr id="mrs_01_1065__row2300165204911"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p13300155220495">hdfs.kerberosKeytab</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p13001452184918">Specifies the keytab file path for Kerberos authentication, which is mandatory in security versions. This configuration is required only in security clusters.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p193001352134918">/opt/test/conf/user.keytab</p>
<div class="note" id="mrs_01_1065__note23001552174916"><span class="notetitle"> NOTE: </span><div class="notebody"><p id="mrs_01_1065__p10301165214492">Obtain the <strong id="mrs_01_1065__b100480318155739">user.keytab</strong> file from the Kerberos certificate file of the user <strong id="mrs_01_1065__b27405678355739">flume_hdfs</strong>. In addition, ensure that the user who installs and runs the Flume client has the read and write permissions on the <strong id="mrs_01_1065__b64857193355739">user.keytab</strong> file.</p>
</div></div>
</td>
</tr>
<tr id="mrs_01_1065__row5301195284912"><td class="cellrowborder" valign="top" width="28.83%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.1 "><p id="mrs_01_1065__p8301155244914">hdfs.useLocalTimeStamp</p>
</td>
<td class="cellrowborder" valign="top" width="37.169999999999995%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.2 "><p id="mrs_01_1065__p103011052134911">Specifies whether to use the local time. Possible values are <strong id="mrs_01_1065__b33130898155739">true</strong> and <strong id="mrs_01_1065__b182695029255739">false</strong>.</p>
</td>
<td class="cellrowborder" valign="top" width="34%" headers="mcps1.3.3.2.2.2.1.1.3.3.2.4.1.3 "><p id="mrs_01_1065__p16301105218499">true</p>
</td>
</tr>
</tbody>
</table>
</div>
</li><li id="mrs_01_1065__l92b924df515f493daa8ec019ca9fcec4"><a name="mrs_01_1065__l92b924df515f493daa8ec019ca9fcec4"></a><a name="l92b924df515f493daa8ec019ca9fcec4"></a>Click <strong id="mrs_01_1065__b137094315571">Export</strong> to save the <strong id="mrs_01_1065__b1437074365710">properties.properties</strong> configuration file to the local.</li></ol>
</div>
</p></li><li id="mrs_01_1065__laf28e4a6809f4c3c884dd080c2d2e41a"><span>Upload the configuration file.</span><p><p id="mrs_01_1065__p1512635181619">Upload the file exported in <a href="#mrs_01_1065__l92b924df515f493daa8ec019ca9fcec4">2.d</a> to the <em id="mrs_01_1065__i713731519482">Flume client installation directory</em><strong id="mrs_01_1065__b31371915164817">/fusioninsight-flume-</strong><span id="mrs_01_1065__text913841574815"><em id="mrs_01_1065__i31384158486">Flume component version number</em></span><strong id="mrs_01_1065__b3139515134811">/conf</strong> directory of the cluster.</p>
</p></li></ol><ol start="4" id="mrs_01_1065__ofa622de0520745d5ad045abc022b9cf6"><li id="mrs_01_1065__le7801192c8734a32a4208d6292de2d4a"><span>Verify log transmission.</span><p><ol type="a" id="mrs_01_1065__o71eb2167b0894dc0acbd61933ab793cf"><li id="mrs_01_1065__lc066c3f5cd3540ef838a483303d05abf">Log in to FusionInsight Manager as a user who has the management permission on HDFS. For details, see <a href="mrs_01_2124.html">Accessing FusionInsight Manager (MRS 3.x or Later)</a>. Choose <strong id="mrs_01_1065__b524525131113">Cluster</strong> &gt; <strong id="mrs_01_1065__b12461551111">Services</strong> &gt; <strong id="mrs_01_1065__b102467511115">HDFS</strong>. On the page that is displayed, click the <strong id="mrs_01_1065__b1324714511119">NameNode(</strong><em id="mrs_01_1065__i1024712512115">Node name</em><strong id="mrs_01_1065__b72483511115">,Active)</strong> link next to <strong id="mrs_01_1065__b192491452114">NameNode WebUI</strong> to go to the HDFS web UI. On the displayed page, choose <strong id="mrs_01_1065__b1524918517117">Utilities</strong> &gt; <strong id="mrs_01_1065__b22506541114">Browse the file system</strong>.</li><li id="mrs_01_1065__l25c000ab134c44b396bbdb8c181c0979">Check whether the data is generated in the <strong id="mrs_01_1065__b110274047455739">/flume/test</strong> directory on the HDFS.</li></ol>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1059.html">Non-Encrypted Transmission</a></div>
</div>
</div>