Yang, Tong 3f5759eed2 MRS comp-lts 2.0.38.SP20 version
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2023-01-19 17:08:45 +00:00

87 lines
19 KiB
HTML

<a name="mrs_01_0397"></a><a name="mrs_01_0397"></a>
<h1 class="topictitle1">Using Flume from Scratch</h1>
<div id="body8662426"><div class="section" id="mrs_01_0397__en-us_topic_0000001173789216_s6020f8e1de5644d2becca6a1c9dd7b98"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_0397__en-us_topic_0000001173789216_a78492c95b3d84580991937af4802369a">You can use Flume to import collected log information to Kafka.</p>
</div>
<div class="section" id="mrs_01_0397__en-us_topic_0000001173789216_sdd14a34b7dc44c2ab9cabb19599a033a"><h4 class="sectiontitle">Prerequisites</h4><ul id="mrs_01_0397__en-us_topic_0000001173789216_u0c52e5bb70f64537997c99cf4de32541"><li id="mrs_01_0397__en-us_topic_0000001173789216_l11118dd36a974dfc8762539ad916ddf6">A streaming cluster with Kerberos authentication enabled has been created.</li><li id="mrs_01_0397__en-us_topic_0000001173789216_l45d4730e083c4b1489a9858a4b8c838d">The Flume client has been installed on the node where logs are generated, for example, <strong id="mrs_01_0397__en-us_topic_0000001173789216_b126304525485">/opt/Flumeclient</strong>. The client directory in the following operations is only an example. Change it to the actual installation directory.</li><li id="mrs_01_0397__en-us_topic_0000001173789216_l9d613f4f7c21477aa8e688973b1ad35c">The streaming cluster can properly communicate with the node where logs are generated.</li></ul>
</div>
<div class="section" id="mrs_01_0397__en-us_topic_0000001173789216_section197021472420"><h4 class="sectiontitle">Using the Flume Client</h4><div class="note" id="mrs_01_0397__en-us_topic_0000001173789216_note6127174918413"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_0397__en-us_topic_0000001173789216_p1012744917411">You do not need to perform <a href="#mrs_01_0397__en-us_topic_0000001173789216_li81278495417">2</a> to <a href="#mrs_01_0397__en-us_topic_0000001173789216_li31329494415">6</a> for a normal cluster.</p>
</div></div>
<ol id="mrs_01_0397__en-us_topic_0000001173789216_ol1712704919420"><li id="mrs_01_0397__en-us_topic_0000001173789216_li53491454112820"><span>Install the client.</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p4349155462816">For details, see <span id="mrs_01_0397__en-us_topic_0000001173789216_ph1991801659"><span id="mrs_01_0397__ph11815123102819"><a href="mrs_01_1595.html">Installing the Flume Client on Clusters</a></span></span>.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li81278495417"><a name="mrs_01_0397__en-us_topic_0000001173789216_li81278495417"></a><a name="en-us_topic_0000001173789216_li81278495417"></a><span>Copy the configuration file of the authentication server from the Master1 node to the <em id="mrs_01_0397__en-us_topic_0000001173789216_i1849326261">Flume client installation directory</em><strong id="mrs_01_0397__en-us_topic_0000001173789216_b141800497">/fusioninsight-flume-</strong><em id="mrs_01_0397__en-us_topic_0000001173789216_i714315541">Flume component version number</em><strong id="mrs_01_0397__en-us_topic_0000001173789216_b5011982">/conf</strong> directory on the node where the Flume client resides.</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p1912818491043">The full file path is ${BIGDATA_HOME}/FusionInsight_BASE_<em id="mrs_01_0397__i163311813193115">XXX</em>/1_X_KerberosClient/etc/kdc.conf. In the preceding path, <span class="parmname" id="mrs_01_0397__en-us_topic_0000001173789216_parmname137921140165312"><b>XXX</b></span> indicates the product version number. <span class="parmname" id="mrs_01_0397__en-us_topic_0000001173789216_parmname8792140175319"><b>X</b></span> indicates a random number. Change it based on the site requirements. The file must be saved by the user who installs the Flume client, for example, user <strong id="mrs_01_0397__en-us_topic_0000001173789216_b819058741">root</strong>.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li6128114911418"><span>Check the service IP address of any node where the Flume role is deployed.</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p109301819113114">Log in to FusionInsight Manager. For details, see <a href="mrs_01_2124.html">Accessing FusionInsight Manager</a>. Choose <strong id="mrs_01_0397__en-us_topic_0000001173789216_b2032411289519">Cluster &gt; Services &gt; Flume &gt; Instance</strong>. Check the service IP address of any node where the Flume role is deployed.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li4130849748"><a name="mrs_01_0397__en-us_topic_0000001173789216_li4130849748"></a><a name="en-us_topic_0000001173789216_li4130849748"></a><span>Copy the user authentication file from this node to the <em id="mrs_01_0397__en-us_topic_0000001173789216_i395801387">Flume client installation directory</em><span class="filepath" id="mrs_01_0397__en-us_topic_0000001173789216_filepath1426015500"><b>/fusioninsight-flume-Flume component version number/conf</b></span> directory on the Flume client node.</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p15131104920419">The full file path is ${BIGDATA_HOME}/FusionInsight_Porter_<em id="mrs_01_0397__i8832171118">XXX</em>/install/FusionInsight-Flume-Flume component version number/flume/conf/flume.keytab.</p>
<p id="mrs_01_0397__en-us_topic_0000001173789216_p101311749743">In the preceding paths, <span class="parmname" id="mrs_01_0397__en-us_topic_0000001173789216_parmname517252515"><b>XXX</b></span> indicates the product version number. Change it based on the site requirements. The file must be saved by the user who installs the Flume client, for example, user <strong id="mrs_01_0397__en-us_topic_0000001173789216_b1548943322">root</strong>.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li313120496411"><span>Copy the <span class="filepath" id="mrs_01_0397__en-us_topic_0000001173789216_filepath1754647099"><b>jaas.conf</b></span> file from this node to the <span class="filepath" id="mrs_01_0397__en-us_topic_0000001173789216_filepath946068778"><b>conf</b></span> directory on the Flume client node.</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p108276123317">The full file path is ${BIGDATA_HOME}/FusionInsight_Current/1_<em id="mrs_01_0397__i235661817114">X</em>_Flume/etc/jaas.conf.</p>
<p id="mrs_01_0397__en-us_topic_0000001173789216_p121320491049">In the preceding path, <span class="parmname" id="mrs_01_0397__en-us_topic_0000001173789216_parmname5517144455217"><b>X</b></span> indicates a random number. Change it based on the site requirements. The file must be saved by the user who installs the Flume client, for example, user <strong id="mrs_01_0397__en-us_topic_0000001173789216_b1405501687">root</strong>.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li31329494415"><a name="mrs_01_0397__en-us_topic_0000001173789216_li31329494415"></a><a name="en-us_topic_0000001173789216_li31329494415"></a><span>Log in to the Flume client node and go to the client installation directory. Run the following command to modify the file:</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p1513234914415"><strong id="mrs_01_0397__en-us_topic_0000001173789216_b11132114911416">vi conf/jaas.conf</strong></p>
<p id="mrs_01_0397__en-us_topic_0000001173789216_p8132134917416">Change the full path of the user authentication file defined by <span class="parmname" id="mrs_01_0397__en-us_topic_0000001173789216_parmname286447899"><b>keyTab</b></span> to the <span class="filepath" id="mrs_01_0397__en-us_topic_0000001173789216_filepath446199138"><b>Flume client installation directory/fusioninsight-flume-<span id="mrs_01_0397__en-us_topic_0000001173789216_text521219258"><em id="mrs_01_0397__en-us_topic_0000001173789216_i2067946796">Flume component version number</em></span>/conf</b></span> saved in <a href="#mrs_01_0397__en-us_topic_0000001173789216_li4130849748">4</a>, and save the modification and exit.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li01322491414"><span>Run the following command to modify the <span class="filepath" id="mrs_01_0397__en-us_topic_0000001173789216_filepath1178145290"><b>flume-env.sh</b></span> configuration file of the Flume client:</span><p><p id="mrs_01_0397__en-us_topic_0000001173789216_p51325495417"><strong id="mrs_01_0397__en-us_topic_0000001173789216_b1638000051">vi </strong><em id="mrs_01_0397__en-us_topic_0000001173789216_i141461625">Flume client installation directory</em><strong id="mrs_01_0397__en-us_topic_0000001173789216_b961361807">/fusioninsight-flume-</strong><span id="mrs_01_0397__en-us_topic_0000001173789216_text1116938555"><em id="mrs_01_0397__en-us_topic_0000001173789216_i1810898737">Flume component version number</em></span><strong id="mrs_01_0397__en-us_topic_0000001173789216_b400665764">/conf/flume-env.sh</strong></p>
<p id="mrs_01_0397__en-us_topic_0000001173789216_p10133164913412">Add the following information after <span class="parmvalue" id="mrs_01_0397__en-us_topic_0000001173789216_parmvalue190788891"><b>-XX:+UseCMSCompactAtFullCollection</b></span>:</p>
<pre class="screen" id="mrs_01_0397__en-us_topic_0000001173789216_screen9133249544">-Djava.security.krb5.conf=<em id="mrs_01_0397__en-us_topic_0000001173789216_i346267409">Flume client installation directory</em>/fusioninsight-flume-1.9.0/conf/kdc.conf -Djava.security.auth.login.config=<em id="mrs_01_0397__en-us_topic_0000001173789216_i1704244165">Flume client installation directory</em>/fusioninsight-flume-1.9.0/conf/jaas.conf -Dzookeeper.request.timeout=120000</pre>
<p id="mrs_01_0397__en-us_topic_0000001173789216_p181335491640">For example, <strong id="mrs_01_0397__en-us_topic_0000001173789216_b1517084102">"-XX:+UseCMSCompactAtFullCollection -Djava.security.krb5.conf=<em id="mrs_01_0397__en-us_topic_0000001173789216_i549608619">Flume client installation directory</em>/fusioninsight-flume-<span id="mrs_01_0397__en-us_topic_0000001173789216_text1188351798"><em id="mrs_01_0397__en-us_topic_0000001173789216_i1958868035">Flume component version number</em></span>/conf/kdc.conf -Djava.security.auth.login.config=</strong><em id="mrs_01_0397__en-us_topic_0000001173789216_i1530597254">Flume client installation directory</em><strong id="mrs_01_0397__en-us_topic_0000001173789216_b445827098">/fusioninsight-flume-</strong><span id="mrs_01_0397__en-us_topic_0000001173789216_text351850741"><em id="mrs_01_0397__en-us_topic_0000001173789216_i683528415">Flume component version number</em></span><strong id="mrs_01_0397__en-us_topic_0000001173789216_b1450704927">/conf/jaas.conf -Dzookeeper.request.timeout=120000"</strong></p>
<p id="mrs_01_0397__en-us_topic_0000001173789216_p7133549541">Change <em id="mrs_01_0397__en-us_topic_0000001173789216_i1759988694">Flume client installation directory</em> to the actual installation directory. Then save and exit.</p>
</p></li><li id="mrs_01_0397__en-us_topic_0000001173789216_li15133949946"><span>Run the following command to restart the Flume client:</span><p><p id="mrs_01_0397__p9863101120393"><strong id="mrs_01_0397__b229517591370">cd </strong><em id="mrs_01_0397__i629605915378">Flume client installation directory</em><strong id="mrs_01_0397__b19296959153719">/fusioninsight-flume-</strong><span id="mrs_01_0397__text6297185983711"><em id="mrs_01_0397__i18296195923712">Flume component version number</em></span><strong id="mrs_01_0397__b11297135913711">/bin</strong></p>
<p id="mrs_01_0397__p15863111123914"><strong id="mrs_01_0397__b9863911143918">./flume-manage.sh restart</strong></p>
<p id="mrs_01_0397__p2063491363912">Example:</p>
<p id="mrs_01_0397__p513313491046"><strong id="mrs_01_0397__b687227199">cd /opt/FlumeClient/fusioninsight-flume-</strong><span id="mrs_01_0397__text1715239672"><em id="mrs_01_0397__i71844062">Flume component version number</em></span><strong id="mrs_01_0397__b985798386">/bin</strong></p>
<p id="mrs_01_0397__p21334491644"><strong id="mrs_01_0397__b013313491647">./flume-manage.sh restart</strong></p>
</p></li><li id="mrs_01_0397__li1391318335141"><span>Configure jobs based on actual service scenarios.</span><p><ul id="mrs_01_0397__ul143551339101419"><li id="mrs_01_0397__li1355339151418">Some parameters can be configured for MRS 3.<em id="mrs_01_0397__i334763984312">x</em> or later on Manager. For details, see <a href="mrs_01_1059.html">Non-Encrypted Transmission</a> or <a href="mrs_01_1068.html">Encrypted Transmission</a>.</li><li id="mrs_01_0397__li18902144615149">Set the parameters in the <strong id="mrs_01_0397__b1683775314417">properties.properties</strong> file. The following uses SpoolDir Source+File Channel+Kafka Sink as an example.<p id="mrs_01_0397__p7419182722014">Run the following command on the node where the Flume client is installed to configure and save a job in <strong id="mrs_01_0397__b113872031143811">properties.properties</strong> (Flume client configuration file) based on service requirements by referring to <a href="mrs_01_1057.html">Flume Service Configuration Guide</a>:</p>
<p id="mrs_01_0397__p1126102042013"><strong id="mrs_01_0397__b457646830">vi </strong><em id="mrs_01_0397__i1478186813">Flume client installation directory</em><strong id="mrs_01_0397__b1598827978">/fusioninsight-flume-</strong><span id="mrs_01_0397__text1419081776"><em id="mrs_01_0397__i1478582987">Flume component version number</em></span><strong id="mrs_01_0397__b462334852">/conf/properties.properties</strong></p>
<pre class="screen" id="mrs_01_0397__screen152611320142014">#########################################################################################
client.sources = static_log_source
client.channels = static_log_channel
client.sinks = kafka_sink
#########################################################################################
#LOG_TO_HDFS_ONLINE_1
client.sources.static_log_source.type = spooldir
client.sources.static_log_source.spoolDir = <em id="mrs_01_0397__i184511510103910">Monitoring directory</em>
client.sources.static_log_source.fileSuffix = .COMPLETED
client.sources.static_log_source.ignorePattern = ^$
client.sources.static_log_source.trackerDir = <em id="mrs_01_0397__i8881213153915">Metadata storage path during transmission</em>
client.sources.static_log_source.maxBlobLength = 16384
client.sources.static_log_source.batchSize = 51200
client.sources.static_log_source.inputCharset = UTF-8
client.sources.static_log_source.deserializer = LINE
client.sources.static_log_source.selector.type = replicating
client.sources.static_log_source.fileHeaderKey = file
client.sources.static_log_source.fileHeader = false
client.sources.static_log_source.basenameHeader = true
client.sources.static_log_source.basenameHeaderKey = basename
client.sources.static_log_source.deletePolicy = never
client.channels.static_log_channel.type = file
client.channels.static_log_channel.dataDirs = <em id="mrs_01_0397__i1884141673911">Data cache path. Multiple paths, separated by commas (,), can be configured to improve performance.</em>
client.channels.static_log_channel.checkpointDir = <em id="mrs_01_0397__i3486121918396">Checkpoint storage path</em>
client.channels.static_log_channel.maxFileSize = 2146435071
client.channels.static_log_channel.capacity = 1000000
client.channels.static_log_channel.transactionCapacity = 612000
client.channels.static_log_channel.minimumRequiredSpace = 524288000
client.sinks.kafka_sink.type = org.apache.flume.sink.kafka.KafkaSink
client.sinks.kafka_sink.kafka.topic = <em id="mrs_01_0397__i965311225399">Topic to which data is written, for example, <strong id="mrs_01_0397__b365332215391">flume_test</strong></em>
client.sinks.kafka_sink.kafka.bootstrap.servers = <em id="mrs_01_0397__i1735106852">XXX</em>.<em id="mrs_01_0397__i1497225976">XXX</em>.<em id="mrs_01_0397__i1009870524">XXX</em>.<em id="mrs_01_0397__i564045343">XXX</em>:<em id="mrs_01_0397__i948966207">Kafka port number</em>,<em id="mrs_01_0397__i1852779803">XXX</em>.<em id="mrs_01_0397__i954400655">XXX</em>.<em id="mrs_01_0397__i101714588">XXX</em>.<em id="mrs_01_0397__i2035731186">XXX</em>:<em id="mrs_01_0397__i854776376">Kafka port number</em>,<em id="mrs_01_0397__i743407972">XXX</em>.<em id="mrs_01_0397__i2059898547">XXX</em>.<em id="mrs_01_0397__i1756083143">XXX</em>.<em id="mrs_01_0397__i1007989609">XXX</em>:<em id="mrs_01_0397__i593635174">Kafka port number</em>
client.sinks.kafka_sink.flumeBatchSize = 1000
client.sinks.kafka_sink.kafka.producer.type = sync
client.sinks.kafka_sink.kafka.security.protocol = SASL_PLAINTEXT
client.sinks.kafka_sink.kafka.kerberos.domain.name = <em id="mrs_01_0397__i16432172518395">Kafka domain name. This parameter is mandatory for a security cluster, for example, <strong id="mrs_01_0397__b943211252390">hadoop.xxx.com</strong>.</em>
client.sinks.kafka_sink.requiredAcks = 0
client.sources.static_log_source.channels = static_log_channel
client.sinks.kafka_sink.channel = static_log_channel</pre>
<div class="note" id="mrs_01_0397__note1927092016207"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_0397__ul1927062072014"><li id="mrs_01_0397__li122711120182014"><strong id="mrs_01_0397__b1439928183913">client.sinks.kafka_sink.kafka.topic</strong>: Topic to which data is written. If the topic does not exist in Kafka, it is automatically created by default.</li><li id="mrs_01_0397__li12718205208"><strong id="mrs_01_0397__b27451633123910">client.sinks.kafka_sink.kafka.bootstrap.servers</strong>: List of Kafka Brokers, which are separated by commas (,). By default, the port is <strong id="mrs_01_0397__b690314364391">21007</strong> for a security cluster and <strong id="mrs_01_0397__b13903173623917">9092</strong> for a normal cluster.</li><li id="mrs_01_0397__li6271192012204"><strong id="mrs_01_0397__b85364018393">client.sinks.kafka_sink.kafka.security.protocol</strong>: The value is <strong id="mrs_01_0397__b12538406393">SASL_PLAINTEXT</strong> for a security cluster and <strong id="mrs_01_0397__b35394018398">PLAINTEXT</strong> for a normal cluster.</li><li id="mrs_01_0397__li17271132010203"><strong id="mrs_01_0397__b2331194323914">client.sinks.kafka_sink.kafka.kerberos.domain.name</strong>:<p id="mrs_01_0397__p5271720102016">You do not need to set this parameter for a normal cluster. For a security cluster, the value of this parameter is the value of <strong id="mrs_01_0397__b8881165017396">kerberos.domain.name</strong> in the Kafka cluster.</p>
<p id="mrs_01_0397__p1427202012203">In the preceding paths, <span class="parmname" id="mrs_01_0397__parmname43368094017"><b>X</b></span> indicates a random number. Change it based on site requirements. The file must be saved by the user who installs the Flume client, for example, user <strong id="mrs_01_0397__b2129712476">root</strong>.</p>
</li></ul>
</div></div>
</li></ul>
</p></li><li id="mrs_01_0397__li1313611498420"><span>After the parameters are set and saved, the Flume client automatically loads the content configured in <strong id="mrs_01_0397__b177471138174013">properties.properties</strong>. When new log files are generated by spoolDir, the files are sent to Kafka producers and can be consumed by Kafka consumers. For details, see <a href="mrs_01_0379.html">Managing Messages in Kafka Topics</a>.</span></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_0390.html">Using Flume</a></div>
</div>
</div>