Yang, Tong 48706b7552 MRS COMP-LTS 320-lts.1 version
Reviewed-by: Kacur, Michal <michal.kacur@t-systems.com>
Co-authored-by: Yang, Tong <yangtong2@huawei.com>
Co-committed-by: Yang, Tong <yangtong2@huawei.com>
2024-04-12 12:51:10 +00:00

96 lines
16 KiB
HTML

<a name="mrs_01_2392"></a><a name="mrs_01_2392"></a>
<h1 class="topictitle1">Submitting a DistCp Job with Oozie Client</h1>
<div id="body32001227"><div class="section" id="mrs_01_2392__en-us_topic_0000001219230961_sc367440a9bee4ae9a15baede2902cc54"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_2392__en-us_topic_0000001219230961_ae05864fc43454b69888e41725935e74b">This section describes how to submit a DistCp job using the Oozie client.</p>
<div class="note" id="mrs_01_2392__en-us_topic_0000001219230961_n55f8e36fffd341878eb6390e2a633e8f"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_2392__en-us_topic_0000001219230961_a5d75fccba647482fb315a28ee027d87a">You are advised to download the latest client.</p>
</div></div>
</div>
<div class="section" id="mrs_01_2392__en-us_topic_0000001219230961_s03008bf64de9424c83937093f6557919"><h4 class="sectiontitle">Prerequisites</h4><ul id="mrs_01_2392__en-us_topic_0000001219230961_u6075e1bead804d118aeb83a52bc084ca"><li id="mrs_01_2392__en-us_topic_0000001219230961_l59e2fa74571d485bb4861d104ade63ea">The HDFS and Oozie components and clients have been installed and are running properly.<p id="mrs_01_2392__en-us_topic_0000001219230961_a809de3689b214dd0bdc79d7e35fbae9c"><a name="mrs_01_2392__en-us_topic_0000001219230961_l59e2fa74571d485bb4861d104ade63ea"></a><a name="en-us_topic_0000001219230961_l59e2fa74571d485bb4861d104ade63ea"></a>If the current client is an earlier version, you need to download and install the client again.</p>
</li><li id="mrs_01_2392__en-us_topic_0000001219230961_l8ff319393ec949d9ae45feea7e19d290">You have created or obtained the human-machine account and password for accessing the Oozie service.<div class="note" id="mrs_01_2392__en-us_topic_0000001219230961_note315568028334"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_2392__en-us_topic_0000001219230961_ul155757648334"><li id="mrs_01_2392__en-us_topic_0000001219230961_li59641568334">This user must belong to the <strong id="mrs_01_2392__en-us_topic_0000001219230961_b12537165924011">hadoop</strong>, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b55428599407">supergroup</strong>, and <strong id="mrs_01_2392__en-us_topic_0000001219230961_b55431759124011">hive</strong> groups and be assigned with the Oozie role operation permission.</li><li id="mrs_01_2392__en-us_topic_0000001219230961_li536774108334">This user must also be assigned the <strong id="mrs_01_2392__en-us_topic_0000001219230961_b13380175114119">manager_viewer</strong> role at least.</li></ul>
</div></div>
</li></ul>
</div>
<ul id="mrs_01_2392__en-us_topic_0000001219230961_ul589917804210"><li id="mrs_01_2392__en-us_topic_0000001219230961_l67e29a0db3d64a01ba3c8f8a242fffe2">You have obtained the URL of the Oozie server (any instance) in the running state, for example, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b1687199144114">https://10.1.130.10:21003/oozie</strong>.</li><li id="mrs_01_2392__en-us_topic_0000001219230961_l99b45eddd6bd47d3bb30ccde42dd5c80">You have obtained the name of the Oozie server, for example, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b362317157416">10-1-130-10</strong>.</li><li id="mrs_01_2392__en-us_topic_0000001219230961_l33cbb5dabf1d4b61ac772fc3ec3c3e4a">You have obtained the IP address of the active Yarn ResourceManager, for example, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b18868182011411">10.1.130.11</strong>.</li></ul>
<div class="section" id="mrs_01_2392__en-us_topic_0000001219230961_section440189144115"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_2392__en-us_topic_0000001219230961_oc881cfc28d414306af1adbbbf4a00b1b"><li id="mrs_01_2392__en-us_topic_0000001219230961_l7935c910b03545e0a5867ca4d3b2c006"><span>Log in to the node where the Oozie client is installed as the client installation user .</span></li><li id="mrs_01_2392__en-us_topic_0000001219230961_lfce2496a67db4306afc7110e3058abea"><span>Run the following command to obtain the installation environment. In the preceding command, <span class="filepath" id="mrs_01_2392__en-us_topic_0000001219230961_f1995adbdd35741e6a3a9f62f7fb7a90b"><b>/opt/client/</b></span> indicates the client installation path.</span><p><p id="mrs_01_2392__en-us_topic_0000001219230961_a9fe0ea3f05954692816d1ed270ff6e25"><strong id="mrs_01_2392__en-us_topic_0000001219230961_ad8409f5d475a4974b1a119c7e03644bc">source /opt/client/bigdata_env</strong></p>
</p></li><li id="mrs_01_2392__en-us_topic_0000001219230961_l1875b9398af340e1844211cfd2f8272d"><span>Check the cluster authentication mode.</span><p><ul id="mrs_01_2392__en-us_topic_0000001219230961_u553e435dd8f84be8a18ff92d6d24725a"><li id="mrs_01_2392__en-us_topic_0000001219230961_ld507b00926e44383b9343a4bc2adbeb3">If the cluster is in security mode, run the <strong id="mrs_01_2392__en-us_topic_0000001219230961_b27365575417">kinit</strong> command to authenticate users.<p id="mrs_01_2392__en-us_topic_0000001219230961_a6f9517958c83424ea524bf7781acc911">For example, the <strong id="mrs_01_2392__en-us_topic_0000001219230961_b621215144217">oozieuser</strong> user is authenticated using the following command:</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_ae4d53a149513466187a4bf7a3651dda3"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b218349103537">kinit oozieuser</strong></p>
</li><li id="mrs_01_2392__en-us_topic_0000001219230961_l4c6e11eac05d4977a9b756054fd46659">If the cluster is in normal mode, go to <a href="#mrs_01_2392__en-us_topic_0000001219230961_lcc62479277a945d99a305c3a8402a40d">4</a>.</li></ul>
</p></li><li id="mrs_01_2392__en-us_topic_0000001219230961_lcc62479277a945d99a305c3a8402a40d"><a name="mrs_01_2392__en-us_topic_0000001219230961_lcc62479277a945d99a305c3a8402a40d"></a><a name="en-us_topic_0000001219230961_lcc62479277a945d99a305c3a8402a40d"></a><span>Run the following command to go to the example directory:</span><p><p id="mrs_01_2392__en-us_topic_0000001219230961_a6a361a015c424467b72a12dcbb292aaf"><strong id="mrs_01_2392__en-us_topic_0000001219230961_a2ec98494ecb742d3aa186b2d18928c32">cd /opt/client/Oozie/oozie-client-*/examples/apps/distcp/</strong></p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_aa736f6a3b96b40c0b60acd1d8626c64f"><a href="#mrs_01_2392__en-us_topic_0000001219230961_ta1113e97d79c4106a91f5e20da6899e0">Table 1</a> lists the files that you need to pay attention to in the directory.</p>
<div class="tablenoborder"><a name="mrs_01_2392__en-us_topic_0000001219230961_ta1113e97d79c4106a91f5e20da6899e0"></a><a name="en-us_topic_0000001219230961_ta1113e97d79c4106a91f5e20da6899e0"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_2392__en-us_topic_0000001219230961_ta1113e97d79c4106a91f5e20da6899e0" frame="border" border="1" rules="all"><caption><b>Table 1 </b>File description</caption><thead align="left"><tr id="mrs_01_2392__en-us_topic_0000001219230961_r4aa1a6ae8d4346328f92540884cca876"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.2.4.2.3.2.3.1.1"><p id="mrs_01_2392__en-us_topic_0000001219230961_a16a9e33c6cbd4b898c5205ffd33d6036">File</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.2.4.2.3.2.3.1.2"><p id="mrs_01_2392__en-us_topic_0000001219230961_ab8b5f25780904c488a54fa26e79a8446">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="mrs_01_2392__en-us_topic_0000001219230961_rdb5a7205478d49d7a5ff58bb41798174"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.1 "><p id="mrs_01_2392__en-us_topic_0000001219230961_afb57a440373e48b8b06b50fa052385c7">job.properties</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.2 "><p id="mrs_01_2392__en-us_topic_0000001219230961_aa99f6c5fc08a4b558f4f5b658f73af02">Parameter definition file of a workflow</p>
</td>
</tr>
<tr id="mrs_01_2392__en-us_topic_0000001219230961_r280e2c0ce768422d970506fb7530754e"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.1 "><p id="mrs_01_2392__en-us_topic_0000001219230961_a11a9d0dfd7c34ae2bbef3a5513508fa0">workflow.xml</p>
</td>
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.2 "><p id="mrs_01_2392__en-us_topic_0000001219230961_a6b5bf5dcae684e0c878e0c5c4afa0832">Rule definition file of a workflow</p>
</td>
</tr>
</tbody>
</table>
</div>
</p></li><li id="mrs_01_2392__en-us_topic_0000001219230961_lc81b755786704225844019d5c86d4966"><span>Run the following command to edit the <span class="filepath" id="mrs_01_2392__en-us_topic_0000001219230961_f10654170ccf1449792b1ffc3ab2861cf"><b>job.properties</b></span> file:</span><p><p id="mrs_01_2392__en-us_topic_0000001219230961_aeb6f21eba3514b47a2fa841358454342"><strong id="mrs_01_2392__en-us_topic_0000001219230961_aba9fb7bb776346d8a3aeeeccee1188bb">vi job.properties</strong></p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_a8f8cfb46c2c24c45b7a75afcc79efba6">Perform the following modifications:</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_p934881516612">Change the value of <strong id="mrs_01_2392__en-us_topic_0000001219230961_b5638131924313">userName</strong> to the name of the human-machine user who submits the job, for example, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b187325334431">userName=oozieuser</strong>.</p>
</p></li><li id="mrs_01_2392__en-us_topic_0000001219230961_li99573433720"><span>If DistCp is not deployed across security clusters, go to <a href="#mrs_01_2392__en-us_topic_0000001219230961_lcabd43b3fb314898bc19ded29c90a2b3">9</a>. Otherwise, go to <a href="#mrs_01_2392__en-us_topic_0000001219230961_li57541420123918">7</a>.</span></li><li id="mrs_01_2392__en-us_topic_0000001219230961_li57541420123918"><a name="mrs_01_2392__en-us_topic_0000001219230961_li57541420123918"></a><a name="en-us_topic_0000001219230961_li57541420123918"></a><span>Establish cross-Manager mutual trust between two clusters.</span></li><li id="mrs_01_2392__en-us_topic_0000001219230961_li15561103713183"><span>Run the following commands to back up and modify the <strong id="mrs_01_2392__en-us_topic_0000001219230961_b1782718619453">workflow.xml</strong> file:</span><p><p id="mrs_01_2392__en-us_topic_0000001219230961_p9958613132012"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b20125123211202">cp workflow.xml workflow.xml.bak</strong></p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_p118059188193"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b13126832122013">vi workflow.xml</strong></p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_p24567342208">Modify the following content:</p>
<pre class="screen" id="mrs_01_2392__en-us_topic_0000001219230961_screen19550552163312">&lt;workflow-app xmlns="uri:oozie:workflow:1.0" name="distcp-wf"&gt;
&lt;start to="distcp-node"/&gt;
&lt;action name="distcp-node"&gt;
&lt;distcp xmlns="uri:oozie:distcp-action:1.0"&gt;
&lt;resource-manager&gt;${resourceManager}&lt;/resource-manager&gt;
&lt;name-node&gt;${nameNode}&lt;/name-node&gt;
&lt;prepare&gt;
&lt;delete path="hdfs://<strong id="mrs_01_2392__en-us_topic_0000001219230961_b565134011916">target_ip:target_port</strong>/user/${userName}/${examplesRoot}/output-data/${outputDir}"/&gt;
&lt;/prepare&gt;
&lt;configuration&gt;
&lt;property&gt;
&lt;name&gt;mapred.job.queue.name&lt;/name&gt;
&lt;value&gt;${queueName}&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;oozie.launcher.mapreduce.job.hdfs-servers&lt;/name&gt;
&lt;value&gt;hdfs://<strong id="mrs_01_2392__en-us_topic_0000001219230961_b198406192012">source_ip:source_port</strong>,hdfs://target_ip:target_port&lt;/value&gt;
&lt;/property&gt;
&lt;/configuration&gt;
&lt;arg&gt;${nameNode}/user/${userName}/${examplesRoot}/input-data/text/data.txt&lt;/arg&gt;
&lt;arg&gt;hdfs://target_ip:target_port/user/${userName}/${examplesRoot}/output-data/${outputDir}/data.txt&lt;/arg&gt;
&lt;/distcp&gt;
&lt;ok to="end"/&gt;
&lt;error to="fail"/&gt;
&lt;/action&gt;
&lt;kill name="fail"&gt;
&lt;message&gt;DistCP failed, error message[${wf:errorMessage(wf:lastErrorNode())}]&lt;/message&gt;
&lt;/kill&gt;
&lt;end name="end"/&gt;
&lt;/workflow-app&gt;</pre>
<p id="mrs_01_2392__en-us_topic_0000001219230961_p188674216218"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b1875910372454">target_ip:target_port</strong> is the HDFS active NameNode address of the other trusted cluster, for example, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b285205254513">10.10.10.233:25000</strong>.</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_p1028213313355"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b2892101019463">source_ip:source_port</strong> indicates the HDFS active NameNode address of the source cluster, for example, <strong id="mrs_01_2392__en-us_topic_0000001219230961_b1864041954619">10.10.10.223:25000</strong>.</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_p374916116378">Change the two IP addresses and port numbers based on the site requirements.</p>
</p></li><li id="mrs_01_2392__en-us_topic_0000001219230961_lcabd43b3fb314898bc19ded29c90a2b3"><a name="mrs_01_2392__en-us_topic_0000001219230961_lcabd43b3fb314898bc19ded29c90a2b3"></a><a name="en-us_topic_0000001219230961_lcabd43b3fb314898bc19ded29c90a2b3"></a><span>Run the <strong id="mrs_01_2392__en-us_topic_0000001219230961_b1540422619469">oozie job</strong> command to run the workflow file:</span><p><p id="mrs_01_2392__en-us_topic_0000001219230961_p54529342154411"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b86452293465">oozie job -oozie https://</strong><em id="mrs_01_2392__en-us_topic_0000001219230961_i064682974614">Host name of the Oozie role</em><strong id="mrs_01_2392__en-us_topic_0000001219230961_b46461729124613">:21003/oozie/ -config job.properties -run</strong></p>
<div class="note" id="mrs_01_2392__en-us_topic_0000001219230961_nab1e7bedb6514357b2d857cdd437f4c6"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_2392__en-us_topic_0000001219230961_u505e120ab7fc4d7dada573bbfaf0fb3a"><li id="mrs_01_2392__en-us_topic_0000001219230961_lb21c9804740942e0b20f979562530dfc">The command parameters are described as follows:<p id="mrs_01_2392__en-us_topic_0000001219230961_add0178af9cb74a7c8bc43ccde17e8056"><a name="mrs_01_2392__en-us_topic_0000001219230961_lb21c9804740942e0b20f979562530dfc"></a><a name="en-us_topic_0000001219230961_lb21c9804740942e0b20f979562530dfc"></a><strong id="mrs_01_2392__en-us_topic_0000001219230961_b13124939144615">-oozie</strong> URL of the Oozie server that executes a job</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_ac71440e8b53147bcb42b441a9d6f02cf"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b111018443465">-config</strong> Workflow property file</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_a80ac946821e648c8b67f4d19e7a48dc0"><strong id="mrs_01_2392__en-us_topic_0000001219230961_b123914612464">-run</strong> Executing a workflow</p>
</li><li id="mrs_01_2392__en-us_topic_0000001219230961_l8c662c9385fb4673b33ebe1261d2f945">If a job ID, for example, <span class="parmvalue" id="mrs_01_2392__en-us_topic_0000001219230961_parmvalue19188104312445"><b>job: 0000021-140222101051722-oozie-omm-W</b></span>, is displayed after the workflow file is executed, the job is successfully submitted. You can view the execution results on the Oozie management page.<p id="mrs_01_2392__en-us_topic_0000001219230961_a980a1632d7b44c7a819da7bd820778e9">Log in to the Oozie web UI at <strong id="mrs_01_2392__en-us_topic_0000001219230961_b3743774711">https</strong>://<em id="mrs_01_2392__en-us_topic_0000001219230961_i197957114710">IP address of the Oozie role</em><strong id="mrs_01_2392__en-us_topic_0000001219230961_b1791476475">:21003/oozie</strong> as user <strong id="mrs_01_2392__en-us_topic_0000001219230961_b280147124719">oozieuser</strong>.</p>
<p id="mrs_01_2392__en-us_topic_0000001219230961_afb508a57fb504357a1e2a1483d95a2e0">On the Oozie web UI, you can view the submitted workflow information based on the job ID in the table on the page.</p>
</li></ul>
</div></div>
</p></li></ol>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1812.html">Using Oozie Client to Submit an Oozie Job</a></div>
</div>
</div>