forked from docs/doc-exports
Reviewed-by: Hasko, Vladimir <vladimir.hasko@t-systems.com> Co-authored-by: Yang, Tong <yangtong2@huawei.com> Co-committed-by: Yang, Tong <yangtong2@huawei.com>
97 lines
13 KiB
HTML
97 lines
13 KiB
HTML
<a name="mrs_01_2392"></a><a name="mrs_01_2392"></a>
|
|
|
|
<h1 class="topictitle1">Submitting a DistCp Job</h1>
|
|
<div id="body0000001124440649"><div class="section" id="mrs_01_2392__sc367440a9bee4ae9a15baede2902cc54"><h4 class="sectiontitle">Scenario</h4><p id="mrs_01_2392__ae05864fc43454b69888e41725935e74b">This section describes how to submit a DistCp job using the Oozie client.</p>
|
|
<div class="note" id="mrs_01_2392__n55f8e36fffd341878eb6390e2a633e8f"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><p id="mrs_01_2392__a5d75fccba647482fb315a28ee027d87a">You are advised to download the latest client.</p>
|
|
</div></div>
|
|
</div>
|
|
<div class="section" id="mrs_01_2392__s03008bf64de9424c83937093f6557919"><h4 class="sectiontitle">Prerequisites</h4><ul id="mrs_01_2392__u6075e1bead804d118aeb83a52bc084ca"><li id="mrs_01_2392__l59e2fa74571d485bb4861d104ade63ea">The HDFS and Oozie components and clients have been installed and are running properly.<p id="mrs_01_2392__a809de3689b214dd0bdc79d7e35fbae9c"><a name="mrs_01_2392__l59e2fa74571d485bb4861d104ade63ea"></a><a name="l59e2fa74571d485bb4861d104ade63ea"></a>If the current client is an earlier version, you need to download and install the client again.</p>
|
|
</li><li id="mrs_01_2392__l8ff319393ec949d9ae45feea7e19d290">You have created or obtained the human-machine account and password for accessing the Oozie service.<div class="note" id="mrs_01_2392__note315568028334"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_2392__ul155757648334"><li id="mrs_01_2392__li59641568334">This user must belong to the <strong id="mrs_01_2392__b12537165924011">hadoop</strong>, <strong id="mrs_01_2392__b55428599407">supergroup</strong>, and <strong id="mrs_01_2392__b55431759124011">hive</strong> groups and be assigned with the Oozie role operation permission. If the multi-instance function is enabled for Hive, the user must belong to a specific Hive instance group, for example, <strong id="mrs_01_2392__b5621264117">hive3</strong>.</li><li id="mrs_01_2392__li536774108334">This user must also be assigned the <strong id="mrs_01_2392__b13380175114119">manager_viewer</strong> role at least.</li></ul>
|
|
</div></div>
|
|
</li></ul>
|
|
</div>
|
|
<ul id="mrs_01_2392__ul589917804210"><li id="mrs_01_2392__l67e29a0db3d64a01ba3c8f8a242fffe2">You have obtained the URL of the Oozie server (any instance) in the running state, for example, <strong id="mrs_01_2392__b1687199144114">https://10.1.130.10:21003/oozie</strong>.</li><li id="mrs_01_2392__l99b45eddd6bd47d3bb30ccde42dd5c80">You have obtained the name of the Oozie server, for example, <strong id="mrs_01_2392__b362317157416">10-1-130-10</strong>.</li><li id="mrs_01_2392__l33cbb5dabf1d4b61ac772fc3ec3c3e4a">You have obtained the IP address of the active Yarn ResourceManager, for example, <strong id="mrs_01_2392__b18868182011411">10.1.130.11</strong>.</li></ul>
|
|
<div class="section" id="mrs_01_2392__section440189144115"><h4 class="sectiontitle">Procedure</h4><ol id="mrs_01_2392__oc881cfc28d414306af1adbbbf4a00b1b"><li id="mrs_01_2392__l7935c910b03545e0a5867ca4d3b2c006"><span>Log in to the node where the Oozie client is installed as the client installation user .</span></li><li id="mrs_01_2392__lfce2496a67db4306afc7110e3058abea"><span>Run the following command to obtain the installation environment. <span class="filepath" id="mrs_01_2392__filepath533613231948"><b>/opt/client/</b></span> is an example client installation path.</span><p><p id="mrs_01_2392__a9fe0ea3f05954692816d1ed270ff6e25"><strong id="mrs_01_2392__ad8409f5d475a4974b1a119c7e03644bc">source /opt/client/bigdata_env</strong></p>
|
|
</p></li><li id="mrs_01_2392__l1875b9398af340e1844211cfd2f8272d"><span>Check the cluster authentication mode.</span><p><ul id="mrs_01_2392__u553e435dd8f84be8a18ff92d6d24725a"><li id="mrs_01_2392__ld507b00926e44383b9343a4bc2adbeb3">If the cluster is in security mode, run the <strong id="mrs_01_2392__b27365575417">kinit</strong> command to authenticate users.<p id="mrs_01_2392__a6f9517958c83424ea524bf7781acc911">For example, the <strong id="mrs_01_2392__b621215144217">oozieuser</strong> user is authenticated using the following command:</p>
|
|
<p id="mrs_01_2392__ae4d53a149513466187a4bf7a3651dda3"><strong id="mrs_01_2392__b218349103537">kinit oozieuser</strong></p>
|
|
</li><li id="mrs_01_2392__l4c6e11eac05d4977a9b756054fd46659">If the cluster is in normal mode, go to <a href="#mrs_01_2392__lcc62479277a945d99a305c3a8402a40d">4</a>.</li></ul>
|
|
</p></li><li id="mrs_01_2392__lcc62479277a945d99a305c3a8402a40d"><a name="mrs_01_2392__lcc62479277a945d99a305c3a8402a40d"></a><a name="lcc62479277a945d99a305c3a8402a40d"></a><span>Run the following command to go to the example directory:</span><p><p id="mrs_01_2392__a6a361a015c424467b72a12dcbb292aaf"><strong id="mrs_01_2392__a2ec98494ecb742d3aa186b2d18928c32">cd /opt/client/Oozie/oozie-client-*/examples/apps/distcp/</strong></p>
|
|
<p id="mrs_01_2392__aa736f6a3b96b40c0b60acd1d8626c64f"><a href="#mrs_01_2392__ta1113e97d79c4106a91f5e20da6899e0">Table 1</a> lists the files that you need to pay attention to in the directory.</p>
|
|
|
|
<div class="tablenoborder"><a name="mrs_01_2392__ta1113e97d79c4106a91f5e20da6899e0"></a><a name="ta1113e97d79c4106a91f5e20da6899e0"></a><table cellpadding="4" cellspacing="0" summary="" id="mrs_01_2392__ta1113e97d79c4106a91f5e20da6899e0" frame="border" border="1" rules="all"><caption><b>Table 1 </b>File description</caption><thead align="left"><tr id="mrs_01_2392__r4aa1a6ae8d4346328f92540884cca876"><th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.2.4.2.3.2.3.1.1"><p id="mrs_01_2392__a16a9e33c6cbd4b898c5205ffd33d6036">File</p>
|
|
</th>
|
|
<th align="left" class="cellrowborder" valign="top" width="50%" id="mcps1.3.4.2.4.2.3.2.3.1.2"><p id="mrs_01_2392__ab8b5f25780904c488a54fa26e79a8446">Description</p>
|
|
</th>
|
|
</tr>
|
|
</thead>
|
|
<tbody><tr id="mrs_01_2392__rdb5a7205478d49d7a5ff58bb41798174"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.1 "><p id="mrs_01_2392__afb57a440373e48b8b06b50fa052385c7">job.properties</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.2 "><p id="mrs_01_2392__aa99f6c5fc08a4b558f4f5b658f73af02">Parameter definition file of a workflow</p>
|
|
</td>
|
|
</tr>
|
|
<tr id="mrs_01_2392__r280e2c0ce768422d970506fb7530754e"><td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.1 "><p id="mrs_01_2392__a11a9d0dfd7c34ae2bbef3a5513508fa0">workflow.xml</p>
|
|
</td>
|
|
<td class="cellrowborder" valign="top" width="50%" headers="mcps1.3.4.2.4.2.3.2.3.1.2 "><p id="mrs_01_2392__a6b5bf5dcae684e0c878e0c5c4afa0832">Rule definition file of a workflow</p>
|
|
</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</div>
|
|
</p></li><li id="mrs_01_2392__lc81b755786704225844019d5c86d4966"><span>Run the following command to edit the <span class="filepath" id="mrs_01_2392__f10654170ccf1449792b1ffc3ab2861cf"><b>job.properties</b></span> file:</span><p><p id="mrs_01_2392__aeb6f21eba3514b47a2fa841358454342"><strong id="mrs_01_2392__aba9fb7bb776346d8a3aeeeccee1188bb">vi job.properties</strong></p>
|
|
<p id="mrs_01_2392__a8f8cfb46c2c24c45b7a75afcc79efba6">Perform the following modifications:</p>
|
|
<p id="mrs_01_2392__p934881516612">Change the value of <strong id="mrs_01_2392__b5638131924313">userName</strong> to the name of the human-machine user who submits the job, for example, <strong id="mrs_01_2392__b187325334431">userName=oozieuser</strong>.</p>
|
|
</p></li><li id="mrs_01_2392__li99573433720"><span>Whether DistCp is not deployed across security clusters.</span><p><ul id="mrs_01_2392__ul1358116157274"><li id="mrs_01_2392__li258120156277">If yes, go to <a href="#mrs_01_2392__li57541420123918">7</a>.</li><li id="mrs_01_2392__li5581315132715">If no, go to <a href="#mrs_01_2392__lcabd43b3fb314898bc19ded29c90a2b3">9</a>.</li></ul>
|
|
</p></li><li id="mrs_01_2392__li57541420123918"><a name="mrs_01_2392__li57541420123918"></a><a name="li57541420123918"></a><span>Establish cross-Manager mutual trust between two clusters.</span></li><li id="mrs_01_2392__li15561103713183"><span>Run the following commands to back up and modify the <strong id="mrs_01_2392__b1782718619453">workflow.xml</strong> file:</span><p><p id="mrs_01_2392__p9958613132012"><strong id="mrs_01_2392__b20125123211202">cp workflow.xml workflow.xml.bak</strong></p>
|
|
<p id="mrs_01_2392__p118059188193"><strong id="mrs_01_2392__b13126832122013">vi workflow.xml</strong></p>
|
|
<p id="mrs_01_2392__p24567342208">Modify the following content:</p>
|
|
<pre class="screen" id="mrs_01_2392__screen19550552163312"><workflow-app xmlns="uri:oozie:workflow:1.0" name="distcp-wf">
|
|
<start to="distcp-node"/>
|
|
<action name="distcp-node">
|
|
<distcp xmlns="uri:oozie:distcp-action:1.0">
|
|
<resource-manager>${resourceManager}</resource-manager>
|
|
<name-node>${nameNode}</name-node>
|
|
<prepare>
|
|
<delete path="hdfs://<strong id="mrs_01_2392__b565134011916">target_ip:target_port</strong>/user/${userName}/${examplesRoot}/output-data/${outputDir}"/>
|
|
</prepare>
|
|
<configuration>
|
|
<property>
|
|
<name>mapred.job.queue.name</name>
|
|
<value>${queueName}</value>
|
|
</property>
|
|
<property>
|
|
<name>oozie.launcher.mapreduce.job.hdfs-servers</name>
|
|
<value>hdfs://<strong id="mrs_01_2392__b198406192012">source_ip:source_port</strong>,hdfs://target_ip:target_port</value>
|
|
</property>
|
|
</configuration>
|
|
<arg>${nameNode}/user/${userName}/${examplesRoot}/input-data/text/data.txt</arg>
|
|
<arg>hdfs://target_ip:target_port/user/${userName}/${examplesRoot}/output-data/${outputDir}/data.txt</arg>
|
|
</distcp>
|
|
<ok to="end"/>
|
|
<error to="fail"/>
|
|
</action>
|
|
<kill name="fail">
|
|
<message>DistCP failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
|
|
</kill>
|
|
<end name="end"/>
|
|
</workflow-app></pre>
|
|
<p id="mrs_01_2392__p188674216218"><strong id="mrs_01_2392__b1875910372454">target_ip:target_port</strong> is the HDFS active NameNode address of the other trusted cluster, for example, <strong id="mrs_01_2392__b285205254513">10.10.10.233:25000</strong>.</p>
|
|
<p id="mrs_01_2392__p1028213313355"><strong id="mrs_01_2392__b2892101019463">source_ip:source_port</strong> indicates the HDFS active NameNode address of the source cluster, for example, <strong id="mrs_01_2392__b1864041954619">10.10.10.223:25000</strong>.</p>
|
|
<p id="mrs_01_2392__p374916116378">Change the two IP addresses and port numbers based on the site requirements.</p>
|
|
</p></li><li id="mrs_01_2392__lcabd43b3fb314898bc19ded29c90a2b3"><a name="mrs_01_2392__lcabd43b3fb314898bc19ded29c90a2b3"></a><a name="lcabd43b3fb314898bc19ded29c90a2b3"></a><span>Run the <strong id="mrs_01_2392__b1540422619469">oozie job</strong> command to run the workflow file:</span><p><p id="mrs_01_2392__p54529342154411"><strong id="mrs_01_2392__b86452293465">oozie job -oozie https://</strong><em id="mrs_01_2392__i064682974614">Host name of the Oozie role</em><strong id="mrs_01_2392__b46461729124613">:21003/oozie/ -config job.properties -run</strong></p>
|
|
<div class="note" id="mrs_01_2392__nab1e7bedb6514357b2d857cdd437f4c6"><img src="public_sys-resources/note_3.0-en-us.png"><span class="notetitle"> </span><div class="notebody"><ul id="mrs_01_2392__u505e120ab7fc4d7dada573bbfaf0fb3a"><li id="mrs_01_2392__lb21c9804740942e0b20f979562530dfc">The command parameters are described as follows:<p id="mrs_01_2392__add0178af9cb74a7c8bc43ccde17e8056"><a name="mrs_01_2392__lb21c9804740942e0b20f979562530dfc"></a><a name="lb21c9804740942e0b20f979562530dfc"></a><strong id="mrs_01_2392__b13124939144615">-oozie</strong> URL of the Oozie server that executes a job</p>
|
|
<p id="mrs_01_2392__ac71440e8b53147bcb42b441a9d6f02cf"><strong id="mrs_01_2392__b111018443465">-config</strong> Workflow property file</p>
|
|
<p id="mrs_01_2392__a80ac946821e648c8b67f4d19e7a48dc0"><strong id="mrs_01_2392__b123914612464">-run</strong> Executing a workflow</p>
|
|
</li><li id="mrs_01_2392__l8c662c9385fb4673b33ebe1261d2f945">If a job ID, for example, <span class="parmvalue" id="mrs_01_2392__parmvalue19188104312445"><b>job: 0000021-140222101051722-oozie-omm-W</b></span>, is displayed after the workflow file is executed, the job is successfully submitted. You can view the execution results on the Oozie management page.<p id="mrs_01_2392__a980a1632d7b44c7a819da7bd820778e9">Log in to the Oozie web UI at <strong id="mrs_01_2392__b3743774711">https</strong>://<em id="mrs_01_2392__i197957114710">IP address of the Oozie role</em><strong id="mrs_01_2392__b1791476475">:21003/oozie</strong> as user <strong id="mrs_01_2392__b280147124719">oozieuser</strong>.</p>
|
|
<p id="mrs_01_2392__afb508a57fb504357a1e2a1483d95a2e0">On the Oozie web UI, you can view the submitted workflow information based on the job ID in the table on the page.</p>
|
|
</li></ul>
|
|
</div></div>
|
|
</p></li></ol>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="mrs_01_1812.html">Using Oozie Client to Submit an Oozie Job</a></div>
|
|
</div>
|
|
</div>
|
|
|