doc-exports/docs/dataartsstudio/api-ref/dataartsstudio_02_0298.html
Xiong, Chen Xiao 14a6d65e8c DataArts API 20240130 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Xiong, Chen Xiao <chenxiaoxiong@huawei.com>
Co-committed-by: Xiong, Chen Xiao <chenxiaoxiong@huawei.com>
2024-03-01 11:46:15 +00:00

166 lines
19 KiB
HTML

<a name="dataartsstudio_02_0298"></a><a name="dataartsstudio_02_0298"></a>
<h1 class="topictitle1">To HDFS</h1>
<div id="body8662426"><div class="section" id="dataartsstudio_02_0298__en-us_topic_0108272844_section33401108172339"><h4 class="sectiontitle">Sample JSON File</h4><pre class="screen" id="dataartsstudio_02_0298__en-us_topic_0108272844_screen18184332112558">"to-config-values": {
"configs": [
{
"inputs": [
{
"name": "toJobConfig.outputDirectory",
"value": "/hdfsto"
},
{
"name": "toJobConfig.outputFormat",
"value": "BINARY_FILE"
},
{
"name": "toJobConfig.writeToTempFile",
"value": "false"
},
{
"name": "toJobConfig.duplicateFileOpType",
"value": "REPLACE"
},
{
"name": "toJobConfig.compression",
"value": "NONE"
},
{
"name": "toJobConfig.appendMode",
"value": "true"
}
],
"name": "toJobConfig"
}
]
}</pre>
</div>
<div class="section" id="dataartsstudio_02_0298__en-us_topic_0108272844_section64036202102948"><h4 class="sectiontitle">Parameter Description</h4>
<div class="tablenoborder"><table cellpadding="4" cellspacing="0" summary="" id="dataartsstudio_02_0298__en-us_topic_0108272844_table56137241103023" frame="border" border="1" rules="all"><thead align="left"><tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row52082877103023"><th align="left" class="cellrowborder" valign="top" width="22.657734226577343%" id="mcps1.3.2.2.1.5.1.1"><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p57963538103023">Parameter</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="19.408059194080593%" id="mcps1.3.2.2.1.5.1.2"><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p64535024103023">Mandatory</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="16.858314168583142%" id="mcps1.3.2.2.1.5.1.3"><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p59954430103023">Type</p>
</th>
<th align="left" class="cellrowborder" valign="top" width="41.075892410758925%" id="mcps1.3.2.2.1.5.1.4"><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p24470628103023">Description</p>
</th>
</tr>
</thead>
<tbody><tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row35963851103023"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p27390810103023">toJobConfig.outputDirectory</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p4063162103023">Yes</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p60680723103023">String</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p16191500103023">Path to which data is written. For example, <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmvalue49709579111924"><b>/data_dir</b></span>.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row36443085103023"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p66208767103023">toJobConfig.outputFormat</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p61309897103023">Yes</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p45780103023">Enumeration</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><div class="p" id="dataartsstudio_02_0298__en-us_topic_0108272844_p66316516111924">File format required for data writes (except the binary format). Currently, the following file formats are supported:<ul id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_ul6294852210148"><li id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_li5932618510148"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_b3100712193118">CSV_FILE</strong>: Write data in CSV format.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_li23398872101411"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_b623610203312">BINARY_FILE</strong>: Files are directly transferred without resolving the content. CDM writes the file without changing the file format.</li></ul>
</div>
<p id="dataartsstudio_02_0298__en-us_topic_0108272844_p35897407111924">If you select <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmvalue875914221240"><b>BINARY_FILE</b></span>, the migration source must also be a file system.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row14202830103433"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p33526379103458">toJobConfig.lineSeparator</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p31282148103458">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p50826021103458">String</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p4596973311298">Line feed character. This parameter is valid only when <span class="parmname" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmname16832101781710"><b>toJobConfig.outputFormat</b></span> is <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmvalue161635236514106"><b>CSV_FILE</b></span>. The default value is <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmvalue1798040172141020_3"><b>\r\n</b></span>.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row51268928103438"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p41226208103458">toJobConfig.fieldSeparator</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p50988541103458">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p36431138103458">String</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p4984816111298">Column delimiter. This parameter is valid only when <span class="parmname" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmname11724115615167"><b>toJobConfig.outputFormat</b></span> is <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmvalue911113917141932"><b>CSV_FILE</b></span>. The default value is <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_parmvalue1798040172141020_1"><b>,</b></span>.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row165374963415"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p19531149103416">toJobConfig.writeToTempFile</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p145319497341">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p8182172215354">Boolean</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p169603823818">The binary file is written to a <span class="uicontrol" id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272836_uicontrol575920575265"><b>.tmp</b></span> file first. After the migration is successful, run the <strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272836_b1675975719261">rename</strong> or <strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272836_b12759145722617">move</strong> command at the migration destination to restore the file.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row54756901103023"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p6123978103023">toJobConfig.duplicateFileOpType</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p26280224103023">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p48323406103023">Enumeration</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><div class="p" id="dataartsstudio_02_0298__en-us_topic_0108272844_p21881853103023">Method for processing duplicate files. If the name and size of a file are the same as those of another file, the file is regarded as a duplicate file. Duplicate files can be processed in the following ways:<ul id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_ul33108444155527"><li id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_li29540541155527"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_b842352706151043">REPLACE</strong>: Replace duplicate files.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_li64538279155527"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_b842352706151050">SKIP</strong>: Skip duplicate files.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_li43973603155527"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_en-us_topic_0108272820_b84235270615111">ABANDON</strong>: Stop the job when any duplicate file is found.</li></ul>
</div>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row60754053103023"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p22131251103023">toJobConfig.compression</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p47800910103023">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p46668477103023">Enumeration</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><div class="p" id="dataartsstudio_02_0298__en-us_topic_0108272844_p22050281103023">After the file is written, select the compression format of the file. The following compression formats are supported:<ul id="dataartsstudio_02_0298__en-us_topic_0108272844_ul64234801103023"><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li41242297103023"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b842352706151425">NONE</strong>: Do not compress the file.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li35636354103023"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b842352706151440_1">DEFLATE</strong>: Compress the file in DEFLATE format.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li52291733103023"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b118912124220">GZIP</strong>: Compress the file in gzip format.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li863556103023"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b842352706151440_3">BZIP2</strong>: Compress the file in bzip2 format.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li7772006103023"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b84235270615156_3">LZ4</strong>: Compress the file in LZ4 format.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li2839198103023"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b842352706151440_5">SNAPPY</strong>: Compress the file in Snappy format.</li></ul>
</div>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row25552788103023"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p56509958103023">toJobConfig.appendMode</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p13903909103023">Yes</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p52474840103023">Boolean</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p22603656103023">Whether to write data when one or more files exist in the loading path. The default value is <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmvalue1344025609174957"><b>false</b></span>.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row4274153852517"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p5622115931110">toJobConfig.encryption</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p10622125919118">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p1762214595111">Enumeration</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><div class="p" id="dataartsstudio_02_0298__en-us_topic_0108272844_p268631112373">This parameter is available only when <span class="parmname" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmname199870284216"><b>toJobConfig.outputFormat</b></span> is set to <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmvalue810073420432"><b>BINARY_FILE</b></span>. It specifies whether to encrypt the uploaded data, and the encryption method. The options are as follows:<ul id="dataartsstudio_02_0298__en-us_topic_0108272844_ul193603764111"><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li443953725020"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b126231953184410">NONE</strong>: Directly write data without encryption.</li><li id="dataartsstudio_02_0298__en-us_topic_0108272844_li599585720411"><strong id="dataartsstudio_02_0298__en-us_topic_0108272844_b159731439122419">AES-256-GCM</strong>: Use the AES 256-bit encryption algorithm to encrypt data. Currently, only the AES-256-GCM (NoPadding) encryption algorithm is supported.</li></ul>
</div>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row183541438142514"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p167122217132">toJobConfig.dek</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p74741127142115">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p2474627102111">String</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p23281415172211">Data encryption key. This parameter is available when <span class="parmname" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmname1414917538256"><b>toJobConfig.encryption</b></span> is set to <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmvalue18149145319251"><b>AES-256-GCM</b></span>. The key is a string of 64-bit hexadecimal numbers.</p>
<p id="dataartsstudio_02_0298__en-us_topic_0108272844_p18430442724">Remember the key configured here because the decryption key must be the same as that configured here. If the encryption and decryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect.</p>
</td>
</tr>
<tr id="dataartsstudio_02_0298__en-us_topic_0108272844_row150113917255"><td class="cellrowborder" valign="top" width="22.657734226577343%" headers="mcps1.3.2.2.1.5.1.1 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p26851918151311">toJobConfig.iv</p>
</td>
<td class="cellrowborder" valign="top" width="19.408059194080593%" headers="mcps1.3.2.2.1.5.1.2 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p1114152632113">No</p>
</td>
<td class="cellrowborder" valign="top" width="16.858314168583142%" headers="mcps1.3.2.2.1.5.1.3 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p1414162617212">String</p>
</td>
<td class="cellrowborder" valign="top" width="41.075892410758925%" headers="mcps1.3.2.2.1.5.1.4 "><p id="dataartsstudio_02_0298__en-us_topic_0108272844_p24771354512">Initialization vector. This parameter is available when <span class="parmname" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmname16420145902519"><b>toJobConfig.encryption</b></span> is set to <span class="parmvalue" id="dataartsstudio_02_0298__en-us_topic_0108272844_parmvalue84211759142512"><b>AES-256-GCM</b></span>. The initialization vector is a string of 32-bit hexadecimal numbers.</p>
<p id="dataartsstudio_02_0298__en-us_topic_0108272844_p51641322122914">Remember the initialization vector configured here because the initialization vector used for decryption must be the same as that configured here. If the encryption and decryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="destination_job_parameters.html">Destination Job Parameters</a></div>
</div>
</div>