This section describes how to use Loader to export data from HDFS/OBS to an SFTP server.
Setting Basic Job Information
Parameter |
Description |
Example Value |
---|---|---|
Name |
Specifies the name of the SFTP server connection. |
sftpName |
SFTP Server IP |
Specifies the IP address of the SFTP server. |
10.16.0.1 |
SFTP Server Port |
Specifies the port number of the SFTP server. |
22 |
SFTP Username |
Specifies the user name for accessing the SFTP server. |
root |
SFTP Password |
Specifies the password for accessing the SFTP server. |
xxxx |
SFTP Public Key |
Specifies public key of the SFTP server. |
OdDt/yn...etM |
When multiple SFTP servers are configured, the data of HDFS/OBS will be divided into multiple parts and exported to the SFTP servers randomly.
Setting Data Source Information
Parameter |
Description |
Example Value |
---|---|---|
Input directory |
Specifies the input path when data is exported from HDFS/OBS. NOTE:
You can use macros to define path parameters. For details, see Using Macro Definitions in Configuration Items. |
/user/test |
Path filter |
Specifies the wildcard for filtering the directories in the input paths of the source files. Input directory is not used in filtering. If there are multiple filter conditions, use commas (,) to separate them. If the parameter is empty, the directory is not filtered. The regular expression filtering is not supported.
|
* |
File filter |
Specifies the wildcard for filtering the file names of the source files. If there are multiple filter conditions, use commas (,) to separate them. The value cannot be left blank. The regular expression filtering is not supported.
|
* |
File Type |
Specifies the file import type.
NOTE:
When the file import type to TEXT_FILE or SEQUENCE_FILE, Loader automatically selects a decompression method based on the file name extension to decompress a file. |
TEXT_FILE |
File Split Type |
Indicates whether to split source files by file name or size. The files obtained after the splitting are used as the input files of each map in the MapReduce task for data export.
|
FILE |
Extractors |
Specifies the number of maps that are started at the same time in a MapReduce job of a data configuration operation. This parameter cannot be set when Extractor Size is set. The value must be less than or equal to 3000. You are advised to set the parameter to the number of CPU cores on the SFTP server. NOTE:
To improve the data import speed, ensure that the following conditions are met:
|
20 |
Extractor size |
Specifies the size of data processed by maps that are started in a MapReduce job of a data configuration operation. The unit is MB. The value must be greater than or equal to 100. The recommended value is 1000. This parameter cannot be set when Extractors is set. |
- |
Setting Data Transformation
Input Type |
Export Type |
---|---|
CSV file input |
File output |
HTML input |
File output |
Fixed-width file input |
File output |
Setting Data Storage Information and Executing the Job
Parameter |
Description |
Example Value |
---|---|---|
Output path |
Specifies the path or file name of the exported file on an SFTP server. If multiple SFTP server IP addresses are configured for the connector, you can set this parameter to multiple paths or file names separated with semicolons (;). Ensure that the number of paths or file names is the same as the number of SFTP servers configured for the connector. NOTE:
You can use macros to define path parameters. For details, see Using Macro Definitions in Configuration Items. |
/opt/tempfile |
Operation |
Specifies the action during data import. When all data is to be imported from the input path to the destination path, the data is stored in a temporary directory and then copied from the temporary directory to the destination path. After the data is imported successfully, the data is deleted from the temporary directory. One of the following actions can be taken when duplicate file names exist during data transfer:
|
OVERRIDE |
Encode type |
Specifies the exported file encoding format, for example, UTF-8. This parameter can be set only in text file export. |
UTF-8 |
Compression |
Indicates whether to enable the compressed transmission function when SFTP is used to export data.
|
true |
Checking the Job Execution Result