You can use Loader to import data from the SFTP server to HDFS.
This section applies to MRS clusters earlier than 3.x.
The job management tab page is displayed by default on the Loader page.
For details, see ftp-connector or sftp-connector.
For details, see hdfs-connector.
Parameter |
Description |
---|---|
Extractors |
Number of Map tasks |
Loaders |
Number of Reduce tasks This parameter is displayed only when the destination field is HBase or Hive. |
Max. Error Records in a Single Shard |
Error record threshold. If the number of error records of a single Map task exceeds the threshold, the task automatically stops and the obtained data is not returned. NOTE:
Data is read and written in batches for MYSQL and MPPDB of generic-jdbc-connector by default. Errors are recorded once at most for each batch of data. |
Dirty Data Directory |
Directory for saving dirty data. If you leave this parameter blank, dirty data will not be saved. |