This task enables you to export data from MRS to external data sources.
Generally, users can manually manage data import and export jobs on the Loader UI. To use shell scripts to update and run Loader jobs, you must configure the installed Loader client.
cd ${BIGDATA_HOME}/FusionInsight_Porter_8.1.2.2/install/FusionInsight-Sqoop-1.99.3/FusionInsight-Sqoop-1.99.3/server/webapps/loader/WEB-INF/ext-lib
chown omm:wheel JAR package name
chmod 600 JAR package name
Connector Type |
Parameter |
Description |
---|---|---|
generic-jdbc-connector |
JDBC Driver Class |
Specifies the name of a JDBC driver class. |
JDBC Connection String |
Specifies the JDBC connection string. |
|
Username |
Specifies the username for connecting to the database. |
|
Password |
Specifies the password for connecting to the database. |
|
JDBC Connection Properties |
Specifies JDBC connection attributes. Click Add to manually add connection attributes.
|
|
hdfs-connector |
- |
- |
oracle-connector |
JDBC Connection String |
Specifies connection string for a user to connect to the database. |
Username |
Specifies the username for connecting to the database. |
|
Password |
Specifies the password for connecting to the database. |
|
Connection Properties |
Specifies connection attributes. Click Add to manually add connection attributes.
|
|
mysql-fastpath-connector |
JDBC Connection String |
Specifies the JDBC connection string. |
Username |
Specifies the username for connecting to the database. |
|
Password |
Specifies the password for connecting to the database. |
|
Connection Properties |
Specifies connection attributes. Click Add to manually add connection attributes.
|
|
sftp-connector |
SFTP Server IP |
Specifies the IP address of the SFTP server. |
SFTP Server Port |
Specifies the port number of the SFTP server. |
|
SFTP Username |
Specifies the username for accessing the SFTP server. |
|
SFTP Password |
Specifies the password for accessing the SFTP server. |
|
SFTP Public Key |
Specifies public key of the SFTP server. |
|
oracle-partition-connector |
JDBC Driver Class |
Specifies the name of a Java database connectivity (JDBC) driver class. |
JDBC Connection String |
Specifies the JDBC connection string. |
|
Username |
Specifies the username for connecting to the database. |
|
Password |
Specifies the password for connecting to the database. |
|
Connection Properties |
Specifies connection attributes. Click Add to manually add connection attributes.
|
When creating or editing a Loader job, you can use macro definitions when configuring parameters such as the SFTP path, HDFS/OBS path, and Where condition of SQL. For details, see Using Macro Definitions in Configuration Items.
Source File Type |
Parameter |
Description |
---|---|---|
HDFS/OBS |
Input Directory |
Specifies the input path when data is exported from HDFS/OBS. |
Path Filter |
Specifies the wildcard for filtering the directories in the input paths of the source files. Input Directory is not used in filtering. If there are multiple filter conditions, use commas (,) to separate them. If the value is empty, the directory is not filtered. The regular expression filtering is not supported. |
|
File Filter |
Specifies the wildcard for filtering the file names of the source files. If there are multiple filter conditions, use commas (,) to separate them. The value cannot be left blank. The regular expression filtering is not supported. |
|
File Type |
Specifies the file import type.
|
|
File Split Type |
Specifies whether to split source files by file name or size. The files obtained after the splitting are used as the input files of each map in the MapReduce task for data export. |
|
Extractors |
Specifies the number of maps that are started at the same time in a MapReduce job of a data configuration operation. This parameter cannot be set when Extractor Size is set. The value must be less than or equal to 3000. |
|
Extractor size |
Specifies the size of data processed by maps that are started in a MapReduce job of a data configuration operation. The unit is MB. The value must be greater than or equal to 100. The recommended value is 1000. This parameter cannot be set when Extractors is set. When a relational database connector is used, Extractor size is unavailable. You need to set Extractors. |
|
HBASE |
HBase Instance |
Specifies the HBase service instance that Loader selects from all available HBase service instances in the cluster. If the selected HBase service instance is not added to the cluster, the HBase job cannot be run properly. |
Quantity |
Specifies the number of maps that are started at the same time in a MapReduce job of a data configuration operation. The value must be less than or equal to 3000. |
|
HIVE |
Hive instance |
Specifies the Hive service instance that Loader selects from all available Hive service instances in the cluster. If the selected Hive service instance is not added to the cluster, the Hive job cannot run properly. |
Quantity |
Specifies the number of maps that are started at the same time in a MapReduce job of a data configuration operation. The value must be less than or equal to 3000. |
|
SPARK |
Spark instance |
Only SparkSQL can access Hive data. Specifies the SparkSQL service instance that Loader selects from all available SparkSQL service instances in the cluster. If the selected Spark service instance is not added to the cluster, the Spark job cannot be run properly. |
Quantity |
Specifies the number of maps that are started at the same time in a MapReduce job of a data configuration operation. The value must be less than or equal to 3000. |
Check whether source data values in the data operation job created by the Loader can be directly used without conversion, including upper and lower case conversion, cutting, merging, and separation.
Type |
Description |
---|---|
Input Type |
|
Conversion type |
|
Output type |
|
The edit box allows you to perform the following tasks:
You can also use the shortcut key Del to delete the file.
For details about how to set parameters in the step conversion information, see Operator Help.
If the conversion step is incorrectly configured, the source data cannot be converted and become dirty data. The dirty data marking rules are as follows:
Data Connection Type |
Parameter |
Description |
---|---|---|
sftp-connector |
Output Path |
Path or name of the export file on an SFTP server. If multiple SFTP server IP addresses are configured for the connector, you can set this parameter to multiple paths or file names separated with semicolons (;). Ensure that the number of input paths or file names is the same as the number of SFTP servers configured for the connector. |
Operation |
Specifies the action during data import. When all data is to be imported from the input path to the destination path, the data is stored in a temporary directory and then copied from the temporary directory to the destination path. After the data is imported successfully, the data is deleted from the temporary directory. One of the following actions can be taken when duplicate file names exist during data transfer:
|
|
Encode type |
Specifies the exported file encoding format, for example, UTF-8. This parameter can be set only in text file export. |
|
Compression |
Indicates whether to enable the compressed transmission function when SFTP is used to export data. true indicates that compression is enabled, and false indicates that compression is disabled. |
|
hdfs-connector |
Output Path |
Specifies the output directory or file name of the export file in the HDFS/OBS. |
File Format |
Specifies the file export type.
|
|
Compression codec |
Specifies the compression format of files exported to HDFS/OBS. Select a format from the drop-down list. If you select NONE or do not set this parameter, data is not compressed. |
|
User-defined compression format |
Name of a user-defined compression format type. |
|
generic-jdbc-connector |
Schema name |
Specifies the database schema name. |
Table name |
Specifies the name of a database table that is used to save the final data of the transmission. |
|
Temporary table |
Specifies the name of a temporary database table that is used to save temporary data during the transmission. The fields in the table must be the same as those in the database specified by Table name. |
|
oracle-partition-connector |
Schema Name |
Specifies the database schema name. |
Table Name |
Specifies the name of a database table that is used to save the final data of the transmission. |
|
Temporary Table |
Specifies the name of a temporary database table that is used to save temporary data during the transmission. The fields in the table must be the same as those in the database specified by Table name. |
|
oracle-connector |
Table Name |
Destination table name to store data. |
Column Name |
Specifies the name of the column to be written. Columns that are not specified can be set to null or the default value. |
|
mysql-fastpath-connector |
Schema Name |
Specifies the database schema name. |
Table Name |
Specifies the name of a database table that is used to save the final data of the transmission. |
|
Temporary Table Name |
Name of the temporary table, which is used to store data. After the job is successfully executed, data is transferred to the formal table. |