This section describes how to use Loader to export data from HDFS/OBS to a relational database.
cd ${BIGDATA_HOME}/FusionInsight_Porter_8.1.2.2/install/FusionInsight-Sqoop-1.99.3/FusionInsight-Sqoop-1.99.3/server/webapps/loader/WEB-INF/ext-lib
chown omm:wheel JAR package name
chmod 600 JAR package name
Setting Basic Job Information
Parameter |
Description |
Example Value |
---|---|---|
Name |
Specifies the name of a relational database connection. |
dbName |
JDBC Driver Class |
Specifies the name of a Java database connectivity (JDBC) driver class. |
oracle.jdbc.driver.OracleDriver |
JDBC Connection String |
Specifies the JDBC connection string. |
jdbc:oracle:thin:@//10.16.0.1:1521/oradb |
Username |
Specifies the username for connecting to the database. |
omm |
Password |
Specifies the password for connecting to the database. |
xxxx |
JDBC Connection Properties |
JDBC connection attribute. Click Add to manually add the attribute.
|
|
Setting Data Source Information
Parameter |
Description |
Example Value |
---|---|---|
Input directory |
Specifies the input path when data is exported from HDFS/OBS. NOTE:
You can use macros to define path parameters. For details, see Using Macro Definitions in Configuration Items. |
/user/test |
Path filter |
Specifies the wildcard for filtering the directories in the input paths of the source files. Input directory is not used in filtering. If there are multiple filter conditions, use commas (,) to separate them. If the parameter is empty, the directory is not filtered. The regular expression filtering is not supported.
|
* |
File filter |
Specifies the wildcard for filtering the file names of the source files. If there are multiple filter conditions, use commas (,) to separate them. The value cannot be left blank. The regular expression filtering is not supported.
|
* |
File Type |
Specifies the file import type.
NOTE:
When the file import type to TEXT_FILE or SEQUENCE_FILE, Loader automatically selects a decompression method based on the file name extension to decompress a file. |
TEXT_FILE |
File split type |
Indicates whether to split source files by file name or size. The files obtained after the splitting are used as the input files of each map in the MapReduce task for data export.
|
FILE |
Extractors |
Specifies the number of maps that are started at the same time in a MapReduce job of a data configuration operation. This parameter cannot be set when Extractor size is set. The value must be less than or equal to 3000. |
20 |
Extractor size |
Specifies the size of data processed by maps that are started in a MapReduce job of a data configuration operation. The unit is MB. The value must be greater than or equal to 100. The recommended value is 1000. This parameter cannot be set when Extractors is set. When a relational database connector is used, Extractor size is unavailable. You need to set Extractors. |
- |
Setting Data Transformation
Input Type |
Export Type |
---|---|
CSV file input |
Table output |
HTML Input |
Table output |
Fixed-width file input |
Table output |
Setting Data Storage Information and Executing the Job
Parameter |
Description |
Example Value |
---|---|---|
Schema name |
Specifies the database schema name. |
dbo |
Table Name |
Specifies the name of a database table that is used to save the final data of the transmission. NOTE:
Table names can be defined using macros. For details, see Using Macro Definitions in Configuration Items. |
test |
Temporary table |
Specifies the name of a temporary database table that is used to save temporary data during the transmission. The fields in the table must be the same as those in the database specified by Table name. NOTE:
A temporary table is used to prevent dirty data from being generated in the destination table when data is exported to the database. Data is migrated from the temporary table to the destination table only after all data is successfully written to the temporary table. Using temporary tables increases the job execution time. |
tmp_test |
Checking the Job Execution Result