Use Loader to import data from a relational database to Hive.
cd ${BIGDATA_HOME}/FusionInsight_Porter_8.1.2.2/install/FusionInsight-Sqoop-1.99.3/FusionInsight-Sqoop-1.99.3/server/webapps/loader/WEB-INF/ext-lib
chown omm:wheel JAR package name
chmod 600 JAR package name
Setting Basic Job Information
Parameter |
Description |
Example Value |
---|---|---|
Name |
Name of a relational database connection |
dbName |
JDBC Driver Class |
Name of a JDBC driver class |
oracle.jdbc.driver.OracleDriver |
JDBC Connection String |
JDBC connection string |
jdbc:oracle:thin:@//10.16.0.1:1521/oradb |
Username |
Username for connecting to the database |
omm |
Password |
Password for connecting to the database |
xxxx |
JDBC Connection Properties |
JDBC connection attribute. Click Add to manually add the attribute.
|
|
Setting Data Source Information
Parameter |
Description |
Example Value |
---|---|---|
Schema Name |
Database schema name. This parameter exists in the Table name schema. |
dbo |
Table Name |
Database table name. This parameter exists in the Table name schema. |
test |
SQL Statement |
SQL statement for Loader to query data to be imported in Table SQL statement mode. The SQL statement requires the query condition WHERE ${CONDITIONS}. Without this condition, the SQL statement cannot be run properly. An example SQL statement is as follows: select * from TABLE WHERE A>B and ${CONDITIONS}. If Table column names is set, the column specified by Table column names will replace the column queried in the SQL statement. This parameter cannot be set when Schema name or Table name is set. NOTE:
You can use macros to define SQL Where statements. For details, see Using Macro Definitions in Configuration Items. |
select * from test where ${CONDITIONS} |
Table Column Names |
Table columns whose content is to be imported by Loader. Use commas (,) to separate multiple fields. If the parameter is not set, all the columns are imported and the Select * order is used as the column location. |
- |
Partition Column Name |
Database table column based on which to-be-imported data is determined. This parameter is used for partitioning in a Map job. You are advised to configure the primary key field. NOTE:
|
id |
Nulls in Partition Column |
Indicates whether to process records whose values are null in database table columns.
|
true |
Whether to Specify a Partition Column |
Indicates whether to specify a partition column. |
true |
Setting Data Transformation
Setting Data Storage Information and Executing the Job
Parameter |
Description |
Example Value |
---|---|---|
Output Directory |
Directory for storing data imported into Hive. NOTE:
You can use macros to define path parameters. For details, see Using Macro Definitions in Configuration Items. |
/opt/tempfile |
Extractors |
Number of Maps that are started at the same time in a MapReduce task of a data configuration operation. The value must be less than or equal to 3000. You are advised to set the parameter to the maximum number of connections on the SFTP server. |
20 |
Extractor Size |
Hive does not support this parameter. Please set Extractors. |
- |
Checking the Job Execution Result