This section describes how to import data from ThirdKafka to Hudi by using the CDLService web UI of a cluster with Kerberos authentication enabled.
Parameter |
Example |
---|---|
Name |
opengausslink |
Link Type |
thirdparty-kafka |
Bootstrap Servers |
10.10.10.10:9093 |
Security Protocol |
SASL_SSL |
Username |
testuser |
Password |
Password of the testuser user |
SSL Truststore Location |
Click Upload to upload the authentication file. |
SSL Truststore Password |
- |
Datastore Type |
opengauss |
Host |
11.11.xxx.xxx,12.12.xxx.xxx |
Port |
8000 |
DB Name |
opengaussdb |
User |
opengaussuser |
DB Password |
Password of the opengaussuser user |
Description |
- |
MRS Kafka can also be used as the source of thirdparty-kafka. If the username and password are used for login authentication, log in to FusionInsight Manager, choose Cluster > Services > Kafka, click Configuration, search for the sasl.enabled.mechanisms parameter in the search box, add PLAIN as the parameter value, click Save to save the configuration, and restart the Kafka service for the configuration to take effect.
On the CDL web UI, configure the thirdparty-kafka link that uses MRS Kafka as the source. For example, the data link configuration is as follows:
Parameter |
Example |
---|---|
Link Type |
hudi |
Name |
hudilink |
Storage Type |
hdfs |
Auth KeytabFile |
/opt/Bigdata/third_lib/CDL/user_libs/cdluser.keytab |
Principal |
cdluser |
Description |
- |
After the test is successful, click OK.
Parameter |
Example Value |
---|---|
Name |
test-env |
Driver Memory |
1 GB |
Type |
spark |
Executor Memory |
1 GB |
Executor Cores |
1 |
Number Executors |
1 |
Queue |
- |
Description |
- |
Click OK.
Parameter |
Example |
---|---|
Name |
job_opengausstohudi |
Desc |
New CDL Job |
Parameter |
Example |
---|---|
Link |
opengausslink |
DB Name |
opengaussdb |
Schema |
opengaussschema |
Datastore Type |
opengauss |
Source Topics |
source_topic |
Tasks Max |
1 |
Tolerance |
none |
Start Time |
- |
Multi Partition |
No |
Topic Table Mapping |
test/hudi_topic |
Parameter |
Example Value |
---|---|
Link |
hudilink |
Path |
/cdl/test |
Interval |
10 |
Max Rate Per Partition |
0 |
Parallelism |
10 |
Target Hive Database |
default |
Configuring Hudi Table Attributes |
Visual View |
Global Configuration of Hudi Table Attributes |
- |
Configuring the Attributes of the Hudi Table: Table Name |
test |
Configuring the Attributes of the Hudi Table: Table Type Opt Key |
COPY_ON_WRITE |
Configuring the Attributes of the Hudi Table: Hudi TableName Mapping |
- |
Configuring the Attributes of the Hudi Table: Hive TableName Mapping |
- |
Configuring the Attributes of the Hudi Table: Table Primarykey Mapping |
id |
Configuring the Attributes of the Hudi Table: Table Hudi Partition Type |
- |
Configuring the Attributes of the Hudi Table: Custom Config |
- |
Check whether the data transmission takes effect, for example, insert data into the table in the openGauss database and view the content of the file imported to Hudi.