forked from docs/doc-exports
Reviewed-by: Rechenburg, Matthias <matthias.rechenburg@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
18 KiB
18 KiB
Java Example Code
Development Description
Redis supports only enhanced datasource connections.
- Prerequisites
An enhanced datasource connection has been created on the DLI management console and bound to a queue in packages.
- Code implementation
- Import dependencies.
- Maven dependency involved
1 2 3 4 5
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.3.2</version> </dependency>
- Import dependency packages.
1 2 3 4 5 6 7 8
import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.sql.*; import org.apache.spark.sql.types.DataTypes; import org.apache.spark.sql.types.StructField; import org.apache.spark.sql.types.StructType; import java.util.*;
- Maven dependency involved
- Create a session.
1 2 3 4 5 6 7 8
SparkConf sparkConf = new SparkConf(); sparkConf.setAppName("datasource-redis") .set("spark.redis.host", "192.168.4.199") .set("spark.redis.port", "6379") .set("spark.redis.auth", "******") .set("spark.driver.allowMultipleContexts","true"); JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf); SQLContext sqlContext = new SQLContext(javaSparkContext);
- Import dependencies.
- Connecting to data sources through DataFrame APIs
- Read JSON data as DataFrames.
1 2 3 4
JavaRDD<String> javaRDD = javaSparkContext.parallelize(Arrays.asList( "{\"id\":\"1\",\"name\":\"Ann\",\"age\":\"18\"}", "{\"id\":\"2\",\"name\":\"lisi\",\"age\":\"21\"}")); Dataset dataFrame = sqlContext.read().json(javaRDD);
- Construct the Redis connection parameters.
1 2 3
Map map = new HashMap<String, String>(); map.put("table","person"); map.put("key.column","id");
- Save data to Redis.
1
dataFrame.write().format("redis").options(map).mode(SaveMode.Overwrite).save();
- Read data from Redis.
1
sqlContext.read().format("redis").options(map).load().show();
- View the operation result.
- Read JSON data as DataFrames.
- Submitting a Spark job
- Upload the Java code file to DLI.
- In the Spark job editor, select the corresponding dependency module and execute the Spark job.
- If the Spark version is 2.3.2 (will be offline soon) or 2.4.5, specify the Module to sys.datasource.redis when you submit a job.
- If the Spark version is 3.1.1, you do not need to select the Module module. You need to configure the 'Spark parameter (--conf) '.
spark.driver.extraClassPath=/usr/share/extension/dli/spark-jar/datasource/redis/*
spark.executor.extraClassPath=/usr/share/extension/dli/spark-jar/datasource/redis/*
Complete Example Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | public class Test_Redis_DaraFrame {
public static void main(String[] args) {
//create a SparkSession session
SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("datasource-redis")
.set("spark.redis.host", "192.168.4.199")
.set("spark.redis.port", "6379")
.set("spark.redis.auth", "******")
.set("spark.driver.allowMultipleContexts","true");
JavaSparkContext javaSparkContext = new JavaSparkContext(sparkConf);
SQLContext sqlContext = new SQLContext(javaSparkContext);
//Read RDD in JSON format to create DataFrame
JavaRDD<String> javaRDD = javaSparkContext.parallelize(Arrays.asList(
"{\"id\":\"1\",\"name\":\"Ann\",\"age\":\"18\"}",
"{\"id\":\"2\",\"name\":\"lisi\",\"age\":\"21\"}"));
Dataset dataFrame = sqlContext.read().json(javaRDD);
Map map = new HashMap<String, String>();
map.put("table","person");
map.put("key.column","id");
dataFrame.write().format("redis").options(map).mode(SaveMode.Overwrite).save();
sqlContext.read().format("redis").options(map).load().show();
}
}
|
Parent topic: Connecting to Redis