Spark on HBase V2 allows users to query HBase tables in Spark SQL and to store data for HBase tables by using the Beeline tool. You can use HBase APIs to create, read data from, and insert data into tables.
Parameter |
Default Value |
Changed To |
---|---|---|
spark.yarn.security.credentials.hbase.enabled |
false |
true |
To ensure that Spark2x can access HBase for a long time, do not modify the following parameters of the HBase and HDFS services:
If the preceding parameter configuration must be modified based on service requirements, ensure that the value of the HDFS parameter dfs.namenode.delegation.token.renew-interval is not greater than the values of the HBase parameters hbase.auth.key.update.interval, hbase.auth.token.max.lifetime, and dfs.namenode.delegation.token.max-lifetime.
Parameter |
Default Value |
Changed To |
---|---|---|
spark.yarn.security.credentials.hbase.enabled |
false |
true |
If you need to use the Spark on HBase function on the Spark2x client, download and install the Spark2x client again.
(id string, name string, age int)
using org.apache.spark.sql.hbase.HBaseSourceV2
options(
hbaseTableName "table2",
keyCols "id",
colsMapping "name=cf1.cq1,age=cf1.cq2");
hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator="," -Dimporttsv.columns=HBASE_ROW_KEY,cf1:cq1,cf1:cq2,cf1:cq3,cf1:cq4,cf1:cq5 table2 /hperson
Where table2 indicates the name of the HBase table, and /hperson indicates the path where the CSV file is stored.
select * from hbaseTable1;