If you want to rapidly migrate CarbonData data from a cluster to another one, you can use the CarbonData backup and restoration commands. This method does not require data import in the target cluster, reducing required migration time.
The Spark2x client has been installed in a directory, for example, /opt/client, in two clusters. The source cluster is cluster A, and the target cluster is cluster B.
source /opt/client/bigdata_env
source /opt/client/Spark2x/component_env
kinit carbondatauser
carbondatauser indicates the user of the original data. That is, the user has the read and write permissions for the tables.
You must add the user to the hadoop (primary group) and hive groups, and associate it with the System_administrator role.
spark-beeline
desc formatted Name of the table containing the original data;
Location in the displayed information indicates the directory where the data file resides.
source /opt/client/bigdata_env
source /opt/client/Spark2x/component_env
kinit carbondatauser2
carbondatauser2 indicates the user that uploads data.
You must add the user to the hadoop (primary group) and hive groups, and associate it with the System_administrator role.
When uploading data in cluster B, ensure that the upload directory has the directories with the same names as the database and table in the original directory and the upload user has the permission to write data to the upload directory. After the data is uploaded, the user has the permission to read and write the data.
For example, if the original data is stored in /user/carboncadauser/warehouse/db1/tb1, the data can be stored in /user/carbondatauser2/warehouse/db1/tb1 in the new cluster.
hdfs dfs -get /user/carboncadauser/warehouse/db1/tb1 /opt/backup
scp /opt/backup root@IP address of the client node of cluster B:/opt/backup
hdfs dfs -put /opt/backup /user/carbondatauser2/warehouse/db1/tb1
REFRESH TABLE $dbName.$tbName;
$dbName indicates the database name, and $tbName indicates the table name.
REGISTER INDEX TABLE $tableName ON $maintable;
$tableName indicates the index table name, and $maintable indicates the table name.