The COPY command is one of cqlsh commands. It includes COPY TO and COPY FROM. They are used to copy data to and from Cassandra.
You can run the COPY TO command to export data from an existing Cassandra instance and then run the COPY FROM command to import the data to an RDBMS instance or a new Cassandra instance. Currently, you can copy data to or from the CSV and JSON files.
You are advised to import and export data during off-peak hours to avoid the impact on your services.
You have connected to a DB instance. For details, see Connecting to a GaussDB(for Cassandra) Instance Over Private Networks.
./cqlsh <DB_HOST> -u <user_name>
COPY cycling.cyclist_name TO '/home/cas/copydata';
./cqlsh <DB_HOST> -u <user_name> -e "COPY cycling.cyclist_name TO '/home/cas/copydata'";
COPY <table name> [(<column>, ...)] TO <file name> WITH <copy option> [AND <copy option> ...]
nohup ./cqlsh <DB_HOST> --request-timeout=3600 --debug -e "COPY nihao.sz_user TO '/home/cas/copydata' with WHERECONDITION='update_timestamp=1' NUMPROCESSES=12 AND RATEFILE='rate.txt' AND RESULTFILE='export_result' AND dataformats='json';" >export.log 2>&1 &
Parameter description:
The common parameters are as follows: NUMPROCESSES, RATEFILE, PAGESIZE, BEGINTOKEN, ENDTOKEN, MAXATTEMPTS, and MAXOUTPUTSIZE.
The newly added parameters are as follows: RESULTFILE, DATAFORMATS, and WHERECONDITION.
For details about other COPY TO parameters, see the Cassandra official documentation.
./cqlsh <DB_HOST> -e "COPY cycling.cyclist_name TO '/home/cas/copydata'"
./cqlsh <DB_HOST> -e "COPY cycling.cyclist_name TO '/home/cas/copydata/cycling.cyclist_name'"
The default number of threads is calculated as follows: <Number of vCPUs>-1
./cqlsh <DB_HOST> -e "COPY cycling.cyclist_name TO '/home/cas/copydata/cycling.cyclist_name' with MAXOUTPUTSIZE=1"
COPY <table name> [(<column>, ...)] FROM <file name> WITH <copy option> [AND <copy option> ...]
nohup ./cqlsh <DB_HOST> --request-timeout=3600 --debug -e "COPY nihao.sz_user FROM '/home/cas/copydata' with NUMPROCESSES=12 AND RATEFILE='rate.txt' AND dataformats='json';" >import.log 2>&1 &
Parameter description:
The common parameters are as follows: NUMPROCESSES, MAXROWS, INGESTRATE, ERRFILE, MAXBATCHSIZE, MINBATCHSIZE, CHUNKSIZE, MAXPARSEERRORS, MAXINSERTERRORS, SKIPROWS, and SKIPCOLS.
The newly added parameter is DATAFORMATS.
For details about other COPY FROM parameters, see the Cassandra official documentation.