With data processing based on foreign tables, GDS is used to transfer data and synchronize data between multiple clusters.
When starting GDS, you can specify any directory as the data transit directory, for example, /opt. An example of the startup command is as follows:
/opt/gds/bin/gds -d /opt -p 192.168.0.2:5000 -H 192.168.0.1/24 -l /opt/gds/bin/gds_log.txt -D -t 2
Assume that the table tbl_remote in the remote cluster is to be synchronized with the table tbl_local in the local cluster and the user performing the synchronization is user_remote. Note that the user must have the permission to access the tbl_remote table.
CREATE SERVER server_remote FOREIGN DATA WRAPPER GC_FDW OPTIONS( address '192.168.178.207:8000', dbname 'db_remote', username 'user_remote', password 'xxxxxxxx', syncsrv 'gsfs://192.168.178.129:5000|gsfs://192.168.178.129:5000' );
GaussDB(DWS) tests the network connected to the GDS addresses set by syncsrv.
CREATE FOREIGN TABLE ft_tbl( col_1 type_name, col_2 type_name, ... ) SERVER server_remote OPTIONS ( schema_name 'schema_remote', table_name 'tbl_remote', encoding 'utf8' );
Full data synchronization of all columns:
1 | INSERT INTO tbl_local SELECT * FROM ft_tbl; |
Data synchronization of all columns based on filter criteria:
1 | INSERT INTO tbl_local SELECT * FROM ft_tbl WHERE col_2 = XX; |
Full data synchronization of some columns:
1 | INSERT INTO tbl_local (col_1) SELECT col_1 FROM ft_tbl; |
Data synchronization of some columns based on filter criteria:
1 | INSERT INTO tbl_local (col_1) SELECT col_1 FROM ft_tbl WHERE col_2 = XX; |
Synchronization of unsharded tables:
1 | INSERT INTO ft_tbl SELECT * FROM tbl_local; |
Data synchronization of the join results:
1 | INSERT INTO ft_tbl SELECT * FROM tbl_local1 join tbl_local2 ON XXX; |
DROP FOREIGN TABLE ft_tbl;