After performing steps in Creating a Foreign Server, create an OBS foreign table in the GaussDB(DWS) database to access the data stored in OBS. An OBS foreign table is read-only. It can only be queried using SELECT.
The syntax for creating a foreign table is as follows. For details, see the syntax CREATE FOREIGN TABLE (SQL on Hadoop or OBS).
1 2 3 4 5 6 7 8 9 10 | CREATE FOREIGN TABLE [ IF NOT EXISTS ] table_name ( [ { column_name type_name [ { [CONSTRAINT constraint_name] NULL | [CONSTRAINT constraint_name] NOT NULL | column_constraint [...]} ] | table_constraint [, ...]} [, ...] ] ) SERVER dfs_server OPTIONS ( { option_name ' value ' } [, ...] ) DISTRIBUTE BY {ROUNDROBIN | REPLICATION} [ PARTITION BY ( column_name ) [ AUTOMAPPED ] ] ; |
For example, when creating a foreign table named product_info_ext_obs, set parameters in the syntax as follows:
Specifies the name of the foreign table to be created.
Multiple columns are separate by commas (,).
The number of fields and field types in the foreign table must be the same as those in the data stored on OBS.
This parameter specifies the foreign server name of the foreign table. This server must exist. The foreign server connects to OBS to read data by setting its foreign server.
Enter the name of the foreign server created by following steps in Creating a Foreign Server.
These are parameters associated with the foreign table. The key parameters are as follows:
You can perform 2 in Preparing Data on OBS to obtain the complete OBS path of the data source file. The path is the endpoint of the OBS service.
This clause is mandatory. Currently, OBS foreign tables support only the ROUNDROBIN distribution mode.
It indicates that when a foreign table reads data from the data source, each node in the GaussDB(DWS) cluster randomly reads some data and integrates the random data to a complete data set.
Other parameters are optional. You can set them as required. In this example, you do not need to set these parameters.
Based on the preceding settings, the command for creating the foreign table is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | DROP FOREIGN TABLE IF EXISTS product_info_ext_obs; CREATE FOREIGN TABLE product_info_ext_obs ( product_price integer not null, product_id char(30) not null, product_time date , product_level char(10) , product_name varchar(200) , product_type1 varchar(20) , product_type2 char(10) , product_monthly_sales_cnt integer , product_comment_time date , product_comment_num integer , product_comment_content varchar(200) ) SERVER obs_server OPTIONS ( format 'orc', foldername '/mybucket/demo.db/product_info_orc/', encoding 'utf8', totalrows '10' ) DISTRIBUTE BY ROUNDROBIN; |
Create an OBS foreign table that contains partition columns. The product_info_ext_obs foreign table uses the product_manufacturer column as the partition key. The following partition directories exist in obs/mybucket/demo.db/product_info_orc/:
Partition directory 1: product_manufacturer=10001
Partition directory 2: product_manufacturer=10010
Partition directory 3: product_manufacturer=10086
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | DROP FOREIGN TABLE IF EXISTS product_info_ext_obs; CREATE FOREIGN TABLE product_info_ext_obs ( product_price integer not null, product_id char(30) not null, product_time date , product_level char(10) , product_name varchar(200) , product_type1 varchar(20) , product_type2 char(10) , product_monthly_sales_cnt integer , product_comment_time date , product_comment_num integer , product_comment_content varchar(200) , product_manufacturer integer ) SERVER obs_server OPTIONS ( format 'orc', foldername '/mybucket/demo.db/product_info_orc/', encoding 'utf8', totalrows '10' ) DISTRIBUTE BY ROUNDROBIN PARTITION BY (product_manufacturer) AUTOMAPPED; |