This statement is used to create an OBS table using the Hive syntax. The main differences between the DataSource and the Hive syntax lie in the supported data formats and the number of supported partitions. For details, see syntax and precautions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name [(col_name1 col_type1 [COMMENT col_comment1], ...)] [COMMENT table_comment] [PARTITIONED BY (col_name2 col_type2, [COMMENT col_comment2], ...)] [ROW FORMAT row_format] [STORED AS file_format] LOCATION 'obs_path' [TBLPROPERTIES (key = value)] [AS select_statement]; row_format: : SERDE serde_cls [WITH SERDEPROPERTIES (key1=val1, key2=val2, ...)] | DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char] [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char] [NULL DEFINED AS char] |
For example, you can use this statement to enable the multiversion function to back up and restore table data. After the multiversion function is enabled, the system automatically backs up table data when you delete or modify the data using insert overwrite or truncate, and retains the data for a certain period. You can quickly restore data within the retention period. For details about SQL syntax related to the multiversion function, see Enabling or Disabling Multiversion Backup and Backing Up and Restoring Data of Multiple Versions.
When creating an OBS table, you can use TBLPROPERTIES ("dli.multi.version.enable"="true") to enable multiversion. For details, see the following example.
Key |
Value |
---|---|
dli.multi.version.enable |
|
comment |
Description of the table |
orc.compress |
An attribute of the ORC table, which specifies the compression mode of the ORC storage. Available values are as follows:
|
auto.purge |
If this parameter is set to true, the deleted or overwritten data is removed and will not be dumped to the recycle bin. |
Parameter |
Description |
---|---|
db_name |
Database name that contains letters, digits, and underscores (_). The value cannot contain only digits and cannot start with a digit or underscore (_). |
table_name |
Table name of a database that contains letters, digits, and underscores (_). The value cannot contain only digits and cannot start with a digit or underscore (_). The matching rule is ^(?!_)(?![0-9]+$)[A-Za-z0-9_$]*$. If special characters are required, use single quotation marks ('') to enclose them. |
col_name |
Field name |
col_type |
Field type |
col_comment |
Field description |
row_format |
Line data format |
file_format |
OBS table storage format. TEXTFILE, AVRO, ORC, SEQUENCEFILE, RCFILE, and PARQUET are supported. |
table_comment |
Table description |
obs_path |
OBS path |
key = value |
Set table properties and values. For example, if you want to enable multiversion, you can set "dli.multi.version.enable"="true". |
select_statement |
The CREATE TABLE AS statement is used to insert the SELECT query result of the source table or a data record to a new table in OBS bucket. |
1 | CREATE TABLE student (id INT, name STRING, score FLOAT) STORED AS PARQUET LOCATION 'obs://bucketName/filePath'; |
1 | CREATE TABLE IF NOT EXISTS student(name STRING, score DOUBLE) PARTITIONED BY (classNo INT) STORED AS PARQUET LOCATION 'obs://bucketName/filePath'; |
classNo is a partition field and must be specified after the PARTITIONED BY keyword, that is, PARTITIONED BY (classNo INT). It cannot be specified after the table name as a table field.
1 | CREATE TABLE t1 STORED AS parquet LOCATION 'obs://bucketName/filePath' as select * from t2; |
1 | CREATE TABLE student (id INT, name STRING, score FLOAT) STORED AS PARQUET LOCATION 'obs://bucketName/filePath' TBLPROPERTIES ("dli.multi.version.enable"="true"); |