Partitioning refers to splitting what is logically one large table into smaller physical pieces based on specific schemes. The table based on the logic is called a partition cable, and a physical piece is called a partition. Data is stored on these smaller physical pieces, namely, partitions, instead of the larger logical partitioned table. During conditional query, the system scans only the partitions that meet the conditions rather than scanning the entire table improving query performance.
Advantages of partitioned tables:
You can choose to partition a table when the table has the following characteristics:
A table is partitioned based on obvious rangeable fields. Generally, columns such as date, area, and value are used for partitioning. The time column is most commonly used.
If the queried data fall into specific ranges, its better tables are partitioned so that through partition pruning, only the queried partition needs to be scanned, improving data scanning efficiency and reducing the I/O overhead of data scanning.
Scanning small tables does not take much time, therefore the performance benefits of partitioning are not significant. Therefore, you are advised to partition only large tables. In column-store table, each column is an independent file storage unit, and the minimum storage unit CU can store 60,000 rows of data. Therefore, for column-store partitioned tables, it is recommended that the data volume in each partition be greater than or equal to the number of DNs multiplied by 60,000.
Example: Create a table web_returns_p1 partitioned by the range wr_returned_date_sk.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | CREATE TABLE web_returns_p1 ( wr_returned_date_sk integer, wr_returned_time_sk integer, wr_item_sk integer NOT NULL, wr_refunded_customer_sk integer ) WITH (orientation = column) DISTRIBUTE BY HASH (wr_item_sk) PARTITION BY RANGE (wr_returned_date_sk) ( PARTITION p2016 VALUES LESS THAN(20161231), PARTITION p2017 VALUES LESS THAN(20171231), PARTITION p2018 VALUES LESS THAN(20181231), PARTITION p2019 VALUES LESS THAN(20191231), PARTITION pxxxx VALUES LESS THAN(maxvalue) ); |
Create partitions in batches, with fixed partition ranges. The following example can be used:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | CREATE TABLE web_returns_p2 ( wr_returned_date_sk integer, wr_returned_time_sk integer, wr_item_sk integer NOT NULL, wr_refunded_customer_sk integer ) WITH (orientation = column) DISTRIBUTE BY HASH (wr_item_sk) PARTITION BY RANGE(wr_returned_date_sk) ( PARTITION p2016 START(20161231) END(20191231) EVERY(10000), PARTITION p0 END(maxvalue) ); |
A list partitioned table can use any column that allows value comparison as the partition key column. When creating a list partitioned table, you must declare the value partition for each partition.
Example: Create a list partitioned table sales_info.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | CREATE TABLE sales_info ( sale_time timestamptz, period int, city text, price numeric(10,2), remark varchar2(100) ) DISTRIBUTE BY HASH(sale_time) PARTITION BY LIST (period, city) ( PARTITION province1_202201 VALUES (('202201', 'city1'), ('202201', 'city2')), PARTITION province2_202201 VALUES (('202201', 'city3'), ('202201', 'city4'), ('202201', 'city5')), PARTITION rest VALUES (DEFAULT) ); |
A table can be partitioned only when it is created. If you want to partition a table, you must create a partitioned table, load the data in the original table to the partitioned table, delete the original table, and rename the partitioned table as the name of the original table. You must also re-grant permissions on the table to users. For example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | CREATE TABLE web_returns_p2 ( wr_returned_date_sk integer, wr_returned_time_sk integer, wr_item_sk integer NOT NULL, wr_refunded_customer_sk integer ) WITH (orientation = column) DISTRIBUTE BY HASH (wr_item_sk) PARTITION BY RANGE(wr_returned_date_sk) ( PARTITION p2016 START(20161231) END(20191231) EVERY(10000), PARTITION p0 END(maxvalue) ); |
1 2 3 4 5 | INSERT INTO web_returns_p2 SELECT * FROM web_returns_p1; DROP TABLE web_returns_p1; ALTER TABLE web_returns_p2 RENAME TO web_returns_p1; GRANT ALL PRIVILEGES ON web_returns_p1 TO dbadmin; GRANT SELECT ON web_returns_p1 TO jack; |
Run the ALTER TABLE statement to add a partition to a partitioned table. For example, to add partition P2020 to the web_returns_p1 table, run the following command:
1 | ALTER TABLE web_returns_p1 ADD PARTITION P2020 VALUES LESS THAN (20201231); |
The syntax for splitting a partition varies between a range partitioned table and a list partitioned table.
1 | ALTER TABLE web_returns_p1 SPLIT PARTITION pxxxx AT(20201231) INTO (PARTITION p2020,PARTITION p20xx); |
1 | ALTER TABLE sales_info SPLIT PARTITION province2_202201 VALUES(('202201', 'city5')) INTO (PARTITION province3_202201,PARTITION province4_202201); |
Run the ALTER TABLE statement to merge two partitions in a partitioned table. For example, merge partitions p2016 and p2017 of table web_returns_p1 into one partition p20162017.
1 | ALTER TABLE web_returns_p1 MERGE PARTITIONS p2016,p2017 INTO PARTITION p20162017; |
1 | ALTER TABLE web_returns_p1 DROP PARTITION P2020; |
1 2 | SELECT * FROM web_returns_p1 PARTITION (p2019); SELECT * FROM web_returns_p1 PARTITION FOR (20201231); |
1 | SELECT * FROM dba_tab_partitions where table_name='web_returns_p1'; |
Run the DROP TABLE statement to delete a partitioned table.
1 | DROP TABLE web_returns_p1; |