forked from docs/doc-exports
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Lu, Huayi <luhuayi@huawei.com> Co-committed-by: Lu, Huayi <luhuayi@huawei.com>
12 KiB
12 KiB
Stream Operation Hints
Function
Specifies the stream method, which can be broadcast, redistribute, or specifying the distribution key for Agg redistribution.
Syntax
1 | [no] broadcast | redistribute(table_list) | redistribute ((*) (columns)) |
Parameter Description
- no indicates that the hinted stream method is not used. When the hint is specified for the distribution columns in the Agg redistribution, no is invalid.
- table_list specifies the tables to be joined. For details, see Parameter Description.
- When hints are specified for distribution columns, the asterisk (*) is fixed and the table name cannot be specified.
- columns specifies one or more columns in the GROUP BY clause. When there are no GROUP BY clauses, it can specify the columns in the DISTINCT clause.'

- The specified distribution column must be represented by the column number in GROUP BY or DISTINCT. The column name cannot be specified.
- For a multi-layer query, you can specify the distribution column hint at each layer. The hint takes effect only at the corresponding layer.
- If the optimizer finds that redistribution is not required after estimation, the specified distribution column is invalid.
Tips
- Generally, the optimizer selects a group of non-skew distribution keys for data redistribution based on statistics. If the default distribution keys have data skew, you can manually specify the distribution columns to avoid data skew.
- When selecting a distribution key, select a group of columns with high distinct values as the distribution key based on data distribution features. In this way, data can be evenly distributed to each DN after redistribution.
- After writing hints, you can run explain verbose to print the execution plan and check whether the specified distribution key is valid. If the specified distribution key is invalid, a warning is displayed.
Example
- Hint the query plan in Examples as follows:
1 2
explain select /*+ no redistribute(store_sales store_returns item store) leading(((store_sales store_returns item store) customer)) */ i_product_name product_name ...
In the original plan, the join result of store_sales, store_returns, item, and store is redistributed before it is joined with customer. After the hinting, the redistribution is disabled and the join order is retained. The optimized plan is as follows:
- Specifies the distribution columns for Agg redistribution.
1 2
explain (verbose on, costs off, nodes off) select /*+ redistribute ((*) (2 3)) */ a1, b1, c1, count(c1) from t1 group by a1, b1, c1 having count(c1) > 10 and sum(d1) > 100
In the following example, the last two columns of the specified GROUP BY columns are used as distribution keys.
- If the statement does not contain the GROUP BY clause, specify the distinct column as the distribution columns.
1 2
explain (verbose on, costs off, nodes off) select /*+ redistribute ((*) (3 1)) */ distinct a1, b1, c1 from t1;
Parent topic: Hint-based Tuning