Regular aggregate functions return a single value calculated from values in a row, or group all rows into a single output row. Window functions perform a calculation across a set of rows and return a value for each row.
1 | function_name ([expression [, expression ... ]]) OVER ( window_definition ) function_name ([expression [, expression ... ]]) OVER window_namefunction_name ( * ) OVER ( window_definition ) function_name ( * ) OVER window_name |
window_definition is defined as follows:
1 | [ existing_window_name ] [ PARTITION BY expression [, ...] ] [ ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ] [ frame_clause ] |
frame_clause is defined as follows:
1 | [ RANGE | ROWS ] frame_start [ RANGE | ROWS ] BETWEEN frame_start AND frame_end |
You can use RANGE and ROWS to specify the window frame. ROWS specifies the window in physical units (rows). RANGE specifies the window as a logical offset.
In RANGE and ROWS, you can use BETWEEN frame_start AND frame_end to specify the window's first and last rows. If frame_end is left blank, it defaults to CURRENT ROW.
The value options of BETWEEN frame_start AND frame_end are as follows:
frame_start cannot be UNBOUNDED FOLLOWING, frame_end cannot be UNBOUNDED PRECEDING, and frame_end cannot be earlier than frame_start. For example, RANGE BETWEEN CURRENT ROW AND value PRECEDING is not allowed.
Description: The RANK function is used for generating non-consecutive sequence numbers for the values in each group. The same values have the same sequence number.
Return type: bigint
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, rank() OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | rank -------+------------+------ 1 | 1 | 1 1 | 1 | 1 1 | 2 | 3 1 | 2 | 3 2 | 3 | 1 2 | 3 | 1 (6 rows) |
Description: The ROW_NUMBER function is used for generating consecutive sequence numbers for the values in each group. The same values have different sequence numbers.
Return type: bigint
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, Row_number() OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | row_number -------+------------+------------ 1 | 1 | 1 1 | 1 | 2 1 | 2 | 3 1 | 2 | 4 2 | 3 | 1 2 | 3 | 2 (6 rows) |
Description: The DENSE_RANK function is used for generating consecutive sequence numbers for the values in each group. The same values have the same sequence number.
Return type: bigint
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, dense_rank() OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | dense_rank -------+------------+------------ 1 | 1 | 1 1 | 1 | 1 1 | 2 | 2 1 | 2 | 2 2 | 3 | 1 2 | 3 | 1 (6 rows) |
Description: The PERCENT_RANK function is used for generating corresponding sequence numbers for the values in each group. That is, the function calculates the value according to the formula Sequence number = (Rank – 1)/(Total rows – 1). Rank is the corresponding sequence number generated based on the RANK function for the value and Total rows is the total number of elements in a group.
Return type: double precision
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, percent_rank() OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | percent_rank -------+------------+------------------ 1 | 1 | 0 1 | 1 | 0 1 | 2 | .666666666666667 1 | 2 | .666666666666667 2 | 3 | 0 2 | 3 | 0 (6 rows) |
Description: The CUME_DIST function is used for generating accumulative distribution sequence numbers for the values in each group. That is, the function calculates the value according to the following formula: Sequence number = Number of rows preceding or peer with current row/Total rows.
Return type: double precision
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, cume_dist() OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date e_dim WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | cume_dist -------+------------+----------- 1 | 1 | .5 1 | 1 | .5 1 | 2 | 1 1 | 2 | 1 2 | 3 | 1 2 | 3 | 1 (6 rows) |
Description: The NTILE function is used for equally allocating sequential data sets to the buckets whose quantity is specified by num_buckets according to num_buckets integer and allocating the bucket number to each row. Divide the partition as equally as possible.
Return type: integer
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, ntile(3) OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | ntile -------+------------+------- 1 | 1 | 1 1 | 1 | 1 1 | 2 | 2 1 | 2 | 3 2 | 3 | 1 2 | 3 | 2 (6 rows) |
Description: The LAG function is used for generating lag values for the corresponding values in each group. That is, the value of the row obtained by moving forward the row corresponding to the current value by offset (integer) is the sequence number. If the row does not exist after the moving, the result value is the default value. If omitted, offset defaults to 1 and default to null.
Return type: same as the parameter type
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, lag(d_mon,3,null) OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | lag -------+------------+----- 1 | 1 | 1 | 1 | 1 | 2 | 1 | 2 | 1 2 | 3 | 2 | 3 | (6 rows) |
Description: The LEAD function is used for generating leading values for the corresponding values in each group. That is, the value of the row obtained by moving backward the row corresponding to the current value by offset (integer) is the sequence number. If the number of rows after the moving exceeds the total number for the current group, the result value is the default value. If omitted, offset defaults to 1 and default to null.
Return type: same as the parameter type
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, lead(d_week_seq,2) OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | lead -------+------------+------ 1 | 1 | 2 1 | 1 | 2 1 | 2 | 1 | 2 | 2 | 3 | 2 | 3 | (6 rows) |
Description: The FIRST_VALUE function is used for returning the first value of each group.
Return type: same as the parameter type
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, first_value(d_week_seq) OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 7 ORDER BY 1,2; d_mon | d_week_seq | first_value -------+------------+------------- 1 | 1 | 1 1 | 1 | 1 1 | 2 | 1 1 | 2 | 1 2 | 3 | 3 2 | 3 | 3 (6 rows) |
Description: Returns the last value of each group.
Return type: same as the parameter type
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, last_value(d_mon) OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 6 ORDER BY 1,2; d_mon | d_week_seq | last_value -------+------------+------------ 1 | 1 | 1 1 | 1 | 1 1 | 2 | 1 1 | 2 | 1 2 | 3 | 2 2 | 3 | 2 (6 rows) |
Description: The nth row for a group is the returned value. If the row does not exist, NULL is returned by default.
Return type: same as the parameter type
For example:
1 2 3 4 5 6 7 8 9 10 | SELECT d_mon, d_week_seq, nth_value(d_week_seq,2) OVER(PARTITION BY d_mon ORDER BY d_week_seq) FROM reason_date WHERE d_mon < 4 AND d_week_seq < 6 ORDER BY 1,2; d_mon | d_week_seq | nth_value -------+------------+----------- 1 | 1 | 1 1 | 1 | 1 1 | 2 | 1 1 | 2 | 1 2 | 3 | 3 2 | 3 | 3 (6 rows) |