Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com> Co-authored-by: Su, Xiaomeng <suxiaomeng1@huawei.com> Co-committed-by: Su, Xiaomeng <suxiaomeng1@huawei.com>
15 KiB
Overview
Table 1 lists the aggregate functions supported by DLI.
Syntax |
Value Type |
Description |
---|---|---|
avg(col), avg(DISTINCT col) |
DOUBLE |
Returns the average value. |
corr(col1, col2) |
DOUBLE |
Returns the coefficient of correlation of a pair of numeric columns. |
count([distinct|all] <colname>) |
BIGINT |
Returns the number of records. |
covar_pop(col1, col2) |
DOUBLE |
Returns the covariance of a pair of numeric columns. |
covar_samp(col1, col2) |
DOUBLE |
Returns the sample covariance of a pair of numeric columns. |
max(col) |
DOUBLE |
Returns the maximum value. |
min(col) |
DOUBLE |
Returns the minimum value. |
percentile(BIGINT col, p) |
DOUBLE |
Returns the percentage value point of the value area. The value of p must be between 0 and 1. Otherwise, NULL is returned. The value cannot be a float. |
percentile_approx(DOUBLE col, p [, B]) |
DOUBLE |
Returns the approximate pth percentile of a numerical column within the group, including floating-point numbers. The value of p should be between 0 and 1. The parameter B controls the accuracy of the approximation, with a higher value of B resulting in a higher level of approximation. The default value is 10000. If the number of non-repeating values in the column is less than B, an exact percentile is returned. |
stddev_pop(col) |
DOUBLE |
Returns the deviation of a specified column. |
stddev_samp(col) |
DOUBLE |
Returns the sample deviation of a specified column. |
sum(col), sum(DISTINCT col) |
DOUBLE |
Returns the sum of the values in a column. |
variance(col), var_pop(col) |
DOUBLE |
Returns the variance of a column. |
var_samp(col) |
DOUBLE |
Returns the sample variance of a specified column. |