Edit this page

Aggregate Functions

Examples

Produce a single row containing the sum of the amount column:

Query

SELECT sum(amount)FROM sales;

Result

 sum----- 170

Produce one row per unique region, containing the sum of amount for each group:

Query

SELECT region, sum(amount)FROM salesGROUP BY region;

Result

 region | sum--------+----- east   | 110 north  |  40 south  |  20

Return only the regions that have a sum of amount higher than 100:

Query

SELECT regionFROM salesGROUP BY regionHAVING sum(amount) > 100;

Result

 region-------- east

Return the number of unique values in the region column:

Query

SELECT count(DISTINCT region)FROM sales;

Result

 count-------     3

Return two values, the total sum of amount and the sum of amount minus columns where the region is north using the FILTER clause:

Query

SELECT sum(amount), sum(amount) FILTER (region != 'north')FROM sales;

Result

 sum | sum-----+----- 170 | 130

Returns a list of all regions in order of the amount column:

Query

SELECT list(region ORDER BY amount DESC)FROM sales;

Result

 list-------------------------- {east,north,south,north}

Returns the amount of the first sale using the first() aggregate function:

Query

SELECT first(amount ORDER BY date ASC)FROM sales;

Result

 first-------    10

Syntax

Aggregates are functions that combine multiple rows into a single value. Aggregates are different from scalar functions and window functions because they change the cardinality of the result. As such, aggregates can only be used in the SELECT and HAVING clauses of a SQL query.

`DISTINCT` Clause in Aggregate Functions

When the DISTINCT clause is provided, only distinct values are considered in the computation of the aggregate. This is typically used in combination with the count aggregate to get the number of distinct elements; but it can be used together with any aggregate function in the system. There are some aggregates that are insensitive to duplicate values (e.g., min and max) and for them this clause is parsed and ignored.

`ORDER BY` Clause in Aggregate Functions

An ORDER BY clause can be provided after the last argument of the function call. Note the lack of the comma separator before the clause.

Query

SELECT string_agg(region, ', ' ORDER BY amount DESC) AS string_aggFROM sales;

Result

 string_agg--------------------------- east, north, south, north

This clause ensures that the values being aggregated are sorted before applying the function. Most aggregate functions are order-insensitive, and for them this clause is parsed and discarded. However, there are some order-sensitive aggregates that can have non-deterministic results without ordering, e.g., first, last, list and string_agg / group_concat / listagg. These can be made deterministic by ordering the arguments.

For example:

Query

CREATE TABLE tbl AS    SELECT s FROM range(1, 4) r(s);
SELECT string_agg(s, ', ' ORDER BY s DESC) AS countdownFROM tbl;

Result

 countdown----------- 3, 2, 1

Handling `NULL` Values

All general aggregate functions ignore NULLs, except for list (array_agg), first (arbitrary) and last. To exclude NULLs from list, you can use a FILTER clause. To ignore NULLs from first, you can use the any_value aggregate.

All general aggregate functions except count return NULL on empty groups. In particular, list does not return an empty list, sum does not return zero, and string_agg does not return an empty string in this case.

General Aggregate Functions

The examples in this section run against the following table:

Query

CREATE TABLE agg_demo (a INTEGER, b INTEGER, s VARCHAR, w DOUBLE, bl BOOLEAN);
INSERT INTO agg_demo VALUES    (10, 1, 'a', 1.0, true),    (20, 2, 'b', 2.0, false),    (30, 3, 'c', 3.0, true),    (40, 4, 'd', 4.0, true);

The table below shows the available general aggregate functions.

Function	Description
`any_value(arg)`	Returns the first non-null value from `arg`. This function is affected by ordering.
`arg_max(arg, val)`	Finds the row with the maximum `val` and calculates the `arg` expression at that row. Rows where the value of the `arg` or `val` expression is `NULL` are ignored. This function is affected by ordering.
`arg_max(arg, val, n)`	The generalized case of `arg_max` for `n` values: returns a `LIST` containing the `arg` expressions for the top `n` rows ordered by `val` descending. This function is affected by ordering.
`arg_max_null(arg, val)`	Finds the row with the maximum `val` and calculates the `arg` expression at that row. Rows where the `val` expression evaluates to `NULL` are ignored. This function is affected by ordering.
`arg_min(arg, val)`	Finds the row with the minimum `val` and calculates the `arg` expression at that row. Rows where the value of the `arg` or `val` expression is `NULL` are ignored. This function is affected by ordering.
`arg_min(arg, val, n)`	Returns a `LIST` containing the `arg` expressions for the "bottom" `n` rows ordered by `val` ascending. This function is affected by ordering.
`arg_min_null(arg, val)`	Finds the row with the minimum `val` and calculates the `arg` expression at that row. Rows where the `val` expression evaluates to `NULL` are ignored. This function is affected by ordering.
`avg(arg)`	Calculates the average of all non-null values in `arg`. This function is affected by ordering.
`bit_and(arg)`	Returns the bitwise AND of all bits in a given expression.
`bit_or(arg)`	Returns the bitwise OR of all bits in a given expression.
`bit_xor(arg)`	Returns the bitwise XOR of all bits in a given expression.
`bitstring_agg(arg)`	Returns a bitstring whose length corresponds to the range of the non-null (integer) values, with bits set at the location of each (distinct) value.
`bool_and(arg)`	Returns `true` if every input value is `true`, otherwise `false`.
`bool_or(arg)`	Returns `true` if any input value is `true`, otherwise `false`.
`count()`	Returns the number of rows.
`count(arg)`	Returns the number of rows where `arg` is not `NULL`.
`countif(arg)`	Returns the number of rows where `arg` is `true`.
`favg(arg)`	Calculates the average using a more accurate floating point summation (Kahan Sum). This function is affected by ordering.
`first(arg)`	Returns the first value (null or non-null) from `arg`. This function is affected by ordering.
`fsum(arg)`	Calculates the sum using a more accurate floating point summation (Kahan Sum). This function is affected by ordering.
`geometric_mean(arg)`	Calculates the geometric mean of all non-null values in `arg`. This function is affected by ordering.
`histogram(arg)`	Returns a `MAP` of key-value pairs representing buckets and counts.
`histogram(arg, boundaries)`	Returns a `MAP` of key-value pairs representing the provided upper `boundaries` and counts of elements in the corresponding bins (left-open and right-closed partitions) of the datatype. A boundary at the largest value of the datatype is automatically added when elements larger than all provided `boundaries` appear, see `is_histogram_other_bin`. Boundaries may be provided, e.g., via `equi_width_bins`.
`histogram_exact(arg, elements)`	Returns a `MAP` of key-value pairs representing the requested elements and their counts. A catch-all element specific to the data-type is automatically added to count other elements when they appear, see `is_histogram_other_bin`.
`histogram_values(source, boundaries)`	Returns the upper boundaries of the bins and their counts.
`last(arg)`	Returns the last value of a column. This function is affected by ordering.
`list(arg)`	Returns a `LIST` containing all the values of a column. This function is affected by ordering.
`max(arg)`	Returns the maximum value present in `arg`. This function is unaffected by distinctness.
`max(arg, n)`	Returns a `LIST` containing the `arg` values for the "top" `n` rows ordered by `arg` descending.
`min(arg)`	Returns the minimum value present in `arg`. This function is unaffected by distinctness.
`min(arg, n)`	Returns a `LIST` containing the `arg` values for the "bottom" `n` rows ordered by `arg` ascending.
`product(arg)`	Calculates the product of all non-null values in `arg`. This function is affected by ordering.
`string_agg(arg)`	Concatenates the column string values with a comma separator (`,`). This function is affected by ordering.
`string_agg(arg, sep)`	Concatenates the column string values with a separator. This function is affected by ordering.
`sum(arg)`	Calculates the sum of all non-null values in `arg` / counts `true` values when `arg` is boolean. The floating-point versions of this function are affected by ordering.
`weighted_avg(arg, weight)`	Calculates the weighted average of all non-null values in `arg`, where each value is scaled by its corresponding `weight`. If `weight` is `NULL`, the corresponding `arg` value will be skipped. This function is affected by ordering.

`any_value(arg)`

Returns the first non-NULL value from arg. This function is affected by ordering.

Query

SELECT any_value(a ORDER BY a) AS any_value FROM agg_demo;

Result

 any_value-----------        10

`arg_max(arg, val)`

Finds the row with the maximum val and calculates the arg expression at that row. Rows where the value of the arg or val expression is NULL are ignored. This function is affected by ordering. Aliases: argmax(arg, val), max_by(arg, val).

Query

SELECT arg_max(a, b) AS arg_max FROM agg_demo;

Result

 arg_max---------      40

`arg_max(arg, val, n)`

The generalized case of arg_max for n values: returns a LIST containing the arg expressions for the top n rows ordered by val descending. This function is affected by ordering. Aliases: argmax(arg, val, n), max_by(arg, val, n).

Query

SELECT arg_max(a, b, 2) AS arg_max FROM agg_demo;

Result

 arg_max--------- {40,30}

`arg_max_null(arg, val)`

Finds the row with the maximum val and calculates the arg expression at that row. Rows where the val expression evaluates to NULL are ignored. This function is affected by ordering.

Query

SELECT arg_max_null(a, b) AS arg_max_null FROM agg_demo;

Result

 arg_max_null--------------           40

`arg_min(arg, val)`

Finds the row with the minimum val and calculates the arg expression at that row. Rows where the value of the arg or val expression is NULL are ignored. This function is affected by ordering. Aliases: argmin(arg, val), min_by(arg, val).

Query

SELECT arg_min(a, b) AS arg_min FROM agg_demo;

Result

 arg_min---------      10

`arg_min(arg, val, n)`

The generalized case of arg_min for n values: returns a LIST containing the arg expressions for the bottom n rows ordered by val ascending. This function is affected by ordering. Aliases: argmin(arg, val, n), min_by(arg, val, n).

Query

SELECT arg_min(a, b, 2) AS arg_min FROM agg_demo;

Result

 arg_min--------- {10,20}

`arg_min_null(arg, val)`

Finds the row with the minimum val and calculates the arg expression at that row. Rows where the val expression evaluates to NULL are ignored. This function is affected by ordering.

Query

SELECT arg_min_null(a, b) AS arg_min_null FROM agg_demo;

Result

 arg_min_null--------------           10

`avg(arg)`

Calculates the average of all non-null values in arg. This function is affected by ordering. Alias: mean.

Query

SELECT avg(a) AS avg FROM agg_demo;

Result

 avg-----  25

`bit_and(arg)`

Returns the bitwise AND of all bits in a given expression.

Query

SELECT bit_and(a) AS bit_and FROM agg_demo;

Result

 bit_and---------       0

`bit_or(arg)`

Returns the bitwise OR of all bits in a given expression.

Query

SELECT bit_or(a) AS bit_or FROM agg_demo;

Result

 bit_or--------     62

`bit_xor(arg)`

Returns the bitwise XOR of all bits in a given expression.

Query

SELECT bit_xor(a) AS bit_xor FROM agg_demo;

Result

 bit_xor---------      40

`bitstring_agg(arg)`

Returns a bitstring whose length corresponds to the range of the non-null (integer) values, with bits set at the location of each (distinct) value.

Query

SELECT bitstring_agg(a) AS bitstring_agg FROM agg_demo;

Result

 bitstring_agg--------------------------------- 1000000000100000000010000000001

`bool_and(arg)`

Returns true if every input value is true, otherwise false.

Query

SELECT bool_and(bl) AS bool_and FROM agg_demo;

Result

 bool_and---------- f

`bool_or(arg)`

Returns true if any input value is true, otherwise false.

Query

SELECT bool_or(bl) AS bool_or FROM agg_demo;

Result

 bool_or--------- t

`count()`

Returns the number of rows. Alias: count(*).

Query

SELECT count() AS count FROM agg_demo;

Result

 count-------     4

`count(arg)`

Returns the number of rows where arg is not NULL.

Query

SELECT count(a) AS count FROM agg_demo;

Result

 count-------     4

`countif(arg)`

Returns the number of rows where arg is true.

Query

SELECT countif(bl) AS countif FROM agg_demo;

Result

 countif---------       3

`favg(arg)`

Calculates the average using a more accurate floating point summation (Kahan Sum). This function is affected by ordering.

Query

SELECT favg(a) AS favg FROM agg_demo;

Result

 favg------   25

`first(arg)`

Returns the first value (null or non-null) from arg. This function is affected by ordering. Alias: arbitrary(A).

Query

SELECT first(a ORDER BY a) AS first FROM agg_demo;

Result

 first-------    10

`fsum(arg)`

Calculates the sum using a more accurate floating point summation (Kahan Sum). This function is affected by ordering. Aliases: sumkahan, kahan_sum.

Query

SELECT fsum(a) AS fsum FROM agg_demo;

Result

 fsum------  100

`geometric_mean(arg)`

Calculates the geometric mean of all non-null values in arg. This function is affected by ordering. Alias: geomean(A).

Query

SELECT geometric_mean(a) AS geometric_mean FROM agg_demo;

Result

 geometric_mean-------------------- 22.133638394006436

`histogram(arg)`

Returns a MAP of key-value pairs representing buckets and counts.

Query

SELECT histogram(a) AS histogram FROM agg_demo;

Result

 histogram--------------------------------------- {"(10,1)","(20,1)","(30,1)","(40,1)"}

`histogram(arg, boundaries)`

Returns a MAP of key-value pairs representing the provided upper boundaries and counts of elements in the corresponding bins (left-open and right-closed partitions) of the datatype. A boundary at the largest value of the datatype is automatically added when elements larger than all provided boundaries appear, see is_histogram_other_bin. Boundaries may be provided, e.g., via equi_width_bins.

Query

SELECT histogram(a, [0, 1, 10]) AS histogram FROM agg_demo;

Result

 histogram--------------------------------------------- {"(0,0)","(1,0)","(10,1)","(2147483647,3)"}

`histogram_exact(arg, elements)`

Returns a MAP of key-value pairs representing the requested elements and their counts. A catch-all element specific to the data-type is automatically added to count other elements when they appear, see is_histogram_other_bin.

Query

SELECT histogram_exact(s, ['a', 'b', 'c']) AS histogram_exact FROM agg_demo;

Result

 histogram_exact---------------------------------------- {"(a,1)","(b,1)","(c,1)","(\\"\\",1)"}

`histogram_values(source, col_name, technique, bin_count)`

Description	Returns the upper boundaries of the bins and their counts.
Example	`histogram_values(integers, i, bin_count := 2)`

Note: this function is not yet supported in SereneDB.

`last(arg)`

Returns the last value of a column. This function is affected by ordering.

Query

SELECT last(a ORDER BY a) AS last FROM agg_demo;

Result

 last------   40

`list(arg)`

Returns a LIST containing all the values of a column. This function is affected by ordering. Alias: array_agg.

Query

SELECT list(a ORDER BY a) AS list FROM agg_demo;

Result

 list--------------- {10,20,30,40}

`max(arg)`

Returns the maximum value present in arg. This function is unaffected by distinctness.

Query

SELECT max(a) AS max FROM agg_demo;

Result

 max-----  40

`max(arg, n)`

Returns a LIST containing the arg values for the "top" n rows ordered by arg descending.

Query

SELECT max(a, 2) AS max FROM agg_demo;

Result

 max--------- {40,30}

`min(arg)`

Returns the minimum value present in arg. This function is unaffected by distinctness.

Query

SELECT min(a) AS min FROM agg_demo;

Result

 min-----  10

`min(arg, n)`

Returns a LIST containing the arg values for the "bottom" n rows ordered by arg ascending.

Query

SELECT min(a, 2) AS min FROM agg_demo;

Result

 min--------- {10,20}

`product(arg)`

Calculates the product of all non-null values in arg. This function is affected by ordering.

Query

SELECT product(a) AS product FROM agg_demo;

Result

 product---------  240000

`string_agg(arg)`

Concatenates the column string values with a comma separator (,). This function is affected by ordering. Aliases: group_concat(arg), listagg(arg).

Query

SELECT string_agg(s ORDER BY s) AS string_agg FROM agg_demo;

Result

 string_agg------------ a,b,c,d

`string_agg(arg, sep)`

Concatenates the column string values with a separator. This function is affected by ordering. Aliases: group_concat(arg, sep), listagg(arg, sep).

Query

SELECT string_agg(s, '-' ORDER BY s) AS string_agg FROM agg_demo;

Result

 string_agg------------ a-b-c-d

`sum(arg)`

Calculates the sum of all non-null values in arg / counts true values when arg is boolean. The floating-point versions of this function are affected by ordering.

Query

SELECT sum(a) AS sum FROM agg_demo;

Result

 sum----- 100

`weighted_avg(arg, weight)`

Calculates the weighted average of all non-null values in arg, where each value is scaled by its corresponding weight. If weight is NULL, the value will be skipped. This function is affected by ordering. Alias: wavg(arg, weight).

Query

SELECT weighted_avg(a, w) AS weighted_avg FROM agg_demo;

Result

 weighted_avg--------------           30

Approximate Aggregates

The table below shows the available approximate aggregate functions.

Function	Description	Example
`approx_count_distinct(x)`	Calculates the approximate count of distinct elements using HyperLogLog.	`approx_count_distinct(A)`
`approx_quantile(x, pos)`	Calculates the approximate quantile using T-Digest.	`approx_quantile(A, 0.5)`
`approx_top_k(arg, k)`	Calculates a `LIST` of the `k` approximately most frequent values of `arg` using Filtered Space-Saving.
`reservoir_quantile(x, quantile, sample_size = 8192)`	Calculates the approximate quantile using reservoir sampling, the sample size is optional and uses 8192 as a default size.	`reservoir_quantile(A, 0.5, 1024)`

Statistical Aggregates

The table below shows the available statistical aggregate functions. They all ignore NULL values (in the case of a single input column x), or pairs where either input is NULL (in the case of two input columns y and x).

Function	Description
`corr(y, x)`	The correlation coefficient.
`covar_pop(y, x)`	The population covariance, which does not include bias correction.
`covar_samp(y, x)`	The sample covariance, which includes Bessel's bias correction.
`entropy(x)`	The log-2 entropy of count input-values.
`kurtosis_pop(x)`	The excess kurtosis (Fisher’s definition) without bias correction.
`kurtosis(x)`	The excess kurtosis (Fisher's definition) with bias correction according to the sample size.
`mad(x)`	The median absolute deviation. Temporal types return a positive `INTERVAL`.
`median(x)`	The middle value of the set. For even value counts, quantitative values are averaged and ordinal values return the lower value.
`mode(x)`	The most frequent value. This function is affected by ordering.
`quantile_cont(x, pos)`	The interpolated `pos`-quantile of `x` for `-1 <= pos <= 1`. Returns the `pos * (n_nonnull_values - 1)`th (zero-indexed, in the specified order) value of `x` or an interpolation between the adjacent values if the index is not an integer. Values of `pos` between `-1` and `0` correspond to counting backwards from `1`. More precisely, `quantile_cont(x, -y) = quantile_cont(x, 1 - y)`. Intuitively, arranges the values of `x` as equispaced points on a line, starting at 0 and ending at 1, and returns the (interpolated) value at `pos`. This is Type 7 in Hyndman & Fan (1996). If `pos` is a `LIST` of `FLOAT`s, then the result is a `LIST` of the corresponding interpolated quantiles.
`quantile_disc(x, pos)`	The discrete `pos`-quantile of `x` for `0 <= pos <= 1`. Returns the `greatest(ceil(pos * n_nonnull_values) - 1, 0)`th (zero-indexed, in the specified order) value of `x`. Intuitively, assigns to each value of `x` an equisized sub-interval (left-open and right-closed except for the initial interval) of the interval `[0, 1]`, and picks the value of the sub-interval that contains `pos`. This is Type 1 in Hyndman & Fan (1996). If `pos` is a `LIST` of `FLOAT`s, then the result is a `LIST` of the corresponding discrete quantiles.
`regr_avgx(y, x)`	The average of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
`regr_avgy(y, x)`	The average of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
`regr_count(y, x)`	The number of non-`NULL` pairs.
`regr_intercept(y, x)`	The intercept of the univariate linear regression line, where x is the independent variable and y is the dependent variable.
`regr_r2(y, x)`	The squared Pearson correlation coefficient between y and x. Also: The coefficient of determination in a linear regression, where x is the independent variable and y is the dependent variable.
`regr_slope(y, x)`	The slope of the linear regression line, where x is the independent variable and y is the dependent variable.
`regr_sxx(y, x)`	The sample variance, which includes Bessel's bias correction, of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
`regr_sxy(y, x)`	The sample covariance, which includes Bessel's bias correction.
`regr_syy(y, x)`	The sample variance, which includes Bessel's bias correction, of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
`skewness(x)`	The skewness.
`sem(x)`	The standard error of the mean.
`stddev_pop(x)`	The population standard deviation.
`stddev_samp(x)`	The sample standard deviation.
`var_pop(x)`	The population variance, which does not include bias correction.
`var_samp(x)`	The sample variance, which includes Bessel's bias correction.

`corr(y, x)`

Description	The correlation coefficient.
Formula	`covar_pop(y, x) / (stddev_pop(x) * stddev_pop(y))`

`covar_pop(y, x)`

Description	The population covariance, which does not include bias correction.
Formula	`(sum(xy) - sum(x) sum(y) / regr_count(y, x)) / regr_count(y, x)`, `covar_samp(y, x) * (1 - 1 / regr_count(y, x))`

`covar_samp(y, x)`

Description	The sample covariance, which includes Bessel's bias correction.
Formula	`(sum(xy) - sum(x) sum(y) / regr_count(y, x)) / (regr_count(y, x) - 1)`, `covar_pop(y, x) / (1 - 1 / regr_count(y, x))`
Alias(es)	`regr_sxy(y, x)`

`entropy(x)`

Description	The log-2 entropy of count input-values.
Formula	-

`kurtosis_pop(x)`

Description	The excess kurtosis (Fisher’s definition) without bias correction.
Formula	-

`kurtosis(x)`

Description	The excess kurtosis (Fisher's definition) with bias correction according to the sample size.
Formula	-

`mad(x)`

Description	The median absolute deviation. Temporal types return a positive `INTERVAL`.
Formula	`median(abs(x - median(x)))`

`median(x)`

Description	The middle value of the set. For even value counts, quantitative values are averaged and ordinal values return the lower value.
Formula	`quantile_cont(x, 0.5)`

`mode(x)`

Description	The most frequent value. This function is affected by ordering.
Formula	-

`quantile_cont(x, pos)`

Description The interpolated pos-quantile of x for 0 <= pos <= 1. Returns the pos * (n_nonnull_values - 1)th (zero-indexed, in the specified order) value of x or an interpolation between the adjacent values if the index is not an integer. Intuitively, arranges the values of x as equispaced points on a line, starting at 0 and ending at 1, and returns the (interpolated) value at pos. This is Type 7 in Hyndman & Fan (1996). If pos is a LIST of FLOATs, then the result is a LIST of the corresponding interpolated quantiles.

Formula -

`quantile_disc(x, pos)`

Description The discrete pos-quantile of x for 0 <= pos <= 1. Returns the greatest(ceil(pos * n_nonnull_values) - 1, 0)th (zero-indexed, in the specified order) value of x. Intuitively, assigns to each value of x an equisized sub-interval (left-open and right-closed except for the initial interval) of the interval [0, 1], and picks the value of the sub-interval that contains pos. This is Type 1 in Hyndman & Fan (1996). If pos is a LIST of FLOATs, then the result is a LIST of the corresponding discrete quantiles.

Formula -

Alias(es) quantile

`regr_avgx(y, x)`

Description	The average of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
Formula	-

`regr_avgy(y, x)`

Description	The average of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
Formula	-

`regr_count(y, x)`

Description	The number of non-`NULL` pairs.
Formula	-

`regr_intercept(y, x)`

Description	The intercept of the univariate linear regression line, where x is the independent variable and y is the dependent variable.
Formula	`regr_avgy(y, x) - regr_slope(y, x) * regr_avgx(y, x)`

`regr_r2(y, x)`

Description	The squared Pearson correlation coefficient between y and x. Also: The coefficient of determination in a linear regression, where x is the independent variable and y is the dependent variable.
Formula	-

`regr_slope(y, x)`

Description	Returns the slope of the linear regression line, where x is the independent variable and y is the dependent variable.
Formula	`regr_sxy(y, x) / regr_sxx(y, x)`
Alias(es)	-

`regr_sxx(y, x)`

Description	The sample variance, which includes Bessel's bias correction, of the independent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
Formula	-

`regr_sxy(y, x)`

Description	The sample covariance, which includes Bessel's bias correction.
Formula	`(sum(xy) - sum(x) sum(y) / regr_count(y, x)) / (regr_count(y, x) - 1)`, `covar_pop(y, x) / (1 - 1 / regr_count(y, x))`
Alias(es)	`covar_samp(y, x)`

`regr_syy(y, x)`

Description	The sample variance, which includes Bessel's bias correction, of the dependent variable for non-`NULL` pairs, where x is the independent variable and y is the dependent variable.
Formula	-

`sem(x)`

Description	The standard error of the mean.
Formula	-

`skewness(x)`

Description	The skewness.
Formula	-

`stddev_pop(x)`

Description	The population standard deviation.
Formula	`sqrt(var_pop(x))`

`stddev_samp(x)`

Description	The sample standard deviation.
Formula	`sqrt(var_samp(x))`
Alias(es)	`stddev(x)`

`var_pop(x)`

Description	The population variance, which does not include bias correction.
Formula	`(sum(x^2) - sum(x)^2 / count(x)) / count(x)`, `var_samp(y, x) * (1 - 1 / count(x))`

`var_samp(x)`

Description	The sample variance, which includes Bessel's bias correction.
Formula	`(sum(x^2) - sum(x)^2 / count(x)) / (count(x) - 1)`, `var_pop(y, x) / (1 - 1 / count(x))`
Alias(es)	`variance(arg, val)`

Ordered Set Aggregate Functions

The table below shows the available “ordered set” aggregate functions. These functions are specified using the WITHIN GROUP (ORDER BY sort_expression) syntax, and they are converted to an equivalent aggregate function that takes the ordering expression as the first argument.

Function	Equivalent
`mode() WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`mode(column ORDER BY column [(ASC\|DESC)])`
`percentile_cont(fraction) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_cont(column, fraction ORDER BY column [(ASC\|DESC)])`
`percentile_cont(fractions) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_cont(column, fractions ORDER BY column [(ASC\|DESC)])`
`percentile_disc(fraction) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_disc(column, fraction ORDER BY column [(ASC\|DESC)])`
`percentile_disc(fractions) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_disc(column, fractions ORDER BY column [(ASC\|DESC)])`

Miscellaneous Aggregate Functions

Function	Description	Alias
`grouping()`	For queries with `GROUP BY` and either `ROLLUP` or `GROUPING SETS`: Returns an integer identifying which of the argument expressions were used to group on to create the current super-aggregate row.	`grouping_id()`

Function	Equivalent
`mode() WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`mode(column ORDER BY column [(ASC\|DESC)])`
`percentile_cont(fraction) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_cont(column, fraction ORDER BY column [(ASC\|DESC)])`
`percentile_cont(fractions) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_cont(column, fractions ORDER BY column [(ASC\|DESC)])`
`percentile_disc(fraction) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_disc(column, fraction ORDER BY column [(ASC\|DESC)])`
`percentile_disc(fractions) WITHIN GROUP (ORDER BY column [(ASC\|DESC)])`	`quantile_disc(column, fractions ORDER BY column [(ASC\|DESC)])`

Examples​

Syntax​

DISTINCT Clause in Aggregate Functions​

ORDER BY Clause in Aggregate Functions​

Handling NULL Values​

General Aggregate Functions​

any_value(arg)​

arg_max(arg, val)​

arg_max(arg, val, n)​

arg_max_null(arg, val)​

arg_min(arg, val)​

arg_min(arg, val, n)​

arg_min_null(arg, val)​

avg(arg)​

bit_and(arg)​

bit_or(arg)​

bit_xor(arg)​

bitstring_agg(arg)​

bool_and(arg)​

bool_or(arg)​

count()​

count(arg)​

countif(arg)​

favg(arg)​

first(arg)​

fsum(arg)​

geometric_mean(arg)​

histogram(arg)​

histogram(arg, boundaries)​

histogram_exact(arg, elements)​

histogram_values(source, col_name, technique, bin_count)​

last(arg)​

list(arg)​

max(arg)​

max(arg, n)​

min(arg)​

min(arg, n)​

product(arg)​

string_agg(arg)​

string_agg(arg, sep)​

sum(arg)​

weighted_avg(arg, weight)​

Approximate Aggregates​

Statistical Aggregates​

corr(y, x)​

covar_pop(y, x)​

covar_samp(y, x)​

entropy(x)​

kurtosis_pop(x)​

kurtosis(x)​

mad(x)​

median(x)​

mode(x)​

quantile_cont(x, pos)​

quantile_disc(x, pos)​

regr_avgx(y, x)​

regr_avgy(y, x)​

regr_count(y, x)​

regr_intercept(y, x)​

regr_r2(y, x)​

regr_slope(y, x)​

regr_sxx(y, x)​

regr_sxy(y, x)​

regr_syy(y, x)​

sem(x)​

skewness(x)​

stddev_pop(x)​

stddev_samp(x)​

var_pop(x)​

var_samp(x)​

Ordered Set Aggregate Functions​

Miscellaneous Aggregate Functions​

Examples

Syntax

`DISTINCT` Clause in Aggregate Functions

`ORDER BY` Clause in Aggregate Functions

Handling `NULL` Values

General Aggregate Functions

`any_value(arg)`

`arg_max(arg, val)`

`arg_max(arg, val, n)`

`arg_max_null(arg, val)`

`arg_min(arg, val)`

`arg_min(arg, val, n)`

`arg_min_null(arg, val)`

`avg(arg)`

`bit_and(arg)`

`bit_or(arg)`

`bit_xor(arg)`

`bitstring_agg(arg)`

`bool_and(arg)`

`bool_or(arg)`

`count()`

`count(arg)`

`countif(arg)`

`favg(arg)`

`first(arg)`

`fsum(arg)`

`geometric_mean(arg)`

`histogram(arg)`

`histogram(arg, boundaries)`

`histogram_exact(arg, elements)`

`histogram_values(source, col_name, technique, bin_count)`

`last(arg)`

`list(arg)`

`max(arg)`

`max(arg, n)`

`min(arg)`

`min(arg, n)`

`product(arg)`

`string_agg(arg)`

`string_agg(arg, sep)`

`sum(arg)`

`weighted_avg(arg, weight)`

Approximate Aggregates

Statistical Aggregates

`corr(y, x)`

`covar_pop(y, x)`

`covar_samp(y, x)`

`entropy(x)`

`kurtosis_pop(x)`

`kurtosis(x)`

`mad(x)`

`median(x)`

`mode(x)`

`quantile_cont(x, pos)`

`quantile_disc(x, pos)`

`regr_avgx(y, x)`

`regr_avgy(y, x)`

`regr_count(y, x)`

`regr_intercept(y, x)`

`regr_r2(y, x)`

`regr_slope(y, x)`

`regr_sxx(y, x)`

`regr_sxy(y, x)`

`regr_syy(y, x)`

`sem(x)`

`skewness(x)`

`stddev_pop(x)`

`stddev_samp(x)`

`var_pop(x)`

`var_samp(x)`

Ordered Set Aggregate Functions

Miscellaneous Aggregate Functions