Edit this page

PIVOT

The PIVOT statement allows distinct values within a column to be separated into their own columns. The values within those new columns are calculated using an aggregate function on the subset of rows that match each distinct value.

SereneDB implements both the SQL Standard PIVOT syntax and a simplified PIVOT syntax that automatically detects the columns to create while pivoting. PIVOT_WIDER may also be used in place of the PIVOT keyword.

Simplified `PIVOT` Syntax

The full syntax diagram is below, but the simplified PIVOT syntax can be summarized using spreadsheet pivot table naming conventions as:

Query

PIVOT ⟨dataset⟩ON ⟨columns⟩USING ⟨values⟩GROUP BY ⟨rows⟩ORDER BY ⟨columns_with_order_directions⟩LIMIT ⟨number_of_rows⟩;

The ON, USING, and GROUP BY clauses are each optional, but they may not all be omitted.

Example Data

All examples use the dataset produced by the queries below:

Query

CREATE TABLE cities (    country VARCHAR, name VARCHAR, year INTEGER, population INTEGER);
INSERT INTO cities VALUES    ('NL', 'Amsterdam', 2000, 1005),    ('NL', 'Amsterdam', 2010, 1065),    ('NL', 'Amsterdam', 2020, 1158),    ('US', 'Seattle', 2000, 564),    ('US', 'Seattle', 2010, 608),    ('US', 'Seattle', 2020, 738),    ('US', 'New York City', 2000, 8015),    ('US', 'New York City', 2010, 8175),    ('US', 'New York City', 2020, 8772);

Query

SELECT *FROM cities;

Result

 country | name          | year | population---------+---------------+------+------------ NL      | Amsterdam     | 2000 |       1005 NL      | Amsterdam     | 2010 |       1065 NL      | Amsterdam     | 2020 |       1158 US      | Seattle       | 2000 |        564 US      | Seattle       | 2010 |        608 US      | Seattle       | 2020 |        738 US      | New York City | 2000 |       8015 US      | New York City | 2010 |       8175 US      | New York City | 2020 |       8772

`PIVOT ON` and `USING`

Use the PIVOT statement below to create a separate column for each year and calculate the total population in each. The ON clause specifies which column(s) to split into separate columns. It is equivalent to the columns parameter in a spreadsheet pivot table.

The USING clause determines how to aggregate the values that are split into separate columns. This is equivalent to the values parameter in a spreadsheet pivot table. If the USING clause is not included, it defaults to count(*).

Query

PIVOT citiesON yearUSING sum(population);

Result

 country | name          | 2000 | 2010 | 2020---------+---------------+------+------+------ NL      | Amsterdam     | 1005 | 1065 | 1158 US      | New York City | 8015 | 8175 | 8772 US      | Seattle       |  564 |  608 |  738

In the above example, the sum aggregate is always operating on a single value. If we only want to change the orientation of how the data is displayed without aggregating, use the first aggregate function. In this example, we are pivoting numeric values, but the first function works very well for pivoting out a text column. (This is something that is difficult to do in a spreadsheet pivot table, but easy in SereneDB!)

This query produces a result that is identical to the one above:

Query

PIVOT citiesON yearUSING first(population);

Result

 country | name          | 2000 | 2010 | 2020---------+---------------+------+------+------ NL      | Amsterdam     | 1005 | 1065 | 1158 US      | New York City | 8015 | 8175 | 8772 US      | Seattle       |  564 |  608 |  738

Expressions in the `USING` clause

The USING clause accepts expressions over aggregate functions, not only a bare aggregate. For example, to report the population in people rather than thousands, multiply the aggregate by 1000:

Query

PIVOT citiesON yearUSING sum(population) * 1000;

Result

 country | name          | 2000    | 2010    | 2020---------+---------------+---------+---------+--------- NL      | Amsterdam     | 1005000 | 1065000 | 1158000 US      | New York City | 8015000 | 8175000 | 8772000 US      | Seattle       |  564000 |  608000 |  738000

`PIVOT ON`, `USING`, and `GROUP BY`

By default, the PIVOT statement retains all columns not specified in the ON or USING clauses. To include only certain columns and further aggregate, specify columns in the GROUP BY clause. This is equivalent to the rows parameter of a spreadsheet pivot table.

In the below example, the name column is no longer included in the output, and the data is aggregated up to the country level.

Query

PIVOT citiesON yearUSING sum(population)GROUP BY country;

Result

 country | 2000 | 2010 | 2020---------+------+------+------ NL      | 1005 | 1065 | 1158 US      | 8579 | 8783 | 9510

`IN` Filter for `ON` Clause

To only create a separate column for specific values within a column in the ON clause, use an optional IN expression. Let's say for example that we wanted to forget about the year 2020 for no particular reason...

Query

PIVOT citiesON year IN (2000, 2010)USING sum(population)GROUP BY country;

Result

 country | 2000 | 2010---------+------+------ NL      | 1005 | 1065 US      | 8579 | 8783

Multiple Expressions per Clause

Multiple columns can be specified in the ON and GROUP BY clauses, and multiple aggregate expressions can be included in the USING clause.

Multiple `ON` Columns and `ON` Expressions

Multiple columns can be pivoted out into their own columns. SereneDB will find the distinct values in each ON clause column and create one new column for all combinations of those values (a Cartesian product).

In the below example, all combinations of unique countries and unique cities receive their own column. Some combinations may not be present in the underlying data, so those columns are populated with NULL values.

Query

PIVOT citiesON country, nameUSING sum(population);

Result

 year | nl_amsterdam | nl_new york city | nl_seattle | us_amsterdam | us_new york city | us_seattle------+--------------+------------------+------------+--------------+------------------+------------ 2000 |         1005 | NULL             | NULL       | NULL         |             8015 |        564 2010 |         1065 | NULL             | NULL       | NULL         |             8175 |        608 2020 |         1158 | NULL             | NULL       | NULL         |             8772 |        738

To pivot only the combinations of values that are present in the underlying data, use an expression in the ON clause. Multiple expressions and/or columns may be provided.

Here, country and name are concatenated together and the resulting concatenations each receive their own column. Any arbitrary non-aggregating expression may be used. In this case, concatenating with an underscore is used to imitate the naming convention the PIVOT clause uses when multiple ON columns are provided (like in the prior example).

Query

PIVOT citiesON country || '_' || nameUSING sum(population);

Result

 year | nl_amsterdam | us_new york city | us_seattle------+--------------+------------------+------------ 2000 |         1005 |             8015 |        564 2010 |         1065 |             8175 |        608 2020 |         1158 |             8772 |        738

Multiple `USING` Expressions

An alias may also be included for each expression in the USING clause. It will be appended to the generated column names after an underscore (_). This makes the column naming convention much cleaner when multiple expressions are included in the USING clause.

In this example, both the sum and max of the population column are calculated for each year and are split into separate columns.

Query

PIVOT citiesON yearUSING sum(population) AS total, max(population) AS maxGROUP BY country;

Result

 country | 2000_total | 2000_max | 2010_total | 2010_max | 2020_total | 2020_max---------+------------+----------+------------+----------+------------+---------- NL      |       1005 |     1005 |       1065 |     1065 |       1158 |     1158 US      |       8579 |     8015 |       8783 |     8175 |       9510 |     8772

Multiple `GROUP BY` Columns

Multiple GROUP BY columns may also be provided. Note that column names must be used rather than column positions (1, 2, etc.), and that expressions are not supported in the GROUP BY clause.

Query

PIVOT citiesON yearUSING sum(population)GROUP BY country, name;

Result

 country | name          | 2000 | 2010 | 2020---------+---------------+------+------+------ NL      | Amsterdam     | 1005 | 1065 | 1158 US      | New York City | 8015 | 8175 | 8772 US      | Seattle       |  564 |  608 |  738

Using `PIVOT` within a `SELECT` Statement

The PIVOT statement may be included within a SELECT statement as a CTE (a Common Table Expression, or WITH clause), or a subquery. This allows for a PIVOT to be used alongside other SQL logic, as well as for multiple PIVOTs to be used in one query.

No SELECT is needed within the CTE, the PIVOT keyword can be thought of as taking its place.

Query

WITH pivot_alias AS (    PIVOT cities    ON year    USING sum(population)    GROUP BY country)SELECT * FROM pivot_alias;

Result

 country | 2000 | 2010 | 2020---------+------+------+------ NL      | 1005 | 1065 | 1158 US      | 8579 | 8783 | 9510

A PIVOT may be used in a subquery and must be wrapped in parentheses. Note that this behavior is different than the SQL Standard Pivot, as illustrated in subsequent examples.

Query

SELECT *FROM (    PIVOT cities    ON year    USING sum(population)    GROUP BY country) pivot_alias;

Result

 country | 2000 | 2010 | 2020---------+------+------+------ NL      | 1005 | 1065 | 1158 US      | 8579 | 8783 | 9510

Multiple `PIVOT` Statements

Each PIVOT can be treated as if it were a SELECT node, so they can be joined together or manipulated in other ways.

For example, if two PIVOT statements share the same GROUP BY expression, they can be joined together using the columns in the GROUP BY clause into a wider pivot.

Query

SELECT *FROM (PIVOT cities ON year USING sum(population) GROUP BY country) year_pivotJOIN (PIVOT cities ON name USING sum(population) GROUP BY country) name_pivotUSING (country);

Result

 country | 2000 | 2010 | 2020 | amsterdam | new york city | seattle---------+------+------+------+-----------+---------------+--------- NL      | 1005 | 1065 | 1158 |      3228 |          NULL |    NULL US      | 8579 | 8783 | 9510 |      NULL |         24962 |    1910

Simplified `PIVOT` Full Syntax Diagram

Below is the full syntax diagram of the PIVOT statement.

SQL Standard `PIVOT` Syntax

The full syntax diagram is below, but the SQL Standard PIVOT syntax can be summarized as:

Query

SELECT *FROM ⟨dataset⟩PIVOT (    ⟨values⟩    FOR        ⟨column_1⟩ IN (⟨in_list⟩)        ⟨column_2⟩ IN (⟨in_list⟩)        ...    GROUP BY ⟨rows⟩);

Unlike the simplified syntax, the IN clause must be specified for each column to be pivoted. If you are interested in dynamic pivoting, the simplified syntax is recommended.

Note that no commas separate the expressions in the FOR clause, but that value and GROUP BY expressions must be comma-separated!

Examples

This example uses a single value expression, a single column expression, and a single row expression:

Query

SELECT *FROM citiesPIVOT (    sum(population)    FOR        year IN (2000, 2010, 2020)    GROUP BY country);

Result

 country | 2000 | 2010 | 2020---------+------+------+------ NL      | 1005 | 1065 | 1158 US      | 8579 | 8783 | 9510

This example is somewhat contrived, but serves as an example of using multiple value expressions and multiple columns in the FOR clause.

Query

SELECT *FROM citiesPIVOT (    sum(population) AS total,    count(population) AS count    FOR        year IN (2000, 2010)        country IN ('NL', 'US'));

Result

 name          | 2000_nl_total | 2000_nl_count | 2000_us_total | 2000_us_count | 2010_nl_total | 2010_nl_count | 2010_us_total | 2010_us_count---------------+---------------+---------------+---------------+---------------+---------------+---------------+---------------+--------------- Amsterdam     |          1005 |             1 |          NULL |             0 |          1065 |             1 |          NULL |             0 New York City |          NULL |             0 |          8015 |             1 |          NULL |             0 |          8175 |             1 Seattle       |          NULL |             0 |           564 |             1 |          NULL |             0 |           608 |             1

SQL Standard `PIVOT` Full Syntax Diagram

Below is the full syntax diagram of the SQL Standard version of the PIVOT statement.

Simplified PIVOT Syntax​

Example Data​

PIVOT ON and USING​

Expressions in the USING clause​

PIVOT ON, USING, and GROUP BY​

IN Filter for ON Clause​

Multiple Expressions per Clause​

Multiple ON Columns and ON Expressions​

Multiple USING Expressions​

Multiple GROUP BY Columns​

Using PIVOT within a SELECT Statement​

Multiple PIVOT Statements​

Simplified PIVOT Full Syntax Diagram​

SQL Standard PIVOT Syntax​

Examples​

SQL Standard PIVOT Full Syntax Diagram​

Simplified `PIVOT` Syntax

Example Data

`PIVOT ON` and `USING`

Expressions in the `USING` clause

`PIVOT ON`, `USING`, and `GROUP BY`

`IN` Filter for `ON` Clause

Multiple Expressions per Clause

Multiple `ON` Columns and `ON` Expressions

Multiple `USING` Expressions

Multiple `GROUP BY` Columns

Using `PIVOT` within a `SELECT` Statement

Multiple `PIVOT` Statements

Simplified `PIVOT` Full Syntax Diagram

SQL Standard `PIVOT` Syntax

Examples

SQL Standard `PIVOT` Full Syntax Diagram