SQL Extensions
On top of standard PostgreSQL syntax, SereneDB offers several SQL extensions and syntactic sugar that make queries more concise and readable.
Clauses
- Creating tables and inserting data:
CREATE OR REPLACE TABLE: avoidDROP TABLE IF EXISTSstatements in scripts.CREATE TABLE ... AS SELECT(CTAS): create a new table from the output of a table without manually defining a schema.INSERT INTO ... BY NAME: this variant of theINSERTstatement allows using column names instead of positions.INSERT OR IGNORE INTO ...: insert the rows that do not result in a conflict due toUNIQUEorPRIMARY KEYconstraints.INSERT OR REPLACE INTO ...: insert the rows that do not result in a conflict due toUNIQUEorPRIMARY KEYconstraints. For those that result in a conflict, replace the columns of the existing row to the new values of the to-be-inserted row.
- Describing tables and computing statistics:
- Making SQL clauses more compact and readable:
FROM-first syntax with an optionalSELECTclause: SereneDB allows queries in the form ofFROM tblwhich selects all columns (performing aSELECT *statement).GROUP BY ALL: omit the group-by columns by inferring them from the list of attributes in theSELECTclause.ORDER BY ALL: shorthand to order on all columns (e.g., to ensure deterministic results).SELECT * EXCLUDE: theEXCLUDEoption allows excluding specific columns from the*expression.SELECT * REPLACE: theREPLACEoption allows replacing specific columns with different expressions in a*expression.UNION BY NAME: perform theUNIONoperation along the names of columns (instead of relying on positions).- Prefix aliases in the
SELECTandFROMclauses: writex: 42instead of42 AS xfor improved readability. - Specifying a percentage of the table size for the
LIMITclause: writeLIMIT 10%to return 10% of the query results.
- Transforming tables:
- Defining SQL-level variables:
Query Features
- Column aliases in
WHERE,GROUP BY, andHAVING. (Note that column aliases cannot be used in theONclause ofJOINclauses.) COLUMNS()expression can be used to execute the same expression on multiple columns:- with regular expressions
- with
EXCLUDEandREPLACE - with lambda functions
- Reusable column aliases (also known as “lateral column aliases”), e.g.:
SELECT i + 1 AS j, j + 2 AS k FROM range(0, 3) t(i) - Advanced aggregation features for analytical (OLAP) queries:
count()shorthand forcount(*)INoperator for lists and maps- Specifying column names for common table expressions (
WITH) - Specifying column names in the
JOINclause - Using
VALUESin theJOINclause - Using
VALUESin the anchor part of common table expressions SWITCHstatements as syntactic sugar for theCASEexpression
Literals and Identifiers
- Case-insensitivity while maintaining case of entities in the catalog
- Underscores as digit separators in numeric literals
Data Types
Data Import
- Auto-detecting the headers and schema of CSV files
- Directly querying CSV files and Parquet files
- Filename expansion (globbing), e.g.:
FROM 'my-data/part-*.parquet'
Functions and Expressions
- Dot operator for function chaining:
SELECT ('hello').upper() - String formatters:
the
format()function with thefmtsyntax and theprintf() function - List comprehensions
- List slicing and indexing from the back (
[-1]) - String slicing
STRUCT.*notation- Creating
LISTusing square brackets - Simple
LISTandSTRUCTcreation - Updating the schema of
STRUCTs
Join Types
Trailing Commas
SereneDB allows trailing commas,
both when listing entities (e.g., column and table names) and when constructing LIST items.
For example, the following query works:
SELECT 42 AS x, ['a', 'b', 'c',] AS y, 'hello world' AS z,; x | y | z----+---------+------------- 42 | {a,b,c} | hello world"Top-N in Group" Queries
Computing the "top-N rows in a group" ordered by some criteria is a common task in SQL that unfortunately often requires a complex query involving window functions and/or subqueries.
To aid in this, SereneDB provides the aggregate functions max(arg, n), min(arg, n), arg_max(arg, val, n), arg_min(arg, val, n), max_by(arg, val, n) and min_by(arg, val, n) to efficiently return the "top" n rows in a group based on a specific column in either ascending or descending order.
For example, let's use the following table:
SELECT * FROM t1; grp | val-----+----- a | 2 a | 1 b | 5 b | 4 a | 3 b | 6We want to get a list of the top-3 val values in each group grp. The conventional way to do this is to use a window function in a subquery:
SELECT array_agg(rs.val), rs.grpFROM (SELECT val, grp, row_number() OVER (PARTITION BY grp ORDER BY val DESC) AS rid FROM t1 ORDER BY val DESC) AS rsWHERE rid < 4GROUP BY rs.grp; array_agg | grp-----------+----- {3,2,1} | a {6,5,4} | bBut in SereneDB, we can do this much more concisely (and efficiently!):
SELECT max(val, 3) FROM t1 GROUP BY grp; max--------- {3,2,1} {6,5,4}