Skip to main content

Text Functions

Text Functions and Operators

This section describes functions and operators for examining and manipulating STRING values.

FunctionDescription
string[index]Extracts a single character using a (1-based) index.
string[begin:end]Extracts a string using slice conventions similar to Python. Missing begin or end arguments are interpreted as the beginning or end of the list respectively. Negative values are accepted.
string LIKE targetReturns true if the string matches the like specifier (see Pattern Matching).
string SIMILAR TO regexReturns true if the string matches the regex (see Pattern Matching).
string ^@ search_stringAlias for starts_with.
arg1 || arg2Concatenates two strings, lists, or blobs. Any NULL input results in NULL. See also concat(arg1, arg2, ...) and list_concat(list1, list2, ...).
array_extract(string, index)Extracts a single character from a string using a (1-based) index.
array_slice(list, begin, end)Extracts a sublist or substring using slice conventions. Negative values are accepted.
ascii(string)Returns an integer that represents the Unicode code point of the first character of the string.
bar(x, min, max[, width])Draws a band whose width is proportional to (x - min) and equal to width characters when x = max. width defaults to 80.
base64(blob)Alias for to_base64.
bin(string)Converts the string to binary representation.
bit_length(string)Number of bits in a string.
char_length(string)Alias for length.
character_length(string)Alias for length.
chr(code_point)Returns a character which is corresponding the ASCII code value or Unicode code point.
concat(value, ...)Concatenates multiple strings or lists. NULL inputs are skipped. See also operator ||.
concat_ws(separator, string, ...)Concatenates many strings, separated by separator. NULL inputs are skipped.
contains(string, search_string)Returns true if search_string is found within string. Note that collations are not supported.
ends_with(string, search_string)Alias for suffix.
format(format, ...)Formats a string using PostgreSQL format specifiers (%s, %I, %L).
formatReadableDecimalSize(integer)Converts integer to a human-readable representation using units based on powers of 10 (KB, MB, GB, etc.).
format_bytes(integer)Converts integer to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.).
from_base64(string)Converts a base64 encoded string to a character string (BLOB).
from_binary(value)Alias for unbin.
from_hex(value)Alias for unhex.
greatest(arg1, ...)Returns the largest value in lexicographical order. Note that lowercase characters are considered larger than uppercase characters and collations are not supported.
hash(value, ...)Returns a UBIGINT with the hash of the value. Note that this is not a cryptographic hash.
hex(string)Converts the string to hexadecimal representation.
ilike_escape(string, like_specifier, escape_character)Returns true if the string matches the like_specifier (see Pattern Matching) using case-insensitive matching. escape_character is used to search for wildcard characters in the string.
instr(string, search_string)Returns location of first occurrence of search_string in string, counting from 1. Returns 0 if no match found.
lcase(string)Alias for lower.
least(arg1, ...)Returns the smallest value in lexicographical order. Note that uppercase characters are considered smaller than lowercase characters and collations are not supported.
left(string, count)Extracts the left-most count characters.
left_grapheme(string, count)Extracts the left-most count grapheme clusters.
len(string)Alias for length.
length(string)Number of characters in string.
length_grapheme(string)Number of grapheme clusters in string.
like_escape(string, like_specifier, escape_character)Returns true if the string matches the like_specifier (see Pattern Matching) using case-sensitive matching. escape_character is used to search for wildcard characters in the string.
lower(string)Converts string to lower case.
lpad(string, count, character)Pads the string with the character on the left until it has count characters. Truncates the string on the right if it has more than count characters.
ltrim(string[, characters])Removes any occurrences of any of the characters from the left side of the string. characters defaults to space.
md5(string)Returns the MD5 hash of the string as a VARCHAR.
md5_number(string)Returns the MD5 hash of the string as a HUGEINT.
md5_number_lower(string)Returns the lower 64-bit segment of the MD5 hash of the string as a UBIGINT.
md5_number_upper(string)Returns the upper 64-bit segment of the MD5 hash of the string as a UBIGINT.
nfc_normalize(string)Converts string to Unicode NFC normalized string. Useful for comparisons and ordering if text data is mixed between NFC normalized and not.
not_ilike_escape(string, like_specifier, escape_character)Returns false if the string matches the like_specifier (see Pattern Matching) using case-insensitive matching. escape_character is used to search for wildcard characters in the string.
not_like_escape(string, like_specifier, escape_character)Returns false if the string matches the like_specifier (see Pattern Matching) using case-sensitive matching. escape_character is used to search for wildcard characters in the string.
ord(string)Alias for unicode.
parse_dirname(path[, separator])Returns the top-level directory name from the given path. separator options: system, both_slash (default), forward_slash, backslash.
parse_dirpath(path[, separator])Returns the head of the path (the pathname until the last slash) similarly to Python's os.path.dirname. separator options: system, both_slash (default), forward_slash, backslash.
parse_filename(string[, trim_extension][, separator])Returns the last component of the path similarly to Python's os.path.basename function. If trim_extension is true, the file extension will be removed (defaults to false). separator options: system, both_slash (default), forward_slash, backslash.
parse_path(path[, separator])Returns a list of the components (directories and filename) in the path similarly to Python's pathlib.parts function. separator options: system, both_slash (default), forward_slash, backslash.
position(search_string IN string)Return location of first occurrence of search_string in string, counting from 1. Returns 0 if no match found.
position(string, search_string)Alias for instr.
prefix(string, search_string)Returns true if string starts with search_string.
printf(format, ...)Formats a string using printf syntax.
read_text(source)Returns the content from source (a filename, a list of filenames, or a glob pattern) as a VARCHAR. The file content is first validated to be valid UTF-8. If read_text attempts to read a file with invalid UTF-8 an error is thrown suggesting to use read_blob instead. See the read_text guide for more details.
regexp_escape(string)Escapes special patterns to turn string into a regular expression similarly to Python's re.escape function.
regexp_extract(string, regex[, group][, options])If string contains the regex pattern, returns the capturing group specified by optional parameter group; otherwise, returns the empty string. The group must be a constant value. If no group is given, it defaults to 0. A set of optional regex options can be set.
regexp_extract(string, regex, name_list[, options])If string contains the regex pattern, returns the capturing groups as a struct with corresponding names from name_list; otherwise, returns a struct with the same keys and empty strings as values. A set of optional regex options can be set.
regexp_extract_all(string, regex[, group][, options])Finds non-overlapping occurrences of the regex in the string and returns the corresponding values of the capturing group. A set of optional regex options can be set.
regexp_extract_all(string, regex, name_list[, options])Finds non-overlapping occurrences of regex in string and returns the capturing groups as a list of structs with corresponding names from name_list. A set of optional regex options can be set.
regexp_full_match(string, regex[, col2])Returns true if the entire string matches the regex. A set of optional regex options can be set.
regexp_matches(string, regex[, options])Returns true if string contains the regex, false otherwise. A set of optional regex options can be set.
regexp_replace(string, regex, replacement[, options])If string contains the regex, replaces the matching part with replacement. A set of optional regex options can be set.
regexp_split_to_array(string, regex[, options])Alias for string_split_regex.
regexp_split_to_table(string, regex)Splits the string along the regex and returns a row for each part.
repeat(string, count)Repeats the string count number of times.
replace(string, source, target)Replaces any occurrences of the source with target in string.
reverse(string)Reverses the string.
right(string, count)Extract the right-most count characters.
right_grapheme(string, count)Extracts the right-most count grapheme clusters.
rpad(string, count, character)Pads the string with the character on the right until it has count characters. Truncates the string on the right if it has more than count characters.
rtrim(string[, characters])Removes any occurrences of any of the characters from the right side of the string. characters defaults to space.
sha1(value)Returns a VARCHAR with the SHA-1 hash of the value.
sha256(value)Returns a VARCHAR with the SHA-256 hash of the value.
split(string, separator)Alias for string_split.
split_part(string, separator, index)Splits the string along the separator and returns the data at the (1-based) index of the list. If the index is outside the bounds of the list, return an empty string (to match PostgreSQL's behavior).
starts_with(string, search_string)Returns true if string begins with search_string.
str_split(string, separator)Alias for string_split.
str_split_regex(string, regex[, options])Alias for string_split_regex.
string_split(string, separator)Splits the string along the separator.
string_split_regex(string, regex[, options])Splits the string along the regex. A set of optional regex options can be set.
string_to_array(string, separator)Alias for string_split.
strip_accents(string)Strips accents from string.
strlen(string)Number of bytes in string.
strpos(string, search_string)Alias for instr.
substr(string, start[, length])Alias for substring.
substring(string, start[, length])Extracts substring starting from character start up to the end of the string. If optional argument length is set, extracts a substring of length characters instead. Note that a start value of 1 refers to the first character of the string.
substring_grapheme(string, start[, length])Extracts substring starting from grapheme clusters start up to the end of the string. If optional argument length is set, extracts a substring of length grapheme clusters instead. Note that a start value of 1 refers to the first character of the string.
suffix(string, search_string)Returns true if string ends with search_string. Note that collations are not supported.
to_base(number, radix[, min_length])Converts number to a string in the given base radix, optionally padding with leading zeros to min_length.
to_base64(blob)Converts a blob to a base64 encoded string.
to_binary(string)Alias for bin.
to_hex(string)Alias for hex.
translate(string, from, to)Replaces each character in string that matches a character in the from set with the corresponding character in the to set. If from is longer than to, occurrences of the extra characters in from are deleted.
trim(string[, characters])Removes any occurrences of any of the characters from either side of the string. characters defaults to space.
ucase(string)Alias for upper.
unbin(value)Converts a value from binary representation to a blob.
unhex(value)Converts a value from hexadecimal representation to a blob.
unicode(string)Returns an INTEGER representing the unicode codepoint of the first character in the string.
upper(string)Converts string to upper case.
url_decode(string)Decodes a URL from a representation using Percent-Encoding.
url_encode(string)Encodes a URL to a representation using Percent-Encoding.

string[index]

Extracts a single character using a (1-based) index. Alias: array_extract.

Query
SELECT 'SereneDB'[4] AS char;
Result
 char------ e

string[begin:end]

Extracts a string using slice conventions similar to Python. Missing begin or end arguments are interpreted as the beginning or end of the list respectively. Negative values are accepted. Alias: array_slice.

Query
SELECT 'SereneDB'[:4] AS slice;
Result
 slice------- Sere

string LIKE target

Returns true if the string matches the like specifier (see Pattern Matching).

Query
SELECT 'hello' LIKE '%lo' AS matches;
Result
 matches--------- t

string SIMILAR TO regex

Returns true if the string matches the regex (see Pattern Matching). Alias: regexp_full_match.

Query
SELECT 'hello' SIMILAR TO 'l+' AS matches;
Result
 matches--------- f

arg1 || arg2

Concatenates two strings, lists, or blobs. Any NULL input results in NULL. See also concat(arg1, arg2, ...) and list_concat(list1, list2, ...).

Query
SELECT 'Serene' || 'DB' AS strings;
SELECT [1, 2, 3] || [4, 5, 6] AS lists;
SELECT '\xAA'::BLOB || '\xBB'::BLOB AS blobs;
Result
 strings---------- SereneDB
 lists--------------- {1,2,3,4,5,6}
 blobs--------- \\xaabb

array_extract(string, index)

Extracts a single character from a string using a (1-based) index.

Query
SELECT array_extract('SereneDB', 2) AS char;
Result
 char------ e

array_slice(list, begin, end)

Extracts a sublist or substring using slice conventions. Negative values are accepted. Alias: list_slice.

Query
SELECT array_slice('SereneDB', 3, 4) AS slice;
SELECT array_slice('SereneDB', 3, NULL) AS slice;
SELECT array_slice('SereneDB', 0, -3) AS slice;
Result
 slice------- re
 slice------- NULL
 slice-------- Serene

ascii(string)

Returns an integer that represents the Unicode code point of the first character of the string.

Query
SELECT ascii('Ω') AS code_point;
Result
 code_point------------        937

bar(x, min, max[, width])

Draws a band whose width is proportional to (x - min) and equal to width characters when x = max. width defaults to 80.

Query
SELECT bar(5, 0, 20, 10) AS band;
Result
 band------ ██▌

bin(string)

Converts the string to binary representation. Alias: to_binary.

Query
SELECT bin('Aa') AS binary;
Result
 binary------------------ 0100000101100001

bit_length(string)

Number of bits in a string.

Query
SELECT bit_length('abc') AS bits;
Result
 bits------   24

chr(code_point)

Returns a character which is corresponding the ASCII code value or Unicode code point.

Query
SELECT chr(65) AS char;
Result
 char------ A

concat(value, ...)

Concatenates multiple strings or lists. NULL inputs are skipped. See also operator ||.

Query
SELECT concat('Hello', ' ', 'World') AS result;
SELECT concat([1, 2, 3], NULL, [4, 5, 6]) AS result;
Result
 result------------- Hello World
 result--------------- {1,2,3,4,5,6}

concat_ws(separator, string, ...)

Concatenates many strings, separated by separator. NULL inputs are skipped.

Query
SELECT concat_ws(', ', 'Banana', 'Apple', 'Melon') AS result;
Result
 result---------------------- Banana, Apple, Melon

contains(string, search_string)

Returns true if search_string is found within string.

Query
SELECT contains('abc', 'a') AS result;
Result
 result-------- t

format(format, ...)

Formats a string using PostgreSQL-style format specifiers (%s, %I, %L, %%).

Query
SELECT format('Hello %s, you are %s', 'World', 42) AS greeting;
Result
 greeting------------------------- Hello World, you are 42

formatReadableDecimalSize(integer)

Converts integer to a human-readable representation using units based on powers of 10 (KB, MB, GB, etc.).

Query
SELECT formatReadableDecimalSize(16000) AS result;
Result
 result--------- 16.0 kB

format_bytes(integer)

Converts integer to a human-readable representation using units based on powers of 2 (KiB, MiB, GiB, etc.). Aliases: formatReadableSize, pg_size_pretty.

Query
SELECT format_bytes(16_000) AS result;
Result
 result---------- 15.6 KiB

from_base64(string)

Converts a base64 encoded string to a character string (BLOB).

Query
SELECT from_base64('QQ==') AS result;
Result
 result-------- \\x41

greatest(arg1, ...)

Returns the largest value in lexicographical order. Note that lowercase characters are considered larger than uppercase characters and collations are not supported.

Query
SELECT greatest(42, 84) AS result;
SELECT greatest('abc', 'bcd', 'cde', 'EFG') AS result;
Result
 result--------     84
 result-------- cde

hash(value, ...)

Returns a UBIGINT with the hash of the value. Note that this is not a cryptographic hash.

Query
SELECT hash('🌊') AS result;
Result
 result---------------------- 12255790945742965276

hex(string)

Converts the string to hexadecimal representation. Alias: to_hex.

Query
SELECT hex('Hello') AS result;
Result
 result------------ 48656C6C6F

ilike_escape(string, like_specifier, escape_character)

Returns true if the string matches the like_specifier (see Pattern Matching) using case-insensitive matching. escape_character is used to search for wildcard characters in the string.

Query
SELECT ilike_escape('A%c', 'a$%C', '$') AS result;
Result
 result-------- t

instr(string, search_string)

Returns location of first occurrence of search_string in string, counting from 1. Returns 0 if no match found. Aliases: position, strpos.

Query
SELECT instr('test test', 'es') AS result;
Result
 result--------      2

least(arg1, ...)

Returns the smallest value in lexicographical order. Note that uppercase characters are considered smaller than lowercase characters and collations are not supported.

Query
SELECT least(42, 84) AS result;
SELECT least('abc', 'bcd', 'cde', 'EFG') AS result;
Result
 result--------     42
 result-------- EFG

left(string, count)

Extracts the left-most count characters.

Query
SELECT left('Hello🌊', 2) AS result;
Result
 result-------- He

left_grapheme(string, count)

Extracts the left-most count grapheme clusters.

Query
SELECT left_grapheme('🤦🏼‍♂️🤦🏽‍♀️', 1) AS result;
Result
 result--------- 🤦🏼‍♂️

length(string)

Number of characters in string. Aliases: char_length, character_length, len.

Query
SELECT length('Hello🌊') AS result;
Result
 result--------      6

length_grapheme(string)

Number of grapheme clusters in string.

Query
SELECT length_grapheme('🤦🏼‍♂️🤦🏽‍♀️') AS result;
Result
 result--------      2

like_escape(string, like_specifier, escape_character)

Returns true if the string matches the like_specifier (see Pattern Matching) using case-sensitive matching. escape_character is used to search for wildcard characters in the string.

Query
SELECT like_escape('a%c', 'a$%c', '$') AS result;
Result
 result-------- t

lower(string)

Converts string to lower case. Alias: lcase.

Query
SELECT lower('Hello') AS result;
Result
 result-------- hello

lpad(string, count, character)

Pads the string with the character on the left until it has count characters. Truncates the string on the right if it has more than count characters.

Query
SELECT lpad('hello', 8, '>') AS result;
Result
 result---------- >>>hello

ltrim(string[, characters])

Removes any occurrences of any of the characters from the left side of the string. characters defaults to space.

Query
SELECT ltrim('    test  ') AS result;
SELECT ltrim('>>>>test<<', '><') AS result;
Result
 result-------- test
 result-------- test<<

md5(string)

Returns the MD5 hash of the string as a VARCHAR.

Query
SELECT md5('abc') AS result;
Result
 result---------------------------------- 900150983cd24fb0d6963f7d28e17f72

md5_number(string)

Returns the MD5 hash of the string as a HUGEINT.

Query

md5_number_lower(string)

Returns the lower 64-bit segment of the MD5 hash of the string as a UBIGINT.

Query
SELECT md5_number_lower('abc') AS result;
Result
 result--------------------- 8250560606382298838

md5_number_upper(string)

Returns the upper 64-bit segment of the MD5 hash of the string as a UBIGINT.

Query
SELECT md5_number_upper('abc') AS result;
Result
 result---------------------- 12704604231530709392

nfc_normalize(string)

Converts string to Unicode NFC normalized string. Useful for comparisons and ordering if text data is mixed between NFC normalized and not.

Query
SELECT nfc_normalize('ardèch') AS result;
Result
 result-------- ardèch

not_ilike_escape(string, like_specifier, escape_character)

Returns false if the string matches the like_specifier (see Pattern Matching) using case-insensitive matching. escape_character is used to search for wildcard characters in the string.

Query
SELECT not_ilike_escape('A%c', 'a$%C', '$') AS result;
Result
 result-------- f

not_like_escape(string, like_specifier, escape_character)

Returns false if the string matches the like_specifier (see Pattern Matching) using case-sensitive matching. escape_character is used to search for wildcard characters in the string.

Query
SELECT not_like_escape('a%c', 'a$%c', '$') AS result;
Result
 result-------- f

parse_dirname(path[, separator])

Returns the top-level directory name from the given path. separator options: system, both_slash (default), forward_slash, backslash.

Query
SELECT parse_dirname('path/to/file.csv', 'system') AS result;
Result
 result-------- path

parse_dirpath(path[, separator])

Returns the head of the path (the pathname until the last slash) similarly to Python's os.path.dirname. separator options: system, both_slash (default), forward_slash, backslash.

Query
SELECT parse_dirpath('path/to/file.csv', 'forward_slash') AS result;
Result
 result--------- path/to

parse_filename(string[, trim_extension][, separator])

Returns the last component of the path similarly to Python's os.path.basename function. If trim_extension is true, the file extension will be removed (defaults to false). separator options: system, both_slash (default), forward_slash, backslash.

Query
SELECT parse_filename('path/to/file.csv', true, 'forward_slash') AS result;
Result
 result-------- file

parse_path(path[, separator])

Returns a list of the components (directories and filename) in the path similarly to Python's pathlib.parts function. separator options: system, both_slash (default), forward_slash, backslash.

Query
SELECT parse_path('path/to/file.csv', 'system') AS result;
Result
 result-------------------- {path,to,file.csv}

position(search_string IN string)

Return location of first occurrence of search_string in string, counting from 1. Returns 0 if no match found. Aliases: instr, strpos.

Query
SELECT position('b' IN 'abc') AS result;
Result
 result--------      2

prefix(string, search_string)

Returns true if string starts with search_string.

Query
SELECT prefix('abc', 'ab') AS result;
Result
 result-------- t

printf(format, ...)

Formats a string using printf syntax.

Query
SELECT printf('Benchmark "%s" took %d seconds', 'CSV', 42) AS result;
Result
 result--------------------------------- Benchmark "CSV" took 42 seconds

read_text(source)

A table function that returns the content from source (a filename, a list of filenames, or a glob pattern); query it with FROM. It yields filename, content, size and last_modified columns, with content as a VARCHAR. The file content is first validated to be valid UTF-8. If read_text attempts to read a file with invalid UTF-8 an error is thrown suggesting to use read_blob instead. See the read_text guide for more details.

Query

regexp_escape(string)

Escapes special patterns to turn string into a regular expression similarly to Python's re.escape function.

Query
SELECT regexp_escape('https://serenedb.com') AS result;
Result
 result------------------------------ https\\:\\/\\/serenedb\\.com

regexp_extract(string, regex[, group][, options])

If string contains the regex pattern, returns the capturing group specified by optional parameter group; otherwise, returns the empty string. The group must be a constant value. If no group is given, it defaults to 0. A set of optional regex options can be set.

Query
SELECT regexp_extract('ABC', '([a-z])(b)', 1, 'i') AS result;
Result
 result-------- A

regexp_extract(string, regex, name_list[, options])

If string contains the regex pattern, returns the capturing groups as a struct with corresponding names from name_list; otherwise, returns a struct with the same keys and empty strings as values. A set of optional regex options can be set.

Query
SELECT regexp_extract('John Doe', '([a-z]+) ([a-z]+)', ['first_name', 'last_name'], 'i') AS result;
Result
 result------------ (John,Doe)

regexp_extract_all(string, regex[, group][, options])

Finds non-overlapping occurrences of the regex in the string and returns the corresponding values of the capturing group. A set of optional regex options can be set.

Query
SELECT regexp_extract_all('Peter: 33, Paul:14', '(\w+):\s*(\d+)', 2) AS result;
Result
 result--------- {33,14}

regexp_extract_all(string, regex, name_list[, options])

Finds non-overlapping occurrences of regex in string and returns the capturing groups as a list of structs with corresponding names from name_list. A set of optional regex options can be set.

Query
SELECT regexp_extract_all('Peter: 33, Paul: 14', '(\w+):\s*(\d+)', ['name', 'age']) AS result;
Result
 result---------------------------- {"(Peter,33)","(Paul,14)"}

regexp_full_match(string, regex[, col2])

Returns true if the entire string matches the regex. A set of optional regex options can be set.

Query
SELECT regexp_full_match('anabanana', '(an)*') AS result;
Result
 result-------- f

regexp_matches(string, regex[, options])

Returns true if string contains the regex, false otherwise. A set of optional regex options can be set.

Query
SELECT regexp_matches('anabanana', '(an)*') AS result;
Result
 result-------- t

regexp_replace(string, regex, replacement[, options])

If string contains the regex, replaces the matching part with replacement. A set of optional regex options can be set.

Query
SELECT regexp_replace('hello', '[lo]', '-') AS result;
Result
 result-------- he-lo

regexp_split_to_table(string, regex)

Splits the string along the regex and returns a row for each part.

Query
SELECT * FROM regexp_split_to_table('hello world; 42', ';? ');
Result
 regexp_split_to_table----------------------- hello world 42

repeat(string, count)

Repeats the string count number of times.

Query
SELECT repeat('A', 5) AS result;
Result
 result-------- AAAAA

replace(string, source, target)

Replaces any occurrences of the source with target in string.

Query
SELECT replace('hello', 'l', '-') AS result;
Result
 result-------- he--o

reverse(string)

Reverses the string.

Query
SELECT reverse('hello') AS result;
Result
 result-------- olleh

right(string, count)

Extract the right-most count characters.

Query
SELECT right('Hello🌊', 3) AS result;
Result
 result-------- lo🌊

right_grapheme(string, count)

Extracts the right-most count grapheme clusters.

Query
SELECT right_grapheme('🤦🏼‍♂️🤦🏽‍♀️', 1) AS result;
Result
 result--------- 🤦🏽‍♀️

rpad(string, count, character)

Pads the string with the character on the right until it has count characters. Truncates the string on the right if it has more than count characters.

Query
SELECT rpad('hello', 10, '<') AS result;
Result
 result------------ hello<<<<<

rtrim(string[, characters])

Removes any occurrences of any of the characters from the right side of the string. characters defaults to space.

Query
SELECT rtrim('    test  ') AS result;
SELECT rtrim('>>>>test<<', '><') AS result;
Result
 result----------     test
 result---------- >>>>test

sha1(value)

Returns a VARCHAR with the SHA-1 hash of the value.

Query
SELECT sha1('🌊') AS result;
Result
 result------------------------------------------ be25fe9cf30cf5c01ae2054b60a5e5511e966d98

sha256(value)

Returns a VARCHAR with the SHA-256 hash of the value

Query
SELECT sha256('🌊') AS result;
Result
 result------------------------------------------------------------------ 8db6bed4743d5a6159e0742df0fded8377d1250420aeda292e41ffa20eba806e

split_part(string, separator, index)

Splits the string along the separator and returns the data at the (1-based) index of the list. If the index is outside the bounds of the list, return an empty string (to match PostgreSQL's behavior).

Query
SELECT split_part('a;b;c', ';', 2) AS result;
Result
 result-------- b

starts_with(string, search_string)

Returns true if string begins with search_string. Alias: ^@.

Query
SELECT starts_with('abc', 'a') AS result;
Result
 result-------- t

string_split(string, separator)

Splits the string along the separator. Aliases: split, str_split, string_to_array.

Query
SELECT string_split('hello-world', '-') AS result;
Result
 result--------------- {hello,world}

string_split_regex(string, regex[, options])

Splits the string along the regex. A set of optional regex options can be set. Aliases: regexp_split_to_array, str_split_regex.

Query
SELECT string_split_regex('hello world; 42', ';? ') AS result;
Result
 result------------------ {hello,world,42}

strip_accents(string)

Strips accents from string.

Query
SELECT strip_accents('mühleisen') AS result;
Result
 result----------- muhleisen

strlen(string)

Number of bytes in string.

Query
SELECT strlen('🌊') AS result;
Result
 result--------      4

substring(string, start[, length])

Extracts substring starting from character start up to the end of the string. If optional argument length is set, extracts a substring of length characters instead. Note that a start value of 1 refers to the first character of the string. Alias: substr.

Query
SELECT substring('Hello', 2) AS result;
SELECT substring('Hello', 2, 2) AS result;
Result
 result-------- ello
 result-------- el

substring_grapheme(string, start[, length])

Extracts substring starting from grapheme clusters start up to the end of the string. If optional argument length is set, extracts a substring of length grapheme clusters instead. Note that a start value of 1 refers to the first character of the string.

Query
SELECT substring_grapheme('🌊🤦🏼‍♂️🤦🏽‍♀️🌊', 3) AS result;
SELECT substring_grapheme('🌊🤦🏼‍♂️🤦🏽‍♀️🌊', 3, 2) AS result;
Result
 result----------- 🤦🏽‍♀️🌊
 result----------- 🤦🏽‍♀️🌊

suffix(string, search_string)

Returns true if string ends with search_string. Note that collations are not supported. Alias: ends_with.

Query
SELECT suffix('abc', 'bc') AS result;
Result
 result-------- t

to_base(number, radix[, min_length])

Converts number to a string in the given base radix, optionally padding with leading zeros to min_length.

Query
SELECT to_base(42, 16, 5) AS result;
Result
 result-------- 0002A

to_base64(blob)

Converts a blob to a base64 encoded string. Alias: base64.

Query
SELECT to_base64('A'::BLOB) AS result;
Result
 result-------- QQ==

translate(string, from, to)

Replaces each character in string that matches a character in the from set with the corresponding character in the to set. If from is longer than to, occurrences of the extra characters in from are deleted.

Query
SELECT translate('12345', '143', 'ax') AS result;
Result
 result-------- a2x5

trim(string[, characters])

Removes any occurrences of any of the characters from either side of the string. characters defaults to space.

Query
SELECT trim('    test  ') AS result;
SELECT trim('>>>>test<<', '><') AS result;
Result
 result-------- test
 result-------- test

unbin(value)

Converts a value from binary representation to a blob. Alias: from_binary.

Query
SELECT unbin('0110') AS result;
Result
 result-------- \\x06

unhex(value)

Converts a value from hexadecimal representation to a blob. Alias: from_hex.

Query
SELECT unhex('2A') AS result;
Result
 result-------- \\x2a

unicode(string)

Returns an INTEGER representing the unicode codepoint of the first character in the string. Alias: ord.

Query
SELECT [unicode('âbcd'), unicode('â'), unicode(''), unicode(NULL)] AS result;
Result
 result------------------- {226,226,-1,NULL}

upper(string)

Converts string to upper case. Alias: ucase.

Query
SELECT upper('Hello') AS result;
Result
 result-------- HELLO

url_decode(string)

Decodes a URL from a representation using Percent-Encoding.

Query
SELECT url_decode('https%3A%2F%2Fserenedb.com%2Fwhy_serenedb%23portable') AS result;
Result
 result-------------------------------------------- https://serenedb.com/why_serenedb#portable

url_encode(string)

Encodes a URL to a representation using Percent-Encoding.

Query
SELECT url_encode('this string has/ special+ characters>') AS result;
Result
 result----------------------------------------------------- this%20string%20has%2F%20special%2B%20characters%3E

Text Similarity Functions

These functions are used to measure the similarity of two strings using various similarity measures.

FunctionDescription
damerau_levenshtein(s1, s2)Extension of Levenshtein distance to also include transposition of adjacent characters as an allowed edit operation. In other words, the minimum number of edit operations (insertions, deletions, substitutions or transpositions) required to change one string to another. Characters of different cases (e.g., a and A) are considered different.
editdist3(s1, s2)Alias for levenshtein.
hamming(s1, s2)The Hamming distance between two strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., a and A) are considered different.
jaccard(s1, s2)The Jaccard similarity between two strings. Characters of different cases (e.g., a and A) are considered different. Returns a number between 0 and 1.
jaro_similarity(s1, s2[, score_cutoff])The Jaro similarity between two strings. Characters of different cases (e.g., a and A) are considered different. Returns a number between 0 and 1. For similarity < score_cutoff, 0 is returned instead. score_cutoff defaults to 0.
jaro_winkler_similarity(s1, s2[, score_cutoff])The Jaro-Winkler similarity between two strings. Characters of different cases (e.g., a and A) are considered different. Returns a number between 0 and 1. For similarity < score_cutoff, 0 is returned instead. score_cutoff defaults to 0.
levenshtein(s1, s2)The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., a and A) are considered different.
mismatches(s1, s2)Alias for hamming.

damerau_levenshtein(s1, s2)

Extension of Levenshtein distance to also include transposition of adjacent characters as an allowed edit operation. In other words, the minimum number of edit operations (insertions, deletions, substitutions or transpositions) required to change one string to another. Characters of different cases (e.g., a and A) are considered different.

Query
SELECT damerau_levenshtein('serene', 'esrnee') AS result;
Result
 result--------      2

hamming(s1, s2)

The Hamming distance between two strings, i.e., the number of positions with different characters for two strings of equal length. Strings must be of equal length. Characters of different cases (e.g., a and A) are considered different. Alias: mismatches.

Query
SELECT hamming('serene', 'serine') AS result;
Result
 result--------      1

jaccard(s1, s2)

The Jaccard similarity between two strings. Characters of different cases (e.g., a and A) are considered different. Returns a number between 0 and 1.

Query
SELECT jaccard('serene', 'serine') AS result;
Result
 result--------    0.8

jaro_similarity(s1, s2[, score_cutoff])

The Jaro similarity between two strings. Characters of different cases (e.g., a and A) are considered different. Returns a number between 0 and 1. For similarity < score_cutoff, 0 is returned instead. score_cutoff defaults to 0.

Query
SELECT jaro_similarity('serene', 'serenedb') AS result;
Result
 result-------------------- 0.9166666666666666

jaro_winkler_similarity(s1, s2[, score_cutoff])

The Jaro-Winkler similarity between two strings. Characters of different cases (e.g., a and A) are considered different. Returns a number between 0 and 1. For similarity < score_cutoff, 0 is returned instead. score_cutoff defaults to 0.

Query
SELECT jaro_winkler_similarity('serene', 'serenedb') AS result;
Result
 result--------   0.95

levenshtein(s1, s2)

The minimum number of single-character edits (insertions, deletions or substitutions) required to change one string to the other. Characters of different cases (e.g., a and A) are considered different. Alias: editdist3.

Query
SELECT levenshtein('serene', 'serenedb') AS result;
Result
 result--------      2

Formatters

format Syntax

The format(format, ...) function formats a string using PostgreSQL-style format specifiers — each specifier in format is replaced by the corresponding argument. (This is PostgreSQL's format, not the {}-placeholder syntax of the {fmt} library.)

A specifier has the form %[position$][-][width]type, where type is one of:

TypeDescription
%sThe argument as a string.
%IThe argument as a quoted SQL identifier (double-quoted when needed).
%LThe argument as a quoted SQL literal (single-quoted, embedded quotes escaped; NULL becomes NULL).
%%A literal % (consumes no argument).

Only s, I and L are valid types — any other (e.g. %d) raises unrecognized format() type specifier.

Basic substitution with %s:

Query
SELECT format('Hello %s, you are %s', 'World', 42) AS greeting;
Result
 greeting------------------------- Hello World, you are 42

%I and %L safely quote identifiers and literals — useful for building SQL dynamically:

Query
SELECT format('INSERT INTO %I VALUES (%L)', 'my table', 'a''b') AS stmt;
Result
 stmt---------------------------------------- INSERT INTO "my table" VALUES ('a''b')

Arguments are consumed in order; an explicit position can be given as %n$ (e.g. %1$s), which also lets you reuse and reorder them:

Query
SELECT format('%1$s %1$s %2$s', 'a', 'b') AS repeated;
Result
 repeated---------- a a b

An optional minimum field width right-justifies the value (left-justifies with a leading -), padding with spaces:

Query
SELECT format('|%-10s|%10s|', 'left', 'right') AS padded;
Result
 padded------------------------- |left      |     right|

printf Syntax

The printf(format, parameters...) function formats strings using the printf syntax.

Format without additional parameters:

Query
SELECT printf('Hello world');
Result
 printf------------- Hello world

Format a string using arguments in a given order:

Query
SELECT printf('The answer to %s is %d', 'life', 42);
Result
 printf-------------------------- The answer to life is 42

Format a string using positional arguments %position$formatter, e.g., the second parameter as a string is encoded as %2$s:

Query
SELECT printf('I''d rather be %2$s than %1$s.', 'right', 'happy');
Result
 printf--------------------------------- I'd rather be happy than right.

Format Specifiers

SpecifierDescriptionExample
%ccharacter code to charactera
%dinteger654321
%Escientific notation3.141593E+00
%ffloat4.560000
%hdinteger truncated to a C short-1039
%hhdinteger truncated to a C signed char-15
%lldinteger654321
%ooctal2375761
%sstringasd
%xhexadecimal9fbf1

Formatting Types

Integers:

Query
SELECT printf('%d + %d = %d', 3, 5, 3 + 5);
Result
 printf----------- 3 + 5 = 8

Booleans are not accepted as %s arguments and raise invalid format specifier; cast them to text first:

Query
SELECT printf('%s != %s', true, false);
Result
db error: ERROR: invalid format specifier

Format datetime values:

Query
SELECT printf('%s', DATE '1992-01-01');
SELECT printf('%s', TIME '12:01:00');
SELECT printf('%s', TIMESTAMP '1992-01-01 12:01:00');
Result
 printf------------ 1992-01-01
 printf---------- 12:01:00
 printf--------------------- 1992-01-01 12:01:00

Format a BLOB (rendered as a hex string):

Query
SELECT printf('%s', 'hello'::BLOB) AS printf;
Result
 printf--------------- \\x68656c6c6f

Pad integers with 0s:

Query
SELECT printf('%04d', 33);
Result
 printf--------   0033

Create timestamps from integers:

Query
SELECT printf('%02d:%02d:%02d %s', 12, 3, 16, 'AM');
Result
 printf------------- 12:03:16 AM

Convert to hexadecimal:

Query
SELECT printf('%x', 123_456_789);
Result
 printf--------- 75bcd15

There is no binary (%b) specifier; it raises invalid format specifier. Use bin to obtain a binary representation:

Query
SELECT printf('%b', 123_456_789);
Result
db error: ERROR: invalid format specifier

Thousand Separators

printf does not support thousand-separator specifiers (such as %,d, %_d or %''d); they raise invalid format specifier:

Query
SELECT printf('%,d',  123_456_789);
SELECT printf('%.d',  123_456_789);
SELECT printf('%''d', 123_456_789);
SELECT printf('%_d',  123_456_789);
Result
db error: ERROR: invalid format specifier
 printf----------- 123456789
db error: ERROR: invalid format specifier
db error: ERROR: invalid format specifier

This also applies to floats and doubles:

Query
SELECT printf('%,f',   123456.789);
SELECT printf('%,.2f', 123456.789);
Result
db error: ERROR: invalid format specifier
db error: ERROR: invalid format specifier