Full-Text Search Functions
The @@ match operator and the TSQUERY constructors, operators and PostgreSQL-compatible parsers used to build full-text queries against an inverted index. Every example below shares the dataset created in Setup; see Full-Text Search for a task-oriented guide.
Setup
The examples on this page share one dataset. Expand to see the schema and sample data.
CREATE TABLE docs (id INTEGER PRIMARY KEY, body VARCHAR, category VARCHAR, dual VARCHAR);
CREATE TEXT SEARCH DICTIONARY en ( template = 'text', locale = 'en_US.UTF-8', case = 'lower', stemming = false, accent = false, frequency = true, position = true, offset = true);
CREATE INDEX docs_idx ON docs USING inverted (id, body en, category, dual en) INCLUDE (dual);
INSERT INTO docs VALUES (1, 'quick brown fox jumps over lazy dog', 'sci-fi', 'x'), (2, 'quick red fox jumps over lazy dog', 'sci-fi', NULL), (3, 'slow grey turtle sleeps', 'drama', 'y'), (4, 'something else entirely', 'comedy', 'z');
VACUUM (REFRESH_TABLE) docs;
CREATE TABLE titles (id INTEGER PRIMARY KEY, title VARCHAR);
CREATE TEXT SEARCH DICTIONARY bigram (template = 'ngram', mingram = 2, maxgram = 2, frequency = true, position = true);
CREATE INDEX titles_idx ON titles USING inverted (id, title bigram);
INSERT INTO titles VALUES (1, 'hello'), (2, 'help'), (3, 'world'), (4, 'held');
VACUUM (REFRESH_TABLE) titles;
CREATE TABLE vecs (id INTEGER, emb FLOAT[3]);
CREATE INDEX vecs_l2 ON vecs USING inverted (id, emb hnsw (metric = 'l2'));
INSERT INTO vecs VALUES (1, [1, 0, 0]::FLOAT[3]), (2, [0, 5, 0]::FLOAT[3]), (3, [0, 0, 1]::FLOAT[3]);
VACUUM (REFRESH_TABLE) vecs;
CREATE TABLE vecs_c (id INTEGER, emb FLOAT[3]);
CREATE INDEX vecs_cos ON vecs_c USING inverted (id, emb hnsw (metric = 'cosine'));
INSERT INTO vecs_c VALUES (1, [1, 0, 0]::FLOAT[3]), (2, [0, 1, 0]::FLOAT[3]), (3, [1, 1, 0]::FLOAT[3]);
VACUUM (REFRESH_TABLE) vecs_c;
CREATE TABLE vecs_l (id INTEGER, emb FLOAT[3]);
CREATE INDEX vecs_l1 ON vecs_l USING inverted (id, emb hnsw (metric = 'l1'));
INSERT INTO vecs_l VALUES (1, [1, 0, 0]::FLOAT[3]), (2, [0, 5, 0]::FLOAT[3]), (3, [0, 0, 1]::FLOAT[3]);
VACUUM (REFRESH_TABLE) vecs_l;
CREATE TABLE vecs_i (id INTEGER, emb FLOAT[3]);
CREATE INDEX vecs_ip ON vecs_i USING inverted (id, emb hnsw (metric = 'ip'));
INSERT INTO vecs_i VALUES (1, [1, 0, 0]::FLOAT[3]), (2, [0, 5, 0]::FLOAT[3]), (3, [2, 2, 2]::FLOAT[3]);
VACUUM (REFRESH_TABLE) vecs_i;
CREATE TABLE geo (id INTEGER PRIMARY KEY, geo GEOMETRY('OGC:CRS84'));
CREATE TEXT SEARCH DICTIONARY gj (template = 'geojson', coding = 's2point');
CREATE INDEX geo_idx ON geo USING inverted (id, geo gj);
INSERT INTO geo VALUES (1, 'POINT(37.6 55.7)'::GEOMETRY('OGC:CRS84')), (2, 'POINT(37.7 55.8)'::GEOMETRY('OGC:CRS84'));
VACUUM (REFRESH_TABLE) geo;
CREATE TABLE scored (id INTEGER PRIMARY KEY, body VARCHAR);
CREATE TEXT SEARCH DICTIONARY scored_en ( template = 'text', locale = 'en_US.UTF-8', case = 'lower', stemming = false, accent = false, frequency = true, position = true, norm = true);
CREATE INDEX scored_idx ON scored USING inverted (id, body scored_en);
INSERT INTO scored VALUES (1, 'the quick brown fox'), (2, 'fox fox fox in the henhouse'), (3, 'a clever fox outwits two hounds in the meadow'), (4, 'quick brown dog runs'), (5, 'the lazy dog sleeps all day');
VACUUM (REFRESH_TABLE) scored;
CREATE TABLE geo_demo (id INTEGER PRIMARY KEY, name VARCHAR, shape JSON);
CREATE TEXT SEARCH DICTIONARY geo_demo_dict (template = 'geojson');
CREATE INDEX geo_demo_idx ON geo_demo USING inverted (id, name, shape geo_demo_dict);
INSERT INTO geo_demo VALUES (1, 'red_square', '{"type":"Polygon","coordinates":[[[0,0],[10,0],[10,10],[0,10],[0,0]]]}'), (2, 'inner_point', '{"type":"Point","coordinates":[5,5]}'), (3, 'edge_point', '{"type":"Point","coordinates":[10,5]}'), (4, 'far_point', '{"type":"Point","coordinates":[50,50]}'), (5, 'overlap_box', '{"type":"Polygon","coordinates":[[[5,5],[15,5],[15,15],[5,15],[5,5]]]}');
VACUUM (REFRESH_TABLE) geo_demo;
CREATE TABLE cities (id INTEGER PRIMARY KEY, name VARCHAR, loc GEOMETRY('OGC:CRS84'));
CREATE TEXT SEARCH DICTIONARY cities_dict (template = 'geojson', coding = 's2point');
CREATE INDEX cities_idx ON cities USING inverted (id, name, loc cities_dict);
INSERT INTO cities VALUES (1, 'kremlin', 'POINT(37.617499 55.752023)'::GEOMETRY('OGC:CRS84')), (2, 'red_square', 'POINT(37.620795 55.753930)'::GEOMETRY('OGC:CRS84')), (3, 'gorky_park', 'POINT(37.601111 55.731389)'::GEOMETRY('OGC:CRS84')), (4, 'spb_palace', 'POINT(30.314611 59.939095)'::GEOMETRY('OGC:CRS84'));
VACUUM (REFRESH_TABLE) cities;
CREATE TABLE passages (id INTEGER PRIMARY KEY, body VARCHAR);
CREATE TEXT SEARCH DICTIONARY passages_en ( template = 'text', locale = 'en_US.UTF-8', case = 'lower', stemming = false, accent = false, frequency = true, position = true, offset = true);
CREATE INDEX passages_idx ON passages USING inverted (id, body passages_en);
INSERT INTO passages VALUES (1, 'the quick brown fox jumps over the lazy dog'), (2, 'a quick red fox runs fast'), (3, 'First sentence here. The quick brown fox is in this sentence. Third one.');
VACUUM (REFRESH_TABLE) passages;Match Operator
| Function | Description |
|---|---|
column @@ tsquery | Match predicate: rows where the indexed column satisfies the query. |
column @@ tsquery
Filters to rows where the indexed column satisfies a TSQUERY.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
column | any indexed column | — | A column covered by an inverted index. |
tsquery | TSQUERY | — | The query to test against the column. A bare string literal is accepted and tokenized by the column's dictionary. |
How it works. @@ is the single entry point that turns a TSQUERY into an inverted-index scan. The whole expression on the right of @@ is claimed by the index at bind time and evaluated by the index scan, not row by row. The operator is commutative — tsquery @@ column is identical. Although it is typed BOOLEAN, it is only valid in a WHERE clause against an inverted-indexed column; using it as a standalone expression (for example in the SELECT list) raises an error, so it is not a general-purpose boolean. Likewise the TSQUERY constructors below only have meaning inside an @@ match — evaluating one on its own raises an error.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_phrase('fox') | 1, 2 | Both bodies contain fox. |
body @@ 'fox' | 1, 2 | A bare string is a valid TSQUERY. |
body @@ ts_phrase('unicorn') | (none) | No body contains unicorn. |
SELECT id FROM docs_idx WHERE body @@ ts_phrase('fox') ORDER BY id; id---- 1 2TSQUERY Constructors
Each returns a TSQUERY. A bare string literal is also a valid TSQUERY — it is tokenized by the column's dictionary (multi-token input uses OR semantics).
| Function | Description |
|---|---|
ts_phrase(text[, gap, text, ...]) | Match the tokens of text as a phrase, with optional gaps. |
ts_tokenize(text[, dictionary]) | Tokenize text into a query, optionally with a named dictionary. |
ts_like(pattern) | Match tokens against a SQL LIKE pattern. |
ts_starts_with(prefix) | Match tokens beginning with prefix. |
ts_regexp(pattern[, syntax]) | Match tokens against a regular expression. |
ts_levenshtein(text[, distance[, transpositions[, prefix]]]) | Match tokens within an edit distance of text. |
ts_ngram(text[, threshold]) | Match by n-gram similarity. |
ts_between(min, max, min_incl, max_incl) | Range match between two bounds. |
ts_lt(value) | Match values less than value. |
ts_le(value) | Match values less than or equal to value. |
ts_gt(value) | Match values greater than value. |
ts_ge(value) | Match values greater than or equal to value. |
ts_any(list[, min_match]) | OR over a list; at least min_match must match. |
ts_all(list) | AND over a list. |
ts_compound(must, must_not, should[, min_should_match]) | Boolean query (the bool analog). |
ts_is_null() | Match rows whose column value is absent. |
ts_is_not_null() | Match rows whose column value is present. |
ts_phrase(text[, gap, text, ...])
Match a run of tokens in their indexed order, optionally separated by token gaps.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR or BLOB | — | A phrase segment. It is tokenized by the column's dictionary; multiple tokens within one segment must be strictly adjacent. |
gap | INTEGER or INTEGER[] | 0 between successive segments | Number of tokens allowed between the two surrounding segments. An integer N means exactly N tokens between; a two-element array [min, max] allows a range. 0 means adjacent. |
text, ... | VARCHAR/BLOB | — | Further segments, each preceded by its own gap. |
How it works. ts_phrase matches positions, so the column's dictionary must have position enabled. The tokens of each text segment must appear adjacent and in order; the optional gap arguments control how far apart consecutive segments may sit. The gap counts the tokens between the two segments — 0 is immediate adjacency, 2 means exactly two intervening tokens. A [min, max] array accepts any gap in that inclusive range. Order is always preserved: ts_phrase('a', 0, 'b') does not match b a.
| Query | Matches id | Why |
|---|---|---|
ts_phrase('quick brown fox') | 1 | Three adjacent tokens in order; only id 1 has brown. |
ts_phrase('quick', 1, 'fox') | 1, 2 | Exactly one token (brown/red) sits between quick and fox. |
ts_phrase('quick', 3, 'over') | 1, 2 | brown/red fox jumps are the three tokens between quick and over. |
ts_phrase('fox', [0, 2], 'dog') | (none) | jumps over lazy are three tokens apart — outside the 0..2 range. |
SELECT id FROM docs_idx WHERE body @@ ts_phrase('quick brown fox') ORDER BY id; id---- 1The gap form expresses proximity directly. To require quick within three tokens before over:
SELECT idFROM docs_idxWHERE body @@ ts_phrase('quick', [0, 3], 'over')ORDER BY id; id---- 1 2ts_tokenize(text[, dictionary])
Analyze text into a query using a chosen dictionary, overriding the column's default analysis.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR/BLOB (or a LIST of them) | — | The text to tokenize. A list yields a LIST(TSQUERY), one per element, for use inside ts_any / ts_all. |
dictionary | VARCHAR | the @@ column's dictionary | Name of the text-search dictionary to analyze with. The special value 'keyword' bypasses analysis and treats text as a single raw token. |
How it works. The one-argument form analyzes text with the same dictionary as the column it is matched against, so the query and the index agree on casing, stemming and stop-words. Naming a dictionary forces a specific analyzer — useful when you want, say, exact 'keyword' matching against a column that is otherwise stemmed. Multi-token output is combined with OR.
::tokenize cast. The cast 'text'::tokenize('dictionary') is exactly equivalent to ts_tokenize('text', 'dictionary') and reads naturally inline — for example WHERE body @@ 'Running'::tokenize('exact'). The cast requires a dictionary name (use 'keyword' to bypass analysis); there is no no-argument cast form.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_tokenize('quick', 'keyword') | 1, 2 | Raw token quick matched verbatim. |
body @@ ts_tokenize('quick grey') | 1, 2, 3 | Two tokens combined with OR. |
body @@ ts_tokenize('QUICK', 'keyword') | (none) | 'keyword' skips lower-casing, so QUICK ≠ indexed quick. |
SELECT id FROM docs_idx WHERE body @@ ts_tokenize('quick', 'keyword') ORDER BY id; id---- 1 2ts_like(pattern)
Match indexed tokens against a SQL LIKE pattern.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
pattern | VARCHAR/BLOB | — | A LIKE pattern. % matches any run of characters (including none), _ matches exactly one character. The pattern is matched against terms as they are stored in the index. |
How it works. ts_like is applied to the indexed term form, not the raw text, so it is not tokenized further — match it against a column whose dictionary stores whole values (such as a keyword-style category column). It is the inverted-index analogue of SQL LIKE: a row matches when any of its indexed tokens satisfies the pattern.
| Query | Matches id | Why |
|---|---|---|
category @@ ts_like('sci%') | 1, 2 | sci-fi starts with sci. |
category @@ ts_like('dram_') | 3 | drama is dram plus exactly one character. |
category @@ ts_like('%edy') | 4 | comedy ends with edy. |
category @@ ts_like('thriller') | (none) | No category equals thriller. |
SELECT id, category FROM docs_idx WHERE category @@ ts_like('sci%') ORDER BY id; id | category----+---------- 1 | sci-fi 2 | sci-fits_starts_with(prefix)
Match any indexed token beginning with prefix.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
prefix | VARCHAR/BLOB | — | The literal leading substring a term must start with. Matched against the indexed term form, with no further tokenization. |
How it works. A row matches when any of its indexed tokens begins with prefix. This is the classic autocomplete primitive — ts_starts_with('app') matches apple, application, app. It is equivalent to ts_like(prefix || '%') but expresses intent more clearly and is the building block behind the a* syntax of to_tsquery.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_starts_with('turt') | 3 | turtle begins with turt. |
body @@ ts_starts_with('qu') | 1, 2 | quick begins with qu. |
body @@ ts_starts_with('zzz') | (none) | No token starts with zzz. |
SELECT id FROM docs_idx WHERE body @@ ts_starts_with('turt') ORDER BY id; id---- 3ts_regexp(pattern[, syntax])
Match indexed tokens against a regular expression.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
pattern | VARCHAR/BLOB | — | The regular expression. It must match a whole indexed term (anchored implicitly, not a substring search). |
syntax | VARCHAR | 'perl' | Dialect: 'perl' (RE2 / Perl-compatible) or 'posix' (POSIX ERE). Case-insensitive. |
How it works. The pattern is applied to each indexed term and a row matches when any term matches. Because terms are stored in their analyzed form (lower-cased by the dictionary in our setup), write the pattern against that form — ts_regexp('QUICK') finds nothing, but the inline flag ts_regexp('(?i)QUICK') does (in 'perl' mode). The 'posix' dialect is handy for bracket-class patterns such as gr[ae]y.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_regexp('qu.*ck') | 1, 2 | quick matches the Perl pattern. |
body @@ ts_regexp('gr[ae]y', 'posix') | 3 | grey matches the POSIX class. |
body @@ ts_regexp('(?i)QUICK') | 1, 2 | The inline (?i) flag makes the match case-insensitive. |
body @@ ts_regexp('z.*') | (none) | No term begins with z. |
SELECT id FROM docs_idx WHERE body @@ ts_regexp('qu.*ck') ORDER BY id; id---- 1 2Use the 'posix' dialect for a POSIX ERE bracket expression:
SELECT idFROM docs_idxWHERE body @@ ts_regexp('gr[ae]y', 'posix')ORDER BY id; id---- 3ts_levenshtein(text[, distance[, transpositions[, prefix]]])
Fuzzy match: find tokens within a bounded edit distance of text — the standard tolerance for typos.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR/BLOB | — | The term to match approximately. When prefix is given, this is only the suffix that is fuzzy-matched (see below). |
distance | INTEGER | auto by length: 0 for ≤ 2 chars, 1 for 3–5, 2 for ≥ 6 | Maximum edit distance (insertions, deletions, substitutions). Allowed range 0–4 (capped at 3 when transpositions is true). |
transpositions | BOOLEAN | true | When true, swapping two adjacent characters counts as one edit (Damerau–Levenshtein); when false, it counts as two. |
prefix | VARCHAR | '' (empty) | A literal leading substring that must match exactly. The term actually matched is prefix || text; only the text portion spends the edit budget. |
How it works. A term matches when its edit distance from the query is at most distance. The one-argument form auto-picks a distance from the query length, mirroring Elasticsearch's AUTO fuzziness: short words tolerate fewer edits. transpositions = true makes a single adjacent-character swap cost one edit instead of two — so quikc reaches quick at distance 1 with transpositions, but needs distance 2 without. The prefix argument anchors an exact leading substring and fuzzy-matches only the rest, which both narrows the candidate set and speeds the scan: ts_levenshtein('X', 1, true, 'quic') requires the literal prefix quic then allows one edit on X, reaching quick. Every indexed term within the edit distance matches — there is no expansion cap (Elasticsearch's max_expansions); narrow the candidate set with prefix instead.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_levenshtein('quikc', 2) | 1, 2 | quikc → quick is within 2 edits. |
body @@ ts_levenshtein('quikc', 1, true) | 1, 2 | The kc↔ck transposition is a single edit. |
body @@ ts_levenshtein('quikc', 1, false) | (none) | Without transpositions, quikc → quick is 2 edits. |
body @@ ts_levenshtein('X', 1, true, 'quic') | 1, 2 | Exact prefix quic plus one edit (X → k) reaches quick. |
SELECT id FROM docs_idx WHERE body @@ ts_levenshtein('quikc', 2) ORDER BY id; id---- 1 2Combine a fixed prefix with a small fuzzy suffix to keep typo-tolerance fast and focused — only terms beginning with quic are considered:
SELECT idFROM docs_idxWHERE body @@ ts_levenshtein('X', 1, true, 'quic')ORDER BY id; id---- 1 2ts_ngram(text[, threshold])
Match by n-gram similarity — fuzzy matching that scores on shared character sequences rather than edit distance.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR/BLOB | — | The term to match approximately. |
threshold | DOUBLE | 0.7 | Minimum similarity, in 0.0–1.0. A term matches when the fraction of n-grams it shares with text is at least threshold. Lower values widen the match (higher recall); higher values tighten it (higher precision). |
How it works. This requires a column tokenized with an n-gram dictionary (our bigram dictionary splits hello into he, el, ll, lo). The query string is split the same way and a term matches when enough of its n-grams overlap. Because it compares sub-sequences, n-gram similarity tolerates insertions, deletions and reorderings and is well suited to short strings and approximate matching where edit distance is too rigid. 1.0 demands an exact n-gram set; 0.3 is permissive.
| Query | Matches id, title | Why |
|---|---|---|
title @@ ts_ngram('hello') | 1 hello | At the default 0.7, only hello shares enough bigrams. |
title @@ ts_ngram('hello', 0.3) | 1 hello, 2 help, 4 held | The lower threshold also admits help and held. |
title @@ ts_ngram('hello', 0.99) | 1 hello | Near-exact n-gram overlap required. |
title @@ ts_ngram('zzzz', 0.3) | (none) | No title shares bigrams with zzzz. |
SELECT id, title FROM titles_idx WHERE title @@ ts_ngram('hello', 0.3) ORDER BY id; id | title----+------- 1 | hello 2 | help 4 | heldN-gram similarity is recall-oriented; SereneDB does not expose a
max_expansions-style cap on the number of candidate terms, so very low thresholds on large vocabularies can be broad. Raisethresholdto tighten results.
ts_between(min, max, min_incl, max_incl)
Range match between two bounds, with explicit control over each end's inclusivity.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
min | any (matches the column) | — (required) | Lower bound, or NULL for unbounded below. |
max | any (matches the column) | — (required) | Upper bound, or NULL for unbounded above. |
min_incl | BOOLEAN | — (required) | true includes min (>=), false excludes it (>). |
max_incl | BOOLEAN | — (required) | true includes max (<=), false excludes it (<). |
How it works. ts_between performs a range scan over a numeric, temporal or verbatim-text (keyword-analyzed) column. Unlike BETWEEN in plain SQL, both inclusivity flags are explicit and required — there is no default. Either bound may be NULL to make that side unbounded; both NULL matches every indexed value. (For a one-sided range you can also use the shorthand ts_lt/ts_le/ts_gt/ts_ge, which take a single non-NULL bound.)
| Query | Matches id | Why |
|---|---|---|
id @@ ts_between(2, 3, true, true) | 2, 3 | Closed interval [2, 3]. |
id @@ ts_between(2, 3, false, false) | (none) | Open interval (2, 3) contains no integer. |
id @@ ts_between(NULL, 2, true, true) | 1, 2 | Unbounded below, up to and including 2. |
id @@ ts_between(3, NULL, true, true) | 3, 4 | From 3 (inclusive) upward, unbounded above. |
SELECT id FROM docs_idx WHERE id @@ ts_between(2, 3, true, true) ORDER BY id; id---- 2 3A NULL bound makes one side unbounded — here, everything up to and including id = 2:
SELECT idFROM docs_idxWHERE id @@ ts_between(NULL, 2, true, true)ORDER BY id; id---- 1 2ts_lt(value)
Match values strictly less than value — the one-sided shorthand for ts_between(NULL, value, ?, false).
| Parameter | Type | Default | Meaning |
|---|---|---|---|
value | any (matches the column) | — | Exclusive upper bound. Must be non-NULL (use ts_between for an unbounded side). |
How it works. Matches every indexed value v with v < value. Inclusivity is fixed by the function name: ts_lt is strict (<).
| Query | Matches id | Why |
|---|---|---|
id @@ ts_lt(3) | 1, 2 | Values strictly below 3. |
id @@ ts_lt(1) | (none) | Nothing is below the minimum id. |
SELECT id FROM docs_idx WHERE id @@ ts_lt(3) ORDER BY id; id---- 1 2ts_le(value)
Match values less than or equal to value.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
value | any (matches the column) | — | Inclusive upper bound. Must be non-NULL. |
How it works. Matches every indexed value v with v <= value; value itself is included.
| Query | Matches id | Why |
|---|---|---|
id @@ ts_le(2) | 1, 2 | Values up to and including 2. |
id @@ ts_le(0) | (none) | Nothing is at or below 0. |
SELECT id FROM docs_idx WHERE id @@ ts_le(2) ORDER BY id; id---- 1 2ts_gt(value)
Match values strictly greater than value.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
value | any (matches the column) | — | Exclusive lower bound. Must be non-NULL. |
How it works. Matches every indexed value v with v > value.
| Query | Matches id | Why |
|---|---|---|
id @@ ts_gt(3) | 4 | Values strictly above 3. |
id @@ ts_gt(4) | (none) | Nothing is above the maximum id. |
SELECT id FROM docs_idx WHERE id @@ ts_gt(3) ORDER BY id; id---- 4ts_ge(value)
Match values greater than or equal to value.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
value | any (matches the column) | — | Inclusive lower bound. Must be non-NULL. |
How it works. Matches every indexed value v with v >= value; value itself is included.
| Query | Matches id | Why |
|---|---|---|
id @@ ts_ge(3) | 3, 4 | Values from 3 upward. |
id @@ ts_ge(5) | (none) | Nothing is at or above 5. |
SELECT id FROM docs_idx WHERE id @@ ts_ge(3) ORDER BY id; id---- 3 4ts_any(list[, min_match])
OR over a list of sub-queries, with an optional "match at least N" threshold.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
list | LIST(TSQUERY) (bare strings allowed) | — | The alternatives. Each element is a TSQUERY; a plain string is tokenized by the column dictionary. |
min_match | INTEGER | 1 | How many alternatives a row must satisfy. Must be between 1 and the list length. 1 is a plain OR; raising it demands more of the alternatives. |
How it works. ts_any is a disjunction with a tunable floor. At min_match = 1 it is a straight OR — match any alternative. Raising min_match turns it into an "N of M" query: with three alternatives and min_match = 2, a row must contain at least two of them. This is the equivalent of Elasticsearch's minimum_should_match (integer form) and the terms_set query. SereneDB takes an integer count only — it does not accept percentage or negative minimum_should_match formats, nor a per-document min-match field.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_any(['quick', 'grey']) | 1, 2, 3 | Any one of the terms is enough. |
body @@ ts_any(['quick', 'grey', 'red'], 2) | 2 | Only id 2 (quick + red) contains two of the three. |
body @@ ts_any(['unicorn', 'dragon']) | (none) | Neither term appears. |
SELECT id FROM docs_idx WHERE body @@ ts_any(['quick', 'grey'], 1) ORDER BY id; id---- 1 2 3Set min_match above 1 for an "N of M" query — here, rows containing at least two of the three terms:
SELECT idFROM docs_idxWHERE body @@ ts_any(['quick', 'grey', 'red'], 2)ORDER BY id; id---- 2ts_all(list)
AND over a list of sub-queries — every element must match.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
list | LIST(TSQUERY) (bare strings allowed) | — | The conjuncts. A row matches only when it satisfies all of them. |
How it works. ts_all is the conjunction (AND) of every element — equivalent to chaining the elements with &&, or to ts_any(list, len(list)). Use it to require that several tokens or sub-queries all appear in the same row.
| Query | Matches id | Why |
|---|---|---|
body @@ ts_all([ts_phrase('quick'), ts_phrase('brown')]) | 1 | Only id 1 has both quick and brown. |
body @@ ts_all([ts_phrase('quick'), ts_phrase('red')]) | 2 | Only id 2 has both quick and red. |
body @@ ts_all([ts_phrase('quick'), ts_phrase('grey')]) | (none) | No row has both quick and grey. |
SELECT id FROM docs_idx WHERE body @@ ts_all([ts_phrase('quick'), ts_phrase('brown')]) ORDER BY id; id---- 1ts_compound(must, must_not, should[, min_should_match])
Boolean query combining required, forbidden and optional clauses in one call — the SereneDB analog of the Elasticsearch bool query.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
must | TSQUERY, LIST(TSQUERY) or NULL | — | Clauses that must all match (AND). NULL is an empty bucket. |
must_not | TSQUERY, LIST(TSQUERY) or NULL | — | Clauses that must not match (exclusion). NULL is an empty bucket. |
should | TSQUERY, LIST(TSQUERY) or NULL | — | Optional clauses; at least min_should_match of them must match. NULL is an empty bucket. |
min_should_match | INTEGER | 1 | How many should clauses are required. Must be between 1 and the number of should clauses; supplying it with no should clauses is an error. |
How it works. ts_compound mirrors Elasticsearch's bool query: must ≈ AND, must_not ≈ negation, should ≈ a tunable OR. A row matches when every must clause matches, no must_not clause matches, and at least min_should_match of the should clauses match. Each bucket accepts a single TSQUERY, a list of them, or NULL for "no clauses here". This packs the work of &&, !! and ts_any into one expression.
| Query | Matches id | Why |
|---|---|---|
ts_compound(ts_phrase('quick'), ts_phrase('grey'), [ts_phrase('lazy')]) | 1, 2 | Must have quick, must not have grey, should have lazy. |
ts_compound(ts_phrase('fox'), NULL, [ts_phrase('brown'), ts_phrase('jumps'), ts_phrase('lazy')], 2) | 1, 2 | Both fox rows have at least two of brown/jumps/lazy. |
ts_compound(ts_phrase('fox'), NULL, [ts_phrase('brown'), ts_phrase('jumps'), ts_phrase('lazy')], 3) | 1 | Only id 1 has all three optional terms. |
SELECT id FROM docs_idx WHERE body @@ ts_compound(ts_phrase('quick'), ts_phrase('grey'), [ts_phrase('lazy')]) ORDER BY id; id---- 1 2Tighten the should bucket with min_should_match — require a fox row that also has all three optional terms brown, jumps and lazy:
SELECT idFROM docs_idxWHERE body @@ ts_compound( ts_phrase('fox'), NULL, [ts_phrase('brown'), ts_phrase('jumps'), ts_phrase('lazy')], 3)ORDER BY id; id---- 1ts_is_null()
Match rows whose indexed column value is absent (NULL) — the index-side counterpart of IS NULL.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
| (none) | Takes no arguments; the column is supplied by the @@ left-hand side. |
How it works. ts_is_null emits a column-existence test (negated) over the inverted index, matching rows where the column has no stored value. The column needs columnstore storage and an analyzer — it must be indexed, or indexed and listed in INCLUDE. In the setup, dual is an INCLUDE column with one NULL row.
| Query | Matches id | Why |
|---|---|---|
dual @@ ts_is_null() | 2 | id 2 is the only row whose dual is NULL. |
SELECT id FROM docs_idx WHERE dual @@ ts_is_null() ORDER BY id; id---- 2ts_is_not_null()
Match rows whose indexed column value is present — the index-side counterpart of IS NOT NULL, and the complement of ts_is_null.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
| (none) | Takes no arguments; the column is supplied by the @@ left-hand side. |
How it works. Emits a column-existence test, matching rows where the column holds a value. Same column requirement as ts_is_null (indexed or INCLUDE).
| Query | Matches id | Why |
|---|---|---|
dual @@ ts_is_not_null() | 1, 3, 4 | Every row except id 2 has a non-NULL dual. |
SELECT id FROM docs_idx WHERE dual @@ ts_is_not_null() ORDER BY id; id---- 1 3 4TSQUERY Operators
| Operator | Description |
|---|---|
a || b | OR — match either sub-query. |
a && b | AND — match both sub-queries. |
!! a | NOT — exclude matches of a (within a conjunction). |
a ## b | Phrase adjacency, with optional gap or gap range. |
a ^ factor | Boost a's contribution to the relevance score. |
a || b
Disjunction (OR) of two sub-queries.
| Operand | Type | Meaning |
|---|---|---|
a, b | TSQUERY | Sub-queries. A row matches when either matches. |
How it works. || combines two TSQUERY values into one that matches if at least one side does. It is the pairwise form of ts_any; chain it (a || b || c) for more alternatives.
| Query | Matches id | Why |
|---|---|---|
body @@ (ts_phrase('brown') || ts_phrase('grey')) | 1, 3 | brown is in id 1, grey in id 3. |
body @@ (ts_phrase('unicorn') || ts_phrase('grey')) | 3 | Only the grey side matches. |
SELECT id FROM docs_idx WHERE body @@ (ts_phrase('brown') || ts_phrase('grey')) ORDER BY id; id---- 1 3a && b
Conjunction (AND) of two sub-queries.
| Operand | Type | Meaning |
|---|---|---|
a, b | TSQUERY | Sub-queries. A row matches only when both match. |
How it works. && matches only rows satisfying both operands — the pairwise form of ts_all. It pairs naturally with !! to express "has X but not Y".
| Query | Matches id | Why |
|---|---|---|
body @@ (ts_phrase('quick') && ts_phrase('brown')) | 1 | Only id 1 has both. |
body @@ (ts_phrase('quick') && ts_phrase('grey')) | (none) | No row has both. |
SELECT id FROM docs_idx WHERE body @@ (ts_phrase('quick') && ts_phrase('brown')) ORDER BY id; id---- 1!! a
Unary negation (NOT) — excludes the matches of a.
| Operand | Type | Meaning |
|---|---|---|
a | TSQUERY | The sub-query to negate. |
How it works. !!a is a hard exclusion (Elasticsearch must_not): it removes rows that match a. It is meaningful only inside a conjunction — something && !!a means "matches something but not a". A standalone negation has no positive clause to filter, so always combine it. SereneDB has no soft down-weighting query (Elasticsearch's boosting with a negative_boost); use ^ to raise a positive clause instead.
| Query | Matches id | Why |
|---|---|---|
body @@ (ts_phrase('quick') && !!ts_phrase('brown')) | 2 | Has quick, excludes the brown row. |
body @@ (ts_phrase('fox') && !!ts_phrase('red')) | 1 | Has fox, excludes the red row. |
SELECT id FROM docs_idx WHERE body @@ (ts_phrase('quick') && !!ts_phrase('brown')) ORDER BY id; id---- 2a ## b
Ordered proximity: require the sub-queries to appear close together, in order.
| Operand | Type | Meaning |
|---|---|---|
a, b | TSQUERY or VARCHAR | The two ends of the phrase. |
| gap (between) | INTEGER or INTEGER[] | Optional. a ## b is strict adjacency; a ## N ## b requires exactly N tokens between; a ## [min, max] ## b allows a gap range. |
How it works. ## is an ordered proximity operator: a must precede b. The integer counts the tokens between the two ends — a ## b (no integer) and a ## 0 ## b both mean immediate adjacency, a ## 2 ## b means exactly two intervening tokens. Order matters: 'quick' ## 'brown' matches quick brown but 'brown' ## 'quick' does not.
The integer in
##counts the tokens between the operands (0= adjacent). Thetsquery_phrasefunction and PostgreSQL's<->use the opposite convention, wheredistance = 1means adjacent. Seetsquery_phrase.
| Query | Matches id | Why |
|---|---|---|
body @@ ('quick' ## 0 ## 'brown') | 1 | quick brown are adjacent in order. |
body @@ ('brown' ## 0 ## 'quick') | (none) | Reversed order — ## is ordered. |
body @@ ('quick' ## 1 ## 'fox') | 1, 2 | Exactly one token (brown/red) between them. |
body @@ ('fox' ## 3 ## 'dog') | 1, 2 | jumps over lazy are the three tokens between. |
SELECT id FROM docs_idx WHERE body @@ ('quick' ## 1 ## 'fox') ORDER BY id; id---- 1 2a ^ factor
Boost: scale a sub-query's contribution to the relevance score.
| Operand | Type | Meaning |
|---|---|---|
a | TSQUERY | The sub-query to boost. |
factor | DOUBLE (≥ 0) | Multiplier applied to a's score contribution. > 1 raises it, < 1 lowers it, 0 zeroes it. |
How it works. ^ reweights a clause for relevance scoring without changing which rows match — the result set is identical, only the order differs once you ORDER BY a score. The factor multiplies that clause's contribution to the score: with ^ 3.0, a row that matches the boosted clause earns three times its normal BM25 weight from it (the worked example below verifies the boosted rows score exactly 3× the unboosted). factor > 1 promotes a clause, 0 < factor < 1 demotes it and 0 neutralizes its score contribution (the clause still matches). Factors compose multiplicatively when nested — (a ^ 2) ^ 3 weights a by 6. Typical use is favouring one alternative in a disjunction, e.g. title_match ^ 3 || body_match. The factor must be non-negative, and the cast form a::boost(factor) is equivalent.
| Query | Matches id | Effect |
|---|---|---|
body @@ (ts_phrase('fox') || ts_phrase('quick') ^ 2.0) | 1, 2 | Same rows as the un-boosted OR; quick matches simply score higher. |
SELECT id FROM docs_idx WHERE body @@ (ts_phrase('fox') || ts_phrase('quick') ^ 2.0) ORDER BY id; id---- 1 2To see the effect, boost one alternative and order by BM25:
SELECT id, BM25(scored_idx.tableoid) AS scoreFROM scored_idxWHERE body @@ (ts_phrase('fox') || ts_phrase('quick') ^ 5.0)ORDER BY score DESC, id; id | score----+------------ 1 | 5.6312833 4 | 5.013905 2 | 0.8407818 3 | 0.43974406PostgreSQL-Compatible Parsers
Each accepts a single string and returns a TSQUERY. These are SereneDB inverted-index queries, not PostgreSQL tsvector/tsquery.
| Function | Description |
|---|---|
to_tsquery(text) | Parse a Lucene-style query (AND/OR, +/-, *, ~, phrases, grouping, boost). |
plainto_tsquery(text) | Tokenize and AND the terms. |
phraseto_tsquery(text) | Treat text as a phrase. |
websearch_to_tsquery(text) | Web-search syntax: quoted substrings are phrases, OR separates alternatives. |
tsquery_phrase(a, b[, distance]) | Function form of the ## phrase operator. |
to_tsquery(text)
Parse a single Lucene-style query string into a TSQUERY.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR | — | A Lucene / Elasticsearch query_string-style expression (see the grammar below). Parsing is strict: a malformed expression raises an error. For lenient, user-facing input use websearch_to_tsquery. |
How it works. to_tsquery runs the full query parser — the richest of the parsers. Beyond the AND / OR / NOT boolean keywords it supports the common operators of the Lucene classic query-parser grammar:
| Syntax | Meaning |
|---|---|
a AND b · a OR b · a NOT b | Boolean conjunction / disjunction / exclusion. |
+a | a is required. |
-a | a is excluded. |
a* | Prefix (wildcard) match. |
a~N | Fuzzy match within edit distance N. |
"a b" | Phrase match. |
"a b"~N | Proximity phrase: a and b within N positions. |
(a b) | Grouping. |
a^N | Boost a's relevance contribution by factor N. |
These combine freely. Despite the PostgreSQL-compatible name, this builds a SereneDB inverted-index query, not a PostgreSQL tsquery; the queries operate on the single column on the left of @@ (there is no field:term scoping).
| Query | Matches id | Why |
|---|---|---|
body @@ to_tsquery('quick AND brown') | 1 | Both terms required. |
body @@ to_tsquery('+fox -red') | 1 | Has fox, excludes red. |
body @@ to_tsquery('qui*') | 1, 2 | Prefix qui matches quick. |
body @@ to_tsquery('quikc~2') | 1, 2 | Fuzzy match within 2 edits of quick. |
body @@ to_tsquery('"quick brown"~3') | 1 | quick and brown within three positions. |
body @@ to_tsquery('xyzzy') | (none) | Term not present. |
SELECT id FROM docs_idx WHERE body @@ to_tsquery('quick AND brown') ORDER BY id; id---- 1The + (required) and - (excluded) operators, grouping and the others combine freely — for example, match rows containing fox but not red:
SELECT id FROM docs_idx WHERE body @@ to_tsquery('+fox -red') ORDER BY id; id---- 1Wildcard, fuzzy and proximity operators compose too — a prefix on one term plus a fuzzy match on another:
SELECT idFROM docs_idxWHERE body @@ to_tsquery('qui* AND f0x~1')ORDER BY id; id---- 1 2plainto_tsquery(text)
Tokenize text and combine the terms with AND.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR | — | Free text. It is tokenized by the column dictionary and the resulting terms are joined with AND. Operators are not interpreted — +, -, quotes and * are treated as ordinary characters. |
How it works. plainto_tsquery is the "all words must appear" parser: it splits text into terms and requires every term, with no order constraint. It is the conjunctive counterpart to a bare string literal (which uses OR).
| Query | Matches id | Why |
|---|---|---|
body @@ plainto_tsquery('quick brown') | 1 | Both quick and brown required. |
body @@ plainto_tsquery('quick fox') | 1, 2 | Both rows have quick and fox. |
body @@ plainto_tsquery('quick grey') | (none) | No row has both. |
SELECT id FROM docs_idx WHERE body @@ plainto_tsquery('quick brown') ORDER BY id; id---- 1phraseto_tsquery(text)
Treat text as an exact ordered phrase.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR | — | Free text tokenized and matched as a contiguous, ordered phrase. |
How it works. phraseto_tsquery tokenizes text and requires the tokens to appear adjacent and in order — equivalent to ts_phrase for a single string. Use it when word order matters.
| Query | Matches id | Why |
|---|---|---|
body @@ phraseto_tsquery('over lazy dog') | 1, 2 | The three words are adjacent and in order in both. |
body @@ phraseto_tsquery('dog lazy over') | (none) | Same words, wrong order. |
SELECT id FROM docs_idx WHERE body @@ phraseto_tsquery('over lazy dog') ORDER BY id; id---- 1 2websearch_to_tsquery(text)
Parse forgiving, web-search-bar syntax into a TSQUERY.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
text | VARCHAR | — | A search-engine-style string. Quoted substrings become phrases, the OR keyword separates alternatives, a leading - excludes a term and unquoted words are otherwise combined with AND. |
How it works. This is the parser for untrusted, user-facing input: unlike to_tsquery it never raises on malformed syntax — stray operators are simply treated as text. It recognizes "quoted phrases", the literal OR keyword, and a leading - for exclusion; everything else is AND-ed.
| Query | Matches id | Why |
|---|---|---|
body @@ websearch_to_tsquery('"quick brown" OR "grey turtle"') | 1, 3 | Either phrase. |
body @@ websearch_to_tsquery('fox -red') | 1 | Has fox, excludes red. |
body @@ websearch_to_tsquery('quick grey') | (none) | Unquoted words are AND-ed; no row has both. |
SELECT id FROM docs_idx WHERE body @@ websearch_to_tsquery('"quick brown" OR "grey turtle"') ORDER BY id; id---- 1 3tsquery_phrase(a, b[, distance])
Function form of a two-term proximity phrase, using PostgreSQL's distance convention.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
a | TSQUERY | — | First (left) sub-query. |
b | TSQUERY | — | Second (right) sub-query, required to follow a. |
distance | INTEGER | 1 (adjacent) | Number of lexemes between the start of a and the start of b, PostgreSQL-style: 1 means adjacent, 2 means one token in between. Must be ≥ 1. |
How it works. tsquery_phrase is the function spelling of an ordered proximity phrase, matching PostgreSQL's <N> / tsquery_phrase semantics where distance = 1 is adjacency. This is the opposite counting convention from the ## operator, where the integer counts the tokens between the operands (0 = adjacent). Pick whichever reads more clearly — they target the same positions.
| Query | Matches id | Why |
|---|---|---|
body @@ tsquery_phrase('quick'::TSQUERY, 'brown'::TSQUERY) | 1 | Default distance = 1: quick brown adjacent. |
body @@ tsquery_phrase('quick'::TSQUERY, 'fox'::TSQUERY, 2) | 1, 2 | distance = 2: one token (brown/red) between. |
body @@ tsquery_phrase('quick'::TSQUERY, 'fox'::TSQUERY, 1) | (none) | distance = 1 demands adjacency, but a token sits between them. |
SELECT id FROM docs_idx WHERE body @@ tsquery_phrase('quick'::TSQUERY, 'fox'::TSQUERY, 2) ORDER BY id; id---- 1 2Convenience Predicates
Sugar that rewrites to @@ at bind time. Each takes the indexed column as its first argument and returns BOOLEAN.
| Function | Description |
|---|---|
phrase_matches(column, text[, text, ...]) | Equivalent to column @@ ts_phrase(...). |
ngram_matches(column, text[, threshold]) | Equivalent to column @@ ts_ngram(...). |
levenshtein_matches(column, text, distance[, transpositions[, prefix]]) | Equivalent to column @@ ts_levenshtein(...). |
has_all_tokens(column, list) | Equivalent to column @@ ts_all(list). |
has_any_tokens(column, list[, min_match]) | Equivalent to column @@ ts_any(list[, min_match]). |
phrase_matches(column, text[, text, ...])
Phrase-match sugar for column @@ ts_phrase(text, ...).
| Parameter | Type | Default | Meaning |
|---|---|---|---|
column | any indexed column | — | The inverted-indexed column to match. |
text | VARCHAR/BLOB | — | A phrase segment, as in ts_phrase. |
text, ... | INTEGER/INTEGER[] + VARCHAR | — | Optional gap, text pairs, identical to ts_phrase's gap grammar. |
How it works. phrase_matches(column, ...) rewrites at bind time to column @@ ts_phrase(...), inheriting the full gap grammar. It reads as a predicate, which is convenient when the @@ form feels verbose.
| Query | Matches id | Why |
|---|---|---|
phrase_matches(body, 'quick brown') | 1 | Adjacent phrase, only in id 1. |
phrase_matches(body, 'quick', 3, 'over') | 1, 2 | Gap form: three tokens between quick and over. |
phrase_matches(body, 'brown grey') | (none) | Those words never appear adjacent. |
SELECT id FROM docs_idx WHERE phrase_matches(body, 'quick brown') ORDER BY id; id---- 1ngram_matches(column, text[, threshold])
N-gram-match sugar for column @@ ts_ngram(text[, threshold]).
| Parameter | Type | Default | Meaning |
|---|---|---|---|
column | n-gram-indexed column | — | The column to match. |
text | VARCHAR/BLOB | — | Term to match by similarity. |
threshold | DOUBLE | 0.7 | Minimum n-gram similarity, as in ts_ngram. |
How it works. Rewrites to column @@ ts_ngram(text[, threshold]); see ts_ngram for the similarity model.
| Query | Matches id, title | Why |
|---|---|---|
ngram_matches(title, 'hello', 0.3) | 1 hello, 2 help, 4 held | All share enough bigrams at 0.3. |
ngram_matches(title, 'hello') | 1 hello | At the default 0.7, only hello qualifies. |
SELECT id, title FROM titles_idx WHERE ngram_matches(title, 'hello', 0.3) ORDER BY id; id | title----+------- 1 | hello 2 | help 4 | heldlevenshtein_matches(column, text, distance[, transpositions[, prefix]])
Fuzzy-match sugar for column @@ ts_levenshtein(...).
| Parameter | Type | Default | Meaning |
|---|---|---|---|
column | any indexed column | — | The column to match. |
text | VARCHAR/BLOB | — | Term to match approximately. |
distance | INTEGER | — (required here) | Maximum edit distance. Unlike ts_levenshtein, the predicate has no auto-distance form, so distance is required. |
transpositions | BOOLEAN | true | Whether an adjacent-character swap counts as one edit. |
prefix | VARCHAR | '' | Exact leading substring; only the suffix spends the edit budget. |
How it works. Rewrites to column @@ ts_levenshtein(text, distance[, transpositions[, prefix]]). See ts_levenshtein for the edit-distance and prefix semantics.
| Query | Matches id | Why |
|---|---|---|
levenshtein_matches(body, 'quikc', 2) | 1, 2 | Within 2 edits of quick. |
levenshtein_matches(body, 'quikc', 1, false) | (none) | Without transpositions, quikc → quick is 2 edits. |
SELECT id FROM docs_idx WHERE levenshtein_matches(body, 'quikc', 2) ORDER BY id; id---- 1 2has_all_tokens(column, list)
True when every token in list is present.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
column | any indexed column | — | The column to match. |
list | LIST(VARCHAR) | — | Tokens that must all appear. Each element is tokenized by the column dictionary. |
How it works. Tokenizes each element and requires all of them — sugar for column @@ ts_all(ts_tokenize(list)). Use it for "contains every one of these words" filters.
| Query | Matches id | Why |
|---|---|---|
has_all_tokens(body, ['quick', 'brown']) | 1 | Only id 1 has both. |
has_all_tokens(body, ['quick', 'fox']) | 1, 2 | Both have quick and fox. |
has_all_tokens(body, ['quick', 'grey']) | (none) | No row has both. |
SELECT id FROM docs_idx WHERE has_all_tokens(body, ['quick', 'brown']) ORDER BY id; id---- 1has_any_tokens(column, list[, min_match])
True when at least min_match tokens in list are present.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
column | any indexed column | — | The column to match. |
list | LIST(VARCHAR) (or a single VARCHAR) | — | Candidate tokens. |
min_match | INTEGER | 1 | Minimum number of tokens that must appear. 1 is a plain OR. |
How it works. Tokenizes the candidates and requires at least min_match of them — sugar for column @@ ts_any(ts_tokenize(list)[, min_match]). Raise min_match for an "N of M" filter, exactly as in ts_any.
| Query | Matches id | Why |
|---|---|---|
has_any_tokens(body, ['grey', 'red']) | 2, 3 | red in id 2, grey in id 3. |
has_any_tokens(body, ['quick', 'grey', 'red'], 2) | 2 | Only id 2 has two of the three. |
has_any_tokens(body, ['unicorn', 'dragon']) | (none) | Neither appears. |
SELECT id FROM docs_idx WHERE has_any_tokens(body, ['grey', 'red'], 1) ORDER BY id; id---- 2 3Utility Functions
| Function | Description |
|---|---|
ts_lexize(dictionary, text) | Return the tokens a dictionary produces for text. |
ts_lexize(dictionary, text)
Return the tokens a dictionary produces for text — the tool for inspecting analysis.
| Parameter | Type | Default | Meaning |
|---|---|---|---|
dictionary | VARCHAR | — | Name of an existing text-search dictionary. It must exist in the catalog ('keyword' is not a real dictionary here). |
text | VARCHAR or LIST(VARCHAR) | — | The text to analyze. A list analyzes each element and concatenates the results. |
How it works. ts_lexize is the only function on this page that runs on its own (not inside @@): it applies a named dictionary's analysis pipeline — lower-casing, stemming, stop-word removal, n-gram splitting — and returns the resulting lexemes as a LIST(VARCHAR). Use it to see exactly how a query string or a document will be tokenized when tuning an index: if your search misses, lexize both the query and the source text and compare.
| Input | Tokens | Why |
|---|---|---|
ts_lexize('en', 'Quick BROWN') | {quick, brown} | The en text dictionary lower-cases and splits on whitespace. |
ts_lexize('bigram', 'help') | {he, el, lp} | The bigram n-gram dictionary emits overlapping 2-grams. |
SELECT ts_lexize('en', 'Quick BROWN') AS ts_lexize; ts_lexize--------------- {quick,brown}Lexize against an n-gram dictionary to see how a term is split for ts_ngram matching:
SELECT ts_lexize('bigram', 'help') AS ts_lexize; ts_lexize------------ {he,el,lp}Coming from Elasticsearch
The functions on this page cover most of the Elasticsearch / OpenSearch query DSL. This table maps each DSL query to its SereneDB equivalent (the left column links to the Elasticsearch reference).
| Elasticsearch / OpenSearch query | SereneDB |
|---|---|
match (analyzed terms, OR) | bare string, or ts_tokenize |
match with operator: and | plainto_tsquery |
match_phrase | ts_phrase, phraseto_tsquery |
match_phrase with slop | ts_phrase gaps, ##, tsquery_phrase |
term / terms | token literal, has_any_tokens |
terms_set (match N of M) | ts_any with min_match |
prefix | ts_starts_with |
wildcard | ts_like or ts_regexp |
regexp | ts_regexp |
fuzzy | ts_levenshtein |
range | ts_between, ts_lt/ts_le/ts_gt/ts_ge |
exists | ts_is_not_null / ts_is_null |
bool (must/must_not/should) | ts_compound, or &&/!!/ts_any |
query_string / simple_query_string | to_tsquery (strict), websearch_to_tsquery (lenient) |
Notable differences
Elasticsearch features without a direct SereneDB equivalent, and what to use instead:
| Elasticsearch / OpenSearch | SereneDB |
|---|---|
minimum_should_match percentage / negative / combination forms | integer count only (ts_any, ts_compound) |
fuzziness: AUTO | one-argument ts_levenshtein auto-picks a distance by term length |
max_expansions (fuzzy / prefix expansion cap) | no expansion cap |
multi_match / combined_fields / field:term scoping | single-column @@; compose multiple predicates with OR |
constant_score | none; ORDER BY a literal, or raw_boost (see Ranking) |
boosting (negative_boost) | none; raise a clause with ^ or exclude with !! |
match_phrase_prefix / match_bool_prefix | combine ts_phrase with ts_starts_with |
more_like_this | none; use vector similarity (Vector Search) |