Skip to main content

CREATE INDEX … USING inverted

Adding USING inverted to CREATE INDEX builds an inverted index, which powers full-text, vector and geospatial search. This page is the syntax reference; for a conceptual guide with query examples, see Inverted Index.

Column specification

Each entry in the USING inverted (...) list is an indexed column.

  • Text and scalar columns may name a text search dictionary (the analyzer). With no dictionary the column is indexed verbatim — one token per value — giving exact, case-sensitive matching suitable for identifiers, tags and categories. Numeric and temporal columns are indexed for exact and range queries.
  • Vector columns use hnsw (...) to build an HNSW index. metric is required; m and ef_construction tune the graph.
  • Expressions — an entry may be a parenthesized expression over one or more columns, e.g. (lower(name)) my_dict or (price * 110 / 100). The query must use the identical expression. Aggregates, subqueries and volatile functions (e.g. random()) are rejected.
  • JSON — index a JSON extraction expression such as (doc ->> 'host') my_dict, optionally with a dictionary; this indexes exactly the sub-field you search. (See also the VARIANT type for storing typed payloads.)
  • Generated columns — index a generated column the same as any other column (index the column itself, not its defining expression).
  • Array columns — a TEXT[] / VARCHAR[] column is indexed element-by-element; a row matches if any element matches.

HUGEINT, DECIMAL, UUID and INTERVAL columns cannot be indexed. Partial indexes (CREATE INDEX … WHERE …) and composite ROW(...) columns are not supported. For a guided treatment of these choices, see What to Index.

Operator-class options (feature flags)

A column's trailing WITH (...) sets per-column feature flags controlling what the index records. They are off by default; enable only what your queries need.

FlagDefaultEnables
frequencyfalseTerm frequency — required for relevance scoring
positionfalseTerm positions — required for phrase / proximity queries
offsetfalseCharacter offsets — required for highlighting
normfalseThe length-normalization factor used by some scorers

The same flags can be set on the dictionary itself, in which case every column using it inherits them. See the feature flags reference.

INCLUDE columns

Columns in INCLUDE (...) are stored but not indexed: they cannot be searched, but a query that selects from the index can return them without a separate base-table lookup. Each may set a storage compression codec — one of uncompressed, bitpacking, alp, rle or fsst — for example INCLUDE (payload included (compression = 'alp')).

Index options

The trailing WITH (...) clause sets index-level options.

OptionDefaultDescription
refresh_interval1000Background refresh interval in milliseconds; 0 disables it
compaction_interval1000Background compaction interval in milliseconds; 0 disables it
cleanup_interval_step1Commit ticks between cleanup passes; 0 disables it
row_group_size122880Row-group size for stored (INCLUDEd) columns
norm_row_group_size122880Row-group size for norm columns when norm is enabled
optimize_top_kScorer expression enabling top-K (WAND) pruning, e.g. 'bm25(1.2, 0.75)'
pkautoPrimary-key column to use as row identity when indexing a view

Indexing tables and views

An inverted index can be built over a base table or a view (including a view over read_parquet/read_csv on local disk or S3). A view has no primary key, so the index materializes its columns at build time and resolves a row identity — automatically for base-table and fast-path-reader views, or via WITH (pk = '...') for generic views. See Indexing a table vs. a view.

Examples

Create a dictionary, index two text columns with it, then query the index:

Query
CREATE TEXT SEARCH DICTIONARY english_dict (    template = 'text',    locale = 'en_US.UTF-8',    case = 'lower',    stemming = false,    accent = false,    frequency = true,    position = true);
CREATE INDEX articles_idx ON articles    USING inverted (id, title english_dict, body english_dict);
VACUUM (REFRESH_TABLE) articles;
SELECT id, title FROM articles_idx WHERE body @@ 'search' ORDER BY id;
Result
 id | title----+------------------  1 | Inverted indexes  2 | Vector search

Store an extra column alongside the index with INCLUDE so it can be returned without touching the base table:

Query
CREATE INDEX pages_idx ON pages    USING inverted (id, body english_dict)    INCLUDE (url);
VACUUM (REFRESH_TABLE) pages;
SELECT id, url FROM pages_idx WHERE body @@ 'tuning' ORDER BY id;
Result
 id | url----+----------------------------  2 | https://example.com/tuning

See also

Syntax

A column_spec is an indexed column with an optional analyzer (dictionary) or, for vectors, an hnsw configuration:

An include_spec is a stored-only column with an optional compression codec: