What to Index
Deciding what goes into an inverted index — which columns, whether to index a raw column or an expression over it, what to merely store for retrieval, and how big the result will be — is the main modeling decision. This page covers those choices. For the exact CREATE INDEX grammar, see the statement reference.
Indexed vs. INCLUDEd columns
A column can play one of two roles:
- Indexed (
USING inverted (col …)) — searchable with@@. The column's tokens go into the inverted structure. INCLUDEd (INCLUDE (col)) — stored but not searchable. The raw value is kept in the index's columnstore so a query against the index can return it without a second lookup against the base table.
Use INCLUDE for columns you frequently return or filter on but never full-text search — ids, URLs, prices, timestamps, JSON payloads. Numeric and temporal indexed columns are stored in a columnar form too, so they support fast range filtering and sorting alongside search.
Indexing expressions
You can index an expression over one or more columns, not just a bare column. The query must use the same expression. For example, index lower(s) to get case-insensitive exact matching on a verbatim column:
CREATE INDEX notes_idx ON notes USING inverted (id, (lower(s)) kw);
SELECT id, s FROM notes_idx WHERE lower(s) @@ 'hello' ORDER BY id; id | s----+------- 1 | HELLOIndexed expressions must be deterministic and reference at least one column. Aggregates, subqueries and volatile functions are rejected at CREATE INDEX:
CREATE INDEX metrics_idx ON metrics USING inverted (id, (sum(n)));db error: ERROR: aggregate functions are not allowed in index expressionsCREATE INDEX metrics_idx2 ON metrics USING inverted (id, (n + random()));db error: ERROR: Index keys cannot contain expressions with side effects.Generated columns
A generated column is indexed like any other column — index the column itself rather than repeating its expression. A STORED generated column works directly:
CREATE INDEX line_items_idx ON line_items USING inverted (id, price_with_tax);
SELECT id, price_with_tax FROM line_items_idxWHERE price_with_tax @@ ts_le(150)ORDER BY id; id | price_with_tax----+---------------- 1 | 110Indexing JSON
Semi-structured JSON is indexed by its sub-fields. Prefer the shredded VARIANT type — it is the fastest way to store, search and return JSON — and index typed sub-field extractions such as (doc['title']::VARCHAR).
The dedicated Indexing JSON page covers the full treatment:
- converting JSON to
VARIANTand how each sub-field type is indexed, - nested fields and array fields,
- assigning a tokenizer per sub-field, and
- indexing a
JSONcolumn asVARIANTwithout changing the column type.
Indexing arrays
A TEXT[] / VARCHAR[] column is indexed element-by-element: a row matches if any element matches. Tokenization and the operator class apply to each element just as for a scalar text column:
CREATE INDEX articles_tags_idx ON articles_tags USING inverted (id, tags kw);
SELECT id, tags FROM articles_tags_idx WHERE tags @@ 'blue' ORDER BY id; id | tags----+------------ 1 | {red,blue}Sizing and cost
The index only stores what you ask it to — keep it lean:
- Feature flags (
frequency,position,offset,norm) each enlarge the index. Enable only what your queries need — see Text Analysis. INCLUDEcodecs: eachINCLUDEd column can set acompressioncodec (uncompressed,bitpacking,alp,rle,fsst).- Vector columns: HNSW
mandef_constructiontrade recall against index size and build time — see Vector Search. - Prefer
INCLUDEover post-hoc materialization for columns you routinely return.
Limitations
- Partial indexes (
CREATE INDEX … WHERE …) are not supported. - Composite
ROW(...)/STRUCTcolumns cannot be indexed; list the columns individually instead. - The same indexed expression cannot be listed twice with different dictionaries.
HUGEINT,DECIMAL,UUIDandINTERVALcolumns cannot be indexed.
See also
- Inverted Index · Text Analysis
CREATE INDEX … USING inverted— full syntaxVARIANTtype · JSON overview- Indexing Views — indexing expressions over external/view data