Skip to main content

Vector Search

The inverted index also indexes vector embeddings for approximate nearest neighbor (ANN) search, using an HNSW graph. This powers semantic search, recommendations and other similarity workloads over FLOAT vectors.

HNSW (Hierarchical Navigable Small World) builds a layered proximity graph over the vectors and navigates it toward the query, so a search touches only a small fraction of the vectors instead of scanning them all. That is what makes it approximate — it trades a little recall for a large speed-up, and m / ef_construction tune that trade-off.

Creating a vector index

A vector column uses the hnsw (...) operator class. The column must be a fixed-size FLOAT array (FLOAT[N]) — every row shares the same dimension N:

Query
CREATE TABLE vectors (id INTEGER, emb FLOAT[3]);
CREATE INDEX l2_index ON vectors USING inverted (id, emb hnsw (metric = 'l2'));

The metric is required. The graph-tuning parameters are optional:

ParameterDescription
metricDistance metric: l2 (Euclidean), cosine, ip (inner product) or l1 (Manhattan)
mGraph connectivity (recommended 2–128). Higher values improve recall at the cost of index size
ef_constructionCandidate-list size during build (recommended 10–500). Higher values improve recall at the cost of build time

There are two ways to query a vector index: k-nearest-neighbor search (the closest k vectors) and range search (every vector within a distance threshold).

k-nearest-neighbor (kNN)

Order by the distance to a query vector and LIMIT to the number of neighbors you want. Each distance operator computes a fixed metric — <-> is L2, <=> is cosine, <+> is L1 and <#> is inner product — so use the one matching the metric your index was built with; the optimizer then routes the query through the HNSW index:

SELECT id FROM index_name ORDER BY emb <-> $query_vector LIMIT k;
Query
SELECT id FROM l2_index ORDER BY emb <-> [0, 0, 0]::FLOAT[3] LIMIT 2;
Result
 id----  3  2

The named distance functions l2_distance, cosine_distance, l1_distance and negative_inner_product are equivalent to the matching operator and can be used explicitly:

Query
SELECT id FROM cos_index ORDER BY cosine_distance(emb, [1, 0, 0]::FLOAT[3]) LIMIT 2;
Result
 id----  1  3

Instead of a fixed number of neighbors, return every vector within a distance threshold (a radius) by comparing the distance in a WHERE clause:

SELECT id FROM index_name WHERE emb <-> $query_vector < radius;
Query
SELECT id FROM l2_index WHERE emb <-> [0, 0, 0]::FLOAT[3] < 100 ORDER BY id;
Result
 id----  2  3

The two forms combine: add ORDER BY emb <-> $query_vector LIMIT k to a range query to take the closest k within the radius.

Column types and compression

A vector column must be a fixed-size FLOAT[N] array — all rows share dimension N (an unsized FLOAT[] is rejected). As with text columns, a vector column can set a storage compression codec, e.g. emb hnsw (metric = 'l2', compression = 'uncompressed').

See also