Edit this page

Vector Functions

Vector-Distance Functions

These functions measure how close two equal-length FLOAT vectors are — the basis of vector search, where a query embedding is compared against the embeddings stored in a column. They come in two flavours that compute the same metrics but serve different roles.

Named functions (l2_distance, cosine_distance, …) are plain scalars: pass any two vectors and get the distance back. They work in any expression, with or without an index, and are the right tool for ad-hoc scoring, re-ranking a candidate set or comparing two specific rows.
Operators (<->, <+>, <=>, <#>) are the indexed form. When written as ORDER BY embedding <-> :query LIMIT k, the planner can route the query through the column's HNSW index and return the approximate k nearest neighbours — far faster than scoring every row. Each operator computes one fixed metric and only accelerates a query when that metric matches the index's configured metric (l2, l1, cosine or ip).

Lower distance means more similar. cosine_similarity and inner_product are the exceptions — there, higher means more similar. negative_inner_product flips the sign so that lower is again more similar, which is what the ip index metric needs.

Fixed-size FLOAT[N] is required

Every vector argument must be a fixed-size float array — FLOAT[3], FLOAT[768], etc. — and both operands of a distance must share the same dimension N. An unsized FLOAT[] is rejected, both as a function argument and as an indexed column. Write a literal as [1, 0, 0]::FLOAT[3].

Function / operator	Metric	Direction	Description
`l2_distance(a, b)`	L2	lower = closer	Euclidean (straight-line) distance.
`l2_sqr_distance(a, b)`	L2²	lower = closer	Squared Euclidean distance (skips the square root).
`l1_distance(a, b)`	L1	lower = closer	Manhattan distance.
`cosine_distance(a, b)`	cosine	lower = closer	`1 − cosine_similarity`.
`cosine_similarity(a, b)`	cosine	higher = closer	Cosine of the angle between the vectors.
`inner_product(a, b)`	dot	higher = closer	Dot product.
`negative_inner_product(a, b)`	ip	lower = closer	`−inner_product`; backs the `ip` index metric.
`a <-> b`	L2	lower = closer	Indexed distance, equivalent to `l2_distance`.
`a <+> b`	L1	lower = closer	Indexed distance, equivalent to `l1_distance`.
`a <=> b`	cosine	lower = closer	Indexed distance, equivalent to `cosine_distance`.
`a <#> b`	ip	lower = closer	Indexed distance, equivalent to `negative_inner_product`.
`l2_norm(a)` · `l1_norm(a)`	—	—	Vector magnitude (L2 / L1).
`l2_normalize(a)` · `l1_normalize(a)`	—	—	Scale to a unit vector (L2 / L1).

Choosing a metric

The four metrics answer different questions about two vectors x and y of length n:

Euclidean / L2 — sqrt(Σ (xᵢ − yᵢ)²). Straight-line distance through space. It accounts for both direction and magnitude, so it is the natural default when the absolute size of the components is meaningful (e.g. raw feature vectors, coordinates).
Squared Euclidean / L2² — Σ (xᵢ − yᵢ)². The same ordering as L2 with the square root dropped, so it is cheaper whenever you only rank or threshold and never need the true distance value.
Manhattan / L1 — Σ |xᵢ − yᵢ|. Sums the per-axis differences instead of combining them with Pythagoras. Less sensitive to a single large-deviation dimension than L2, and a common choice for sparse or high-dimensional data.
Cosine — 1 − (x · y) / (‖x‖ · ‖y‖). Compares only the direction of the vectors and ignores their length. This is the usual choice for text and embedding models, where two documents about the same topic point the same way regardless of length. A zero vector has no direction, so cosine of a zero vector is undefined.
Inner product (dot) — Σ xᵢ yᵢ. Rewards vectors that are both aligned and large. For vectors that are already L2-normalized to unit length, inner product equals cosine similarity, so many embedding pipelines normalize once and then use the cheaper dot product. Because higher means more similar, the index uses negative inner product (negative_inner_product) so that, like every other metric, a smaller value sorts first.

Operators, metrics and the HNSW index

Each operator is bound to exactly one metric and is independent of how a column happens to be indexed — a <-> b always computes L2 even if a comes from a cosine-indexed column. To get acceleration, the operator's metric must match the index's metric:

Operator	Metric	Equivalent function	Matching index `metric`
`a <-> b`	L2	`l2_distance`	`l2`
`a <+> b`	L1	`l1_distance`	`l1`
`a <=> b`	cosine	`cosine_distance`	`cosine`
`a <#> b`	ip	`negative_inner_product`	`ip`

A query of the shape

SELECT id FROM index_name ORDER BY emb <-> $query LIMIT k;

lets the planner walk the HNSW graph toward $query and return the approximate k nearest neighbours, touching only a small fraction of the rows. Use the operator whose metric matches the index you built; mixing them (for example <=> against an l2 index) still returns correct distances but cannot use the graph, so it falls back to scanning every row. For tiny tables the planner may scan anyway because a sequential pass is cheaper than graph traversal — the speed-up matters at scale.

Distance functions

`l2_distance(a, b)`

Euclidean (L2) distance between two equal-length FLOAT vectors — the straight-line distance in space.

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`)

Query

SELECT l2_distance([3, 4, 0]::FLOAT[3], [0, 0, 0]::FLOAT[3]) AS distance;

Result

 distance----------        5

`l2_sqr_distance(a, b)`

The squared L2 distance. It preserves the same ordering as l2_distance while skipping the square root, so it is cheaper when you only need to rank or threshold.

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`)

Query

SELECT l2_sqr_distance([3, 4, 0]::FLOAT[3], [0, 0, 0]::FLOAT[3]) AS distance;

Result

 distance----------       25

`l1_distance(a, b)`

Manhattan (L1) distance — the sum of the absolute per-component differences.

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`)

Query

SELECT l1_distance([3, 4, 0]::FLOAT[3], [0, 0, 0]::FLOAT[3]) AS distance;

Result

 distance----------        7

`cosine_distance(a, b)`

Cosine distance, defined as 1 − cosine_similarity. It ignores vector magnitude and depends only on direction, which makes it the usual choice for normalized text/embedding models. Lies in [0, 2].

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`, non-zero)

Query

SELECT cosine_distance([1, 0, 0]::FLOAT[3], [1, 1, 0]::FLOAT[3]) AS distance;

Result

 distance------------ 0.29289323

`cosine_similarity(a, b)`

The cosine of the angle between the vectors, in [−1, 1]. Unlike the distance functions, higher values mean more similar. For unit-length vectors it equals inner_product.

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`, non-zero)

Query

SELECT cosine_similarity([1, 0, 0]::FLOAT[3], [1, 1, 0]::FLOAT[3]) AS similarity;

Result

 similarity------------ 0.70710677

`inner_product(a, b)`

The dot product of the two vectors. Higher means more similar; for unit-length vectors it equals cosine_similarity.

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`)

Query

SELECT inner_product([1, 2, 3]::FLOAT[3], [4, 5, 6]::FLOAT[3]) AS dot;

Result

 dot-----  32

`negative_inner_product(a, b)`

−inner_product, so that lower values mean more similar — the form used by the ip HNSW metric, where the index orders neighbours by ascending distance. It is the function behind the <#> operator.

Operand	Type
`a`, `b`	`FLOAT[N]` (same `N`)

Query

SELECT negative_inner_product([1, 2, 3]::FLOAT[3], [4, 5, 6]::FLOAT[3]) AS neg_dot;

Result

 neg_dot---------     -32

Distance operators: `<->`, `<+>`, `<=>`, `<#>`

The operator forms are what drive an indexed nearest-neighbour search. Each maps to one metric and to a named function, and only accelerates a query when its metric matches the index (see Operators, metrics and the HNSW index).

Operator	Metric	Equivalent function
`a <-> b`	`l2`	`l2_distance`
`a <+> b`	`l1`	`l1_distance`
`a <=> b`	`cosine`	`cosine_distance`
`a <#> b`	`ip`	`negative_inner_product`

kNN: the `k` closest vectors

Order by the operator and LIMIT to the number of neighbours you want. The query below asks the l2-metric index for the two nearest rows to the origin and projects the distance alongside the id:

Query

SELECT id, emb <-> [0, 0, 0]::FLOAT[3] AS distanceFROM vecs_l2ORDER BY distance, idLIMIT 2;

Result

 id | distance----+----------  1 |        1  3 |        1

Range search: every vector within a radius

Compare the same operator in a WHERE clause to return all vectors closer than a threshold, rather than a fixed count:

Query

SELECT id, emb <-> [0, 0, 0]::FLOAT[3] AS distanceFROM vecs_l2WHERE emb <-> [0, 0, 0]::FLOAT[3] < 2ORDER BY distance, id;

Result

 id | distance----+----------  1 |        1  3 |        1

The two forms combine: add ORDER BY emb <-> $query LIMIT k to a range query to take the closest k within the radius.

The other metrics

The same pattern works with each metric's operator against an index built for that metric.

Manhattan (<+>, l1 index):

Query

SELECT id, emb <+> [0, 0, 0]::FLOAT[3] AS distanceFROM vecs_lORDER BY distance, idLIMIT 2;

Result

 id | distance----+----------  1 |        1  3 |        1

Cosine (<=>, cosine index) — id 1 is the query direction itself, so its distance is 0:

Query

SELECT id, emb <=> [1, 0, 0]::FLOAT[3] AS distanceFROM vecs_cORDER BY distance, idLIMIT 2;

Result

 id | distance----+------------  1 |          0  3 | 0.29289323

Inner product (<#>, ip index) — the operator returns the negative dot product, so the largest dot product (id 3, the longest aligned vector) sorts first:

Query

SELECT id, emb <#> [1, 1, 1]::FLOAT[3] AS distanceFROM vecs_iORDER BY distance, idLIMIT 3;

Result

 id | distance----+----------  3 |       -6  2 |       -5  1 |       -1

Norms and normalization

l2_norm and l1_norm return a vector's magnitude; l2_normalize and l1_normalize scale it to a unit vector under the corresponding norm. Normalizing before indexing with cosine (or comparing with inner_product) keeps comparisons magnitude-independent — once vectors are unit-length, inner_product and cosine_similarity coincide.

Function	Operand	Returns
`l2_norm(a)` · `l1_norm(a)`	`FLOAT[N]`	scalar magnitude
`l2_normalize(a)` · `l1_normalize(a)`	`FLOAT[N]`	`FLOAT[N]` unit vector

Query

SELECT l2_norm([3, 4, 0]::FLOAT[3]) AS norm;

Result

 norm------    5

Query

SELECT l1_norm([3, 4, 0]::FLOAT[3]) AS norm;

Result

 norm------    7

Query

SELECT l2_normalize([3, 4, 0]::FLOAT[3]) AS unit;

Result

 unit------------- {0.6,0.8,0}

Query

SELECT l1_normalize([3, 4, 0]::FLOAT[3]) AS unit;

Result

 unit-------------------------- {0.42857146,0.5714286,0}

For array-typed distance helpers (array_distance, array_cosine_distance, …) see Array Functions.

Coming from Elasticsearch

Elasticsearch kNN search over a dense_vector field maps onto a SereneDB hnsw index queried with a distance operator:

Elasticsearch / OpenSearch	SereneDB
`dense_vector` field + `index: hnsw`	`FLOAT[N]` column with an `hnsw` operator class
`similarity: l2_norm`	`<->` / `l2_distance` (metric `l2`)
`similarity: cosine`	`<=>` / `cosine_distance` (metric `cosine`)
`similarity: dot_product` / `max_inner_product`	`<#>` / `negative_inner_product` (metric `ip`)
`knn` query (`k`, `num_candidates`)	`ORDER BY emb <-> $q LIMIT k`
`knn` with `filter`	a `WHERE` clause beside the `ORDER BY` (Hybrid Search)

Notable differences. SereneDB adds a Manhattan metric (<+> / l1_distance) that Elasticsearch lacks, while Elasticsearch's l_inf / Hamming (binary-vector) similarities have no SereneDB equivalent. Tuning is at index build time (m, ef_construction) rather than per-query num_candidates — though sdb_ef_search overrides the search beam per session.

Vector-Distance Functions​

Choosing a metric​

Operators, metrics and the HNSW index​

Distance functions​

l2_distance(a, b)​

l2_sqr_distance(a, b)​

l1_distance(a, b)​

cosine_distance(a, b)​

cosine_similarity(a, b)​

inner_product(a, b)​

negative_inner_product(a, b)​

Distance operators: <->, <+>, <=>, <#>​

kNN: the k closest vectors​

Range search: every vector within a radius​

The other metrics​

Norms and normalization​

Coming from Elasticsearch​

See also​

Vector-Distance Functions

Choosing a metric

Operators, metrics and the HNSW index

Distance functions

`l2_distance(a, b)`

`l2_sqr_distance(a, b)`

`l1_distance(a, b)`

`cosine_distance(a, b)`

`cosine_similarity(a, b)`

`inner_product(a, b)`

`negative_inner_product(a, b)`

Distance operators: `<->`, `<+>`, `<=>`, `<#>`

kNN: the `k` closest vectors

Range search: every vector within a radius

The other metrics

Norms and normalization

Coming from Elasticsearch

See also