Edit this page

Ranking

Matching answers a yes/no question — does a row match? Ranking answers a different one — how well does it match? — and orders the results accordingly. The two are separate steps: a WHERE col @@ query filter selects the matching rows, and a scorer in ORDER BY sorts them by relevance.

Scoring with BM25

Okapi BM25 is the standard relevance scorer. Use it in ORDER BY, highest score first:

Query

SELECT id, title, BM25(movies_idx.tableoid) AS relevanceFROM movies_idxWHERE description @@ ts_phrase('alien')ORDER BY relevance DESC, id;

Result

 id | title                         | relevance----+-------------------------------+-----------  7 | Star Trek: The Motion Picture | 1.9693236

Other scorers

SereneDB ships several scorers; TFIDF is the classic alternative:

Query

SELECT id, title, TFIDF(movies_idx.tableoid) AS relevanceFROM movies_idxWHERE description @@ ts_phrase('alien')ORDER BY relevance DESC, id;

Result

 id | title                         | relevance----+-------------------------------+-----------  7 | Star Trek: The Motion Picture | 1.2527629

Scorer	Parameters (defaults)	Notes
`BM25`	`k1` (1.2), `b` (0.75)	Okapi BM25; `b = 0` disables length normalization (BM15)
`TFIDF`	`with_norms` (false)	Classic TF-IDF
`lm_jm`	`lambda` (0.1)	Language model, Jelinek-Mercer smoothing
`lm_dirichlet`	`mu` (2000)	Language model, Dirichlet smoothing
`indri_dirichlet`	`mu` (2000)	Indri-style Dirichlet (no floor clamp)
`dfi`	`measure` (`'standardized'`)	Divergence-from-independence; also `'saturated'`, `'chi_squared'`
`raw_tf` / `raw_boost` / `raw_dl`	—	Raw term frequency, boost and document length

Boosting

The ^ operator multiplies a sub-query's contribution to the score, so you can weight some clauses above others. Here a title match is boosted so it outranks a description-only match:

Query

SELECT id, titleFROM movies_idxWHERE title @@ ('alien'::tsquery ^ 5.0) OR description @@ 'galaxy'ORDER BY BM25(movies_idx.tableoid) DESC, id;

Result

 id | title----+-------------------------------  8 | Alien  7 | Star Trek: The Motion Picture

Top-K queries and WAND pruning

The common shape ORDER BY <scorer> DESC LIMIT k returns the best k matches. Building the index with the optimize_top_k index option enables WAND pruning, which skips candidates that provably cannot reach the top k:

CREATE INDEX movies_idx ON movies
    USING inverted (id, description ranking_dict)
    WITH (optimize_top_k = 'bm25(1.2, 0.75)');

Query

SELECT id, titleFROM movies_idxWHERE description @@ 'galaxy'ORDER BY BM25(movies_idx.tableoid) DESC, idLIMIT 1;

Result

 id | title----+-------------------------------  7 | Star Trek: The Motion Picture

Pruning engages only when all of the following hold; otherwise the query still runs correctly, just without the optimization:

the query is ORDER BY <scorer>(idx.tableoid) DESC with a LIMIT;
the scorer matches the one named in optimize_top_k exactly (a different scorer falls back to a full scan);
the filter is a single term or an OR of terms (not a phrase, AND or NOT).

You can confirm pruning is active in the query plan — EXPLAIN shows Top: k, optimized on the scan. The sdb_disable_top_k_optimization and sdb_scored_terms_limit session settings tune this at query time.

Tie-breaking

Scores can tie. Add further ORDER BY columns after the scorer for a deterministic order — typically the primary key:

SELECT id, title
FROM movies_idx
WHERE description @@ 'galaxy'
ORDER BY BM25(movies_idx.tableoid) DESC, id;

Combining ranked queries

To blend a lexical (BM25) ranking with a vector ranking — or any two ranked result sets — use Reciprocal Rank Fusion, which combines ranks rather than incomparable scores. See Hybrid Search and the Reciprocal Rank Fusion recipe.

Scoring with BM25​

Other scorers​

Boosting​

Top-K queries and WAND pruning​

Tie-breaking​

Combining ranked queries​

See also​

Scoring with BM25

Other scorers

Boosting

Top-K queries and WAND pruning

Tie-breaking

Combining ranked queries

See also