Edit this page

Quick Start

SereneDB is a search-OLAP database: one engine that runs full-text search, vector (semantic) search and analytical queries over the same data. It speaks the PostgreSQL wire protocol and SQL, so your existing clients and drivers just work. It ships as a single binary with nothing else to run.

Install

DOCKER
LINUX

curl https://install.serenedb.com | sh

Or run the image directly:

docker run -d --name serenedb -p 7890:7890 serenedb/serenedb

curl https://linux.serenedb.com | sh

Direct downloads are available on the GitHub releases page.

Connect

SereneDB speaks the PostgreSQL wire protocol. Connect with psql or any PostgreSQL-compatible client:

psql -h localhost -p 7890

No credentials are required by default.

AI search in 60 seconds

Everything from here runs in that one session. Load a real dataset, generate embeddings, index it for full-text and vector search, then query it — lexical, semantic and analytical — over the same data, with no second system and no data copies.

Load data from Hugging Face

SereneDB reads Parquet, CSV and JSON directly from object storage, HTTP and the Hugging Face Hub — no import job and no schema to define up front. Point a CREATE TABLE at a remote dataset and it lands as an ordinary table:

Query

CREATE TABLE movies AS    SELECT id, title, overview,           split_part(genres, '-', 1) AS genre,           vote_average, vote_count    FROM read_parquet('hf://datasets/wykonos/movies@~parquet/default/train/*.parquet')    WHERE overview IS NOT NULL AND vote_count > 1000;

That is a real public dataset of films, read over the network in one statement. You can also skip loading entirely and index files in place on S3 or disk — see zero-ETL search over external data.

Ask an analytical question

movies behaves like any SQL table. Aggregate it the way you would in any analytical database:

Query

SELECT genre, COUNT(*) AS films, ROUND(AVG(vote_average), 2) AS avg_ratingFROM moviesGROUP BY genreORDER BY avg_rating DESC, genre;

Result

 genre           | films | avg_rating-----------------+-------+------------ Action          |     1 |        8.2 Science Fiction |     4 |       8.13 Horror          |     1 |        8.1 Animation       |     2 |       8.05 Drama           |     2 |       7.85 Romance         |     2 |       7.85

Generate embeddings

To search by meaning and not just keywords, turn each overview into an embedding vector. ai_embed calls an embedding model straight from SQL and returns a vector you store in a fixed-size FLOAT[N] column — no pipeline, no separate vector service:

Query

ALTER TABLE movies ADD COLUMN embedding FLOAT[4];
UPDATE moviesSET embedding = ai_embed(overview, 'text-embedding-3-small', 'openai');

The provider is configured once with CREATE SECRET; any OpenAI-compatible endpoint works, including a local Ollama server.

Index for full-text and vector search

A single inverted index covers both worlds: a text analyzer over the text columns and an hnsw index over the embedding. Lexical and semantic search share one index, beside the columnar data:

Query

CREATE TEXT SEARCH DICTIONARY english (    template = 'text',    locale = 'en_US.UTF-8',    case = 'lower',    stemming = false,    frequency = true,    position = true,    offset = true);
CREATE INDEX movies_idx ON movies    USING inverted(id, title english, overview english, embedding hnsw (metric = 'cosine'));
VACUUM (REFRESH_TABLE) movies;

Full-text search

Query the index by name, match text with the @@ operator and a query constructor like ts_phrase, then rank by BM25:

Query

SELECT title, genre, ROUND(BM25(movies_idx.tableoid)::NUMERIC, 3) AS scoreFROM movies_idxWHERE overview @@ ts_phrase('falls in love')ORDER BY score DESC, title;

Result

 title        | genre   | score--------------+---------+------- The Notebook | Romance | 6.768 Titanic      | Romance | 6.768

Hybrid search: lexical and semantic

This is where SereneDB pulls ahead. A full-text predicate narrows to the rows that mention a term; the <=> distance to a query embedding ranks them by what they actually mean — both in one statement, against one index. Embed the query text at search time with the same ai_embed:

Query

SELECT title, genreFROM   movies_idxWHERE  overview @@ 'space'                                   -- lexical recallORDER  BY embedding <=> ai_embed('space exploration',                                 'text-embedding-3-small', 'openai')LIMIT  3;                                                    -- semantic ranking

Result

 title          | genre----------------+----------------- Interstellar   | Science Fiction Gravity        | Science Fiction Hidden Figures | Drama

Of the four films whose overview mentions "space", semantic ranking surfaces the three that are really about space exploration — and drops WALL-E, which is set in space but is, at heart, a kids' animation.

Search and analytics, one query

The same index feeds aggregation directly. The search predicate selects a candidate set; the GROUP BY runs over it — in a single statement, on one engine:

Query

SELECT genre, COUNT(*) AS matches, ROUND(AVG(vote_average), 2) AS avg_ratingFROM movies_idxWHERE overview @@ 'space'GROUP BY genreORDER BY matches DESC, genre;

Result

 genre           | matches | avg_rating-----------------+---------+------------ Science Fiction |       2 |       8.05 Animation       |       1 |        8.1 Drama           |       1 |        8.1

Next steps

Start with the concepts, then dive into the search type you need:

Inverted Index — how it works, the query model, mixing field types in one index
Text Analysis — dictionaries, tokenizers and the index-time = query-time rule
What to Index — columns, expressions, generated columns, JSON and VARIANT

Query types:

Full-Text Search — the @@ operator, phrases, fuzzy, boolean queries
Ranking — BM25 and other scorers, boosting, top-K / WAND
Vector Search — HNSW indexing, kNN and range search
Hybrid Search — combine full-text filters with vector ranking
Geospatial Search — ST_* predicates over GeoJSON / GEOMETRY

Going further:

Indexing External Data — zero-ETL search over a Parquet / CSV / JSON / Iceberg data lake
Maintenance & Introspection — refresh, compaction and inspecting indexes
Migrating from Elasticsearch — feature mapping, including aggregations
AI Functions — generate embeddings with ai_embed

Install​

Connect​

AI search in 60 seconds​

Load data from Hugging Face​

Ask an analytical question​

Generate embeddings​

Index for full-text and vector search​

Full-text search​

Hybrid search: lexical and semantic​

Search and analytics, one query​

Next steps​

Install

Connect

AI search in 60 seconds

Load data from Hugging Face

Ask an analytical question

Generate embeddings

Index for full-text and vector search

Full-text search

Hybrid search: lexical and semantic

Search and analytics, one query

Next steps