Skip to main content

Quick Start

SereneDB is a search-OLAP database: one engine that runs full-text search, vector (semantic) search and analytical queries over the same data. It speaks the PostgreSQL wire protocol and SQL, so your existing clients and drivers just work. It ships as a single binary with nothing else to run.

Install

curl https://install.serenedb.com | sh

Or run the image directly:

docker run -d --name serenedb -p 7890:7890 serenedb/serenedb

Connect

SereneDB speaks the PostgreSQL wire protocol. Connect with psql or any PostgreSQL-compatible client:

psql -h localhost -p 7890

No credentials are required by default.

AI search in 60 seconds

Everything from here runs in that one session. Load a real dataset, generate embeddings, index it for full-text and vector search, then query it — lexical, semantic and analytical — over the same data, with no second system and no data copies.

Load data from Hugging Face

SereneDB reads Parquet, CSV and JSON directly from object storage, HTTP and the Hugging Face Hub — no import job and no schema to define up front. Point a CREATE TABLE at a remote dataset and it lands as an ordinary table:

Query
CREATE TABLE movies AS    SELECT id, title, overview,           split_part(genres, '-', 1) AS genre,           vote_average, vote_count    FROM read_parquet('hf://datasets/wykonos/movies@~parquet/default/train/*.parquet')    WHERE overview IS NOT NULL AND vote_count > 1000;

That is a real public dataset of films, read over the network in one statement. You can also skip loading entirely and index files in place on S3 or disk — see zero-ETL search over external data.

Ask an analytical question

movies behaves like any SQL table. Aggregate it the way you would in any analytical database:

Query
SELECT genre, COUNT(*) AS films, ROUND(AVG(vote_average), 2) AS avg_ratingFROM moviesGROUP BY genreORDER BY avg_rating DESC, genre;
Result
 genre           | films | avg_rating-----------------+-------+------------ Action          |     1 |        8.2 Science Fiction |     4 |       8.13 Horror          |     1 |        8.1 Animation       |     2 |       8.05 Drama           |     2 |       7.85 Romance         |     2 |       7.85

Generate embeddings

To search by meaning and not just keywords, turn each overview into an embedding vector. ai_embed calls an embedding model straight from SQL and returns a vector you store in a fixed-size FLOAT[N] column — no pipeline, no separate vector service:

Query
ALTER TABLE movies ADD COLUMN embedding FLOAT[4];
UPDATE moviesSET embedding = ai_embed(overview, 'text-embedding-3-small', 'openai');

The provider is configured once with CREATE SECRET; any OpenAI-compatible endpoint works, including a local Ollama server.

A single inverted index covers both worlds: a text analyzer over the text columns and an hnsw index over the embedding. Lexical and semantic search share one index, beside the columnar data:

Query
CREATE TEXT SEARCH DICTIONARY english (    template = 'text',    locale = 'en_US.UTF-8',    case = 'lower',    stemming = false,    frequency = true,    position = true,    offset = true);
CREATE INDEX movies_idx ON movies    USING inverted(id, title english, overview english, embedding hnsw (metric = 'cosine'));
VACUUM (REFRESH_TABLE) movies;

Query the index by name, match text with the @@ operator and a query constructor like ts_phrase, then rank by BM25:

Query
SELECT title, genre, ROUND(BM25(movies_idx.tableoid)::NUMERIC, 3) AS scoreFROM movies_idxWHERE overview @@ ts_phrase('falls in love')ORDER BY score DESC, title;
Result
 title        | genre   | score--------------+---------+------- The Notebook | Romance | 6.768 Titanic      | Romance | 6.768

Hybrid search: lexical and semantic

This is where SereneDB pulls ahead. A full-text predicate narrows to the rows that mention a term; the <=> distance to a query embedding ranks them by what they actually mean — both in one statement, against one index. Embed the query text at search time with the same ai_embed:

Query
SELECT title, genreFROM   movies_idxWHERE  overview @@ 'space'                                   -- lexical recallORDER  BY embedding <=> ai_embed('space exploration',                                 'text-embedding-3-small', 'openai')LIMIT  3;                                                    -- semantic ranking
Result
 title          | genre----------------+----------------- Interstellar   | Science Fiction Gravity        | Science Fiction Hidden Figures | Drama

Of the four films whose overview mentions "space", semantic ranking surfaces the three that are really about space exploration — and drops WALL-E, which is set in space but is, at heart, a kids' animation.

Search and analytics, one query

The same index feeds aggregation directly. The search predicate selects a candidate set; the GROUP BY runs over it — in a single statement, on one engine:

Query
SELECT genre, COUNT(*) AS matches, ROUND(AVG(vote_average), 2) AS avg_ratingFROM movies_idxWHERE overview @@ 'space'GROUP BY genreORDER BY matches DESC, genre;
Result
 genre           | matches | avg_rating-----------------+---------+------------ Science Fiction |       2 |       8.05 Animation       |       1 |        8.1 Drama           |       1 |        8.1

Next steps

Start with the concepts, then dive into the search type you need:

  • Inverted Index — how it works, the query model, mixing field types in one index
  • Text Analysis — dictionaries, tokenizers and the index-time = query-time rule
  • What to Index — columns, expressions, generated columns, JSON and VARIANT

Query types:

Going further: