Quick Start
SereneDB is a search-OLAP database: one engine that runs full-text search, vector (semantic) search and analytical queries over the same data. It speaks the PostgreSQL wire protocol and SQL, so your existing clients and drivers just work. It ships as a single binary with nothing else to run.
Install
- DOCKER
- LINUX
curl https://install.serenedb.com | sh
Or run the image directly:
docker run -d --name serenedb -p 7890:7890 serenedb/serenedb
curl https://linux.serenedb.com | sh
Direct downloads are available on the GitHub releases page.
Connect
SereneDB speaks the PostgreSQL wire protocol. Connect with psql or any PostgreSQL-compatible client:
psql -h localhost -p 7890
No credentials are required by default.
AI search in 60 seconds
Everything from here runs in that one session. Load a real dataset, generate embeddings, index it for full-text and vector search, then query it — lexical, semantic and analytical — over the same data, with no second system and no data copies.
Load data from Hugging Face
SereneDB reads Parquet, CSV and JSON directly from object storage, HTTP and the Hugging Face Hub — no import job and no schema to define up front. Point a CREATE TABLE at a remote dataset and it lands as an ordinary table:
CREATE TABLE movies AS SELECT id, title, overview, split_part(genres, '-', 1) AS genre, vote_average, vote_count FROM read_parquet('hf://datasets/wykonos/movies@~parquet/default/train/*.parquet') WHERE overview IS NOT NULL AND vote_count > 1000;That is a real public dataset of films, read over the network in one statement. You can also skip loading entirely and index files in place on S3 or disk — see zero-ETL search over external data.
Ask an analytical question
movies behaves like any SQL table. Aggregate it the way you would in any analytical database:
SELECT genre, COUNT(*) AS films, ROUND(AVG(vote_average), 2) AS avg_ratingFROM moviesGROUP BY genreORDER BY avg_rating DESC, genre; genre | films | avg_rating-----------------+-------+------------ Action | 1 | 8.2 Science Fiction | 4 | 8.13 Horror | 1 | 8.1 Animation | 2 | 8.05 Drama | 2 | 7.85 Romance | 2 | 7.85Generate embeddings
To search by meaning and not just keywords, turn each overview into an embedding vector. ai_embed calls an embedding model straight from SQL and returns a vector you store in a fixed-size FLOAT[N] column — no pipeline, no separate vector service:
ALTER TABLE movies ADD COLUMN embedding FLOAT[4];
UPDATE moviesSET embedding = ai_embed(overview, 'text-embedding-3-small', 'openai');The provider is configured once with CREATE SECRET; any OpenAI-compatible endpoint works, including a local Ollama server.
Index for full-text and vector search
A single inverted index covers both worlds: a text analyzer over the text columns and an hnsw index over the embedding. Lexical and semantic search share one index, beside the columnar data:
CREATE TEXT SEARCH DICTIONARY english ( template = 'text', locale = 'en_US.UTF-8', case = 'lower', stemming = false, frequency = true, position = true, offset = true);
CREATE INDEX movies_idx ON movies USING inverted(id, title english, overview english, embedding hnsw (metric = 'cosine'));
VACUUM (REFRESH_TABLE) movies;Full-text search
Query the index by name, match text with the @@ operator and a query constructor like ts_phrase, then rank by BM25:
SELECT title, genre, ROUND(BM25(movies_idx.tableoid)::NUMERIC, 3) AS scoreFROM movies_idxWHERE overview @@ ts_phrase('falls in love')ORDER BY score DESC, title; title | genre | score--------------+---------+------- The Notebook | Romance | 6.768 Titanic | Romance | 6.768Hybrid search: lexical and semantic
This is where SereneDB pulls ahead. A full-text predicate narrows to the rows that mention a term; the <=> distance to a query embedding ranks them by what they actually mean — both in one statement, against one index. Embed the query text at search time with the same ai_embed:
SELECT title, genreFROM movies_idxWHERE overview @@ 'space' -- lexical recallORDER BY embedding <=> ai_embed('space exploration', 'text-embedding-3-small', 'openai')LIMIT 3; -- semantic ranking title | genre----------------+----------------- Interstellar | Science Fiction Gravity | Science Fiction Hidden Figures | DramaOf the four films whose overview mentions "space", semantic ranking surfaces the three that are really about space exploration — and drops WALL-E, which is set in space but is, at heart, a kids' animation.
Search and analytics, one query
The same index feeds aggregation directly. The search predicate selects a candidate set; the GROUP BY runs over it — in a single statement, on one engine:
SELECT genre, COUNT(*) AS matches, ROUND(AVG(vote_average), 2) AS avg_ratingFROM movies_idxWHERE overview @@ 'space'GROUP BY genreORDER BY matches DESC, genre; genre | matches | avg_rating-----------------+---------+------------ Science Fiction | 2 | 8.05 Animation | 1 | 8.1 Drama | 1 | 8.1Next steps
Start with the concepts, then dive into the search type you need:
- Inverted Index — how it works, the query model, mixing field types in one index
- Text Analysis — dictionaries, tokenizers and the index-time = query-time rule
- What to Index — columns, expressions, generated columns, JSON and
VARIANT
Query types:
- Full-Text Search — the
@@operator, phrases, fuzzy, boolean queries - Ranking — BM25 and other scorers, boosting, top-K / WAND
- Vector Search — HNSW indexing, kNN and range search
- Hybrid Search — combine full-text filters with vector ranking
- Geospatial Search —
ST_*predicates over GeoJSON /GEOMETRY
Going further:
- Indexing External Data — zero-ETL search over a Parquet / CSV / JSON / Iceberg data lake
- Maintenance & Introspection — refresh, compaction and inspecting indexes
- Migrating from Elasticsearch — feature mapping, including aggregations
- AI Functions — generate embeddings with
ai_embed