Skip to main content

What makes
SereneDB different

Search engines are built for retrieval. Analytical databases are built for aggregation. SereneDB is a search-OLAP database that combines a high-performance search engine with a vectorized analytical execution engine, accessible through PostgreSQL.

The fastest open-source search engine

At the foundation is IResearch — a C++ information retrieval library that outperforms every major open-source search engine across all query types. IResearch is in the core of SereneDB, sharing the same data and memory with the rest of the system.

  • Full-text search — tokenization, stemming and normalization across 30+ languages via configurable text search dictionaries.
  • Phrase and proximity queries — exact phrases or terms within a specified distance.
  • Fuzzy search — typo-tolerant matching via Levenshtein distance and n-gram similarity.
  • Wildcard and prefix search — partial term matching without full table scans.
  • BM25 and TF-IDF ranking — industry-standard relevance scoring, tunable per query.
  • Search highlighting — return matching fragments with highlighted terms. (coming soon)
  • Vector and hybrid search — approximate nearest neighbor search for AI-powered applications.
  • Geospatial indexing — S2-based spatial queries.
  • Nested search — search within arrays of objects, preserving field relationships per array entry. (coming soon)
  • JSON indexing — all of the above search capabilities applied directly to JSON columns, without extracting fields into separate columns. (coming soon)
SELECT title, BM25() AS score
FROM articles_idx
WHERE PHRASE(body, 'distributed database')
ORDER BY BM25() DESC
LIMIT 10;

Analytical execution engine

Traditional databases are row-oriented — optimized for transactional workloads that read and write individual rows. Analytical queries are different: they scan millions of rows but only touch a few columns, computing aggregations, joins and rankings over large datasets.

SereneDB's query engine is built on Velox — a vectorized execution engine that stores and processes data in columnar format. Columns are compressed independently and scanned with SIMD instructions, so analytical queries read only the data they need. Aggregations, joins and window functions operate on search results at analytical speed.

SELECT category, COUNT(*) AS articles, AVG(BM25()) AS avg_relevance
FROM articles_idx
WHERE PHRASE(body, 'machine learning')
GROUP BY category
ORDER BY avg_relevance DESC;

Query and index remote data

Define external tables over Parquet, ORC or CSV files — on local disk or S3-compatible storage — and query them with the full power of the SQL engine.

You can also create inverted indexes over external tables — full-text search, phrase queries and relevance ranking over data that lives remotely, without importing a single row.

CREATE TABLE logs (
timestamp TIMESTAMP,
level TEXT,
message TEXT
) USING EXTERNAL WITH (
PATH = 's3://my-bucket/logs/*.parquet',
FORMAT = 'parquet',
S3_REGION = 'us-east-1'
);

CREATE INDEX logs_idx ON logs
USING inverted (timestamp, level, message log_dict);

SELECT timestamp, message, BM25() AS score
FROM logs_idx
WHERE PHRASE(message, 'connection timeout')
ORDER BY BM25() DESC;

See the Parquet cookbook for more details.

PostgreSQL compatible

SereneDB uses PostgreSQL's SQL parser and speaks the PostgreSQL wire protocol. Connect with psql, DBeaver, DataGrip, Grafana or any PostgreSQL driver. If you know SQL, you know SereneDB.

Single binary

SereneDB ships as serened — one binary, one process. Search indexes and columnar storage share the same data. Writes are immediately visible to both search and analytical queries.