Skip to main content

Working with Parquet Files

SereneDB supports reading and writing Apache Parquet files — a columnar storage format widely used in data engineering and analytics pipelines.

Exporting to Parquet

Export a table to a Parquet file:

COPY movies TO '/path/to/movies.parquet' WITH (FORMAT PARQUET);

Export a filtered subset:

COPY (SELECT title, year, rating FROM movies WHERE rating > 8.0)
TO '/path/to/top_rated.parquet' WITH (FORMAT PARQUET);

Export specific columns:

COPY movies(title, year) TO '/path/to/movies_slim.parquet' WITH (FORMAT PARQUET);

Importing from Parquet

Load a Parquet file into an existing table:

COPY movies FROM '/path/to/movies.parquet' WITH (FORMAT PARQUET);

Filter rows during import:

COPY movies FROM '/path/to/movies.parquet' WITH (FORMAT PARQUET) WHERE year > 2010;

Import with column reordering:

COPY movies(year, title) FROM '/path/to/movies_slim.parquet' WITH (FORMAT PARQUET);

External tables

Instead of loading data into SereneDB, you can query Parquet files directly as external tables. The data stays in the file and is read on demand:

CREATE TABLE movies_external (
title TEXT,
year INTEGER,
rating FLOAT
) USING EXTERNAL WITH (
PATH = '/path/to/movies.parquet'
);

Query it like any other table:

SELECT title, rating
FROM movies_external
WHERE year > 2015
ORDER BY rating DESC
LIMIT 10;

SereneDB pushes filter predicates down to the Parquet file scan, so only matching data is read from disk.

Reading from S3

SereneDB can read and write Parquet (and CSV) files directly from S3-compatible storage:

COPY movies FROM 's3://my-bucket/data/movies.parquet' WITH (
FORMAT PARQUET,
S3_ACCESS_KEY 'your-access-key',
S3_SECRET_KEY 'your-secret-key',
S3_REGION 'us-east-1'
);

Export to S3:

COPY movies TO 's3://my-bucket/data/movies.parquet' WITH (
FORMAT PARQUET,
S3_ACCESS_KEY 'your-access-key',
S3_SECRET_KEY 'your-secret-key',
S3_REGION 'us-east-1'
);

S3 authentication options

OptionDescription
S3_ACCESS_KEYAWS access key ID (use together with S3_SECRET_KEY)
S3_SECRET_KEYAWS secret access key (use together with S3_ACCESS_KEY)
S3_IAM_ROLEIAM role ARN (alternative to access/secret keys)
S3_USE_INSTANCE_CREDENTIALSUse EC2 instance credentials (default: false)

S3 connection options

OptionDescription
S3_ENDPOINTCustom endpoint for S3-compatible services (e.g., MinIO)
S3_REGIONAWS region
S3_PATH_STYLE_ACCESSUse path-style URLs (default: true)
S3_SSL_ENABLEDEnable SSL (default: false)

S3-compatible services

For MinIO or other S3-compatible services, specify the endpoint:

COPY movies FROM 's3://my-bucket/movies.parquet' WITH (
FORMAT PARQUET,
S3_ACCESS_KEY 'minio-key',
S3_SECRET_KEY 'minio-secret',
S3_ENDPOINT 'localhost:9000'
);

Search over Parquet files

Coming Soon

Inverted indexes over external Parquet tables are planned for a future release. This will allow you to run full-text search queries directly on Parquet files without importing them first.

Converting between formats

Parquet and CSV are interchangeable. Export from one format and import into another:

-- CSV to Parquet
COPY movies TO '/path/to/movies.parquet' WITH (FORMAT PARQUET);

-- Parquet to CSV
COPY movies TO '/path/to/movies.csv' WITH (FORMAT CSV, HEADER TRUE);