Working with Parquet Files
SereneDB supports reading and writing Apache Parquet files — a columnar storage format widely used in data engineering and analytics pipelines.
Exporting to Parquet
Export a table to a Parquet file:
COPY movies TO '/path/to/movies.parquet' WITH (FORMAT PARQUET);
Export a filtered subset:
COPY (SELECT title, year, rating FROM movies WHERE rating > 8.0)
TO '/path/to/top_rated.parquet' WITH (FORMAT PARQUET);
Export specific columns:
COPY movies(title, year) TO '/path/to/movies_slim.parquet' WITH (FORMAT PARQUET);
Importing from Parquet
Load a Parquet file into an existing table:
COPY movies FROM '/path/to/movies.parquet' WITH (FORMAT PARQUET);
Filter rows during import:
COPY movies FROM '/path/to/movies.parquet' WITH (FORMAT PARQUET) WHERE year > 2010;
Import with column reordering:
COPY movies(year, title) FROM '/path/to/movies_slim.parquet' WITH (FORMAT PARQUET);
External tables
Instead of loading data into SereneDB, you can query Parquet files directly as external tables. The data stays in the file and is read on demand:
CREATE TABLE movies_external (
title TEXT,
year INTEGER,
rating FLOAT
) USING EXTERNAL WITH (
PATH = '/path/to/movies.parquet'
);
Query it like any other table:
SELECT title, rating
FROM movies_external
WHERE year > 2015
ORDER BY rating DESC
LIMIT 10;
SereneDB pushes filter predicates down to the Parquet file scan, so only matching data is read from disk.
Reading from S3
SereneDB can read and write Parquet (and CSV) files directly from S3-compatible storage:
COPY movies FROM 's3://my-bucket/data/movies.parquet' WITH (
FORMAT PARQUET,
S3_ACCESS_KEY 'your-access-key',
S3_SECRET_KEY 'your-secret-key',
S3_REGION 'us-east-1'
);
Export to S3:
COPY movies TO 's3://my-bucket/data/movies.parquet' WITH (
FORMAT PARQUET,
S3_ACCESS_KEY 'your-access-key',
S3_SECRET_KEY 'your-secret-key',
S3_REGION 'us-east-1'
);
S3 authentication options
| Option | Description |
|---|---|
S3_ACCESS_KEY | AWS access key ID (use together with S3_SECRET_KEY) |
S3_SECRET_KEY | AWS secret access key (use together with S3_ACCESS_KEY) |
S3_IAM_ROLE | IAM role ARN (alternative to access/secret keys) |
S3_USE_INSTANCE_CREDENTIALS | Use EC2 instance credentials (default: false) |
S3 connection options
| Option | Description |
|---|---|
S3_ENDPOINT | Custom endpoint for S3-compatible services (e.g., MinIO) |
S3_REGION | AWS region |
S3_PATH_STYLE_ACCESS | Use path-style URLs (default: true) |
S3_SSL_ENABLED | Enable SSL (default: false) |
S3-compatible services
For MinIO or other S3-compatible services, specify the endpoint:
COPY movies FROM 's3://my-bucket/movies.parquet' WITH (
FORMAT PARQUET,
S3_ACCESS_KEY 'minio-key',
S3_SECRET_KEY 'minio-secret',
S3_ENDPOINT 'localhost:9000'
);
Search over Parquet files
Inverted indexes over external Parquet tables are planned for a future release. This will allow you to run full-text search queries directly on Parquet files without importing them first.
Converting between formats
Parquet and CSV are interchangeable. Export from one format and import into another:
-- CSV to Parquet
COPY movies TO '/path/to/movies.parquet' WITH (FORMAT PARQUET);
-- Parquet to CSV
COPY movies TO '/path/to/movies.csv' WITH (FORMAT CSV, HEADER TRUE);