Edit this page

Working with Parquet Files

SereneDB supports reading and writing Apache Parquet files — a columnar storage format widely used in data engineering and analytics pipelines.

Exporting to Parquet

Export a table to a Parquet file:

COPY movies TO '/path/to/movies.parquet' WITH (FORMAT PARQUET);

Export a filtered subset:

COPY (SELECT title, year, rating FROM movies WHERE rating > 8.0)
    TO '/path/to/top_rated.parquet' WITH (FORMAT PARQUET);

Export specific columns:

COPY movies(title, year) TO '/path/to/movies_slim.parquet' WITH (FORMAT PARQUET);

Importing from Parquet

Load a Parquet file into an existing table:

COPY movies FROM '/path/to/movies.parquet' WITH (FORMAT PARQUET);

Filter rows during import:

COPY movies FROM '/path/to/movies.parquet' WITH (FORMAT PARQUET) WHERE year > 2010;

Import with column reordering:

COPY movies(year, title) FROM '/path/to/movies_slim.parquet' WITH (FORMAT PARQUET);

External tables

Instead of loading data into SereneDB, you can query Parquet files directly as external tables. The data stays in the file and is read on demand:

CREATE TABLE movies_external (
    title TEXT,
    year INTEGER,
    rating FLOAT
) USING EXTERNAL WITH (
    PATH = '/path/to/movies.parquet'
);

Query it like any other table:

SELECT title, rating
FROM movies_external
WHERE year > 2015
ORDER BY rating DESC
LIMIT 10;

SereneDB pushes filter predicates down to the Parquet file scan, so only matching data is read from disk.

Reading from S3

SereneDB can read and write Parquet (and CSV) files directly from S3-compatible storage:

COPY movies FROM 's3://my-bucket/data/movies.parquet' WITH (
    FORMAT PARQUET,
    S3_ACCESS_KEY 'your-access-key',
    S3_SECRET_KEY 'your-secret-key',
    S3_REGION 'us-east-1'
);

Export to S3:

COPY movies TO 's3://my-bucket/data/movies.parquet' WITH (
    FORMAT PARQUET,
    S3_ACCESS_KEY 'your-access-key',
    S3_SECRET_KEY 'your-secret-key',
    S3_REGION 'us-east-1'
);

S3 authentication options

Option	Description
`S3_ACCESS_KEY`	AWS access key ID (use together with `S3_SECRET_KEY`)
`S3_SECRET_KEY`	AWS secret access key (use together with `S3_ACCESS_KEY`)
`S3_IAM_ROLE`	IAM role ARN (alternative to access/secret keys)
`S3_USE_INSTANCE_CREDENTIALS`	Use EC2 instance credentials (default: `false`)

S3 connection options

Option	Description
`S3_ENDPOINT`	Custom endpoint for S3-compatible services (e.g., MinIO)
`S3_REGION`	AWS region
`S3_PATH_STYLE_ACCESS`	Use path-style URLs (default: `true`)
`S3_SSL_ENABLED`	Enable SSL (default: `false`)

S3-compatible services

For MinIO or other S3-compatible services, specify the endpoint:

COPY movies FROM 's3://my-bucket/movies.parquet' WITH (
    FORMAT PARQUET,
    S3_ACCESS_KEY 'minio-key',
    S3_SECRET_KEY 'minio-secret',
    S3_ENDPOINT 'localhost:9000'
);

Search over Parquet files

Coming Soon

Inverted indexes over external Parquet tables are planned for a future release. This will allow you to run full-text search queries directly on Parquet files without importing them first.

Converting between formats

Parquet and CSV are interchangeable. Export from one format and import into another:

-- CSV to Parquet
COPY movies TO '/path/to/movies.parquet' WITH (FORMAT PARQUET);

-- Parquet to CSV
COPY movies TO '/path/to/movies.csv' WITH (FORMAT CSV, HEADER TRUE);

Exporting to Parquet​

Importing from Parquet​

External tables​

Reading from S3​

S3 authentication options​

S3 connection options​

S3-compatible services​

Search over Parquet files​

Converting between formats​