Encryption
SereneDB supports reading and writing encrypted Parquet files. SereneDB broadly follows the Parquet Modular Encryption specification with some limitations.
Reading and Writing Encrypted Files
Using the PRAGMA add_parquet_key function, named encryption keys of 128, 192, or 256 bits can be added to a session. These keys are stored in-memory:
PRAGMA add_parquet_key('key128', '0123456789112345');
PRAGMA add_parquet_key('key192', '012345678911234501234567');
PRAGMA add_parquet_key('key256', '01234567891123450123456789112345');
PRAGMA add_parquet_key('key256base64', 'MDEyMzQ1Njc4OTExMjM0NTAxMjM0NTY3ODkxMTIzNDU=');Success
Success
Success
SuccessWriting Encrypted Parquet Files
After specifying the key (e.g., key256), files can be encrypted as follows:
COPY tbl TO 'tbl.parquet' (ENCRYPTION_CONFIG {footer_key: 'key256'});Reading Encrypted Parquet Files
An encrypted Parquet file using a specific key (e.g., key256), can then be read as follows:
COPY tbl FROM 'tbl.parquet' (ENCRYPTION_CONFIG {footer_key: 'key256'});Or:
SELECT *FROM read_parquet('tbl.parquet', encryption_config = {footer_key: 'key256'}); id | name | value----+-------+------- 1 | alpha | 10 2 | beta | 20Interoperability
SereneDB can read uniformly encrypted Parquet files written by the Arrow C++ API (e.g., via PyArrow), as long as the same encryption key is used for both the footer and all columns.
Limitations
SereneDB's Parquet encryption currently has the following limitations.
SereneDB encrypts the footer and all columns using the footer_key. The Parquet specification allows encryption of individual columns with different keys, e.g.:
COPY tbl TO 'tbl.parquet' (ENCRYPTION_CONFIG { footer_key: 'key256', column_keys: {key256: ['col0', 'col1']} });db error: ERROR: Parquet encryption_config column_keys not yet implementedHowever, this is unsupported at the moment and will cause an error to be thrown (for now).
Performance Implications
Note that encryption has some performance implications: reading and writing encrypted Parquet files is slower than reading and writing the unencrypted equivalents.