Skip to content

DuckDB

DuckDB is the right connection when you want a full SQL surface over local files — joins across CSVs, materialised intermediate tables, custom views — without running a server.

analytics:
driver: duckdb
dsn: ./data/analytics.duckdb # or :memory: for ephemeral
allowed_users: [analysts]
read_only: true # default
FormMeaning
./path/to/file.duckdbA persistent DuckDB database file (created on first use).
:memory:In-memory database. Resets on restart; useful for purely file-driven dashboards.

Relative paths resolve from the dashboard folder.

DuckDB has no notion of users; PlotPress’s user-scoping is enforced at the connection level via allowed_users. Restrict access to the database file with OS permissions — typically the PlotPress process owns it and nothing else can read it.

Define views once via DuckDB SQL:

-- one-time setup, e.g. via `duckdb data/analytics.duckdb`
CREATE VIEW monthly_revenue AS
SELECT
date_trunc('month', invoice_date) AS month,
sum(amount) AS revenue,
currency
FROM read_parquet('./invoices/*.parquet')
GROUP BY 1, 3;

PlotPress then runs SELECT * FROM monthly_revenue WHERE year = ? against the connection. Plot blocks reference the view name:

```plot view=monthly_revenue
Plot.barY(data, { x: "month", y: "revenue" })
```

DuckDB’s read_csv, read_parquet, read_json_auto, and read_ndjson functions work in views directly:

CREATE VIEW orders AS
SELECT * FROM read_csv('./data/orders.csv', header=true);
CREATE VIEW customers AS
SELECT * FROM read_parquet('./data/customers.parquet');
CREATE VIEW joined AS
SELECT o.*, c.segment
FROM orders o LEFT JOIN customers c USING (customer_id);

This is what makes DuckDB different from the Files connection: you get a SQL layer on top.

DuckDB’s extensions install on first use; PlotPress preloads the common ones:

ExtensionWhat it adds
httpfsread_parquet('s3://...'), read_csv('https://...')
sqliteATTACH 'app.sqlite' AS app (TYPE SQLITE); then query app tables
postgresATTACH '...' AS pg (TYPE POSTGRES); for live Postgres reads
parquetAlways available; column projection + predicate pushdown
jsonAlways available; JSON path expressions
excelOptional; reads .xlsx directly

To enable an extension at boot, add it to the connection’s extensions list:

analytics:
driver: duckdb
dsn: ./data/analytics.duckdb
extensions: [httpfs, sqlite]
allowed_users: [analysts]

Drop a parameterised SQL file in queries/ for one-off slices:

-- queries/top_customers.sql
-- @param limit int default 20
SELECT customer_id, sum(amount) AS revenue
FROM orders
GROUP BY customer_id
ORDER BY revenue DESC
LIMIT :limit;

:name rewrites to DuckDB’s ? placeholder.

  • Single-writer. read_only: true is the default and the strongly recommended mode. If you must write (materialised aggregates), do it from a separate ingest job, not from inside PlotPress.
  • Memory. DuckDB is fast because it does work in memory. Large joins against multi-GB Parquet files want memory_limit set: extensions: [httpfs]; memory_limit: 4GB in the connection.
  • File locking. A persistent .duckdb opened by PlotPress can be opened simultaneously only in read-only mode by other processes. Don’t share the same file with a running ingest job.
  • Versions. PlotPress pins a DuckDB version; databases written by a newer DuckDB CLI can fail to open. Keep the CLI you use for setup at the same major version as the embedded one (plotpress version).