DuckDB in your SaaS stack: what it replaces, what it doesn't, and the multi-tenant pattern that holds up
Why per-customer Parquet on S3 beats shared DuckDB instances for tenant isolation — and what breaks when you scale it
Your cloud data warehouse line item grows every quarter, even when your query volume holds steady. For most B2B SaaS companies, the growth is predictable: more customers, more data, more storage. But the larger cost driver is how cloud warehouses price compute, and how much of it you pay for without using. DuckDB for SaaS analytics changes this calculation. Not by being faster at everything, and not in the way most tutorials describe.
This piece covers what DuckDB actually is, where the cost advantage is real, and the one deployment pattern that holds up when you have thousands of customers whose data cannot touch each other.
What DuckDB actually is (and the single constraint that shapes everything)
DuckDB is an in-process analytical database. It runs inside your application process, not as a separate server. There is no socket to connect to, no authentication layer to configure, no connection pool to manage. You open a file or an in-memory instance, run SQL, read results, close it. The latency profile is fundamentally different from a network-connected database, even one running on localhost.
It uses columnar storage and vectorised query execution. These two properties make it fast for analytical workloads: aggregations over millions of rows, joins across wide tables, groupings on high-cardinality columns. They also make it slower than Postgres for point lookups and transactional writes. This is not a replacement for your OLTP database. It is a replacement for what you would otherwise send to BigQuery or Snowflake.
The constraint that shapes every deployment decision: DuckDB has a single-writer model. Two processes cannot open the same .duckdb file in read-write mode simultaneously. The second gets a lock error. Understanding this upfront prevents the most common architecture mistake, which is treating DuckDB like Postgres and discovering the constraint under production load.
Where the cost advantage actually comes from
Cloud data warehouses charge for compute and storage separately. BigQuery on-demand pricing is $6.25 per terabyte scanned. A query that touches ten 1-GB tables scans 10 GB regardless of which rows it needs. Snowflake bills per-second for virtual warehouse uptime, so a warehouse that processes one query per hour is still billing between queries unless you configure aggressive auto-suspend.
DuckDB queries Parquet files directly. Storage costs are S3 rates, roughly $0.023 per GB per month. Compute is whatever is already running your query handler: a Lambda function, an EC2 instance, a container handling other requests. There is no separate warehouse sitting idle.
At 20 GB of Parquet data per customer, S3 storage runs approximately $0.46 per customer per month. A hundred-customer SaaS at that data size spends under $50 a month on analytics storage. The same dataset in a Snowflake XS warehouse, refreshed nightly, costs substantially more once you account for warehouse compute time and storage overhead.
The crossover point is not fixed. It depends on query complexity, refresh frequency, and what compute you already provision. Below roughly 100 GB per customer, with batch or near-real-time refresh measured in minutes, the per-customer Parquet pattern reliably undercuts managed warehouse pricing. Above 500 GB per customer, or where cross-customer aggregate queries are primary, cloud warehouses earn back the advantage through optimised storage formats and query planning at scale.
Three deployment patterns for SaaS analytics
Teams reach for three architectures when adding DuckDB to a SaaS product. They differ in how customer data is stored, where the DuckDB process runs, and how tenant isolation is enforced.
| Pattern | Storage | Isolation strength | Single-writer impact | Practical ceiling |
|---|---|---|---|---|
| A — per-customer .duckdb file on disk | Local disk | File-level | Per-file; manageable | ~500 active customers before file-handle pressure |
| B — shared instance, schema per tenant | Single .duckdb file | Schema-level (soft) | Global; one tenant blocks all | Prototyping only |
| C — per-customer Parquet on S3 | S3 prefixes | Prefix-level (hard) | Not applicable | Tens of thousands of customers |
Pattern A uses one .duckdb file per customer on local disk. Isolation is solid: each customer's database is a separate file, and backup is a file copy. The problems emerge at scale. You are pinning each customer to a specific machine, file-handle counts become a concern at several thousand active customers, and there is no natural path to horizontal scaling.
Pattern B puts all customers in one shared DuckDB instance, separated by schema. This looks familiar from Postgres but behaves differently. DuckDB has no row-level security. A query missing a tenant filter reads across all schemas. The single-writer limitation applies globally: one customer's ETL job can block another customer's writes. This pattern is useful for local development and almost never appropriate for production.
Pattern C uses per-customer Parquet files on S3, with an ephemeral in-process DuckDB instance created and destroyed per query. Customer data never coexists in the same DuckDB instance. The single-writer constraint becomes irrelevant because each query creates its own instance and reads read-only from S3. This is the pattern that holds up.
The multi-tenant pattern in practice
In the per-customer Parquet model, each customer's data lives as Parquet files under an isolated S3 prefix: s3://your-bucket/tenant-{id}/. When a customer runs a report, you instantiate a DuckDB connection in memory, point it at that customer's prefix, run the query, return the result, and discard the connection. No state persists between requests.
Tenant isolation is enforced at the file-system level. A DuckDB instance pointed at one customer's S3 prefix cannot read another's prefix unless your code explicitly attaches it. There is no access control policy to misconfigure and no security rule to forget. The isolation is structural.
Data residency follows naturally. EU customers' Parquet files go in an eu-west-1 bucket. Indian customers' files go in ap-south-1. The DuckDB instance running the query runs in the same region. Data does not cross geographical boundaries to answer a query.
import duckdb
def run_customer_query(tenant_id: str, query: str) -> list[dict]:
"""
Each call creates and destroys an in-memory DuckDB instance.
No state persists between requests; no tenant data can bleed across.
"""
s3_prefix = f"s3://analytics-data/{tenant_id}/"
con = duckdb.connect(database=":memory:")
con.execute("INSTALL httpfs; LOAD httpfs;")
con.execute("SET s3_region = 'ap-south-1';")
# Credentials via environment or IAM role — not hardcoded.
# This instance only sees this customer's files.
con.execute(
f"CREATE VIEW events AS "
f"SELECT * FROM read_parquet('{s3_prefix}events/*.parquet');"
)
con.execute(
f"CREATE VIEW orders AS "
f"SELECT * FROM read_parquet('{s3_prefix}orders/*.parquet');"
)
result = con.execute(query).fetchdf()
con.close()
return result.to_dict(orient="records")The write side requires serialisation. Because you write to Parquet files rather than to a live database, the standard approach is a scheduled ETL step: pull new records from your OLTP database, write or append to Parquet, make the updated files available for query. If multiple services produce data for the same customer simultaneously, route writes through a queue to serialise them at the Parquet layer. Concurrent Parquet writes from two processes corrupt the file, which is a Parquet constraint rather than a DuckDB one.
What breaks
Real-time freshness. DuckDB is batch-oriented. Data is available for query once written to Parquet, typically after a refresh interval of minutes. If your product promises sub-second analytics freshness, the toolchain is different: a streaming database such as Materialize or RisingWave, or a real-time OLAP system such as ClickHouse or Apache Druid. DuckDB is not competing with those systems.
Cross-tenant aggregates. The per-customer Parquet isolation model is an asset for data residency and an obstacle if your product requires aggregate views across all customers. Computing these requires querying every customer's prefix in sequence or in parallel. This is feasible at small customer counts and a scheduling challenge at tens of thousands.
Schema evolution at scale. Parquet encodes schema in the file. Changing a column type or adding a non-nullable column requires a migration that reads old files and writes new ones. At 5 GB per customer this is fast. At 50 GB per customer with 5,000 customers it is a multi-hour background job. Plan schema changes before your dataset grows large.
Very large per-customer datasets. The cost advantage depends on DuckDB scanning files from S3 efficiently. Above roughly 500 GB per customer, network transfer costs and scan time over remote storage start to erode the advantage. Cloud warehouses with co-located compute earn back their premium at that scale.
The operational surface you take on
Cloud data warehouses include backup, schema versioning, query history, and audit logging. DuckDB on S3 does not. What you take on:
- Backup. With the per-customer Parquet pattern, enable S3 versioning on the bucket. It handles point-in-time recovery automatically at a fraction of additional storage cost.
- Schema migrations. There is no built-in migration system. Write timestamped migration scripts that read a customer's current Parquet, apply the transformation, write the new version, and record completion in a metadata table. Run across customers in parallel; track failures per customer.
- Version upgrades. DuckDB minor versions are generally backward-compatible; major versions have occasionally changed the on-disk format. Pin the version in your package file, test upgrades on a copy of production data before deploying, and treat version migrations with the same care as database schema migrations.
- Monitoring. DuckDB has no built-in metrics endpoint. Wrap queries with wall-clock timing and log slow queries. Track p50 and p99 query latency per query type per customer, and alert when a customer's queries slow down persistently: it usually signals Parquet files that need recompaction.
DuckDB for SaaS analytics: when it fits, and when to wait
DuckDB with per-customer Parquet is worth reaching for when: per-customer data is under 100 GB; analytics freshness can be measured in minutes rather than milliseconds; multi-tenant data isolation is a compliance or contractual requirement; and your current cloud warehouse spend is driven primarily by headroom rather than compute actually used.
It is less suited when: your product requires sub-second analytics freshness; cross-tenant aggregate queries are a primary use case; per-customer datasets are large enough that S3 scan costs become material; or your team is not prepared to own backup, schema migration, and monitoring for an additional data layer. Taking on less operational complexity in exchange for higher managed-service costs is a reasonable trade early on.
Where DuckDB earns its place is the analytics tier that most B2B SaaS products end up building: per-customer dashboards, export features, scheduled reports, embedded analytics on customer-facing pages. At data sizes where cloud warehouse pricing is mostly headroom, the per-customer Parquet pattern cuts the bill significantly and gives stronger tenant isolation than schema-level separation provides.
Frequently asked questions
Related reading
Rate limiting in production: the four algorithms and their failure modes
Most services reach for a token bucket and never look further. Rate limiting is four distinct algorithms with different burst behaviours and failure modes — here is what each one actually protects against.
Distributed service timeouts: the three production failure modes your 30-second default doesn't prevent
Most service timeouts are arbitrary numbers set once and never revisited. Here are the three failure modes that escape a misconfigured timeout — and how to calculate values that account for them.
Zero-downtime Postgres schema migrations: what every DDL operation does under the hood
Most Postgres schema changes that cause outages aren't dangerous by nature — they're dangerous because the table is large. Here's what lock each DDL operation takes and the exact patterns to make them safe.