DuckDB vs BigQuery: When Local Wins
DuckDB is faster and cheaper than BigQuery for most workloads that fit on one machine. Here's an honest comparison with real cost calculations.
DuckDB vs BigQuery: When Local Wins
BigQuery is excellent at what it does — serverless distributed SQL at petabyte scale. But most businesses don't have petabyte-scale data, and they're paying Google cloud prices for a workload that DuckDB could handle for free on a laptop.
This is an honest comparison. BigQuery wins in specific scenarios. DuckDB wins in many others. Here's how to think about the decision.
The Cost Question#
Let's start with money, because it drives most technology decisions.
BigQuery pricing:
- Storage: $0.02/GB/month
- Queries: $6.25 per TB scanned (on-demand)
- Flat-rate: $2,000+/month minimum for committed slots
A team running 500GB of queries per day pays roughly $937/month on BigQuery's on-demand model.
DuckDB pricing:
- $0. It's open source.
The compute cost is your hardware (already owned) or a small cloud VM ($20-100/month if you need one).
The Performance Question#
BigQuery is a distributed system. It spins up thousands of nodes to answer your query. This is fast for large queries but adds latency for small ones.
DuckDB runs on a single machine. It uses all available CPU cores and RAM. For datasets under a few hundred GB, this is faster than BigQuery.
Approximate query times on a 50GB dataset:
| Query Type | BigQuery | DuckDB (local) |
|---|---|---|
| Simple aggregation | 2-5s | 0.5-2s |
| Complex join | 5-15s | 1-5s |
| Full table scan | 3-8s | 2-6s |
| Small filtered query | 1-3s | 0.1-0.5s |
BigQuery's distributed architecture helps when your data is terabytes. Under a few hundred GB, DuckDB is consistently faster.
Feature Comparison#
| Feature | DuckDB | BigQuery |
|---|---|---|
| Setup | 1 command | Google Cloud account + project |
| Data location | Your machine | Google's servers |
| Max dataset size | Single machine | Effectively unlimited |
| SQL dialect | PostgreSQL-compatible | GoogleSQL (mostly standard) |
| Concurrent users | Limited (single-writer) | Many |
| Streaming ingest | No | Yes (Streaming API) |
| ML integration | Via Python | BigQuery ML (SQL-based) |
| IAM/access control | File permissions | Google IAM |
| Compliance | You control | Google's certifications |
| Versioning/time travel | No (manual) | 7-day time travel |
| Cost | $0 | Pay-per-query or committed |
When BigQuery Wins#
1. Your data is in Google Cloud already
If your application runs on GCP and your data is in Cloud Storage or Pub/Sub, BigQuery is the natural choice. The data transfer costs and latency of moving data to a local DuckDB are not worth it.
2. You need multi-user concurrent access
BigQuery handles hundreds of concurrent analysts without any configuration. DuckDB's single-writer limitation makes it awkward for large teams.
3. Your dataset is genuinely large (100GB+)
Once your compressed data exceeds what fits comfortably on one machine, BigQuery's distributed architecture earns its cost.
4. You need point-in-time recovery
BigQuery's 7-day time travel lets you query historical snapshots. DuckDB requires manual backup discipline.
5. Compliance requires managed cloud storage
If your compliance requirements mandate data residency in a certified cloud environment, BigQuery's SOC2/ISO27001 certifications may be required.
When DuckDB Wins#
1. Your data fits on one machine
Most business analytics — CRM data, product metrics, financial models — fits easily in a few GB. DuckDB handles this faster than BigQuery and for free.
2. You need offline or local-first analytics
DuckDB works without internet. You can query your data on a plane. BigQuery requires a network connection.
3. Privacy matters
Your data stays on your machine with DuckDB. It never leaves. With BigQuery, you're trusting Google with your most sensitive business data.
4. You want to query files directly
DuckDB can query CSV and Parquet files directly without loading them:
SELECT * FROM read_parquet('s3://my-bucket/data/*.parquet');BigQuery requires loading data into tables first.
5. Startup economics
$0/month vs $500-3,000/month makes a real difference when you're pre-revenue.
The Hybrid Architecture#
Many teams use both:
BigQuery (master data store, compliance, shared access)
↓
DuckDB (local analysis, ML feature engineering, dashboards)
Export subsets from BigQuery to Parquet, analyze locally with DuckDB:
# Export from BigQuery to GCS
bq extract my-project:dataset.table gs://bucket/export/*.parquet
# Download and query with DuckDB
gsutil cp -r gs://bucket/export/ ./data/
duckdb -c "SELECT * FROM read_parquet('data/*.parquet') LIMIT 10"DenchClaw's Approach#
DenchClaw uses DuckDB for all data storage. The philosophy is explicit: your data belongs on your machine, not in Google's cloud.
For a personal CRM, a startup's pipeline, or a team's internal analytics, the BigQuery use case simply doesn't apply. You don't need distributed query execution across 1,000 nodes to answer "how many deals did we close this month?"
DuckDB answers that in 10ms from a local file. BigQuery answers it in 2 seconds and charges you for the privilege.
Real Cost Analysis for Startups#
A typical seed-stage startup's analytics workload:
- 50GB of event data
- 500 queries/day across 5 analysts
- Data refreshed nightly from production
BigQuery monthly cost:
- Storage: 50GB × $0.02 = $1/month
- Queries: 500 queries × avg 5GB scanned × $6.25/TB = ~$15/day = $450/month
- Total: ~$451/month
DuckDB monthly cost:
- $0 (runs on existing hardware)
- Optional: 1 EC2 t3.medium ($30/month) if you need a shared server
Savings: $421-451/month = $5,000-5,400/year
That's real money for a startup. And for most startups' data sizes, DuckDB is actually faster.
Frequently Asked Questions#
Can DuckDB query BigQuery data directly?#
Not natively. You can export BigQuery data to GCS as Parquet, then query it with DuckDB's httpfs extension.
Does DuckDB support streaming data like BigQuery?#
No. DuckDB is designed for batch analytics. For streaming data, use a streaming system (Kafka, Pub/Sub) and then query historical data with DuckDB.
What happens when my data outgrows DuckDB?#
Migrate to ClickHouse (self-hosted, cheaper) or BigQuery (managed, familiar SQL). Parquet format makes migration straightforward.
Is BigQuery HIPAA-compliant?#
Yes, with a BAA. DuckDB's compliance posture depends on where you deploy it. For regulated industries, consult your compliance team.
Can I use DuckDB in Google Cloud?#
Yes. Run DuckDB on a GCP Compute Engine instance. Use httpfs to access GCS files directly.
Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →
