Back to The Times of Claw

DuckDB for Business Intelligence Without a Data Warehouse

DuckDB replaces a full data warehouse for most BI workloads. Here's how to build dashboards, reports, and analytics without Snowflake or BigQuery.

Mark Rachapoom
Mark Rachapoom
·6 min read
DuckDB for Business Intelligence Without a Data Warehouse

DuckDB for Business Intelligence Without a Data Warehouse

Most companies don't need Snowflake. They need fast answers to business questions, and DuckDB can provide them from a file on your laptop — no data warehouse, no monthly bill, no infrastructure team.

This guide shows you how to use DuckDB as your BI layer: connecting it to your data sources, building the queries that power dashboards, and serving results to visualization tools.

The Problem with Traditional BI Stacks#

The classic BI stack looks like this:

Source Systems → ETL Pipeline → Data Warehouse → BI Tool → Dashboards

Every component has a cost:

  • Data warehouse (Snowflake/BigQuery): $300-3,000+/month
  • ETL tool (Fivetran/dbt Cloud): $500-2,000+/month
  • BI tool (Looker/Tableau): $1,500-6,000+/seat/year
  • Total: $5,000-20,000+/month for a mid-sized team

For startups and small teams, this is prohibitive. And for most analytical workloads, it's also unnecessary.

The DuckDB BI Stack#

Source Systems → DuckDB → Dashboards
  • DuckDB: Free (open source)
  • Visualization: Grafana (free), Evidence (free), Metabase (free self-hosted), or custom HTML/JS
  • ETL: SQL queries in DuckDB, scheduled with cron
  • Total cost: $0 (beyond compute)

Connecting DuckDB to Your Data Sources#

CSV and Parquet Files#

-- Query CSV directly
SELECT * FROM read_csv_auto('sales_data.csv');
 
-- Query multiple Parquet files
SELECT * FROM read_parquet('s3://my-bucket/events/2026/*.parquet');
 
-- Create a view over a file
CREATE VIEW monthly_sales AS
SELECT 
    DATE_TRUNC('month', sale_date) AS month,
    region,
    SUM(amount) AS revenue
FROM read_csv_auto('sales_data.csv')
GROUP BY month, region;

PostgreSQL (Production Database)#

INSTALL postgres;
LOAD postgres;
 
ATTACH 'host=prod-db.internal dbname=app user=analytics' AS prod (TYPE POSTGRES);
 
-- Create a local cache table
CREATE TABLE customers_cache AS SELECT * FROM prod.customers;
 
-- Or query live (slower but always fresh)
SELECT * FROM prod.customers WHERE created_at > '2026-01-01';

REST APIs via httpfs#

INSTALL httpfs;
LOAD httpfs;
 
-- Query a JSON API
SELECT * FROM read_json_auto('https://api.example.com/v1/metrics?period=monthly');

Google Sheets (via CSV export URL)#

-- Google Sheets published as CSV
SELECT * FROM read_csv_auto(
    'https://docs.google.com/spreadsheets/d/YOUR_ID/export?format=csv'
);

Building a BI Schema#

Structure your DuckDB BI database with a simple star schema:

-- Fact table
CREATE TABLE fact_revenue (
    date DATE,
    product_id VARCHAR,
    customer_id VARCHAR,
    region VARCHAR,
    amount DECIMAL(10,2),
    units INTEGER
);
 
-- Dimension tables
CREATE TABLE dim_product (
    product_id VARCHAR PRIMARY KEY,
    name VARCHAR,
    category VARCHAR,
    unit_price DECIMAL(10,2)
);
 
CREATE TABLE dim_customer (
    customer_id VARCHAR PRIMARY KEY,
    name VARCHAR,
    segment VARCHAR,
    country VARCHAR
);
 
CREATE TABLE dim_date (
    date DATE PRIMARY KEY,
    year INTEGER,
    quarter INTEGER,
    month INTEGER,
    week INTEGER,
    day_of_week INTEGER,
    is_weekend BOOLEAN
);

Populate dim_date once:

INSERT INTO dim_date
SELECT 
    d AS date,
    YEAR(d) AS year,
    QUARTER(d) AS quarter,
    MONTH(d) AS month,
    WEEK(d) AS week,
    DAYOFWEEK(d) AS day_of_week,
    DAYOFWEEK(d) IN (1, 7) AS is_weekend
FROM generate_series(DATE '2020-01-01', DATE '2030-12-31', INTERVAL '1 day') AS t(d);

Core BI Queries#

Revenue Dashboard#

-- Current month KPIs
SELECT 
    SUM(amount) AS total_revenue,
    COUNT(*) AS transactions,
    COUNT(DISTINCT customer_id) AS unique_customers,
    AVG(amount) AS avg_transaction_value
FROM fact_revenue
WHERE DATE_TRUNC('month', date) = DATE_TRUNC('month', CURRENT_DATE);
 
-- Month-over-month comparison
WITH current_month AS (
    SELECT SUM(amount) AS revenue
    FROM fact_revenue
    WHERE DATE_TRUNC('month', date) = DATE_TRUNC('month', CURRENT_DATE)
),
prior_month AS (
    SELECT SUM(amount) AS revenue
    FROM fact_revenue
    WHERE DATE_TRUNC('month', date) = DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
)
SELECT 
    current_month.revenue AS current_revenue,
    prior_month.revenue AS prior_revenue,
    ROUND((current_month.revenue - prior_month.revenue) / prior_month.revenue * 100, 1) AS mom_growth_pct
FROM current_month, prior_month;

Product Performance Report#

SELECT 
    p.category,
    p.name AS product,
    SUM(f.amount) AS revenue,
    SUM(f.units) AS units_sold,
    COUNT(DISTINCT f.customer_id) AS unique_buyers,
    ROUND(SUM(f.amount) / SUM(SUM(f.amount)) OVER () * 100, 1) AS revenue_share_pct,
    RANK() OVER (PARTITION BY p.category ORDER BY SUM(f.amount) DESC) AS rank_in_category
FROM fact_revenue f
JOIN dim_product p ON f.product_id = p.product_id
WHERE YEAR(f.date) = 2026
GROUP BY p.category, p.name
ORDER BY revenue DESC;

Customer Cohort Retention#

WITH first_purchase AS (
    SELECT customer_id, MIN(date) AS cohort_date
    FROM fact_revenue
    GROUP BY customer_id
),
monthly_activity AS (
    SELECT 
        fp.customer_id,
        DATE_TRUNC('month', fp.cohort_date) AS cohort_month,
        DATE_TRUNC('month', f.date) AS activity_month,
        DATEDIFF('month', fp.cohort_date, f.date) AS months_after_signup
    FROM fact_revenue f
    JOIN first_purchase fp ON f.customer_id = fp.customer_id
)
SELECT 
    cohort_month,
    months_after_signup,
    COUNT(DISTINCT customer_id) AS retained_customers
FROM monthly_activity
GROUP BY cohort_month, months_after_signup
ORDER BY cohort_month, months_after_signup;

Connecting to Visualization Tools#

Metabase (Free, Self-Hosted)#

Metabase supports DuckDB via JDBC. Add DuckDB as a data source:

  1. Install Metabase: docker run -p 3000:3000 metabase/metabase
  2. Add DuckDB connection: Settings → Databases → Add database → DuckDB
  3. Point to your .duckdb file path

Evidence (Free, Code-Based)#

Evidence is a Markdown-based BI tool that queries DuckDB natively:

---
title: Revenue Dashboard
---
 
<script>
    const revenue = await query(`
        SELECT DATE_TRUNC('month', date) AS month, SUM(amount) AS revenue
        FROM fact_revenue
        GROUP BY month
        ORDER BY month
    `);
</script>
 
<LineChart data={revenue} x="month" y="revenue" title="Monthly Revenue"/>

Grafana#

Install the DuckDB plugin, then query with SQL directly in Grafana dashboards.

Custom HTML Dashboard with DuckDB-WASM#

Run DuckDB entirely in the browser:

<script type="module">
import * as duckdb from 'https://cdn.jsdelivr.net/npm/@duckdb/duckdb-wasm/+esm';
 
const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();
const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);
const worker = new Worker(bundle.mainWorker);
const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
 
const conn = await db.connect();
await conn.query(`SELECT 1 AS test`);
</script>

DenchClaw as a DuckDB BI Platform#

DenchClaw is built on top of DuckDB and includes an App Builder that lets you create dashboards directly against your CRM data. No external BI tool needed.

// Inside a DenchClaw app
const revenue = await dench.db.query(`
    SELECT 
        DATE_TRUNC('month', "Close Date") AS month,
        SUM("Deal Value") AS revenue,
        COUNT(*) AS deals_closed
    FROM v_deals
    WHERE "Stage" = 'Closed Won'
    GROUP BY month
    ORDER BY month
`);
 
// Render with Chart.js
new Chart(ctx, {
    type: 'bar',
    data: {
        labels: revenue.map(r => r.month),
        datasets: [{ data: revenue.map(r => r.revenue) }]
    }
});

The entire BI stack runs locally: DuckDB handles the queries, the DenchClaw app renders the charts, and you own all the data.

Frequently Asked Questions#

Can DuckDB replace a data warehouse for my startup?#

For most startups under $10M ARR, yes. DuckDB handles hundreds of millions of rows analytically on a modern laptop. The use cases where you need a proper data warehouse: multi-terabyte datasets, real-time streaming ingestion, or dozens of concurrent analysts.

How do I schedule DuckDB ETL jobs?#

Use cron (Linux/Mac) or Task Scheduler (Windows) to run SQL scripts on a schedule. Or use dbt with the DuckDB adapter.

Can multiple analysts query the same DuckDB database?#

In read-only mode, yes — multiple processes can open the same file simultaneously. For writes (ETL), only one process can write at a time.

What BI tools have native DuckDB support?#

Evidence, Rill, Metabase (via JDBC), and several others have native DuckDB support. Grafana requires a plugin.

How do I handle DuckDB schema migrations?#

DuckDB supports ALTER TABLE ADD COLUMN, RENAME, and most standard DDL. For complex migrations, create a new table and migrate data, then drop the old table.

Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →

Mark Rachapoom

Written by

Mark Rachapoom

Building the future of AI CRM software.

Continue reading

DENCH

© 2026 DenchHQ · San Francisco, CA