DuckDB for Time-Series Analysis

DuckDB handles time-series data natively with powerful date functions, window queries, and fast columnar storage. Here's how to use it for business analytics.

Mark Rachapoom

March 26, 2026·7 min read

DuckDB for Time-Series Analysis

DuckDB handles time-series data exceptionally well. Its columnar storage format is ideal for the kind of sequential, timestamp-ordered data that drives business metrics — revenue trends, user activity, pipeline velocity, churn rates. And it does it all without requiring a separate time-series database.

This guide covers the practical patterns for time-series analysis with DuckDB: date arithmetic, window functions, aggregations, and visualization-ready output.

Why DuckDB Works Well for Time-Series#

Most OLTP databases (PostgreSQL, MySQL) store rows together on disk. DuckDB stores columns together. For time-series queries — "give me daily revenue for the last 90 days" — this means DuckDB only reads the timestamp and value columns, skipping everything else. On large datasets, this can be 10-100x faster than a row-oriented database.

DuckDB also ships with a rich set of date and interval functions, and its window function support is excellent. There's no need for a specialized time-series database for most business analytics workloads.

Data Setup#

Let's start with a realistic events table:

CREATE TABLE events (
    event_id VARCHAR,
    user_id VARCHAR,
    event_type VARCHAR,
    revenue DECIMAL(10,2),
    occurred_at TIMESTAMP
);

And a deals table as you'd find in a DenchClaw CRM:

CREATE TABLE deals (
    deal_id VARCHAR,
    company_id VARCHAR,
    stage VARCHAR,
    value DECIMAL(10,2),
    created_at DATE,
    closed_at DATE
);

Basic Date Operations#

Extracting Date Parts#

SELECT 
    YEAR(occurred_at) AS year,
    MONTH(occurred_at) AS month,
    DAY(occurred_at) AS day,
    DAYOFWEEK(occurred_at) AS day_of_week,
    HOUR(occurred_at) AS hour
FROM events
LIMIT 10;

Truncating to Periods#

The DATE_TRUNC function is essential for grouping by time periods:

SELECT 
    DATE_TRUNC('day', occurred_at) AS day,
    COUNT(*) AS event_count,
    SUM(revenue) AS daily_revenue
FROM events
WHERE occurred_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY day
ORDER BY day;

Supported truncation periods: microsecond, millisecond, second, minute, hour, day, week, month, quarter, year.

Date Arithmetic#

-- Events in the last 7 days
SELECT * FROM events
WHERE occurred_at > NOW() - INTERVAL '7 days';
 
-- Events between two dates
SELECT * FROM events
WHERE occurred_at BETWEEN '2026-01-01' AND '2026-03-31';
 
-- Days since each deal was created
SELECT deal_id, DATEDIFF('day', created_at, CURRENT_DATE) AS age_days
FROM deals;

Time-Series Aggregations#

Daily Revenue Trend#

SELECT 
    DATE_TRUNC('day', occurred_at) AS date,
    COUNT(*) AS transactions,
    SUM(revenue) AS revenue,
    AVG(revenue) AS avg_transaction_value
FROM events
WHERE event_type = 'purchase'
    AND occurred_at >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY date
ORDER BY date;

Weekly Cohort Analysis#

SELECT 
    DATE_TRUNC('week', first_seen) AS cohort_week,
    DATE_TRUNC('week', occurred_at) AS activity_week,
    DATEDIFF('week', first_seen, occurred_at) AS weeks_since_signup,
    COUNT(DISTINCT user_id) AS active_users
FROM events
JOIN (
    SELECT user_id, MIN(occurred_at) AS first_seen FROM events GROUP BY user_id
) AS cohorts USING (user_id)
GROUP BY cohort_week, activity_week, weeks_since_signup
ORDER BY cohort_week, weeks_since_signup;

Month-over-Month Growth#

WITH monthly AS (
    SELECT 
        DATE_TRUNC('month', occurred_at) AS month,
        SUM(revenue) AS revenue
    FROM events
    WHERE event_type = 'purchase'
    GROUP BY month
)
SELECT 
    month,
    revenue,
    LAG(revenue) OVER (ORDER BY month) AS prev_month_revenue,
    ROUND(
        (revenue - LAG(revenue) OVER (ORDER BY month)) / 
        LAG(revenue) OVER (ORDER BY month) * 100, 
        1
    ) AS mom_growth_pct
FROM monthly
ORDER BY month;

Window Functions for Time-Series#

Window functions let you compute running totals, moving averages, and rankings across ordered rows.

Running Total#

SELECT 
    DATE_TRUNC('day', occurred_at) AS date,
    SUM(revenue) AS daily_revenue,
    SUM(SUM(revenue)) OVER (ORDER BY DATE_TRUNC('day', occurred_at)) AS cumulative_revenue
FROM events
GROUP BY date
ORDER BY date;

7-Day Moving Average#

WITH daily AS (
    SELECT 
        DATE_TRUNC('day', occurred_at) AS date,
        SUM(revenue) AS daily_revenue
    FROM events
    GROUP BY date
)
SELECT 
    date,
    daily_revenue,
    ROUND(AVG(daily_revenue) OVER (
        ORDER BY date 
        ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
    ), 2) AS rolling_7d_avg
FROM daily
ORDER BY date;

Percentile Over Time#

SELECT 
    DATE_TRUNC('week', occurred_at) AS week,
    PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue) AS median_deal_value,
    PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY revenue) AS p90_deal_value
FROM events
WHERE event_type = 'purchase'
GROUP BY week
ORDER BY week;

Filling Gaps in Time-Series#

A common problem: some days have no data, creating gaps in your charts. Here's how to fill them:

-- Generate a series of dates
WITH date_series AS (
    SELECT generate_series(
        DATE '2026-01-01',
        DATE '2026-03-31',
        INTERVAL '1 day'
    )::DATE AS date
),
daily_revenue AS (
    SELECT 
        DATE_TRUNC('day', occurred_at)::DATE AS date,
        SUM(revenue) AS revenue
    FROM events
    GROUP BY date
)
SELECT 
    ds.date,
    COALESCE(dr.revenue, 0) AS revenue
FROM date_series ds
LEFT JOIN daily_revenue dr ON ds.date = dr.date
ORDER BY ds.date;

generate_series is a DuckDB built-in that produces a sequence of values — integers, dates, or timestamps.

Time-Series for Pipeline Analytics#

In a sales CRM, time-series analysis on your deal pipeline is one of the most valuable things you can do. Here's a pattern for pipeline snapshot analysis:

-- Pipeline value by stage over time (snapshot-based)
WITH pipeline_snapshots AS (
    SELECT 
        DATE_TRUNC('week', created_at) AS snapshot_week,
        stage,
        COUNT(*) AS deal_count,
        SUM(value) AS pipeline_value
    FROM deals
    WHERE closed_at IS NULL OR closed_at > snapshot_week
    GROUP BY snapshot_week, stage
)
SELECT 
    snapshot_week,
    stage,
    deal_count,
    pipeline_value,
    SUM(pipeline_value) OVER (PARTITION BY snapshot_week) AS total_pipeline
FROM pipeline_snapshots
ORDER BY snapshot_week, stage;

Deal Velocity#

-- How long deals spend in each stage
SELECT 
    stage,
    AVG(DATEDIFF('day', stage_entered_at, COALESCE(stage_exited_at, CURRENT_DATE))) AS avg_days_in_stage,
    PERCENTILE_CONT(0.5) WITHIN GROUP (
        ORDER BY DATEDIFF('day', stage_entered_at, COALESCE(stage_exited_at, CURRENT_DATE))
    ) AS median_days
FROM deal_stage_history
GROUP BY stage
ORDER BY avg_days_in_stage DESC;

Comparing DuckDB to TimescaleDB and InfluxDB#

Feature	DuckDB	TimescaleDB	InfluxDB
Setup	Embedded	PostgreSQL extension	Separate server
Query language	SQL	SQL + hypertable functions	Flux / InfluxQL
Compression	Columnar	Time-series chunks	Time-series chunks
Streaming ingest	No	Yes	Yes
SQL compatibility	Full	Full	Limited
Best for	Analytical queries	High-frequency writes	IoT/metrics
Cost	Free	Free/paid tiers	Free/paid tiers

DuckDB wins for batch analytics on historical data. TimescaleDB and InfluxDB win when you need high-frequency writes (>10K rows/sec) and streaming ingest.

For business analytics — daily/weekly aggregations on sales, marketing, and product data — DuckDB is the right choice and requires zero infrastructure.

Exporting for Visualization#

-- Export time-series data to CSV for Grafana or a BI tool
COPY (
    SELECT 
        DATE_TRUNC('day', occurred_at) AS date,
        SUM(revenue) AS revenue
    FROM events
    GROUP BY date
    ORDER BY date
) TO 'daily_revenue.csv' (FORMAT CSV, HEADER TRUE);

Or use DuckDB for business patterns to power dashboards directly from query results.

Written by

Mark Rachapoom

Building the future of AI CRM software.

Twitter GitHub LinkedIn

DuckDB for Time-Series Analysis

DuckDB for Time-Series Analysis

Why DuckDB Works Well for Time-Series#

Data Setup#

Basic Date Operations#

Extracting Date Parts#

Truncating to Periods#

Date Arithmetic#

Time-Series Aggregations#

Daily Revenue Trend#

Weekly Cohort Analysis#

Month-over-Month Growth#

Window Functions for Time-Series#

Running Total#

7-Day Moving Average#

Percentile Over Time#

Filling Gaps in Time-Series#

Time-Series for Pipeline Analytics#

Deal Velocity#

Comparing DuckDB to TimescaleDB and InfluxDB#

Exporting for Visualization#

Frequently Asked Questions#

Can DuckDB handle real-time time-series data?#

How do I handle timezones in DuckDB?#

Is DuckDB fast enough for large time-series datasets?#

How do I do lag/lead comparisons?#

Can I connect DuckDB time-series data to Grafana?#

The AI Agent for Sales You Should Be Using in 2026

The Trust Model for AI Agents: Why Local Wins

AI and the Attention Economy: Taking Back Your Focus

AI and Founder Burnout: Using Agents to Protect Your Energy

AI Is Eating the CRM Category

AI and the Changing Nature of Knowledge Work

The AI Agent for Sales You Should Be Using in 2026

The Trust Model for AI Agents: Why Local Wins

AI and the Attention Economy: Taking Back Your Focus

AI and Founder Burnout: Using Agents to Protect Your Energy

AI Is Eating the CRM Category

AI and the Changing Nature of Knowledge Work