Back to The Times of Claw

DuckDB for Marketing Analytics

Run marketing attribution, campaign performance, CAC, and channel ROI analysis with DuckDB. No GA4 export needed — query your raw data directly.

Mark Rachapoom
Mark Rachapoom
·5 min read
DuckDB for Marketing Analytics

DuckDB for Marketing Analytics

Marketing analytics with DuckDB means querying your raw event and conversion data directly — no waiting for GA4 reports, no sampling, no third-party attribution black boxes. You write the SQL, you define the attribution model, you own the results.

Here's how to set it up and the queries that matter.

Schema Setup#

CREATE TABLE sessions (
    session_id VARCHAR PRIMARY KEY,
    user_id VARCHAR,
    started_at TIMESTAMP,
    source VARCHAR,      -- 'google', 'twitter', 'direct', 'email'
    medium VARCHAR,      -- 'cpc', 'organic', 'social', 'email'
    campaign VARCHAR,
    landing_page VARCHAR,
    country VARCHAR,
    device_type VARCHAR  -- 'desktop', 'mobile', 'tablet'
);
 
CREATE TABLE conversions (
    conversion_id VARCHAR PRIMARY KEY,
    user_id VARCHAR,
    session_id VARCHAR,
    event_type VARCHAR,  -- 'signup', 'trial_start', 'purchase'
    revenue DECIMAL(10,2),
    converted_at TIMESTAMP
);
 
CREATE TABLE ad_spend (
    date DATE,
    channel VARCHAR,
    campaign VARCHAR,
    spend DECIMAL(10,2),
    impressions INTEGER,
    clicks INTEGER
);

Channel Performance#

-- Channel-level summary: traffic, conversions, CAC
SELECT 
    s.source,
    s.medium,
    COUNT(DISTINCT s.session_id) AS sessions,
    COUNT(DISTINCT s.user_id) AS unique_visitors,
    COUNT(DISTINCT c.user_id) AS conversions,
    SUM(c.revenue) AS revenue,
    ROUND(COUNT(DISTINCT c.user_id) * 100.0 / COUNT(DISTINCT s.user_id), 2) AS conversion_rate_pct,
    ROUND(SUM(c.revenue) / NULLIF(COUNT(DISTINCT c.user_id), 0), 2) AS revenue_per_conversion
FROM sessions s
LEFT JOIN conversions c ON s.session_id = c.session_id
WHERE s.started_at >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY s.source, s.medium
ORDER BY revenue DESC NULLS LAST;

Marketing Attribution#

First-Touch Attribution#

WITH first_touch AS (
    SELECT DISTINCT ON (user_id)
        user_id,
        source,
        medium,
        campaign,
        started_at
    FROM sessions
    ORDER BY user_id, started_at ASC
)
SELECT 
    ft.source,
    ft.medium,
    COUNT(DISTINCT c.user_id) AS attributed_conversions,
    SUM(c.revenue) AS attributed_revenue
FROM first_touch ft
JOIN conversions c ON ft.user_id = c.user_id
GROUP BY ft.source, ft.medium
ORDER BY attributed_revenue DESC;

Last-Touch Attribution#

WITH last_touch AS (
    SELECT DISTINCT ON (user_id)
        user_id,
        source,
        medium,
        campaign
    FROM sessions
    ORDER BY user_id, started_at DESC
)
SELECT 
    lt.source,
    lt.medium,
    COUNT(DISTINCT c.user_id) AS attributed_conversions,
    SUM(c.revenue) AS attributed_revenue
FROM last_touch lt
JOIN conversions c ON lt.user_id = c.user_id
GROUP BY lt.source, lt.medium
ORDER BY attributed_revenue DESC;

Linear Attribution (Equal Credit to All Touchpoints)#

WITH touchpoints AS (
    SELECT 
        s.user_id,
        s.source,
        s.medium,
        s.campaign,
        COUNT(*) OVER (PARTITION BY s.user_id) AS total_touchpoints
    FROM sessions s
    WHERE EXISTS (
        SELECT 1 FROM conversions c WHERE c.user_id = s.user_id
    )
),
attributed AS (
    SELECT 
        t.source,
        t.medium,
        t.campaign,
        t.user_id,
        SUM(c.revenue) / t.total_touchpoints AS attributed_revenue
    FROM touchpoints t
    JOIN conversions c ON t.user_id = c.user_id
    GROUP BY t.source, t.medium, t.campaign, t.user_id, t.total_touchpoints
)
SELECT 
    source,
    medium,
    SUM(attributed_revenue) AS total_attributed_revenue
FROM attributed
GROUP BY source, medium
ORDER BY total_attributed_revenue DESC;

Campaign ROI#

-- ROI by campaign: spend vs. revenue
SELECT 
    a.campaign,
    SUM(a.spend) AS total_spend,
    SUM(c.revenue) AS total_revenue,
    COUNT(DISTINCT c.user_id) AS conversions,
    ROUND(SUM(a.spend) / NULLIF(COUNT(DISTINCT c.user_id), 0), 2) AS cac,
    ROUND((SUM(c.revenue) - SUM(a.spend)) / NULLIF(SUM(a.spend), 0) * 100, 1) AS roi_pct
FROM ad_spend a
LEFT JOIN sessions s ON s.campaign = a.campaign 
    AND DATE_TRUNC('day', s.started_at) = a.date
LEFT JOIN conversions c ON c.session_id = s.session_id
WHERE a.date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY a.campaign
ORDER BY roi_pct DESC NULLS LAST;

CAC (Customer Acquisition Cost)#

-- Overall CAC
SELECT 
    DATE_TRUNC('month', a.date) AS month,
    SUM(a.spend) AS total_spend,
    COUNT(DISTINCT c.user_id) AS new_customers,
    ROUND(SUM(a.spend) / NULLIF(COUNT(DISTINCT c.user_id), 0), 2) AS cac
FROM ad_spend a
JOIN sessions s ON s.campaign = a.campaign
JOIN conversions c ON c.session_id = s.session_id
    AND c.event_type = 'purchase'
WHERE a.date >= CURRENT_DATE - INTERVAL '6 months'
GROUP BY month
ORDER BY month;

Email Marketing Analytics#

CREATE TABLE email_events (
    email_id VARCHAR,
    user_id VARCHAR,
    campaign_id VARCHAR,
    event_type VARCHAR,  -- 'sent', 'opened', 'clicked', 'unsubscribed'
    occurred_at TIMESTAMP
);
 
-- Email campaign performance
SELECT 
    campaign_id,
    COUNT(CASE WHEN event_type = 'sent' THEN 1 END) AS sent,
    COUNT(CASE WHEN event_type = 'opened' THEN 1 END) AS opened,
    COUNT(CASE WHEN event_type = 'clicked' THEN 1 END) AS clicked,
    COUNT(CASE WHEN event_type = 'unsubscribed' THEN 1 END) AS unsubscribed,
    ROUND(COUNT(CASE WHEN event_type = 'opened' THEN 1 END) * 100.0 / 
          NULLIF(COUNT(CASE WHEN event_type = 'sent' THEN 1 END), 0), 1) AS open_rate_pct,
    ROUND(COUNT(CASE WHEN event_type = 'clicked' THEN 1 END) * 100.0 / 
          NULLIF(COUNT(CASE WHEN event_type = 'opened' THEN 1 END), 0), 1) AS ctor_pct
FROM email_events
GROUP BY campaign_id
ORDER BY open_rate_pct DESC;

DenchClaw for Marketing Analytics#

DenchClaw can be your marketing analytics workbench. Store session, conversion, and campaign data in DuckDB, then use the App Builder to create dashboards:

// DenchClaw app: marketing overview
const channelData = await dench.db.query(`
    SELECT source, SUM(revenue) AS revenue
    FROM sessions s JOIN conversions c ON s.session_id = c.session_id
    WHERE s.started_at >= CURRENT_DATE - INTERVAL '30 days'
    GROUP BY source ORDER BY revenue DESC
`);

You can also use the browser agent to pull campaign performance data from Google Ads or Meta Ads Manager and merge it with your conversion data for true attribution.

Frequently Asked Questions#

How does DuckDB compare to Google Analytics for marketing analytics?#

GA4 provides pre-built reports but samples data and doesn't let you run custom SQL. DuckDB requires more setup but gives you complete data and unlimited query flexibility.

How do I handle UTM parameter parsing in DuckDB?#

Store UTM parameters as separate columns (utm_source, utm_medium, utm_campaign) when collecting sessions. Parse them at ingestion time, not query time.

What's the best attribution model for B2B SaaS?#

Multi-touch attribution (linear or time-decay) generally gives more useful insights than first/last touch for B2B with long sales cycles.

Can DuckDB handle billions of event rows for marketing analysis?#

On a server with ample RAM, yes. For truly large-scale web analytics (billions of sessions), consider ClickHouse, which handles high-cardinality, high-volume event data better.

How do I combine DuckDB marketing data with CRM data?#

Join on user_id or email. DuckDB handles joins between tables in the same database — marketing attribution + CRM pipeline data in one query.

Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →

Mark Rachapoom

Written by

Mark Rachapoom

Building the future of AI CRM software.

Continue reading

DENCH

© 2026 DenchHQ · San Francisco, CA