Your CRM as a .duckdb File: What That Actually Means
DenchClaw stores all your CRM data in a single .duckdb file. Here's what that means for ownership, portability, performance, and privacy — and why it's the right choice.
Your CRM as a .duckdb File: What That Actually Means
There's a single file at ~/.openclaw-dench/workspace/workspace.duckdb. Every contact, every company, every deal, every note in your DenchClaw CRM is in that file. All of it.
That's a deliberate architectural choice, and it has significant implications for how your data works — implications that most CRM users have never had to think about because their data has always lived in someone else's cloud.
Let me walk through what having your CRM as a .duckdb file actually means in practice.
What DuckDB Is#
DuckDB is an embedded analytical database. "Embedded" means it runs in-process — no separate server, no daemon, no Docker container. It's like SQLite, but optimized for analytical workloads rather than transactional ones.
DuckDB is:
- Free and open source (MIT license)
- Single-file: the entire database is one
.duckdbfile - Full SQL: window functions, CTEs, PIVOT, JSON functions, array operations
- Fast for analytics: columnar storage, vectorized execution, optimized aggregations
- Cross-platform: macOS, Linux, Windows, ARM, x86
- Widely adopted: used at companies like Meta, Amazon, Snowflake
For a CRM workload — millions of rows, complex analytical queries, frequent reads, moderate writes — DuckDB is nearly ideal.
The Schema#
DuckDB stores your CRM data using an EAV (Entity-Attribute-Value) pattern:
-- Every "table" in your CRM is an object
CREATE TABLE objects (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR NOT NULL, -- "people", "companies", "deals"
icon VARCHAR,
entry_count INTEGER DEFAULT 0,
created_at TIMESTAMP
);
-- Every "column" is a field
CREATE TABLE fields (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
object_id UUID REFERENCES objects(id),
name VARCHAR NOT NULL, -- "Full Name", "Email", "Stage"
type VARCHAR NOT NULL, -- "text", "email", "enum", "relation"
config JSON, -- type-specific config
position INTEGER,
created_at TIMESTAMP
);
-- Every "row" is an entry
CREATE TABLE entries (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
object_id UUID REFERENCES objects(id),
created_at TIMESTAMP,
updated_at TIMESTAMP
);
-- Every "cell value" is an entry_field
CREATE TABLE entry_fields (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
entry_id UUID REFERENCES entries(id),
field_id UUID REFERENCES fields(id),
value VARCHAR,
created_at TIMESTAMP
);
-- Linked markdown documents
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
entry_id UUID REFERENCES entries(id),
path VARCHAR NOT NULL,
created_at TIMESTAMP
);On top of this, DenchClaw generates PIVOT views — flat, readable tables that look like a traditional CRM:
-- Auto-generated view for the "people" object
CREATE VIEW v_people AS
PIVOT (
SELECT e.id, ef.field_name, ef.value
FROM entries e
JOIN entry_fields_with_names ef ON ef.entry_id = e.id
WHERE e.object_id = (SELECT id FROM objects WHERE name = 'people')
) ON field_name
USING FIRST(value);The result: v_people looks exactly like a traditional SQL table, with one column per field.
What You Can Do With a .duckdb File#
This is where it gets powerful. Your CRM data is now queryable by anything that speaks DuckDB or SQL.
Direct Terminal Access#
# Open the DuckDB REPL
duckdb ~/.openclaw-dench/workspace/workspace.duckdb
# Count your leads
D SELECT COUNT(*) FROM v_people WHERE "Status" = 'Lead';
# Top deals by value
D SELECT "Title", "Value", "Stage", "Close Date"
FROM v_deals
ORDER BY CAST("Value" AS NUMERIC) DESC
LIMIT 10;
# Contacts not reached in 30 days
D SELECT "Full Name", "Email", "Last Contacted"
FROM v_people
WHERE "Status" = 'Active'
AND ("Last Contacted" < current_date - INTERVAL 30 DAYS
OR "Last Contacted" IS NULL)
ORDER BY "Last Contacted" ASC;Python Analytics#
import duckdb
import pandas as pd
conn = duckdb.connect("~/.openclaw-dench/workspace/workspace.duckdb", read_only=True)
# Load into pandas
df = conn.execute("SELECT * FROM v_deals").df()
# Monthly deal velocity
df['close_date'] = pd.to_datetime(df['Close Date'])
monthly = df.groupby(df['close_date'].dt.to_period('M'))['Value'].agg(['count', 'sum'])
print(monthly)Jupyter Notebooks#
Run your entire CRM analytics in Jupyter:
# %pip install duckdb
import duckdb
conn = duckdb.connect("~/.openclaw-dench/workspace/workspace.duckdb")
# Pipeline funnel analysis
funnel = conn.execute("""
SELECT "Stage", COUNT(*) as count, SUM(CAST("Value" AS NUMERIC)) as total
FROM v_deals
GROUP BY "Stage"
ORDER BY CASE "Stage"
WHEN 'Prospect' THEN 1
WHEN 'Qualified' THEN 2
WHEN 'Proposal Sent' THEN 3
WHEN 'Closed Won' THEN 4
ELSE 5 END
""").df()
import plotly.express as px
px.bar(funnel, x='Stage', y='total', title='Pipeline by Stage')Export to Any Format#
-- Export contacts to CSV
COPY v_people TO 'people.csv' (HEADER, DELIMITER ',');
-- Export deals to Parquet
COPY v_deals TO 'deals.parquet' (FORMAT PARQUET);
-- Export everything to JSON
COPY (SELECT * FROM v_companies) TO 'companies.json' (FORMAT JSON);You can take these files and import them into any system: Salesforce, HubSpot, Airtable, Google Sheets, whatever.
Performance#
DuckDB is fast. Surprisingly fast.
For typical CRM workloads:
- 10,000 contacts: queries run in single-digit milliseconds
- 100,000 contacts: queries run in tens of milliseconds
- 1,000,000 entries: analytical aggregations still under 1 second
The AI agent's natural language → SQL → response loop typically completes in 2-4 seconds total — 1-2s for the AI model call, sub-100ms for the SQL execution.
DenchClaw runs PIVOT views on demand for smaller objects and pre-generates them for larger ones. You can force regeneration:
# Regenerate all PIVOT views
openclaw db refresh-views --profile denchBackup Strategy#
One file = one backup target. It's simple:
# Time Machine backs it up automatically (if workspace is on your main drive)
# Manual backup
cp ~/.openclaw-dench/workspace/workspace.duckdb ~/Desktop/crm-backup-$(date +%Y%m%d).duckdb
# Automated daily backup
# Add to crontab: 0 2 * * * cp ~/.openclaw-dench/workspace/workspace.duckdb /backups/crm-$(date +\%Y\%m\%d).duckdb
# Git-based backup (includes schema + docs)
cd ~/.openclaw-dench/workspace
git add workspace.duckdb && git commit -m "crm backup $(date)"
git push origin mainFor recovery: copy the .duckdb file back, start DenchClaw, everything is there.
The Portability Guarantee#
The .duckdb file is fully self-contained. To move DenchClaw to a new machine:
# On old machine
tar czf denchclaw-workspace.tar.gz ~/.openclaw-dench/workspace/
# Transfer to new machine
scp denchclaw-workspace.tar.gz newmachine:~
# On new machine
npx denchclaw # installs fresh
tar xzf denchclaw-workspace.tar.gz # restore workspace
# DenchClaw is back exactly as you left itThe Vendor Risk Argument#
I've thought a lot about vendor risk since building DenchClaw. Here's the core argument for local-first data:
HubSpot could change their pricing tomorrow. (They have, repeatedly.) If your data is in their cloud and they raise prices 3x, you have no leverage. You can try to export, but exports are often incomplete — no activity history, no custom field types, lossy formatting.
SaaS companies get acquired. When the acquirer kills the product (it happens), you have a 90-day notice window to scramble your export. If the export tools are broken (they often are), you lose data.
SaaS companies get breached. Your contacts, your deal notes, your email history — all exposed. Not your fault, not your decision.
With a .duckdb file:
- You control pricing: it's your file, there is no subscription
- Acquisitions don't matter: the software is MIT licensed
- Breaches affect only your machine: you control the security perimeter
Frequently Asked Questions#
Is a single .duckdb file a single point of failure?#
Yes — back it up. But "single point of failure" in a local-first context means your local machine, which is already a single point of failure for your other local data. Backups solve this.
What if workspace.duckdb gets corrupted?#
DuckDB has a WAL (Write-Ahead Log) that protects against corruption from crashes. For catastrophic corruption, restore from backup. The markdown files (skills, configs, docs) are unaffected.
Can multiple processes write to workspace.duckdb simultaneously?#
DuckDB allows multiple readers but only one writer at a time. The DenchClaw gateway manages concurrent access safely. Don't open the file in a second DuckDB process while the gateway is writing.
How large can the .duckdb file get?#
DuckDB handles terabytes in production. For a CRM, you're unlikely to exceed 1GB even with millions of entries. The file compresses extremely well.
Can I store the .duckdb file in Dropbox or iCloud Drive?#
Yes, but with caution: avoid running DenchClaw when the cloud sync is mid-sync. The safest approach is to put the workspace in a directory that syncs only when DenchClaw is not running, or to use git for sync instead.
Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →
