Back to The Times of Claw

DuckDB Extensions: What They Are and How to Use Them

DuckDB extensions add capabilities like full-text search, spatial data, Excel export, and more. Here's a complete guide to the most useful extensions.

Mark Rachapoom
Mark Rachapoom
·6 min read
DuckDB Extensions: What They Are and How to Use Them

DuckDB Extensions: What They Are and How to Use Them

DuckDB ships with a lean core and extends through a plugin system called extensions. These add capabilities that aren't in the core engine — full-text search, spatial queries, HTTP requests, Excel file support, JSON parsing, and more.

Extensions are loaded on demand, follow a consistent pattern, and in most cases install with a single SQL command.

How Extensions Work#

Every DuckDB extension follows the same two-step pattern:

INSTALL extension_name;
LOAD extension_name;

INSTALL downloads the extension binary from the DuckDB extension repository and stores it locally. LOAD activates it in the current session. Once installed, you only need LOAD in future sessions.

To auto-load extensions, you can configure DuckDB to load them at startup, or set autoload_known_extensions = true in your config.

Core Extensions (Built-In)#

These extensions ship with DuckDB and don't require a separate install:

json — JSON Parsing and Generation#

LOAD json;
 
-- Parse JSON
SELECT json_extract('{"name": "DenchClaw", "version": 1}', '$.name');
-- Returns: "DenchClaw"
 
-- Read JSON files directly
SELECT * FROM read_json_auto('data.json');
 
-- Read newline-delimited JSON
SELECT * FROM read_ndjson_auto('events.ndjson');

parquet — Parquet File Support#

-- Read Parquet files (no LOAD needed — built into DuckDB)
SELECT * FROM 'data.parquet';
SELECT * FROM read_parquet('s3://my-bucket/data/*.parquet');
 
-- Write Parquet
COPY (SELECT * FROM deals) TO 'deals.parquet' (FORMAT PARQUET);

Parquet is DuckDB's native exchange format. Reading/writing Parquet is typically the fastest way to move data in and out.

icu — Unicode Collation#

LOAD icu;
 
-- Case-insensitive sort
SELECT name FROM contacts ORDER BY name COLLATE NOCASE;
 
-- Locale-aware sort
SELECT name FROM contacts ORDER BY name COLLATE de;
INSTALL fts;
LOAD fts;
 
PRAGMA create_fts_index('notes', 'id', 'body', stemmer='porter');
 
SELECT id, fts_main_notes.match_bm25(id, 'quarterly revenue') AS score
FROM notes
WHERE score IS NOT NULL
ORDER BY score DESC;

See the DuckDB full-text search guide for a complete walkthrough.

httpfs — HTTP and S3 File Access#

INSTALL httpfs;
LOAD httpfs;
 
-- Read CSV from HTTP
SELECT * FROM read_csv_auto('https://example.com/data.csv');
 
-- Read from S3
SET s3_region='us-east-1';
SET s3_access_key_id='YOUR_KEY';
SET s3_secret_access_key='YOUR_SECRET';
 
SELECT * FROM read_parquet('s3://my-bucket/analytics/*.parquet');
 
-- Write to S3
COPY (SELECT * FROM deals) TO 's3://my-bucket/deals.parquet' (FORMAT PARQUET);

This turns DuckDB into a lightweight ETL engine — read from S3, transform with SQL, write back.

spatial — Geographic Data#

INSTALL spatial;
LOAD spatial;
 
-- Create geometric points
SELECT ST_Point(-122.4194, 37.7749) AS san_francisco;
 
-- Distance in meters
SELECT ST_Distance_Spheroid(
    ST_Point(-122.4194, 37.7749),
    ST_Point(-118.2437, 34.0522)
) AS distance_sf_to_la;
 
-- Read GeoJSON
SELECT * FROM ST_Read('boundaries.geojson');
 
-- Read Shapefiles
SELECT * FROM ST_Read('counties.shp');

Useful for sales territory analysis, customer location mapping, and any geo-enriched analytics.

excel — Excel File Import/Export#

INSTALL excel;
LOAD excel;
 
-- Read Excel
SELECT * FROM read_xlsx('contacts.xlsx');
 
-- Write Excel
COPY (SELECT * FROM v_people) TO 'people_export.xlsx' (FORMAT XLSX);

This is a big deal for business users — no pandas, no Python, just SQL to Excel.

sqlite — SQLite Database Access#

INSTALL sqlite;
LOAD sqlite;
 
-- Attach SQLite database
ATTACH 'legacy.db' AS sqlite_db (TYPE SQLITE);
 
-- Query it like a regular table
SELECT * FROM sqlite_db.contacts;
 
-- Copy data into DuckDB
INSERT INTO my_table SELECT * FROM sqlite_db.old_table;

Useful for migrating data from SQLite-based apps or querying mobile app databases.

postgres — PostgreSQL Access#

INSTALL postgres;
LOAD postgres;
 
ATTACH 'host=localhost dbname=production user=app password=secret' AS pg (TYPE POSTGRES);
 
-- Query Postgres tables
SELECT * FROM pg.customers LIMIT 100;
 
-- Join across databases
SELECT d.*, c.email 
FROM deals d
JOIN pg.customers c ON d.customer_id = c.id;

Run analytical queries against your production Postgres database without loading data — DuckDB executes the scan on Postgres and pulls results locally.

mysql — MySQL/MariaDB Access#

INSTALL mysql;
LOAD mysql;
 
ATTACH 'host=localhost user=root database=crm' AS mysql_db (TYPE MYSQL);
SELECT * FROM mysql_db.contacts;

aws — AWS Credential Integration#

INSTALL aws;
LOAD aws;
 
-- Load AWS credentials from ~/.aws/credentials
CALL load_aws_credentials();
 
-- Now httpfs can use those credentials automatically
SELECT * FROM read_parquet('s3://my-bucket/data.parquet');

tpch and tpcds — Benchmark Data Generation#

INSTALL tpch;
LOAD tpch;
 
-- Generate TPC-H benchmark data at scale factor 1 (~1GB)
CALL dbgen(sf=1);
 
-- Access generated tables
SELECT * FROM orders LIMIT 10;

Useful for testing query performance without real data.

Extension Management#

List Installed Extensions#

SELECT * FROM duckdb_extensions();

This shows all extensions, their status (loaded/installed/available), and their version.

Check Extension Version#

SELECT extension_name, extension_version, loaded, installed
FROM duckdb_extensions()
WHERE loaded = true;

Uninstall an Extension#

-- Remove from local storage
SELECT * FROM duckdb_extensions() WHERE extension_name = 'fts';
-- Then manually delete from ~/.duckdb/extensions/

Extensions in DenchClaw#

DenchClaw uses DuckDB as its core database and leverages several extensions by default:

  • json — for storing and querying structured data in the EAV schema
  • parquet — for data import/export workflows
  • fts — for search across CRM notes and documents
  • httpfs — for the browser agent's data import features

When you build a DuckDB-powered app inside DenchClaw, you can use any installed extension in your SQL queries.

Writing Custom Extensions#

DuckDB has a C++ extension API. You can write extensions that:

  • Add new scalar functions
  • Add new aggregate functions
  • Add new table functions (custom data sources)
  • Add new file format readers

The DuckDB community has published dozens of open-source extensions on GitHub. Check the official extension repository for the full list.

Frequently Asked Questions#

Do extensions persist between DuckDB sessions?#

INSTALL is permanent (stored on disk). LOAD is per-session. You need to LOAD an extension each session, or configure auto-loading.

Are extensions available in DuckDB WASM (browser)?#

Most extensions are not available in DuckDB WASM due to browser sandbox restrictions. The json and parquet extensions are available. httpfs, spatial, and others are not.

Can I use extensions in Python with duckdb package?#

Yes. The Python API supports INSTALL and LOAD:

import duckdb
con = duckdb.connect()
con.execute("INSTALL httpfs; LOAD httpfs;")

Are extensions verified/signed?#

Official extensions from the DuckDB repository are signed. Community extensions may not be. Check the source before loading third-party extensions.

How do I use extensions offline?#

After running INSTALL, the extension binary is cached locally. Future LOAD calls don't require internet access.

Ready to try DenchClaw? Install in one command: npx denchclaw. Full setup guide →

Mark Rachapoom

Written by

Mark Rachapoom

Building the future of AI CRM software.

Continue reading

DENCH

© 2026 DenchHQ · San Francisco, CA