DuckDB is SQLite for analytics: in-process, columnar, fast on a single machine, no server. It looks like a curiosity until you actually use it — then it replaces Pandas, makes Parquet workflows trivial, and embeds in apps for analytical features. By 2026 it's a daily tool for many engineers.
In-process columnar SQL
Single library (Python, R, Java, Node, C++ all wrap it). Open files directly: SELECT * FROM 'data.parquet'. Joins, aggregations, window functions. No server, no schema declaration, no setup. Fast — competitive with Spark on single-machine workloads.
Replacing Pandas for big data
Pandas loads everything in memory; chokes above ~1GB. DuckDB streams. duckdb.sql("SELECT category, AVG(price) FROM 'huge.parquet' GROUP BY category").df() returns a small DataFrame for further work. 10-100x faster on aggregations.
Cloud storage support
Read directly from S3, GCS, R2: SELECT * FROM 's3://bucket/data.parquet'. With httpfs extension. Push down predicates; only read the bytes needed. Useful for cheap ad-hoc analytics without a data warehouse.
Embedded analytics in apps
Ship DuckDB inside your app for in-app analytics. User uploads CSV → DuckDB queries it client-side. Smaller than spinning up a warehouse for a feature. Wasm build for browser deployment.
Where it doesn't fit
OLTP (use Postgres). Concurrent writers (single-writer model). Distributed scale (single machine). Production OLAP with many users (use a warehouse). DuckDB is a tool, not the warehouse.