Skip to main content

DuckDB

DuckDB is an open-source, high-performance, in-process SQL database management system (RDBMS) for analytics:

  • Designed for OLAP - DuckDB is designed for online analytical processing (OLAP) workloads, rather than transactional (OLTP) applications.
  • Embedded - DuckDB operates within the same process as your application or notebook, eliminating network overhead.
  • Versatile - DuckDB can handle diverse data formats, such as CSV, JSON, Parquet, and Apache Arrow. It also integrates with databases like MySQL, SQLite, and Postgres.
  • Easy to use - DuckDB provides a rich SQL dialect, with support for arbitrary and nested correlated subqueries, window functions, collations, and complex types.
  • Fast - DuckDB is designed to be fast, reliable, and portable. It can efficiently process and query gigabytes of data from various sources.
  • Embeddable - DuckDB enables users to analyze data on edge, which can improve response times and preserve bandwidth.

Commands

brew install duckdb

Performance Optimization

Appender

If you're streaming data into DuckDB, INSERT statements become a bottleneck fast.

DuckDB's Appender API bypasses the SQL layer entirely. No parsing, no query planning. You write directly to the columnar storage format, which means you can handle real-time ingestion without the usual speed/batch size trade-off.

Stream rows through a low-level API. Data caches in batches before writing to disk. You're essentially using a binary protocol instead of SQL strings.

Good for:

  • Kafka consumers or message queue ingestion
  • Log aggregation pipelines
  • IoT sensor data collection
  • Any scenario where data arrives continuously

A few things to watch out for. It's order and type sensitive. You match columns exactly, no inference. One constraint violation fails the entire batch, no partial inserts. And you're writing to a single table per Appender instance.

Available in C, C++, Go, Java, and Rust. For batch ETL or small datasets, regular INSERT is simpler and fine. But for streaming? This is the tool.

Appender – DuckDB

Tutorials