#duckdb

2 posts

24 May 2026· min read

DuckDB: the single-node engine eating the warehouse

Most companies' data is not big enough to justify a distributed warehouse. A single fat box running DuckDB reads Parquet and Iceberg off S3 directly and answers the median analytics query in under a second, for a fixed bill and no cold start. The big-data era was mostly oversizing.

#infrastructure

24 May 2026· min read

DuckLake: metadata belongs in a database, not a pile of files

Iceberg and Delta reimplemented a transactional catalog as JSON and Avro files in object storage — and then needed a real database catalog on top anyway. DuckLake's heresy is to skip the file layer entirely: put all the metadata in SQL, keep the data in Parquet. It is both obvious and a little rude.