#duckdb
2 posts
·5 min read
DuckDB: the single-node engine eating the warehouse
Most companies' data is not big enough to justify a distributed warehouse. A single fat box running DuckDB reads Parquet and Iceberg off S3 directly and answers the median analytics query in under a second, for a fixed bill and no cold start. The big-data era was mostly oversizing.
#data
#duckdb
#databases
#warehouse
#infrastructure
#opinion
#ai-assisted
·6 min read
DuckLake: metadata belongs in a database, not a pile of files
Iceberg and Delta reimplemented a transactional catalog as JSON and Avro files in object storage — and then needed a real database catalog on top anyway. DuckLake's heresy is to skip the file layer entirely: put all the metadata in SQL, keep the data in Parquet. It is both obvious and a little rude.
#data
#ducklake
#duckdb
#iceberg
#lakehouse
#opinion
#ai-assisted