Foundations
9 articles in this category.
·2 min read
How data is stored: databases, warehouses, lakes, and lakehouses
A map of where data actually lives — from one table in a database to data warehouses, lakes, and lakehouses — and which layer fits which job. The overview that ties the whole storage stack together.
#data
#databases
#architecture
#ai-assisted
·3 min read
OLTP vs OLAP: two opposite jobs
OLTP runs your application — many tiny reads and writes of whole rows. OLAP runs your analytics — huge scans of a few columns. They pull database design in opposite directions, which is why you end up with two systems and a pipe between them.
#data
#oltp
#olap
#databases
#ai-assisted
·2 min read
Schema-on-read vs schema-on-write
Schema-on-write enforces structure at ingest (warehouses); schema-on-read stores raw and applies structure at query (lakes). Flexibility vs guarantees.
#data
#architecture
#ai-assisted
·2 min read
SQL vs NoSQL
SQL databases are relational, schema-enforced, and strongly consistent. NoSQL is a grab-bag of non-relational stores built for scale and flexible schemas. Pick by access pattern.
#data
#databases
#ai-assisted
·5 min read
What is a data lakehouse?
A lakehouse puts a database-style table layer on top of cheap object storage — warehouse guarantees (ACID, schema, time travel) over data-lake files. How the layers stack, and why the architecture exists at all.
#data
#lakehouse
#architecture
#ai-assisted
·3 min read
What is a database?
A database is software that stores data and answers questions about it safely and fast — schemas, indexes, transactions, a query language. Why it beats a pile of files, and the OLTP/OLAP split that divides the whole field.
#data
#databases
#sql
#ai-assisted
·3 min read
What is a data warehouse?
A data warehouse is a database built the opposite way from your app's — columnar, massively parallel, tuned for huge scans and SQL over the whole company's history. What separates it from an OLTP database, and what you pay for the speed.
#data
#warehouse
#olap
#ai-assisted
·3 min read
What is a data lake?
A data lake is your data as open files in object storage — cheap, open, readable by any engine. Done right it is organized and governed, not a swamp. How it differs from a warehouse, and where a table format fits in.
#data
#lake
#object-storage
#ai-assisted
·4 min read
What is ACID (database transactions)?
ACID is the four guarantees a database transaction makes — Atomicity, Consistency, Isolation, Durability — and why they get hard in distributed systems.
#data
#databases
#transactions
#ai-assisted