OLTP vs OLAP: two opposite jobs
OLTP runs your application — many tiny reads and writes of whole rows. OLAP runs your analytics — huge scans of a few columns. They pull database design in opposite directions, which is why you end up with two systems and a pipe between them.
In this series
How data is storedOLTP (Online Transaction Processing) is the database behind your application. OLAP (Online Analytical Processing) is the database behind your dashboards and reports. They sound like two flavours of the same thing, but they want opposite designs at almost every level — and understanding why explains the shape of nearly every data platform, including why companies run two separate systems with a pipeline between them.
The two access patterns
The split comes down to one question: does a query touch whole rows or parts of many rows?
- OLTP serves an app: "fetch order #4821", "insert this payment", "set user 99's email". Many small operations per second, each touching a few complete rows, mostly by primary key. Correctness and low latency per operation are everything.
- OLAP serves analysis: "average order value by country last quarter." One query scans millions of rows but reads only a few columns, and aggregates. Throughput over huge scans matters; per-query latency of seconds is fine.
Everything else is downstream of this one difference.
| OLTP | OLAP | |
|---|---|---|
| Workload | Many small reads/writes | Few large scans |
| Touches | Whole rows, by key | Few columns, many rows |
| Storage | Row-oriented | Columnar (Parquet1) |
| Optimized for | Low-latency writes | High-throughput reads |
| Normalization | Highly normalized | Denormalized / star schema |
| Examples | Postgres, MySQL | BigQuery, Snowflake, a lakehouse |
| Sized in | Rows/sec, ms latency | Rows scanned, GB/sec |
🔗 Learn more — 1 How Parquet works: columnar storage explained
Why storage layout splits them
The deepest consequence is on-disk layout. OLTP stores data row by row, because fetching one whole order should be a single contiguous read. OLAP stores data column by column, because scanning one column across ten million rows should not drag the other 200 columns off disk with it. That is exactly the row-vs-columnar tradeoff that Parquet exists to exploit — and it is physically the opposite choice. A storage engine tuned for one is bad at the other; you cannot have both in the same file.
This is the real reason OLTP and OLAP are separate systems. It is not organizational habit — it is that the optimal byte layout for an app database is the pessimal layout for analytics.
The pipe between them
So you end up with two databases and a need to move data from the transactional one to the analytical one:
flowchart TD
APP["Application"] --> OLTP["OLTP database (Postgres) — row store, normalized"]
OLTP --> PIPE["Pipeline: CDC stream or batch ETL"]
PIPE --> OLAP["OLAP store (lakehouse / warehouse) — columnar"]
OLAP --> BI["Dashboards, reports, ML"]
classDef plain stroke:#7b88a1,stroke-width:2.5px
classDef key stroke:#a3be8c,stroke-width:2.5px
class PIPE key
class APP,OLTP,OLAP,BI plain
That pipe is the entire field of data engineering in one arrow. It is either a periodic batch ETL (dump and reload nightly) or a continuous change data capture stream that turns every row change in the OLTP database into an event — often onto Kafka2 — and lands it in columnar form. You run analytics on the copy specifically so a heavy report never competes with live transactions for the app database's resources.
🔗 Learn more — 2 What is Apache Kafka?
Common confusions
- "HTAP" / hybrid systems claim to do both in one engine. They exist and are useful, but they are managing the row-vs-columnar tension internally (often by keeping two copies), not abolishing it.
- OLAP is not "big data." It is a workload shape (scan-and-aggregate), independent of size. A 10 GB analytical query is still OLAP — and on data that size a single open-source node (DuckDB, ClickHouse) serves it instantly, with no distributed cluster and no managed-warehouse bill. The reflex to spin up a paid cloud warehouse for every analytical workload is mostly marketing; the databases are already fast.
- A data warehouse3 and an OLAP database are not quite synonyms. "Warehouse" names the system and its role; "OLAP" names the workload it is built for. The warehouse is one way to serve OLAP; a lakehouse is another.
🔗 Learn more — 3 What is a data warehouse?
The short version: OLTP = whole rows, written fast, row storage; OLAP = few columns, scanned wide, columnar storage. They are opposite enough that you keep two systems and spend real engineering effort on the pipeline that copies data from one to the other.