OLTP vs OLAP: two opposite jobs

OLTP (Online Transaction Processing) is the database behind your application. OLAP (Online Analytical Processing) is the database behind your dashboards and reports. They sound like two flavours of the same thing, but they want opposite designs at almost every level — and understanding why explains the shape of nearly every data platform, including why companies run two separate systems with a pipeline between them.

The two access patterns

The split comes down to one question: does a query touch whole rows or parts of many rows?

OLTP serves an app: "fetch order #4821", "insert this payment", "set user 99's email". Many small operations per second, each touching a few complete rows, mostly by primary key. Correctness and low latency per operation are everything.
OLAP serves analysis: "average order value by country last quarter." One query scans millions of rows but reads only a few columns, and aggregates. Throughput over huge scans matters; per-query latency of seconds is fine.

Everything else is downstream of this one difference.

	OLTP	OLAP
Workload	Many small reads/writes	Few large scans
Touches	Whole rows, by key	Few columns, many rows
Storage	Row-oriented	Columnar (Parquet¹)
Optimized for	Low-latency writes	High-throughput reads
Normalization²	Highly normalized	Denormalized / star schema³
Examples	Postgres, MySQL	BigQuery⁴, Snowflake, a lakehouse
Sized in	Rows/sec, ms latency	Rows scanned, GB/sec

🔗 Learn more — ¹ How Parquet works: columnar storage explained

🔗 Learn more — ² Normalization vs denormalization

🔗 Learn more — ³ Star schema and dimensional modeling

🔗 Learn more — ⁴ What is BigQuery?

Why storage layout splits them

The deepest consequence is on-disk layout. OLTP stores data row by row, because fetching one whole order should be a single contiguous read. OLAP stores data column by column, because scanning one column across ten million rows should not drag the other 200 columns off disk with it. That is exactly the row-vs-columnar tradeoff that Parquet exists to exploit — and it is physically the opposite choice. A storage engine tuned for one is bad at the other; you cannot have both in the same file.

This is the real reason OLTP and OLAP are separate systems. It is not organizational habit — it is that the optimal byte layout for an app database is the pessimal layout for analytics.

The pipe between them

So you end up with two databases and a need to move data from the transactional one to the analytical one:

flowchart TD
    APP["Application"] --> OLTP["OLTP database (Postgres) — row store, normalized"]
    OLTP --> PIPE["Pipeline: CDC stream or batch ETL"]
    PIPE --> OLAP["OLAP store (lakehouse / warehouse) — columnar"]
    OLAP --> BI["Dashboards, reports, ML"]

    classDef plain stroke:#7b88a1,stroke-width:2.5px
    classDef key stroke:#a3be8c,stroke-width:2.5px
    class PIPE key
    class APP,OLTP,OLAP,BI plain

That pipe is the entire field of data engineering in one arrow. It is either a periodic batch ETL⁵ (dump and reload nightly) or a continuous change data capture⁶ stream that turns every row change in the OLTP database into an event — often onto Kafka⁷ — and lands it in columnar form. You run analytics on the copy specifically so a heavy report never competes with live transactions for the app database's resources.

🔗 Learn more — ⁵ What is ETL (and how is ELT different)?

🔗 Learn more — ⁶ What is Change Data Capture (CDC)?

🔗 Learn more — ⁷ What is Apache Kafka?

Common confusions

"HTAP" / hybrid systems claim to do both in one engine. They exist and are useful, but they are managing the row-vs-columnar tension internally (often by keeping two copies), not abolishing it.
OLAP is not "big data." It is a workload shape (scan-and-aggregate), independent of size. A 10 GB analytical query is still OLAP — and on data that size a single open-source node (DuckDB, ClickHouse⁸) serves it instantly, with no distributed cluster and no managed-warehouse bill. The reflex to spin up a paid cloud warehouse for every analytical workload is mostly marketing; the databases are already fast.
A data warehouse⁹ and an OLAP database are not quite synonyms. "Warehouse" names the system and its role; "OLAP" names the workload it is built for. The warehouse is one way to serve OLAP; a lakehouse is another.

🔗 Learn more — ⁸ What is ClickHouse?

🔗 Learn more — ⁹ What is a data warehouse?

The short version: OLTP = whole rows, written fast, row storage; OLAP = few columns, scanned wide, columnar storage. They are opposite enough that you keep two systems and spend real engineering effort on the pipeline that copies data from one to the other.