← Learn··Updated 21 May 2026·4 min read

What is a Jupyter notebook (and how does the runtime work)?

A short reference on Jupyter notebooks — the two-process architecture, the five ZeroMQ sockets between frontend and kernel, the .ipynb file format, and the hidden-state pitfall that catches almost everyone.

Programming
Runtimes
#jupyter
#python
#data-science
#ipython

The one-line definition

A Jupyter notebook is a JSON document of mixed code, prose, and rendered output, paired at runtime with a separate language process called a kernel that actually executes the code. The notebook UI is the editor and renderer; the kernel is the language runtime. Almost every confusion users have about notebooks comes from forgetting these are two processes.

Two-process architecture

When you click Run cell in your browser, here is what is actually running:

flowchart LR
    UI["Frontend<br/>(browser tab, VS Code,<br/>Qt console, Colab)"]
    SRV["jupyter_server<br/>(HTTP + WebSocket bridge)"]
    K["Kernel process<br/>(ipykernel for Python,<br/>IRkernel for R, …)"]
    UI <-->|HTTP / WebSocket| SRV
    SRV <-->|ZeroMQ sockets,<br/>JSON messages| K

The frontend is whatever you are looking at — a browser tab, JupyterLab, VS Code, or Google Colab's web UI. It does not run your code. It sends messages to jupyter_server, which forwards them over ZeroMQ to the kernel — a separate OS process that imports your modules, allocates objects, and holds your variables.

The kernel does not know or care which frontend is connected. The same kernel can be driven by a browser, a VS Code window, and a CLI client at the same time. They each see the same kernel state because the state lives in the kernel process, not in the UI.

The five ZeroMQ sockets

The Jupyter messaging protocol defines five sockets between frontend and kernel. None of them go through a broker; ZeroMQ is just a sockets library with named patterns.

Socket Pattern What flows
Shell request/reply Code execution requests, introspection, completion
IOPub publish/subscribe Kernel broadcasts: stdout, stderr, rich displays, status
Stdin request/reply Kernel asks the user for input (e.g. input() in Python)
Control request/reply Out-of-band: interrupt, shutdown, debug
Heartbeat request/reply Periodic liveness check

When a kernel starts, it writes a JSON connection file with the port number for each of these sockets and an HMAC signing key. The frontend reads that file (or receives the ports via jupyter_server) and opens connections.

How executing a cell actually works

Concretely, when you press Shift+Enter:

sequenceDiagram
    participant UI as Frontend
    participant Shell as Shell socket
    participant IOPub as IOPub socket
    participant K as Kernel

    UI->>Shell: execute_request<br/>{ code: "x = 1 + 1" }
    Shell->>K: deliver request
    K-->>IOPub: status: busy
    K-->>IOPub: execute_input (the code being run)
    Note over K: kernel executes the code
    K-->>IOPub: stream / display_data / execute_result
    K-->>IOPub: status: idle
    K->>Shell: execute_reply<br/>{ status: "ok",<br/>execution_count: 17 }
    Shell-->>UI: deliver reply

The frontend subscribes to IOPub and renders every published message: prints go to a stream output, the last expression goes to an execute_result with its mime bundle, errors arrive as error messages with the traceback. The execute_reply on the Shell socket carries the cell's execution_count (the number you see in the [17]: prompt) and the overall status.

The .ipynb file format

A notebook file is straightforward JSON:

{
  "nbformat": 4,
  "nbformat_minor": 5,
  "metadata": { "kernelspec": { "name": "python3", "display_name": "Python 3" } },
  "cells": [
    {
      "cell_type": "code",
      "execution_count": 3,
      "source": ["x = 1 + 1\n", "x"],
      "outputs": [{ "output_type": "execute_result", "execution_count": 3, "data": { "text/plain": ["2"] } }]
    }
  ]
}

Three cell types — code, markdown, raw. Each cell carries its source and (for code cells) its rendered outputs. Output data is a mime bundle, which is how a single cell can carry both text/plain and image/png for the same result and have the frontend pick the richest representation it can render.

The consequence engineers run into: the file embeds outputs. A notebook with a 10 MB rendered chart in it is a 10 MB file. Tools like nbstripout and jupytext exist to strip outputs at commit time or to round-trip notebooks through a paired .py file, both to keep diffs reviewable and to keep repos slim.

Kernels and kernelspecs

A kernel is any program that speaks the Jupyter messaging protocol. The three most common:

  • ipykernel — the Python kernel that ships with the project. The default everywhere.
  • IRkernel — the R kernel.
  • IJulia — the Julia kernel.

Beyond those, the community kernels list has well over a hundred entries: Bash, Go, Rust, TypeScript, Scala, MATLAB, Haskell, and many more.

Kernels register themselves with Jupyter through a kernel.json file in one of Jupyter's data paths (~/.local/share/jupyter/kernels, /usr/share/jupyter/kernels, etc.). List the ones installed on a machine with:

jupyter kernelspec list

A kernelspec is small — it specifies the kernel's display name, the command to launch the kernel process, and the language metadata. Everything else (which Python, which packages, which CUDA version) lives in the environment the command starts.

The hidden-state pitfall

The kernel's variables persist between cell executions, in whatever order you ran the cells. That sounds obvious in the abstract and bites everyone in practice:

  • Run cell 3 (defines x = 5), then cell 1 (uses x). Cell 1 sees x = 5 even though it appears before cell 3 in the file.
  • Delete cell 3 from the file. x is still defined in the kernel. The notebook appears to work because the variable is still in memory.
  • Save and share. The next person opens the notebook on a fresh kernel and gets a NameError.

The fix is to make a habit of Kernel → Restart & Run All before committing or publishing. The headless equivalent for CI pipelines is nbconvert --execute or papermill, which both run a notebook end-to-end against a clean kernel and fail loudly if anything is missing.

Where you actually use Jupyter in 2026

  • JupyterLab — the modern Jupyter UI; default in most fresh installations.
  • Classic Notebook — the original UI; still surprisingly common, still maintained.
  • VS Code notebooks — VS Code can host a Jupyter kernel directly with its own renderer. Same protocol underneath.
  • Google Colab — managed Jupyter with GPUs and a heavily customised frontend; the kernels are still IPython.
  • Databricks notebooks — similar idea, custom UI on top of Spark/Photon, supports a Python kernel that talks IPython-style.
  • JupyterHub — multi-user Jupyter for teams or universities; spawns a kernel per user.

All of them speak the same kernel protocol underneath. Pick whichever frontend fits the workflow; the kernel is what makes the cell actually run.

The whole point

The notebook file is the cells. The kernel is the runtime. Hold those two ideas as separate, and the rest of Jupyter's quirks — out-of-order execution, kernel restarts, why outputs persist when the source has been deleted — stop being mysterious and start being mechanical.