JSON: discovered, not invented
Douglas Crockford has said for twenty years that he did not invent JSON, he discovered it. The format was sitting inside JavaScript the whole time, waiting for someone to extract it. The story of how a 2001 footnote in a browser scripting language ate XML's lunch is shorter than most people think.
AI-assisted postDrafted with help from Claude, edited and fact-checked by Mart. See transparency policy →Douglas Crockford has been careful for twenty years about the verb he uses. He did not invent JSON. He discovered it. The format was already inside JavaScript — the object literal syntax that had been part of the language since Netscape shipped LiveScript in late 1995 — and all Crockford did was pull it out, write a tiny grammar around it, and put up a one-page website. The story of how a footnote in a browser scripting language replaced XML as the lingua franca of the web is shorter than most people think it is, and the structural reason JSON won is the same reason Crockford keeps insisting on discovered.
Two thousand and one, State Software
Crockford was at State Software in 2001, building a tool to push data between a browser and a server without forcing the browser to reload. The standard way to do that in 2001 was either an <iframe> trick (load a hidden frame, parse the resulting HTML, hope nothing escapes) or XmlHttpRequest (introduced as IXMLHTTPRequest in IE5 in March 1999, mostly used to ferry XML around). Both worked. Both were heavy. Both required either a parser written specifically for the document shape, or a full XML parser — which in 2001 was a non-trivial dependency for any application code.
The breakthrough was small: JavaScript already had a syntax for nested objects and arrays of primitives, and JavaScript was already running in the browser. There was no need to ship anything to parse the data the browser was about to receive — eval() was a parser. Crockford realised that the same syntax could be sent the other direction too, generated server-side, and consumed on either end of the wire by any language that had a small parser written for the grammar. The grammar was already there, embedded in the language. What was missing was someone to name the thing and write the grammar down.
Crockford registered JSON.org in 2001 and put up a one-page spec. The page is still there, mostly unchanged. It describes the entire format in five railroad diagrams. The name expanded to JavaScript Object Notation — the only sense in which the term is an invention rather than a discovery.
The bloat JSON was reacting against
XML in 2001 was not the verbose villain it later became. It was the consensus interchange format. SOAP, WSDL, RSS, Maven, Spring config, Office Open XML, and every enterprise integration project of the early 2000s were XML applications. The industry had ratified XML as the way two systems talk. The cost of that ratification was complexity. A SOAP envelope to send {"id": 7} was a multi-line document with namespaces, schema references, and at least one boilerplate header per round trip.
Crockford's framing line was that JSON was "the fat-free alternative to XML". The fat being shed was the document-tree apparatus that XML inherited from SGML and that almost nobody using XML-as-config actually needed. Most XML in 2001 was being used to express a tree of key-value pairs, which is exactly what JavaScript's object literal syntax already represented in three or four fewer characters. The category mismatch was the cost.
The minimal spec at JSON.org
The discipline of the original JSON spec is the part that made it travel. Crockford documented six value types: string, number, object, array, true/false, null. He did not include comments. He did not include dates. He did not include trailing commas. He did not include integer types separate from floats. The spec fit on one page because every decision that could be deferred was deferred. The result was that any language with a string parser and a recursive function could write a JSON parser in an afternoon, and a JSON parser in any of the popular languages of 2002 — Perl, Python, Ruby, PHP, Java — fit in a few hundred lines of source. That portability was not an accident; it was what the small spec was for.
Importantly, JSON sits in the same category as YAML and TOML — data serialisation, not markup language. The artifact JSON describes is a data structure being projected to text, not a document being annotated.
Standardisation, in three waves
The format went into the wild in 2001 and propagated by word of mouth and library port for five years before anybody formalised it. The first IETF standardisation came in July 2006 as RFC 4627, a five-page document that mostly codified what the JSON.org page already said. It left enough ambiguity (duplicate object keys, top-level scalars, UTF-8 handling, integer precision) that implementations diverged in small ways for the next eight years.
The second wave was ECMA's. ECMA-404 was published in October 2013 as a one-page grammar with no normative behaviour. It was deliberately narrower than RFC 4627 — describing only what a JSON text is, not how to parse one — and intended to be the authoritative reference for the syntax. ECMA-404 has been revised twice since (2017, 2024), each time only to clarify Unicode handling.
The third wave was IETF cleanup. RFC 7159 replaced RFC 4627 in March 2014, allowing top-level scalars; then RFC 8259 in December 2017 made UTF-8 mandatory and clarified duplicate-key handling. Same year, ISO ratified ISO/IEC 21778:2017 as the international standard. By 2017, JSON had four standards documents, all of which describe the same syntax slightly differently, and none of which Crockford was directly involved in.
The Google Trends crossover
The visible cultural moment came in 2009–2010, when Google Trends shows JSON queries overtaking XML queries for the first time. The crossover was the public acknowledgement of a shift that had already happened in API design. Twitter's REST API had gone JSON-by-default in 2009. Facebook's Graph API launched JSON-only in 2010. Stripe in 2011 was JSON throughout. By the time the trend lines crossed, the major web platforms had already migrated and the rest of the industry was catching up.
What XML did not lose was the document-markup half of its job. HTML, EPUB, DocBook, SVG, OOXML, and the entire XML-as-document-tree family are still XML in 2026, exactly because XML actually is a markup language and the alternative formats are not. JSON eats XML's data-interchange lunch; it leaves XML's document-markup plate alone. The category split is the one post 10 walked through.
What JSON still cannot do
The minimalism that made JSON portable left five recurring complaints that the broader ecosystem keeps trying to patch:
- No comments. Crockford explicitly removed them in 2002 because he saw parsers using comments as directives, which would create vendor incompatibilities. The JSON5 extension (2012) puts them back.
- No trailing commas. Strict JSON rejects them. JSON5 and JSONC (Microsoft's superset, used in VS Code config) allow them.
- No date type. Dates are strings by convention, usually ISO 8601. Every system layers its own date parsing on top.
- Integer precision. JSON numbers are IEEE 754 doubles by default, which means integers above 2^53 lose precision. Bigint-aware libraries handle this case-by-case.
- No newline-delimited streaming. NDJSON added that as a side spec; one JSON object per line.
Each of these has a partial fix in a downstream format (JSON5, JSONC, NDJSON, JSON-LD, JSON Schema). None of them changed the base JSON grammar. The exclusions are part of why the grammar held.
A short close
Crockford's discovery framing is precise: the syntax was already in the language, the use case was already in the air, and the only thing missing was a name and a spec page. Crockford supplied both, refused to extend the grammar, and let the format propagate on its own minimalism. Twenty-five years later JSON is the default API payload format on the web and the default config format in roughly half the ecosystems that have not landed on YAML or TOML. The success was not in the design; it was in the restraint. Every decision Crockford did not make in 2001 is a decision the ecosystem has been free to make differently since.
Read next
The .yml extension is a 1990s DOS artifact. The 'YAML Ain't Markup Language' acronym is a 2002 self-correction. Both questions resolve cleanly once you know markup languages and data serialisation formats are different categories with different ancestors.
grep is short for g/re/p — the ed command syntax for global regular expression print. Regular expressions themselves go back to a 1951 RAND memo by Stephen Kleene. The thirty-year flavour war is a footnote to the original math.
A docker is a longshoreman — the worker at a port who loads and unloads shipping containers. The software took its name from the work; the work took its metaphor from the 1956 standardisation of physical shipping containers. The container ecosystem inherited the maritime vocabulary.