How I Got Luhmann's 73,715-Card Archive in a Couple of Minutes

March 28, 2026 · Martijn Aslander

Updated March 30, 2026 — This project was published within hours of the first analysis, following Nicole van der Hoeven's philosophy of learning in public: share early, be vulnerable, correct as you go. Early readers — particularly Ton Zijlstra — provided corrections that significantly improved the work. That's not a flaw in the process. That's the process.

Niklas Luhmann's Zettelkasten is legendary. Over 45 years, the German sociologist built a system of roughly 90,000 handwritten note cards that powered more than 50 books and 550 articles. It's the most famous example of networked note-taking in history.

Bielefeld University digitized the entire thing. Every card is scanned. Most are transcribed. They put it online at niklas-luhmann-archiv.de — and then made the exact mistake Luhmann would have predicted.

The irony

They built a JavaScript-heavy web application. You click through cards one by one. There's a search page. It renders beautifully. But there's no download button. No export. No "here's the dataset."

Luhmann wrote about exactly this kind of thing — systems that create access rituals instead of enabling genuine communication. The archive website is a presentation layer pretending to be an access layer.

The hidden API

Behind the website sits a BaseX XML database with a simple search API:

https://v0.api.niklas-luhmann-archiv.de/ZK/search

It takes a single q parameter containing a JSON query:

{
  "page": 1,
  "rows": 10000,
  "fulltext": "",
  "fuzzy": false,
  "FTSearchMode": "and",
  "zettelnummer": "",
  "zettelnummerSearchMode": "starts-with",
  "areas": [],
  "ref": "",
  "zks": ["1"]
}

Change zks to ["1"] for the first Zettelkasten (1951–1962) or ["2"] for the second (1963–1996). Crank rows up to 10,000 and paginate.

What you get

Each card comes back as a JSON object with:

transcriptionPreview — a preview of the transcribed text (truncated for longer cards; full transcriptions require individual TEI-XML requests)
luhmann_number — Luhmann's original numbering system
meta — card type, position in the physical archive, section
file — pointer to the original XML and scan images
flags — whether a branch visualization exists

The download

Nine HTTP requests. Three for ZK I (22,079 cards), six for ZK II (51,636 cards). Three seconds of politeness delay between each. Total wall time: under 30 seconds.

ZK I: 22,079 cards — 37 MB
ZK II: 51,636 cards — 63 MB
TOTAL: 73,715 cards — 100 MB

The script is 80 lines of bash and python. It resumes if interrupted, merges chunks into single files, and reports totals.

The missing ~16,000

The archive reports around 90,000 physical cards, but the API returns 73,715. The difference is the bibliographic sections (Auszüge 15–20 of ZK II), keyword registers (Schlagwortregister), and person registers — six sections of approximately 2,500–3,700 cards each that are catalogued separately. Sections 18–20 have not yet been published online. (Corrected thanks to Ton Zijlstra, who pointed to the archive's own documentation.)

What this enables

With the full dataset as JSON, you can:

Build a local graph of Luhmann's cross-references
Run full-text search across 45 years of thinking
Analyze his numbering patterns and branching structure
Train models on what actual long-term knowledge work looks like
Study how his vocabulary and references evolved over time

The data was always there. It just needed someone to skip the presentation layer.

The script

#!/bin/bash
# Download Niklas Luhmann's complete Zettelkasten
API="https://v0.api.niklas-luhmann-archiv.de/ZK/search"

for zk_id in 1 2; do
    page=1
    while true; do
        q="{\"page\":${page},\"rows\":10000,\"fulltext\":\"\",\"fuzzy\":false,\"FTSearchMode\":\"and\",\"zettelnummer\":\"\",\"zettelnummerSearchMode\":\"starts-with\",\"areas\":[],\"ref\":\"\",\"zks\":[\"${zk_id}\"]}"
        encoded=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$q'))")
        curl -s -o "zk${zk_id}_p${page}.json" "${API}?q=${encoded}"

        count=$(python3 -c "import json; print(len(json.load(open('zk${zk_id}_p${page}.json')).get('results',[])))")
        total=$(python3 -c "import json; print(json.load(open('zk${zk_id}_p${page}.json')).get('numberOfResults',0))")

        echo "ZK $zk_id page $page: $count cards (of $total)"
        [ "$count" -eq 0 ] || [ $((page * 10000)) -ge "$total" ] && break
        page=$((page + 1))
        sleep 3
    done
done

That's it. Luhmann's life's work, in a curl loop.

Explore the Network 50 Things Nobody Knew View on GitHub