How I Got Luhmann's 73,715-Card Archive in a Couple of Minutes
March 28, 2026 · Martijn AslanderNiklas Luhmann's Zettelkasten is legendary. Over 45 years, the German sociologist built a system of roughly 90,000 handwritten note cards that powered more than 50 books and 550 articles. It's the most famous example of networked note-taking in history.
Bielefeld University digitized the entire thing. Every card is scanned. Most are transcribed. They put it online at niklas-luhmann-archiv.de — and then made the exact mistake Luhmann would have predicted.
The irony
They built a JavaScript-heavy web application. You click through cards one by one. There's a search page. It renders beautifully. But there's no download button. No export. No "here's the dataset."
Luhmann wrote about exactly this kind of thing — systems that create access rituals instead of enabling genuine communication. The archive website is a presentation layer pretending to be an access layer.
The hidden API
Behind the website sits a BaseX XML database with a simple search API:
https://v0.api.niklas-luhmann-archiv.de/ZK/search
It takes a single q parameter containing a JSON query:
{
"page": 1,
"rows": 10000,
"fulltext": "",
"fuzzy": false,
"FTSearchMode": "and",
"zettelnummer": "",
"zettelnummerSearchMode": "starts-with",
"areas": [],
"ref": "",
"zks": ["1"]
}
Change zks to ["1"] for the first Zettelkasten (1951–1962) or ["2"] for the second (1963–1996). Crank rows up to 10,000 and paginate.
What you get
Each card comes back as a JSON object with:
transcriptionPreview— a preview of the transcribed text (truncated for longer cards; full transcriptions require individual TEI-XML requests)luhmann_number— Luhmann's original numbering systemmeta— card type, position in the physical archive, sectionfile— pointer to the original XML and scan imagesflags— whether a branch visualization exists
The download
Nine HTTP requests. Three for ZK I (22,079 cards), six for ZK II (51,636 cards). Three seconds of politeness delay between each. Total wall time: under 30 seconds.
ZK II: 51,636 cards — 63 MB
TOTAL: 73,715 cards — 100 MB
The script is 80 lines of bash and python. It resumes if interrupted, merges chunks into single files, and reports totals.
The missing ~16,000
The archive reports around 90,000 physical cards, but the API returns 73,715. The difference is the bibliographic sections (Auszüge 15–20 of ZK II), keyword registers (Schlagwortregister), and person registers — six sections of approximately 2,500–3,700 cards each that are catalogued separately. Sections 18–20 have not yet been published online. (Corrected thanks to Ton Zijlstra, who pointed to the archive's own documentation.)
What this enables
With the full dataset as JSON, you can:
- Build a local graph of Luhmann's cross-references
- Run full-text search across 45 years of thinking
- Analyze his numbering patterns and branching structure
- Train models on what actual long-term knowledge work looks like
- Study how his vocabulary and references evolved over time
The data was always there. It just needed someone to skip the presentation layer.
The script
#!/bin/bash
# Download Niklas Luhmann's complete Zettelkasten
API="https://v0.api.niklas-luhmann-archiv.de/ZK/search"
for zk_id in 1 2; do
page=1
while true; do
q="{\"page\":${page},\"rows\":10000,\"fulltext\":\"\",\"fuzzy\":false,\"FTSearchMode\":\"and\",\"zettelnummer\":\"\",\"zettelnummerSearchMode\":\"starts-with\",\"areas\":[],\"ref\":\"\",\"zks\":[\"${zk_id}\"]}"
encoded=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$q'))")
curl -s -o "zk${zk_id}_p${page}.json" "${API}?q=${encoded}"
count=$(python3 -c "import json; print(len(json.load(open('zk${zk_id}_p${page}.json')).get('results',[])))")
total=$(python3 -c "import json; print(json.load(open('zk${zk_id}_p${page}.json')).get('numberOfResults',0))")
echo "ZK $zk_id page $page: $count cards (of $total)"
[ "$count" -eq 0 ] || [ $((page * 10000)) -ge "$total" ] && break
page=$((page + 1))
sleep 3
done
done
That's it. Luhmann's life's work, in a curl loop.