What does this mean for archives? — Groningen War Puzzles

The bottleneck

The archive world has been working with three steps for decades:

Done

Preserve
Safeguard the physical object

Done

Digitise
Photograph the object

Stalled

Make accessible
Add metadata so it can be found

Step 3 is where things stall — and the situation is more nuanced than you might think. Making accessible in the archive world does not mean transcribing every card; it means describing what is there so someone can find it. And at the collection level, that has often been done: cards are listed in inventories, organised by name, and retrievable physically or digitally. But making the contents of those cards searchable — reading each card, structuring it into fields, making it linkable and countable — is a fundamentally different step. And that step is left undone for most collections, simply because doing it by hand is too much work.

The consequence is that enormous collections are digitised and findable at the item level, but the content remains inaccessible. You can look up a card if you already know the person — but you cannot ask how many cards mention Neuengamme, or who was arrested on 13 November 1943, or which names keep appearing as traitors.

The resistance cards of the OVCG in Groningen are a striking example. The digital archive was first published in 2004, and in 2021 the Mondriaan Fund awarded the project "OVCG Visible Online" — a two-year programme in collaboration with the Groningen Archives to describe the collection and improve findability. That work was done professionally and carefully, and the result is impressive: 4,272 of 4,615 cards are digitised and findable by name. But the contents of those cards — what is written on them, who is mentioned, what connections exist between them — had never been made searchable. Not because the work was inadequate, but because that step was simply not feasible with the available resources. Until April 2026.

The difference between OCR and informatisation

What happened here resembles OCR, but it is something fundamentally different. Traditional OCR reads typed text and turns it into plain text — character by character, with no understanding of what it says. The result is searchable text, but nothing more.

What I used is a Vision LLM: an AI model that can look at and understand images. It does not just read the handwriting — it recognises that "geb. 12-3-1918" is a date of birth, that "KP" is a resistance group, that "Neuengamme" is a concentration camp, and that "gearr. 4-12-1944" means someone was arrested on that date. The difference becomes concrete when you look at the same card:

OCR produces: "Koerierster, veel verzetswerk. Vrouwenmars nr 104."

Vision AI produces: Name: Albertien Lunshof-Lunshof. Role: courier. Resistance work: yes. Women's March: participant, number 104. Own card: no — mentioned as subordinate clause on the card of husband Albert Lunshof (minr 1153854).

That is not transcription — that is reading with comprehension and structuring in a single step. The machine does in one operation what traditionally requires two separate processes: first reading, then cataloguing. And that is where the breakthrough lies. Not in the speed, but in the fact that digitising and making accessible collapse into one. The gap between them disappears.

What about Transkribus?

Transkribus is an established platform for handwriting recognition widely used in the European archive world, and it is excellent at what it does: converting handwriting into searchable text. But it is and remains transcription — it produces plain text, not structured data. You can search for the word "Neuengamme" but you do not get a field concentration camp: Neuengamme, deportation date: 12-11-1944. Transkribus makes text searchable. What was done here makes meaning searchable.

The same applies to crowdsourcing platforms like VeleHanden.nl, where thousands of volunteers transcribe handwritten documents — valuable work, but the goal is and remains transcription, not informatisation.

How we approach handwritten texts differently — and how you can verify whether what AI finds is actually correct — is described in How do you know if what AI finds is actually correct, when searching an old archive? (Dutch)

The difference between transcribing and informatising

Transcribing is essentially a single operation: transferring text from one medium to another. From handwriting to digital text. It is one discipline — palaeography — and the result is a readable copy of what was written.

Informatising is something fundamentally different, and the distinction is crucial. It means: applying multiple disciplines simultaneously to the same source. Not just reading, but also understanding, structuring, cross-referencing, contextualising and validating. That requires not one skill but six:

Palaeography
Reading the handwriting

Archival science
Structuring into fields: name, date, group, fate

Historical research
Recognising context: KP = resistance group, Neuengamme = camp

Data analysis
Cross-referencing across 3,150 records

Fact-checking
External validation against 7 online sources

Network analysis
Making connections between individuals visible

None of these steps is new in itself. They have been performed for decades by excellent professionals in their respective fields. But they are normally carried out by different people, in different months, with different tools — the palaeographer reads, the archivist structures, the historian contextualises, the data analyst cross-references. They rarely work simultaneously on the same material, simply because that is logistically impossible and financially unfeasible.

What Agentic AI makes possible is applying those disciplines simultaneously, to the same material, in the same session. Not because AI is smarter than the individual specialist — it is not — but because it can combine all six steps in a single pass at a scale that would be impossible for a team of specialists.

I previously applied the same method to genealogical research on Merovingian kings, where the simultaneous cross-referencing of onomastics, hagiography, prosopography and eight other disciplines led to hypotheses that the individual fields had not formulated — not because the specialists could not, but because nobody was standing in all those fields at once. (Read: AI as a brilliant genealogist, Dutch)

With the resistance cards it is exactly the same. The building blocks were all already there — the cards, the sources, the expertise. What was missing was someone or something that could apply all six disciplines simultaneously to 3,150 cards in a single night. The difference between transcribing and informatising is not a difference in speed but a difference in category: one discipline versus six disciplines at once.

On top of that, the pipeline improves as it processes more cards. Not because the model learns in the technical sense, but because previously transcribed cards are fed as context — the model recognises handwriting patterns, Groningen place names and frequently used abbreviations better because it has already seen the previous cards. I previously applied this technique to the Zettelkasten archive of Niklas Luhmann — 90,000 handwritten note cards by the most famous German sociologist of the twentieth century, where an academic team with a five-million-euro budget has been working for ten years and is a third of the way through. Technically, this resistance archive is the second project in two weeks that I have informatised in this way.

The power of emergent analytical insights

The real power lies not in reading one card — it lies in reading all cards at once. An archivist who reads one card sees one story. But a traitor who is mentioned on five different cards, by five different people, over a period of two months — that pattern is completely invisible when you read one card at a time. It only becomes visible when you lay all 3,150 side by side.

And that is precisely what happened here. Geesje Bleeker appears on five independent cards — no single card describes her as a serial betrayer, but five cards together do. Eleven cards mention amounts for which the SD released people, and no single card describes a "system," but together they sketch an informal corruption system with tariffs. Dozens of cards mention fragments of the same route — Groningen, Scholtenhuis, Neuengamme, Lübeck Bay — and no single card describes the complete deportation route, but together they trace it.

These are what you might call emergent insights: they do not exist on the individual cards; they only emerge when you treat the entire archive as a dataset. No human can hold 3,150 cards in their head simultaneously. A machine can.

Having the error discussion honestly

The first reaction from the archive world will be: but you don't know if the transcription is correct. And that is true. The AI transcription has not been independently validated and undoubtedly contains errors, from misread handwriting to misinterpreted abbreviations and incorrectly linked names. I do not dispute that, because it is a given with any form of automated processing.

But it is worth recognising that the implicit assumption behind that criticism is that the alternative would be a flawless human transcription — and that does not exist. Human transcription also contains errors, interpretation and inconsistencies, with the difference that human transcription takes a hundred times as long and costs a hundred times as much. The honest comparison is therefore not:

AI transcription (with errors) vs. perfect human transcription

but:

AI transcription (with errors, available now) vs. no transcription (for twenty years)

The question is not whether this is perfect, but whether it is useful enough to find things that would otherwise not be found. The answer is on this site — patterns that were invisible for eighty years, connections that were technically impossible to make, research questions that nobody could ask because the data was not searchable.

It is also worth considering that errors do not begin with the AI. The cards themselves were written years to decades after the war, based on testimonies and memories — not by the resistance fighters themselves, but by researchers who reconstructed after the fact what had happened. Memories shift, details fade, some things are emphasised and others omitted. That is not a reproach to the archive's creators, but it does mean there are three sources of error in this data, not one: the original recording, the selection of who received a card in the first place, and our AI reading of it. Of those three, the AI transcription is the only one you can check and correct after the fact — and that is precisely what we invite.

The errors must be found and corrected, and that is the next step — a step that requires precisely the expertise that archivists and historians possess. But that step can only be taken after someone has taken the first one.

From gatekeeper to validator

This shifts the role of the archive. Until now, the archive was the gatekeeper — only what they had made accessible was findable, and the rest existed in theory but was invisible in practice.

If Agentic AI (not to be confused with chatbots and LLMs, which can be highly unreliable) can do the initial informatisation — fast, cheap, imperfect but useful — then the role of the professional changes. Not disappearing, but shifting: from we read the cards and make them findable to we validate what the machine has read and correct where necessary.

That is not a disaster for the profession but a meaningful refinement of it. The professional no longer spends their time retyping names and dates, but on what they were actually trained for: assessing connections, testing hypotheses, placing findings in their historical context. The craft does not become redundant — it is finally deployed for its intended purpose.

The invitation

This project does not claim definitive answers — it delivers research questions. Every hypothesis on this site is an open question waiting for someone with the knowledge to confirm, refute or nuance it. The inbox is open.

How scalable is this experiment?

3,150 cards is a relatively small archive. But the method used here scales to virtually any collection available as scans. The National Archives of the Netherlands holds millions of digitised pages, and at the Amsterdam City Archives, NIOD (Institute for War, Holocaust and Genocide Studies) and the regional archives, there are collections everywhere whose contents are not searchable — handwritten documents, deeds, forms, cards and letters that were once photographed and have been waiting ever since.

The technology used here — a Vision LLM that reads and structures handwriting — works in principle on all those collections. Not perfectly, but well enough to start. The cost of this project was a few euros in API fees and one night's work — not by an archivist, but by someone who knows how to call an API.

This means the familiar argument "we don't have enough money and we don't have enough people" becomes a fundamentally different conversation. The technology is available and the costs have become negligible — the question is no longer whether we can do it, but whether we want to do it this way.

Why an amateur need not strive for perfection

The archive world has a culture of rigour, and rightly so — you do not treat historical sources lightly, every word counts and every interpretation must be justified. But that rigour has a downside, because it can also become a reason for inaction. We will wait until the technology is better, we will wait until there is a budget, we will wait until there is a protocol. And meanwhile the cards lie in drawers, the people who can still confirm the stories pass away, and knowledge is lost that is not in the archive but only in the minds of descendants.

This project is not perfect and undoubtedly contains errors that need to be found and corrected. But it exists, and because it exists, people can now search for their family name, researchers can find patterns that were previously invisible, and archivists can begin validating what the machine has read. Something that exists with errors is ultimately more useful than something that does not exist because it was not yet perfect enough.

The stories on these cards are eighty years old. The cards themselves were written in the decades that followed. They have been online as scans since 2004. And they are now, in April 2026, as far as we know searchable, linkable and countable for the first time. I am genuinely curious whether there are people in the archive world who want to work with this.

This experiment is a sub-project of the Pilot Information Autonomy, an ongoing laboratory where I work with a small group of motivated people, supported by an advisory board, on precisely these kinds of questions — questions that are alive in government and public administration, and that we try to make transparent and practically manageable. If you work in government or in a government-affiliated organisation and want to explore experiments like this: please get in touch.

Possible implications for the archive sector