Self-Calibrating Claim Lens
Reading a source the way the brain reads a speaker — learning the patterns before judging the content.
What it is
When you meet someone for the first time, you don't immediately evaluate what they say. You listen for a while. You calibrate. You learn their speech patterns, their hedging words, their habits of exaggeration or understatement. After ten minutes, you know how to read them. "I'm fine" from one person means they're fine. From another, it means they're in crisis.
You do the same thing when you read a book. After twenty pages of a historian's work, you know how they write negation, how they hedge, when they cite others versus when they assert their own view. You've calibrated your reading to this specific author. You don't read every word equally — you read through the lens of what you've learned about how this particular person uses language.
This is a self-calibrating claim lens. Not a fixed set of rules applied to any text, but a reading strategy that adapts itself to the specific corpus it encounters.
What it does in the brain
Predictive coding as calibration. The brain does not passively receive language. It predicts the next word, the next phrase, the next rhetorical move. When a prediction fails — when the author says something unexpected — the brain updates its model of this specific author. After enough updates, the model is calibrated. You read faster, you understand more, and you catch deviations that would be invisible to someone reading their first page.
Negation processing. Neurolinguistic research shows that the brain processes negation not as a simple flip of meaning, but as a two-step operation: first activate the affirmative ("he was arrested"), then suppress it ("he was NOT arrested"). This means negation is cognitively expensive and error-prone — readers sometimes remember the affirmative when they should remember the negation. A calibrated reader knows where to expect negation from this specific author and processes it more reliably.
Source monitoring. The prefrontal cortex continuously tracks not just what was said, but who said it and how confident they sounded. "According to Somer, X was arrested" activates a different neural circuit than "X was arrested." The first tags the claim with a source; the second presents it as the author's own assertion. A calibrated reader makes this distinction automatically.
The problem it solves
You want to fact-check an 8.3-million-word historical standard work against primary sources. You need a machine that can read those 8.3 million words and extract every falsifiable claim: who did what, when, where. But the machine must also understand what the author is NOT claiming — negation, hedging, citation, hypothesis. Without that understanding, the fact-checker will check claims the author never made.
How it works
The self-calibrating claim lens is a method, not a tool. For any corpus, the process is the same:
Step 1: Measure. Do not assume what patterns the author uses. Scan the entire text. Count every negation word, every date format, every action verb. In the Dutch standard work on World War II, we found 38,000 date instances across 12 different formats, and 14,700 action verbs across 10 categories. These numbers were not guessed — they were measured.
Step 2: Test the simple approach. Start with distance-based negation detection: if a negation word appears within N words of an action verb, flag it. Measure precision and recall on the actual corpus. We found that 3 words caught 71%, 5 words caught 88% — but also that 12% of real negations were missed and some false positives were introduced.
Step 3: Find where it fails. Examine every failure case. In Dutch, the verb often sits at the end of the clause while the negation sits at the beginning, with an entire subordinate clause in between. Distance-based approaches fail structurally on this language pattern. "Niet alleen de mannen werden weggevoerd" ("Not only the men were deported") — distance says negation, syntax says affirmation. They WERE deported.
Step 4: Build the paths. Use dependency parsing to follow the syntactic tree instead of counting words. Then add corpus-specific paths for constructions the parser misses. For the Dutch standard work, we built five paths:
| Path | What it catches | Example |
|---|---|---|
| Direct dependency | Negation word attached to action verb | "werd niet gearresteerd" (was not arrested) |
| Via auxiliary | Negation on helper verb | "zou niet worden vrijgelaten" (would not be released) |
| Subject negation | "geen" (no) on the subject | "geen gevangene werd vrijgelaten" (no prisoner was released) |
| Without-clause | "zonder dat" (without) | "zonder dat hij werd verhoord" (without being interrogated) |
| Pronoun negation | "niemand" (nobody) as subject | "niemand werd vrijgelaten" (nobody was released) |
Step 5: Validate. Test on the corpus itself, not on example sentences. The five paths together achieve 96% accuracy on the actual text. The remaining 4% are constructions that are genuinely ambiguous even for human readers.
The key insight: the five paths are not the discovery. The discovery is the method for finding them. A legal corpus would produce different paths. A medical corpus would produce others. But the method — measure, test, fail, parse, validate — is the same for any language and any corpus. The lens calibrates itself to what it reads.
What it does in ThetaOS
In WOZ (the WO-II Operating System), the self-calibrating claim lens turns the 8.3-million-word standard work by Loe de Jong into a structured set of falsifiable claims. Each claim has: who (entity), what (action), when (date), where (location), and a negation flag. These claims are then checked against 172 declassified British intelligence files and 62 Dutch resistance reports.
The machine is reusable. When a new archive arrives — Arolsen (557,000 records), the Dutch domestic intelligence files, or the war crimes tribunal archives (opening January 2027) — the fact-check table is recalculated automatically. Coverage grows with every new source. Silence becomes louder: what De Jong doesn't mention that the primary sources do mention becomes visible and measurable.
Beyond WOZ, the claim lens applies to any reference work. Any standard text can be decomposed into falsifiable claims and tested against primary sources. The lens is domain-agnostic — it calibrates itself to whatever it reads.
The phenomenological connection
Husserl's intentionality: consciousness is always consciousness-of-something. The claim lens is always claim-lens-of-a-specific-corpus. There is no generic claim lens, just as there is no generic consciousness. The lens shapes itself to its object — and in doing so, reveals structures that were invisible to both the uncalibrated reader and the uncalibrated machine.
Designed — April 15, 2026