//The Eleven-Minute Bug

7–10 minutes

The Northumbrian Equinox

Around the year 725, at the monastery in Jarrow, the monk known as the Venerable Bede completed De temporum ratione—”On the Reckoning of Time.” It stood as the most sophisticated technical manual of its age, a guide to the computus: the complex mathematical system used to calculate the date of Easter.

To the modern eye, the medieval obsession with the date of Easter looks like pedantry. In reality, the computus functioned as the operating system of Western civilization. Because Easter was a “movable feast”—calculated as the first Sunday after the first full moon following the spring equinox—its date dictated the entire liturgical calendar.

Historical note.
While primarily religious, the liturgical calendar often served as the de facto coordination layer for secular administration, influencing when taxes were collected, when soldiers were mustered, and when legal contracts expired.

The computus was the temporal stack upon which the Middle Ages ran. And it contained a bug.

Bede’s system relied on the Julian calendar, which assumed the solar year was exactly 365.25 days long. But the tropical solar year is approximately 365.2422 days. The discrepancy—about eleven minutes—is a ghost. It is an anomaly so minuscule that it remains invisible within the span of a single human life. If you lived in the eighth century, your tables matched the sky perfectly.

But the computus was not a local tool; it was a synchronized infrastructure. From the fjords of Scandinavia to the plains of Lombardy, every monastery copied the same tables and rang their bells at the same intervals. Because the system was perfectly coordinated, the error did not cancel out. It compounded.

\Delta_{cumulative} = \sum_{t=1}^{n} (365.25 – 365.2422)_t

By the early thirteenth century, those eleven minutes had added up to roughly seven days. By 1582, the divergence was ten days. The “Spring Equinox” of the official tables occurred while the actual sun was still deep in its winter transit. The monks were looking at their books and seeing a reality that no longer existed.

The danger was not that the monks were incompetent, nor that they were ignorant of the shift. By the later Middle Ages, scholars were well aware of the discrepancy. The problem was the impossibility of correcting it without fracturing a synchronized Christendom. The danger was that they were all perfectly, harmoniously wrong.

The Synchronization Multiplier

The parable is not really about grain or Easter mass. It is about the architecture of consensus. We assume that we have traded monks for engineers and vellum for CI/CD pipelines, believing our modern infrastructure—built on the logic of silicon—is immune to the slow decoupling of the Middle Ages.

But the computus crisis demonstrates a fundamental law of systems: The Synchronization Multiplier.

In a fragmented system, errors function as noise. If one monastery in Gaul miscalculates the moon, they might feast a week early, but the rest of the world remains aligned with the stars. The mistake is contained by the lack of coordination. But in a synchronized system, an error is not noise; it is a signal. When the entire world uses the same tables, there is no local corrective mechanism.

Synchronization suppresses correction because the cost of local deviation is immediate and high. If a single node corrects its own table to match reality, it becomes incompatible with the network. Therefore, rational actors will defer correction even when the error is detected, preferring to be “wrong with the group” rather than “right alone.”

Falsifiable Hypothesis.
Organizations using a single dominant foundation model will detect systematic failures later than those using enforced model disagreement thresholds.

We see the “loud” version of this today in global IT outages. When a provider like CrowdStrike pushes a flawed update, it propagates instantly. We also see it in the “quiet ubiquity” of vulnerabilities like Log4Shell, where a standardized dependency creates a universal fault line waiting to be triggered. The system behaves as designed—instant synchronization—causing total failure.

However, the “loud” failure is the safer one. It invites immediate repair. The more dangerous failure is the “quiet” one—the Eleven-Minute Bug. This is a “correct” output that invisibly diverges from the territory it maps.

Intelligence as Infrastructure

We are currently crossing a threshold where Large Language Models (LLMs) are moving from assistive tools to judgment infrastructure. Until recently, our technical abstractions encoded rules. A compiler does not have an opinion on your logic; it follows a deterministic grammar. These are the “hard” tables of our era.

But as we integrate AI into the core of our development stacks—through agentic workflows and automated code refactoring—we delegate the layer of Judgment to the models. This is not merely a matter of suggestion; it is a matter of defaults. When judgment is embedded in the default settings of CI/CD pipelines and code review tools, opting out becomes an act of resistance rather than a neutral choice.

When millions of developers use the same underlying model to decide how a system should be architected, they are aligning the judgment of a global industry. We are building a new, global computus.

The Mechanism of Systemic Divergence

Here we must leave the metaphor and look at the mechanism. To understand the risk, we must distinguish between “hallucinations” and Systemic Divergence.

Hallucination: A loud error (e.g., $2+2=5$). These are bugs to be squashed.
Systemic Divergence: A statistical mean. The model suggests a pattern that is plausible, standard, and helpful, but contains a tiny, systematic skew away from hard technical reality.

Consider Cryptographic Divergence. Even in domains with specialized tooling, the gradient toward readable abstraction creates a steady pull away from hardware-faithful reasoning. An LLM, prioritizing patterns found in general-purpose software, may suggest “clean” variable-time comparison functions.

Technical note.
In a timing attack, an adversary deduces the key by measuring how long the CPU takes to process it. “Optimized” or “readable” code is often less secure than “constant-time” code for this reason.

If a generation of engineers relies on the same model to audit their security layers, the model’s preference for “clean” logic becomes the industry standard. The systems pass all unit tests. They look flawless to human reviewers. But the underlying hardware—the actual “stars”—remains bound by the physics of voltage and clock cycles. We build a world of code that is eleven minutes disconnected from how silicon actually processes instructions.

The Benchmark Trap

In the thirteenth century, to challenge the table was to challenge the infrastructure of Christendom. We are building our own circular validation loop: The Benchmark Trap.

We evaluate LLMs based on their performance on benchmarks like MMLU or HumanEval. These benchmarks are now part of the training data. Furthermore, as AI generates more of the world’s content, models are increasingly trained on their own previous judgments—a phenomenon researchers call “Model Collapse.”

This is not a buzzword; it is a specific statistical failure mode where synthetic outputs increasingly dominate the training data, narrowing variance and reinforcing prior statistical biases. We are checking the tables by looking at other copies of the tables. Synchronization creates an epistemic monoculture where a single “eleven-minute” error does not just survive; it becomes the new Spring Equinox.

The Moral Misdiagnosis

The most fascinating element of the medieval computus crisis was the reaction to the divergence. They did not say, “We need to update our solar year constant.” Instead, they looked for moral reasons. They blamed the corruption of the papacy or the decay of the universities.

We see this pattern emerging in the AI discourse. When synchronized systems show signs of divergence—brittleness, bureaucracy, supply chain failure—our first instinct is to blame “lazy developers” or “corporate greed.” Moral explanations are cognitively cheaper than structural ones. They misdiagnose a systematic divergence in judgment as a failure of individual character. We will spend decades arguing over the heresy of the models while the eleven-minute bug continues to pull our infrastructure away from the stars.

The Gregorian Option: Forced Desynchronization

In 1582, Pope Gregory XIII patched the calendar with a hard fork: he deleted ten days. It was a brilliant technical solution that caused a social catastrophe. What would a “Gregorian Reform” for synchronized intelligence look like?

It requires a deliberate Desynchronization Strategy. These strategies are expensive, slow, and locally irrational—just as the Gregorian reform was. But they are necessary to introduce adversarial friction:

Forced Heterogeneity: We must resist converging on a single “best” model. Organizations should employ “N-version modeling,” where distinct models suggest architectural decisions, and human review is required if they disagree.
Epistemic Diversity as Redundancy: Universal agreement between models is not a sign of truth, but a risk signal. If every AI agent agrees a refactor is “best,” that is precisely when a human should look for the error.
The Hardware Equinox: We must maintain “monks” who look at the stars without the tables—engineers who write code and audit logic without AI assistance, serving as a control group for reality.

The Stars and the Tables

The monks of Jarrow were not sloppy; they were the most disciplined technicians of their era. Their tragedy was their success. They built a system so coherent and universal that it silenced the stars for a thousand years.

We are currently building our own tables. We are exhilarated by the alignment AI offers. But the solar year does not care about our tables. Reality—whether the physics of a semiconductor or the limits of resources—remains indifferent to our consensus.

The question is not whether the models are smart. The question is: What are they wrong about by eleven minutes?