Bias or Dynamics? — EigenTrace

What We Found

EigenTrace measures semantic displacement in AI text summarization. When five frontier models summarize the same news article, certain words get dropped and others get added. The pattern is not random — it is systematic, content-dependent, and consistent across all models tested.

The displacement is 74% stronger when the story involves an AI developer. The entity swap test confirms this: swap the company name, the displacement follows the entity (p = 0.0085, d = 0.471). Eight statistical tests →

This raises a question we cannot yet answer: is this a bias that better training would fix, or a stable attractor that the system returns to even under perturbation?

The distinction matters. A bias implies a solution: retrain with better data. A dynamics implies a constraint: the system has a shape, and that shape has consequences regardless of training choices.

The Lyapunov Question

In dynamical systems, Lyapunov stability describes systems that return to equilibrium after being pushed away from it. A ball at the bottom of a bowl. Push it — it rolls back. The bowl is the basin. The ball's resting position is the attractor.

If alignment training creates something like this — a basin in the model's output space where low-volatility, institutionally-smooth language is the attractor — then the systematic softening EigenTrace measures might not be a bug to fix. It might be the shape of the bowl.

We do not claim this is the case. We describe three experimental frameworks that would test it, and note what our own data already constrains.

The Test That Matters

The real test is not whether a Lyapunov function can be retrofitted — it always can. The test is whether a function V, constructed from first principles (the reward model, the KL term, a measurable output statistic), actually predicts things you don't already know.

Predictions that would count:

Which prompts will resist attenuation. V should identify prompt categories where the basin is shallow — where small perturbations escape the attractor and produce unattenuated output. We should be able to predict these categories before testing them.

Where basin boundaries fall. There should be measurable phase transitions — not gradual degradation — at specific perturbation scales in activation space. The boundary locations should be predictable from V, not discovered empirically and then explained retroactively.

Cross-model transfer. If V captures genuine dynamics rather than corpus-specific statistics, a stability measure constructed from one model should predict basin geometry, jailbreak susceptibility, or attenuation patterns in a different model with different architecture and training data. Within-model prediction sits too close to retrofitting. Cross-model prediction requires the dynamics to be architecture-independent.

Training progression. V should predict how attenuation scales with training compute. At what point during RLHF does the basin form? Does it deepen monotonically, or does it crystallize at a phase transition?

Why This Matters Beyond Theory

A bias is a tendency you can correct. A dynamics is terrain.

If the displacement EigenTrace measures is a bias, the solution is straightforward: better training data, better objectives, better evaluation. The problem has a fix.

If it is a dynamics — a stable basin that the system returns to when perturbed — the implications are different. Better prompts don't move mountains. Any organization using language models for financial analysis, legal review, due diligence, intelligence synthesis — anywhere precise causal language matters — would have a quantifiable blind spot that cannot be engineered around with prompting tricks.

And if five frontier models share roughly the same basin geometry — which the cross-model consistency in EigenTrace data suggests — then thousands of organizations running inference on those models inherit the same softening, in the same direction, on the same topics. That is a monoculture risk in information infrastructure.

Whether that's an existential concern or a manageable engineering problem depends on what you're using the models for. But you cannot pretend the terrain isn't there once you've measured it.

What We Cannot Yet Do

We have not constructed V. We have not run perturbation experiments. We have not demonstrated phase transitions in activation space. We have not tested cross-model transfer of stability measures.

What we have is a systematic displacement pattern that is consistent across five models, survives eight statistical robustness tests, and appears to be inherited from the training corpus rather than produced by alignment training. The Lyapunov framework is a hypothesis about why the pattern has the shape it does — why it's stable, why it's consistent, why it resists perturbation.

The hypothesis earns its keep only if it generates predictions that simpler accounts ("the models learned consistent stylistic patterns") do not. Recovery times, basin widths, phase-transition locations, cross-model transfer coefficients — these are the numerical predictions a stability-theoretic account would produce and a distributional-regularity account would not.

Until those predictions exist and can be tested, "Lyapunov" is a lens, not a finding. We state this because stating it is what separates methodology from rhetoric.