Four word sets. Two raycasts. The complete topology of what five frontier models convergently compress from news summaries.
Governance-concept terminals from the same tensor, same algorithm, three different inputs. Random English words produce zero governance terminals. News words that models kept produce 17.5% — the institutional anchors. News words that models dropped produce 10.4% — the operational modifiers.
Models anchor to institutional framing — keeping the structural nouns, the committee names, the broad geopolitical vocabulary — while convergently dropping the operational modifiers that connect named actors to specific consequences. The kept words preserve the cage. The dropped words were the teeth.
Set 1 is confrontation. The source article said "devastated." Any single model dropping it could be stylistic — "destruction" is in the headline, and redundancy pruning is what summarization does. But five competing companies, with different architectures, different training data, and different RLHF pipelines, independently dropping the same word from the same article — that is a convergence event. One absence is compression. Five identical absences are a coordinate.
Set 3 is geometry. No one wrote "annihilation." The source didn't contain it. The models didn't drop it. But in the frozen embedding space, the nearest neighbors of the consensus output include "annihilation" — and no model used it. We do not claim the token was close to being sampled; we do not have access to the models' internal probability distributions. We claim only that the concept is geometrically adjacent to the output in embedding space. That is a measurement, not an inference about model cognition.
Sets 2 and 4 are exploration. Both use directional kNN at extrapolated coordinates. The terminal concepts show what is semantically downstream of the absent or shadow words in this particular embedding space. The raycast maps the territory. The p-value proves the territory exists.
This is not a claim about model intent. The measurements are deterministic — frozen embeddings, static tensor, arithmetic operations. The word "devastated" may have been dropped because RLHF training penalizes intense language, because the tokenizer favored a different path, or because the model judged it redundant. We measure the absence. We measure the geometric consequence. We do not measure intent.
The host model (Mistral Small 22B, running locally) would show similar patterns under the same measurement. We acknowledge this in every broadcast.
Every computation uses frozen BAAI/bge-large-en-v1.5 embeddings. The 253K concept tensor is static Wikipedia vocabulary. The raycast is deterministic: same headline, same void word, same terminal concepts. Run it twice, get the same answer. No language model evaluates another language model's output at any point in the measurement stack.