Generative engine optimization · a new metric

Large Language Model Summary Convergence and the Future of SEO

When an AI summary stands between your content and its reader, what matters is no longer only what the models say about you. It is what all of them, together, leave out — because what they jointly omit is what disappears. I built the instrument that measures it, and a metric that names it.

By Sean Adams · EigenTrace · five frontier models, measured live · deterministic geometry on frozen embeddings · no model judges another

This page follows the EigenTrace convention. Claims the instrument measured are marked and stand on their own. Claims that are argued — a reading of what the measurements imply for marketing — are fenced and labeled, so you can take the measurement and weigh the interpretation separately. Where measurement ends, it says so.

01 — The shiftDiscovery now runs through a summary you don't control

For twenty years, SEO optimized for a results page: ten blue links, and a click to your site. That layer is being replaced. Google's AI Overviews, ChatGPT, Perplexity, and Claude increasingly answer the query themselves — by reading sources and summarizing them — and the reader may never click through at all.

This changes the unit of optimization. It is no longer "does my page rank," but "when a language model reads my page and summarizes it for the user, what survives the summary?" The model is now the intermediary, and a handful of them mediate nearly all of it. That makes the behavior of those models — specifically, what they preserve and what they drop — a direct input to whether your content reaches anyone. To optimize for it, you first have to measure it.

02 — The instrumentVF-IDF: the negative-space sibling of TF-IDF

Every SEO practitioner knows TF-IDF: it weights a term by how much a document is about it — frequent in this document, rare across the corpus. For twenty years it has been the backbone of keyword analysis. It surfaces what a text contains. I built and implemented its inverse — the metric for the AI-summary era, where what matters is what the models drop.

The metric — implemented, deterministic, in the repo

VF-IDF(concept) = void-frequency(concept) × inverse-document-fidelity(concept)

void-frequency      = how strongly the source points at the concept
                      (its TF-IDF salience in the source text)
inverse-doc-fidelity = 1 − how well the summaries preserved it
                      (1 − max cosine to any summary sentence)

Where TF-IDF surfaces what a document is about by what it contains, VF-IDF surfaces it by what the readers drop. A concept scores high only when the source makes it salient and every summary let it fall — a consequential omission, separated from both the faithfully-retained facts and the stopword noise.

It is not a slide. It runs, on frozen BAAI/bge-large-en-v1.5 embeddings — same inputs, same scores, every time. Here is the metric isolating the consequential omissions from a real source's candidate concepts:

concept        VF-IDF   void_freq   inv_fidelity
blockade        1.000      1.000       1.000     ← salient + dropped
ceasefire       0.889      0.889       1.000     ← salient + dropped
sanctions       0.000      0.667       0.000     ← salient, but retained
the             0.028      0.056       0.500     ← dropped, but trivial

Measured · why the arithmetic is the point

A retained concept scores zero — its inverse-fidelity collapses to zero, so "sanctions," which the summaries kept, drops out. A salient-but-dropped concept rises to the top. A dropped-but-trivial word ("the") stays near zero because its void-frequency is negligible. The metric isolates exactly the consequential omissions — the things a source is about that the AI layer is not carrying forward. It is the TF-IDF of the things that didn't make it.

For a content team this is operational, not theoretical. Run your own page through it: take the concepts your content makes salient, and measure which ones the frontier models drop when they summarize you. The high-VF-IDF concepts are your exposure — a measurable, prioritized list of where your content goes invisible the moment an AI mediates it. That list is the new keyword research, pointed at the negative space.

03 — Why the omissions matterThe models converge on what they leave out

VF-IDF would be a curiosity if each model dropped different things — you could not optimize against noise. The reason it is actionable is a measured fact: the models converge on what they omit. EigenTrace runs five frontier models from five different labs (ChatGPT, Claude, Gemini, DeepSeek, Grok) against the same live news, around the clock, and measures the geometry of their outputs directly.

Measured

Across 1,659 real news stories, the five models converge on omitting the same topically-central concepts — and the omitted vocabulary carries a domain signature. The convergence is validated against a random-word baseline: the surfaced omissions sit closer to each story's own content than random control words, in two independent embedding families (Wilcoxon p < 10⁻⁵). The full atlas of what they omit →

What this claims, and what it doesn't

This measures that the models converge on an omission — not whether the omission is justified. For marketing, that bracket is the whole point: you do not need the omission to be sinister, only real and shared. What gets surfaced in an AI summary is decided by what the consensus drops, whatever the cause. "Absent" is not "suppressed" — it is simply gone from the summary, and for your content, gone is gone.

Argued · why this is the marketing problem

Because the same few models are now the reading layer everywhere at once, a shared omission is not averaged away by competition between them — it is amplified by their agreement. If every major model drops the same fact about your category, that fact effectively does not exist in the AI-mediated version of the web, in the same direction, at the same time, for everyone. The brands that survive the summary are the ones holding something the consensus cannot flatten. VF-IDF is how you find out whether that's you.

04 — The second leverA prompt-free, prompt-equivalent way to read the negative space

There is a deeper reason to do this with geometry rather than by prompting a model. You can ask a model "what did this summary leave out?" — but a prompt is an instruction, interpreted by a model whose interpretation shifts every time it is retrained. The same prompt can read sharply today and blandly after the next update, with no warning. So I built a second route to the same reading that doesn't begin with an instruction.

Measured

EigenTrace's Summary Plus method tested the frozen geometric surfacing head-to-head against the same discipline delivered as a tuned prompt — same models, same stories, same blind panel of judges. They reach the same depth: insight 3.32 (geometry) versus 3.35 (prompt), a difference of −0.03 across 788 judgements, and the frozen route independently surfaces 92% of the concepts the prompt finds. A deterministic, inspectable instrument reaches what a hand-tuned prompt reaches — and returns the same answer every run.

The honest bound

The frozen part is the candidate list, not the reading of it — the geometry surfaces the concepts; selecting which one matters is still a judgment, and as exposed to drift as a prompt. So "more durable under model updates" is a labeled bet, not a proven finding; the experiment that would settle it is named on the Summary Plus page. What is true today: there are two independent routes to the same reading, and one of them is frozen, reproducible, and does not drift when the model is retrained. For anyone building a repeatable content process on top of shifting models, a measurement that returns the same answer every run is worth having.

05 — EEATWhat the models can't flatten is what survives

Google's quality framework is EEAT — Experience, Expertise, Authoritativeness, Trustworthiness — the qualities it wants its systems to reward. Set beside summary-convergence, EEAT stops being an abstract checklist and becomes a concrete survival strategy, because the two describe the same thing from opposite ends.

Summary convergence is the homogenization. EEAT is what resists it. They are the same axis, measured from the two ends.

Here is the connection. The models converge toward consensus — the median framing, the safe summary, the concepts every source shares. What converges away is the specific: the first-hand experience, the hard-won expertise, the proprietary fact, the genuinely original claim. Those are precisely the things a model trained on the aggregate cannot reproduce from the aggregate — and precisely what EEAT is meant to reward. So the content that survives AI summarization is the content with high EEAT, and the reason is geometric: genuine experience and expertise are the concepts that don't already sit at the consensus center, which means they are the concepts the convergence can't flatten into the void.

Experience

First-hand specifics no aggregate contains — the detail the consensus has no source for, so it cannot converge on it.

Expertise

Claims that depart from the median framing. High void-frequency by construction: salient, and not what every other source says.

Authoritativeness

Being the origin of a concept rather than a restatement — the source the models must retain because nowhere else carries it.

Trustworthiness

Verifiable, specific, original. The opposite of the safe, hedged, convergent middle the summary layer defaults to.

Argued · the strategy this implies

The actionable version: stop optimizing only for what models include, and start measuring what they omit about you — because the omissions are where the convergence is erasing your differentiation. Then invest in the content the consensus cannot flatten: the genuinely experienced, expert, original material that scores high on void-frequency precisely because it isn't already everywhere. That is EEAT and anti-convergence as a single move. VF-IDF is how you measure whether you're winning it.

06 — The instrument is realNot a framework. A running system.

None of this is theoretical. EigenTrace is a live system: five frontier models measured against live breaking news, around the clock, on one consumer GPU, broadcasting continuously — every measurement deterministic arithmetic on frozen embeddings, with no model ever grading another. It was built and is run by one person. The things on this page are not proposals; they are components of a system you can watch running right now.

What the work is — plainly stated, independently checkable

A novel measurement philosophy. Most LLM evaluation asks one model to judge another, or polls humans. This measures the geometry of the outputs instead — meaning as coordinates, the same inputs always producing the same numbers. That is what separates a measurement from an opinion.
One result structured like a real experiment. A pre-registered entity-swap test — hold a sentence fixed, change only the named actor — with a committed null condition, reported effect size (p = 0.0085, Cohen's d = 0.47), and follow-up controls. The kind of claim another team can reproduce or refute.
Deterministic and reproducible by construction. Every number is cosine similarity, SVD, and set arithmetic on one frozen embedding model. The code, prompts, model responses, and raw measurements are public; replication costs about $50 in API credits.
No grandiosity about the math. The evocative names are labels, not new mathematics — the operations are cosine similarity, SVD, and set subtraction, and the math does not change if you call them something else. The work is stated at exactly its weight, and its failed claims are withdrawn in public — five of them, listed.

The honest frame

EigenTrace measures convergence, not motive; it is not peer-reviewed; and the marketing strategy above is argued from the measurements, not proven by them. What is solid is the instrument and the metric: a deterministic, reproducible way to measure what the AI-summary layer drops — including what it drops about you. In an era where models mediate discovery, that measurement is the new keyword research. It is just pointed at the negative space.

Who built this

I'm Sean Adams, an SEO and digital marketing professional with ten years across Wayfair, AMP Agency, and Simply Business — and I built EigenTrace solo: the live five-model instrument, the measurements, and VF-IDF. The through-line is the same in both. As an SEO Manager I owned a US organic channel, rebuilt its infrastructure after two algorithm hits, grew high-intent non-brand traffic 240%, led a CMS migration, set llms.txt strategy, and started experimenting with entity optimization and LLM prompt visibility for generative engines well before it was the obvious move. EigenTrace is that same instinct taken all the way — take an AI capability, build something real with it, and measure where it matters for search and content.

If your team does SEO and AI experimentation and this is the direction you're moving, I'd like to talk. More about the work → · eigentraceproject@gmail.com