When an AI system answers a question with current information, it does not recall the web from memory. It runs a search, pulls in pages, reads short extracts from them, and writes its answer from those extracts. Those extracts are grounding snippets, and they are the atomic unit of visibility in AI search. You can rank first in traditional search and still be invisible here, because the model, not the user, is now the reader, and a selective one.
Every platform runs the same basic pipeline, tuned differently: search query → pages received → pages with readable content → pages cited. Received is every URL the search step returned; readable is the subset the model actually obtained text for; cited is the few whose sources appear in the answer. The gap between received and cited is where each system shows its character.
A grounding snippet is built by extractive summarization, not abstractive: the system pulls exact sentences from your page rather than paraphrasing. The unit of extraction is the individual sentence, scored against the query, and the top sentences are stitched together. Where the chosen sentences are not next to each other on the page, they are joined by an ellipsis, producing the familiar segment … segment … segment shape. This is not unique to Google: testing Claude shows it returns the same ellipsis-joined, sentence-stitched format, so the pattern appears to be a shared convention across assistants.
The pipeline runs prompt → query fanout → retrieval → extractive summarization → context assembly → synthesis and attribution. The observed traits of Google's extraction:
We replicate this behaviour closely by fine-tuning the open cross-encoder model microsoft/deberta-v3-large.
Before retrieval, the model breaks one prompt into several single-intent sub-queries, a separation of concerns where a multi-faceted question is split into individual dimensions of intent. Each sub-query retrieves its own set of sources, typically five to twenty. Because of fanout, a page can be grounded for one angle of a question and absent for another.
Most of your page never reaches the model. Across one sample analysis the system cited about 32% of the available characters, with per-source coverage ranging from roughly 21% to 65%. What gets kept is core service information, process steps, pricing and examples; what gets dropped is navigation, boilerplate, time-sensitive promotions, off-topic sections, and verbatim customer quotes.
Grounding behaves like a fixed pie, not an expanding one. From an analysis of 7,060 queries, 2,275 pages and 883,262 snippets:
The lesson is blunt: density beats length. More content dilutes your coverage without increasing what gets selected; you are competing for share of a fixed pie.
The snippets do not stick around. AI search is single-turn transient: the raw extracts are injected into the context for one turn, then purged the moment the answer is finished, to save token space. Ask a follow-up and the model is working from its own earlier summary, not the original page. What persists of you is whatever was captured in that first snippet, filtered through the model's reading of it, not the broader page.
The same query, asked the same day, produced very different evidence on each platform:
The snippets a model exposes are not always reliable, even about themselves. In one case Gemini recited a grounding citation for a paper that does not exist, hallucinating while reporting its own grounding context. And the quantitative work above comes from our own measurements: we did not control for confounders such as authority and freshness, and we keep the raw data private for client confidentiality. Treat the numbers as strong directional signal rather than settled fact.
You can see this for yourself: our free grounding snippet tool at snippets.dejan.ai runs a live grounded search and shows which URLs Gemini pulls and the exact sentences it extracts, the practical groundwork behind Selection Rate Optimization.
Sign in with Google to comment.