On-Site Makes You Readable. Off-Site Makes You Cited.

Most AI-search strategies are SEO with a new label. The work that actually earns an AI citation happens off your site — on the sources the model already trusts. Here is the honest division of labor, and the five questions to ask before you hire a GEO agency.

Elizabeth S., Founder and Managing Partner of Citable

Elizabeth S.

Founder 6 min read

Share
Summarize with AI
In this article
  1. 01 Where do AI engines actually get their citations?
  2. 02 What does on-site work actually get you?
  3. 03 Why is the citation decided off your site?
  4. 04 Which off-site sources do AI engines trust?
  5. 05 So why does most “AEO” look exactly like SEO?
  6. 06 The honest division of labor: structure, earn, cite
  7. 07 Five questions to ask before you hire a GEO agency
  8. 08 What this means for your roadmap

There is a version of AI-search strategy that is just SEO with a new acronym on the slide. Same keyword work, same blog cadence, same schema checklist — now labeled AEO or GEO. It is not wrong, exactly. It is half the job, sold as the whole one.

The half it skips is the half that decides whether you get cited.

Where do AI engines actually get their citations?

When a buyer asks ChatGPT, Perplexity, Gemini, or Google’s AI Overviews “which tool should we use,” the engine does not read your homepage and render a verdict. It reads the intent behind the question, fans it out into sub-questions, retrieves candidate passages from across the open web, and then quotes the sources it trusts most.

For most commercial questions, those trusted sources are not your site. They are the places the rest of the internet talks about you: review platforms, community threads, comparison pages, listicles, press. Your own domain is one input among many — and on the questions that drive revenue, it is rarely the deciding one.

That is the uncomfortable structural fact underneath every honest GEO conversation. The answer is assembled somewhere you do not control, from sources you did not write.

What does on-site work actually get you?

It gets you readable. That is not nothing — it is the precondition for everything else.

The on-site layer is the work most agencies are comfortable selling: Organization and author schema, clean and crawlable architecture, clear entity signals, and content structured so an engine can lift a passage cleanly. Done well, it makes you eligible. A brand that fails this layer gets skipped, or worse, misdescribed with stale information it never gets to correct.

But eligibility is not selection. A perfectly structured, schema-rich, lightning-fast site tells an engine you can be read. It does not tell the engine you should be named. On-site is the floor. You can finish it — and then discover the citation is decided somewhere you have not touched.

Why is the citation decided off your site?

Because trust, to a retrieval system, is corroboration. An engine answering “which tool is best” is not looking for your claim that you are best — every vendor claims that. It is looking for independent signals that converge on the same answer. The more places the open web names you, in the contexts the question implies, the more confidently a model can cite you without hedging.

Your homepage is a single, self-interested source. A G2 page with hundreds of structured reviews, a Reddit thread where practitioners argue your category, a comparison page that lists you against the alternatives — those are corroboration. They are what an engine reaches for precisely because you did not write them.

This is why off-site is not an add-on to GEO. It is the part of GEO that does the deciding.

Which off-site sources do AI engines trust?

Four categories carry most of the weight for commercial answers:

  • Review sites. G2, Capterra, Trustpilot, and their equivalents — especially when reviews are structured and numerous. For SaaS, this is often the single largest input to “best tool for X.”
  • Forums and communities. Reddit threads, Stack Overflow, Hacker News, niche communities. Models lean on these heavily because they read as candid and human — which is exactly why platforms like Reddit now license their content for AI retrieval.
  • Comparison pages and listicles. “Best X for Y” roundups and head-to-head pages. These exist to answer the same question the buyer is asking the AI, so the AI quotes them directly.
  • Press and podcasts. Earned mentions in publications and shows — but only if they are machine-readable. A podcast with no transcript carries far less weight than the same conversation published as text.

The pattern across all four: they are earned, not published. You cannot ship them from your own CMS on Friday afternoon. That difficulty is the entire reason most programs skip them.

So why does most “AEO” look exactly like SEO?

Because the on-site half is the comfortable half. It is a known checklist, it is fully under the agency’s control, and it produces a deliverable that looks like progress. Adding “AEO” to a deck of schema audits and blog posts is a relabel, not a strategy.

The off-site half is harder. It means digital PR, getting you into the comparison pages and roundups, making your reviews and press readable to a machine, and building authority on platforms you do not own. It is slower, less predictable, and harder to package. So it gets quietly dropped — and the client is told that “more content” is the path to AI visibility. It is not.

The honest version of this work names the floor as the floor and spends its real effort above it.

The honest division of labor: structure, earn, cite

Three moves, in order, and none of them optional:

  1. Structure. Make the site readable — schema, entity, extractable content, the on-site signals that get you eligible. Lead with the answer so a passage can be lifted cleanly. This is finite work; do it once, maintain it.
  2. Earn. Build the off-site authority an engine actually trusts — reviews, comparison pages, forums, earned press with transcripts. This is the part that decides citation, and it never finishes.
  3. Cite. Measure where you are named across engines, against competitors, prompt by prompt — and feed what you learn back into structure and earn. The metric is Share of Answer, not Google rank.

Underneath all three sits one discipline that bridges on-site and off: be hard to substitute. Our published floor for that is concrete — pages carrying 19 or more sourced data points average meaningfully more AI citations than thin pages, and first-party data outperforms third-party because no competing source can claim it. The same property that makes your own pages citable makes you worth quoting when a third-party source describes you. That is what a Context Hub is built to compound.

Five questions to ask before you hire a GEO agency

Whether you are evaluating us or anyone else, the questions are the same. They separate a measurement-first program from a content mill with a new acronym:

  1. Are they tracking where you are cited in AI answers — or only Google rankings?
  2. Are they measuring your visibility against named competitors, prompt by prompt?
  3. Are they building off-site authority, or just shipping more blog posts?
  4. Are they making your reviews, press, and podcast mentions machine-readable for AI?
  5. Can they show you which third-party sources actually shape answers in your category?

If most of the answers come back as “Google rankings” and “more content,” you are being sold SEO with a new label.

What this means for your roadmap

Do the on-site work — it is real, and skipping it leaves you unreadable. But do not mistake it for the destination. A schema-perfect site that no third-party source corroborates is a brand that AI engines can read and still decline to name.

Get cited, and you become the default answer in your category — the one the model reaches for, the one the pipeline starts coming to instead of you chasing it. That is decided off your site, on the sources the engines trust. It is the harder half, and it is the half worth paying for.

If you want to see which of your pages are readable, which third-party sources shape your category, and where you stand against competitors prompt by prompt, that is exactly what our AI Visibility Audit measures — across the engines that decide.

Readable is necessary. Cited is decided elsewhere.

Source: Citable GEO methodology

Two halves of getting cited

Layer What it is What it gets you
On-site — readable Schema, clean architecture, entity signals, extractable content Eligibility: the engine can read and parse you
Off-site — cited Reviews, forums, comparison pages, listicles, earned press and podcasts Selection: the engine trusts you enough to name you
On-site is necessary and finite — you can finish it. Off-site is where the citation is decided, and it never finishes.

Frequently asked

Questions buyers ask before booking

Where do AI engines get the sources they cite?

Generative engines like ChatGPT, Perplexity, Gemini, and Google AI Overviews assemble answers by retrieving passages from across the open web, then quoting the sources they trust most. For most commercial questions — which tool to use, which provider is best — those trusted sources are third-party: review sites, forums, comparison pages, listicles, and press. Your own website is one input among many, and rarely the deciding one.

Is on-site SEO useless for AI search?

No. On-site work — schema, clean architecture, entity signals, extractable content — is necessary. It is what makes an engine able to read and parse you, and a brand that fails it can be skipped or misdescribed. But being readable only makes you eligible. It does not, on its own, make an engine choose to name you over a competitor. On-site is the floor, not the ceiling.

What counts as off-site authority for GEO?

Off-site authority is everything the rest of the internet says about you that an AI engine can read: your presence and rating on review sites like G2 and Capterra, mentions in Reddit threads and community forums, inclusion in comparison pages and listicles, and earned press and podcast appearances — ideally with transcripts so they are machine-readable. These are the sources models lean on when they decide who to cite.

How is real GEO different from relabeled SEO?

Relabeled SEO does the same on-page work — keywords, schema, blog posts — and calls it AEO or GEO. Real GEO measures where you are cited across AI engines versus competitors, repairs the on-site signals that make you readable, and then does the harder off-site work of earning authority on the third-party sources that shape answers. The tell is measurement: a real GEO program reports Share of Answer, not just Google rankings.

How do I know if my agency is doing real GEO or relabeled SEO?

Ask whether they track where you are cited in AI answers, whether they measure you against named competitors prompt by prompt, whether they build off-site authority or just ship more blogs, whether they make your reviews and press machine-readable, and whether they can name the third-party sources that shape answers in your category. If most answers come back as 'Google rankings' and 'more content,' it is SEO with a new label.

If the answers are mostly 'Google rankings' and 'more blogs,' it is SEO with a new label.

Five questions to ask before you hire a GEO agency

  • Are they tracking where you are cited in AI answers — or only Google rankings?
  • Are they measuring your visibility against named competitors, prompt by prompt?
  • Are they building off-site authority, or just shipping more blog posts?
  • Are they making your reviews, press, and podcast mentions machine-readable for AI?
  • Can they show you which third-party sources actually shape answers in your category?

Ready to be cited by AI?

Two paths in. Free check tells you where you stand in 10 seconds. Paid audit tells you exactly what to fix, with a baseline you can measure forward from.

Run the free check Book the audit · €1,200

Prefer to talk first? Get in touch