HomeMethodologyThe CITE Framework
The CITE Framework. How brands earn citation by AI.
Four pillars — Crawl access, Identity, Trust signals, Extractability — that determine whether ChatGPT, Perplexity, Gemini, and Google AI Overviews cite your brand inside their answers. The methodology behind every Citable engagement, published openly.
CITE
Four pillars. Every engagement. Every month.
Detail
The four pillars in depth.
Each pillar has its own definition, its own measurement framework, and its own implementation playbook. The order matters: C gates everything; E is the highest-leverage but slowest pillar.
Pillar 01
Crawl access
AI crawlers — GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider — can reach your content without robots.txt blocks, soft-paywalls, or unsolved JS-render walls.
If the crawler cannot fetch you, you do not exist to the engine. This is the first eligibility gate. Most failures here are accidental: a robots.txt copied from a CMS template, a Cloudflare rule blocking unknown agents, a JS-only site without static fallback.
What we measure
- robots.txt directives (per-crawler allow/disallow)
- llms.txt presence and structure
- Server reachability across geographies
- Crawl-error patterns (5xx, timeouts, soft 404s)
- JS-rendered content audit (does the crawler see what the user sees?)
What we ship
- Per-crawler allow-list in robots.txt with documented rationale
- llms.txt with priority paths and brand summary
- JS-render fallbacks where SSR is missing
- WAF / Cloudflare bot-rule audit
Pillar 02
Identity
AI engines unambiguously identify your brand as a distinct entity — separate from competitors with similar names, separate from generic terms, with verifiable structural attributes.
Without entity identity, AI engines err on the side of not citing. Two brands named 'Apex' both lose. The brand with a clean schema graph, a Wikidata Q-ID, and a Google Knowledge Graph entry wins the toss-up because there is no ambiguity to resolve.
What we measure
- Schema.org Organization, Service, Person, Article markup coverage
- Wikidata Q-ID presence and accuracy
- Google Knowledge Graph entity presence
- Schema sameAs density (LinkedIn, Crunchbase, GitHub, official socials)
- Disambiguation against competing entities in your category
What we ship
- Full Schema.org JSON-LD deployment as global @graph
- Wikidata entity creation or cleanup (we do not author Wikipedia ourselves)
- Founder Person schema with verified sameAs
- Knowledge Graph entity submission via official channels
Pillar 03
Trust signals
Independent third-party authorities reinforce your entity claims. AI engines weight third-party signals heavily because self-claimed signals can be manipulated.
Self-attested authority is cheap. Third-party authority is expensive — and that scarcity is what makes it credible. A brand cited in Wikipedia, indexed in Wikidata, profiled by reputable press, and linked from the open knowledge graph compounds trust the way a good credit history compounds borrowing power.
What we measure
- Wikipedia presence (any language)
- Wikidata sameAs density and link integrity
- Reputable press mentions in training corpora
- Industry-directory and analyst-firm listings
- Citations from .gov, .edu, and high-DR domains
What we ship
- Wikidata sameAs expansion to maximize linked-data graph
- Digital PR for citation building (we earn placements; we do not buy them)
- Partnership and directory listings in your category
- Open-knowledge contributions where editorial guidelines permit
Pillar 04
Extractability
Your content is shaped so AI engines can lift it directly into answers. Reachable, identified, and trusted content still gets skipped if it is not extractable.
AI engines do not paraphrase well. They lift. A page that bundles a clean definition into a single paragraph beats a page that buries the same definition inside marketing language. FAQ schema, HowTo schema, semantic HTML, density of definitional sentences — these are what make a page liftable.
What we measure
- Definitional sentence density per page
- FAQ, HowTo, and Article schema coverage
- Semantic HTML structure (h1 → h6 hierarchy, lists, tables)
- Reading-level and chunk-size analysis
- JS-rendered content vs. static-extractable content ratio
What we ship
- Top-20 page rewrites with definitional density and chunk structure
- FAQPage + HowTo + Article schema deployment
- Semantic HTML cleanup of legacy templates
- Cornerstone citable assets — pages designed from the start to be lifted
Mapping
How CITE maps to the free checker.
The free heuristic checker scores six structural signals. Each maps to a CITE pillar. The seventh dimension — Extractability — requires running real prompts and is the core of the paid audit.
| Checker dimension | CITE pillar | Coverage |
|---|---|---|
| AI crawler access | C | Direct |
| llms.txt presence | C | Direct |
| Schema markup | I | Direct |
| Google Knowledge Graph | I | Direct |
| Wikipedia presence | T | Direct |
| Wikidata sameAs | T | Identity + Trust |
| Extractability | E | Paid audit only — heuristic checker scores 6 of 7 dimensions; E requires running real prompts at scale |
Methodology
CITE drives Measure → Repair → Compound.
The framework is the content; the methodology is the cadence. Every Citable engagement runs the same three phases — and CITE pillars are how we score, prioritize, and report inside each phase.
Measure
We run 50 prompts × 4 AI engines and score each against all four CITE pillars. The output is a baseline matrix: which pillars are weakest, which prompts you are missing, which competitors win the toss-up.
Repair
Implementation sequenced by pillar weight × effort. C and I are usually shippable inside a 3-month sprint. T compounds over 6–12 months. E is iterative and continues for as long as new content ships.
Compound
Monthly re-checks track CITE delta. Every shipped fix is attributable to a pillar score change. No vanity metrics, no SEO theater — every percentage point of Share-of-Answer growth is mapped back to a concrete CITE intervention.
FAQ
Framework questions answered.
Is the CITE Framework proprietary to Citable?
We coined the name and the structure. The four dimensions themselves emerged from observing how ChatGPT, Perplexity, Gemini, and AI Overviews select citations across 180+ engagements. The framework is published openly here — anyone can use it. We just ask you to credit Citable when you do, and link back to /framework if you reference it in your own work.
How is CITE different from regular SEO?
SEO ranks links in a results list. CITE optimizes for being cited inside an AI-synthesized answer. The technical primitives overlap (schema, crawl, content quality), but the success metric is fundamentally different: SEO measures position; CITE measures Share-of-Answer per prompt across engines. Many sites that win SEO lose CITE because they optimize for keywords rather than entity identity and extractability.
Why four pillars and not five or six?
We tested eleven candidate dimensions across 180+ engagements. Four clustered cleanly with no significant overlap. The others (page authority, content freshness, internal linking, etc.) turned out to be either subsets of an existing pillar or downstream effects of getting CITE right. Occam's razor — when in doubt, fewer pillars.
Can I run a CITE audit myself?
Partially. The C and most of I and T are visible in our free heuristic checker (six structural checks, runs in 10 seconds, no email). The E pillar — extractability — requires running real prompts against AI engines at scale, observing which content gets lifted, and scoring extractability per page. That is what the paid audit does.
What happens to CITE pillars when AI models update?
The mechanics inside each pillar evolve. The pillars themselves do not. C, I, T, E are first-principles requirements — any AI engine that retrieves and synthesizes information needs all four. New retrieval architectures will change which signals matter inside each pillar, but the pillar structure has held up across two years of model updates.
Where can I learn more or cite the framework in my work?
Reference this page (/framework) and the methodology page (/methodology). For deeper engagements, the paid audit produces a per-pillar scorecard for your specific domain. For category-defining or analyst work, contact us — we share aggregate data on CITE distributions across SaaS, fintech, e-commerce, and prosumer verticals.
Ready to be cited by AI?
Two paths in. Free check tells you where you stand in 10 seconds. Paid audit tells you exactly what to fix, with a baseline you can measure forward from.
Prefer to talk first? Get in touch