We built an agents.json manifest for our agency site — what it does, why it matters, and the full spec we shipped

agents.json is a machine-readable site manifest designed for AI agents — a typed companion to llms.txt that exposes every page, capability, and endpoint. We shipped one for citable.agency. Here is what is in it, why we built it, and the exact format other teams can copy.

Elizabeth S.

Founder 5 min read

Share
Summarize with AI
In this article
  1. 01 What agents.json actually is
  2. 02 Why we built it before there is a ratified standard
  3. 03 What is inside ours
  4. 04 How to ship one for your own site
  5. 05 The compound effect with the rest of the GEO stack

In May 2026 we shipped citable.agency/agents.json — a machine-readable manifest that lets an AI agent discover the entire site without crawling a single HTML page. As far as we can tell, it is the first published agents.json manifest for an agency website. The spec is self-describing, the format is open, and the implementation took an afternoon. This post documents what is in it, why we shipped it, and the exact pattern any other team can copy.

What agents.json actually is

agents.json is a typed JSON file at the root of your website that exposes:

  • Site metadata — name, description, supported languages, contact, manifest version.
  • A typed page index — every canonical page (homepage, services, methodology, pricing, glossary, articles, case studies, tools) with a type tag, EN + ES URLs, and a one-line summary written for agent consumption.
  • Service definitions — structured offers an agent can match against a buyer’s stated need (e.g. “find a GEO audit under €1,500 delivered in under two weeks”).
  • Glossary entries — definitions an agent can quote directly.
  • Machine endpoints — sitemap, llms.txt, RSS, JSON-LD index — every other structured surface the agent might want to consume.

It is not a replacement for llms.txt. It is the typed companion. llms.txt is markdown an LLM reads in-context to prime an answer. agents.json is structured data an autonomous agent parses programmatically to drive a decision.

Why we built it before there is a ratified standard

There is no W3C or IETF spec for agents.json yet. The closest active proposals are:

  • agents.json (Stripe-originated community effort) — a manifest of API actions and capabilities an agent can invoke.
  • /.well-known/agent — under W3C discussion, similar in spirit, broader in scope.

Waiting for the standard would be the safe move. We did not wait, for three reasons:

  1. The upside is asymmetric. If our format is wrong, we update it — spec_version lets us version-pin. If our format is roughly right and agents start parsing manifests at scale, we are legibly first in our category. The downside is a 200-line file we maintain. The upside is being the brand an agent shortlists when its buyer asks for a boutique GEO agency.

  2. Some agents already parse manifests of this shape. Several open-source agent toolkits and at least one major retrieval-augmented system read agents.json-style manifests today. The share of agentic traffic that prefers manifests is small but growing every quarter — the same curve llms.txt was on a year ago.

  3. Publishing forces clarity. The act of writing a typed inventory of every canonical page surfaces every place where our own taxonomy was fuzzy. We tightened our glossary, our service descriptions, and our homepage summary as a direct result of writing the manifest.

What is inside ours

Here is the actual top-level shape of citable.agency/agents.json, trimmed for readability:

{
  "$schema": "https://citable.agency/agents.json",
  "spec_version": "1.0",
  "generated_at": "2026-05-27T...Z",
  "site": {
    "name": "Citable",
    "url": "https://citable.agency",
    "description": "Boutique GEO + Technical SEO + Web Development + Infrastructure agency.",
    "languages": ["en", "es"]
  },
  "pages": [
    { "id": "homepage", "url": "/", "es_url": "/es/", "type": "Homepage", ... },
    { "id": "audit", "url": "/audit/", "es_url": "/es/auditoria/", "type": "Page", "title": "AI Visibility Audit · 1,200 EUR", ... },
    ...
  ],
  "services": [
    {
      "id": "geo-audit",
      "name": "AI Visibility Audit",
      "summary": "50-prompt Share of Answer baseline across 5 AI surfaces + 90-day roadmap.",
      "offer": { "price": "1200", "currency": "EUR", "duration": "7-10 business days" }
    },
    ...
  ],
  "glossary": [
    { "term": "Share of Answer", "definition": "The percentage of relevant AI responses in which a brand appears as a cited source.", "url": "/glossary#share-of-answer" },
    ...
  ],
  "endpoints": {
    "sitemap": "/sitemap.xml",
    "llms_txt": "/llms.txt",
    "rss": "/journal/rss.xml"
  }
}

The full manifest is live at /agents.json and updates on every deploy.

How to ship one for your own site

The implementation cost is small if you already have an Astro or Next.js site with content collections. The four steps:

  1. Define your PageType vocabulary. Pick a small, fixed set — ours is Homepage | Service | Methodology | Framework | Glossary | Article | CaseStudy | Tool | Page. Resist the urge to invent twenty.

  2. Inventory canonical pages by hand. Write a STATIC_PAGES array — every entry point an agent should know about, with EN + ES URLs and a one-line summary. Curated beats exhaustive: an agent wants the canonical entry points, not every blog post.

  3. Generate the rest from content collections. Use getCollection('journal') to emit articles. Use your glossary data to emit terms with definitions. Filter drafts and skip non-canonical language duplicates.

  4. Expose at /agents.json and link from llms.txt. In Astro, a single API route returning JSON.stringify(manifest, null, 2) with Content-Type: application/json is enough. Add a one-line pointer to agents.json inside llms.txt so any consumer reading the markdown summary discovers the richer manifest.

The compound effect with the rest of the GEO stack

agents.json is not a current input to ChatGPT or Perplexity retrieval. It will not move your Share of Answer score next month. What it does is position you to be legible to the next layer of consumers — autonomous agents — the same way Schema.org positioned brands to be legible to current AI assistants two years ago.

The brands that shipped clean Organization + Service schema in 2024 are the brands ChatGPT cites confidently in 2026. The brands that ship clean agents.json manifests in 2026 will be the brands agentic systems shortlist in 2028. The cost to ship is an afternoon. The cost to ship two years late, when the standard is ratified and every competitor has one, is being structurally invisible to the layer above search.

If you want a second opinion on whether your stack is ready for agents — current and emerging — start with the AI Visibility Audit. It is the diagnostic every Citable engagement begins with: 50 prompts across 5 surfaces, documented Share of Answer baseline, structural gap analysis, and a 90-day roadmap. €1,200, 7–10 business days, no discovery call required.

Three machine-readable surfaces compared

Source: Citable Agency working spec

agents.json vs llms.txt vs sitemap.xml

Surface Format Consumer Primary purpose
sitemap.xml XML Search engine crawlers URL discovery + last-modified hints
llms.txt Markdown LLM in-context grounding Prose summary an LLM can read at answer time
agents.json Typed JSON Autonomous AI agents Programmatic site discovery — pages, services, endpoints, capabilities

How we built ours

Source: Citable Agency, May 2026

Shipping agents.json on Astro in one afternoon

  1. Define the typed shape

    30 min

    Decide your PageType vocabulary (Homepage, Service, Methodology, Framework, Glossary, Article, CaseStudy, Tool, Page) and your site-level fields (name, description, languages, spec_version, generated_at).

  2. Inventory the canonical pages by hand

    45 min

    Write a STATIC_PAGES array — every entry point an agent should know about, with EN + ES URLs, a one-line summary, and a type tag. Resist generating from filesystem — curated beats exhaustive for agent consumption.

  3. Generate the rest from content collections

    1 hour

    Use getCollection('journal') and your glossary data to emit articles + glossary terms with definitions. Filter drafts and non-canonical languages.

  4. Expose at /agents.json and link from llms.txt

    20 min

    Astro API route returns JSON.stringify(manifest, null, 2). Add a one-line pointer in llms.txt so any consumer that reads llms.txt discovers the richer manifest too.

Frequently asked

Questions buyers ask before booking

Is agents.json the same as llms.txt?

No. llms.txt is a markdown summary designed to be read in-context by a language model — prose that primes a chat answer. agents.json is a typed JSON manifest designed to be parsed programmatically by an autonomous agent — structure that drives a decision tree. They serve different stages of the agent workflow and a serious GEO baseline includes both.

Is there a ratified standard for agents.json?

Not yet. The closest active proposals are agents.json (Stripe/community) and /.well-known/agent (under W3C discussion). We published a self-describing manifest with a spec_version field so consumers can pin to a known shape, and we will track standards convergence as it happens. Shipping early beats waiting for a final spec — the upside is being discoverable to the agents that ship before the standard does.

What goes inside agents.json?

At minimum: site metadata (name, description, languages), a typed list of canonical pages (homepage, services, methodology, pricing, about), and pointers to machine endpoints (sitemap.xml, llms.txt, RSS, JSON-LD). At maximum: glossary entries with definitions, services with structured offers, contact endpoints, and a capabilities list (what the site can do for the agent). Our published manifest includes all of the above.

Will any AI agent actually read this in 2026?

Some already do. Agentic frameworks like Stripe's agents.json reader, several open-source agent toolkits, and at least one major retrieval-augmented system parse manifests of this shape today. Most agents still crawl HTML — but the agents that parse manifests get a 10× cleaner picture, and the share of agentic traffic that prefers manifests is growing every quarter.

Does publishing agents.json help with ChatGPT or Perplexity citation?

Indirectly. agents.json is not a current input to ChatGPT or Perplexity retrieval. The direct citation lift comes from schema, entity disambiguation, content extractability, and crawler access. agents.json is an agentic-web hedge — it positions you to be legible to autonomous agents the same way schema positions you to be legible to current AI assistants. Both compound.

Ready to be cited by AI?

Two paths in. Free check tells you where you stand in 10 seconds. Paid audit tells you exactly what to fix, with a baseline you can measure forward from.

Run the free check Book the audit · €1,200

Prefer to talk first? Get in touch