Skip to main content
← Back to blog

We made every blog post visible to AI agents — here's how WebMCP and schema.org work together

Two complementary channels that cover the whole spectrum: schema.org for the crawler that indexes, WebMCP for the agent that executes.

If you publish a post today and an AI agent visits the page, there are two reads it has to be able to do. One is the static HTML: the agent wants to index your content, cite excerpts, synthesize answers for its users. The other is interactive: the agent wants to search across your posts, filter by tag, fetch a specific post without scraping the page.

Neither covers the other. The two channels are complementary, and when we designed Agentikas we decided that every blog on the platform speaks both from the very first post, with zero configuration from the author.

Channel 1: schema.org JSON-LD on every page

Schema.org is the shared language that crawlers understand. Google reads it. Bing reads it. AI crawlers — GPTBot, ClaudeBot, PerplexityBot — read it better than semantic HTML, because it saves them interpretation work.

The component that renders a post includes, before the closing <article>, a <script type="application/ld+json"> with the full graph:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "WebMCP: The Protocol That's Changing the Game",
  "datePublished": "2026-03-21T00:00:00Z",
  "dateModified": "2026-04-30T08:50:52Z",
  "author": { "@type": "Person", "name": "Salvador Morales" },
  "publisher": {
    "@type": "Organization",
    "name": "Agentikas Labs",
    "logo": { "@type": "ImageObject", "url": "https://agentikas.ai/logo.png" }
  },
  "image": "https://images.unsplash.com/photo-1518432031352-d6fc5c10da5a",
  "inLanguage": "en-US",
  "keywords": ["WebMCP", "Technology", "Open Source"]
}
</script>

What matters about this JSON-LD isn't that it's there — it's that it's in the initial HTML, not injected by JavaScript. If you add it after hydration, AI crawlers don't see it. This pairs with the decision to SSR at the edge — the JSON-LD ships in the first byte.

And it goes beyond the individual post. The blog home injects a Blog graph linking to recent BlogPosting entries, tag pages inject CollectionPage, the author injects Person. When an agent synthesizes an answer about Agentikas, it doesn't have to guess the structure — it reads it.

Channel 2: WebMCP, the browser API for agents

Schema.org covers the crawler. It doesn't cover the agent that's running right now, inside the user's browser, deciding what to do in response to a question. That's what WebMCP is for — an open standard backed by Google and Microsoft and submitted to the W3C.

WebMCP is not a remote manifest nor an endpoint at /.well-known/: it is a browser API. A web page registers its tools at runtime against navigator.modelContext, and the agents active in that tab — Claude for Chrome, MCP extensions in Gemini, ChatGPT's agent mode — discover them when the page loads and can invoke them directly.

The minimum a page registers looks like this:

navigator.modelContext.provideContext({
  tools: [
    {
      name: "search_posts",
      description: "Full-text search over this blog's posts.",
      inputSchema: {
        type: "object",
        properties: {
          query: { type: "string" },
          limit: { type: "integer", default: 10, maximum: 50 }
        },
        required: ["query"]
      },
      execute: async ({ query, limit = 10 }) => {
        const res = await fetch(
          "/api/blog/search?q=" + encodeURIComponent(query) + "&limit=" + limit
        );
        return res.json();
      }
    }
    // ...more tools
  ]
});

Three details that change the mindset versus traditional APIs:

  • Zero API keys. The execute() runs in the page's JavaScript, in the user's browser. It inherits their cookies, session, and permissions. Same credentials they use when browsing manually.
  • Same origin. The fetch() calls inside execute are same-origin with your site. No CORS, no cross-domain tokens, no remote MCP server to operate.
  • Self-description per tool. Each tool carries its name, a natural-language description, and a JSON Schema for its arguments. An agent that has never seen your site understands what you offer just by loading the page, the same way a human understands a form by looking at it.

The tools we register on every blog

On any Agentikas Labs blog, the SDK registers four tools as soon as the page finishes hydrating:

navigator.modelContext.provideContext({
  tools: [
    {
      name: "search_posts",
      description: "Full-text search over published posts.",
      inputSchema: {
        type: "object",
        properties: {
          query: { type: "string", description: "Text to search for." },
          limit: { type: "integer", default: 10, maximum: 50 }
        },
        required: ["query"]
      },
      execute: async ({ query, limit = 10 }) => {
        const res = await fetch(
          "/api/blog/search?q=" + encodeURIComponent(query) + "&limit=" + limit
        );
        return res.json();
      }
    },
    {
      name: "get_post",
      description: "Fetch a post by slug, returning markdown and metadata.",
      inputSchema: {
        type: "object",
        properties: { slug: { type: "string" } },
        required: ["slug"]
      },
      execute: async ({ slug }) => {
        const res = await fetch("/api/blog/posts/" + slug);
        if (!res.ok) throw new Error("Post not found");
        return res.json();
      }
    },
    {
      name: "list_recent_posts",
      description: "List the latest N published posts.",
      inputSchema: {
        type: "object",
        properties: {
          limit:  { type: "integer", default: 10 },
          locale: { type: "string", enum: ["es", "en"] }
        }
      },
      execute: async ({ limit = 10, locale = "en" }) => {
        const res = await fetch(
          "/api/blog/recent?limit=" + limit + "&locale=" + locale
        );
        return res.json();
      }
    },
    {
      name: "list_tags",
      description: "Tags used in this blog with post counts.",
      execute: async () => (await fetch("/api/blog/tags")).json()
    }
  ]
});

When a compatible agent invokes search_posts({ query: "WebMCP" }), it doesn't scrape HTML: it executes the tool's execute, which runs a fetch against a typed JSON API. If you change the blog's design tomorrow, the tools keep working exactly the same. That's the key difference from scraping.

How to try WebMCP yourself in 5 minutes

WebMCP is new: not every browser supports it natively yet. There are two ways to inspect the tools any compatible site registers and play with them. The fastest is the official Chrome extension; the most "purist" is Chrome Canary with the native feature flag.

Option A — The "WebMCP - Model Context Tool Inspector" extension (recommended)

  1. Install the extension. It's published in the Chrome Web Store: WebMCP - Model Context Tool Inspector. It works in Chrome and its derivatives (Brave, Edge, Arc) without switching browsers.
  2. Visit an Agentikas Labs post, for example the article about the WebMCP protocol. Wait for the page to fully load.
  3. Open the extension panel by clicking its icon in the toolbar. You'll see the list of tools this page registered on navigator.modelContext: search_posts, get_post, list_recent_posts, list_tags. Each one with its description and input schema.
  4. Invoke a tool manually. The extension generates a form from the JSON Schema. Try search_posts with query: "WebMCP" and limit: 5. You'll see the JSON response exactly as an agent would receive it.
  5. Inspect the contract. The extension shows the input JSON Schema and the response shape. That's what an AI agent uses to understand what your site does without human-written documentation.

The extension's value is that it literally puts you in the agent's seat: you see exactly what Claude for Chrome or any MCP client sees when it lands on your site.

Option B — Chrome Canary with the native feature flag

If you want to see native browser support (no extension), as it's being standardized at the W3C:

  1. Download Chrome Canary from google.com/chrome/canary. It's the bleeding-edge channel of Chrome, where experimental features are tested before reaching Stable.
  2. Enable the flag. Open chrome://flags in Canary and search for "Model Context" or "WebMCP". Turn the entry on (typically labeled Web Model Context API) and restart Canary when prompted.
  3. Verify the API exists. Open any Agentikas Labs post and, in the DevTools console (Cmd+Option+J on macOS), run:
    typeof navigator.modelContext
    // "object"    → native support active
    // "undefined" → flag not enabled or your Canary is too old
    
  4. List the registered tools:
    navigator.modelContext.tools.map(t => t.name)
    // ["search_posts", "get_post", "list_recent_posts", "list_tags"]
    
  5. Invoke a tool from the console:
    await navigator.modelContext.tools
      .find(t => t.name === "search_posts")
      .execute({ query: "WebMCP", limit: 5 });
    
    The response is the same JSON an AI agent would see — an array of posts with title, slug, date, and excerpt.

Native support changes fast: the exact API may have small variations between Canary versions. The Chrome Web Store extension uses an equivalent polyfill internally, so for practical inspection the results are identical.

What to look at once you see the tools

After invoking search_posts or list_recent_posts with either option, look at the response shape. It's what defines what's possible for an agent on your site:

  • If your blog returns a rich object (slug, title, author, tags, date, excerpt), an agent can synthesize a complete answer without scraping the page again.
  • If you only return URLs, all an agent can do is link out — you lose control over how your content gets cited.
  • If each tool's description is vague or ambiguous, agents will use it less. Tool descriptions are the copy that sells your site to agents.

Why both channels, not just one

The reasonable question: if WebMCP enables rich programmatic execution, why keep spending bytes on schema.org? Three reasons:

  1. WebMCP requires execution. It only works when an agent is alive in the user's browser — Claude for Chrome enabled, an MCP extension running, ChatGPT's agent mode open. Schema.org works in offline crawlers, weeks earlier, indexing for future queries.
  2. Different semantic surface. Schema.org describes what the content is (author, date, language, keywords). WebMCP describes what you can do with it (search, list, fetch a specific post). Both matter.
  3. Incremental adoption. Schema.org has 15 years of adoption. WebMCP is from 2026. Agents already in production take advantage of both channels according to capability — newer ones invoke tools, older ones read JSON-LD. We're not waiting for everyone to speak WebMCP for our content to be readable today.

The practical consequence: it's not a decision, it's two parallel features, and both are served for free if the platform is well designed.

The robots.txt that invites instead of blocking

The final detail: both channels are useless if you block the crawlers. Most blogs today block GPTBot, ClaudeBot, and Google-Extended out of paranoia. At Agentikas the default is the opposite:

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: PerplexityBot
Allow: /

Blocking AI bots today is the 2026 version of "I don't want to be on Google." You can do it. Your future visitors can also fail to find you.


The WebMCP SDK that registers these tools is open source: github.com/agentikas/agentikas-webmcp-sdk. It works with any site that has schema.org wired up properly — not only those running on Agentikas. To inspect the tools on any compatible web page, install the WebMCP - Model Context Tool Inspector extension.

Comments

Loading comments…