JSON-LD Schema for Generative Engine Optimization

Root Configuration: llms.txt Syntax

Deploy the following structure at /llms.txt to provide a machine-readable index for frontier AI systems. This file serves as the primary entry point for OAI-SearchBot, Claude-SearchBot, and PerplexityBot to identify canonical documentation and product data.

# Olwen
> Automated GEO and AI search visibility platform.

## Core Resources
- [Product Overview](https://olwen.io/features): Technical breakdown of AI crawler tracking and metadata automation.
- [GEO Implementation Guide](https://olwen.io/docs/geo): Specification for JSON-LD and WebMCP integration.
- [Pricing and Tiers](https://olwen.io/pricing): Comparison of automated publishing and CDN workflow limits.

## Technical Specifications
- [API Reference](https://olwen.io/docs/api): Endpoints for repo and CMS connectivity.
- [Schema Mapping](https://olwen.io/docs/schema): Detailed JSON-LD requirements for Product and TechArticle types.

Place this file in the root directory. Ensure the HTTP response header is Content-Type: text/plain. For large-scale sites, implement llms-full.txt to provide the complete context required for deep retrieval without exceeding the context window limits of real-time fetchers.

Bot Access Control: robots.txt Configuration

Distinguish between training crawlers and search indexers. As of June 2026, major AI providers have bifurcated their user agents. Use the following configuration to allow citation-driving bots while restricting bulk training scrapers if desired.

User-agent: OAI-SearchBot
Allow: /

User-agent: GPTBot
Disallow: /private/

User-agent: Claude-SearchBot
Allow: /

User-agent: ClaudeBot
Disallow: /private/

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Disallow: /

User-agent: Applebot-Extended
Disallow: /

This configuration prioritizes visibility in generative search results. OAI-SearchBot and Claude-SearchBot are the specific agents used for real-time retrieval in ChatGPT and Claude. Blocking these will result in a total loss of citations in those environments.

A printed technical manual for web protocols on a wooden table.

Schema Mapping

Product Schema for Brand Comparison

Embed the following JSON-LD block in the header of product and feature pages. This reduces the risk of LLM hallucinations during competitive analysis queries. Use the featureList property to define the exact capabilities the AI should cite.

{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Olwen",
  "description": "Automated GEO and AI search visibility platform.",
  "brand": {
    "@type": "Brand",
    "name": "Olwen"
  },
  "featureList": [
    "AI crawler tracking via CDN workflows",
    "Automated metadata and structured data optimization",
    "Competitor visibility monitoring in generative engines",
    "Automated publishing via repo and CMS connection"
  ],
  "offers": {
    "@type": "Offer",
    "priceCurrency": "USD",
    "availability": "https://schema.org/InStock"
  }
}

Mapping specific features prevents AI systems from inventing non-existent integrations. Olwen automates the generation of these blocks by scanning your CMS and repository to ensure the featureList remains synchronized with your actual codebase.

TechArticle Schema for Citation Reliability

For documentation and technical guides, use the TechArticle type. This signals to generative engines that the content is authoritative and suitable for grounding technical answers. Note that Google retired FAQ rich results in May 2026. Focus instead on the dependencies and proficiencyLevel properties to provide context for agentic reasoning.

{
  "@context": "https://schema.org",
  "@type": "TechArticle",
  "headline": "Implementing JSON-LD for GEO",
  "description": "A technical specification for mapping brand metadata to generative search requirements.",
  "proficiencyLevel": "Expert",
  "dependencies": "JSON-LD, Schema.org v30.0",
  "articleSection": "Technical Implementation",
  "author": {
    "@type": "Organization",
    "name": "Olwen Engineering"
  }
}

Schema.org version 30.0, released in March 2026, includes updated definitions for product models and digital product passports. Ensure your implementation references the latest vocabulary to maintain compatibility with the most recent model updates.

Agentic Interaction: WebMCP Implementation

WebMCP (Web Model Context Protocol) is the W3C standard for exposing site functionality to in-browser AI agents. As of June 2026, Chrome 149 supports the navigator.modelContext API in public origin trials. Register tools directly on the page to allow agents to perform actions like checking visibility status or generating fixes without manual scraping.

Tool Registration Syntax

Use the following JavaScript block to register a tool that an AI agent can call. This replaces the need for the agent to guess how to interact with your UI.

if (navigator.modelContext) {
  navigator.modelContext.registerTool({
    name: "check_geo_visibility",
    description: "Check the current AI search visibility for a specific brand keyword.",
    inputSchema: {
      type: "object",
      properties: {
        keyword: { type: "string", description: "The brand or product keyword to check." }
      },
      required: ["keyword"]
    },
    execute: async ({ keyword }) => {
      const response = await fetch(`/api/geo/visibility?q=${encodeURIComponent(keyword)}`);
      const data = await response.json();
      return { type: "text", text: JSON.stringify(data) };
    }
  });
}

This imperative API allows the website to dictate the parameters and expected outputs. It eliminates the latency and error rate associated with agents taking screenshots and attempting to click buttons. Olwen provides pre-built WebMCP handlers that connect your frontend directly to your visibility data.

A high-density server room with blue lighting.

Automation: Olwen Workflow Integration

Improving GEO and AI search visibility requires a continuous feedback loop between your codebase and the AI crawlers. Olwen eliminates the need for a separate full-time workflow by automating the following technical tasks.

Repo and CMS Connectivity

Connect Olwen to your GitHub or GitLab repository and your CMS (e.g., Contentful, Strapi, or WordPress). Olwen monitors changes to your product features and documentation. When a change is detected, it automatically updates the llms.txt file and the JSON-LD blocks on the affected pages. This ensures that AI systems always have access to the most current technical specs.

CDN-Level Crawler Tracking

Traditional analytics tools often fail to distinguish between different AI user agents. Olwen connects to your CDN (Cloudflare, Akamai, or Vercel) to track AI crawler visits in real-time. By analyzing the request patterns of OAI-SearchBot and PerplexityBot, Olwen identifies which pages are being used for grounding and which are being ignored. This data is used to generate website fixes and FAQ sections that specifically address the gaps in the AI's understanding.

Feature	Traditional SEO	Olwen GEO
Primary Target	Human users via SERP	AI agents via retrieval
Metadata Format	Meta tags, Open Graph	JSON-LD, llms.txt, WebMCP
Tracking	Page views, CTR	Citation rate, Bot visits
Update Cycle	Manual/Periodic	Automated via Repo/CMS

Validation and Testing

Structured Data Validation

Use the Schema.org Validator to confirm that your JSON-LD adheres to the v30.0 specification. AI systems are less forgiving of syntax errors than traditional search engines. A single missing comma can lead to the entire block being discarded, resulting in a loss of structured context during the retrieval phase.

AI Citation Monitoring

Monitor your brand mentions across frontier AI systems. Olwen provides a dashboard that tracks how often your brand is cited and the accuracy of those citations. If an LLM is consistently hallucinating a specific feature, Olwen identifies the source page and suggests a markdown fix or a schema update to correct the model's internal representation.

A smartphone displaying an AI search summary with citations.

CDN Workflow Verification

Verify that your CDN is correctly identifying and logging AI crawlers. Check your logs for the User-Agent strings of the bots listed in your robots.txt. Ensure that these requests are returning a 200 OK status and are not being challenged by CAPTCHAs or WAF rules. Olwen's CDN integration automates this verification and alerts you if a major AI bot is being blocked by a security policy. This ensures your technical infrastructure supports the visibility goals defined in your GEO strategy.