JSON-LD Schema for Generative Engine Optimization
Root Configuration: llms.txt Syntax
Deploy the following structure at /llms.txt to provide a machine-readable index for frontier AI systems. This file serves as the primary entry point for OAI-SearchBot, Claude-SearchBot, and PerplexityBot to identify canonical documentation and product data.
# Olwen
> Automated GEO and AI search visibility platform.
## Core Resources
- [Product Overview](https://olwen.io/features): Technical breakdown of AI crawler tracking and metadata automation.
- [GEO Implementation Guide](https://olwen.io/docs/geo): Specification for JSON-LD and WebMCP integration.
- [Pricing and Tiers](https://olwen.io/pricing): Comparison of automated publishing and CDN workflow limits.
## Technical Specifications
- [API Reference](https://olwen.io/docs/api): Endpoints for repo and CMS connectivity.
- [Schema Mapping](https://olwen.io/docs/schema): Detailed JSON-LD requirements for Product and TechArticle types.
Place this file in the root directory. Ensure the HTTP response header is Content-Type: text/plain. For large-scale sites, implement llms-full.txt to provide the complete context required for deep retrieval without exceeding the context window limits of real-time fetchers.
Bot Access Control: robots.txt Configuration
Distinguish between training crawlers and search indexers. As of June 2026, major AI providers have bifurcated their user agents. Use the following configuration to allow citation-driving bots while restricting bulk training scrapers if desired.
User-agent: OAI-SearchBot
Allow: /
User-agent: GPTBot
Disallow: /private/
User-agent: Claude-SearchBot
Allow: /
User-agent: ClaudeBot
Disallow: /private/
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Disallow: /
User-agent: Applebot-Extended
Disallow: /
This configuration prioritizes visibility in generative search results. OAI-SearchBot and Claude-SearchBot are the specific agents used for real-time retrieval in ChatGPT and Claude. Blocking these will result in a total loss of citations in those environments.

Schema Mapping
Product Schema for Brand Comparison
Embed the following JSON-LD block in the header of product and feature pages. This reduces the risk of LLM hallucinations during competitive analysis queries. Use the featureList property to define the exact capabilities the AI should cite.
{
"@context": "https://schema.org",
"@type": "Product",
"name": "Olwen",
"description": "Automated GEO and AI search visibility platform.",
"brand": {
"@type": "Brand",
"name": "Olwen"
},
"featureList": [
"AI crawler tracking via CDN workflows",
"Automated metadata and structured data optimization",
"Competitor visibility monitoring in generative engines",
"Automated publishing via repo and CMS connection"
],
"offers": {
"@type": "Offer",
"priceCurrency": "USD",
"availability": "https://schema.org/InStock"
}
}
Mapping specific features prevents AI systems from inventing non-existent integrations. Olwen automates the generation of these blocks by scanning your CMS and repository to ensure the featureList remains synchronized with your actual codebase.
TechArticle Schema for Citation Reliability
For documentation and technical guides, use the TechArticle type. This signals to generative engines that the content is authoritative and suitable for grounding technical answers. Note that Google retired FAQ rich results in May 2026. Focus instead on the dependencies and proficiencyLevel properties to provide context for agentic reasoning.
{
"@context": "https://schema.org",
"@type": "TechArticle",
"headline": "Implementing JSON-LD for GEO",
"description": "A technical specification for mapping brand metadata to generative search requirements.",
"proficiencyLevel": "Expert",
"dependencies": "JSON-LD, Schema.org v30.0",
"articleSection": "Technical Implementation",
"author": {
"@type": "Organization",
"name": "Olwen Engineering"
}
}
Schema.org version 30.0, released in March 2026, includes updated definitions for product models and digital product passports. Ensure your implementation references the latest vocabulary to maintain compatibility with the most recent model updates.
Agentic Interaction: WebMCP Implementation
WebMCP (Web Model Context Protocol) is the W3C standard for exposing site functionality to in-browser AI agents. As of June 2026, Chrome 149 supports the navigator.modelContext API in public origin trials. Register tools directly on the page to allow agents to perform actions like checking visibility status or generating fixes without manual scraping.
Tool Registration Syntax
Use the following JavaScript block to register a tool that an AI agent can call. This replaces the need for the agent to guess how to interact with your UI.
if (navigator.modelContext) {
navigator.modelContext.registerTool({
name: "check_geo_visibility",
description: "Check the current AI search visibility for a specific brand keyword.",
inputSchema: {
type: "object",
properties: {
keyword: { type: "string", description: "The brand or product keyword to check." }
},
required: ["keyword"]
},
execute: async ({ keyword }) => {
const response = await fetch(`/api/geo/visibility?q=${encodeURIComponent(keyword)}`);
const data = await response.json();
return { type: "text", text: JSON.stringify(data) };
}
});
}
This imperative API allows the website to dictate the parameters and expected outputs. It eliminates the latency and error rate associated with agents taking screenshots and attempting to click buttons. Olwen provides pre-built WebMCP handlers that connect your frontend directly to your visibility data.

Automation: Olwen Workflow Integration
Improving GEO and AI search visibility requires a continuous feedback loop between your codebase and the AI crawlers. Olwen eliminates the need for a separate full-time workflow by automating the following technical tasks.
Repo and CMS Connectivity
Connect Olwen to your GitHub or GitLab repository and your CMS (e.g., Contentful, Strapi, or WordPress). Olwen monitors changes to your product features and documentation. When a change is detected, it automatically updates the llms.txt file and the JSON-LD blocks on the affected pages. This ensures that AI systems always have access to the most current technical specs.
CDN-Level Crawler Tracking
Traditional analytics tools often fail to distinguish between different AI user agents. Olwen connects to your CDN (Cloudflare, Akamai, or Vercel) to track AI crawler visits in real-time. By analyzing the request patterns of OAI-SearchBot and PerplexityBot, Olwen identifies which pages are being used for grounding and which are being ignored. This data is used to generate website fixes and FAQ sections that specifically address the gaps in the AI's understanding.
| Feature | Traditional SEO | Olwen GEO |
|---|---|---|
| Primary Target | Human users via SERP | AI agents via retrieval |
| Metadata Format | Meta tags, Open Graph | JSON-LD, llms.txt, WebMCP |
| Tracking | Page views, CTR | Citation rate, Bot visits |
| Update Cycle | Manual/Periodic | Automated via Repo/CMS |
Validation and Testing
Structured Data Validation
Use the Schema.org Validator to confirm that your JSON-LD adheres to the v30.0 specification. AI systems are less forgiving of syntax errors than traditional search engines. A single missing comma can lead to the entire block being discarded, resulting in a loss of structured context during the retrieval phase.
AI Citation Monitoring
Monitor your brand mentions across frontier AI systems. Olwen provides a dashboard that tracks how often your brand is cited and the accuracy of those citations. If an LLM is consistently hallucinating a specific feature, Olwen identifies the source page and suggests a markdown fix or a schema update to correct the model's internal representation.

CDN Workflow Verification
Verify that your CDN is correctly identifying and logging AI crawlers. Check your logs for the User-Agent strings of the bots listed in your robots.txt. Ensure that these requests are returning a 200 OK status and are not being challenged by CAPTCHAs or WAF rules. Olwen's CDN integration automates this verification and alerts you if a major AI bot is being blocked by a security policy. This ensures your technical infrastructure supports the visibility goals defined in your GEO strategy.