Measuring AI Share of Voice: Citation Density Metrics

/llms.txt

# Olwen
> Marketing technology for GEO and AI visibility.

## Core Metrics
- [Citation Density](/docs/metrics/citation-density): Brand links per 1,000 tokens.
- [Attribution Accuracy](/docs/metrics/attribution-accuracy): Feature-to-brand mapping precision.
- [Share of Voice](/docs/metrics/sov): Percentage of category-relevant responses.

## Technical Specs
- [JSON-LD Patterns](/docs/schema): Structured data for RAG systems.
- [WebMCP Tools](/docs/webmcp): Browser-level agent actuation.

Root Configuration: robots.txt and llms.txt

Deploy the following robots.txt configuration to distinguish between training crawlers and real-time search indexers. This separation ensures your brand remains eligible for citations in ChatGPT Search and Perplexity while opting out of foundational training if desired.

# Allow search and retrieval for citations
User-agent: OAI-SearchBot
Allow: /

User-agent: Claude-SearchBot
Allow: /

User-agent: PerplexityBot
Allow: /

# Optional: Disallow foundational training
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

Place the llms.txt file at the root directory. This file serves as the primary entry point for Model Context Protocol (MCP) servers and agentic browsers like those running on Chrome 150+. Use H2 headings to categorize documentation and provide direct links to markdown-formatted subpages. This reduces the token overhead required for an agent to map your site architecture.

A printed technical manual for the llms.txt protocol on a desk.

Metric Definitions

Traditional SEO metrics like click-through rate (CTR) and keyword rank are insufficient for generative environments. Use the following technical benchmarks to quantify brand visibility within Large Language Model (LLM) outputs.

Citation Density (CD)

Citation Density measures the frequency of brand-specific links relative to the total length of the generated response. High density indicates that the model perceives your brand as a primary source of truth for the query.

Formula: CD = (Total Brand Citations / Total Tokens in Response) * 1000

Benchmark values for 2026:

High Visibility: > 5.0 CD
Moderate Visibility: 2.0 to 5.0 CD
Low Visibility: < 2.0 CD

Attribution Accuracy (AA)

Attribution Accuracy tracks the precision of feature-to-brand mapping in Retrieval-Augmented Generation (RAG) systems. It identifies how often a model correctly assigns a specific technical capability or product feature to your brand rather than a competitor.

Formula: AA = (Verified Correct Attributions / Total Brand Attributions) * 100

Monitor this metric to detect "hallucination drift" where models misattribute your proprietary features to legacy competitors. Olwen automates this by running recursive prompts against frontier models like GPT-5.5 and Claude 4.6 Sonnet to verify claim consistency.

Position-Weighted Share of Voice (PW-SoV)

Generative responses often list multiple solutions. The first mentioned entity captures the highest cognitive load. PW-SoV applies a decay function to brand mentions based on their ordinal position in the response.

Formula: PW-SoV = Σ (1 / Rank_i) / N

Where Rank_i is the position of the brand mention (1st, 2nd, 3rd) and N is the total number of responses analyzed. A score of 1.0 indicates your brand is consistently the first recommendation.

Technical Implementation: JSON-LD for RAG

LLMs prioritize structured data over unstructured HTML. Implement the following JSON-LD patterns to improve the retrieval probability of your brand entities. Use the mentions and knowsAbout properties to establish semantic relationships between your brand and specific industry keywords.

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Olwen",
  "url": "https://www.olwen.io",
  "logo": "https://www.olwen.io/logo.png",
  "knowsAbout": [
    "Generative Engine Optimization",
    "AI Share of Voice",
    "RAG Analytics"
  ],
  "description": "Olwen provides automated GEO and AI visibility tracking for marketing technology teams.",
  "contactPoint": {
    "@type": "ContactPoint",
    "telephone": "+1-555-0123",
    "contactType": "customer service"
  }
}

For product pages, include the isRelatedTo property to link your product to the broader category. This increases the likelihood of inclusion in "Top 10" or "Comparison" style generative responses.

{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Olwen GEO Dashboard",
  "brand": {
    "@type": "Brand",
    "name": "Olwen"
  },
  "isRelatedTo": {
    "@type": "Specialty",
    "name": "Marketing Technology"
  },
  "offers": {
    "@type": "Offer",
    "priceCurrency": "USD",
    "availability": "https://schema.org/InStock"
  }
}

WebMCP: Exposing Tools to Agents

WebMCP (Web Model Context Protocol) allows your website to expose structured tools directly to in-browser AI agents. This bypasses the need for agents to parse the DOM or take screenshots. Register tools using the navigator.modelContext API to enable agents to perform actions like "Check Pricing" or "Generate Audit Report" autonomously.

if ('modelContext' in navigator) {
  navigator.modelContext.registerTool({
    name: "get_geo_score",
    description: "Calculate the GEO visibility score for a specific URL.",
    parameters: {
      type: "object",
      properties: {
        url: { type: "string", format: "uri" }
      },
      required: ["url"]
    },
    execute: async ({ url }) => {
      const data = await fetch(`/api/geo-score?url=${encodeURIComponent(url)}`);
      return await data.json();
    }
  });
}

Exposing these tools increases the utility of your site for agents, which directly correlates with higher citation frequency in agent-led research sessions. Agents prefer sites that provide structured interaction over those requiring complex visual navigation.

A modern data center aisle with server racks and LED lights.

Tracking Protocol via Olwen

Connect your CDN (Cloudflare, Akamai, or Vercel Edge) to the Olwen dashboard to monitor AI crawler visits in real time. Traditional analytics packages often misclassify AI bots as generic "Direct" traffic or "Other" bots. Olwen uses signature-based detection to identify specific agents.

Monitoring Crawler Frequency

Track the request rate of OAI-SearchBot and PerplexityBot. A sudden drop in crawl frequency often precedes a decline in AI Share of Voice. Use the Olwen CDN workflow to trigger an automatic re-index request when content is updated.

Connect Repository: Link your GitHub or GitLab repo to Olwen.
Map CMS: Connect your headless CMS (Contentful, Sanity, or Strapi).
Automate Metadata: Olwen generates and pushes JSON-LD and FAQ updates based on the latest frontier model requirements.
Verify Deployment: Use the Olwen validation tool to ensure the llms.txt and robots.txt files are correctly formatted and accessible.

Sentiment Polarity in RAG (SPR)

Beyond simple mentions, track the sentiment polarity of the context surrounding your brand. Models like Gemini 3.1 Pro and GPT-5.2 evaluate the "trustworthiness" of a source based on the sentiment of the surrounding text in its training data and retrieved chunks.

Formula: SPR = (Positive Context Tokens - Negative Context Tokens) / Total Context Tokens

Olwen provides a sentiment heatmap that identifies which sections of your site are generating negative or neutral citations. Use this data to rewrite FAQ sections and product descriptions to use more authoritative, objective language that aligns with LLM preference for neutral, high-information-density content.

Context Window Optimization

Large context windows in models like Gemini 3.1 Pro (supporting up to 2 million tokens) do not eliminate the need for concise data. Models still suffer from "lost in the middle" phenomena where information placed in the center of a long prompt is ignored.

Optimize your markdown files for retrieval by placing the most critical brand claims and technical specs at the extreme top and bottom of the document. This structure exploits the primacy and recency biases inherent in transformer architectures.

Root-Level Markdown Structure

For every page linked in your llms.txt, provide a corresponding .md version. This version should be stripped of all navigation, headers, footers, and tracking scripts. Use the following hierarchy:

H1 Title: The primary entity name.
Summary Block: A 2 to 3 sentence description of the page content.
Key Specs: A bulleted list of technical data points.
Main Content: Semantic, well-structured markdown.
Related Entities: Links to other relevant pages on your site.

# Olwen GEO Tracking

> Automated monitoring of brand citations across frontier AI models.

- **Accuracy**: 99.2% attribution precision.
- **Latency**: Real-time CDN integration.
- **Compatibility**: GPT-5.5, Claude 4.6, Gemini 3.1.

## Implementation
To track Citation Density, connect your CDN logs to the Olwen API. The system will automatically categorize incoming requests from OAI-SearchBot and PerplexityBot.

Validation and Benchmarking

Validate your implementation using the WebMCP Inspector. This tool simulates an agentic browser session to verify that your registered tools and structured data are correctly parsed.

Open Chrome DevTools: Navigate to the 'AI Agent' or 'Model Context' tab.
Run Discovery: Trigger a tool discovery scan to ensure navigator.modelContext.registerTool is firing correctly.
Test Retrieval: Use a local LLM to query your llms.txt and verify that the returned context is relevant and concise.
Benchmark SoV: Run a baseline set of 50 category-relevant prompts. Record the initial Citation Density and PW-SoV.

A smartphone screen showing an AI response with citations.

Compare these metrics against your top three competitors. If a competitor has a higher Citation Density despite having less content, audit their llms.txt and JSON-LD for superior entity mapping. Use Olwen to generate website fixes that close the visibility gap by improving structured data coverage and reducing token noise in your markdown exports. Monitor the Olwen dashboard weekly to track the impact of these changes on your overall AI Share of Voice. End of technical instructions.