On this page:

Traditional search traffic is declining 25% by 2026. AI-referred sessions grew 527% in early 2025. When someone asks ChatGPT, Perplexity, or Claude about a topic you’ve written about, your content needs to be in a format they can actually read. Most websites aren’t.

Jekyll-AEO fixes that in a single bundle install. It hooks into Jekyll’s build lifecycle and produces everything AI systems need to discover, parse, and cite your content — clean markdown copies, a site-wide llms.txt index, structured JSON-LD, crawler policies, and domain identity metadata. No external services, no API calls, no runtime dependencies.

Why This Matters Now

No AI crawler except Google renders JavaScript. Most AI systems see your raw HTML source — navigation menus, tracking scripts, layout divs, and framework boilerplate — not your rendered page. A typical web page is 85-95% markup and 5-15% actual content.

The math is brutal: AI systems have finite context windows, and every token wasted on HTML noise is a token that can’t be used to understand your content. Markdown reduces that overhead by 20-30%. A well-structured llms-full.txt file achieves 90%+ token reduction overall.

Sites with schema.org structured data get 2.5x more AI citations. FAQPage markup specifically gives a 3.2x higher chance of appearing in AI Overviews. And only ~10% of sites have adopted llms.txt — the early-mover window is open.


Features

Jekyll-AEO ships eight features. Three are on by default, one is available via a Liquid tag, and four are opt-in.

Generate .md Files

Every HTML page gets a companion .md file — your content stripped of Liquid tags, kramdown annotations, and layout noise. Just clean, structured markdown that LLMs can ingest directly.

/about/index.html  →  /about.md
/blog/my-post/     →  /blog/my-post.md
/products/widget/  →  /products/widget.md

Liquid tags, kramdown annotations, and developer comments are stripped automatically. Content is preserved. Title, description, and last-modified date are prepended from front matter.

Every HTML page automatically gets a <link rel="alternate"> tag in the <head> pointing to its markdown copy. AI crawlers discover the machine-readable version of each page without needing to know your URL scheme. Works in auto-inject mode (default) or data mode for manual template placement.

Generate llms.txt

Following the llms.txt specification by Jeremy Howard (Answer.AI), Jekyll-AEO generates:

  • /llms.txt — A structured index of your entire site, organized by collection, with titles, descriptions, and links to markdown copies
  • /llms-full.txt — Every page’s markdown content concatenated into a single file
# My Website

> A website about building great products

## Pages

- [About](/about.md): Learn about our company and mission
- [Pricing](/pricing.md): Plans and pricing for all tiers

## Blog Posts

- [Launching v2.0](/blog/launching-v2.md): Our biggest release yet

Sections auto-generate from your Jekyll collections, or you can define custom sections. Adoption is accelerating — Mintlify has rolled it out across thousands of documentation sites including Anthropic and Cursor.

Generate JSON-LD

Add {% aeo_json_ld %} to your layout and get schema.org structured data automatically:

Schema Trigger Impact
BreadcrumbList Every page except homepage Helps AI understand site hierarchy
Organization Homepage Brand identity signal
FAQPage faq: array in front matter 3.2x higher AI Overview appearance
HowTo howto: object in front matter Step-by-step content extraction
Speakable speakable: true in front matter Voice assistant discovery
Article Dated pages (skips when jekyll-seo-tag is installed) Authorship and date signals

Add FAQ or HowTo structured data to any page through simple front matter arrays — no manual JSON-LD required.

Generate robots.txt

Control which AI bots can access your site. Jekyll-AEO generates a robots.txt that separates search bots (allowed) from training bots (blocked):

User-agent: OAI-SearchBot    # Allow search
Allow: /
User-agent: GPTBot            # Block training
Disallow: /
Llms-txt: https://yoursite.com/llms.txt

Automatically includes Sitemap: and Llms-txt: directives, covers all major AI companies, and steps aside if you already have a robots.txt in your source directory.

Generate Domain Profile

Publish a /.well-known/domain-profile.json file following the AI Domain Data specification. This gives AI assistants authoritative metadata about your site’s identity — reducing hallucination and improving how your brand appears in AI-generated answers. Auto-populated from your _config.yml site settings.

Generate URL Map

A structured markdown table of every page on your site — with URLs, layouts, redirect mappings, markdown copy paths, and skip reasons. Written to your source directory so it can be committed to version control as a content audit tool.

Validate Output

Verify your AEO output after every build:

bundle exec jekyll aeo:validate

Checks that llms.txt exists and is properly formatted, all referenced markdown files exist, and domain-profile.json (if present) has valid structure with all required fields.


AEO GEO Deep Dive


How AI Platforms Find Your Content

Each AI platform has different content preferences — only 21-25% of cited domains overlap between platforms. Optimizing for ChatGPT alone misses 75-79% of the opportunity.

Platform Primary Signals Top Source Search Index
ChatGPT Referring domains, domain traffic Wikipedia (47.9% of top-10 citations) Bing (87% match)
Perplexity Credibility, recency, semantic relevance Reddit (47% of responses) Bing + own index
Claude Brand authority, content quality Mentions brands in 97.3% of responses Brave Search (86.7% match)
Google AI Overviews Schema markup, E-E-A-T Most conservative (48.5% brand mention rate) Google Search

A cross-platform technical foundation — llms.txt, clean markdown, domain profiles, proper robots.txt — is the highest-leverage investment because it works across all platforms simultaneously.


AI Crawler Landscape

Understanding which bots are crawling your site — and why — is essential. Jekyll-AEO’s robots.txt generator gives you granular control.

Company Search/Retrieval Bot Training Bot Notes
OpenAI OAI-SearchBot, ChatGPT-User GPTBot ChatGPT-User no longer respects robots.txt
Anthropic Claude-SearchBot, Claude-User ClaudeBot Respects robots.txt + Crawl-delay
Perplexity PerplexityBot, Perplexity-User Perplexity-User ignores robots.txt
Google Googlebot Google-Extended Only platform that renders JavaScript
Microsoft Bingbot Powers ChatGPT citations (87% match)
Apple Applebot-Extended Applebot Apple Intelligence
Meta Meta-ExternalAgent Training only
Amazon Amazonbot Training / Alexa

The strategy: Allow search/retrieval bots so your content appears in AI answers. Block training bots so it doesn’t end up in training datasets. You get the citation benefits without contributing to model training.


Content Optimization Tips

Jekyll-AEO makes your content technically accessible to AI. To maximize citation frequency, the research says:

What the data shows:

Signal Impact
Pages with 19+ statistics 5.4 avg citations vs. 2.8 without
Pages with expert quotes 4.1 avg citations vs. 2.4 without
FAQPage schema markup 3.2x higher chance of appearing in AI Overviews
Schema.org structured data 2.5x higher citation chance overall
Brand presence on 4+ channels Significantly more AI citations
Brand authority 0.334 correlation — the single strongest predictor

Actionable principles:

  • Answer-first structure — Lead every section with a direct, concise answer. AI systems extract the first 50-150 words of a topically relevant section most frequently.
  • Fact density — Include concrete statistics, data points, and cited sources. This is the highest-impact content optimization.
  • Clear hierarchy — Use H2/H3 headings as semantic boundaries. Keep sections at 120-180 words — the optimal chunk size for RAG retrieval.
  • Tables and lists — AI systems excel at extracting structured tabular data and numbered steps.
  • Content freshness — Visibility drops 2-3 days after publication without updates. A 30-90 day refresh cadence is recommended. Jekyll-AEO adds last-modified dates automatically.
  • Neutral tone — Encyclopedic, authoritative prose is 30% more likely to appear in AI answers than opinion-heavy content.

The foundational academic paper “GEO: Generative Engine Optimization” (Princeton, Georgia Tech, Allen AI, IIT Delhi — ACM SIGKDD 2024) tested 9 optimization methods and found citing sources, adding quotations, and adding statistics each improve visibility 30-40%. Keyword stuffing is explicitly ineffective for GEO.


How It Works

Jekyll-AEO hooks into Jekyll’s standard build lifecycle. Add the gem, run jekyll build, and all outputs are generated alongside your existing site — no separate build step, no external services, no post-processing scripts.

Every feature shares the same skip logic: redirect pages, non-HTML outputs, and excluded paths are handled consistently. Per-page outputs (markdown copies, link tags) are generated during the build. Site-wide outputs (llms.txt, robots.txt, domain profile) are generated in a single pass after all pages are written.

For a technical deep dive into the build pipeline, see the README.


Proven Results

Companies investing in AEO and GEO are seeing measurable returns:

Company Result Method
LS Building Products 540% boost in AI Overview mentions Content restructuring + schema
Go Fish Digital AI traffic converts at 25x traditional search GEO optimization
Ramp Citation share 8.1% to 12.2% in one month Monitoring + optimization
Runpod 4x customer growth in 90 days Content architecture redesign
NerdWallet 35% revenue growth despite 20% traffic decrease AI-first content strategy

B2B SaaS companies report 6-27x higher conversion rates from AI-referred traffic vs. traditional search. The traffic numbers are smaller, but visitors who arrive through AI recommendations arrive with higher intent, better context, and more confidence — because an AI system already evaluated the options and recommended you specifically.


Get Started


Quick Start

Three steps, under five minutes:

1. Install

# Gemfile
gem "jekyll-aeo"
bundle install

2. Build

bundle exec jekyll build

Check your output directory. You’ll find .md companions for every page, plus llms.txt and llms-full.txt at the root.

3. Validate

bundle exec jekyll aeo:validate

That’s it. Your site is now AI-readable. Enable advanced features like robots.txt, domain profiles, and URL maps through _config.yml when you’re ready — see the README for all configuration options.

Designed to Coexist

Jekyll-AEO works alongside the plugins you already use. It complements jekyll-seo-tag (different layer — AI outputs vs. HTML meta tags), cooperates with jekyll-sitemap (priority ordering prevents robots.txt conflicts), and respects jekyll-redirect-from (redirect pages are automatically skipped).

Nothing to configure. Install both and they cooperate automatically.

Ruby >= 3.0. Jekyll >= 4.0. MIT licensed. Built and maintained by ZAAI.