ZAAI Code Lab
Jekyll logo Jekyll Plugin

Your Jekyll site,
ready for every AI search.

Make your Jekyll site visible to AI search engines, assistants, and LLMs. One gem, zero config, eight features.

Install the Jekyll-AEO gem and follow the setup instructions at https://zaai.com/jekyll-aeo/setup-prompt/

On this page:

Traditional search traffic is declining 25% by 2026. AI-referred sessions grew 527% in early 2025. When someone asks ChatGPT, Perplexity, or Claude about a topic you’ve written about, your content needs to be in a format they can actually read. Most websites aren’t.

Jekyll-AEO fixes that in a single bundle install. It hooks into Jekyll’s build lifecycle and produces everything AI systems need to discover, parse, and cite your content: clean markdown copies, a site-wide llms.txt index, structured JSON-LD, crawler policies, and domain identity metadata. No external services, no API calls, no runtime dependencies.

Why This Matters Now

No AI crawler except Google renders JavaScript. Most AI systems see your raw HTML source (navigation menus, tracking scripts, layout divs, and framework boilerplate), not your rendered page. A typical web page is 85-95% markup and 5-15% actual content.

The math is brutal: AI systems have finite context windows, and every token wasted on HTML noise is a token that can’t be used to understand your content. Markdown reduces that overhead by 20-30%. A well-structured llms-full.txt file achieves 90%+ token reduction overall.

Sites with schema.org structured data get 2.5x more AI citations. FAQPage markup specifically gives a 3.2x higher chance of appearing in AI Overviews. And only ~10% of sites have adopted llms.txt, so the early-mover window is open.


Features

Jekyll-AEO ships eight features. Three are on by default, one is available via a Liquid tag, and four are opt-in.

Generate .md Files

Every HTML page gets a companion .md file: your content stripped of Liquid tags, kramdown annotations, and layout noise. Just clean, structured markdown that LLMs can ingest directly.

/about/index.html  →  /about.md
/blog/my-post/     →  /blog/my-post.md
/products/widget/  →  /products/widget.md

Liquid tags, kramdown annotations, and developer comments are stripped automatically. Content is preserved. Title, description, and last-modified date are prepended from front matter.

Every HTML page automatically gets a <link rel="alternate"> tag in the <head> pointing to its markdown copy. AI crawlers discover the machine-readable version of each page without needing to know your URL scheme. Works in auto-inject mode (default) or data mode for manual template placement.

Generate llms.txt

Following the llms.txt specification by Jeremy Howard (Answer.AI), Jekyll-AEO generates:

  • /llms.txt: A structured index of your entire site, organized by collection, with titles, descriptions, and links to markdown copies
  • /llms-full.txt: Every page’s markdown content concatenated into a single file
# My Website

> A website about building great products

## Pages

- [About](/about.md): Learn about our company and mission
- [Pricing](/pricing.md): Plans and pricing for all tiers

## Blog Posts

- [Launching v2.0](/blog/launching-v2.md): Our biggest release yet

Sections auto-generate from your Jekyll collections, or you can define custom sections. Adoption is accelerating: Mintlify has rolled it out across thousands of documentation sites including Anthropic and Cursor.

Generate JSON-LD

Add { aeo_json_ld } to your layout and get schema.org structured data automatically:

SchemaTriggerImpact
BreadcrumbListEvery page except homepageHelps AI understand site hierarchy
OrganizationHomepageBrand identity signal
FAQPagefaq: array in front matter3.2x higher AI Overview appearance
HowTohowto: object in front matterStep-by-step content extraction
Speakablespeakable: true in front matterVoice assistant discovery
ArticleDated pages (skips when jekyll-seo-tag is installed)Authorship and date signals

Add FAQ or HowTo structured data to any page through simple front matter arrays. No manual JSON-LD required.

Generate robots.txt

Control which AI bots can access your site. Jekyll-AEO generates a robots.txt that separates search bots (allowed) from training bots (blocked):

User-agent: OAI-SearchBot    # Allow search
Allow: /
User-agent: GPTBot            # Block training
Disallow: /
Llms-txt: https://yoursite.com/llms.txt

Automatically includes Sitemap: and Llms-txt: directives, covers all major AI companies, and steps aside if you already have a robots.txt in your source directory.

Generate Domain Profile

Publish a /.well-known/domain-profile.json file following the AI Domain Data specification. This gives AI assistants authoritative metadata about your site’s identity, reducing hallucination and improving how your brand appears in AI-generated answers. Auto-populated from your _config.yml site settings.

Generate URL Map

A structured markdown table of every page on your site, with URLs, layouts, redirect mappings, markdown copy paths, and skip reasons. Written to your source directory so it can be committed to version control as a content audit tool.

Validate Output

Verify your AEO output after every build:

bundle exec jekyll aeo:validate

Checks that llms.txt exists and is properly formatted, all referenced markdown files exist, and domain-profile.json (if present) has valid structure with all required fields.


AEO GEO Deep Dive


How AI Platforms Find Your Content

Each AI platform has different content preferences. Only 21-25% of cited domains overlap between platforms, so optimizing for ChatGPT alone misses 75-79% of the opportunity.

PlatformPrimary SignalsTop SourceSearch Index
ChatGPTReferring domains, domain trafficWikipedia (47.9% of top-10 citations)Bing (87% match)
PerplexityCredibility, recency, semantic relevanceReddit (47% of responses)Bing + own index
ClaudeBrand authority, content qualityMentions brands in 97.3% of responsesBrave Search (86.7% match)
Google AI OverviewsSchema markup, E-E-A-TMost conservative (48.5% brand mention rate)Google Search

A cross-platform technical foundation (llms.txt, clean markdown, domain profiles, proper robots.txt) is the highest-leverage investment because it works across all platforms simultaneously.


AI Crawler Landscape

Understanding which bots are crawling your site, and why, is essential. Jekyll-AEO’s robots.txt generator gives you granular control.

CompanySearch/Retrieval BotTraining BotNotes
OpenAIOAI-SearchBot, ChatGPT-UserGPTBotChatGPT-User no longer respects robots.txt
AnthropicClaude-SearchBot, Claude-UserClaudeBotRespects robots.txt + Crawl-delay
PerplexityPerplexityBot, Perplexity-UserN/APerplexity-User ignores robots.txt
GoogleGooglebotGoogle-ExtendedOnly platform that renders JavaScript
MicrosoftBingbotN/APowers ChatGPT citations (87% match)
AppleApplebot-ExtendedApplebotApple Intelligence
MetaN/AMeta-ExternalAgentTraining only
AmazonN/AAmazonbotTraining / Alexa

The strategy: Allow search/retrieval bots so your content appears in AI answers. Block training bots so it doesn’t end up in training datasets. You get the citation benefits without contributing to model training.


Content Optimization Tips

Jekyll-AEO makes your content technically accessible to AI. To maximize citation frequency, the research says:

What the data shows:

SignalImpact
Pages with 19+ statistics5.4 avg citations vs. 2.8 without
Pages with expert quotes4.1 avg citations vs. 2.4 without
FAQPage schema markup3.2x higher chance of appearing in AI Overviews
Schema.org structured data2.5x higher citation chance overall
Brand presence on 4+ channelsSignificantly more AI citations
Brand authority0.334 correlation, the single strongest predictor

Actionable principles:

  • Answer-first structure: Lead every section with a direct, concise answer. AI systems extract the first 50-150 words of a topically relevant section most frequently.
  • Fact density: Include concrete statistics, data points, and cited sources. This is the highest-impact content optimization.
  • Clear hierarchy: Use H2/H3 headings as semantic boundaries. Keep sections at 120-180 words, the optimal chunk size for RAG retrieval.
  • Tables and lists: AI systems excel at extracting structured tabular data and numbered steps.
  • Content freshness: Visibility drops 2-3 days after publication without updates. A 30-90 day refresh cadence is recommended. Jekyll-AEO adds last-modified dates automatically.
  • Neutral tone: Encyclopedic, authoritative prose is 30% more likely to appear in AI answers than opinion-heavy content.

The foundational academic paper “GEO: Generative Engine Optimization” (Princeton, Georgia Tech, Allen AI, IIT Delhi; ACM SIGKDD 2024) tested 9 optimization methods and found citing sources, adding quotations, and adding statistics each improve visibility 30-40%. Keyword stuffing is explicitly ineffective for GEO.


How It Works

Jekyll-AEO hooks into Jekyll’s standard build lifecycle. Add the gem, run jekyll build, and all outputs are generated alongside your existing site. No separate build step, no external services, no post-processing scripts.

Every feature shares the same skip logic: redirect pages, non-HTML outputs, and excluded paths are handled consistently. Per-page outputs (markdown copies, link tags) are generated during the build. Site-wide outputs (llms.txt, robots.txt, domain profile) are generated in a single pass after all pages are written.

For a technical deep dive into the build pipeline, see the README.


Proven Results

Companies investing in AEO and GEO are seeing measurable returns:

CompanyResultMethod
LS Building Products540% boost in AI Overview mentionsContent restructuring + schema
Go Fish DigitalAI traffic converts at 25x traditional searchGEO optimization
RampCitation share 8.1% to 12.2% in one monthMonitoring + optimization
Runpod4x customer growth in 90 daysContent architecture redesign
NerdWallet35% revenue growth despite 20% traffic decreaseAI-first content strategy

B2B SaaS companies report 6-27x higher conversion rates from AI-referred traffic vs. traditional search. The traffic numbers are smaller, but visitors who arrive through AI recommendations arrive with higher intent, better context, and more confidence, because an AI system already evaluated the options and recommended you specifically.


Get Started


Quick Start

Three steps, under five minutes:

1. Install

# Gemfile
gem "jekyll-aeo"
bundle install

2. Build

bundle exec jekyll build

Check your output directory. You’ll find .md companions for every page, plus llms.txt and llms-full.txt at the root.

3. Validate

bundle exec jekyll aeo:validate

That’s it. Your site is now AI-readable. Enable advanced features like robots.txt, domain profiles, and URL maps through _config.yml when you’re ready. See the README for all configuration options.

Designed to Coexist

Jekyll-AEO works alongside the plugins you already use. It complements jekyll-seo-tag (a different layer: AI outputs vs. HTML meta tags), cooperates with jekyll-sitemap (priority ordering prevents robots.txt conflicts), and respects jekyll-redirect-from (redirect pages are automatically skipped).

Nothing to configure. Install both and they cooperate automatically.

Ruby >= 3.0. Jekyll >= 4.0. MIT licensed. Built and maintained by ZAAI.

Highlights

  • Generates a clean .md copy of every page, 20-30% fewer tokens than HTML

  • Produces llms.txt and llms-full.txt following the llmstxt.org spec, 90%+ token reduction

  • Injects <link rel="alternate" type="text/markdown"> tags for AI crawler discovery

  • Outputs schema.org JSON-LD: FAQPage, HowTo, BreadcrumbList, Organization, Speakable, Article

  • Generates a robots.txt that allows search bots and blocks training bots

  • Publishes /.well-known/domain-profile.json for authoritative site identity

  • Ships a CLI validator: bundle exec jekyll aeo:validate

  • Zero config to start. Works with jekyll-seo-tag, jekyll-sitemap, jekyll-feed, and jekyll-redirect-from

Tech Stack

  • Ruby
  • Jekyll
  • Liquid
  • Schema.org JSON-LD