Skip to content

llm-d Website Documentation System - Current State Analysis #220

@petecheslock

Description

@petecheslock

In advance of our push to improve the public facing documentation and guides I wanted to have an (AI assisted) analysis of the current state of how docs get to the public llm-d.ai website. Since I was the one who got the documentation to the site in the first place, this doc does a good job to summarize how it works today, what requirements we were (and are still) working within, and why what we currently have is probably not the solution we want going forward.

Executive Summary

The llm-d website (llm-d.github.io) currently uses a distributed documentation model where content is pulled from multiple upstream repositories and transformed at build time using Docusaurus. While this approach keeps docs close to code, it has become increasingly complex and fragile due to:

  1. Complex transformation pipeline with hacky markdown-to-MDX conversions
  2. Limited versioning support - all content syncs from main branch only
  3. Scattered documentation across 8+ repositories
  4. Build-time dependencies on external GitHub repositories
  5. Difficult local development - changes require rebuilding the entire site when testing different branches
  6. No single source of truth for documentation

Current Architecture

System Overview

┌─────────────────────────────────────────────────────────────┐
│                    Build Process (GitHub Actions)           │
│  - Triggers: Push to main, nightly cron, manual             │
│  - Runs: npm install → npm run build → deploy to GH Pages   │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│               Docusaurus Build with Remote Content          │
│                                                              │
│  1. Reads components-data.yaml (release metadata)           │
│  2. Downloads README.md from 8+ GitHub repos                │
│  3. Applies transformations (fix links, images, MDX)        │
│  4. Generates static site with navigation                   │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                    Output: llm-d.ai                         │
│  - Architecture docs from llm-d/llm-d                       │
│  - Component docs from individual repos                     │
│  - Guides from llm-d/llm-d/guides/                         │
│  - Community docs from llm-d/llm-d                         │
│  - Generated release page from YAML                         │
└─────────────────────────────────────────────────────────────┘

Technology Stack

  • Site Generator: Docusaurus 3.9.2
  • Remote Content Plugin: docusaurus-plugin-remote-content v4.0.0
  • Build Automation: GitHub Actions (nightly + on-push)
  • Hosting: GitHub Pages
  • Content Sources: 8+ GitHub repositories

File Structure

llm-d.github.io/
├── docusaurus.config.js              # Main Docusaurus config
├── sidebars.js                       # Sidebar definitions (auto-generated)
├── package.json                      # Dependencies
│
├── remote-content/                   # Remote content system
│   ├── remote-content.js            # Plugin aggregator
│   └── remote-sources/
│       ├── components-data.yaml     # ⭐ Central config (release metadata)
│       ├── sync-release.mjs         # Script to update YAML from GitHub API
│       ├── component-configs.js     # YAML loader and repo URL generator
│       ├── repo-transforms.js       # ⚠️ Complex transformation logic
│       ├── utils.js                 # Helper functions
│       │
│       ├── architecture/            # Architecture docs config
│       │   ├── architecture-main.js
│       │   └── components-generator.js  # Auto-generates component pages
│       │
│       ├── guide/                   # Guide docs config
│       │   └── guide-generator.js   # Auto-generates guide pages
│       │
│       ├── community/               # Community docs config
│       │   ├── contribute.js
│       │   ├── code-of-conduct.js
│       │   ├── security.js
│       │   └── sigs.js
│       │
│       ├── usage/                   # Usage docs config
│       │   └── usage-generator.js
│       │
│       └── infra-providers/         # Infra provider docs config
│           └── infra-providers-generator.js
│
├── docs/                            # ⚠️ Build output (not source!)
│   ├── architecture/                # Generated from llm-d/llm-d
│   │   ├── architecture.mdx         # Main README
│   │   ├── latest-release.md        # Generated from YAML
│   │   └── Components/              # Component READMEs
│   │       ├── inference-scheduler.md
│   │       ├── modelservice.md
│   │       ├── kv-cache.md
│   │       └── ... (8 components)
│   │
│   ├── guide/                       # Generated from llm-d/llm-d/guides/
│   │   ├── guide.md                 # guides/README.md
│   │   └── Installation/
│   │       ├── prerequisites.md
│   │       ├── quickstart.md
│   │       ├── inference-scheduling.md
│   │       ├── pd-disaggregation.md
│   │       └── ... (12 guides)
│   │
│   ├── community/                   # Generated from llm-d/llm-d
│   │   ├── contribute.md
│   │   ├── code-of-conduct.md
│   │   ├── security.md
│   │   └── sigs.md
│   │
│   └── usage/                       # Generated from component repos
│       └── ...
│
├── blog/                            # ✅ Local content (not synced)
├── src/                             # ✅ Local React components
└── static/                          # ✅ Local static assets

How Remote Content Syncing Works

1. Configuration (components-data.yaml)

The single source of truth for what gets synced:

release:
  version: v0.5.1
  releaseDate: '2026-03-05'
  releaseDateFormatted: March 5, 2026
  releaseUrl: https://github.com/llm-d/llm-d/releases/tag/v0.5.1
  releaseName: Release v0.5.1

components:
  - name: llm-d-inference-scheduler
    org: llm-d
    sidebarLabel: Inference Scheduler
    description: The scheduler that makes optimized routing decisions...
    sidebarPosition: 1
    version: v0.6.0
  # ... 8 more components

Key Point: Version tags are only for display on the Latest Release page. All content syncs from main branch.

2. Content Download (Build Time)

For each configured source, the docusaurus-plugin-remote-content plugin:

  1. Downloads content from GitHub raw URL: https://raw.githubusercontent.com/{org}/{repo}/main/{file}
  2. Passes content through modifyContent() function
  3. Applies transformations (see next section)
  4. Writes transformed content to docs/ directory

Example: Component README sync

{
  name: 'component-llm-d-inference-scheduler',
  sourceBaseUrl: 'https://raw.githubusercontent.com/llm-d/llm-d-inference-scheduler/main/',
  outDir: 'docs/architecture/Components',
  documents: ['README.md'],
  modifyContent(filename, content) {
    // Download README.md from GitHub
    // Apply transformations
    // Output to docs/architecture/Components/inference-scheduler.md
  }
}

3. Content Transformation Pipeline

The most complex and fragile part of the system. Located in repo-transforms.js:

Phase 1: Basic MDX Fixes (applyBasicMdxFixes)

Problem: GitHub-flavored Markdown ≠ MDX (Docusaurus uses MDX)

Transformations:

  • Convert GitHub callouts → Docusaurus admonitions
    > [!NOTE]           →    :::note
    > This is a note         This is a note
                             :::
  • Convert custom tab markers → Docusaurus Tabs components
    <!-- TABS:START -->     →    <Tabs>
    <!-- TAB:GKE -->              <TabItem value="gke" label="GKE">
    content                       content
    <!-- TABS:END -->             </TabItem></Tabs>
  • Fix HTML tags for MDX compatibility
    • <br><br />
    • Self-closing tags must have />
    • Attributes must be quoted
  • Convert HTML comments → JSX comments
    • <!-- comment -->{/* comment */}
    • Multi-line comments removed entirely
  • Escape curly braces in code blocks

Phase 2: Link Fixing

Problem: Relative links in GitHub READMEs break in Docusaurus

Strategy: Rewrite ALL relative links to point back to GitHub

// Relative markdown link
[Some Doc](./guides/example.md)

// Gets rewritten to:
[Some Doc](https://github.com/llm-d/llm-d/blob/main/guides/example.md)

Exception: Internal guide links

  • Maintains mapping of GitHub paths → Docusaurus paths
  • Example: guides/quickstart/README.md/docs/guide/Installation/quickstart
  • Only works for explicitly listed paths in INTERNAL_GUIDE_MAPPINGS

Phase 3: Image Fixing

Problem: Relative image paths break

Strategy: Rewrite to GitHub raw URLs

// Relative image
![Diagram](./images/architecture.png)

// Gets rewritten to:
![Diagram](https://github.com/llm-d/llm-d/raw/main/guides/images/architecture.png)

Phase 4: Known Broken Link Fixes

Hardcoded fixes for upstream issues:

// Fix broken 'dev' branch references
.replace(/github\.com\/llm-d\/llm-d\/tree\/dev\//g,
         'github.com/llm-d/llm-d/tree/main/')

TODO comments indicate these are temporary hacks

Phase 5: Frontmatter Injection

Adds YAML frontmatter for Docusaurus:

---
title: Inference Scheduler
description: "The scheduler that makes optimized routing decisions..."
sidebar_label: Inference Scheduler
sidebar_position: 1
keywords: [llm-d, inference scheduler, request routing]
---

Phase 6: Source Attribution

Adds callout banner to bottom of page:

:::info Content Source
This content is automatically synced from [README.md](link) on the `main` branch.

📝 To suggest changes, please [edit the source file](link) or [create an issue](link).
:::

Content Sources (8+ Repositories)

Main Repository

  • Repo: llm-d/llm-d
  • Content:
    • Architecture overview (README.mddocs/architecture/architecture.mdx)
    • User guides (guides/**/*.mddocs/guide/)
    • Community docs (CONTRIBUTING.md, CODE_OF_CONDUCT.md, etc.)
  • Branch: Always main

Component Repositories

  1. llm-d-inference-scheduler (llm-d/llm-d-inference-scheduler)
  2. llm-d-modelservice (llm-d-incubation/llm-d-modelservice)
  3. llm-d-inference-sim (llm-d/llm-d-inference-sim)
  4. llm-d-infra (llm-d-incubation/llm-d-infra)
  5. llm-d-kv-cache (llm-d/llm-d-kv-cache)
  6. llm-d-benchmark (llm-d/llm-d-benchmark)
  7. workload-variant-autoscaler (llm-d-incubation/workload-variant-autoscaler)
  8. gateway-api-inference-extension (kubernetes-sigs/gateway-api-inference-extension) - skipSync: true

Each component's README.md is synced to docs/architecture/Components/{name}.md


Build and Deployment Process

GitHub Actions Workflow (.github/workflows/deploy.yml)

Triggers:

  • Push to main branch
  • Nightly cron at midnight UTC (0 0 * * *)
  • Manual trigger (workflow_dispatch)

Steps:

  1. Checkout code
  2. Setup Node.js 20.18.1
  3. npm install
  4. npm run build (Downloads remote content + builds Docusaurus site)
  5. Upload build artifact
  6. Deploy to GitHub Pages

Build Time: ~3-5 minutes (includes downloading from 8+ repos)

Release Update Process

When a new llm-d release is published:

  1. Run sync script:

    cd remote-content/remote-sources
    node sync-release.mjs
  2. Script actions:

    • Queries GitHub Releases API for latest release
    • Parses "LLM-D Component Summary" table from release notes
    • Updates components-data.yaml:
      • Release version, date, URL
      • Component versions
      • New/re-enabled container images
  3. Manual review and commit:

    git diff components-data.yaml
    git add components-data.yaml
    git commit -m "Update to llm-d v0.5.1"
    git push
  4. Automatic deployment:

    • GitHub Actions triggers on push
    • Builds site with updated metadata
    • Deploys to llm-d.ai

Important: Content (READMEs, guides) still syncs from main branch, not the release tag. Version numbers are only used for display on the Latest Release page.


Pain Points and Limitations

1. Complex and Fragile Transformations

Problem: The transformation pipeline in repo-transforms.js is 300+ lines of regex-based text manipulation.

Examples of brittleness:

  • Regex parsing of markdown syntax (tabs, callouts, links, images)
  • Special case handling for different link types (relative, root-relative, complex ../ paths)
  • Hardcoded fixes for known broken upstream links (with TODO comments)
  • Manual mapping of internal guide links
  • Edge cases around curly braces, HTML tags, comments

Why it's hacky:

  • Regex cannot properly parse markdown (markdown is context-dependent)
  • Transformations are order-dependent (must run in specific sequence)
  • Each new markdown feature requires new regex
  • Difficult to test comprehensively
  • Easy to introduce regressions

Example of complexity:

// Convert tabs (60+ lines of code)
.replace(/<!-- TABS:START -->\n([\s\S]*?)<!-- TABS:END -->/g, (match, tabsContent) => {
  const tabSections = tabsContent.split(/<!-- TAB:/);
  const tabs = [];
  for (let i = 1; i < tabSections.length; i++) {
    const section = tabSections[i];
    const labelMatch = section.match(/^([^:]+?)(?::default)?\s*-->\n([\s\S]*?)$/);
    // ... more parsing ...
  }
  // Generate Docusaurus Tabs component
  // Add imports at top of file
})

2. No Versioning Support

Problem: All content syncs from main branch only.

Impact:

  • Documentation always shows latest development state
  • No way to view docs for specific releases (v0.4.x, v0.5.x, etc.)
  • Users on older versions see docs for features they don't have
  • Breaking changes in docs can confuse users
  • Cannot maintain separate docs for LTS versions

Current workaround:

  • Version tags in YAML only displayed on "Latest Release" page
  • To test content from feature branch, must manually edit code:
    // Temporary hack to test feature branch
    const ref = 'feature-branch'; // Change back to 'main' before committing!

3. Build-Time External Dependencies

Problem: Site build depends on availability and content of 8+ external GitHub repositories.

Risks:

  • Build breaks if upstream repo is deleted/moved
  • Build breaks if upstream repo is temporarily unavailable
  • Build breaks if upstream content has syntax errors
  • Cannot build offline
  • Cannot guarantee reproducible builds (upstream can change)
  • Slow builds (must download from multiple repos)

Example failure scenario:

  1. Component repo merges breaking markdown change
  2. Nightly build runs at midnight
  3. Transformation fails on new markdown syntax
  4. Site deploy fails
  5. Website is down until someone fixes transformation code

4. Difficult Local Development

Problem: Testing doc changes requires complex workflow.

For synced content:

  1. Fork the source repository (e.g., llm-d/llm-d)
  2. Make changes to README or guide
  3. Commit and push to fork
  4. Temporarily modify component-configs.js to point to fork/branch in upstream repo
  5. Run npm start to build site locally
  6. Remember to revert config changes before committing

Pain points:

  • Cannot see doc changes without rebuilding entire site
  • Must modify website code to test upstream changes
  • Easy to accidentally commit temporary config changes
  • Slow iteration cycle (build takes minutes)

5. Scattered Documentation

Problem: Documentation is spread across multiple repositories with different structures.

Current locations:

  • Architecture: llm-d/llm-d/README.md
  • Guides: llm-d/llm-d/guides/*/README.md
  • Component docs: {org}/{repo}/README.md (8 repos)
  • Community: llm-d/llm-d/CONTRIBUTING.md, etc.
  • API reference: Not currently documented
  • Infrastructure: llm-d-incubation/llm-d-infra/README.md

Impact:

  • No single source of truth
  • Difficult to maintain consistency
  • Hard to search across all docs
  • No unified navigation
  • Unclear where to add new documentation
  • Contributors don't know which repo to edit

6. Limited SEO and Discoverability

Problem: Docusaurus is primarily designed for single-repo documentation.

Issues:

  • Documentation and marketing site mixed together
  • Harder to optimize for different audiences (users vs. marketers)
  • Search results point to generic llm-d.ai (not docs.llm-d.ai)
  • Cannot track documentation analytics separately

7. Maintenance Burden

Code to maintain:

  • 12 remote content source files (.js configs)
  • 1 YAML data file (manually updated)
  • 1 sync script (430 lines)
  • 1 transformation system (300+ lines of regex)
  • 1 utility library
  • Custom Docusaurus configuration

Every time:

  • New component is added → Update YAML + regenerate
  • New guide is added → Update generator config
  • GitHub changes markdown syntax → Update transformations
  • Docusaurus updates → May break transformations
  • Component repo restructures → Update source paths

Why the Current Approach Was Chosen

Original goals:

  1. Keep docs close to code (docs live with component)
  2. Allow component teams to own their docs
  3. Automatic updates (docs deploy when code changes)
  4. Single website for everything (docs + marketing)

This made sense when:

  • Small number of components (3-4)
  • Simple markdown (no advanced features)
  • No versioning requirements
  • Rapid iteration phase

What changed:

  • Now 8+ components (growing)
  • Need versioning for stable releases
  • Complex markdown features (tabs, callouts, etc.)
  • Users on different versions (v0.4, v0.5, etc.)
  • Transformation edge cases accumulating
  • More emphasis on documentation quality

Appendix: Key Files Reference

Current System

Configuration:

  • remote-content/remote-sources/components-data.yaml - Central configuration
  • remote-content/remote-content.js - Plugin aggregator
  • docusaurus.config.js - Docusaurus configuration

Transformation:

  • remote-content/remote-sources/repo-transforms.js - Transformation pipeline (300+ lines)
  • remote-content/remote-sources/utils.js - Helper functions

Generators:

  • remote-content/remote-sources/architecture/components-generator.js - Component docs
  • remote-content/remote-sources/guide/guide-generator.js - Guide docs
  • remote-content/remote-sources/sync-release.mjs - Release sync script (430 lines)

Workflows:

  • .github/workflows/deploy.yml - Build and deployment

Dependencies:

  • docusaurus-plugin-remote-content v4.0.0
  • @docusaurus/core 3.9.2
  • js-yaml 4.1.0 (for YAML parsing)

Generated Output (Do Not Edit!)

  • docs/architecture/ - Architecture and component docs
  • docs/guide/ - User guides
  • docs/community/ - Community docs
  • docs/usage/ - Usage docs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions