Skip to content

Living Wiki

jstuart0 edited this page Apr 28, 2026 · 1 revision

Living Wiki

The living wiki generates and maintains a coherent, citation-grounded wiki directly from indexed code. When enabled for a repository, SourceBridge opens a PR within 90 seconds, appends additive commits on subsequent pushes without force-pushing or overwriting reviewer edits, and can publish to Confluence or Notion in addition to git.

What it is

A living wiki is not documentation you write — it is documentation the system derives from the code graph, symbol summaries, and subsystem clusters. Each page carries (path:start-end) citations so readers (and the VS Code plugin) can trace any claim back to the exact source line.

The wiki stays current because the scheduler wakes periodically, detects changed code, and regenerates only the affected pages. Human edits made in a PR review or directly in Confluence/Notion are preserved across regeneration cycles via block-level reconciliation.

Enabling the feature

Living wiki is off by default. Enable it globally:

SOURCEBRIDGE_LIVING_WIKI_ENABLED=true

Or via config.toml:

[living_wiki]
enabled = true

Then enable it per-repository from the repository Settings → Living Wiki panel in the web UI, or via GraphQL:

mutation {
  enableLivingWikiForRepo(repositoryId: "your-repo-id") {
    ok
  }
}

Per-repo settings panel

The settings panel has six visual states based on the repo's current state:

  1. Globally disabled — living wiki is not enabled on this server
  2. Kill switchSOURCEBRIDGE_LIVING_WIKI_KILL_SWITCH=true is set
  3. Activation gate — feature is enabled globally but not yet configured for this repo
  4. Corrupt — settings row has unexpected data; shows a repair CTA
  5. Cold-start in progress / enabled idle — wiki is generating or up to date
  6. Refinement panel — advanced settings exposed after first successful generation

Stage A (first-time setup) shows only:

  • Enabled toggle
  • Audience selection (engineer / product / operator)
  • Sink selection

Stage B (after first generation) exposes:

  • Edit policy per sink
  • PR review vs. direct-publish toggle
  • Custom scheduler interval

Sinks

A sink is a publish target. Each repo can publish to multiple sinks simultaneously.

Sink kind Status Notes
git_repo Wired Opens a PR against the repository. Cold-start opens wiki: initial generation (sourcebridge). Incremental runs append commits to the open PR branch.
confluence Wired Publishes to a Confluence space. Pages reconciled by external_id. Block-level reconciliation preserves human edits.
notion Wired Publishes to a Notion database. Blocks tracked by external_id property.
github_wiki Partial Pushes to the repo's built-in GitHub wiki (no PR gate).
gitlab_wiki Partial Pushes to the repo's built-in GitLab wiki.
backstage_techdocs Stubbed Generates MkDocs-compatible output in the TechDocs layout.
mkdocs Stubbed Generates a docs/ tree with mkdocs.yml.
docusaurus Stubbed Generates docs/ with Docusaurus frontmatter.
vitepress Stubbed Generates docs/ with VitePress structure.

"Stubbed" means the sink kind is defined in the data model and settings UI but the writer implementation is not yet complete. Attempting to use a stubbed sink returns ErrSinkNotImplemented and the job result will show this failure.

Cold-start flow

  1. Scheduler wakes, acquires leader lease for the repo.
  2. Cold-start runner fetches subsystem clusters and graph metrics.
  3. Taxonomy resolver maps clusters to wiki "areas" (falling back to package-path heuristics when clusters are absent).
  4. LLM generates a proposed_ast for each area, plus auto-extracted Glossary, Activity Log, and Decision Record pages.
  5. Quality validators run per page. Gate failures trigger one retry with the rejection reason injected into the prompt. Second failure excludes the page.
  6. Sink writers push to each configured sink.
  7. For git_repo: SourceBridge opens a PR titled wiki: initial generation (sourcebridge).
  8. On PR merge, proposed_ast promotes to canonical_ast.
  9. On PR rejection, proposed_ast is discarded and cold-start retries on the next push.

The direct_publish per-repo option skips the PR gate and writes directly to the default branch.

Incremental updates

After the first cold-start, SourceBridge tracks two watermarks per repo:

  • source_processed_sha — last commit the generator ran for
  • wiki_published_sha — last merged-wiki baseline

Incremental runs diff against the published baseline, not the unmerged PR head. Subsequent pushes while a PR is open append a new commit (wiki: incremental update (<sha>)) to the existing PR branch — no force-push, no orphaned inline comments, no overwritten reviewer tweaks.

Reviewer commits to the PR branch mark affected blocks human-edited-on-pr-branch in proposed_ast; subsequent bot commits leave those blocks alone.

Block-level reconciliation

Every generated page is internally a tree of typed blocks with stable IDs. Block IDs are sticky to logical position, not derived from content.

Ownership states:

  • generated — created and maintained by SourceBridge
  • human-edited — modified by a human in a PR or directly in the sink
  • human-edited-on-pr-branch — modified in the open PR; bot commits skip these
  • human-only — created entirely by a human; SourceBridge never touches it

When writing to Confluence or Notion, only changed generated blocks are updated. Human-edited blocks are left alone.

Edit policies per sink:

  • local_to_sink — edit stays in that sink's overlay; canonical AST unchanged (default for Confluence and Notion)
  • promote_to_canonical — edit syncs back to canonical and propagates to all sinks (default for git-repo sinks)
  • require_review_before_promote — edit opens a sync-PR (default for GitHub/GitLab built-in wikis)

Credential broker

Confluence and Notion credentials are stored field-level encrypted in SurrealDB. The Broker interface provides per-job Snapshot objects. HTTP clients for sinks receive a Snapshot per call — not credentials baked in at construction time. This means credential rotation takes effect on the next job without restarting any long-lived client.

The invariant: at most one credential rotation per job. Rotations are recorded on the governance audit log.

Scheduler

The scheduler runs inside the API process with:

  • Per-repo FNV-32a jitter — deterministic across restarts; prevents thundering-herd on large deployments
  • Leader election — reuses the trash_sweep lease pattern so only one API replica runs the scheduler
  • Per-tenant concurrency capmax_concurrent_jobs_per_tenant (default 5)
  • Per-sink rate limiters — separate limiters for Confluence, Notion, GitHub, and GitLab
  • Default interval — 15 minutes (configurable via SOURCEBRIDGE_LIVING_WIKI_SCHEDULER_INTERVAL)

Quality validators

Eight validators run per generated page:

Validator Description
vagueness Flags pages that state nothing concrete
empty_headline Flags pages with no meaningful heading content
code_example_present Checks for at least one code block where appropriate
citation_density Ensures sufficient (path:start-end) citations
reading_level Checks prose complexity for the target audience
architectural_relevance Verifies the page is about the repository being documented
factual_grounding Cross-checks claims against the code graph
block_count Enforces minimum structural complexity

Validators are configured per-template per-audience with gate vs. warning severity. ADRs don't need the same citation density as API reference pages.

Retry policy: gate failure triggers one retry with the rejection reason injected into the prompt. Second failure excludes the page and surfaces the rejection reason in the PR description.

Failure categories:

  • transient — network or LLM timeout; job is retried on next scheduler wake
  • auth — credential failure; UI shows a "Fix credentials" deep-link
  • partial_content — some pages succeeded, some failed; retryLivingWikiJob mutation retries only the excluded pages

Metrics (Prometheus)

Five series exposed when the living wiki scheduler is running:

Metric Description
livingwiki_jobs_total Counter by status (success / failed / skipped)
livingwiki_pages_generated_total Pages generated per job
livingwiki_validation_failures_total Validator failures by name and page template
livingwiki_job_duration_seconds Histogram of job wall time
livingwiki_sink_write_duration_seconds Histogram of per-sink write time

Global settings

The /settings/living-wiki page in the web UI configures global integration credentials (GitHub PAT, GitLab PAT, Confluence site + email + API token, Notion integration token). Each secret field is write-only from the API side — the server stores it encrypted and returns "********" on reads.

Test-connection buttons validate credentials against the live external APIs before you enable any sink.

Precedence: UI-stored value > environment variable > default.

Limitations and known stubs

  • backstage_techdocs, mkdocs, docusaurus, vitepress sink writers are not yet implemented. They return ErrSinkNotImplemented.
  • The Notion webhook model is not yet rich enough for real-time trigger; notion-poll is used as a workaround.
  • Block migrations (moved, split, merged, renamed) appear in PR descriptions but are not yet surfaced in the web UI diff view.
  • Per-page progress in the activity feed shows at a granularity of "page started/completed" — not token-level streaming.
  • The human-only block ownership type is tracked in the model but the web UI does not yet have a first-class creation flow for it.

Testing tiers

Three testing tiers exist for the living wiki:

  1. Unit-integrationmake test-livingwiki-integration (build tag livingwiki_integration). Runs an in-process E2E test.
  2. Real-Confluence smoke — a Kubernetes CronJob at deploy/kubernetes/base/cronjobs/livingwiki-smoke.yaml.example that runs weekly against a real Confluence instance.
  3. Manual release runbook — checklist in RELEASING.md that must be completed before any release touching internal/livingwiki/.

Clone this wiki locally