Skip to content

refactor(citation-manager): move session cache from extractor hook into CLI tool #92

@WesleyMFrederick

Description

@WesleyMFrederick

Problem

The citation extractor hook (.claude/hooks/citation-manager/extractor.sh) owns session-based caching logic that prevents re-extraction of the same file within a Claude Code session. This cache logic is implemented in bash (~145 lines) and tightly couples a cross-cutting concern (session caching) to the hook layer rather than the tool itself. The tool (citation-manager extract links) is completely stateless with no concept of sessions or caching.

This means:

  • Cache logic lives in bash, not in the tool's Node.js codebase (untestable with Vitest)
  • The tool cannot be used independently with caching (only works through the hook)
  • Cache behavior is invisible to the tool's architecture and component guides

Reproduction Steps

  1. Run citation-manager extract links <file.md> directly — always extracts, no caching
  2. Read a .md file in Claude Code session — hook fires, checks cache, conditionally extracts
  3. Read same .md file again in same session — hook silently skips (cache hit)
  4. Observe: caching only works through the hook, never through direct CLI usage

Root Cause

The cache was implemented at the hook level as a pragmatic shortcut during initial development. The hook script manages file-based cache markers in .citation-manager/claude-cache/ keyed by session_id + md5(file_content). The citation-manager CLI was designed as a stateless tool with no session awareness.

Expected Behavior

  • citation-manager extract links <file> --session <id> caches by default, returns empty on cache hit
  • citation-manager extract links <file> --no-cache opts out of caching
  • citation-manager extract links <file> --clear-cache clears cache markers
  • Hook becomes thin passthrough (~15 lines): parse stdin, pass session_id to tool, format output for Claude
  • Hook retains output formatting responsibility (converting JSON to ## Citation: markdown)

Related

  • Broken hook paths fixed in settings.json (validator + extractor paths updated to citation-manager/ subdirectory after commit 966253d refactor)
  • Architecture: /WesleyMFrederick/cc-workflows/blob/main/tools/citation-manager/design-docs/ARCHITECTURE-Citation-Manager.md
  • Hook file: /WesleyMFrederick/cc-workflows/blob/main/.claude/hooks/citation-manager/extractor.sh

Note

Architecture decisions made during exploration:

  • File-based cache — tool is CLI (process exits between calls), cache must persist on disk
  • Flags on existing extract command, NOT a separate cache subcommand — per Simplicity First (^simplicity-first) and Implement When Needed (^implement-when-needed) principles
  • Manual cache cleanup only — no TTL/auto-cleanup; --clear-cache flag for manual use
  • Formatting stays in hook — tool returns data JSON, hook formats for Claude's hookSpecificOutput

Acceptance Criteria

  • citation-manager extract links <file> --session <id> returns normal JSON on first call, empty/null on subsequent calls with same session+content
  • --no-cache flag bypasses cache and always extracts
  • --clear-cache flag removes all cache markers from .citation-manager/claude-cache/
  • Cache key uses session_id + content_hash (preserving current invalidation behavior)
  • Extractor hook reduced to stdin parsing + session_id passthrough + output formatting
  • Hook no longer contains any cache check/write logic
  • All existing extract links behavior unchanged when --session not provided (backward compatible)
  • No user-facing behavior changes in Claude Code sessions (hook + tool produce same result)

Definition of Done

  • Failing tests written (RED phase)
  • Implementation complete (GREEN phase)
  • All tests pass
  • Build succeeds
  • Hook updated and tested manually in Claude Code session
  • Component guide updated for ContentExtractor/CLI Orchestrator
  • Architecture doc updated with cache component description
  • Committed with conventional commit

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions