1 files changed, 426 insertions, 0 deletions
diff --git a/docs/design/2026-05-28-rulesets-enhancement-backlog.org b/docs/design/2026-05-28-rulesets-enhancement-backlog.org
new file mode 100644
index 0000000..577edb6
--- /dev/null
+++ b/docs/design/2026-05-28-rulesets-enhancement-backlog.org
@@ -0,0 +1,426 @@
+#+TITLE: Suggested rulesets enhancements
+#+DATE: 2026-05-28
+#+SOURCE: Codex review of /home/cjennings/code/rulesets
+
+* Purpose
+
+This file captures improvement ideas from a broad review of the =rulesets=
+project. The goals are:
+
+- make agents more efficient;
+- increase agent knowledge and effectiveness;
+- reduce token usage;
+- improve user experience;
+- make the system easier to operate across machines, projects, and runtimes.
+
+* Enhancement Backlog
+
+** TODO Runtime-neutral core
+
+Complete the migration from Claude-Code-specific structure to a generic agent
+runtime contract.
+
+Current state:
+
+- The reusable core is already mostly runtime-neutral: =.ai/protocols.org=,
+  =.ai/workflows/=, =.ai/scripts/=, =inbox/=, cross-agent comms, session
+  archives, and task workflows are conceptually usable by any capable agent.
+- The install paths, language bundles, launcher, hook wiring, and naming are
+  still Claude-specific: =~/.claude/=, =.claude/=, =CLAUDE.md=, Claude hook
+  payloads, and the =claude= binary.
+
+Suggested shape:
+
+- Define a small runtime contract: where rules live, where workflows live, how
+  live session state is named, how startup instructions are found, and which
+  capabilities a runtime adapter may expose.
+- Treat Claude Code as one adapter, not the project identity.
+- Add adapter directories or manifests for Claude, Codex/OpenAI-compatible
+  agents, and local OpenAI-compatible agents.
+
+Why it helps:
+
+- Makes the system usable offline with local models after setup.
+- Reduces vendor lock-in while preserving the high-value =.ai/= workflow layer.
+- Lets multiple agents share project conventions without rewriting the project
+  memory and workflows for each tool.
+
+** TODO Per-agent live session files
+
+Replace the singleton live session file =.ai/session-context.org= with
+agent-scoped session context files.
+
+Problem:
+
+- The current live file is a single shared path.
+- If two agents operate in one project at the same time, they can overwrite or
+  confuse each other's active session state.
+- This matters more as the project moves toward generic runtime support.
+
+Suggested shape:
+
+- Use a directory such as =.ai/live-sessions/=.
+- Name live files with runtime, host, pid/session id, and timestamp, e.g.
+  =codex-strix-2026-05-28-0830.org=.
+- At wrap-up, move the live file into =.ai/sessions/= using the existing archive
+  naming convention.
+- Keep a lightweight pointer file only if needed, e.g. =.ai/current-session= for
+  single-agent clients.
+
+Why it helps:
+
+- Enables concurrent agents safely.
+- Makes crash recovery more precise.
+- Prevents one runtime's live context from polluting another runtime's session.
+
+** TODO Workflow routing index compression
+
+Convert =.ai/workflows/INDEX.org= into a compact machine-readable routing
+manifest, while keeping prose workflow documentation separate.
+
+Problem:
+
+- The index is useful, but long.
+- Agents need only a small routing table most of the time: trigger phrase,
+  workflow file, purpose, and plugin ownership.
+- Reading prose-heavy routing text costs tokens before the agent knows which
+  workflow matters.
+
+Suggested shape:
+
+- Add =.ai/workflows/catalog.json= or =catalog.edn= with entries:
+  - =id=
+  - =file=
+  - =purpose=
+  - =triggers=
+  - =source_plugins=
+  - =loads=
+  - =token_tier=
+- Generate the human =INDEX.org= from the manifest, or generate the manifest
+  from a canonical structured block in each workflow.
+
+Why it helps:
+
+- Faster workflow routing.
+- Lower context load at startup.
+- Easier drift detection: missing files, stale triggers, orphan plugins, and
+  duplicate triggers can be checked mechanically.
+
+** TODO Skill, command, rule, hook, and workflow catalog
+
+Generate a top-level machine-readable catalog of all agent-facing artifacts.
+
+Problem:
+
+- Agents currently infer system shape from directory scans and long files.
+- Important capabilities are spread across top-level skill directories,
+  =.claude/commands/=, =claude-rules/=, =hooks/=, =scripts/=, =languages/=,
+  =teams/=, =mcp/=, and =.ai/workflows/=.
+- This makes first-pass orientation expensive.
+
+Suggested shape:
+
+- Generate =catalog.json= at repo root.
+- Include each artifact's:
+  - kind: skill, command, rule, hook, script, workflow, language bundle, team
+    overlay, MCP server;
+  - name;
+  - summary;
+  - trigger or invocation;
+  - source path;
+  - install target;
+  - dependencies;
+  - whether it is user-invoked, model-triggered, startup-triggered, or internal.
+
+Why it helps:
+
+- Agents can load one compact file before deciding what else to read.
+- Improves routing accuracy.
+- Makes docs, install checks, and audit output consistent.
+- Reduces repeated expensive exploration in every session.
+
+** TODO Token-tiered workflow and rule files
+
+Split long workflow and rule documents into explicit token tiers.
+
+Problem:
+
+- Some files serve multiple audiences: routing, quick execution, deep reference,
+  and historical rationale.
+- Agents often need the execution steps but not the whole rationale.
+- Long documents are good for maintainability but expensive in active context.
+
+Suggested shape:
+
+- Standardize top-level sections:
+  - =Summary= or =Quick Contract=: one-screen purpose and outputs.
+  - =Execution=: steps an agent must follow.
+  - =Reference=: examples, edge cases, rationale, old decisions.
+  - =History= or =Design Notes=: durable context not needed every run.
+- Teach startup/routing to read only =Summary= first, then =Execution= only for
+  the selected workflow.
+- Keep long rationale available but opt-in.
+
+Why it helps:
+
+- Cuts token usage without deleting useful knowledge.
+- Makes behavior more predictable because every workflow exposes its contract
+  in the same place.
+- Helps smaller/local models follow the system with less context pressure.
+
+** TODO Generated install and audit manifest
+
+Make installable artifacts explicit in a manifest instead of inferred primarily
+from directory globs and Makefile variables.
+
+Problem:
+
+- The Makefile and scripts infer many behaviors from filesystem shape.
+- Globs are convenient but hide intent: default hooks vs opt-in hooks, user
+  commands vs skills, project-owned files vs rulesets-owned files.
+- Drift checks duplicate some of this logic.
+
+Suggested shape:
+
+- Add =install-manifest.json= or fold install targets into the top-level
+  =catalog.json=.
+- For each installed artifact, specify:
+  - source;
+  - target;
+  - install mode: symlink, copy, seed-only, generated, ignored;
+  - overwrite policy;
+  - ownership: rulesets-owned, project-owned, machine-owned;
+  - check policy.
+
+Why it helps:
+
+- Reduces Makefile complexity.
+- Makes =doctor= and =audit= simpler and more reliable.
+- Documents ownership rules in data instead of prose plus shell logic.
+
+** TODO Deduplicate =.ai/= template source
+
+Clarify the canonical source for =.ai/= template files and reduce duplication
+between repo-local =.ai/= and =claude-templates/.ai/=.
+
+Problem:
+
+- The repo contains both live project =.ai/= files and template
+  =claude-templates/.ai/= files.
+- Much of the content appears duplicated.
+- Agents and humans have to know which one is canonical for a given edit.
+
+Suggested shape:
+
+- Choose one canonical template source.
+- Treat project-local =.ai/= as this repo's own working copy, not the template
+  source, or vice versa.
+- Add a manifest/doctor check that says which paths are generated, copied, or
+  project-owned.
+- Consider excluding live session state from template sync completely.
+
+Why it helps:
+
+- Prevents edits landing in the wrong copy.
+- Reduces files agents must inspect.
+- Makes audit output easier to trust.
+
+** TODO Tighten generated/cache exclusions
+
+Make default inventory, audit, and review commands ignore generated/vendor/cache
+content more aggressively.
+
+Problem:
+
+- Directory scans can include =node_modules=, =__pycache__=, =.pytest_cache=,
+  package locks, generated OAuth artifacts, and test caches.
+- Even when ignored by git, these files are visible to naïve filesystem reads.
+
+Suggested shape:
+
+- Add a shared ignore file for agent inventory, e.g. =.aiignore= or
+  =rulesets-ignore.json=.
+- Teach scripts and instructions to use it when summarizing or reviewing.
+- Keep intentional lockfiles policy explicit: ignored if local skill dependency
+  cache, tracked if reproducibility matters.
+
+Why it helps:
+
+- Reduces token waste during exploration.
+- Prevents vendor files from distorting project summaries.
+- Makes agent review output focus on authored source.
+
+** TODO Generated project facts snapshot
+
+Maintain a compact =project-facts= file for high-value context.
+
+Problem:
+
+- Important facts exist, but are scattered across README, notes, sessions,
+  design docs, Makefile, and task files.
+- Each new agent session repeats discovery work.
+
+Suggested shape:
+
+- Generate or maintain =.ai/project-facts.org= or =.ai/project-facts.json=.
+- Include:
+  - project purpose;
+  - install modes;
+  - active language bundles;
+  - canonical template source;
+  - active migrations;
+  - recent durable decisions;
+  - key commands;
+  - known hazards;
+  - remote location;
+  - current open design direction.
+
+Why it helps:
+
+- Gives agents high-signal context immediately.
+- Reduces repeated README/Makefile/design-doc scanning.
+- Helps small/local models perform better with limited context.
+
+** TODO Workflow test harness
+
+Add lightweight tests for workflow documentation integrity.
+
+Problem:
+
+- Many workflows are prose specs.
+- Prose can drift: triggers can disappear from the index, referenced scripts can
+  be renamed, expected output files can change, plugin ownership can be unclear.
+
+Suggested shape:
+
+- Add tests that verify:
+  - every workflow file is indexed or classified as a plugin;
+  - every indexed workflow exists;
+  - every referenced script path exists;
+  - every source plugin maps to a parent workflow;
+  - required sections exist;
+  - workflow names and triggers are unique enough to route.
+
+Why it helps:
+
+- Catches documentation drift before runtime.
+- Makes workflow changes safer.
+- Increases agent trust in the routing layer.
+
+** TODO Normalize script interfaces
+
+Standardize helper script command-line conventions.
+
+Problem:
+
+- Scripts are useful but vary by interface.
+- Agents compose scripts more safely when flags and exit codes are predictable.
+
+Suggested shape:
+
+- For every script where it makes sense, support:
+  - =--help=;
+  - =--check= or dry-run mode;
+  - =--json= for structured agent consumption;
+  - stable exit codes;
+  - no-op success when there is nothing to do;
+  - clear stderr for human-readable failures.
+- Document the shared convention once.
+
+Why it helps:
+
+- Improves automation reliability.
+- Reduces brittle text parsing.
+- Lets agents gather facts with lower token usage by reading JSON summaries.
+
+** TODO User-facing command simplification
+
+Add a few high-level commands for common operator intent.
+
+Problem:
+
+- The Makefile is capable but broad.
+- A user or agent may need to remember whether to run =install=, =audit=,
+  =doctor=, =install-ai=, =install-lang=, or =catchup-machine=.
+
+Suggested shape:
+
+- Add or polish high-level targets such as:
+  - =make status=: summarize install state, dirty state, audit status, and open
+    inbox/task counts;
+  - =make sync=: safe machine/project sync path;
+  - =make health=: doctor + lint + relevant tests;
+  - =make bootstrap-project PROJECT=...=: install =.ai/= plus optional language
+    bundle in one guided path.
+
+Why it helps:
+
+- Reduces user friction.
+- Makes common workflows memorable.
+- Gives agents safer entry points than assembling many lower-level commands.
+
+** TODO Durable decision log for rulesets itself
+
+Promote durable project decisions into a concise decision log.
+
+Problem:
+
+- Some important decisions live in sessions, inbox files, or design notes.
+- Session archives are good history but not the best place for current truth.
+
+Suggested shape:
+
+- Add =docs/decisions/= or =docs/adr/= for rulesets itself.
+- Keep entries short:
+  - context;
+  - decision;
+  - consequences;
+  - supersedes/superseded-by links.
+- Link design docs and session summaries as supporting material.
+
+Why it helps:
+
+- Makes the current architecture easier to understand.
+- Reduces need to reread long historical sessions.
+- Helps agents avoid reopening settled questions.
+
+** TODO Local/offline model profile support
+
+Encode model profiles and capability assumptions for hosted and local runtimes.
+
+Problem:
+
+- The generic runtime spec identifies local/offline use as a goal.
+- Different agents have different context windows, tool support, speed, and
+  reliability.
+- Workflows do not yet adapt to those differences.
+
+Suggested shape:
+
+- Add runtime/model profiles such as:
+  - hosted high-context coding agent;
+  - hosted low-cost/fast agent;
+  - local 30B coding model;
+  - local 8B fallback.
+- For each profile, specify:
+  - context budget;
+  - preferred summary tiers;
+  - tool assumptions;
+  - maximum recommended workflow depth;
+  - when to ask for human confirmation;
+  - when to avoid huge file reads.
+
+Why it helps:
+
+- Makes offline operation practical, not just possible.
+- Helps smaller models avoid context overload.
+- Lets workflows degrade gracefully by capability.
+
+* Likely Highest-Impact First Steps
+
+1. Generate a compact catalog for workflows, skills, commands, rules, hooks, and
+   scripts.
+2. Add token-tiered summaries to the highest-traffic workflow/rule files.
+3. Replace singleton live session state with per-agent live session files.
+4. Clarify and enforce the canonical =.ai/= template source.
+5. Add a generated project facts snapshot for fast agent orientation.