3 files changed, 448 insertions, 80 deletions
diff --git a/docs/design/2026-05-28-rulesets-enhancement-backlog.org b/docs/design/2026-05-28-rulesets-enhancement-backlog.org
new file mode 100644
index 0000000..577edb6
--- /dev/null
+++ b/docs/design/2026-05-28-rulesets-enhancement-backlog.org
@@ -0,0 +1,426 @@
+#+TITLE: Suggested rulesets enhancements
+#+DATE: 2026-05-28
+#+SOURCE: Codex review of /home/cjennings/code/rulesets
+
+* Purpose
+
+This file captures improvement ideas from a broad review of the =rulesets=
+project. The goals are:
+
+- make agents more efficient;
+- increase agent knowledge and effectiveness;
+- reduce token usage;
+- improve user experience;
+- make the system easier to operate across machines, projects, and runtimes.
+
+* Enhancement Backlog
+
+** TODO Runtime-neutral core
+
+Complete the migration from Claude-Code-specific structure to a generic agent
+runtime contract.
+
+Current state:
+
+- The reusable core is already mostly runtime-neutral: =.ai/protocols.org=,
+  =.ai/workflows/=, =.ai/scripts/=, =inbox/=, cross-agent comms, session
+  archives, and task workflows are conceptually usable by any capable agent.
+- The install paths, language bundles, launcher, hook wiring, and naming are
+  still Claude-specific: =~/.claude/=, =.claude/=, =CLAUDE.md=, Claude hook
+  payloads, and the =claude= binary.
+
+Suggested shape:
+
+- Define a small runtime contract: where rules live, where workflows live, how
+  live session state is named, how startup instructions are found, and which
+  capabilities a runtime adapter may expose.
+- Treat Claude Code as one adapter, not the project identity.
+- Add adapter directories or manifests for Claude, Codex/OpenAI-compatible
+  agents, and local OpenAI-compatible agents.
+
+Why it helps:
+
+- Makes the system usable offline with local models after setup.
+- Reduces vendor lock-in while preserving the high-value =.ai/= workflow layer.
+- Lets multiple agents share project conventions without rewriting the project
+  memory and workflows for each tool.
+
+** TODO Per-agent live session files
+
+Replace the singleton live session file =.ai/session-context.org= with
+agent-scoped session context files.
+
+Problem:
+
+- The current live file is a single shared path.
+- If two agents operate in one project at the same time, they can overwrite or
+  confuse each other's active session state.
+- This matters more as the project moves toward generic runtime support.
+
+Suggested shape:
+
+- Use a directory such as =.ai/live-sessions/=.
+- Name live files with runtime, host, pid/session id, and timestamp, e.g.
+  =codex-strix-2026-05-28-0830.org=.
+- At wrap-up, move the live file into =.ai/sessions/= using the existing archive
+  naming convention.
+- Keep a lightweight pointer file only if needed, e.g. =.ai/current-session= for
+  single-agent clients.
+
+Why it helps:
+
+- Enables concurrent agents safely.
+- Makes crash recovery more precise.
+- Prevents one runtime's live context from polluting another runtime's session.
+
+** TODO Workflow routing index compression
+
+Convert =.ai/workflows/INDEX.org= into a compact machine-readable routing
+manifest, while keeping prose workflow documentation separate.
+
+Problem:
+
+- The index is useful, but long.
+- Agents need only a small routing table most of the time: trigger phrase,
+  workflow file, purpose, and plugin ownership.
+- Reading prose-heavy routing text costs tokens before the agent knows which
+  workflow matters.
+
+Suggested shape:
+
+- Add =.ai/workflows/catalog.json= or =catalog.edn= with entries:
+  - =id=
+  - =file=
+  - =purpose=
+  - =triggers=
+  - =source_plugins=
+  - =loads=
+  - =token_tier=
+- Generate the human =INDEX.org= from the manifest, or generate the manifest
+  from a canonical structured block in each workflow.
+
+Why it helps:
+
+- Faster workflow routing.
+- Lower context load at startup.
+- Easier drift detection: missing files, stale triggers, orphan plugins, and
+  duplicate triggers can be checked mechanically.
+
+** TODO Skill, command, rule, hook, and workflow catalog
+
+Generate a top-level machine-readable catalog of all agent-facing artifacts.
+
+Problem:
+
+- Agents currently infer system shape from directory scans and long files.
+- Important capabilities are spread across top-level skill directories,
+  =.claude/commands/=, =claude-rules/=, =hooks/=, =scripts/=, =languages/=,
+  =teams/=, =mcp/=, and =.ai/workflows/=.
+- This makes first-pass orientation expensive.
+
+Suggested shape:
+
+- Generate =catalog.json= at repo root.
+- Include each artifact's:
+  - kind: skill, command, rule, hook, script, workflow, language bundle, team
+    overlay, MCP server;
+  - name;
+  - summary;
+  - trigger or invocation;
+  - source path;
+  - install target;
+  - dependencies;
+  - whether it is user-invoked, model-triggered, startup-triggered, or internal.
+
+Why it helps:
+
+- Agents can load one compact file before deciding what else to read.
+- Improves routing accuracy.
+- Makes docs, install checks, and audit output consistent.
+- Reduces repeated expensive exploration in every session.
+
+** TODO Token-tiered workflow and rule files
+
+Split long workflow and rule documents into explicit token tiers.
+
+Problem:
+
+- Some files serve multiple audiences: routing, quick execution, deep reference,
+  and historical rationale.
+- Agents often need the execution steps but not the whole rationale.
+- Long documents are good for maintainability but expensive in active context.
+
+Suggested shape:
+
+- Standardize top-level sections:
+  - =Summary= or =Quick Contract=: one-screen purpose and outputs.
+  - =Execution=: steps an agent must follow.
+  - =Reference=: examples, edge cases, rationale, old decisions.
+  - =History= or =Design Notes=: durable context not needed every run.
+- Teach startup/routing to read only =Summary= first, then =Execution= only for
+  the selected workflow.
+- Keep long rationale available but opt-in.
+
+Why it helps:
+
+- Cuts token usage without deleting useful knowledge.
+- Makes behavior more predictable because every workflow exposes its contract
+  in the same place.
+- Helps smaller/local models follow the system with less context pressure.
+
+** TODO Generated install and audit manifest
+
+Make installable artifacts explicit in a manifest instead of inferred primarily
+from directory globs and Makefile variables.
+
+Problem:
+
+- The Makefile and scripts infer many behaviors from filesystem shape.
+- Globs are convenient but hide intent: default hooks vs opt-in hooks, user
+  commands vs skills, project-owned files vs rulesets-owned files.
+- Drift checks duplicate some of this logic.
+
+Suggested shape:
+
+- Add =install-manifest.json= or fold install targets into the top-level
+  =catalog.json=.
+- For each installed artifact, specify:
+  - source;
+  - target;
+  - install mode: symlink, copy, seed-only, generated, ignored;
+  - overwrite policy;
+  - ownership: rulesets-owned, project-owned, machine-owned;
+  - check policy.
+
+Why it helps:
+
+- Reduces Makefile complexity.
+- Makes =doctor= and =audit= simpler and more reliable.
+- Documents ownership rules in data instead of prose plus shell logic.
+
+** TODO Deduplicate =.ai/= template source
+
+Clarify the canonical source for =.ai/= template files and reduce duplication
+between repo-local =.ai/= and =claude-templates/.ai/=.
+
+Problem:
+
+- The repo contains both live project =.ai/= files and template
+  =claude-templates/.ai/= files.
+- Much of the content appears duplicated.
+- Agents and humans have to know which one is canonical for a given edit.
+
+Suggested shape:
+
+- Choose one canonical template source.
+- Treat project-local =.ai/= as this repo's own working copy, not the template
+  source, or vice versa.
+- Add a manifest/doctor check that says which paths are generated, copied, or
+  project-owned.
+- Consider excluding live session state from template sync completely.
+
+Why it helps:
+
+- Prevents edits landing in the wrong copy.
+- Reduces files agents must inspect.
+- Makes audit output easier to trust.
+
+** TODO Tighten generated/cache exclusions
+
+Make default inventory, audit, and review commands ignore generated/vendor/cache
+content more aggressively.
+
+Problem:
+
+- Directory scans can include =node_modules=, =__pycache__=, =.pytest_cache=,
+  package locks, generated OAuth artifacts, and test caches.
+- Even when ignored by git, these files are visible to naïve filesystem reads.
+
+Suggested shape:
+
+- Add a shared ignore file for agent inventory, e.g. =.aiignore= or
+  =rulesets-ignore.json=.
+- Teach scripts and instructions to use it when summarizing or reviewing.
+- Keep intentional lockfiles policy explicit: ignored if local skill dependency
+  cache, tracked if reproducibility matters.
+
+Why it helps:
+
+- Reduces token waste during exploration.
+- Prevents vendor files from distorting project summaries.
+- Makes agent review output focus on authored source.
+
+** TODO Generated project facts snapshot
+
+Maintain a compact =project-facts= file for high-value context.
+
+Problem:
+
+- Important facts exist, but are scattered across README, notes, sessions,
+  design docs, Makefile, and task files.
+- Each new agent session repeats discovery work.
+
+Suggested shape:
+
+- Generate or maintain =.ai/project-facts.org= or =.ai/project-facts.json=.
+- Include:
+  - project purpose;
+  - install modes;
+  - active language bundles;
+  - canonical template source;
+  - active migrations;
+  - recent durable decisions;
+  - key commands;
+  - known hazards;
+  - remote location;
+  - current open design direction.
+
+Why it helps:
+
+- Gives agents high-signal context immediately.
+- Reduces repeated README/Makefile/design-doc scanning.
+- Helps small/local models perform better with limited context.
+
+** TODO Workflow test harness
+
+Add lightweight tests for workflow documentation integrity.
+
+Problem:
+
+- Many workflows are prose specs.
+- Prose can drift: triggers can disappear from the index, referenced scripts can
+  be renamed, expected output files can change, plugin ownership can be unclear.
+
+Suggested shape:
+
+- Add tests that verify:
+  - every workflow file is indexed or classified as a plugin;
+  - every indexed workflow exists;
+  - every referenced script path exists;
+  - every source plugin maps to a parent workflow;
+  - required sections exist;
+  - workflow names and triggers are unique enough to route.
+
+Why it helps:
+
+- Catches documentation drift before runtime.
+- Makes workflow changes safer.
+- Increases agent trust in the routing layer.
+
+** TODO Normalize script interfaces
+
+Standardize helper script command-line conventions.
+
+Problem:
+
+- Scripts are useful but vary by interface.
+- Agents compose scripts more safely when flags and exit codes are predictable.
+
+Suggested shape:
+
+- For every script where it makes sense, support:
+  - =--help=;
+  - =--check= or dry-run mode;
+  - =--json= for structured agent consumption;
+  - stable exit codes;
+  - no-op success when there is nothing to do;
+  - clear stderr for human-readable failures.
+- Document the shared convention once.
+
+Why it helps:
+
+- Improves automation reliability.
+- Reduces brittle text parsing.
+- Lets agents gather facts with lower token usage by reading JSON summaries.
+
+** TODO User-facing command simplification
+
+Add a few high-level commands for common operator intent.
+
+Problem:
+
+- The Makefile is capable but broad.
+- A user or agent may need to remember whether to run =install=, =audit=,
+  =doctor=, =install-ai=, =install-lang=, or =catchup-machine=.
+
+Suggested shape:
+
+- Add or polish high-level targets such as:
+  - =make status=: summarize install state, dirty state, audit status, and open
+    inbox/task counts;
+  - =make sync=: safe machine/project sync path;
+  - =make health=: doctor + lint + relevant tests;
+  - =make bootstrap-project PROJECT=...=: install =.ai/= plus optional language
+    bundle in one guided path.
+
+Why it helps:
+
+- Reduces user friction.
+- Makes common workflows memorable.
+- Gives agents safer entry points than assembling many lower-level commands.
+
+** TODO Durable decision log for rulesets itself
+
+Promote durable project decisions into a concise decision log.
+
+Problem:
+
+- Some important decisions live in sessions, inbox files, or design notes.
+- Session archives are good history but not the best place for current truth.
+
+Suggested shape:
+
+- Add =docs/decisions/= or =docs/adr/= for rulesets itself.
+- Keep entries short:
+  - context;
+  - decision;
+  - consequences;
+  - supersedes/superseded-by links.
+- Link design docs and session summaries as supporting material.
+
+Why it helps:
+
+- Makes the current architecture easier to understand.
+- Reduces need to reread long historical sessions.
+- Helps agents avoid reopening settled questions.
+
+** TODO Local/offline model profile support
+
+Encode model profiles and capability assumptions for hosted and local runtimes.
+
+Problem:
+
+- The generic runtime spec identifies local/offline use as a goal.
+- Different agents have different context windows, tool support, speed, and
+  reliability.
+- Workflows do not yet adapt to those differences.
+
+Suggested shape:
+
+- Add runtime/model profiles such as:
+  - hosted high-context coding agent;
+  - hosted low-cost/fast agent;
+  - local 30B coding model;
+  - local 8B fallback.
+- For each profile, specify:
+  - context budget;
+  - preferred summary tiers;
+  - tool assumptions;
+  - maximum recommended workflow depth;
+  - when to ask for human confirmation;
+  - when to avoid huge file reads.
+
+Why it helps:
+
+- Makes offline operation practical, not just possible.
+- Helps smaller models avoid context overload.
+- Lets workflows degrade gracefully by capability.
+
+* Likely Highest-Impact First Steps
+
+1. Generate a compact catalog for workflows, skills, commands, rules, hooks, and
+   scripts.
+2. Add token-tiered summaries to the highest-traffic workflow/rule files.
+3. Replace singleton live session state with per-agent live session files.
+4. Clarify and enforce the canonical =.ai/= template source.
+5. Add a generated project facts snapshot for fast agent orientation.
diff --git a/inbox/PROCESSED-2026-05-28-0117-from-.emacs.d-suggestion-open-tasks-hybrid-friction-cascade.org b/inbox/PROCESSED-2026-05-28-0117-from-.emacs.d-suggestion-open-tasks-hybrid-friction-cascade.org
deleted file mode 100644
index 43bb0ba..0000000
--- a/inbox/PROCESSED-2026-05-28-0117-from-.emacs.d-suggestion-open-tasks-hybrid-friction-cascade.org
+++ /dev/null
@@ -1,80 +0,0 @@
-#+TITLE: Suggestion — hybrid output shape for open-tasks.org Phase C Next Mode
-#+FROM: dotemacs (~/.emacs.d)
-#+DATE: 2026-05-28
-
-* Suggestion
-
-Update =claude-templates/.ai/workflows/open-tasks.org= Phase C → Next Mode to present two outputs together: the existing importance/urgency cascade recommendation up top, plus a 3-option friction filter underneath ranked by =:quick:solo:= > =:quick:= > =:solo:=. The user picks the row that matches their current state.
-
-The current cascade picks one task by importance/urgency (DOING > Active Reminders > Deadlines > priority order). That works when Craig is sharp and has time. It fails when the winner is a partially-blocked decision, hardware-dependent verify, or large refactor — the recommendation gets declined and the cascade falls through. The friction filter sidesteps that mismatch by surfacing tasks Craig can actually finish given a 20-minute window or a flagging-energy moment.
-
-Replacing the cascade entirely (which was the literal first proposal in the dotemacs session) drops the cascade's teeth on deadlines and priority — an =[#A]= deadline-tomorrow task that lacks =:quick:= or =:solo:= would go invisible. The hybrid preserves the cascade's importance/urgency forcing while adding the friction-based override path.
-
-* Why
-
-Two reinforcing effects, both surfaced in the dotemacs session 2026-05-28 (taskaudit + task-review):
-
-1. =:quick:= and =:solo:= tagging coverage is growing. The =task-review.org= workflow already assesses both tags on every review pass, so the friction filter degrades gracefully today (small set, growing over time) rather than catastrophically (empty list if no tags exist yet).
-
-2. The cascade and the friction filter answer different questions. Cascade = "what matters most?" Friction filter = "what shape of task can I actually finish right now?" The hybrid lets Craig pick the question to answer based on the state of his day.
-
-The proposed shape also pairs cleanly with the just-suggested "recommendation at item 1" convention (separate inbox drop, =2026-05-28-0014-from-.emacs.d-suggestion-numbered-options-recommendation-first.org=) — the friction block is exactly the kind of inline-numbered choice list that convention is meant for.
-
-* Proposed Phase C → Next Mode rewrite
-
-Replace the existing "Apply the prioritization cascade in order. Stop at the first matching step:" through the cascade's six steps with two sections:
-
-** Step 1 — Cascade recommendation (importance/urgency)
-
-Apply the prioritization cascade in order. Stop at the first matching step. This is the importance/urgency answer — what matters most right now.
-
-(Steps 1–6 unchanged from current workflow: DOING, Active Reminders, Deadline-driven, V2MOM, Simple priority, All done.)
-
-Present as a single task with the matched cascade step in the reason line.
-
-** Step 2 — Friction filter (effort + autonomy)
-
-Independently of the cascade, scan the open-task set for tasks tagged =:quick:= and =:solo:=. Build three ranked picks:
-
-- Quick + solo: the top task carrying both tags
-- Quick: the top task carrying =:quick:= only
-- Solo: the top task carrying =:solo:= only
-
-Within each row, pick the single task per the same-level tie-breakers already defined (blocks-other-work, recently-discussed, most-foundational). If a row has no tasks, omit it rather than padding.
-
-** Output shape
-
-#+begin_example
-Cascade recommendation (importance/urgency):
-- <task>  — [#priority], <cascade-reason>
-
-If you want lower friction instead:
-1. Quick + solo: <task> — [#priority], ~<est>
-2. Quick:       <task> — [#priority], ~<est>
-3. Solo:        <task> — [#priority]
-#+end_example
-
-The cascade recommendation reads as the "answer" — what Craig probably wants if his day is going well. The friction block reads as the override — what he picks when the cascade winner isn't actionable in this moment. The 3 rows are numbered with item 1 (=:quick:solo:=) as the strongest friction pick, per the numbered-options-with-recommendation-first convention.
-
-* Edge cases
-
-- If the friction filter has zero rows (no =:quick:= or =:solo:= tasks in the open set), omit the friction block entirely and present only the cascade recommendation.
-- If the cascade recommendation and the =:quick:solo:= row are the same task, dedupe — show it once at the top with both labels.
-- If Craig declines the cascade recommendation, drop to the friction block as the natural next prompt (rather than continuing through the cascade) — the friction filter IS the override path.
-
-* Update to "Common Mistakes" section
-
-Add:
-
-- *Skipping the friction block when the cascade recommendation isn't actionable.* The friction block is the override path; don't fall through to lower-cascade-tier tasks if a =:quick:= or =:solo:= task is what's actually needed.
-- *Recommending more than one task in the friction block.* One task per row (quick+solo / quick / solo), not a per-row shortlist.
-
-The existing "Recommending more than one task in next mode" mistake should be tightened to mean the cascade recommendation only — the friction block is by design a 3-row choice.
-
-* Adoption shape
-
-Additive to the existing workflow — cascade logic survives unchanged in Step 1; the friction filter is a new Step 2 with bounded scope. Every project's existing =todo.org= continues to work; tag coverage grows naturally via =task-review.org=.
-
-* Source
-
-Drafted from the dotemacs session 2026-05-28. Adjudicated decision: hybrid (option 1) chosen over literal-replace (option 2) and separate-trigger (option 3). Discussion captured in the session log.
diff --git a/todo.org b/todo.org
index 24bc9ae..b2a0995 100644
--- a/todo.org
+++ b/todo.org
@@ -1263,6 +1263,28 @@ Three wins: handoff is one paste not a re-read; forces specs to be implementable
 
 If the spec lacks an =Implementation phases= section, the step is the prompt to ask the author to add one before =Ready=.
 
+** TODO [#C] Triage Codex enhancement backlog :spec:
+:PROPERTIES:
+:CREATED: [2026-05-28 Thu]
+:LAST_REVIEWED: 2026-05-28
+:END:
+
+Codex left a 14-item enhancement backlog for rulesets, filed to [[file:docs/design/2026-05-28-rulesets-enhancement-backlog.org][docs/design/2026-05-28-rulesets-enhancement-backlog.org]]. The doc groups improvements under four goals (agent efficiency, knowledge/effectiveness, token reduction, user experience) and recommends a five-item "highest-impact first" ordering.
+
+Two of the 14 already have homes and don't need re-filing:
+- Item #1 *Runtime-neutral core* = the broader arc captured in =Generic agent runtime support — Codex spec v0= (#16 above).
+- Item #2 *Per-agent live session files* = the immediate-correctness slice already filed as [[#16-1 Codex Phase 1][Codex Phase 1 — AI_AGENT_ID + session-context.d/<id>.org]] above ([#B]).
+
+Triage scope: walk the remaining 12 items, decide for each whether to (a) accept and file as its own TODO with priority + tags, (b) fold into an adjacent existing TODO, (c) defer / park as a someday-maybe, or (d) reject with rationale. The triage output is a one-page disposition table — same shape as a spec-response disposition list — so the decisions land in =git log= and aren't re-litigated.
+
+Items worth flagging up front:
+- Items #3, #4, #6 (workflow routing index, catalog of agent-facing artifacts, install manifest) propose generated machine-readable indices. Compose well as a single =catalog.json= initiative.
+- Item #5 (token-tiered files) is a workflow-shape change that touches every existing workflow. Big surface area, real payoff.
+- Item #13 (decision log for rulesets itself) overlaps with the existing session-context + docs/design pattern. Worth deciding whether to formalize or keep ad-hoc.
+- Items #11 (script interface normalization) and #12 (user-facing command simplification) could land incrementally as scripts and Make targets are touched.
+
+No implementation before triage. The backlog doc is the canonical reference for the proposals; this task tracks the decision.
+
 ** DONE [#C] Iteration-history backfill for spec-review and spec-response :docs:followup:
 CLOSED: [2026-05-28 Thu]
 Source: org-drill inbox 2026-05-28.