aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--docs/design/2026-05-28-rulesets-enhancement-backlog.org30
-rw-r--r--todo.org91
2 files changed, 109 insertions, 12 deletions
diff --git a/docs/design/2026-05-28-rulesets-enhancement-backlog.org b/docs/design/2026-05-28-rulesets-enhancement-backlog.org
index 577edb6..76880eb 100644
--- a/docs/design/2026-05-28-rulesets-enhancement-backlog.org
+++ b/docs/design/2026-05-28-rulesets-enhancement-backlog.org
@@ -13,6 +13,36 @@ project. The goals are:
- improve user experience;
- make the system easier to operate across machines, projects, and runtimes.
+* 2026-05-28 Triage Dispositions
+
+Each item below was triaged interactively with Craig on 2026-05-28. Disposition recorded here so the decision survives in git.
+
+** Accept (filed as TODOs)
+
+- *Item #2 Per-agent live session files* — filed [#B] :feature: as "Codex Phase 1 — AI_AGENT_ID + session-context.d/<id>.org" earlier on 2026-05-28. Multi-agent is plausible; the singleton session-context.org races for real. Strong yes.
+- *Item #8 Tighten generated/cache exclusions (=.aiignore=)* — small surface area, real payoff for filesystem scans. Filed [#C] :chore:.
+- *Item #10 Workflow test harness* — startup drift check catches index mismatches but not deeper integrity (broken script references, orphan plugins). A handful of bats/pytest tests catches this at CI time. Filed [#C] :feature:.
+
+** Pilot or scope-limit (filed as TODOs)
+
+- *Item #5 Token-tiered workflow and rule files* — pilot the Summary / Execution / Reference / History split on the largest workflows (=startup.org=, =triage-intake.org=). Don't templatize universally; shorter workflows don't need tiering. Filed [#C] :feature:.
+- *Item #7 Deduplicate =.ai/= template source* — reject the wholesale dedupe. The dual source (=claude-templates/= canonical + =.ai/= mirror) is a *feature*: canonical lives where other projects can rsync from it, mirror lives so rulesets-as-a-project has a working copy. Real pain is the sync-discipline overhead. Filed [#C] :feature:quick:solo: for "canonical/mirror drift detection via pre-commit hook or =make sync-check=".
+- *Item #12 User-facing command simplification* — accept =make status= only (composes audit + doctor + open-task count cleanly). Reject =make sync= / =make health= / =make bootstrap-project= — they either duplicate existing flows or wrap for the sake of wrapping. Filed [#C] :feature:quick:solo:.
+
+** Convention, not a tracked task
+
+- *Item #11 Normalize script interfaces* (=--help=, =--check=, =--json=, stable exit codes) — adopt opportunistically as scripts are touched. The pattern is partially in place already. Don't sweep; let convention spread organically. No TODO; document the convention if it firms up further.
+- *Item #13 Durable decision log (ADR layer)* — watch-and-wait. If rulesets keeps growing, formalize ADRs under =docs/decisions/=. If scope stabilizes, the existing session-archive + docs/design pattern is sufficient. No TODO until the growth signal is clear.
+
+** Reject
+
+- *Item #1 Broader runtime-neutral arc* — adapter manifests + runtime profiles is heavy infrastructure for a marginal payoff in the actual use case (Claude Code primary, Codex co-reviewer). The broader spec stays parked at #16 [#C]; only Phase 1 (item #2 above) gets prioritized. The full arc reopens when there's a real local-model workflow.
+- *Item #3 Workflow routing index compression (catalog.json/edn for INDEX.org)* — =INDEX.org= is already one-line-per-entry and token-efficient. A parallel machine-readable layer adds drift risk and a generator step for savings that are small at current size. Keep =INDEX.org= terse instead.
+- *Item #4 Universal =catalog.json= for skills/commands/rules/hooks/workflows* — looks valuable on paper, rots fast in practice. Every artifact change requires the catalog updated; prose docstrings already serve the same purpose; Claude Code's harness handles skill discovery natively. Maintenance burden far exceeds the "first-pass orientation" savings.
+- *Item #6 Generated install and audit manifest (=install-manifest.json=)* — the current Makefile + =install-*.sh= pattern is shell-based but refined over many sessions and works correctly. A data layer adds indirection without changing what executes. Better docs in the existing scripts solve the "globs hide intent" complaint at lower cost.
+- *Item #9 Generated project facts snapshot (=.ai/project-facts.org=)* — duplicates =notes.org='s *Project-Specific Context* role, which is already loaded at every session start. Invites drift between two parallel sources. Keep =notes.org= terse and high-signal.
+- *Item #14 Local/offline model profile support* — premature. No actual local-model workflow yet. Building profile infrastructure ahead of a real use case is speculative. Reopens with item #1 when there's a concrete trigger.
+
* Enhancement Backlog
** TODO Runtime-neutral core
diff --git a/todo.org b/todo.org
index b2a0995..2e5357c 100644
--- a/todo.org
+++ b/todo.org
@@ -1263,27 +1263,94 @@ Three wins: handoff is one paste not a re-read; forces specs to be implementable
If the spec lacks an =Implementation phases= section, the step is the prompt to ask the author to add one before =Ready=.
-** TODO [#C] Triage Codex enhancement backlog :spec:
+** DONE [#C] Triage Codex enhancement backlog :spec:
+CLOSED: [2026-05-28 Thu]
:PROPERTIES:
:CREATED: [2026-05-28 Thu]
:LAST_REVIEWED: 2026-05-28
:END:
-Codex left a 14-item enhancement backlog for rulesets, filed to [[file:docs/design/2026-05-28-rulesets-enhancement-backlog.org][docs/design/2026-05-28-rulesets-enhancement-backlog.org]]. The doc groups improvements under four goals (agent efficiency, knowledge/effectiveness, token reduction, user experience) and recommends a five-item "highest-impact first" ordering.
+Triaged interactively 2026-05-28. Disposition table for all 14 items lives at [[file:docs/design/2026-05-28-rulesets-enhancement-backlog.org][2026-05-28-rulesets-enhancement-backlog.org]] under "Triage Dispositions": 3 accepted (filed below as TODOs), 3 pilot/scope-limited (filed below), 2 marked as conventions rather than tracked tasks, 6 rejected with rationale. Items #1 and #2 already had homes (#16 and the Phase-1 codex TODO).
+
+** TODO [#C] Add =.aiignore= for agent inventory exclusions :chore:
+:PROPERTIES:
+:CREATED: [2026-05-28 Thu]
+:LAST_REVIEWED: 2026-05-28
+:END:
+
+From the codex enhancement backlog (item #8). Filesystem scans by agents and helper scripts pick up =node_modules=, =__pycache__=, =.pytest_cache=, lockfiles, generated OAuth artifacts, and test caches, even when those are gitignored. Token waste during exploration and skewed project summaries.
+
+Scope: add a shared =.aiignore= file (or =rulesets-ignore.json= if a more structured format helps) listing default exclusions. Teach the scripts that walk the project (=audit.sh=, =diff-lang.sh=, =sync-language-bundle.sh=, future =catalog= work if any) to honor it. Document in =protocols.org= so agents know to consult it before naive recursive reads.
+
+Keep the lockfile policy explicit: ignored when a local skill dependency cache, tracked when reproducibility matters.
+
+** TODO [#C] Workflow test harness — drift + integrity tests :feature:
+:PROPERTIES:
+:CREATED: [2026-05-28 Thu]
+:LAST_REVIEWED: 2026-05-28
+:END:
+
+From the codex enhancement backlog (item #10). Startup's drift check catches index-vs-directory mismatches but not deeper integrity: a workflow that references a script that's been renamed, a plugin whose parent engine has been deleted, a required section missing from a newly-added workflow.
+
+Scope: add =scripts/tests/workflow-integrity.bats= (or pytest equivalent) verifying:
+
+- Every =.org= file in =.ai/workflows/= is either indexed in =INDEX.org= or classifiable as a source plugin under an indexed engine.
+- Every indexed workflow file actually exists.
+- Every =file:= or shell-command reference inside a workflow to a script under =.ai/scripts/= or =scripts/= resolves to an existing file.
+- Every source plugin maps to a parent workflow that exists and is indexed.
+- Required sections (Overview, When to Use, the workflow's main phases) are present in each workflow.
+- Workflow trigger phrases are unique enough to route — no two workflows claim the same exact trigger.
+
+Wire into =make test=. Run on the canonical =claude-templates/.ai/workflows/= as the source of truth.
+
+** TODO [#C] Token-tier pilot on largest workflows :feature:
+:PROPERTIES:
+:CREATED: [2026-05-28 Thu]
+:LAST_REVIEWED: 2026-05-28
+:END:
+
+From the codex enhancement backlog (item #5), scope-limited to a pilot rather than a universal template change.
+
+Apply a standardized section structure to the largest workflow files first — =startup.org= and =triage-intake.org= are the prime candidates. Sections:
+
+- *Summary* / *Quick Contract* — one-screen purpose and outputs.
+- *Execution* — the steps an agent must follow.
+- *Reference* — examples, edge cases, rationale, old decisions.
+- *History* / *Design Notes* — durable context not needed every run.
+
+Teach startup/routing to read =Summary= only at routing time, then =Execution= only for the selected workflow. Other sections become opt-in.
+
+After the pilot, evaluate: did the savings show up in real session token use? Did the structure constrain the workflow expressiveness too much? If yes to savings and no to constraint, expand to the next-largest workflows. If not, document why and stop. Don't templatize universally — shorter workflows don't need tiering.
+
+** TODO [#C] Canonical/mirror drift detection via pre-commit hook or =make sync-check= :feature:quick:solo:
+:PROPERTIES:
+:CREATED: [2026-05-28 Thu]
+:LAST_REVIEWED: 2026-05-28
+:END:
+
+From the codex enhancement backlog (item #7), reframed: don't dedupe the dual source — the canonical-in-=claude-templates/= + mirror-in-=.ai/= pattern is a feature (other projects rsync from the canonical; the mirror lets rulesets-as-a-project have a working copy). The real pain is sync-discipline overhead — every workflow edit needs both copies updated, and forgetting one leaves the next startup's rsync to surface the drift.
+
+Scope: write a small =scripts/sync-check.sh= (or fold into the existing Makefile) that diffs =claude-templates/.ai/workflows/= against =.ai/workflows/=, exits non-zero on drift. Wire as a pre-commit hook (=githooks/pre-commit= or equivalent) so the discipline is enforced before publish, not at the next startup. =make sync-check= as a manual entry point.
+
+Verification: introduce a deliberate diff, commit, hook should block. Restore parity, hook should pass.
+
+** TODO [#C] Add =make status= — compose audit + doctor + open-task count :feature:quick:solo:
+:PROPERTIES:
+:CREATED: [2026-05-28 Thu]
+:LAST_REVIEWED: 2026-05-28
+:END:
-Two of the 14 already have homes and don't need re-filing:
-- Item #1 *Runtime-neutral core* = the broader arc captured in =Generic agent runtime support — Codex spec v0= (#16 above).
-- Item #2 *Per-agent live session files* = the immediate-correctness slice already filed as [[#16-1 Codex Phase 1][Codex Phase 1 — AI_AGENT_ID + session-context.d/<id>.org]] above ([#B]).
+From the codex enhancement backlog (item #12), scope-limited: =make status= only. Reject the rest of #12 (=make sync= duplicates the existing sync flow; =make health= wraps existing checks without adding signal; =make bootstrap-project= duplicates =install-ai= + =install-lang=).
-Triage scope: walk the remaining 12 items, decide for each whether to (a) accept and file as its own TODO with priority + tags, (b) fold into an adjacent existing TODO, (c) defer / park as a someday-maybe, or (d) reject with rationale. The triage output is a one-page disposition table — same shape as a spec-response disposition list — so the decisions land in =git log= and aren't re-litigated.
+Scope: one Makefile target that prints a compact summary of:
-Items worth flagging up front:
-- Items #3, #4, #6 (workflow routing index, catalog of agent-facing artifacts, install manifest) propose generated machine-readable indices. Compose well as a single =catalog.json= initiative.
-- Item #5 (token-tiered files) is a workflow-shape change that touches every existing workflow. Big surface area, real payoff.
-- Item #13 (decision log for rulesets itself) overlaps with the existing session-context + docs/design pattern. Worth deciding whether to formalize or keep ad-hoc.
-- Items #11 (script interface normalization) and #12 (user-facing command simplification) could land incrementally as scripts and Make targets are touched.
+- Install audit state (clean / drift, calling =make audit=).
+- Machine-global doctor state (calling =make doctor=).
+- Open-task count (top-level entries in =todo.org= under =* Rulesets Open Work=).
+- Inbox count (files in =inbox/= excluding =.gitkeep= and =PROCESSED-= prefixes).
+- Git working-tree status (clean / dirty, ahead/behind upstream).
-No implementation before triage. The backlog doc is the canonical reference for the proposals; this task tracks the decision.
+Output should be roughly 10 lines, scannable in one glance. Composes the existing checks; no new logic except the summary formatting.
** DONE [#C] Iteration-history backfill for spec-review and spec-response :docs:followup:
CLOSED: [2026-05-28 Thu]