aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-06-12 02:24:01 -0500
committerCraig Jennings <c@cjennings.net>2026-06-12 02:24:01 -0500
commit22e19c21e6aabe0319d4b09a862f4a3705c92509 (patch)
tree5210f7ac7bba64b7b1b0b23f56d14f4dafefaf9a
parentc6fd73441ef0b683abb859863dcd0d48377a4838 (diff)
downloadrulesets-22e19c21e6aabe0319d4b09a862f4a3705c92509.tar.gz
rulesets-22e19c21e6aabe0319d4b09a862f4a3705c92509.zip
docs(spec): fold the Codex review into the agent-runtime spec
The review's top finding was that one Not-ready label hid an implementable slice. Status now splits by arc: Phase 1.5 helper instances are READY WITH CAVEATS (the three-ring gate and the manual drills are binding, and the ai-term.el work is a coordinated .emacs.d handoff with an exact artifact), while phases 2-5 stay NOT READY behind a decisions-required section and a Phase 5 reverification prerequisite that demotes the model table to a recommendation. The remaining findings hardened the slice: per-ring rollback actions including the half-propagated-sync case, the review's test inventory adopted as normative, a message contract for stale helper files, and explicit roster-unavailable behavior on unsupported platforms. All recommendations accepted except the document split, modified to a dual rubric in one document. The review file and dispositions table ride along.
-rw-r--r--docs/design/2026-05-28-generic-agent-runtime-spec-review.org178
-rw-r--r--docs/design/2026-05-28-generic-agent-runtime-spec.org156
-rw-r--r--todo.org7
3 files changed, 308 insertions, 33 deletions
diff --git a/docs/design/2026-05-28-generic-agent-runtime-spec-review.org b/docs/design/2026-05-28-generic-agent-runtime-spec-review.org
new file mode 100644
index 0000000..90d7030
--- /dev/null
+++ b/docs/design/2026-05-28-generic-agent-runtime-spec-review.org
@@ -0,0 +1,178 @@
+#+TITLE: Review: Generic Agent Runtime Support for rulesets
+#+AUTHOR: Codex
+#+DATE: 2026-06-12
+#+STARTUP: showall
+
+* Scope reviewed
+
+Reviewed the target spec at [[file:2026-05-28-generic-agent-runtime-spec.org][2026-05-28-generic-agent-runtime-spec.org]], the spec-review workflow, the current launcher/install/template implementation, existing tests, and existing task tracking.
+
+Code and docs read:
+
+- [[file:../../Makefile][Makefile]] — global install/deps targets still install Claude-only roots and the Claude launcher.
+- [[file:../../claude-templates/bin/ai][claude-templates/bin/ai]] — tmux launcher still hard-codes =CLAUDE_CMD=claude=, project detection via =.ai/protocols.org=, and one window per project name.
+- [[file:../../scripts/install-lang.sh][scripts/install-lang.sh]] and [[file:../../scripts/sync-language-bundle.sh][scripts/sync-language-bundle.sh]] — language bundle install/sync still writes =.claude/= and =CLAUDE.md=.
+- [[file:../../.ai/scripts/session-context-path][.ai/scripts/session-context-path]] and [[file:../../.ai/scripts/tests/session-context-path.bats][its bats tests]] — Phase 1 resolver exists and covers unset, empty, distinct, and sanitized =AI_AGENT_ID= values.
+- [[file:../../.ai/protocols.org][.ai/protocols.org]], [[file:../../.ai/workflows/startup.org][startup.org]], and [[file:../../.ai/workflows/wrap-it-up.org][wrap-it-up.org]] — protocols documents the agent-scoped path; startup/wrap-up resolve that path but do not yet implement roster-first helper branching or live-helper gates.
+- [[file:../../.ai/scripts/todo-cleanup.el][todo-cleanup.el]], [[file:../../.ai/scripts/lint-org.el][lint-org.el]], and [[file:../../.ai/scripts/wrap-org-table.el][wrap-org-table.el]] — =lint-org= and =wrap-org-table= already take =/tmp= backups; =todo-cleanup= does not.
+- [[file:../../todo.org][todo.org]] — existing Phase 1.5 helper task and broader generic-runtime parent task.
+
+I did not re-verify the time-sensitive Hugging Face model recommendations online during this review. Treat the local-model picks as stale-until-checked before Phase 5.
+
+* Implementation-readiness
+
+Rubric for the whole spec: =Not ready=.
+
+The spec is strong enough to implement the Phase 1.5 helper-instance slice if Craig accepts the caveats already captured in tracking: the synced-template rollout must stay gated, the live sandbox drills must pass, and the Emacs =ai-term.el= work must land as a coordinated cross-project change. The broader runtime-neutral refactor in phases 2-5 is not implementation-ready because it still has unresolved product choices and time-sensitive external assumptions.
+
+* Overall assessment
+
+The spec correctly identifies the current architecture: the reusable project core lives under =.ai/=, while the install surface, bundle layout, launcher, hooks, and user-facing docs remain Claude-specific. Phase 1 is already done in the repo: =session-context-path= is present, tested, and wired into startup/wrap-up path resolution.
+
+The helper-instance amendment is the most actionable part. It names the concurrency risk, defines a role contract, narrows helper writes, and gives concrete tests and manual drills. The main risk is rollout, not design: =startup.org= and =.ai/scripts/= sync broadly, so a partially validated helper branch could affect every project.
+
+The generic runtime arc is still a product decision package, not an implementation plan. It names plausible phases, but the open decisions determine file names, local runtime UX, adapter scope, and support burden.
+
+* High-priority findings
+
+** Split the ready helper slice from the not-ready runtime-neutral arc
+
+Blocking status: blocks =Ready= for the whole spec; does not block a scoped Phase 1.5 implementation if tracked as its own accepted slice.
+
+Why it matters: the spec now contains two different projects. Phase 1.5 solves a near-term same-runtime concurrency problem. Phases 2-5 rename and generalize the entire distribution. The current Status section says the spec is not implementation-ready, while the Phase 1.5 section is detailed enough for implementation. Without an explicit split, an implementer has to decide whether "start implementation" means helper support only or the full runtime-neutral refactor.
+
+What to change: in the spec Status or Recommended next step, state two rubrics separately:
+
+- Phase 1.5 helper instances: =Ready with caveats= once Craig accepts the pre-live gating and cross-project Emacs handoff.
+- Phases 2-5 generic runtime refactor: =Not ready= until the open decisions are answered and model/runtime assumptions are reverified.
+
+** Resolve the phase 2-5 product choices before implementation
+
+Blocking status: blocks =Ready= for phases 2-5.
+
+Why it matters: the spec still asks which generic instruction file to use, whether to standardize on =llama.cpp= or =ollama=, and which local agent CLI is first-supported. Those choices affect manifest schema, install paths, docs, doctor checks, runtime command templates, and test fixtures. If implementation starts now, those decisions will be made inside code.
+
+What to change: add a short "Decisions required before phases 2-5" section with accepted answers for:
+
+- generic instruction file strategy;
+- default local runtime manager/server;
+- first supported local editing CLI;
+- whether phase 2 should support only Claude + one local runtime or also Codex immediately;
+- compatibility behavior for existing =CLAUDE.md= and =.claude/= projects during the transition.
+
+** Reverify external runtime and model assumptions before the local-model phase
+
+Blocking status: blocks =Ready= for Phase 5, not for Phase 1.5.
+
+Why it matters: the local-model recommendations are inherently time-sensitive. Model availability, quant files, serving backends, GPU support, context behavior, and practical latency can change. The spec cites sources, but implementation of =rulesets doctor= and archsetup handoff should not bake in stale assumptions.
+
+What to change: make Phase 5 start with a research/verification task that records current model URLs, file sizes, license, backend support, smoke command, memory fit, and fallback behavior. Keep the spec's current model table as a recommendation, not as an implementation constant.
+
+* Medium-priority findings
+
+** Add the exact cross-project handoff artifact for ai-term.el
+
+Blocking status: not blocking for the rulesets side if the handoff is created before or during implementation.
+
+The spec correctly says =ai-term.el= lives in =~/.emacs.d= and is not a rulesets edit. That means the implementation plan should say exactly how the rulesets task hands off the required Emacs change: an inbox file, a linked task, or a commit in that repo. Otherwise the rulesets implementation can finish with shell helpers working while the F9 path remains unsafe.
+
+** Name the rollback point for template-wide helper rollout
+
+Blocking status: not blocking if the three-ring gate is accepted; it is a release-safety improvement.
+
+The pre-live gate is good: bats, sandbox, pilot, then template-wide release. Add the rollback action for each ring: remove =agent-roster= from the pilot project, revert the =startup.org= helper branch, or disable helper detection when =agent-roster= is absent. This matters because startup template sync has broad blast radius.
+
+* UX observations
+
+The helper UX is concrete enough for v1: launcher path, raw-launch safety net, helper opener, helper workflow, and wrap-up behavior are all named. The no-trigger-phrase decision for =helper-mode.org= is good because humans should not have to remember a workflow incantation when the launcher can route.
+
+For phases 2-5, the user mental model is not yet settled. A user will need to know whether they are installing a runtime, choosing a model profile, or creating project instructions. That should be decided before docs or commands ship.
+
+* Architecture observations
+
+The spec fits the current repo boundaries. =.ai/= remains core; runtime adapters can sit beside existing Claude-specific layout; the launcher and install scripts are the right integration points. The current implementation confirms the major refactor points: =Makefile=, =install-lang.sh=, =sync-language-bundle.sh=, and =claude-templates/bin/ai= all hard-code Claude assumptions today.
+
+The helper slice is intentionally smaller than the runtime manifest work and should stay that way. Do not introduce TOML manifests, local model service config, or bundle splitting while implementing =agent-roster= and helper startup/wrap-up.
+
+* Robustness and performance observations
+
+The helper data-integrity rules cover the important local lost-update shapes: scoped org edits, no helper memory writes, git mutation primary-only while concurrent, and log-before-write. The remaining robustness question is how stale helper files are surfaced without permanently blocking hygiene. The spec says this is a judgment call; implementation should make the message explicit and include file path, timestamp, and suggested actions.
+
+The =agent-roster= scan is cheap enough for startup on Linux, but it is Linux =/proc= specific. That is fine for v1 if the script reports a clear unsupported-platform result and startup treats "roster unavailable" as the no-op path described in the pre-live gate.
+
+* Test strategy recommendations
+
+Specific tests to add for Phase 1.5:
+
+- =agent-roster= returns alone when no other matching process has cwd under the project.
+- =agent-roster= excludes its own process ancestry.
+- =agent-roster= reports not-alone when a spawned sleeper or test helper has cwd at the project root or below it.
+- Startup helper branch is byte-identical to today's path when =agent-roster= is missing or reports alone.
+- Startup routes to =helper-mode.org= and skips pulls/rsync/inbox processing when =agent-roster= reports another live agent.
+- =ai --helper= assigns a sanitized helper id, exports =AI_AGENT_ID= and =AI_HELPER=, and uses the helper opener.
+- Primary and helper resolve distinct context paths.
+- Helper-originated inbox send includes the helper id in same-minute slug generation.
+- Wrap-up with live helpers pauses before hygiene/commit.
+- Orphaned-helper wrap-up runs the full closing path only when the roster reports alone.
+- =todo-cleanup.el= copies a =/tmp= backup before any mutating mode.
+
+Manual drills from the spec are necessary and should remain gates: live helper + primary scoped edit, primary wrap-up while helper is mid-task, orphaned helper closes the tree, and raw =claude= launch gets caught by startup.
+
+* Documentation and tooling recommendations
+
+For Phase 1.5, update:
+
+- =protocols.org= with a short pointer to =helper-mode.org= and the helper write tiers.
+- =startup.org= with the roster-first branch and the no-op guarantee when unavailable.
+- =wrap-it-up.org= with helper/primary wrap-up ordering and live-helper pause messages.
+- =INDEX.org= with =helper-mode.org= marked auto-routed, not user-triggered.
+- README only if the user-facing launcher gains =ai --helper= before broader runtime docs exist.
+
+For phases 2-5, defer broad README renames until the runtime choices are resolved.
+
+* Suggested spec edits
+
+- Add separate readiness labels for "Phase 1.5 helper instances" and "Phases 2-5 runtime-neutral refactor."
+- Add a "Decisions required before phases 2-5" section listing the instruction-file, local runtime, first CLI, and adapter-scope choices.
+- Add a Phase 5 prerequisite to reverify model/backend assumptions against current sources before implementation.
+- Add the exact =.emacs.d= handoff artifact for the =ai-term.el= helper path.
+- Add rollback actions for the bats/sandbox/pilot/template rollout rings.
+
+* Agreed decisions
+
+None newly agreed during this review. Existing decisions in the spec stand: primary keeps the singleton path for Phase 1.5, helpers use =helper-<rand4>= and =session-context.d/=, =helper-mode.org= is canonical, =agent-roster= is the shared detection primitive, and =ai-term.el= owns its own tmux naming/wiring on top of the shared roster.
+
+* Open questions
+
+- Are phases 2-5 still desired near-term, or should the spec be split so the helper-instance slice can ship independently and the runtime-neutral arc remains parked?
+- Which local runtime stack and editing CLI should become the first supported v1 target?
+- What exact handoff path should be used for the =~/.emacs.d= =ai-term.el= changes?
+
+* vNext candidates
+
+- Runtime manifests for Claude plus one local runtime.
+- Generic bundle layout split into common + runtime adapters.
+- Local-model doctor checks once archsetup owns install and model cache setup.
+- Codex adapter support after the first non-Claude runtime proves the manifest shape.
+
+* Implementation tasks (drop-in for todo.org)
+
+These tasks are already represented in [[file:../../todo.org][todo.org]] as parent tasks. If copied elsewhere, keep Phase 1.5 separate from the parked runtime-neutral arc.
+
+** TODO [#B] Helper instances — concurrent same-project Claude :feature:
+Implement Phase 1.5: =agent-roster=, =ai --helper=, =helper-mode.org=, startup/wrap-up helper branches, live-helper gates, and helper write-safety docs. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 1.5).
+
+** TODO [#C] Runtime manifests and generic install commands :feature:
+Resolve the phase 2 decisions, then add =runtimes/claude.toml=, one local runtime manifest, and =make install-runtime= while keeping =make install= Claude-compatible. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 2).
+
+** TODO [#C] Runtime-aware language bundles :feature:
+Split common language material from runtime-specific adapters and add at least elisp support for the first local runtime. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 3).
+
+** TODO [#D] Runtime-neutral user-facing docs and aliases :chore:
+After compatibility aliases exist, rename Claude-specific public docs and source directories where the behavior is actually runtime-neutral. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 4).
+
+** TODO [#D] Local model install handoff and doctor checks :feature:
+After current model/backend assumptions are reverified and archsetup owns install/cache setup, add doctor checks for server availability, model files, and smoke prompts. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 5).
+
+** TODO [#B] Generic agent runtime — test surface :test:
+Unit: launcher id/runtime selection, session-context path/archive names, roster detection, helper id sanitization. Integration: two fake runtimes or primary/helper sessions writing distinct contexts; install-lang/sync-language-bundle legacy compatibility. Manual: live helper scoped edit, corruption drill, orphaned-helper close, raw-launch safety net, and Emacs F9 helper path. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Test strategy and Phase 1.5 pre-live gating).
diff --git a/docs/design/2026-05-28-generic-agent-runtime-spec.org b/docs/design/2026-05-28-generic-agent-runtime-spec.org
index 40a97b4..0b37814 100644
--- a/docs/design/2026-05-28-generic-agent-runtime-spec.org
+++ b/docs/design/2026-05-28-generic-agent-runtime-spec.org
@@ -65,30 +65,38 @@ under-specified — spawning a second Claude in the same project to look things
up or update tasks safely — and a new Phase 1.5 sequences that slice ahead of
the runtime-neutral phases 2-6, which remain pending a go/no-go.
-*NOT IMPLEMENTATION-READY* (Craig, 2026-06-11, after the fourth design
-revision). The helper-instance design iterated four times in one evening;
-holding it open until the known gaps close. Readiness checklist — all of
-these before any build starts:
-
-- [X] Emacs launch surface designed (see the open-issue subsection in the
- helper section): every place a session can be born routes through, or is
- caught by, the deterministic path. /Closed 2026-06-12: mechanics verified
+*Readiness is split by arc* (per the 2026-06-12 Codex review's top finding —
+the spec contains two different projects and one label misled):
+
+- *Phase 1.5 — helper instances: READY WITH CAVEATS* (2026-06-12). The
+ caveats are binding, not advisory: the three-ring pre-live gate governs
+ every merge into synced template paths; the manual drills are gates, not
+ suggestions; and the =ai-term.el= work lands as a coordinated
+ cross-project handoff to =~/.emacs.d= (the exact artifact is named in
+ Phase 1.5), so the rulesets side isn't "done" while the F9 path is
+ still unsafe.
+- *Phases 2-5 — runtime-neutral refactor: NOT READY.* Blocked on the
+ /Decisions required before phases 2-5/ section under Open decisions, and
+ on Phase 5's reverification prerequisite (the local-model table is a
+ recommendation, not an implementation constant). Parked pending Craig's
+ go/no-go on the arc.
+
+The original readiness checklist, resolved:
+
+- [X] Emacs launch surface designed. /Closed 2026-06-12: mechanics verified
in ai-term.el's code, integration design written, the three open calls
confirmed by Craig (roster-only sharing, singleton primary,
helper-mode.org as canonical home)./
-- [ ] Pre-live test strategy agreed (see Test strategy): sandbox drills
- pass, and the rollout is gated so nothing reaches live projects via
- template sync until validated — startup.org edits propagate to every
- project on their next session, so "accidentally live everywhere" is the
- default failure mode, not an edge case. /The three-ring gating is
- written; "agreed" lands with the independent review below./
-- [X] A re-read of the whole helper section after the dust settles, since
- four same-day revisions usually leave a seam somewhere. /Done 2026-06-12:
- the coherence pass unified the churned subsections and verified the
- ai-term.el claims against code./
-- [ ] Independent spec review (the =spec-review= cycle, as the KB and
- consolidation specs got) comes back Ready or Ready-with-caveats, and its
- dispositions are folded in via =spec-response=.
+- [X] Pre-live test strategy agreed. /The review accepted the three-ring
+ gate as the release-safety mechanism and asked for per-ring rollback
+ actions — added to the gating section./
+- [X] A re-read of the whole helper section after the dust settles. /Done
+ 2026-06-12: the coherence pass unified the churned subsections and
+ verified the ai-term.el claims against code./
+- [X] Independent spec review. /Codex, 2026-06-12: Not-ready for the
+ combined spec, Phase 1.5 implementable as a scoped slice — which the
+ split rubric above now states directly. Dispositions folded in the same
+ day; see Review dispositions./
* Problem
@@ -423,7 +431,11 @@ Known limits, accepted for v1: an agent session not running as a local
process on this machine (a cloud session against the same checkout) is
invisible to the scan; and the match is on process cwd, so an agent started
from outside the project tree wouldn't be seen. Both are edge shapes the
-operator created deliberately and can manage manually.
+operator created deliberately and can manage manually. The scan is also
+Linux-=/proc=-specific: on an unsupported platform the script reports
+"roster unavailable" explicitly (never a silent "alone"), and startup
+treats that result as the no-op path from the pre-live gate — same behavior
+as the script being absent.
*** Spawn paths: deterministic launcher, startup safety net
@@ -534,7 +546,10 @@ every personal task and corruption has maximal blast radius.
stale file would block hygiene forever, so staleness is surfaced as a
judgment call — the file's own content and timestamps show whether the
helper is really gone — never silently skipped past and never silently
- honored indefinitely.
+ honored indefinitely. The surfaced message is contractual (review
+ finding): it names the file path, its timestamps, and the suggested
+ actions (treat as stale and proceed / wait / abort), so the judgment is
+ made on evidence rather than a bare "helper detected" warning.
2. /A new primary starting while a helper runs./ The previous primary may
wrap and exit while a helper keeps working; the next =ai= launch becomes
primary and runs full startup. The existing guards already do the right
@@ -625,7 +640,12 @@ What remains to design — the integration, not a new surface:
- The =emacs.md= live-reload discipline applies to the ai-term.el changes,
and the change lands in the =~/.emacs.d= project (its own repo and
session scope — a cross-project handoff from rulesets, not a rulesets
- edit).
+ edit). The handoff artifact is exact (review finding, 2026-06-12):
+ implementation step one sends an =inbox-send .emacs.d= handoff carrying
+ this subsection's integration contract plus the recommendations below,
+ and the rulesets task does not close until =.emacs.d= confirms its task
+ is filed or landed — otherwise the shell path ships safe while F9 stays
+ unsafe and nothing tracks the gap.
Recommendations for ai-term.el beyond the helper feature (Craig asked for
these 2026-06-12; they ride the same handoff):
@@ -799,6 +819,12 @@ Independent of the phases 2-6 go/no-go; same-runtime only.
** Phase 5: Local model install handoff
+- Prerequisite (review finding, 2026-06-12): a reverification task runs
+ first — record /current/ model URLs, file sizes, licenses, backend
+ support, a smoke command, memory fit, and fallback behavior against live
+ sources. The model table in the Introductory note is a recommendation
+ frozen at 2026-05-28, not an implementation constant; nothing in doctor
+ checks or the archsetup handoff bakes it in unverified.
- Send archsetup an inbox note requesting local model runtime support.
- After archsetup lands it, teach =rulesets doctor= to verify:
- =llama-server= or =ollama= installed.
@@ -832,24 +858,43 @@ three rings:
the self-ancestry exclusion against the test's own process chain. The
startup hook tested for the no-op guarantee: when =agent-roster= is
absent or reports alone, behavior is byte-identical to today.
+ /Rollback: revert the commit; nothing here touches synced paths yet./
2. /Sandbox ring./ A disposable project (its own git repo, never
template-synced back) runs the live drills before any real project sees
the feature: primary + helper concurrent edits on one org file; the
corruption drill (primary wrap-up pauses on a live helper); the
orphaned-helper drill (primary wraps first, helper closes the door,
tree ends clean); the raw-launch drill (helper started without the
- launcher gets caught by the startup roster); and an Emacs-surface drill
- once that design lands.
+ launcher gets caught by the startup roster); and the Emacs F9 drill
+ (helper spawned via ai-term once its handoff lands).
+ /Rollback: delete the sandbox project; no other surface was touched./
3. /Pilot ring./ The startup detection ships dormant-by-construction —
the hook is a no-op wherever =agent-roster= is missing, and the script
ships first to one pilot project only (copied into its
=.ai/project-scripts/=, which the sync never touches) before the
template-wide release puts it everywhere. Rulesets itself is the
natural pilot: it's where a broken sweep is noticed fastest.
+ /Rollback: delete =agent-roster= from the pilot's project-scripts; the
+ hook reverts to its no-op path on the next session./
+4. /Template-wide release./ The startup branch and the script land in the
+ synced template paths only after the pilot soaks.
+ /Rollback: revert the startup.org commit and remove the script from
+ =claude-templates/.ai/scripts/=; the next sync's =--delete= clears every
+ project's copy, and the no-op guarantee means a half-propagated state
+ (some projects synced, some not) is safe in both directions./
+
+Ring-1 test inventory (the review's list, normative): roster alone /
+ancestry-exclusion / not-alone-on-sleeper cases; startup no-op
+byte-identity when roster is missing or alone; startup routes to
+helper-mode and skips pulls/rsync/inbox when not alone; =ai --helper=
+assigns a sanitized id, exports both vars, uses the helper opener; primary
+and helper resolve distinct context paths; helper-originated =inbox-send=
+slugs carry the id; wrap-up pauses on live helpers before hygiene and
+commit; orphaned-helper close runs only when the roster reports alone;
+=todo-cleanup.el= takes a =/tmp= backup before any mutating mode.
Nothing merges past ring 1 into the synced template paths until ring 2's
-drills pass, and the spec's NOT-IMPLEMENTATION-READY marker clears only
-when all three rings are written into the implementation plan.
+drills pass.
* Open decisions
@@ -866,8 +911,57 @@ when all three rings are written into the implementation plan.
- Which local agent CLI should be the first supported offline editor:
=aider=, =opencode=, a simple custom wrapper, or something else?
+** Decisions required before phases 2-5 — added 2026-06-12 (review finding)
+
+These are the blocker subset of the open decisions above, plus two the
+review added. Phases 2-5 stay NOT READY until each has an accepted answer;
+deciding them inside code is the failure mode this section prevents.
+
+1. Generic instruction-file strategy (=AGENTS.md= / =AI.md= /
+ runtime-specific only).
+2. Default local runtime manager/server (=llama.cpp= only vs =ollama=
+ as the beginner default).
+3. First supported local editing CLI.
+4. Phase-2 adapter scope: Claude + one local runtime only, or Codex
+ support immediately.
+5. Compatibility behavior for existing =CLAUDE.md= / =.claude/= projects
+ during the transition.
+
* Recommended next step
-Start with Phase 1 only. The singleton session-context file is the immediate
-correctness issue for simultaneous agents, and it can be fixed without renaming
-the whole repository or disrupting current Claude installs.
+Updated 2026-06-12: implement Phase 1.5 under its READY-WITH-CAVEATS rubric
+(the helper task in todo.org carries the plan). Phases 2-5 stay parked until
+the decisions section above is answered and Craig calls the go/no-go on the
+arc. The original recommendation — start with Phase 1 only — is complete:
+Phase 1 shipped.
+
+* Review dispositions — 2026-06-12 Codex review
+
+Every recommendation from [[file:2026-05-28-generic-agent-runtime-spec-review.org][the review]], dispositioned:
+
+| Recommendation | Disposition | Where it landed |
+|----------------+-------------+-----------------|
+| Split readiness labels by arc | Accept | Status: dual rubric (1.5 READY WITH CAVEATS, 2-5 NOT READY) |
+| "Decisions required before phases 2-5" section | Accept | Open decisions, new subsection (5 items) |
+| Phase 5 reverification prerequisite | Accept | Phase 5, first bullet; model table marked recommendation-only |
+| Exact =.emacs.d= handoff artifact | Accept | Emacs subsection: =inbox-send= handoff is implementation step one; task closes on =.emacs.d= confirmation |
+| Per-ring rollback actions | Accept | Pre-live gating: rollback line per ring, incl. the half-propagated-sync case |
+| Stale-helper message contract | Accept | Data-integrity rule 1: path + timestamps + suggested actions |
+| Roster unsupported-platform behavior | Accept | Roster subsection: explicit "roster unavailable" result, no-op path |
+| Ring-1 test inventory | Accept | Pre-live gating: the review's list adopted as normative |
+| Docs-update list for 1.5 | Accept | Already in Phase 1.5 items; INDEX auto-routed note included |
+| Physically split the spec (open question) | Modify | Dual rubric in one document; phases 2-5 stay parked here pending Craig's go/no-go, matching the standing task framing |
+
+Rejections: none.
+
+* Review and iteration history
+
+** 2026-06-12 Fri @ 02:09:10 -0500 — Codex — reviewer
+- What changed or was recommended: ran the spec-review workflow and wrote a formal review. Rubric for the whole spec: =Not ready=. Phase 1 is already shipped; Phase 1.5 helper instances are implementable as a scoped slice with the existing rollout/manual-validation caveats; phases 2-5 remain blocked on product choices and time-sensitive local-runtime/model verification.
+- Why: the spec now combines a concrete same-runtime helper implementation with a broader runtime-neutral refactor whose instruction-file, local-runtime, first-CLI, and adapter-scope decisions are still open.
+- Artifacts: [[file:2026-05-28-generic-agent-runtime-spec-review.org][2026-05-28-generic-agent-runtime-spec-review.org]]; existing [[file:../../todo.org][todo.org]] entries for "Helper-instance support" and "Generic agent runtime support — Codex spec v0" updated with the review outcome.
+
+** 2026-06-12 Fri @ 02:23:04 -0500 — Claude — author (spec-response)
+- What changed: folded the review in. All recommendations accepted except the document-split open question, modified to a dual rubric in one document (see Review dispositions). Status now labels Phase 1.5 READY WITH CAVEATS and phases 2-5 NOT READY; the original readiness checklist is fully resolved.
+- Why: the review's top finding was that one Not-ready label hid an implementable slice; the rest hardened the slice's rollout (per-ring rollbacks, normative test inventory, exact .emacs.d handoff artifact, stale-helper message contract, roster platform behavior) and fenced the parked arc (decisions-required section, Phase 5 reverification prerequisite).
+- Artifacts: this spec's Status, Pre-live gating, Phase 5, Open decisions, and Review dispositions sections; the helper task in todo.org carries the same caveats.
diff --git a/todo.org b/todo.org
index fb71dab..5e1f17f 100644
--- a/todo.org
+++ b/todo.org
@@ -45,7 +45,7 @@ Cancelled 2026-06-11: Craig confirmed the decision — one todo queue with a sin
:CREATED: [2026-06-11 Thu]
:LAST_REVIEWED: 2026-06-11
:END:
-BLOCKED ON SPEC READINESS (Craig, 2026-06-11): the spec is marked NOT IMPLEMENTATION-READY after four same-day design revisions. Before any build — finish the Emacs integration design (corrected 2026-06-12: the surface is ghostel + ai-term.el's F9 flow, which already creates project-named aiv- tmux sessions; integration = teach ai-term.el's session-create the roster→export→opener steps and add a [helper] picker badge; the ai-term.el change is a cross-project handoff to ~/.emacs.d, not a rulesets edit), write the three-ring test gating into the implementation plan (bats → sandbox drills → pilot project, dormant-by-construction so template sync can't put it live early), and re-read the whole helper section for seams. Then implement.
+SPEC REVIEWED 2026-06-12: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org][Codex review]] keeps the whole generic-runtime spec at =Not ready=, but treats Phase 1.5 as implementable as a scoped slice if the caveats below are accepted. Before any build, keep the Emacs integration as a cross-project handoff to =~/.emacs.d=, preserve the three-ring gate (bats → sandbox drills → pilot project), and do not let startup/helper changes reach synced template paths until the live drills pass.
Implement Phase 1.5 of the generic-agent-runtime spec ([[file:docs/design/2026-05-28-generic-agent-runtime-spec.org][spec]], amended 2026-06-11 with the "Concurrent same-project agents" section). Craig's case: spawn a second Claude in the same project to look things up or update tasks safely while the primary works. The session-context split (AI_AGENT_ID + session-context.d/) already shipped; this builds the rest:
@@ -1139,7 +1139,7 @@ Immediate correctness issue Codex flagged: the singleton .ai/session-context.org
Broader refactor proposes runtimes/ adapter manifests, generic install commands, language-bundle split (common/ + runtimes/<runtime>/), launcher refactor, local model service via llama.cpp/ollama. Big surface area, six phases.
-Before any implementation: needs a real review pass on the spec, and a decision on whether to commit to the larger arc (phases 2-6).
+2026-06-12 spec review complete: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org][Codex review]] rubric for the whole spec is =Not ready=. Phase 1 is already shipped, and Phase 1.5 is tracked separately as the helper-instance task. Before any phases 2-5 implementation, decide whether to commit to the larger arc and answer the blocker decisions: generic instruction-file strategy, default local runtime/server, first supported local editing CLI, adapter scope, and compatibility behavior for existing =CLAUDE.md= / =.claude/= projects.
*** 2026-06-10 Wed @ 14:13:55 -0500 Noted Phase 1 already shipped; narrowed scope to the phases 2-6 decision
Phase 1 (the correctness fix) is live: protocols.org documents the AI_AGENT_ID-scoped session-context path (=.ai/session-context.d/<id>.org=) and =.ai/scripts/session-context-path= resolves it. The singleton race Codex flagged is closed. What remains is the spec review plus a go/no-go on the broader runtime-neutral refactor: runtimes/ adapter manifests, generic install commands, language-bundle split, launcher refactor, local model service.
@@ -1147,6 +1147,9 @@ Phase 1 (the correctness fix) is live: protocols.org documents the AI_AGENT_ID-s
*** 2026-06-11 Thu @ 19:26:26 -0500 Spec amended with the helper-instance slice; implementation split out
Craig's motivating case (a second Claude in the same project for lookups and safe task updates) was under-specified in v0 — it had identity and message targeting but no spawn mechanics and no write-safety contract for the shared files the session-context split doesn't isolate. Added the "Concurrent same-project agents (helper instances)" section (subagent boundary, identity/spawn via =ai --helper=, the tiered read/write contract, light startup, helper wrap-up) and Phase 1.5 to the migration plan. Implementation filed as its own [#B] task ("Helper-instance support"); this task stays scoped to the phases 2-6 go/no-go.
+*** 2026-06-12 Fri @ 02:09:10 -0500 Independent spec review complete
+Codex ran the spec-review workflow. Outcome: the combined spec is =Not ready= because phases 2-5 still require product decisions and current external-runtime/model verification. Phase 1.5 can proceed only as the already-split helper task, with rollout/manual-validation caveats accepted and no accidental template-wide release before sandbox/pilot drills pass. Review file: [[file:docs/design/2026-05-28-generic-agent-runtime-spec-review.org]].
+
* Rulesets Resolved
** DONE [#C] Fix =cj-scan= false positives on cj fences nested inside other =#+begin_*= blocks :bug:
CLOSED: [2026-05-15 Fri]