diff options
| author | Craig Jennings <c@cjennings.net> | 2026-06-12 02:24:01 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-06-12 02:24:01 -0500 |
| commit | 22e19c21e6aabe0319d4b09a862f4a3705c92509 (patch) | |
| tree | 5210f7ac7bba64b7b1b0b23f56d14f4dafefaf9a /docs | |
| parent | c6fd73441ef0b683abb859863dcd0d48377a4838 (diff) | |
| download | rulesets-22e19c21e6aabe0319d4b09a862f4a3705c92509.tar.gz rulesets-22e19c21e6aabe0319d4b09a862f4a3705c92509.zip | |
docs(spec): fold the Codex review into the agent-runtime spec
The review's top finding was that one Not-ready label hid an implementable slice. Status now splits by arc: Phase 1.5 helper instances are READY WITH CAVEATS (the three-ring gate and the manual drills are binding, and the ai-term.el work is a coordinated .emacs.d handoff with an exact artifact), while phases 2-5 stay NOT READY behind a decisions-required section and a Phase 5 reverification prerequisite that demotes the model table to a recommendation.
The remaining findings hardened the slice: per-ring rollback actions including the half-propagated-sync case, the review's test inventory adopted as normative, a message contract for stale helper files, and explicit roster-unavailable behavior on unsupported platforms. All recommendations accepted except the document split, modified to a dual rubric in one document. The review file and dispositions table ride along.
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/design/2026-05-28-generic-agent-runtime-spec-review.org | 178 | ||||
| -rw-r--r-- | docs/design/2026-05-28-generic-agent-runtime-spec.org | 156 |
2 files changed, 303 insertions, 31 deletions
diff --git a/docs/design/2026-05-28-generic-agent-runtime-spec-review.org b/docs/design/2026-05-28-generic-agent-runtime-spec-review.org new file mode 100644 index 0000000..90d7030 --- /dev/null +++ b/docs/design/2026-05-28-generic-agent-runtime-spec-review.org @@ -0,0 +1,178 @@ +#+TITLE: Review: Generic Agent Runtime Support for rulesets +#+AUTHOR: Codex +#+DATE: 2026-06-12 +#+STARTUP: showall + +* Scope reviewed + +Reviewed the target spec at [[file:2026-05-28-generic-agent-runtime-spec.org][2026-05-28-generic-agent-runtime-spec.org]], the spec-review workflow, the current launcher/install/template implementation, existing tests, and existing task tracking. + +Code and docs read: + +- [[file:../../Makefile][Makefile]] — global install/deps targets still install Claude-only roots and the Claude launcher. +- [[file:../../claude-templates/bin/ai][claude-templates/bin/ai]] — tmux launcher still hard-codes =CLAUDE_CMD=claude=, project detection via =.ai/protocols.org=, and one window per project name. +- [[file:../../scripts/install-lang.sh][scripts/install-lang.sh]] and [[file:../../scripts/sync-language-bundle.sh][scripts/sync-language-bundle.sh]] — language bundle install/sync still writes =.claude/= and =CLAUDE.md=. +- [[file:../../.ai/scripts/session-context-path][.ai/scripts/session-context-path]] and [[file:../../.ai/scripts/tests/session-context-path.bats][its bats tests]] — Phase 1 resolver exists and covers unset, empty, distinct, and sanitized =AI_AGENT_ID= values. +- [[file:../../.ai/protocols.org][.ai/protocols.org]], [[file:../../.ai/workflows/startup.org][startup.org]], and [[file:../../.ai/workflows/wrap-it-up.org][wrap-it-up.org]] — protocols documents the agent-scoped path; startup/wrap-up resolve that path but do not yet implement roster-first helper branching or live-helper gates. +- [[file:../../.ai/scripts/todo-cleanup.el][todo-cleanup.el]], [[file:../../.ai/scripts/lint-org.el][lint-org.el]], and [[file:../../.ai/scripts/wrap-org-table.el][wrap-org-table.el]] — =lint-org= and =wrap-org-table= already take =/tmp= backups; =todo-cleanup= does not. +- [[file:../../todo.org][todo.org]] — existing Phase 1.5 helper task and broader generic-runtime parent task. + +I did not re-verify the time-sensitive Hugging Face model recommendations online during this review. Treat the local-model picks as stale-until-checked before Phase 5. + +* Implementation-readiness + +Rubric for the whole spec: =Not ready=. + +The spec is strong enough to implement the Phase 1.5 helper-instance slice if Craig accepts the caveats already captured in tracking: the synced-template rollout must stay gated, the live sandbox drills must pass, and the Emacs =ai-term.el= work must land as a coordinated cross-project change. The broader runtime-neutral refactor in phases 2-5 is not implementation-ready because it still has unresolved product choices and time-sensitive external assumptions. + +* Overall assessment + +The spec correctly identifies the current architecture: the reusable project core lives under =.ai/=, while the install surface, bundle layout, launcher, hooks, and user-facing docs remain Claude-specific. Phase 1 is already done in the repo: =session-context-path= is present, tested, and wired into startup/wrap-up path resolution. + +The helper-instance amendment is the most actionable part. It names the concurrency risk, defines a role contract, narrows helper writes, and gives concrete tests and manual drills. The main risk is rollout, not design: =startup.org= and =.ai/scripts/= sync broadly, so a partially validated helper branch could affect every project. + +The generic runtime arc is still a product decision package, not an implementation plan. It names plausible phases, but the open decisions determine file names, local runtime UX, adapter scope, and support burden. + +* High-priority findings + +** Split the ready helper slice from the not-ready runtime-neutral arc + +Blocking status: blocks =Ready= for the whole spec; does not block a scoped Phase 1.5 implementation if tracked as its own accepted slice. + +Why it matters: the spec now contains two different projects. Phase 1.5 solves a near-term same-runtime concurrency problem. Phases 2-5 rename and generalize the entire distribution. The current Status section says the spec is not implementation-ready, while the Phase 1.5 section is detailed enough for implementation. Without an explicit split, an implementer has to decide whether "start implementation" means helper support only or the full runtime-neutral refactor. + +What to change: in the spec Status or Recommended next step, state two rubrics separately: + +- Phase 1.5 helper instances: =Ready with caveats= once Craig accepts the pre-live gating and cross-project Emacs handoff. +- Phases 2-5 generic runtime refactor: =Not ready= until the open decisions are answered and model/runtime assumptions are reverified. + +** Resolve the phase 2-5 product choices before implementation + +Blocking status: blocks =Ready= for phases 2-5. + +Why it matters: the spec still asks which generic instruction file to use, whether to standardize on =llama.cpp= or =ollama=, and which local agent CLI is first-supported. Those choices affect manifest schema, install paths, docs, doctor checks, runtime command templates, and test fixtures. If implementation starts now, those decisions will be made inside code. + +What to change: add a short "Decisions required before phases 2-5" section with accepted answers for: + +- generic instruction file strategy; +- default local runtime manager/server; +- first supported local editing CLI; +- whether phase 2 should support only Claude + one local runtime or also Codex immediately; +- compatibility behavior for existing =CLAUDE.md= and =.claude/= projects during the transition. + +** Reverify external runtime and model assumptions before the local-model phase + +Blocking status: blocks =Ready= for Phase 5, not for Phase 1.5. + +Why it matters: the local-model recommendations are inherently time-sensitive. Model availability, quant files, serving backends, GPU support, context behavior, and practical latency can change. The spec cites sources, but implementation of =rulesets doctor= and archsetup handoff should not bake in stale assumptions. + +What to change: make Phase 5 start with a research/verification task that records current model URLs, file sizes, license, backend support, smoke command, memory fit, and fallback behavior. Keep the spec's current model table as a recommendation, not as an implementation constant. + +* Medium-priority findings + +** Add the exact cross-project handoff artifact for ai-term.el + +Blocking status: not blocking for the rulesets side if the handoff is created before or during implementation. + +The spec correctly says =ai-term.el= lives in =~/.emacs.d= and is not a rulesets edit. That means the implementation plan should say exactly how the rulesets task hands off the required Emacs change: an inbox file, a linked task, or a commit in that repo. Otherwise the rulesets implementation can finish with shell helpers working while the F9 path remains unsafe. + +** Name the rollback point for template-wide helper rollout + +Blocking status: not blocking if the three-ring gate is accepted; it is a release-safety improvement. + +The pre-live gate is good: bats, sandbox, pilot, then template-wide release. Add the rollback action for each ring: remove =agent-roster= from the pilot project, revert the =startup.org= helper branch, or disable helper detection when =agent-roster= is absent. This matters because startup template sync has broad blast radius. + +* UX observations + +The helper UX is concrete enough for v1: launcher path, raw-launch safety net, helper opener, helper workflow, and wrap-up behavior are all named. The no-trigger-phrase decision for =helper-mode.org= is good because humans should not have to remember a workflow incantation when the launcher can route. + +For phases 2-5, the user mental model is not yet settled. A user will need to know whether they are installing a runtime, choosing a model profile, or creating project instructions. That should be decided before docs or commands ship. + +* Architecture observations + +The spec fits the current repo boundaries. =.ai/= remains core; runtime adapters can sit beside existing Claude-specific layout; the launcher and install scripts are the right integration points. The current implementation confirms the major refactor points: =Makefile=, =install-lang.sh=, =sync-language-bundle.sh=, and =claude-templates/bin/ai= all hard-code Claude assumptions today. + +The helper slice is intentionally smaller than the runtime manifest work and should stay that way. Do not introduce TOML manifests, local model service config, or bundle splitting while implementing =agent-roster= and helper startup/wrap-up. + +* Robustness and performance observations + +The helper data-integrity rules cover the important local lost-update shapes: scoped org edits, no helper memory writes, git mutation primary-only while concurrent, and log-before-write. The remaining robustness question is how stale helper files are surfaced without permanently blocking hygiene. The spec says this is a judgment call; implementation should make the message explicit and include file path, timestamp, and suggested actions. + +The =agent-roster= scan is cheap enough for startup on Linux, but it is Linux =/proc= specific. That is fine for v1 if the script reports a clear unsupported-platform result and startup treats "roster unavailable" as the no-op path described in the pre-live gate. + +* Test strategy recommendations + +Specific tests to add for Phase 1.5: + +- =agent-roster= returns alone when no other matching process has cwd under the project. +- =agent-roster= excludes its own process ancestry. +- =agent-roster= reports not-alone when a spawned sleeper or test helper has cwd at the project root or below it. +- Startup helper branch is byte-identical to today's path when =agent-roster= is missing or reports alone. +- Startup routes to =helper-mode.org= and skips pulls/rsync/inbox processing when =agent-roster= reports another live agent. +- =ai --helper= assigns a sanitized helper id, exports =AI_AGENT_ID= and =AI_HELPER=, and uses the helper opener. +- Primary and helper resolve distinct context paths. +- Helper-originated inbox send includes the helper id in same-minute slug generation. +- Wrap-up with live helpers pauses before hygiene/commit. +- Orphaned-helper wrap-up runs the full closing path only when the roster reports alone. +- =todo-cleanup.el= copies a =/tmp= backup before any mutating mode. + +Manual drills from the spec are necessary and should remain gates: live helper + primary scoped edit, primary wrap-up while helper is mid-task, orphaned helper closes the tree, and raw =claude= launch gets caught by startup. + +* Documentation and tooling recommendations + +For Phase 1.5, update: + +- =protocols.org= with a short pointer to =helper-mode.org= and the helper write tiers. +- =startup.org= with the roster-first branch and the no-op guarantee when unavailable. +- =wrap-it-up.org= with helper/primary wrap-up ordering and live-helper pause messages. +- =INDEX.org= with =helper-mode.org= marked auto-routed, not user-triggered. +- README only if the user-facing launcher gains =ai --helper= before broader runtime docs exist. + +For phases 2-5, defer broad README renames until the runtime choices are resolved. + +* Suggested spec edits + +- Add separate readiness labels for "Phase 1.5 helper instances" and "Phases 2-5 runtime-neutral refactor." +- Add a "Decisions required before phases 2-5" section listing the instruction-file, local runtime, first CLI, and adapter-scope choices. +- Add a Phase 5 prerequisite to reverify model/backend assumptions against current sources before implementation. +- Add the exact =.emacs.d= handoff artifact for the =ai-term.el= helper path. +- Add rollback actions for the bats/sandbox/pilot/template rollout rings. + +* Agreed decisions + +None newly agreed during this review. Existing decisions in the spec stand: primary keeps the singleton path for Phase 1.5, helpers use =helper-<rand4>= and =session-context.d/=, =helper-mode.org= is canonical, =agent-roster= is the shared detection primitive, and =ai-term.el= owns its own tmux naming/wiring on top of the shared roster. + +* Open questions + +- Are phases 2-5 still desired near-term, or should the spec be split so the helper-instance slice can ship independently and the runtime-neutral arc remains parked? +- Which local runtime stack and editing CLI should become the first supported v1 target? +- What exact handoff path should be used for the =~/.emacs.d= =ai-term.el= changes? + +* vNext candidates + +- Runtime manifests for Claude plus one local runtime. +- Generic bundle layout split into common + runtime adapters. +- Local-model doctor checks once archsetup owns install and model cache setup. +- Codex adapter support after the first non-Claude runtime proves the manifest shape. + +* Implementation tasks (drop-in for todo.org) + +These tasks are already represented in [[file:../../todo.org][todo.org]] as parent tasks. If copied elsewhere, keep Phase 1.5 separate from the parked runtime-neutral arc. + +** TODO [#B] Helper instances — concurrent same-project Claude :feature: +Implement Phase 1.5: =agent-roster=, =ai --helper=, =helper-mode.org=, startup/wrap-up helper branches, live-helper gates, and helper write-safety docs. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 1.5). + +** TODO [#C] Runtime manifests and generic install commands :feature: +Resolve the phase 2 decisions, then add =runtimes/claude.toml=, one local runtime manifest, and =make install-runtime= while keeping =make install= Claude-compatible. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 2). + +** TODO [#C] Runtime-aware language bundles :feature: +Split common language material from runtime-specific adapters and add at least elisp support for the first local runtime. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 3). + +** TODO [#D] Runtime-neutral user-facing docs and aliases :chore: +After compatibility aliases exist, rename Claude-specific public docs and source directories where the behavior is actually runtime-neutral. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 4). + +** TODO [#D] Local model install handoff and doctor checks :feature: +After current model/backend assumptions are reverified and archsetup owns install/cache setup, add doctor checks for server availability, model files, and smoke prompts. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Migration plan, Phase 5). + +** TODO [#B] Generic agent runtime — test surface :test: +Unit: launcher id/runtime selection, session-context path/archive names, roster detection, helper id sanitization. Integration: two fake runtimes or primary/helper sessions writing distinct contexts; install-lang/sync-language-bundle legacy compatibility. Manual: live helper scoped edit, corruption drill, orphaned-helper close, raw-launch safety net, and Emacs F9 helper path. Spec: [[file:2026-05-28-generic-agent-runtime-spec.org]] (Test strategy and Phase 1.5 pre-live gating). diff --git a/docs/design/2026-05-28-generic-agent-runtime-spec.org b/docs/design/2026-05-28-generic-agent-runtime-spec.org index 40a97b4..0b37814 100644 --- a/docs/design/2026-05-28-generic-agent-runtime-spec.org +++ b/docs/design/2026-05-28-generic-agent-runtime-spec.org @@ -65,30 +65,38 @@ under-specified — spawning a second Claude in the same project to look things up or update tasks safely — and a new Phase 1.5 sequences that slice ahead of the runtime-neutral phases 2-6, which remain pending a go/no-go. -*NOT IMPLEMENTATION-READY* (Craig, 2026-06-11, after the fourth design -revision). The helper-instance design iterated four times in one evening; -holding it open until the known gaps close. Readiness checklist — all of -these before any build starts: - -- [X] Emacs launch surface designed (see the open-issue subsection in the - helper section): every place a session can be born routes through, or is - caught by, the deterministic path. /Closed 2026-06-12: mechanics verified +*Readiness is split by arc* (per the 2026-06-12 Codex review's top finding — +the spec contains two different projects and one label misled): + +- *Phase 1.5 — helper instances: READY WITH CAVEATS* (2026-06-12). The + caveats are binding, not advisory: the three-ring pre-live gate governs + every merge into synced template paths; the manual drills are gates, not + suggestions; and the =ai-term.el= work lands as a coordinated + cross-project handoff to =~/.emacs.d= (the exact artifact is named in + Phase 1.5), so the rulesets side isn't "done" while the F9 path is + still unsafe. +- *Phases 2-5 — runtime-neutral refactor: NOT READY.* Blocked on the + /Decisions required before phases 2-5/ section under Open decisions, and + on Phase 5's reverification prerequisite (the local-model table is a + recommendation, not an implementation constant). Parked pending Craig's + go/no-go on the arc. + +The original readiness checklist, resolved: + +- [X] Emacs launch surface designed. /Closed 2026-06-12: mechanics verified in ai-term.el's code, integration design written, the three open calls confirmed by Craig (roster-only sharing, singleton primary, helper-mode.org as canonical home)./ -- [ ] Pre-live test strategy agreed (see Test strategy): sandbox drills - pass, and the rollout is gated so nothing reaches live projects via - template sync until validated — startup.org edits propagate to every - project on their next session, so "accidentally live everywhere" is the - default failure mode, not an edge case. /The three-ring gating is - written; "agreed" lands with the independent review below./ -- [X] A re-read of the whole helper section after the dust settles, since - four same-day revisions usually leave a seam somewhere. /Done 2026-06-12: - the coherence pass unified the churned subsections and verified the - ai-term.el claims against code./ -- [ ] Independent spec review (the =spec-review= cycle, as the KB and - consolidation specs got) comes back Ready or Ready-with-caveats, and its - dispositions are folded in via =spec-response=. +- [X] Pre-live test strategy agreed. /The review accepted the three-ring + gate as the release-safety mechanism and asked for per-ring rollback + actions — added to the gating section./ +- [X] A re-read of the whole helper section after the dust settles. /Done + 2026-06-12: the coherence pass unified the churned subsections and + verified the ai-term.el claims against code./ +- [X] Independent spec review. /Codex, 2026-06-12: Not-ready for the + combined spec, Phase 1.5 implementable as a scoped slice — which the + split rubric above now states directly. Dispositions folded in the same + day; see Review dispositions./ * Problem @@ -423,7 +431,11 @@ Known limits, accepted for v1: an agent session not running as a local process on this machine (a cloud session against the same checkout) is invisible to the scan; and the match is on process cwd, so an agent started from outside the project tree wouldn't be seen. Both are edge shapes the -operator created deliberately and can manage manually. +operator created deliberately and can manage manually. The scan is also +Linux-=/proc=-specific: on an unsupported platform the script reports +"roster unavailable" explicitly (never a silent "alone"), and startup +treats that result as the no-op path from the pre-live gate — same behavior +as the script being absent. *** Spawn paths: deterministic launcher, startup safety net @@ -534,7 +546,10 @@ every personal task and corruption has maximal blast radius. stale file would block hygiene forever, so staleness is surfaced as a judgment call — the file's own content and timestamps show whether the helper is really gone — never silently skipped past and never silently - honored indefinitely. + honored indefinitely. The surfaced message is contractual (review + finding): it names the file path, its timestamps, and the suggested + actions (treat as stale and proceed / wait / abort), so the judgment is + made on evidence rather than a bare "helper detected" warning. 2. /A new primary starting while a helper runs./ The previous primary may wrap and exit while a helper keeps working; the next =ai= launch becomes primary and runs full startup. The existing guards already do the right @@ -625,7 +640,12 @@ What remains to design — the integration, not a new surface: - The =emacs.md= live-reload discipline applies to the ai-term.el changes, and the change lands in the =~/.emacs.d= project (its own repo and session scope — a cross-project handoff from rulesets, not a rulesets - edit). + edit). The handoff artifact is exact (review finding, 2026-06-12): + implementation step one sends an =inbox-send .emacs.d= handoff carrying + this subsection's integration contract plus the recommendations below, + and the rulesets task does not close until =.emacs.d= confirms its task + is filed or landed — otherwise the shell path ships safe while F9 stays + unsafe and nothing tracks the gap. Recommendations for ai-term.el beyond the helper feature (Craig asked for these 2026-06-12; they ride the same handoff): @@ -799,6 +819,12 @@ Independent of the phases 2-6 go/no-go; same-runtime only. ** Phase 5: Local model install handoff +- Prerequisite (review finding, 2026-06-12): a reverification task runs + first — record /current/ model URLs, file sizes, licenses, backend + support, a smoke command, memory fit, and fallback behavior against live + sources. The model table in the Introductory note is a recommendation + frozen at 2026-05-28, not an implementation constant; nothing in doctor + checks or the archsetup handoff bakes it in unverified. - Send archsetup an inbox note requesting local model runtime support. - After archsetup lands it, teach =rulesets doctor= to verify: - =llama-server= or =ollama= installed. @@ -832,24 +858,43 @@ three rings: the self-ancestry exclusion against the test's own process chain. The startup hook tested for the no-op guarantee: when =agent-roster= is absent or reports alone, behavior is byte-identical to today. + /Rollback: revert the commit; nothing here touches synced paths yet./ 2. /Sandbox ring./ A disposable project (its own git repo, never template-synced back) runs the live drills before any real project sees the feature: primary + helper concurrent edits on one org file; the corruption drill (primary wrap-up pauses on a live helper); the orphaned-helper drill (primary wraps first, helper closes the door, tree ends clean); the raw-launch drill (helper started without the - launcher gets caught by the startup roster); and an Emacs-surface drill - once that design lands. + launcher gets caught by the startup roster); and the Emacs F9 drill + (helper spawned via ai-term once its handoff lands). + /Rollback: delete the sandbox project; no other surface was touched./ 3. /Pilot ring./ The startup detection ships dormant-by-construction — the hook is a no-op wherever =agent-roster= is missing, and the script ships first to one pilot project only (copied into its =.ai/project-scripts/=, which the sync never touches) before the template-wide release puts it everywhere. Rulesets itself is the natural pilot: it's where a broken sweep is noticed fastest. + /Rollback: delete =agent-roster= from the pilot's project-scripts; the + hook reverts to its no-op path on the next session./ +4. /Template-wide release./ The startup branch and the script land in the + synced template paths only after the pilot soaks. + /Rollback: revert the startup.org commit and remove the script from + =claude-templates/.ai/scripts/=; the next sync's =--delete= clears every + project's copy, and the no-op guarantee means a half-propagated state + (some projects synced, some not) is safe in both directions./ + +Ring-1 test inventory (the review's list, normative): roster alone / +ancestry-exclusion / not-alone-on-sleeper cases; startup no-op +byte-identity when roster is missing or alone; startup routes to +helper-mode and skips pulls/rsync/inbox when not alone; =ai --helper= +assigns a sanitized id, exports both vars, uses the helper opener; primary +and helper resolve distinct context paths; helper-originated =inbox-send= +slugs carry the id; wrap-up pauses on live helpers before hygiene and +commit; orphaned-helper close runs only when the roster reports alone; +=todo-cleanup.el= takes a =/tmp= backup before any mutating mode. Nothing merges past ring 1 into the synced template paths until ring 2's -drills pass, and the spec's NOT-IMPLEMENTATION-READY marker clears only -when all three rings are written into the implementation plan. +drills pass. * Open decisions @@ -866,8 +911,57 @@ when all three rings are written into the implementation plan. - Which local agent CLI should be the first supported offline editor: =aider=, =opencode=, a simple custom wrapper, or something else? +** Decisions required before phases 2-5 — added 2026-06-12 (review finding) + +These are the blocker subset of the open decisions above, plus two the +review added. Phases 2-5 stay NOT READY until each has an accepted answer; +deciding them inside code is the failure mode this section prevents. + +1. Generic instruction-file strategy (=AGENTS.md= / =AI.md= / + runtime-specific only). +2. Default local runtime manager/server (=llama.cpp= only vs =ollama= + as the beginner default). +3. First supported local editing CLI. +4. Phase-2 adapter scope: Claude + one local runtime only, or Codex + support immediately. +5. Compatibility behavior for existing =CLAUDE.md= / =.claude/= projects + during the transition. + * Recommended next step -Start with Phase 1 only. The singleton session-context file is the immediate -correctness issue for simultaneous agents, and it can be fixed without renaming -the whole repository or disrupting current Claude installs. +Updated 2026-06-12: implement Phase 1.5 under its READY-WITH-CAVEATS rubric +(the helper task in todo.org carries the plan). Phases 2-5 stay parked until +the decisions section above is answered and Craig calls the go/no-go on the +arc. The original recommendation — start with Phase 1 only — is complete: +Phase 1 shipped. + +* Review dispositions — 2026-06-12 Codex review + +Every recommendation from [[file:2026-05-28-generic-agent-runtime-spec-review.org][the review]], dispositioned: + +| Recommendation | Disposition | Where it landed | +|----------------+-------------+-----------------| +| Split readiness labels by arc | Accept | Status: dual rubric (1.5 READY WITH CAVEATS, 2-5 NOT READY) | +| "Decisions required before phases 2-5" section | Accept | Open decisions, new subsection (5 items) | +| Phase 5 reverification prerequisite | Accept | Phase 5, first bullet; model table marked recommendation-only | +| Exact =.emacs.d= handoff artifact | Accept | Emacs subsection: =inbox-send= handoff is implementation step one; task closes on =.emacs.d= confirmation | +| Per-ring rollback actions | Accept | Pre-live gating: rollback line per ring, incl. the half-propagated-sync case | +| Stale-helper message contract | Accept | Data-integrity rule 1: path + timestamps + suggested actions | +| Roster unsupported-platform behavior | Accept | Roster subsection: explicit "roster unavailable" result, no-op path | +| Ring-1 test inventory | Accept | Pre-live gating: the review's list adopted as normative | +| Docs-update list for 1.5 | Accept | Already in Phase 1.5 items; INDEX auto-routed note included | +| Physically split the spec (open question) | Modify | Dual rubric in one document; phases 2-5 stay parked here pending Craig's go/no-go, matching the standing task framing | + +Rejections: none. + +* Review and iteration history + +** 2026-06-12 Fri @ 02:09:10 -0500 — Codex — reviewer +- What changed or was recommended: ran the spec-review workflow and wrote a formal review. Rubric for the whole spec: =Not ready=. Phase 1 is already shipped; Phase 1.5 helper instances are implementable as a scoped slice with the existing rollout/manual-validation caveats; phases 2-5 remain blocked on product choices and time-sensitive local-runtime/model verification. +- Why: the spec now combines a concrete same-runtime helper implementation with a broader runtime-neutral refactor whose instruction-file, local-runtime, first-CLI, and adapter-scope decisions are still open. +- Artifacts: [[file:2026-05-28-generic-agent-runtime-spec-review.org][2026-05-28-generic-agent-runtime-spec-review.org]]; existing [[file:../../todo.org][todo.org]] entries for "Helper-instance support" and "Generic agent runtime support — Codex spec v0" updated with the review outcome. + +** 2026-06-12 Fri @ 02:23:04 -0500 — Claude — author (spec-response) +- What changed: folded the review in. All recommendations accepted except the document-split open question, modified to a dual rubric in one document (see Review dispositions). Status now labels Phase 1.5 READY WITH CAVEATS and phases 2-5 NOT READY; the original readiness checklist is fully resolved. +- Why: the review's top finding was that one Not-ready label hid an implementable slice; the rest hardened the slice's rollout (per-ring rollbacks, normative test inventory, exact .emacs.d handoff artifact, stale-helper message contract, roster platform behavior) and fenced the parked arc (decisions-required section, Phase 5 reverification prerequisite). +- Artifacts: this spec's Status, Pre-live gating, Phase 5, Open decisions, and Review dispositions sections; the helper task in todo.org carries the same caveats. |
