1 files changed, 117 insertions, 60 deletions
diff --git a/docs/design/2026-05-28-generic-agent-runtime-spec.org b/docs/design/2026-05-28-generic-agent-runtime-spec.org
index 2172ef3..e3651f9 100644
--- a/docs/design/2026-05-28-generic-agent-runtime-spec.org
+++ b/docs/design/2026-05-28-generic-agent-runtime-spec.org
@@ -394,22 +394,49 @@ subagent (the Agent tool), use that — no second session exists and nothing
 here applies. A helper instance is for interactive, long-lived parallel work
 Craig drives himself in a second terminal.
 
-*** Detection: an agent discovers it isn't alone (no operator action)
+*** The roster: one detection primitive
+
+=agent-roster= is a small script and the single source of "who else is live
+in this project." Every other piece — both launchers and the in-session
+startup check — calls it rather than reimplementing the scan.
+
+The scan is stateless, verified live on 2026-06-11 with four concurrent
+agents: enumerate running agent processes (=pgrep -x claude=), read each
+one's working directory from =/proc/<pid>/cwd=, keep those whose cwd is the
+project root or inside it, and exclude the scanner's own process ancestry
+(walk parent pids from =/proc/self=). What remains is the set of /other/
+live agents in this project. Output: one line per other agent (pid + cwd),
+exit 0 when alone. The process-name match is deliberately Claude-scoped for
+Phase 1.5; generalizing it (a process name per runtime, from the phase-2
+runtime manifests) belongs to the phases 2-6 arc.
 
-Third revision (Craig, 2026-06-11): the agent detects concurrency itself, so
-the operator can open a plain terminal, run the agent, and say nothing
-special — no env var, no special opener, no launcher flag required.
+Known limits, accepted for v1: an agent session not running as a local
+process on this machine (a cloud session against the same checkout) is
+invisible to the scan; and the match is on process cwd, so an agent started
+from outside the project tree wouldn't be seen. Both are edge shapes the
+operator created deliberately and can manage manually.
+
+*** Spawn paths: deterministic launcher, startup safety net
+
+Two paths produce a correctly-identified helper. They share the roster and
+the same three steps — check, assign identity, route to the role contract —
+differing only in /who/ executes the steps.
 
-The signal is a stateless process scan, verified live on 2026-06-11 with four
-concurrent agents: enumerate running agent processes (=pgrep -x claude=),
-read each one's working directory from =/proc/<pid>/cwd=, keep those whose
-cwd is the project root or inside it, and exclude the scanner's own process
-ancestry (walk parent pids from =/proc/self=). What remains is the set of
-/other/ live agents in this project. A small script, =agent-roster=, wraps
-this and prints the others (pid + cwd), exit 0 when alone.
+The *launcher* is the preferred path because a shell script can't skip a
+step the way a model-followed instruction can (Craig, 2026-06-11).
+=ai --helper [project]= runs the three steps in order: (a) =agent-roster=
+against the target directory; (b) when the roster shows a live agent,
+assign and export the id (=AI_AGENT_ID=helper-<rand4>=, =AI_HELPER=1=);
+(c) launch the agent in that directory with the helper opening instruction
+(read and follow =helper-mode.org=) already in the prompt, in a tmux window
+named =<project>:helper-<id>=. Roster empty → warn and launch a normal
+primary session instead. The Emacs surface gets the same three steps via
+=ai-term.el= (next subsection).
 
-The check is the *first action of every session* — before any pull, rsync,
-or anchor read, because everything startup does next forks on it:
+The *startup check* is the safety net for sessions born without a launcher
+(a raw =claude= in some terminal). The roster runs as the first action of
+every session — before any pull, rsync, or anchor read — because everything
+startup does next forks on it:
 
 - /Alone, no anchor:/ fresh session. Normal startup.
 - /Alone, anchor exists:/ the previous session crashed. Recover — exactly
@@ -417,49 +444,37 @@ or anchor read, because everything startup does next forks on it:
 - /Not alone:/ a live primary (or primary + helpers) exists. Skip startup
   entirely and execute =helper-mode.org= instead.
 
-This also resolves the standing ambiguity in the singleton check: a live
-anchor used to mean only "crashed session"; with the roster it splits into
-crashed (no live process) vs concurrent (live process).
+The not-alone branch also covers identity when the launcher didn't provide
+one: the session self-assigns =helper-<rand4>=, records the id as the first
+line of its context file, and uses the resolved path
+(=.ai/session-context.d/<id>.org=) literally from then on. Where a script
+consumes =AI_AGENT_ID= (the =session-context-path= resolver), the helper
+prefixes that invocation explicitly: =AI_AGENT_ID=<id> .ai/scripts/session-context-path=.
+The id lives in the context file, not in ambient shell state, so a helper
+session can't lose it between tool calls.
 
-Known limits, accepted for v1: an agent session not running as a local
-process on this machine (a cloud session against the same checkout) is
-invisible to the scan; and the match is on process cwd, so an agent started
-from outside the project tree wouldn't be seen. Both are edge shapes the
-operator created deliberately and can manage manually.
+The roster also resolves the standing ambiguity in the singleton check: a
+live anchor used to mean only "crashed session"; it now splits into crashed
+(no live process) vs concurrent (live process).
 
-*** Identity and role files
+*** Identity and the role contract
 
 - The primary keeps the unset-id singleton (=.ai/session-context.org=), per
   Phase 1's compatibility rule. Zero friction for the overwhelmingly common
   one-agent case, and the asymmetry is harmless: the helper's writes land in
-  =.ai/session-context.d/=, away from the singleton.
-- A helper self-assigns its identity when detection fires: pick
-  =helper-<rand4>=, record it as the first line of its own context file at
-  =.ai/session-context.d/<id>.org=, and use that path explicitly for the
-  rest of the session (exporting =AI_AGENT_ID= per Bash call where the
-  =session-context-path= resolver is involved). No launcher cooperation
-  needed.
+  =.ai/session-context.d/=, away from the singleton. (This deliberately
+  diverges from the v0 launcher model, where every agent gets an id; revisit
+  only if the asymmetry bites in practice.)
+- Helpers are =helper-<rand4>=, launcher-assigned or self-assigned per the
+  spawn paths above, sanitized by =session-context-path=, archived with the
+  id in the filename.
 - =helper-mode.org= (a template workflow, Craig's "helper.org") is the role
-  contract the detection routes to: self-assignment above, the read/write
-  tiers from this section, the light startup, and the helper wrap-up. It is
-  not operator-triggerable by phrase; startup's detection is its entry
-  point, plus an explicit "you are a helper" instruction as the manual
-  fallback.
-- The launcher is the deterministic spawn path (Craig, 2026-06-11, fourth
-  revision): a shell script can't skip a step the way a model-followed
-  instruction can. =ai --helper [project]= does three things in order —
-  (a) runs =agent-roster= for the target directory, (b) assigns and exports
-  the id (=AI_AGENT_ID=helper-<rand4>=, =AI_HELPER=1=) when the roster shows
-  a live agent, and (c) launches the agent in that directory with the
-  helper opening instruction (read and follow =helper-mode.org=) already in
-  the prompt. Window named =<project>:helper-<id>=. Roster empty → it warns
-  and launches a normal primary session instead.
-- The in-session startup check (previous subsection) stays as the safety
-  net, not the mechanism: it catches agents started raw (=claude= in a
-  terminal, no launcher) and is still what splits a live anchor into
-  crashed-vs-concurrent. Belt and suspenders: the launcher makes the common
-  path deterministic; the startup roster keeps the uncommon path safe.
-  Both call the same =agent-roster= script.
+  contract and the *single canonical home* of the helper rules: the
+  read/write tiers, the data-integrity rules, the light startup, and the
+  helper wrap-up. protocols.org carries a one-paragraph pointer; startup.org
+  and wrap-it-up.org reference it rather than restating it. It has no
+  operator trigger phrase — the spawn paths route to it, and an explicit
+  "you are a helper" instruction is the manual fallback.
 
 *** Read/write contract for shared files
 
@@ -485,7 +500,9 @@ silently (both read, both write, last write wins). The contract, by tier:
   a targeted cross-agent message. The helper also never runs startup's
   Phase A.0 pulls or the =.ai/= rsync — the primary already did, and a
   concurrent pull-under-edit is exactly the race the startup guards exist to
-  prevent.
+  prevent. The git ban is concurrency-scoped, and /Helper startup and
+  wrap-up/ below lifts it for exactly one case: an orphaned helper that
+  finds itself alone at wrap-up.
 - /Escalation:/ anything the contract blocks routes through the targeted
   cross-agent form already specced above (=machine.project.agent-id=), or
   just gets reported to Craig.
@@ -562,27 +579,63 @@ uniform across both surfaces (claude is a child of the tmux server either
 way, and the scan matches on cwd regardless). The earlier idea of an
 =ai --no-tmux= mode is retired: there is no tmux-less Emacs path to serve.
 
+Mechanics verified in the code (2026-06-12, =cj/--ai-term-launch-command=):
+the launch builds one shell string — =tmux new-session -A -s aiv-<name>
+-n ai -c <dir> '<agent-command>; exec bash'= — with the startup instruction
+embedded in =cj/ai-term-agent-command= (default: =claude "Read
+.ai/protocols.org and follow all instructions."=). Two consequences:
+
+- Env injection is trivial: prefix the assignments into that same command
+  string (=AI_AGENT_ID=... AI_HELPER=1 claude "..."=). No
+  =tmux set-environment= dance, no post-attach injection.
+- =-A= (attach-or-create) means *one session per project*: F9 on a project
+  with a live session reattaches — ai-term as it stands /cannot create a
+  second agent for the same project/. The helper path therefore needs its
+  own session name (=aiv-<name>-helper-<id>=, plain =new-session=, no =-A=)
+  and its own buffer name (=agent [<name>:helper-<id>]=), and the existing
+  one-session assumptions (DWIM toggle, crash-recovery matching, M-F9
+  close) need to handle the extra sessions.
+
 What remains to design — the integration, not a new surface:
 
-- =ai-term.el='s session-create step learns the same three-step spawn:
-  run =agent-roster= for the picked project, export =AI_AGENT_ID= /
-  =AI_HELPER= into the tmux session when a live agent exists, and send the
-  helper-mode opener instead of the primary startup instruction.
+- =ai-term.el='s session-create learns the three-step spawn: run
+  =agent-roster= for the picked project; when a live agent exists, build
+  the helper variant of the launch command (env prefix + helper-mode opener
+  + helper session/buffer names per above) instead of the primary one.
 - The picker badges grow a =[helper]= state alongside =[running]= /
   =[detached]=, so picking a project that already has a live agent shows
   what the new session will become before it's created.
 - Division of labor: =agent-roster= (the shared script) is the single
   detection implementation; =ai-term.el= and =bin/ai= each do their own
-  id/export/opener wiring on top of it. Whether =ai-term.el= shells out to
-  =bin/ai= wholesale or just to =agent-roster= is the remaining call —
-  ai-term owns tmux session naming (=aiv-= prefix) and window placement
-  that =bin/ai= knows nothing about, which argues for sharing only the
-  roster.
+  id/export/opener wiring on top of it. ai-term owns tmux session naming
+  (=aiv-= prefix) and window placement that =bin/ai= knows nothing about,
+  which argues for sharing only the roster rather than shelling out to
+  =bin/ai= wholesale.
 - The =emacs.md= live-reload discipline applies to the ai-term.el changes,
   and the change lands in the =~/.emacs.d= project (its own repo and
   session scope — a cross-project handoff from rulesets, not a rulesets
   edit).
 
+Recommendations for ai-term.el beyond the helper feature (Craig asked for
+these 2026-06-12; they ride the same handoff):
+
+- /Roster-truth over name-truth./ The picker's =[running]= / =[detached]=
+  badges derive from tmux session names; a raw-shell agent in the same
+  project is invisible to them. Once =agent-roster= exists, badge from the
+  roster (any live agent, however launched) and keep tmux matching only for
+  the reattach/recovery feature.
+- /Survive the agent-command exit consistently./ The =; exec bash= tail
+  keeps the window alive after the agent exits, but the buffer then shows a
+  bare prompt with no hint the agent ended. A small status line (window
+  rename to =ai:done=, or a =tmux display-message=) makes a finished or
+  crashed agent visible from the picker before reattaching.
+- /One source for the opening instruction./ =cj/ai-term-agent-command=
+  hardcodes the same "Read .ai/protocols.org..." sentence the =ai= script
+  carries. When the helper opener lands, both launchers will have /two/
+  openers each. Move the opener strings to one place both read (the =ai=
+  script can emit them: =ai --print-opener [primary|helper]=), so a future
+  wording change can't drift the surfaces apart.
+
 *** Helper startup and wrap-up
 
 Helper startup is deliberately light: resolve the context path, read
@@ -792,7 +845,11 @@ when all three rings are written into the implementation plan.
 - What should the generic project instruction file be: =AGENTS.md=,
   =AI.md=, or runtime-specific only?
 - Should =.ai/session-context.org= become a symlink to the current agent's file,
-  or should it disappear after migration?
+  or should it disappear after migration? /Partially answered by the helper
+  design (2026-06-11): for Phase 1.5 the singleton stays as the primary's
+  real file — neither symlink nor removed — and only helpers use
+  =session-context.d/=. The symlink-or-disappear question is live again only
+  if phases 2-6 move every agent onto ids./
 - Should =rulesets= standardize on =llama.cpp= only, or support =ollama= as the
   default beginner-friendly local runtime?
 - Which local agent CLI should be the first supported offline editor: