#+TITLE: Spec: Generic Agent Runtime Support for rulesets
#+AUTHOR: Codex
#+DATE: 2026-05-28
#+STARTUP: showall

* Introductory note

Craig asked for a design pass on making =rulesets= generic rather than
Claude-Code-specific. The motivating case is offline operation: if he is on a
laptop without network, a local LLM should still be able to use the same project
structure, workflows, memory, and cross-agent conventions. The design also needs
to support two different LLMs running in the same project at the same time,
without trampling each other's live session state.

I read the current =rulesets= checkout and found that the reusable core is
already there: =.ai/= workflows, scripts, cross-agent comms, inboxes, and
project startup structure are not inherently Claude-specific. The Claude
assumptions live mostly in naming, install destinations, launcher behavior,
per-language bundle layout, hook APIs, and a single active
=.ai/session-context.org= file.

Hardware notes:

- This machine is the high-end local-LLM target: AMD Ryzen AI Max+ 395, 128 GiB
  RAM, Radeon 8060S / Strix Halo unified memory. For offline agentic coding, I
  recommend installing =Qwen3-Coder-30B-A3B-Instruct-GGUF= as the default local
  coding model, preferably =Q6_K= on this machine and =Q4_K_M= as the compatibility
  quant. It is code-specialized, Apache-2.0, and its GGUF files fit comfortably.
  For a stronger general fallback on this machine, also install
  =Qwen3-Next-80B-A3B-Instruct-GGUF= =Q4_K_M=; it is not as code-specialized but
  gives a much larger model with long context and still fits the 128 GiB system.
- =velox= hardware from =ssh velox inxi -C -G -m -S --filter=: Intel i7-1370P,
  64 GiB DDR4, Intel Iris Xe integrated graphics. For that machine, the strongest
  model I would recommend as normal offline coding stock is
  =Qwen3-Coder-30B-A3B-Instruct-GGUF= =Q4_K_M=. It should fit in RAM with room for
  context, but expect CPU-class latency. Also install an 8B fallback for quick
  edits and low-latency triage.

Suggested archsetup handoff: ask =archsetup= to install the runtime stack
(=llama.cpp= with Vulkan/CPU support, optionally =ollama= as a simple manager),
create a shared model cache, and prefetch the model set above during normal
machine setup when network is available.

Sources checked:

- [[https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct-GGUF][Qwen3-Next-80B-A3B-Instruct-GGUF model card]]: Q4_K_M is 48.4 GB, native
  context length is 262,144 tokens, Apache-2.0.
- [[https://huggingface.co/tensorblock/Qwen_Qwen3-Coder-30B-A3B-Instruct-GGUF][Qwen3-Coder-30B-A3B-Instruct GGUF quant listing]]: Q4_K_M is 18.557 GB,
  Q5_K_M is 21.726 GB, Q6_K is 25.093 GB.
- [[https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct-GGUF][Qwen3-Next model overview]]: 80B total parameters, 3B active, GGUF support via
  =llama.cpp= / =llama-cpp-python=.
- [[https://en.wikipedia.org/wiki/Llama.cpp][llama.cpp overview]]: supports Vulkan, HIP/ROCm, OpenCL, CPU, and other
  backends. For this hardware class, keep the implementation backend-swappable.

* Status

Draft v0. This is not an implementation plan yet; it is a product/architecture
spec for the next =rulesets= refactor.

Amended 2026-06-11: the resolver half of Phase 1 shipped
(=.ai/scripts/session-context-path= + the protocols.org agent-scoped path
contract); the launcher half has not. A new section, /Concurrent same-project
agents (helper instances)/, covers Craig's motivating case the v0 draft left
under-specified — spawning a second Claude in the same project to look things
up or update tasks safely — and a new Phase 1.5 sequences that slice ahead of
the runtime-neutral phases 2-6, which remain pending a go/no-go.

*Readiness is split by arc* (per the 2026-06-12 Codex review's top finding —
the spec contains two different projects and one label misled):

- *Phase 1.5 — helper instances: READY WITH CAVEATS* (2026-06-12). The
  caveats are binding, not advisory: the three-ring pre-live gate governs
  every merge into synced template paths; the manual drills are gates, not
  suggestions; and the =ai-term.el= work lands as a coordinated
  cross-project handoff to =~/.emacs.d= (the exact artifact is named in
  Phase 1.5), so the rulesets side isn't "done" while the F9 path is
  still unsafe.
- *Phases 2-5 — runtime-neutral refactor: NOT READY.* Blocked on the
  /Decisions required before phases 2-5/ section under Open decisions, and
  on Phase 5's reverification prerequisite (the local-model table is a
  recommendation, not an implementation constant). Parked pending Craig's
  go/no-go on the arc.

The original readiness checklist, resolved:

- [X] Emacs launch surface designed. /Closed 2026-06-12: mechanics verified
  in ai-term.el's code, integration design written, the three open calls
  confirmed by Craig (roster-only sharing, singleton primary,
  helper-mode.org as canonical home)./
- [X] Pre-live test strategy agreed. /The review accepted the three-ring
  gate as the release-safety mechanism and asked for per-ring rollback
  actions — added to the gating section./
- [X] A re-read of the whole helper section after the dust settles. /Done
  2026-06-12: the coherence pass unified the churned subsections and
  verified the ai-term.el claims against code./
- [X] Independent spec review. /Codex, 2026-06-12: Not-ready for the
  combined spec, Phase 1.5 implementable as a scoped slice — which the
  split rubric above now states directly. Dispositions folded in the same
  day; see Review dispositions./

* Problem

=rulesets= is named and wired as a Claude Code rules distribution:

- Global install targets =~/.claude/skills=, =~/.claude/rules=,
  =~/.claude/hooks=, and =~/.claude/settings.json=.
- Per-project language bundles copy into =.claude/= and seed =CLAUDE.md=.
- The launcher =claude-templates/bin/ai= hard-codes =CLAUDE_CMD=claude= and
  requires the =claude= binary.
- Template documentation says "Claude" throughout =protocols.org=,
  =startup.org=, and the README.
- Hook scripts and settings assume Claude Code's hook protocol and
  =$CLAUDE_PROJECT_DIR=.
- The active session file is a singleton =.ai/session-context.org=, which is
  unsafe when two agents operate in the same project simultaneously.

The result: the good project structure is portable in principle but not in
practice. A local offline model can read files, but there is no generic runtime
contract that tells it where to load rules from, where to record live state, how
to avoid another agent's context file, or how to use the same launcher and
project discovery flow.

* Goals

- Preserve =.ai/= as the project-neutral workflow, memory, scripts, inbox, and
  cross-agent layer.
- Support multiple runtimes:
  - Claude Code as the existing adapter.
  - Codex/OpenAI-compatible hosted agents.
  - Local OpenAI-compatible agents backed by =llama.cpp= / =ollama= / LM Studio.
- Allow two or more agents to work in the same project concurrently without
  sharing a live session-context file.
- Keep current Claude workflows working during migration.
- Make language bundles and team overlays installable for more than one runtime.
- Make offline use a first-class path: rules, workflows, launcher, model cache,
  and local endpoint all work with no network after setup.

* Non-goals for v1

- No attempt to make every Claude hook feature work identically in every runtime.
  Runtimes expose different hook/event APIs.
- No automatic prompt translation that rewrites every rule into every vendor's
  preferred style. V1 should install common rules plus small runtime adapters.
- No local model benchmarking harness. Pick sensible defaults and make the model
  inventory configurable.
- No forced rename of existing =.claude/= installations in existing projects.
  Compatibility matters.

* Current-state findings

** Project-neutral pieces

These can remain conceptually unchanged:

- =.ai/protocols.org= as the behavioral entry point.
- =.ai/workflows/= and =.ai/scripts/= as synced canonical project tooling.
- =.ai/project-workflows/= and =.ai/project-scripts/= as project-owned extension
  points.
- =inbox/= and =inbox/from-agents/= as human and agent inboxes.
- Cross-agent message protocol and scripts. They say "agent" already and are
  mostly model-neutral.

** Claude-specific pieces

Observed files and assumptions:

- =README.org= describes "Claude Code skills, rules, and per-language project
  bundles."
- =Makefile= uses =SKILLS_DIR=$(HOME)/.claude/skills=,
  =RULES_DIR=$(HOME)/.claude/rules=, =HOOKS_DIR=$(HOME)/.claude/hooks=, and
  installs =.claude= config.
- =Makefile deps= installs =@anthropic-ai/claude-code= and checks =claude=.
- =scripts/install-lang.sh= copies common rules into =PROJECT/.claude/rules=,
  copies language-specific =claude/= directories, and seeds =CLAUDE.md=.
- =scripts/sync-language-bundle.sh= fingerprints bundles by
  =PROJECT/.claude/rules= files.
- =scripts/install-team.sh= installs team overlays into =PROJECT/.claude/rules=.
- =scripts/audit.sh= calls the canonical source =claude-templates/.ai=.
- =claude-templates/bin/ai= requires =claude= and launches
  =claude "<project instructions>"= in tmux.
- =languages/elisp/CLAUDE.md= is the project instruction template.
- =languages/elisp/claude/settings.json= uses Claude Code hooks and
  =$CLAUDE_PROJECT_DIR=.

* Proposed model

** Vocabulary

- *Core* — runtime-neutral rules, workflows, scripts, and project conventions.
- *Runtime* — an agent implementation: =claude=, =codex=, =local-openai=,
  =aider-local=, etc.
- *Runtime adapter* — install paths, hook wiring, command template, instruction
  filename, and limitations for one runtime.
- *Agent instance* — one live process/session in one project, identified by
  runtime + host + project + unique suffix.

** Directory model

Keep =.ai/= as the stable project-local core.

Change active session state from a singleton:

#+begin_example
.ai/session-context.org
#+end_example

to an active-session directory:

#+begin_example
.ai/session-context.d/
  <agent-id>.org
.ai/sessions/
  YYYY-MM-DD-HH-MM-<agent-id>-<description>.org
#+end_example

Recommended =agent-id= shape:

#+begin_example
<host>.<project>.<runtime>.<short-id>
#+end_example

Examples:

#+begin_example
pearl.org-drill.claude.a83f
pearl.org-drill.local-qwen30b.19ca
velox.archsetup.local-qwen30b.7712
#+end_example

Compatibility rule: if exactly one active context exists, tools may expose a
temporary =.ai/session-context.org= symlink or legacy copy for old workflows.
New workflows should read/write by =AI_AGENT_ID=.

** Runtime manifest

Add a repository-level runtime manifest:

#+begin_example
runtimes/
  claude.toml
  codex.toml
  local-openai.toml
#+end_example

Each runtime defines:

#+begin_src toml
id = "local-openai"
display_name = "Local OpenAI-compatible agent"
command = "aider"
args = ["--model", "openai/qwen-local", "--openai-api-base", "http://127.0.0.1:11434/v1"]
requires_network = false
project_instruction_files = ["AGENTS.md", ".ai/protocols.org"]
global_install_root = "~/.config/rulesets/runtimes/local-openai"
project_install_dir = ".agents/local-openai"
supports_hooks = "wrapper"
supports_mcp = false
supports_subagents = false
#+end_src

The manifest lets the launcher and install scripts reason about a runtime
without hard-coding Claude paths.

** Source layout

Refactor source directories toward:

#+begin_example
agent-rules/                 # former claude-rules; runtime-neutral where possible
skills/                      # skills with runtime support metadata
ai-templates/.ai/            # former claude-templates/.ai
runtimes/claude/             # Claude adapter
runtimes/codex/              # Codex adapter
runtimes/local-openai/       # local model adapter
languages/elisp/common/      # common language bundle material
languages/elisp/runtimes/claude/
languages/elisp/runtimes/local-openai/
teams/deepsat/common/
teams/deepsat/runtimes/claude/
#+end_example

Do not require a big-bang rename. V1 can support aliases:

- =claude-rules/= remains as a compatibility symlink or wrapper around
  =agent-rules/=.
- =claude-templates/= remains as an alias for =ai-templates/= until all startup
  workflows are updated.
- =languages/<lang>/claude/= remains supported by the Claude adapter.

** Install behavior

Replace "install Claude tooling" with "install runtime adapter":

#+begin_example
make install-runtime RUNTIME=claude
make install-runtime RUNTIME=local-openai
make install-lang LANG=elisp PROJECT=~/code/foo RUNTIME=claude
make install-lang LANG=elisp PROJECT=~/code/foo RUNTIME=local-openai
#+end_example

Claude adapter:

- Global: =~/.claude/skills=, =~/.claude/rules=, =~/.claude/hooks=.
- Project: =.claude/= and =CLAUDE.md=.
- Hook API: Claude Code =settings.json=.

Local OpenAI adapter:

- Global: =~/.config/rulesets/local-openai/= and model server config.
- Project: =.agents/local-openai/= plus =AGENTS.md= or
  =.ai/runtime/local-openai/instructions.md=.
- Hook API: wrapper-level checks only. If the local CLI has no hook protocol,
  hooks become documented commands or wrapper pre/post actions.

Codex adapter:

- Project instruction file should be =AGENTS.md= where supported.
- Runtime-specific config lives under =.agents/codex/= or the tool's native
  config path.

** Launcher behavior

Refactor =claude-templates/bin/ai= into a generic launcher, still named =ai=:

#+begin_example
ai                         # choose project and default runtime
ai --runtime claude .
ai --runtime local-openai .
ai --runtime local-qwen30b ~/code/org-drill
ai --attach
ai --list-runtimes
#+end_example

Launcher responsibilities:

- Discover projects by =.ai/protocols.org=, not by "Claude-template project."
- Select runtime from:
  - explicit =--runtime=,
  - project default in =.ai/runtime.toml=,
  - host default in =~/.config/rulesets/runtime.toml=.
- Create =AI_AGENT_ID= before launch.
- Export:
  - =AI_AGENT_ID=
  - =AI_RUNTIME=
  - =AI_PROJECT_DIR=
  - =AI_SESSION_CONTEXT=.ai/session-context.d/$AI_AGENT_ID.org=
- Use tmux window names that include runtime when needed:
  - =org-drill= if only one agent for the project.
  - =org-drill:claude= and =org-drill:local-qwen30b= if multiple agents exist.
- Pass a runtime-appropriate opening instruction:
  - Claude: current command-line prompt.
  - Local agent: prompt file or initial message that says to read
    =.ai/protocols.org= and use =AI_SESSION_CONTEXT=.

** Session-context contract

Every runtime must obey:

- Never write the legacy singleton when =AI_SESSION_CONTEXT= is set.
- Create the context file lazily on the first state-mutating turn.
- Archive to =.ai/sessions/= with the =agent-id= in the filename.
- Include runtime and model metadata in frontmatter:

#+begin_example
#+TITLE: Session context
#+AGENT_ID: pearl.org-drill.local-qwen30b.19ca
#+RUNTIME: local-openai
#+MODEL: Qwen3-Coder-30B-A3B-Instruct-Q6_K
#+HOST: pearl
#+STARTED: 2026-05-28T...
#+end_example

Startup workflow changes:

- Check =.ai/session-context.d/*.org=, not only =.ai/session-context.org=.
- If the current =AI_AGENT_ID= has a live file, recover it.
- If other active files exist, surface them as "other active agents" but do not
  read them wholesale unless needed. This prevents context contamination.

** Cross-agent updates

The existing cross-agent protocol can stay, but add optional fields:

#+begin_example
#+SENDER_AGENT_ID: pearl.org-drill.claude.a83f
#+SENDER_RUNTIME: claude
#+TARGET_AGENT_ID: pearl.org-drill.local-qwen30b.19ca
#+TARGET_RUNTIME: local-openai
#+MODEL: Qwen3-Coder-30B-A3B-Instruct-Q6_K
#+end_example

Destination syntax can remain =machine.project= for project-level delivery.
Add =machine.project.agent-id= as an optional targeted form when two agents in
the same project are both active.

Receivers should ignore messages targeted at another =TARGET_AGENT_ID= unless
the user explicitly asks them to take over.

** Concurrent same-project agents (helper instances) — 2026-06-11 amendment

The motivating case (Craig, 2026-06-11): spawn a second Claude in the same
project to look things up or update tasks safely while the primary session
works. This is same-runtime concurrency — it needs none of the runtime
manifests or local-model machinery from phases 2-6. It sits on the shipped
session-context split plus the contracts below, which the v0 draft didn't
cover: identity assignment at spawn, and write-safety for the files the
session-context split does /not/ isolate.

First, the boundary with subagents: when the lookup fits a dispatched
subagent (the Agent tool), use that — no second session exists and nothing
here applies. A helper instance is for interactive, long-lived parallel work
Craig drives himself in a second terminal.

*** The roster: one detection primitive

=agent-roster= is a small script and the single source of "who else is live
in this project." Every other piece — both launchers and the in-session
startup check — calls it rather than reimplementing the scan.

The scan is stateless, verified live on 2026-06-11 with four concurrent
agents: enumerate running agent processes (=pgrep -x claude=), read each
one's working directory from =/proc/<pid>/cwd=, keep those whose cwd is the
project root or inside it, and exclude the scanner's own process ancestry
(walk parent pids from =/proc/self=). What remains is the set of /other/
live agents in this project. Output: one line per other agent (pid + cwd),
exit 0 when alone. The process-name match is deliberately Claude-scoped for
Phase 1.5; generalizing it (a process name per runtime, from the phase-2
runtime manifests) belongs to the phases 2-6 arc.

Known limits, accepted for v1: an agent session not running as a local
process on this machine (a cloud session against the same checkout) is
invisible to the scan; and the match is on process cwd, so an agent started
from outside the project tree wouldn't be seen. Both are edge shapes the
operator created deliberately and can manage manually. The scan is also
Linux-=/proc=-specific: on an unsupported platform the script reports
"roster unavailable" explicitly (never a silent "alone"), and startup
treats that result as the no-op path from the pre-live gate — same behavior
as the script being absent.

*** Spawn paths: deterministic launcher, startup safety net

Two paths produce a correctly-identified helper. They share the roster and
the same three steps — check, assign identity, route to the role contract —
differing only in /who/ executes the steps.

The *launcher* is the preferred path because a shell script can't skip a
step the way a model-followed instruction can (Craig, 2026-06-11).
=ai --helper [project]= runs the three steps in order: (a) =agent-roster=
against the target directory; (b) when the roster shows a live agent,
assign and export the id (=AI_AGENT_ID=helper-<rand4>=, =AI_HELPER=1=);
(c) launch the agent in that directory with the helper opening instruction
(read and follow =helper-mode.org=) already in the prompt, in a tmux window
named =<project>:helper-<id>=. Roster empty → warn and launch a normal
primary session instead. The Emacs surface gets the same three steps via
=ai-term.el= (next subsection).

The *startup check* is the safety net for sessions born without a launcher
(a raw =claude= in some terminal). The roster runs as the first action of
every session — before any pull, rsync, or anchor read — because everything
startup does next forks on it:

- /Alone, no anchor:/ fresh session. Normal startup.
- /Alone, anchor exists:/ the previous session crashed. Recover — exactly
  today's behavior.
- /Not alone:/ a live primary (or primary + helpers) exists. Skip startup
  entirely and execute =helper-mode.org= instead.

The not-alone branch also covers identity when the launcher didn't provide
one: the session self-assigns =helper-<rand4>=, records the id as the first
line of its context file, and uses the resolved path
(=.ai/session-context.d/<id>.org=) literally from then on. Where a script
consumes =AI_AGENT_ID= (the =session-context-path= resolver), the helper
prefixes that invocation explicitly: =AI_AGENT_ID=<id> .ai/scripts/session-context-path=.
The id lives in the context file, not in ambient shell state, so a helper
session can't lose it between tool calls.

The roster also resolves the standing ambiguity in the singleton check: a
live anchor used to mean only "crashed session"; it now splits into crashed
(no live process) vs concurrent (live process).

*** Identity and the role contract

- The primary keeps the unset-id singleton (=.ai/session-context.org=), per
  Phase 1's compatibility rule — /decided (Craig, 2026-06-12)/. Zero
  friction for the overwhelmingly common one-agent case, and the asymmetry
  is harmless: the helper's writes land in =.ai/session-context.d/=, away
  from the singleton. (This deliberately diverges from the v0 launcher
  model, where every agent gets an id; revisit only if the asymmetry bites
  in practice.)
- Helpers are =helper-<rand4>=, launcher-assigned or self-assigned per the
  spawn paths above, sanitized by =session-context-path=, archived with the
  id in the filename.
- =helper-mode.org= (a template workflow, Craig's "helper.org") is the role
  contract and the *single canonical home* of the helper rules — /decided
  (Craig, 2026-06-12)/: the read/write tiers, the data-integrity rules, the
  light startup, and the helper wrap-up. protocols.org carries a
  one-paragraph pointer; startup.org and wrap-it-up.org reference it rather
  than restating it. It has no operator trigger phrase — the spawn paths
  route to it, and an explicit "you are a helper" instruction is the manual
  fallback.

*** Read/write contract for shared files

The session-context split isolates per-agent session state. Everything else
in the project is shared mutable state — =todo.org=, =.ai/notes.org=,
=inbox/=, docs — and two Edit-tool writers on one org file lose updates
silently (both read, both write, last write wins). The contract, by tier:

- /Always safe:/ any read; writes to the helper's own
  =session-context.d/<id>.org=.
- /Safe by discipline (task updates — the case Craig named):/ scoped writes to
  shared org files under four rules: re-read the file region immediately
  before each edit; anchor the edit on a unique heading; scope each edit to a
  single heading's subtree; never reflow, restructure, or sweep the file.
  Appending a new =**= task at a section end and editing one task's body or
  state both qualify. The race window collapses to seconds, and a collision
  corrupts one heading rather than the file.
- /Primary-only:/ file-wide passes (=todo-cleanup.el=, =lint-org.el=,
  =wrap-org-table.el=, archive sweeps), inbox processing, template sync, and
  ALL git mutation — commit, push, pull, stash. Two committers in one
  worktree contend on the index lock and interleave staging; the helper
  leaves its tree changes for the primary's next commit, or describes them in
  a targeted cross-agent message. The helper also never runs startup's
  Phase A.0 pulls or the =.ai/= rsync — the primary already did, and a
  concurrent pull-under-edit is exactly the race the startup guards exist to
  prevent. The git ban is concurrency-scoped, and /Helper startup and
  wrap-up/ below lifts it for exactly one case: an orphaned helper that
  finds itself alone at wrap-up.
- /Escalation:/ anything the contract blocks routes through the targeted
  cross-agent form already specced above (=machine.project.agent-id=), or
  just gets reported to Craig.

*** Data-integrity rules — 2026-06-11 second pass

The scoped-edit discipline covers helper-vs-primary edits on /different/
headings. Four loss/corruption windows remain, each with a rule. They matter
doubly in the consolidated home project, where a single =todo.org= carries
every personal task and corruption has maximal blast radius.

1. /Primary file-wide passes vs a live helper./ =todo-cleanup.el=,
   =lint-org.el=, and =wrap-org-table.el= rewrite whole files; run while a
   helper is mid-edit, they clobber the helper's just-written change (last
   write wins, silently). Rule: before any file-wide pass, the primary checks
   =session-context.d/= for live helper files. If any exist, it pauses the
   pass and asks Craig (or skips hygiene for that wrap). A crashed helper's
   stale file would block hygiene forever, so staleness is surfaced as a
   judgment call — the file's own content and timestamps show whether the
   helper is really gone — never silently skipped past and never silently
   honored indefinitely. The surfaced message is contractual (review
   finding): it names the file path, its timestamps, and the suggested
   actions (treat as stale and proceed / wait / abort), so the judgment is
   made on evidence rather than a bare "helper detected" warning.
2. /A new primary starting while a helper runs./ The previous primary may
   wrap and exit while a helper keeps working; the next =ai= launch becomes
   primary and runs full startup. The existing guards already do the right
   thing — the helper's uncommitted edits make the tree dirty, and Phase A.0
   skips pulls on a dirty tree. Rule: startup additionally surfaces live
   =session-context.d/= files as active agents (already in the
   session-context contract above), so the new primary knows /why/ the tree
   is dirty instead of treating it as leftover mess to resolve.
3. /Write-ahead edit journal./ The helper logs each intended shared-file
   edit — file, heading, one-line intent — to its own session-context file
   /before/ applying it. The Session Log discipline already requires logging
   state-mutating turns; this tightens it to log-before-write for shared
   files specifically. After a crash, the journal shows which edits landed,
   and nothing about the shared file has to be reconstructed from memory.
4. /The memory dir./ =MEMORY.md= is a shared read-modify-write index with no
   heading anchors — the same lost-update shape as =todo.org= with none of
   the scoped-edit protection. Rule: helpers don't write memory at all.
   Candidate memories go into the helper's session log; the primary (or its
   wrap-up promotion check) writes them.

Backstop, independent of all four: every file-wide pass snapshots its target
to =/tmp= before modifying. =lint-org.el= and =wrap-org-table.el= already do;
=todo-cleanup.el= — the pass that moves whole subtrees — does not, and gets
brought up to the invariant in Phase 1.5. Beyond that, the primary's normal
commit cadence keeps git the recovery layer: a corrupted =todo.org= is one
checkout away from its last committed state, which is exactly why all git
mutation stays primary-only and frequent.

One small collision nit: =inbox-send= filenames carry minute-resolution
timestamps (=YYYY-MM-DD-HHMM-from-<project>-<slug>=). A helper and primary
sending to the same target in the same minute with the same slug would
collide; helper-originated sends include the agent id in the slug.

*** Emacs launch surface — integration contract

Corrected 2026-06-12 after reading the actual config (the first draft of
this subsection assumed eat/vterm and was wrong). The facts, verified in
=~/.emacs.d/modules/=:

- The Emacs terminal is *ghostel* (a native Emacs emulator over
  libghostty-vt, the Ghostty engine) — =term-config.el=. Generic ghostel
  buffers auto-launch tmux inside themselves (=cj/term-launch-tmux=).
- The Emacs AI launch surface already exists: *ai-term.el* (the F9 flow).
  It picks a project by the same =.ai/protocols.org= fingerprint, creates a
  project-named tmux session (=aiv-<basename>=), attaches a ghostel buffer
  to it, and sends the agent's startup instruction. Its picker already
  tracks session liveness (=[running]= / =[detached]= badges, crash
  recovery by matching surviving =aiv-= tmux sessions back to projects).

So Emacs-born agents are /tmux/ sessions viewed through ghostel — the same
process shape as shell-launched =ai= sessions, which makes roster detection
uniform across both surfaces (claude is a child of the tmux server either
way, and the scan matches on cwd regardless). The earlier idea of an
=ai --no-tmux= mode is retired: there is no tmux-less Emacs path to serve.

Mechanics verified in the code (2026-06-12, =cj/--ai-term-launch-command=):
the launch builds one shell string — =tmux new-session -A -s aiv-<name>
-n ai -c <dir> '<agent-command>; exec bash'= — with the startup instruction
embedded in =cj/ai-term-agent-command= (default: =claude "Read
.ai/protocols.org and follow all instructions."=). Two consequences:

- Env injection is trivial: prefix the assignments into that same command
  string (=AI_AGENT_ID=... AI_HELPER=1 claude "..."=). No
  =tmux set-environment= dance, no post-attach injection.
- =-A= (attach-or-create) means *one session per project*: F9 on a project
  with a live session reattaches — ai-term as it stands /cannot create a
  second agent for the same project/. The helper path therefore needs its
  own session name (=aiv-<name>-helper-<id>=, plain =new-session=, no =-A=)
  and its own buffer name (=agent [<name>:helper-<id>]=), and the existing
  one-session assumptions (DWIM toggle, crash-recovery matching, M-F9
  close) need to handle the extra sessions.

What remains to design — the integration, not a new surface:

- =ai-term.el='s session-create learns the three-step spawn: run
  =agent-roster= for the picked project; when a live agent exists, build
  the helper variant of the launch command (env prefix + helper-mode opener
  + helper session/buffer names per above) instead of the primary one.
- The picker badges grow a =[helper]= state alongside =[running]= /
  =[detached]=, so picking a project that already has a live agent shows
  what the new session will become before it's created.
- Division of labor — /decided (Craig, 2026-06-12)/: =agent-roster= (the
  shared script) is the single detection implementation; =ai-term.el= and
  =bin/ai= each do their own id/export/opener wiring on top of it. ai-term
  owns tmux session naming (=aiv-= prefix) and window placement that
  =bin/ai= knows nothing about, so the surfaces share only the roster, not
  the whole launcher.
- The =emacs.md= live-reload discipline applies to the ai-term.el changes,
  and the change lands in the =~/.emacs.d= project (its own repo and
  session scope — a cross-project handoff from rulesets, not a rulesets
  edit). The handoff artifact is exact (review finding, 2026-06-12):
  implementation step one sends an =inbox-send .emacs.d= handoff carrying
  this subsection's integration contract plus the recommendations below,
  and the rulesets task does not close until =.emacs.d= confirms its task
  is filed or landed — otherwise the shell path ships safe while F9 stays
  unsafe and nothing tracks the gap.

Recommendations for ai-term.el beyond the helper feature (Craig asked for
these 2026-06-12; they ride the same handoff):

- /Roster-truth over name-truth./ The picker's =[running]= / =[detached]=
  badges derive from tmux session names; a raw-shell agent in the same
  project is invisible to them. Once =agent-roster= exists, badge from the
  roster (any live agent, however launched) and keep tmux matching only for
  the reattach/recovery feature.
- /Survive the agent-command exit consistently./ The =; exec bash= tail
  keeps the window alive after the agent exits, but the buffer then shows a
  bare prompt with no hint the agent ended. A small status line (window
  rename to =ai:done=, or a =tmux display-message=) makes a finished or
  crashed agent visible from the picker before reattaching.
- /One source for the opening instruction./ =cj/ai-term-agent-command=
  hardcodes the same "Read .ai/protocols.org..." sentence the =ai= script
  carries. When the helper opener lands, both launchers will have /two/
  openers each. Move the opener strings to one place both read (the =ai=
  script can emit them: =ai --print-opener [primary|helper]=), so a future
  wording change can't drift the surfaces apart.

*** Helper startup and wrap-up

Helper startup is deliberately light: resolve the context path, read
=protocols.org=, optionally read the primary anchor's =* Summary= only (never
its Session Log — the context-contamination rule above), and start working.
No pulls, no rsync, no inbox processing, no staleness nudges.

Helper wrap-up forks on the roster, because session-end /ordering/ is not
guaranteed — the primary may wrap and leave while helpers keep working
(Craig, 2026-06-11):

- /Others still live:/ archive its own file to
  =.ai/sessions/YYYY-MM-DD-HH-MM-<id>-<desc>.org=, skip the hygiene passes
  and the commit/push steps, and surface any uncommitted tree changes so
  the remaining agents or Craig pick them up. Today's helper contract.
- /Alone (orphaned helper — the primary already wrapped):/ last agent out
  closes the door. Without this, the helper's edits would strand as a dirty
  worktree with nobody allowed to commit them. The git ban on helpers
  exists for /concurrency/ (index-lock contention, interleaved staging),
  and an orphaned helper has no concurrency — so the ban lifts, and the
  helper assumes closing duties: the full wrap-up, including hygiene passes
  (it is alone; the gate is satisfied), commit through the normal
  review/voice flow, and push.

The mirror case — the /primary/ wrapping while helpers still live — extends
the hygiene gate to the wrap-up commit itself: a wrap-up commit sweeps the
whole tree, including a helper's in-progress edits. With live helpers on the
roster, the primary's wrap-up pauses and asks Craig: commit the helper's
work-in-progress along with its own (the helper's log-before-write journal
shows what was in flight), wait for the helper to finish, or wrap without
the shared-file hygiene and leave the helper to close the door as above.

A helper that dies uncleanly leaves its file in =session-context.d/= — the
next session's roster-aware startup surfaces it as an interrupted agent,
same as the singleton today.

** Hook and validation strategy

V1 should not pretend all runtimes have Claude's hooks.

Define hook levels:

| Level | Meaning |
|-------+---------|
| =native= | Runtime has an event/hook API; install native config. |
| =wrapper= | =ai= launcher or helper scripts run checks around common actions. |
| =manual= | Rules document the verification commands; no enforcement. |

Language bundles should declare which hooks are required and which are advisory.
For local runtimes, start with =manual= plus project-level test commands. Add
=wrapper= only where the local agent CLI can route edits through a known command.

** Local model runtime

Install a host-level local model service:

- Preferred low-level runtime: =llama.cpp= server with OpenAI-compatible API.
- Optional manager: =ollama= for simpler model lifecycle where its model catalog
  is enough.
- Model cache: =~/.local/share/llm/models= or =/srv/models/llm=.
- Ports:
  - =127.0.0.1:11434= for =ollama= if installed.
  - =127.0.0.1:8081= for =llama-server= default coding model.
  - =127.0.0.1:8082= for larger/general model when running simultaneously.

Host model recommendations:

| Host | Hardware | Default offline coding model | Larger/secondary model |
|------+----------+------------------------------+------------------------|
| current high-end machine | Ryzen AI Max+ 395, 128 GiB unified RAM, Radeon 8060S | =Qwen3-Coder-30B-A3B-Instruct-GGUF Q6_K= | =Qwen3-Next-80B-A3B-Instruct-GGUF Q4_K_M= |
| velox | i7-1370P, 64 GiB RAM, Intel Iris Xe | =Qwen3-Coder-30B-A3B-Instruct-GGUF Q4_K_M= | 8B fallback for speed |

Rationale:

- The Qwen3-Coder 30B GGUF sizes leave enough headroom for context and a second
  agent on both machines.
- The high-end machine can also carry Qwen3-Next 80B Q4_K_M at 48.4 GB, useful
  for long-context planning or general reasoning offline.
- =velox= is memory-capable but GPU-limited; Qwen3-Coder 30B Q4_K_M is the
  strongest practical coding default before latency becomes the dominant pain.

* Migration plan

** Phase 1: Add runtime identity without renaming everything

- Teach =ai= launcher to set =AI_AGENT_ID=, =AI_RUNTIME=, =AI_PROJECT_DIR=, and
  =AI_SESSION_CONTEXT=.
- Update startup/wrap-up workflows to prefer =AI_SESSION_CONTEXT=.
- Keep legacy =.ai/session-context.org= fallback.
- Add tests for two simultaneous session-context files.

** Phase 1.5: Helper instances (concurrent same-project Claude) — added 2026-06-11

Independent of the phases 2-6 go/no-go; same-runtime only.

- =agent-roster= script: process-scan detection (cwd match within project
  root, self-ancestry exclusion), exit 0 alone / 1 not-alone, others listed
  pid + cwd. Bats-tested with fake =/proc=-style fixtures or spawned sleeper
  processes.
- Startup gains the detection as its first action, before Phase A.0's pulls:
  not-alone routes to =helper-mode.org= and runs nothing else; alone keeps
  today's crashed-vs-fresh anchor logic.
- =helper-mode.org= template workflow — the role contract: identity
  self-assignment, the read/write tiers, light start, helper wrap-up.
  INDEX entry marked auto-routed (no operator trigger phrase).
- =ai --helper= as the deterministic spawn path: roster check, id
  assignment + export, launch with the helper-mode opening instruction,
  tmux window naming, warn-and-run-primary when the roster is empty.
- Wrap-up ordering rules: helper wrap-up re-runs the roster and assumes
  full closing duties (hygiene + commit + push) when orphaned; primary
  wrap-up with live helpers pauses at the commit and asks (commit helper
  WIP / wait / leave closing to the helper).
- The shared-file read/write contract documented where agents will obey it
  (protocols.org pointing at =helper-mode.org=, with startup.org and
  wrap-it-up.org carrying the helper branches).
- Bats tests: id assignment and sanitization through the launcher, helper vs
  primary path resolution, two simultaneous context files (extends the
  existing =session-context-path= suite).
- Data-integrity items (the second-pass rules): the live-helper gate before
  any file-wide hygiene pass (with stale-file surfacing); =todo-cleanup.el=
  brought up to the backup-to-=/tmp= invariant the other two passes already
  meet; log-before-write journaling for helper shared-file edits in
  protocols.org; memory writes primary-only (helpers log candidates); agent
  id in helper-originated =inbox-send= slugs.
- Manual validation: Craig runs a real helper session against a live primary
  — a task lookup, one scoped =todo.org= edit, wrap-up — and the primary's
  next commit picks up the helper's edit cleanly. Then the corruption drill:
  with the helper mid-task, the primary attempts a wrap-up and the hygiene
  gate visibly pauses on the live helper.

** Phase 2: Introduce runtime manifests and generic install commands

- Add =runtimes/claude.toml= and make current install behavior data-driven.
- Add =runtimes/local-openai.toml= with command templates.
- Add =make install-runtime= and keep =make install= as Claude-compatible alias.

** Phase 3: Split common language bundles from runtime adapters

- Move runtime-neutral language rules into =languages/<lang>/common=.
- Keep Claude-specific settings/hooks under =languages/<lang>/runtimes/claude=.
- Add local-openai adapter docs/instructions for at least elisp.

** Phase 4: Rename user-facing docs

- Rename =claude-templates= to =ai-templates= after compatibility aliases exist.
- Rename =claude-rules= to =agent-rules= after scripts no longer hard-code it.
- Update docs from "Claude should" to "the active agent should" where the rule is
  runtime-neutral.
- Keep a short Claude adapter README for Claude-only behavior.

** Phase 5: Local model install handoff

- Prerequisite (review finding, 2026-06-12): a reverification task runs
  first — record /current/ model URLs, file sizes, licenses, backend
  support, a smoke command, memory fit, and fallback behavior against live
  sources. The model table in the Introductory note is a recommendation
  frozen at 2026-05-28, not an implementation constant; nothing in doctor
  checks or the archsetup handoff bakes it in unverified.
- Send archsetup an inbox note requesting local model runtime support.
- After archsetup lands it, teach =rulesets doctor= to verify:
  - =llama-server= or =ollama= installed.
  - configured model files exist.
  - configured OpenAI-compatible endpoint can answer a smoke prompt.

* Test strategy

- Unit-test launcher runtime selection and =AI_AGENT_ID= generation.
- Unit-test session-context path generation and archival names.
- Integration-test two fake runtimes launching the same project into distinct
  context files.
- Test =sync-language-bundle.sh= compatibility for legacy Claude bundles.
- Test install-lang for:
  - =RUNTIME=claude= writes =.claude/= and =CLAUDE.md=.
  - =RUNTIME=local-openai= writes =.agents/local-openai/= and does not touch
    =.claude/=.
- Test startup workflow examples or scripts so they look for
  =session-context.d= without breaking old projects.
- Test cross-agent targeted messages with =TARGET_AGENT_ID=.

** Phase 1.5 pre-live gating — added 2026-06-11, blocks readiness

The structural risk is the template sync: =startup.org= and =.ai/scripts/=
propagate to /every/ project on its next session, so a half-validated
detection hook goes live everywhere by default. The rollout is gated in
three rings:

1. /Bats ring./ =agent-roster= tested with spawned sleeper processes
   standing in for agents (real =/proc=, real cwds, no fakes needed) plus
   the self-ancestry exclusion against the test's own process chain. The
   startup hook tested for the no-op guarantee: when =agent-roster= is
   absent or reports alone, behavior is byte-identical to today.
   /Rollback: revert the commit; nothing here touches synced paths yet./
2. /Sandbox ring./ A disposable project (its own git repo, never
   template-synced back) runs the live drills before any real project sees
   the feature: primary + helper concurrent edits on one org file; the
   corruption drill (primary wrap-up pauses on a live helper); the
   orphaned-helper drill (primary wraps first, helper closes the door,
   tree ends clean); the raw-launch drill (helper started without the
   launcher gets caught by the startup roster); and the Emacs F9 drill
   (helper spawned via ai-term once its handoff lands).
   /Rollback: delete the sandbox project; no other surface was touched./
3. /Pilot ring./ The startup detection ships dormant-by-construction —
   the hook is a no-op wherever =agent-roster= is missing, and the script
   ships first to one pilot project only (copied into its
   =.ai/project-scripts/=, which the sync never touches) before the
   template-wide release puts it everywhere. Rulesets itself is the
   natural pilot: it's where a broken sweep is noticed fastest.
   /Rollback: delete =agent-roster= from the pilot's project-scripts; the
   hook reverts to its no-op path on the next session./
4. /Template-wide release./ The startup branch and the script land in the
   synced template paths only after the pilot soaks.
   /Rollback: revert the startup.org commit and remove the script from
   =claude-templates/.ai/scripts/=; the next sync's =--delete= clears every
   project's copy, and the no-op guarantee means a half-propagated state
   (some projects synced, some not) is safe in both directions./

Ring-1 test inventory (the review's list, normative): roster alone /
ancestry-exclusion / not-alone-on-sleeper cases; startup no-op
byte-identity when roster is missing or alone; startup routes to
helper-mode and skips pulls/rsync/inbox when not alone; =ai --helper=
assigns a sanitized id, exports both vars, uses the helper opener; primary
and helper resolve distinct context paths; helper-originated =inbox-send=
slugs carry the id; wrap-up pauses on live helpers before hygiene and
commit; orphaned-helper close runs only when the roster reports alone;
=todo-cleanup.el= takes a =/tmp= backup before any mutating mode.

Nothing merges past ring 1 into the synced template paths until ring 2's
drills pass.

* Open decisions

- What should the generic project instruction file be: =AGENTS.md=,
  =AI.md=, or runtime-specific only?
- Should =.ai/session-context.org= become a symlink to the current agent's file,
  or should it disappear after migration? /Partially answered by the helper
  design (2026-06-11): for Phase 1.5 the singleton stays as the primary's
  real file — neither symlink nor removed — and only helpers use
  =session-context.d/=. The symlink-or-disappear question is live again only
  if phases 2-6 move every agent onto ids./
- Should =rulesets= standardize on =llama.cpp= only, or support =ollama= as the
  default beginner-friendly local runtime?
- Which local agent CLI should be the first supported offline editor:
  =aider=, =opencode=, a simple custom wrapper, or something else?

** Decisions required before phases 2-5 — added 2026-06-12 (review finding)

These are the blocker subset of the open decisions above, plus two the
review added. Phases 2-5 stay NOT READY until each has an accepted answer;
deciding them inside code is the failure mode this section prevents.

1. Generic instruction-file strategy (=AGENTS.md= / =AI.md= /
   runtime-specific only).
2. Default local runtime manager/server (=llama.cpp= only vs =ollama=
   as the beginner default).
3. First supported local editing CLI.
4. Phase-2 adapter scope: Claude + one local runtime only, or Codex
   support immediately.
5. Compatibility behavior for existing =CLAUDE.md= / =.claude/= projects
   during the transition.

* Recommended next step

Updated 2026-06-12: implement Phase 1.5 under its READY-WITH-CAVEATS rubric
(the helper task in todo.org carries the plan). Phases 2-5 stay parked until
the decisions section above is answered and Craig calls the go/no-go on the
arc. The original recommendation — start with Phase 1 only — is complete:
Phase 1 shipped.

* Review dispositions — 2026-06-12 Codex review

Every recommendation from [[file:2026-05-28-generic-agent-runtime-spec-review.org][the review]], dispositioned:

| Recommendation | Disposition | Where it landed |
|----------------+-------------+-----------------|
| Split readiness labels by arc | Accept | Status: dual rubric (1.5 READY WITH CAVEATS, 2-5 NOT READY) |
| "Decisions required before phases 2-5" section | Accept | Open decisions, new subsection (5 items) |
| Phase 5 reverification prerequisite | Accept | Phase 5, first bullet; model table marked recommendation-only |
| Exact =.emacs.d= handoff artifact | Accept | Emacs subsection: =inbox-send= handoff is implementation step one; task closes on =.emacs.d= confirmation |
| Per-ring rollback actions | Accept | Pre-live gating: rollback line per ring, incl. the half-propagated-sync case |
| Stale-helper message contract | Accept | Data-integrity rule 1: path + timestamps + suggested actions |
| Roster unsupported-platform behavior | Accept | Roster subsection: explicit "roster unavailable" result, no-op path |
| Ring-1 test inventory | Accept | Pre-live gating: the review's list adopted as normative |
| Docs-update list for 1.5 | Accept | Already in Phase 1.5 items; INDEX auto-routed note included |
| Physically split the spec (open question) | Modify | Dual rubric in one document; phases 2-5 stay parked here pending Craig's go/no-go, matching the standing task framing |

Rejections: none.

** Second pass — 2026-06-12

The re-review after the fold-in confirmed every disposition (its Agreed
decisions section accepts the split rubric, the caveats, and the
one-document structure) and found no new high- or medium-priority issues.
One optional editorial edit was suggested and accepted: the Emacs
subsection heading no longer says "open issue / blocks readiness", since
its body and the Status checklist had already closed it — it now reads
/Emacs launch surface — integration contract/. The review's drop-in task
shapes are already represented in todo.org per the review itself; no new
tasks were filed.

* Review and iteration history

** 2026-06-12 Fri @ 02:09:10 -0500 — Codex — reviewer
- What changed or was recommended: ran the spec-review workflow and wrote a formal review. Rubric for the whole spec: =Not ready=. Phase 1 is already shipped; Phase 1.5 helper instances are implementable as a scoped slice with the existing rollout/manual-validation caveats; phases 2-5 remain blocked on product choices and time-sensitive local-runtime/model verification.
- Why: the spec now combines a concrete same-runtime helper implementation with a broader runtime-neutral refactor whose instruction-file, local-runtime, first-CLI, and adapter-scope decisions are still open.
- Artifacts: [[file:2026-05-28-generic-agent-runtime-spec-review.org][2026-05-28-generic-agent-runtime-spec-review.org]]; existing [[file:../../todo.org][todo.org]] entries for "Helper-instance support" and "Generic agent runtime support — Codex spec v0" updated with the review outcome.

** 2026-06-12 Fri @ 02:23:04 -0500 — Claude — author (spec-response)
- What changed: folded the review in. All recommendations accepted except the document-split open question, modified to a dual rubric in one document (see Review dispositions). Status now labels Phase 1.5 READY WITH CAVEATS and phases 2-5 NOT READY; the original readiness checklist is fully resolved.
- Why: the review's top finding was that one Not-ready label hid an implementable slice; the rest hardened the slice's rollout (per-ring rollbacks, normative test inventory, exact .emacs.d handoff artifact, stale-helper message contract, roster platform behavior) and fenced the parked arc (decisions-required section, Phase 5 reverification prerequisite).
- Artifacts: this spec's Status, Pre-live gating, Phase 5, Open decisions, and Review dispositions sections; the helper task in todo.org carries the same caveats.

** 2026-06-12 Fri @ 02:39:38 -0500 — Codex — reviewer
- What changed or was recommended: re-ran the spec-review workflow after the response pass. Rubric by arc: Phase 1.5 helper instances =Ready with caveats=; phases 2-5 runtime-neutral refactor =Not ready=; combined document =ready only for the scoped helper slice. No new blocking findings for Phase 1.5. Optional editorial note: rename the stale "Open issue: the Emacs launch surface" heading now that the body is an integration contract.
- Why: the response pass accepted or modified every prior recommendation into the spec, so the remaining implementation risk is controlled by the caveats/gates already written. The broader runtime-neutral arc still correctly waits on explicit product decisions and current-source local-runtime verification.
- Artifacts: [[file:2026-05-28-generic-agent-runtime-spec-review.org][2026-05-28-generic-agent-runtime-spec-review.org]] updated with this second-pass review; existing [[file:../../todo.org][todo.org]] helper/runtime tasks remain the tracking surface.

** 2026-06-12 Fri @ 02:47:07 -0500 — Claude — author (spec-response, second pass)
- What changed: folded the second review in. Nothing substantive remained to change — the review confirmed all dispositions with zero new high/medium findings — so the response is the accepted editorial heading rename (the Emacs subsection is now /Emacs launch surface — integration contract/) and the second-pass note under Review dispositions.
- Why: the review cycle has converged. Phase 1.5 stands at Ready-with-caveats with both reviewer and author aligned on the caveats; phases 2-5 stay parked behind the decisions fence.
- Artifacts: the renamed Emacs subsection; the Review dispositions second-pass note; Codex's second-pass review file and its todo.org/history edits ride in the same commit.