diff options
| author | Craig Jennings <c@cjennings.net> | 2026-05-28 01:29:15 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-05-28 01:29:15 -0500 |
| commit | 96f0a5f19672a4ed0eeba3a511a4ff30bcfbd61b (patch) | |
| tree | 36446025424c7c1b580ae4cd5f75757639978308 /docs/design/2026-05-28-generic-agent-runtime-spec.org | |
| parent | 1777773d382edf592006ee6d3a0feef78ca25690 (diff) | |
| download | rulesets-96f0a5f19672a4ed0eeba3a511a4ff30bcfbd61b.tar.gz rulesets-96f0a5f19672a4ed0eeba3a511a4ff30bcfbd61b.zip | |
chore(intake): file pearl pattern-catalog and codex runtime spec as TODOs
Moved three inbox notes into docs/design/ so the task body links survive: pearl's two pattern-catalog handoffs and codex's v0 generic-agent-runtime spec. Added two corresponding TODOs under Rulesets Open Work, both [#C].
Diffstat (limited to 'docs/design/2026-05-28-generic-agent-runtime-spec.org')
| -rw-r--r-- | docs/design/2026-05-28-generic-agent-runtime-spec.org | 471 |
1 files changed, 471 insertions, 0 deletions
diff --git a/docs/design/2026-05-28-generic-agent-runtime-spec.org b/docs/design/2026-05-28-generic-agent-runtime-spec.org new file mode 100644 index 0000000..8c16043 --- /dev/null +++ b/docs/design/2026-05-28-generic-agent-runtime-spec.org @@ -0,0 +1,471 @@ +#+TITLE: Spec: Generic Agent Runtime Support for rulesets +#+AUTHOR: Codex +#+DATE: 2026-05-28 +#+STARTUP: showall + +* Introductory note + +Craig asked for a design pass on making =rulesets= generic rather than +Claude-Code-specific. The motivating case is offline operation: if he is on a +laptop without network, a local LLM should still be able to use the same project +structure, workflows, memory, and cross-agent conventions. The design also needs +to support two different LLMs running in the same project at the same time, +without trampling each other's live session state. + +I read the current =rulesets= checkout and found that the reusable core is +already there: =.ai/= workflows, scripts, cross-agent comms, inboxes, and +project startup structure are not inherently Claude-specific. The Claude +assumptions live mostly in naming, install destinations, launcher behavior, +per-language bundle layout, hook APIs, and a single active +=.ai/session-context.org= file. + +Hardware notes: + +- This machine is the high-end local-LLM target: AMD Ryzen AI Max+ 395, 128 GiB + RAM, Radeon 8060S / Strix Halo unified memory. For offline agentic coding, I + recommend installing =Qwen3-Coder-30B-A3B-Instruct-GGUF= as the default local + coding model, preferably =Q6_K= on this machine and =Q4_K_M= as the compatibility + quant. It is code-specialized, Apache-2.0, and its GGUF files fit comfortably. + For a stronger general fallback on this machine, also install + =Qwen3-Next-80B-A3B-Instruct-GGUF= =Q4_K_M=; it is not as code-specialized but + gives a much larger model with long context and still fits the 128 GiB system. +- =velox= hardware from =ssh velox inxi -C -G -m -S --filter=: Intel i7-1370P, + 64 GiB DDR4, Intel Iris Xe integrated graphics. For that machine, the strongest + model I would recommend as normal offline coding stock is + =Qwen3-Coder-30B-A3B-Instruct-GGUF= =Q4_K_M=. It should fit in RAM with room for + context, but expect CPU-class latency. Also install an 8B fallback for quick + edits and low-latency triage. + +Suggested archsetup handoff: ask =archsetup= to install the runtime stack +(=llama.cpp= with Vulkan/CPU support, optionally =ollama= as a simple manager), +create a shared model cache, and prefetch the model set above during normal +machine setup when network is available. + +Sources checked: + +- [[https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct-GGUF][Qwen3-Next-80B-A3B-Instruct-GGUF model card]]: Q4_K_M is 48.4 GB, native + context length is 262,144 tokens, Apache-2.0. +- [[https://huggingface.co/tensorblock/Qwen_Qwen3-Coder-30B-A3B-Instruct-GGUF][Qwen3-Coder-30B-A3B-Instruct GGUF quant listing]]: Q4_K_M is 18.557 GB, + Q5_K_M is 21.726 GB, Q6_K is 25.093 GB. +- [[https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct-GGUF][Qwen3-Next model overview]]: 80B total parameters, 3B active, GGUF support via + =llama.cpp= / =llama-cpp-python=. +- [[https://en.wikipedia.org/wiki/Llama.cpp][llama.cpp overview]]: supports Vulkan, HIP/ROCm, OpenCL, CPU, and other + backends. For this hardware class, keep the implementation backend-swappable. + +* Status + +Draft v0. This is not an implementation plan yet; it is a product/architecture +spec for the next =rulesets= refactor. + +* Problem + +=rulesets= is named and wired as a Claude Code rules distribution: + +- Global install targets =~/.claude/skills=, =~/.claude/rules=, + =~/.claude/hooks=, and =~/.claude/settings.json=. +- Per-project language bundles copy into =.claude/= and seed =CLAUDE.md=. +- The launcher =claude-templates/bin/ai= hard-codes =CLAUDE_CMD=claude= and + requires the =claude= binary. +- Template documentation says "Claude" throughout =protocols.org=, + =startup.org=, and the README. +- Hook scripts and settings assume Claude Code's hook protocol and + =$CLAUDE_PROJECT_DIR=. +- The active session file is a singleton =.ai/session-context.org=, which is + unsafe when two agents operate in the same project simultaneously. + +The result: the good project structure is portable in principle but not in +practice. A local offline model can read files, but there is no generic runtime +contract that tells it where to load rules from, where to record live state, how +to avoid another agent's context file, or how to use the same launcher and +project discovery flow. + +* Goals + +- Preserve =.ai/= as the project-neutral workflow, memory, scripts, inbox, and + cross-agent layer. +- Support multiple runtimes: + - Claude Code as the existing adapter. + - Codex/OpenAI-compatible hosted agents. + - Local OpenAI-compatible agents backed by =llama.cpp= / =ollama= / LM Studio. +- Allow two or more agents to work in the same project concurrently without + sharing a live session-context file. +- Keep current Claude workflows working during migration. +- Make language bundles and team overlays installable for more than one runtime. +- Make offline use a first-class path: rules, workflows, launcher, model cache, + and local endpoint all work with no network after setup. + +* Non-goals for v1 + +- No attempt to make every Claude hook feature work identically in every runtime. + Runtimes expose different hook/event APIs. +- No automatic prompt translation that rewrites every rule into every vendor's + preferred style. V1 should install common rules plus small runtime adapters. +- No local model benchmarking harness. Pick sensible defaults and make the model + inventory configurable. +- No forced rename of existing =.claude/= installations in existing projects. + Compatibility matters. + +* Current-state findings + +** Project-neutral pieces + +These can remain conceptually unchanged: + +- =.ai/protocols.org= as the behavioral entry point. +- =.ai/workflows/= and =.ai/scripts/= as synced canonical project tooling. +- =.ai/project-workflows/= and =.ai/project-scripts/= as project-owned extension + points. +- =inbox/= and =inbox/from-agents/= as human and agent inboxes. +- Cross-agent message protocol and scripts. They say "agent" already and are + mostly model-neutral. + +** Claude-specific pieces + +Observed files and assumptions: + +- =README.org= describes "Claude Code skills, rules, and per-language project + bundles." +- =Makefile= uses =SKILLS_DIR=$(HOME)/.claude/skills=, + =RULES_DIR=$(HOME)/.claude/rules=, =HOOKS_DIR=$(HOME)/.claude/hooks=, and + installs =.claude= config. +- =Makefile deps= installs =@anthropic-ai/claude-code= and checks =claude=. +- =scripts/install-lang.sh= copies common rules into =PROJECT/.claude/rules=, + copies language-specific =claude/= directories, and seeds =CLAUDE.md=. +- =scripts/sync-language-bundle.sh= fingerprints bundles by + =PROJECT/.claude/rules= files. +- =scripts/install-team.sh= installs team overlays into =PROJECT/.claude/rules=. +- =scripts/audit.sh= calls the canonical source =claude-templates/.ai=. +- =claude-templates/bin/ai= requires =claude= and launches + =claude "<project instructions>"= in tmux. +- =languages/elisp/CLAUDE.md= is the project instruction template. +- =languages/elisp/claude/settings.json= uses Claude Code hooks and + =$CLAUDE_PROJECT_DIR=. + +* Proposed model + +** Vocabulary + +- *Core* — runtime-neutral rules, workflows, scripts, and project conventions. +- *Runtime* — an agent implementation: =claude=, =codex=, =local-openai=, + =aider-local=, etc. +- *Runtime adapter* — install paths, hook wiring, command template, instruction + filename, and limitations for one runtime. +- *Agent instance* — one live process/session in one project, identified by + runtime + host + project + unique suffix. + +** Directory model + +Keep =.ai/= as the stable project-local core. + +Change active session state from a singleton: + +#+begin_example +.ai/session-context.org +#+end_example + +to an active-session directory: + +#+begin_example +.ai/session-context.d/ + <agent-id>.org +.ai/sessions/ + YYYY-MM-DD-HH-MM-<agent-id>-<description>.org +#+end_example + +Recommended =agent-id= shape: + +#+begin_example +<host>.<project>.<runtime>.<short-id> +#+end_example + +Examples: + +#+begin_example +pearl.org-drill.claude.a83f +pearl.org-drill.local-qwen30b.19ca +velox.archsetup.local-qwen30b.7712 +#+end_example + +Compatibility rule: if exactly one active context exists, tools may expose a +temporary =.ai/session-context.org= symlink or legacy copy for old workflows. +New workflows should read/write by =AI_AGENT_ID=. + +** Runtime manifest + +Add a repository-level runtime manifest: + +#+begin_example +runtimes/ + claude.toml + codex.toml + local-openai.toml +#+end_example + +Each runtime defines: + +#+begin_src toml +id = "local-openai" +display_name = "Local OpenAI-compatible agent" +command = "aider" +args = ["--model", "openai/qwen-local", "--openai-api-base", "http://127.0.0.1:11434/v1"] +requires_network = false +project_instruction_files = ["AGENTS.md", ".ai/protocols.org"] +global_install_root = "~/.config/rulesets/runtimes/local-openai" +project_install_dir = ".agents/local-openai" +supports_hooks = "wrapper" +supports_mcp = false +supports_subagents = false +#+end_src + +The manifest lets the launcher and install scripts reason about a runtime +without hard-coding Claude paths. + +** Source layout + +Refactor source directories toward: + +#+begin_example +agent-rules/ # former claude-rules; runtime-neutral where possible +skills/ # skills with runtime support metadata +ai-templates/.ai/ # former claude-templates/.ai +runtimes/claude/ # Claude adapter +runtimes/codex/ # Codex adapter +runtimes/local-openai/ # local model adapter +languages/elisp/common/ # common language bundle material +languages/elisp/runtimes/claude/ +languages/elisp/runtimes/local-openai/ +teams/deepsat/common/ +teams/deepsat/runtimes/claude/ +#+end_example + +Do not require a big-bang rename. V1 can support aliases: + +- =claude-rules/= remains as a compatibility symlink or wrapper around + =agent-rules/=. +- =claude-templates/= remains as an alias for =ai-templates/= until all startup + workflows are updated. +- =languages/<lang>/claude/= remains supported by the Claude adapter. + +** Install behavior + +Replace "install Claude tooling" with "install runtime adapter": + +#+begin_example +make install-runtime RUNTIME=claude +make install-runtime RUNTIME=local-openai +make install-lang LANG=elisp PROJECT=~/code/foo RUNTIME=claude +make install-lang LANG=elisp PROJECT=~/code/foo RUNTIME=local-openai +#+end_example + +Claude adapter: + +- Global: =~/.claude/skills=, =~/.claude/rules=, =~/.claude/hooks=. +- Project: =.claude/= and =CLAUDE.md=. +- Hook API: Claude Code =settings.json=. + +Local OpenAI adapter: + +- Global: =~/.config/rulesets/local-openai/= and model server config. +- Project: =.agents/local-openai/= plus =AGENTS.md= or + =.ai/runtime/local-openai/instructions.md=. +- Hook API: wrapper-level checks only. If the local CLI has no hook protocol, + hooks become documented commands or wrapper pre/post actions. + +Codex adapter: + +- Project instruction file should be =AGENTS.md= where supported. +- Runtime-specific config lives under =.agents/codex/= or the tool's native + config path. + +** Launcher behavior + +Refactor =claude-templates/bin/ai= into a generic launcher, still named =ai=: + +#+begin_example +ai # choose project and default runtime +ai --runtime claude . +ai --runtime local-openai . +ai --runtime local-qwen30b ~/code/org-drill +ai --attach +ai --list-runtimes +#+end_example + +Launcher responsibilities: + +- Discover projects by =.ai/protocols.org=, not by "Claude-template project." +- Select runtime from: + - explicit =--runtime=, + - project default in =.ai/runtime.toml=, + - host default in =~/.config/rulesets/runtime.toml=. +- Create =AI_AGENT_ID= before launch. +- Export: + - =AI_AGENT_ID= + - =AI_RUNTIME= + - =AI_PROJECT_DIR= + - =AI_SESSION_CONTEXT=.ai/session-context.d/$AI_AGENT_ID.org= +- Use tmux window names that include runtime when needed: + - =org-drill= if only one agent for the project. + - =org-drill:claude= and =org-drill:local-qwen30b= if multiple agents exist. +- Pass a runtime-appropriate opening instruction: + - Claude: current command-line prompt. + - Local agent: prompt file or initial message that says to read + =.ai/protocols.org= and use =AI_SESSION_CONTEXT=. + +** Session-context contract + +Every runtime must obey: + +- Never write the legacy singleton when =AI_SESSION_CONTEXT= is set. +- Create the context file lazily on the first state-mutating turn. +- Archive to =.ai/sessions/= with the =agent-id= in the filename. +- Include runtime and model metadata in frontmatter: + +#+begin_example +#+TITLE: Session context +#+AGENT_ID: pearl.org-drill.local-qwen30b.19ca +#+RUNTIME: local-openai +#+MODEL: Qwen3-Coder-30B-A3B-Instruct-Q6_K +#+HOST: pearl +#+STARTED: 2026-05-28T... +#+end_example + +Startup workflow changes: + +- Check =.ai/session-context.d/*.org=, not only =.ai/session-context.org=. +- If the current =AI_AGENT_ID= has a live file, recover it. +- If other active files exist, surface them as "other active agents" but do not + read them wholesale unless needed. This prevents context contamination. + +** Cross-agent updates + +The existing cross-agent protocol can stay, but add optional fields: + +#+begin_example +#+SENDER_AGENT_ID: pearl.org-drill.claude.a83f +#+SENDER_RUNTIME: claude +#+TARGET_AGENT_ID: pearl.org-drill.local-qwen30b.19ca +#+TARGET_RUNTIME: local-openai +#+MODEL: Qwen3-Coder-30B-A3B-Instruct-Q6_K +#+end_example + +Destination syntax can remain =machine.project= for project-level delivery. +Add =machine.project.agent-id= as an optional targeted form when two agents in +the same project are both active. + +Receivers should ignore messages targeted at another =TARGET_AGENT_ID= unless +the user explicitly asks them to take over. + +** Hook and validation strategy + +V1 should not pretend all runtimes have Claude's hooks. + +Define hook levels: + +| Level | Meaning | +|-------+---------| +| =native= | Runtime has an event/hook API; install native config. | +| =wrapper= | =ai= launcher or helper scripts run checks around common actions. | +| =manual= | Rules document the verification commands; no enforcement. | + +Language bundles should declare which hooks are required and which are advisory. +For local runtimes, start with =manual= plus project-level test commands. Add +=wrapper= only where the local agent CLI can route edits through a known command. + +** Local model runtime + +Install a host-level local model service: + +- Preferred low-level runtime: =llama.cpp= server with OpenAI-compatible API. +- Optional manager: =ollama= for simpler model lifecycle where its model catalog + is enough. +- Model cache: =~/.local/share/llm/models= or =/srv/models/llm=. +- Ports: + - =127.0.0.1:11434= for =ollama= if installed. + - =127.0.0.1:8081= for =llama-server= default coding model. + - =127.0.0.1:8082= for larger/general model when running simultaneously. + +Host model recommendations: + +| Host | Hardware | Default offline coding model | Larger/secondary model | +|------+----------+------------------------------+------------------------| +| current high-end machine | Ryzen AI Max+ 395, 128 GiB unified RAM, Radeon 8060S | =Qwen3-Coder-30B-A3B-Instruct-GGUF Q6_K= | =Qwen3-Next-80B-A3B-Instruct-GGUF Q4_K_M= | +| velox | i7-1370P, 64 GiB RAM, Intel Iris Xe | =Qwen3-Coder-30B-A3B-Instruct-GGUF Q4_K_M= | 8B fallback for speed | + +Rationale: + +- The Qwen3-Coder 30B GGUF sizes leave enough headroom for context and a second + agent on both machines. +- The high-end machine can also carry Qwen3-Next 80B Q4_K_M at 48.4 GB, useful + for long-context planning or general reasoning offline. +- =velox= is memory-capable but GPU-limited; Qwen3-Coder 30B Q4_K_M is the + strongest practical coding default before latency becomes the dominant pain. + +* Migration plan + +** Phase 1: Add runtime identity without renaming everything + +- Teach =ai= launcher to set =AI_AGENT_ID=, =AI_RUNTIME=, =AI_PROJECT_DIR=, and + =AI_SESSION_CONTEXT=. +- Update startup/wrap-up workflows to prefer =AI_SESSION_CONTEXT=. +- Keep legacy =.ai/session-context.org= fallback. +- Add tests for two simultaneous session-context files. + +** Phase 2: Introduce runtime manifests and generic install commands + +- Add =runtimes/claude.toml= and make current install behavior data-driven. +- Add =runtimes/local-openai.toml= with command templates. +- Add =make install-runtime= and keep =make install= as Claude-compatible alias. + +** Phase 3: Split common language bundles from runtime adapters + +- Move runtime-neutral language rules into =languages/<lang>/common=. +- Keep Claude-specific settings/hooks under =languages/<lang>/runtimes/claude=. +- Add local-openai adapter docs/instructions for at least elisp. + +** Phase 4: Rename user-facing docs + +- Rename =claude-templates= to =ai-templates= after compatibility aliases exist. +- Rename =claude-rules= to =agent-rules= after scripts no longer hard-code it. +- Update docs from "Claude should" to "the active agent should" where the rule is + runtime-neutral. +- Keep a short Claude adapter README for Claude-only behavior. + +** Phase 5: Local model install handoff + +- Send archsetup an inbox note requesting local model runtime support. +- After archsetup lands it, teach =rulesets doctor= to verify: + - =llama-server= or =ollama= installed. + - configured model files exist. + - configured OpenAI-compatible endpoint can answer a smoke prompt. + +* Test strategy + +- Unit-test launcher runtime selection and =AI_AGENT_ID= generation. +- Unit-test session-context path generation and archival names. +- Integration-test two fake runtimes launching the same project into distinct + context files. +- Test =sync-language-bundle.sh= compatibility for legacy Claude bundles. +- Test install-lang for: + - =RUNTIME=claude= writes =.claude/= and =CLAUDE.md=. + - =RUNTIME=local-openai= writes =.agents/local-openai/= and does not touch + =.claude/=. +- Test startup workflow examples or scripts so they look for + =session-context.d= without breaking old projects. +- Test cross-agent targeted messages with =TARGET_AGENT_ID=. + +* Open decisions + +- What should the generic project instruction file be: =AGENTS.md=, + =AI.md=, or runtime-specific only? +- Should =.ai/session-context.org= become a symlink to the current agent's file, + or should it disappear after migration? +- Should =rulesets= standardize on =llama.cpp= only, or support =ollama= as the + default beginner-friendly local runtime? +- Which local agent CLI should be the first supported offline editor: + =aider=, =opencode=, a simple custom wrapper, or something else? + +* Recommended next step + +Start with Phase 1 only. The singleton session-context file is the immediate +correctness issue for simultaneous agents, and it can be fixed without renaming +the whole repository or disrupting current Claude installs. |
