aboutsummaryrefslogtreecommitdiff
path: root/.ai/scripts/cross-agent-comms/cross-agent-resume.md
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-05-06 21:59:52 -0500
committerCraig Jennings <c@cjennings.net>2026-05-06 21:59:52 -0500
commitd81b23ad6b6e437dfe3c338a00a4be39bc555146 (patch)
tree2d4b0d7890fd1fc70d81282b81fed2808c28a106 /.ai/scripts/cross-agent-comms/cross-agent-resume.md
parent201377f57430ef28d02e703a2191434bbee55c75 (diff)
downloadrulesets-d81b23ad6b6e437dfe3c338a00a4be39bc555146.tar.gz
rulesets-d81b23ad6b6e437dfe3c338a00a4be39bc555146.zip
chore(ai): initialize project notes and Claude tooling surfaces
Replace the seed notes.org with project-specific context (layout, install modes, task tracker location, recent inflection point). Bring in the synced template surfaces (protocols, workflows, scripts, references, retrospectives, someday-maybe) as tracked content for this content/documentation project.
Diffstat (limited to '.ai/scripts/cross-agent-comms/cross-agent-resume.md')
-rw-r--r--.ai/scripts/cross-agent-comms/cross-agent-resume.md117
1 files changed, 117 insertions, 0 deletions
diff --git a/.ai/scripts/cross-agent-comms/cross-agent-resume.md b/.ai/scripts/cross-agent-comms/cross-agent-resume.md
new file mode 100644
index 0000000..8aa8357
--- /dev/null
+++ b/.ai/scripts/cross-agent-comms/cross-agent-resume.md
@@ -0,0 +1,117 @@
+# cross-agent-resume
+
+**Purpose.** Clear the HALT file and restart the watcher service. Counterpart
+to `cross-agent-halt`. Resuming agent polling is **explicit per-session** —
+this script doesn't auto-revive halted polling loops; you tell each session
+to re-engage.
+
+## Usage
+
+```
+cross-agent-resume [--tailnet]
+```
+
+### Flags
+
+| Flag | Default | Purpose |
+|---|---|---|
+| `--tailnet` | local only | Clear HALT on every peer in `peers.toml` via SSH over Tailscale. |
+
+## Behavior
+
+### Local resume (default)
+
+1. Remove the HALT file: `rm -f ~/.config/cross-agent-comms/HALT`. (Use `-f`
+ so a missing file isn't an error — running resume when not halted is safe.)
+2. Restart the watcher service: `systemctl --user start cross-agent-watch.path`.
+3. Print a summary:
+ ```
+ ✓ HALT file removed
+ ✓ Watcher service started (cross-agent-watch.path)
+ - cross-agent-send and cross-agent-recv will accept new operations.
+ - Inbound messages held during halt will be picked up by the watcher.
+ - Agent polling does NOT auto-resume. To re-engage polling in a paused
+ session, open that Claude session and tell the agent to resume.
+ ```
+4. Exit 0.
+
+### Cross-tailnet resume (`--tailnet`)
+
+1. Apply local resume steps 1-2 first.
+2. Read `peers.toml` for the list of remote machines.
+3. For each peer, SSH:
+ ```
+ ssh <user>@<host> "rm -f ~/.config/cross-agent-comms/HALT && \
+ systemctl --user start cross-agent-watch.path"
+ ```
+4. Track per-peer success/failure:
+ ```
+ Resuming velox.local ✓ (HALT cleared, watcher started)
+ Resuming bastion.local ✗ (ssh exit 255: no route to host)
+ Resuming locally ✓
+
+ PARTIAL RESUME: 2/3 machines resumed. bastion.local still halted.
+ ```
+5. Exit 0 if all peers resumed; exit 1 on any failure.
+
+## Why agent polling doesn't auto-resume
+
+Two reasons the asymmetry is deliberate:
+
+1. *Auto-resume could silently invert intentional kills.* If you halted
+ because a session was misbehaving, removing HALT shouldn't quietly revive
+ that session's polling. You re-engage explicitly so you're aware of which
+ sessions came back online.
+
+2. *You may want to inspect before resuming.* After a halt, you might want to
+ read pending messages, fix configuration, or kill a particular Claude
+ session entirely. Per-session resume forces that pause.
+
+## Re-engaging polling in a Claude session
+
+After `cross-agent-resume`, open the relevant Claude session and say something
+like:
+
+```
+HALT is cleared; resume polling.
+```
+
+The agent will check the HALT file (now absent), re-create its polling
+schedule, and continue the in-flight conversation from wherever it left off.
+The conversation file is intact; the receiver will pick up any new messages
+that arrived during the halt window.
+
+## Failure modes
+
+| Symptom | Cause | Fix |
+|---|---|---|
+| HALT file doesn't exist | Already resumed (or never halted) | OK — `-f` makes this a no-op. |
+| `systemctl --user start` fails | Watcher service not installed | Install per `cross-agent-watch.md`'s systemd recipe. |
+| `--tailnet` resumes some peers but not others | Same as halt: peer unreachable | Per-peer status reported; resolve manually for unreachable peers. |
+| Permission denied removing HALT file | File owned by another user | Check ownership; HALT files should be owned by the running user. |
+
+## Examples
+
+```bash
+# Local resume after a halt
+cross-agent-resume
+
+# Resume all tailnet peers + local
+cross-agent-resume --tailnet
+```
+
+## Recovery flow
+
+After a halt:
+
+1. Investigate whatever caused the halt (runaway loop, bad config, etc.).
+2. Fix the underlying issue.
+3. Run `cross-agent-resume`.
+4. Open each Claude session that was polling and tell its agent to re-engage.
+5. Confirm operation with `cross-agent-status`.
+
+## See also
+
+- `cross-agent-halt` — counterpart that creates the HALT file.
+- `cross-agent-status` — verify HALT cleared and see pending messages.
+- `cross-agent-comms.org` — protocol spec, `* Halt mechanism` section.