aboutsummaryrefslogtreecommitdiff
path: root/.ai/workflows/cross-agent-comms.org
blob: 430b4b08ac2f25ffde3075577387e59fbfbd2003 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
#+TITLE: Cross-Agent Communication Workflow (v5)
#+AUTHOR: Craig Jennings & Claude (homelab + career sessions)
#+DATE: 2026-04-27
#+VERSION: 5

* Status

Draft. Iterating between the homelab and career sessions through a multi-round design discussion. Awaiting Craig's review for promotion to =~/code/rulesets/claude-templates/.ai/workflows/=.

v5 changes from v4:
- *Script absorption.* Seven operational scripts (=cross-agent-send=, =cross-agent-recv=, =cross-agent-watch=, =cross-agent-status=, =cross-agent-discover=, =cross-agent-halt=, =cross-agent-resume=) now own most implementation detail. Their READMEs are the operational source of truth. The spec stays declarative.
- *Failsafe halt.* Layered HALT-file mechanism stops all cross-agent activity on a machine within ~5 min, without visiting individual sessions or restarting Claude Code. =cross-agent-halt= and =cross-agent-resume= are the convenience entry points; every other component checks the HALT file independently.
- *Identity.* Messages are GPG-signed by sender and verified by receiver. Combined with POSIX permissions on =from-agents/= and Tailscale-level network auth, identity becomes a three-layer story.
- *Atomic writes.* Writers MUST use temp-file + rename. =cross-agent-send= handles this; the spec just states the contract.
- *Dedup.* Sequence-collision dedup is now binary SHA-256 equality, not a fuzzy ">90% match" threshold.
- *Cold-start handling.* Layered: =cross-agent-watch= (push notifications via =inotifywait=) is the primary mechanism; startup-workflow check and user-direct-injection are coverage layers.
- *Spec stays roughly the same length but does more protocol work.* Operational detail (rsync retry numbers, inotifywait recipes, peers.toml schema, GPG flags, dedup mechanics) moved to the script READMEs. The spec adds new protocol elements (identity layer, atomic-writes contract, SHA-256 dedup, =escalate= type, =RELEASE_STATUS= values, =REQUIRES_TOOLS= optional field) in the freed space. Total documentation surface (spec + seven READMEs ≈ 1000 lines) is larger than v4's 259 lines, but the spec and the READMEs serve different audiences — protocol-thinkers and CLI-users — and a reader of just the spec can comprehend the protocol without consulting any README.

* When to use

When two Claude sessions in different projects (same machine or different machines on the same Tailscale tailnet) need to coordinate on a shared task that one session can't complete alone — typically because one has tooling, context, or MCP access the other doesn't.

Examples that fit:
- Session A asks session B to apply a workflow patch in B's project, then verify it.
- Session A runs a long task and needs session B to monitor results in B's domain.
- Two sessions co-design a workflow.

Examples that don't fit:
- A simple file handoff that doesn't require iteration.
- A task one session can do alone.
- Cross-tailnet or cross-organization. The protocol is local-tailnet-scoped.

* Protocol

** File location

Each project has =inbox/from-agents/= as its agent-comms mailbox. Create the directory if it doesn't exist; set permissions =chmod 700= and ownership to the user.

- Sender writes to receiver's =inbox/from-agents/=.
- Receiver polls (or watches) =inbox/from-agents/=, *not* the parent =inbox/=.
- The parent =inbox/= stays reserved for human-triage items.
- Out-of-band artifacts (PDFs, datasets) live at =inbox/from-agents/artifacts/=. Reference by relative path in the message body.

The user does NOT write directly to =from-agents/=. To inject input into a running conversation, the user tells one of the agents in that agent's session; the agent writes the input as a normal message attributed to the user.

** File naming

=YYYYMMDDTHHMMSSZ-from-<sender>-<short-conv-id>.org=

- Timestamp is UTC ISO 8601 compact. The trailing =Z= is mandatory.
- =from-<sender>= prefix.
- =<short-conv-id>= is a stable kebab-case slug across the back-and-forth. Reusable across time; ordering relies on filename timestamps.

Frontmatter =#+TIMESTAMP= carries the same instant in local time with explicit offset. The two MUST refer to the same instant.

The implementation (=cross-agent-send=) generates the canonical filename from the message's frontmatter (=CONVERSATION_ID=, current UTC time) and the sender's project context. Senders supply only the message body file; the script handles naming. Senders MUST NOT pre-name files in this format and pass them through; the script overwrites with its own canonical name to ensure consistency and enable the sender-side max-seen sequence-collision-reduction scan.

GPG signatures live in a sibling file =YYYYMMDDTHHMMSSZ-from-<sender>-<short-conv-id>.org.asc=. Receivers verify before processing. See =* Writes are atomic= for the two-file delivery ordering rule.

** Frontmatter

Required:

#+begin_example
#+TITLE: <human-readable subject>
#+CONVERSATION_ID: <stable across the thread>
#+MESSAGE_TYPE: <see types below>
#+SEQUENCE: <integer hint>
#+TIMESTAMP: <ISO 8601 with explicit offset>
#+PROTOCOL_VERSION: 5
#+end_example

Optional:

#+begin_example
#+REQUIRES_TOOLS: <comma-separated tool/MCP slugs, e.g. gmail-mcp, slack-mcp>
#+RELEASE_STATUS: <see release-statuses; valid only on MESSAGE_TYPE: release>
#+WORKFLOW_VERSION: <sender's version of cross-agent-comms.org; informational only in v5 — no enforcement>
#+end_example

Receiver sanity-checks frontmatter before acting. Missing or malformed frontmatter → surface to user, don't proceed. Mismatched =PROTOCOL_VERSION= → receiver writes a =query= asking the originator to upgrade.

** Identity

Messages are GPG-signed by the sender. Receivers verify the detached signature before processing the message body.

The implementation (=cross-agent-send=) signs automatically with the sender's configured key (the user's primary GPG key by default; configurable via =--key= flag or environment). Receivers verify automatically against the keys in their GPG keyring.

Identity is a three-layer story:

1. *Tailscale layer.* Only tailnet members can reach the rsync-over-SSH endpoint at all.
2. *POSIX layer.* =chmod 700= on =from-agents/= means only processes running as the directory's owner can write.
3. *GPG layer.* Sender's signature on each message proves the message originated from a process holding the key.

Three independent layers. Per-user GPG (using existing keys) gives a correctness check more than a security boundary — unsigned messages are almost certainly bugs, not attackers. That's still load-bearing.

** Writes are atomic

Writers MUST use a temp-file + rename pattern (=mktemp= + =mv= within the same filesystem) so receivers never see partial files. The implementation script (=cross-agent-send=) handles this.

Receivers ignore =.tmp.*= files, processing only the final renamed name.

*Two-file ordering.* When a message has a sibling GPG signature file (=.org.asc=), the writer MUST rename the =.asc= to its final name *before* renaming the =.org=. Two =mv= operations are not atomic together — without this ordering, a receiver could read the =.org= in the window between the two renames and fail GPG verify because the =.asc= hasn't landed yet. The rule: receiver only acts on =.org= files, and a =.org= without a corresponding =.asc= means the signature is genuinely missing (not still in flight).

** Sequence numbering

=#+SEQUENCE= is a *hint*, not a strict counter. Canonical order is =#+TIMESTAMP=. Sequences may collide under rapid back-and-forth (both sides write what they think is sequence N near-simultaneously). Treat collision as a normal protocol event.

*Receiver-side dedup rule.* When a new file shares =CONVERSATION_ID= + =SEQUENCE= with an already-processed message, compare SHA-256 hashes. Identical hashes → silent dedup, treat as a retry. Different hashes → process both, ordered by =#+TIMESTAMP=.

*Sender-side collision-reduction (best-effort).* Before picking sequence, scan the receiver's =from-agents/= for the highest existing sequence in this conversation across both sender prefixes. Use =max(seen) + 1=.

** Message types

- *request* — a side asks for work, input, or a decision. Sequence 1 is always =request=.
- *progress* — work-in-progress checkpoint. "Here's where I am, no action needed from you, more coming." Originator's poll loop should NOT page the user on progress messages.
- *query* — either side asks a clarifying question that blocks further work. Originator's poll loop SHOULD surface this immediately. Originator answers and work continues.
- *pushback* — receiver formally disagrees with the request and has *not* started the work. Carries reasoning. Distinct from =query= because the originator's response path differs.
- *complete* — receiver signals the requested work is done. Triggers verification.
- *release* — terminal type. Originator writes after verifying =complete=. Carries =RELEASE_STATUS= to disambiguate the closure mode.
- *escalate* — punts the conversation to the user for adjudication. Both sides pause polling on =escalate=; the user resolves.

Reply expectation is implied by type: =request=, =query=, =pushback=, =escalate= expect a reply; =progress=, =complete=, =release= don't.

** Conversation lifecycle

A conversation is a directed loop between an originator (issued sequence 1) and a receiver:

1. Originator writes =request= (sequence 1). Begins polling for replies.
2. *Optional acknowledgment.* Receiver may write a =progress= at sequence 2 to acknowledge receipt and set expectations. Required if work will take >5 minutes (so the originator's poll loop doesn't waste wakes).
3. *Optional echo-back.* For ambiguous or large requests, receiver writes a =progress= that restates work items and announces "starting now unless you push back within N minutes."
4. Receiver works. May write =progress= updates. =query= mid-work if blocked. =pushback= if the request is wrong.
5. Receiver writes =complete=. Begins polling for =release=.
6. Originator reads, *verifies the deliverable directly*. For subjective deliverables, verification is the originator's editorial accept.
7. If verified: =release= with =RELEASE_STATUS: complete=. If problems: new =request= (next sequence number).
8. Receiver sees =release=, stops polling.

The verification step is load-bearing. =complete= is a *claim*; =release= is *verification*.

** Pushback path

On receiving a =pushback=, the originator chooses:

1. *Revise* — new =request= with adjusted scope.
2. *Insist* — new =request= addressing the pushback's reasoning, standing by direction.
3. *Withdraw* — =release= with =RELEASE_STATUS: withdrawn-after-pushback=.

*Deadlock cap.* After two pushback-insist exchanges, the next message MUST be =MESSAGE_TYPE: escalate=. Both agents pause polling; the user resolves.

** =RELEASE_STATUS= values

| Status | Meaning |
|---+---|
| =complete= | Goal achieved, originator verified |
| =cancelled= | Originator changed their mind mid-conversation |
| =withdrawn-after-pushback= | Originator chose option 3 on receiver's =pushback= |
| =abandoned-after-escalation= | User adjudicated and chose to close the conversation |
| =abandoned-after-timeout= | Receiver auto-closed after originator never returned to verify |

** Async fallback

If the originator session ends between =request= and =complete=, the receiver's =complete= goes unverified. Receiver behavior:

- Polls for =release= up to ~24 hours of cycles (implementation default).
- After timeout, writes a final =progress= message ("treating as terminal-without-verification; originator never returned to release") and stops polling. Receiver does NOT write =release= itself — that would contradict the lifecycle rule that =release= is the originator's terminal action.
- Next time the originator project starts, the unreleased =complete= is surfaced as a startup item. The user can issue a late =release= (with whichever =RELEASE_STATUS= fits) or open a fresh conversation to revisit. =RELEASE_STATUS: abandoned-after-timeout= is used at that point if the user wants to formally close the orphaned thread.

** Escalation

A side writes =escalate= when:
- Pushback-insist deadlock cap reached.
- Conversation has stalled (no productive movement in N exchanges).
- A reply-expecting message has gone unanswered past timeout.

Body summarizes both sides' positions in 60 seconds of reading. Both agents pause polling; the user resolves.

* Implementation notes

This sub-section describes how to operate the protocol. Operational detail lives in the seven scripts' READMEs.

** Recommended scripts

| Script | Replaces user action | README |
|---+---+---|
| =cross-agent-send <dest> <msg>= | Filename generation, GPG sign, atomic write, peer lookup, rsync push, retry+backoff, failure surfacing — seven mechanical sender-side steps. Frontmatter and message body are still author-supplied. | =cross-agent-send.md= |
| =cross-agent-recv <msg>= | Frontmatter sanity-check, =PROTOCOL_VERSION= verify, GPG verify, SHA-256 dedup, =REQUIRES_TOOLS= check — five mechanical receiver-side steps. Output is a structured decision (=process= / =dedup= / =query= / =reject=) the agent acts on. | =cross-agent-recv.md= |
| =cross-agent-watch= | Manually checking inboxes; "did I get a message?" | =cross-agent-watch.md= |
| =cross-agent-status= | Walking each project to count pending messages | =cross-agent-status.md= |
| =cross-agent-discover= | Remembering project topology and reachability | =cross-agent-discover.md= |
| =cross-agent-halt [reason] [--tailnet]= | Visiting each session to stop polling, restarting Claude Code, or hand-killing processes when comms go runaway. =--tailnet= propagates HALT to all peers. | =cross-agent-halt.md= |
| =cross-agent-resume [--tailnet]= | Manually clearing the HALT state and restarting the watcher. Per-session polling does NOT auto-resume — the user re-engages each session explicitly. | =cross-agent-resume.md= |

The scripts are tools the user runs from any terminal. They do not depend on agent context — =cross-agent-status= run from a fresh shell works.

A reader can comprehend this protocol from this spec alone. Script READMEs add operational detail that makes the protocol practical to use, but understanding the protocol's semantics requires only this document.

** Polling

Default cadence: 270 seconds (≈4.5 min). Sits just under the 5-minute prompt-cache TTL.

If a side needs to slow down (heads-down work, idle wait), it writes a =progress= message saying so in prose. The other side adapts. There are no named polling modes.

After ~12 empty polls in a row, the poll loop surfaces the silence to the user.

A future runtime with native filesystem-event support could replace polling for active sessions; =cross-agent-watch= already provides event-driven notifications outside active sessions.

** User multi-tasking

- *Deferral.* If the user's last message in the agent's session was less than 60 seconds ago AND a poll fires, queue the inbox check until either the user sends another message OR 5 minutes pass without further input.
- *Surfacing.* On the next user-facing response: "While we were working on X, a cross-agent message landed from <project>. It's a =<type>= — want me to handle it now or after we finish?"
- *Mid-question.* Answer the user first.
- *Project switch.* If the user moves to the receiver project mid-conversation, the receiver agent surfaces the in-flight thread on first user prompt.
- *Conversation state.* Always include in any response that mentions a cross-agent thread: "<conv-id> at sequence N, awaiting <event>."

** Failure modes

The seven scripts surface most failures with concrete error messages. Spec-level failure modes:

- *Malformed frontmatter on a received file.* Surface to user; do not act.
- *Mismatched =PROTOCOL_VERSION=.* Receiver writes =query= asking originator to upgrade.
- *Missing or invalid GPG signature.* Receiver surfaces "unsigned/unverified message"; refuses to act.
- *Sequence collision* with non-matching SHA-256. Process both, ordered by timestamp.
- *Required tool unavailable.* Receiver checks =REQUIRES_TOOLS= during frontmatter-sanity-check (before any work begins). On a missing tool, receiver writes =query= asking the originator to reframe the request to avoid the unavailable tool. Originator may revise (new =request=) or withdraw (=release= with =RELEASE_STATUS: cancelled=). =query= is the right type rather than =pushback= because missing-tool is a capability gap, not disagreement.
- *Runaway resource usage.* User invokes =cross-agent-halt= globally (or =cross-agent-halt --tailnet= for cross-machine). HALT file stops all components within one polling cycle (~5 min). See =* Halt mechanism= for the layered checks.
- *User halts mid-conversation.* Both sides write a final =progress= note ("HALT fired; pausing"); polling stops within one cadence; conversations resume on explicit per-session re-engage after HALT clears.
- *HALT file accidentally created* (typo, errant =touch=). =cross-agent-status= prominently flags HALT active; user clears with =cross-agent-resume=. Cost: no messages send during the typo window.
- *HALT file unreadable* (perms wrong, partial write). Each component fails-closed (treats as halted) and reports "HALT file present but unreadable; treat as halted." Safer than fail-open.

Operational failures (rsync push fails, watcher dies, peer unreachable) live in the script READMEs' failure-mode tables.

* Halt mechanism

A failsafe to stop all cross-agent activity on a machine without visiting individual sessions or restarting Claude Code. Designed for the runaway-polling case: an agent has spun up conversations with N other agents, polling is eating CPU, and the user needs to stop everything *now*.

** The HALT file

Path: =~/.config/cross-agent-comms/HALT=.

Existence triggers halt across all components on the machine. The file's body may carry an optional human-readable reason (reviewed by the user later when deciding to resume).

User commands:

#+begin_example
$ touch ~/.config/cross-agent-comms/HALT      # halt
$ rm ~/.config/cross-agent-comms/HALT         # resume
#+end_example

Or via convenience scripts (=cross-agent-halt= / =cross-agent-resume=) that also handle the watcher service and cross-machine propagation.

** Layered checks (the failsafe property)

Every component MUST check the HALT file. The "any one component stops the system independently" property is what makes this failsafe — the system doesn't depend on a single point doing the right thing.

| Component | Check timing | Behavior on HALT |
|---+---+---|
| =cross-agent-send= | At start of send + between =.asc= and =.org= rsync + between retry iterations | Refuse to start new send; complete current step then exit. Worst case: one in-flight send finishes within a few seconds. |
| =cross-agent-recv= | Before any verify or dedup | Leave inbound message in place — do NOT dedup, reject, or move. Resume picks it up via cold-start handling. |
| =cross-agent-watch= | At iteration start | Suppress notifications; log only. Continues running, no-op until HALT clears. |
| =cross-agent-status= | At start | Print prominent "⚠ HALT ACTIVE" banner before normal output. Read-only, continues. |
| =cross-agent-discover= | At start | Print HALT banner; continue read-only enumeration. |
| Agent polling loop | First action on every wake | Write a final =progress= note to any active conversation ("HALT fired; pausing"), do NOT reschedule, surface "halt active" to user. Polling decays within one cadence (~5 min). |
| Agent user-facing responses | Every response while HALT is set | Append "(HALT active; cross-agent comms paused)" to the response. On HALT clear, the next response says "(HALT cleared; cross-agent comms ready to resume — say so to re-engage polling)." Persistent, not just first-response — keeps awareness alive. |
| Conversation initiator | Before writing sequence 1 of any new conversation | Refuse and surface to user. |
| Startup workflow | Phase A on session start | If HALT exists, surface immediately and skip cross-agent inbox checks. |

The agent polling-loop check is the load-bearing one for "stops eating CPU." Wake-ups already scheduled fire, but each wake on-HALT is a no-op + reschedule-prevention. Within one polling cadence (~5 min) all polling stops.

*Fail-closed on unreadable HALT.* If the HALT file exists but is unreadable (wrong permissions, partial write), components MUST treat as halted. Safer than fail-open.

** Resume asymmetry (deliberate)

Halt is automatic everywhere. Resume requires explicit user intent per-session.

When the user removes HALT (or runs =cross-agent-resume=), components stop refusing to act, but agent polling does NOT auto-resume. The user must open each session and tell that agent to resume polling for its conversations.

The asymmetry exists because:

1. Auto-resume could silently invert intentional kills. If the user halted because a session was misbehaving, removing HALT shouldn't quietly revive it.
2. Per-session resume forces the user to look at each session and confirm the situation is resolved before re-engaging.

** Cross-machine halt

=cross-agent-halt --tailnet= iterates =peers.toml= and SSH-touches HALT on each peer. Same shape for resume.

Reports per-peer status with non-zero exit on partial halt:

#+begin_example
$ cross-agent-halt --tailnet
Halting velox.local      ✓ (HALT file written)
Halting bastion.local    ✗ (ssh exit 255: no route to host)
Halting locally          ✓ (HALT file written)

PARTIAL HALT: 2/3 machines halted. bastion.local needs manual halt.
Exit 1.
#+end_example

Scripting can detect partial halt via the exit code. Same pattern for =--tailnet= on resume.

* Limitations

- *Local-tailnet only.* Filesystem IPC + rsync over SSH. Cross-tailnet or cross-organization is out of scope.
- *Identity has three layers (Tailscale + POSIX + GPG)* but no message-content encryption. Confidentiality is not the goal; signing is correctness, not secrecy.
- *Single-receiver per conversation.* Fan-out to multiple receivers requires manually orchestrating multiple parallel conversations.
- *Polling is best-effort.* A wake may be delayed by an in-flight tool call until the runtime is idle. =cross-agent-watch= mitigates by offering event-driven notifications.
- *Project-extension drift.* If two projects' =.ai/project-workflows/= modify shared workflow definitions in incompatible ways, cross-agent assumptions can diverge silently. The optional =#+WORKFLOW_VERSION= advisory field is informational only in v5 — no implementation reads or acts on it. A future version may add enforcement on mismatch (e.g. receiver writes =query= asking which side is stale). Today, alignment is verified manually before high-stakes conversations.

* Persistence after release

Conversation files persist by default. The conversation log is the audit trail.

Manual archival is fine if the inbox grows unmanageable. Suggested cadence: once the conversation has been =release='d AND the work it produced has shipped, archive both projects' message files into =.ai/sessions/cross-agent/= as a flat directory — no per-conversation subdirectories. Rename each archived file to lead with the conversation-id so messages from the same conversation cluster on =ls=: =<conv-id>-<TIMESTAMP>-from-<sender>.org= (and the matching =.asc= sibling, if present). Inbox filenames lead with the timestamp because chronological arrival is what matters in =from-agents/=; archives invert that because grouping by conversation is what matters when reading history. Keep the =.asc= signatures alongside the =.org= files in archive — they're small and document the GPG verification chain.

Old messages don't affect protocol behavior (=cross-agent-status='s pending semantics correctly ignore released messages) but the =from-agents/= directory grows indefinitely without manual archival. =cross-agent-status= performance degrades noticeably when a project's =from-agents/= exceeds a few hundred files. =cross-agent-init= (deferred to v6) would include an archival sub-command.

* Open questions

- *=cross-agent-init= and =cross-agent-compose= helper scripts.* =-init= would be one-command project bootstrap (creates =inbox/from-agents/= with =chmod 700=, installs the =cross-agent-watch= systemd path unit, validates peer config, runs a discovery probe). =-compose= would be interactive frontmatter authoring (prompts for required fields, produces a draft message file). Both deferred to v6. Current onboarding requires manual =mkdir= + systemd setup per =cross-agent-watch.md='s install recipe; current message authoring requires writing the file by hand or via a small in-agent template.
- *Hard conversation timeout.* The async-fallback timeout is implementation-default ~24 hours. Right number depends on use case; tighten as patterns emerge.
- *=paused= polling state.* Today there's no clean signal for "pause without ending." Add when first user complaint surfaces.
- *Multi-LLM context.* If we ever bring in a non-Claude agent, the protocol's natural-language framing may need formalization.

* Examples

** =prep-fixup= conversation (2026-04-26 → 2026-04-27)

Eleven exchanges between homelab and career produced the v4 spec by iterative critique-and-simplification. Three real-time sequence collisions during the conversation drove the sequence-as-hint rule that landed in v4 and persists in v5.

Files at =~/projects/{homelab,career}/inbox/from-agents/= named =*-prep-fixup.org=. Worth re-reading when designing future cross-agent flows.

** =comms-cold-start-discovery= conversation (2026-04-27)

The follow-up that produced this v5 spec. Cold-start, watcher tooling, agent discovery, GPG identity, sha256 dedup, atomic writes, POSIX perms, script absorption, and process-vs-text simplification. Tonight's first cold-start in real time (career session went dormant after =prep-fixup= release; Craig's user-injection re-engaged it) is the worked demonstration of the v5 user-injection rule.

Files at =~/projects/{homelab,career}/inbox/from-agents/= named =*-comms-cold-start-discovery.org=.