diff options
Diffstat (limited to '.ai/workflows')
| -rw-r--r-- | .ai/workflows/INDEX.org | 20 | ||||
| -rw-r--r-- | .ai/workflows/broadcast.org | 4 | ||||
| -rw-r--r-- | .ai/workflows/create-workflow.org | 8 | ||||
| -rw-r--r-- | .ai/workflows/cross-agent-comms.org | 334 | ||||
| -rw-r--r-- | .ai/workflows/helper-mode.org | 2 | ||||
| -rw-r--r-- | .ai/workflows/inbox-zero.org | 97 | ||||
| -rw-r--r-- | .ai/workflows/inbox.org | 488 | ||||
| -rw-r--r-- | .ai/workflows/monitor-inbox.org | 122 | ||||
| -rw-r--r-- | .ai/workflows/process-inbox.org | 220 | ||||
| -rw-r--r-- | .ai/workflows/spec-create.org | 4 | ||||
| -rw-r--r-- | .ai/workflows/spec-response.org | 61 | ||||
| -rw-r--r-- | .ai/workflows/spec-review.org | 104 | ||||
| -rw-r--r-- | .ai/workflows/startup.org | 37 | ||||
| -rw-r--r-- | .ai/workflows/triage-intake.org | 6 | ||||
| -rw-r--r-- | .ai/workflows/wrap-it-up.org | 76 |
15 files changed, 674 insertions, 909 deletions
diff --git a/.ai/workflows/INDEX.org b/.ai/workflows/INDEX.org index 42119b4..eef81df 100644 --- a/.ai/workflows/INDEX.org +++ b/.ai/workflows/INDEX.org @@ -18,8 +18,10 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - =helper-mode.org= — role contract for a helper instance (a second Claude in the same project as a live primary). No manual trigger; the spawn paths route to it, "you are a helper" is the manual fallback. - =first-session.org= — initialize =.ai/= for a brand-new project. - Triggers: "this is a new project", "let's set this project up". Auto-runs if =.ai/sessions/= is empty. -- =wrap-it-up.org= — end-of-session: write summary, archive, commit, push. +- =wrap-it-up.org= — end-of-session: write summary, archive, commit, push, then a phrase-dependent Step 6 teardown. Bare "wrap it up" tears the session down (kills the ai-term buffer + =aiv-<project>= tmux session via a =Stop=-hook sentinel, after the valediction flushes); a "with summary" / "and summarize" wrap keeps the buffer; "and shutdown" gates on being the only live ai-term session, then powers the machine off via an abort-able Emacs countdown. - Triggers: "wrap it up", "that's a wrap", "let's call it a wrap" + - No-teardown triggers: "wrap it up with summary", "wrap it up and summarize" + - Shutdown trigger: "wrap it up and shutdown" - =retrospective.org= — post-mortem after a tough session. - Triggers: "let's do a retrospective", "retrospective time" @@ -44,12 +46,11 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - Triggers: "let's do a journal entry", "create a journal entry" - =clean-todo.org= — tidy =todo.org=: hygiene pass + =--archive-done=, then summarize. Wrap-up does this automatically; this is the manual entry point. - Triggers: "clean up todo.org", "clean-todo", "tidy the todo file", "archive the done items in todo.org", "run the todo cleanup" -- =process-inbox.org= — evaluate each inbox item against a three-question value gate (advances an existing TODO / improves the project / serves the mission), then implement, fold, file, defer, or reject per source (Craig / project handoff / script). Auto-invoked by startup when inbox is non-empty. Source-aware rejection flow: handoff rejections write a response back via =inbox-send= naming the failed gate question and any reconsideration condition. - - Triggers: "process inbox", "process the inbox", "handle the inbox", "what's in inbox", "what's in the inbox", "let's clear the inbox", "let's process the inbox items" -- =monitor-inbox.org= — the cadence + act-vs-file + reply layer over process-inbox: "monitor the inbox" runs a pass now then loops process-inbox every 15 min; gates on a clean tree + green suite at both ends; in no-approvals mode auto-executes only agreed + quick + solo items (else files or parks); also the ambient =inbox-status= task-boundary check and the reply-to-sender discipline. - - Triggers: "monitor the inbox", "watch the inbox", "respond to the handoffs", "handle the handoffs" -- =inbox-zero.org= — route the *global roam inbox* (=~/org/roam/inbox.org=) to owning projects by =<project>:= heading prefix. Distinct from =process-inbox.org= (the project's own =inbox/= dir). The current session claims only its own prefixed items, files them into =todo.org=, removes them from the shared inbox, and leaves foreign/unowned items. Every scan reports the total item count plus how many appear related to this project. v1 is single-destination (prefix-claim only); domain-aware whole-inbox routing is deferred. Called read-only from startup (count + offer) and as a wrap-up Step 3 sub-step. - - Triggers: "inbox zero", "empty the inbox", "process the roam inbox", "triage my roam inbox" +- =inbox.org= — one engine for the project's inbox surfaces, with the shared value gate / skeptical review / disposition ladder / reply discipline / capture-guard / priority-scheme check in one place, plus thin per-surface modes. *Process mode* evaluates each project-local =inbox/= item against the three-question value gate, then implements / folds / files / defers / rejects per source (auto-invoked by startup when inbox is non-empty). *Monitor mode* runs process mode now then loops it every 15 min, gating on a clean tree + green suite and adding the act-vs-file + no-approvals-execute + reply discipline. *Roam mode* routes the global roam inbox (=~/org/roam/inbox.org=) to owning projects by =<project>:= prefix (read-only nudge at startup, sweep at wrap-up). *Auto inbox zero* runs roam mode on an interactive =/loop= at a Craig-chosen interval. Distinct from =triage-intake.org= (external accounts), which stays separate. + - Process-mode triggers: "process inbox", "process the inbox", "handle the inbox", "what's in inbox", "what's in the inbox", "let's clear the inbox", "let's process the inbox items" + - Monitor-mode triggers: "monitor the inbox", "watch the inbox", "respond to the handoffs", "handle the handoffs" + - Roam-mode triggers: "inbox zero", "empty the inbox", "process the roam inbox", "triage my roam inbox" + - Auto-mode trigger: "auto inbox zero" (match before "inbox zero") ** Calendar @@ -81,9 +82,9 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - =spec-create.org= — author a design/feature spec before non-trivial work (more than ~6 hours, or real trade-offs): a when-to-spec gate, then problem-first framing, design + alternatives + inline mini-ADR decisions, implementation phases + acceptance criteria + a readiness-dimensions menu, a terseness pass, and a hand-off self-check against the review rubric. The *author* side that starts the trio; its output feeds =spec-review.org=. - Triggers: "let's write a spec", "spec this out", "create a spec for X", "spec-create workflow" -- =spec-review.org= — review a design/feature spec for implementation-readiness: run the readiness gate, read the code first, evaluate across dimensions, assign a rubric (Ready / Ready-with-caveats / Not-ready / Needs-research), and write a =<spec>-review.org= file when not ready. The *reviewer* side; its output feeds =spec-response.org=. +- =spec-review.org= — review a design/feature spec for implementation-readiness: run the readiness gate, read the code first, evaluate across dimensions, assign a rubric (Ready / Ready-with-caveats / Not-ready / Needs-research), and record findings as =TODO= tasks in the spec's =* Review findings= section when not ready. The *reviewer* side; its output feeds =spec-response.org=. - Triggers: "review the spec", "is this spec implementation-ready?", "spec-review workflow", "review the design" -- =spec-response.org= — fold an external spec review back in: decide accept / modify / reject for every recommendation, weave accepts into the spec body, document modifies and rejects in a "Review dispositions" section, reconcile cross-spec tensions, iterate to implementation-ready. The *author* side; consumes the =<spec>-review.org= file =spec-review.org= produces. +- =spec-response.org= — fold a spec review back in: decide accept / modify / reject for every finding, weave accepts into the spec body, complete each finding task in place (the reason recorded on modifies and rejects), reconcile cross-spec tensions, iterate to implementation-ready. The *author* side; consumes the =* Review findings= =spec-review.org= produces. - Triggers: "respond to the review", "process the spec reviews", "spec-response workflow", "fold in the review" ** Tools and meta @@ -107,7 +108,6 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - Triggers: "session harvest", "harvest the sessions", "let's run the session-harvest workflow", "monthly harvest", "mine the sessions" - =no-approvals.org= — drop the interaction-level approval gates for a pre-agreed batch while keeping engineering-discipline gates (=/review-code=, =/voice personal=, tests, session-log updates, subagent reviews, destructive-action consent). Mode stays on until Craig turns it off, a real question arises, the queue empties, or the conversation switches topics. - Triggers: "no-approvals mode", "no approvals", "no-approval", "no need for approval gates", "stop asking, just keep going", "I'll check back in when you're done or stuck", "do all =<selector>= with no-approval" -- =cross-agent-comms.org= — protocol for cross-project agent coordination via =inbox/from-agents/= (file-based IPC, GPG-signed, supports cross-machine over Tailscale). Auto: when =cross-agent-watch= detects a new inbound message, or when an agent decides to initiate a cross-project conversation. Operational scripts (=cross-agent-send=, =-recv=, =-watch=, =-status=, =-discover=, =-halt=, =-resume=) and their READMEs live at =.ai/scripts/cross-agent-comms/=. * Living Document diff --git a/.ai/workflows/broadcast.org b/.ai/workflows/broadcast.org index 1be07d2..60e9ed1 100644 --- a/.ai/workflows/broadcast.org +++ b/.ai/workflows/broadcast.org @@ -159,11 +159,11 @@ broadcast, not a task and not tailored to this project. - Ask Craig any follow-up questions then — this message is deliberately general. #+end_example -The "For the receiving agent" block is fixed text — it travels with every situational broadcast so the message is self-describing. A receiving project's =process-inbox= reads it and acts on those instructions without needing any special-casing; the value gate accepts it as situational awareness that improves how the project works. +The "For the receiving agent" block is fixed text — it travels with every situational broadcast so the message is self-describing. A receiving project's =inbox.org= process mode reads it and acts on those instructions without needing any special-casing; the value gate accepts it as situational awareness that improves how the project works. ** Receiving behavior (what a project does with an incoming situational broadcast) -When =process-inbox= encounters a =Broadcast:= item, the disposition is *record-and-hold*, not file-as-task: +When =inbox.org= process mode encounters a =Broadcast:= item, the disposition is *record-and-hold*, not file-as-task: 1. Add a dated entry to =notes.org= Active Reminders capturing the situation and its end date (if any). 2. If the event bears on an open task, note the connection in that task's body. diff --git a/.ai/workflows/create-workflow.org b/.ai/workflows/create-workflow.org index 6060df1..393fce5 100644 --- a/.ai/workflows/create-workflow.org +++ b/.ai/workflows/create-workflow.org @@ -195,7 +195,7 @@ After the Q&A, ask together: Decide on a name for this workflow. *Naming convention:* Action-oriented (verb form) -- Examples: "refactor", "inbox-zero", "create-workflow", "review-code" +- Examples: "refactor", "clean-todo", "create-workflow", "review-code" - Why: Shorter, natural when saying "let's do a [name] workflow" - Filename: =.ai/workflows/[name].org= @@ -240,10 +240,10 @@ Update =notes.org=: Example entry: #+begin_src org -,** inbox-zero -File: =.ai/workflows/inbox-zero.org= +,** journal-entry +File: =.ai/workflows/journal-entry.org= -Workflow for processing inbox to zero: +Workflow for capturing a daily journal entry: 1. [Brief workflow summary] 2. [Key steps] diff --git a/.ai/workflows/cross-agent-comms.org b/.ai/workflows/cross-agent-comms.org deleted file mode 100644 index 430b4b0..0000000 --- a/.ai/workflows/cross-agent-comms.org +++ /dev/null @@ -1,334 +0,0 @@ -#+TITLE: Cross-Agent Communication Workflow (v5) -#+AUTHOR: Craig Jennings & Claude (homelab + career sessions) -#+DATE: 2026-04-27 -#+VERSION: 5 - -* Status - -Draft. Iterating between the homelab and career sessions through a multi-round design discussion. Awaiting Craig's review for promotion to =~/code/rulesets/claude-templates/.ai/workflows/=. - -v5 changes from v4: -- *Script absorption.* Seven operational scripts (=cross-agent-send=, =cross-agent-recv=, =cross-agent-watch=, =cross-agent-status=, =cross-agent-discover=, =cross-agent-halt=, =cross-agent-resume=) now own most implementation detail. Their READMEs are the operational source of truth. The spec stays declarative. -- *Failsafe halt.* Layered HALT-file mechanism stops all cross-agent activity on a machine within ~5 min, without visiting individual sessions or restarting Claude Code. =cross-agent-halt= and =cross-agent-resume= are the convenience entry points; every other component checks the HALT file independently. -- *Identity.* Messages are GPG-signed by sender and verified by receiver. Combined with POSIX permissions on =from-agents/= and Tailscale-level network auth, identity becomes a three-layer story. -- *Atomic writes.* Writers MUST use temp-file + rename. =cross-agent-send= handles this; the spec just states the contract. -- *Dedup.* Sequence-collision dedup is now binary SHA-256 equality, not a fuzzy ">90% match" threshold. -- *Cold-start handling.* Layered: =cross-agent-watch= (push notifications via =inotifywait=) is the primary mechanism; startup-workflow check and user-direct-injection are coverage layers. -- *Spec stays roughly the same length but does more protocol work.* Operational detail (rsync retry numbers, inotifywait recipes, peers.toml schema, GPG flags, dedup mechanics) moved to the script READMEs. The spec adds new protocol elements (identity layer, atomic-writes contract, SHA-256 dedup, =escalate= type, =RELEASE_STATUS= values, =REQUIRES_TOOLS= optional field) in the freed space. Total documentation surface (spec + seven READMEs ≈ 1000 lines) is larger than v4's 259 lines, but the spec and the READMEs serve different audiences — protocol-thinkers and CLI-users — and a reader of just the spec can comprehend the protocol without consulting any README. - -* When to use - -When two Claude sessions in different projects (same machine or different machines on the same Tailscale tailnet) need to coordinate on a shared task that one session can't complete alone — typically because one has tooling, context, or MCP access the other doesn't. - -Examples that fit: -- Session A asks session B to apply a workflow patch in B's project, then verify it. -- Session A runs a long task and needs session B to monitor results in B's domain. -- Two sessions co-design a workflow. - -Examples that don't fit: -- A simple file handoff that doesn't require iteration. -- A task one session can do alone. -- Cross-tailnet or cross-organization. The protocol is local-tailnet-scoped. - -* Protocol - -** File location - -Each project has =inbox/from-agents/= as its agent-comms mailbox. Create the directory if it doesn't exist; set permissions =chmod 700= and ownership to the user. - -- Sender writes to receiver's =inbox/from-agents/=. -- Receiver polls (or watches) =inbox/from-agents/=, *not* the parent =inbox/=. -- The parent =inbox/= stays reserved for human-triage items. -- Out-of-band artifacts (PDFs, datasets) live at =inbox/from-agents/artifacts/=. Reference by relative path in the message body. - -The user does NOT write directly to =from-agents/=. To inject input into a running conversation, the user tells one of the agents in that agent's session; the agent writes the input as a normal message attributed to the user. - -** File naming - -=YYYYMMDDTHHMMSSZ-from-<sender>-<short-conv-id>.org= - -- Timestamp is UTC ISO 8601 compact. The trailing =Z= is mandatory. -- =from-<sender>= prefix. -- =<short-conv-id>= is a stable kebab-case slug across the back-and-forth. Reusable across time; ordering relies on filename timestamps. - -Frontmatter =#+TIMESTAMP= carries the same instant in local time with explicit offset. The two MUST refer to the same instant. - -The implementation (=cross-agent-send=) generates the canonical filename from the message's frontmatter (=CONVERSATION_ID=, current UTC time) and the sender's project context. Senders supply only the message body file; the script handles naming. Senders MUST NOT pre-name files in this format and pass them through; the script overwrites with its own canonical name to ensure consistency and enable the sender-side max-seen sequence-collision-reduction scan. - -GPG signatures live in a sibling file =YYYYMMDDTHHMMSSZ-from-<sender>-<short-conv-id>.org.asc=. Receivers verify before processing. See =* Writes are atomic= for the two-file delivery ordering rule. - -** Frontmatter - -Required: - -#+begin_example -#+TITLE: <human-readable subject> -#+CONVERSATION_ID: <stable across the thread> -#+MESSAGE_TYPE: <see types below> -#+SEQUENCE: <integer hint> -#+TIMESTAMP: <ISO 8601 with explicit offset> -#+PROTOCOL_VERSION: 5 -#+end_example - -Optional: - -#+begin_example -#+REQUIRES_TOOLS: <comma-separated tool/MCP slugs, e.g. gmail-mcp, slack-mcp> -#+RELEASE_STATUS: <see release-statuses; valid only on MESSAGE_TYPE: release> -#+WORKFLOW_VERSION: <sender's version of cross-agent-comms.org; informational only in v5 — no enforcement> -#+end_example - -Receiver sanity-checks frontmatter before acting. Missing or malformed frontmatter → surface to user, don't proceed. Mismatched =PROTOCOL_VERSION= → receiver writes a =query= asking the originator to upgrade. - -** Identity - -Messages are GPG-signed by the sender. Receivers verify the detached signature before processing the message body. - -The implementation (=cross-agent-send=) signs automatically with the sender's configured key (the user's primary GPG key by default; configurable via =--key= flag or environment). Receivers verify automatically against the keys in their GPG keyring. - -Identity is a three-layer story: - -1. *Tailscale layer.* Only tailnet members can reach the rsync-over-SSH endpoint at all. -2. *POSIX layer.* =chmod 700= on =from-agents/= means only processes running as the directory's owner can write. -3. *GPG layer.* Sender's signature on each message proves the message originated from a process holding the key. - -Three independent layers. Per-user GPG (using existing keys) gives a correctness check more than a security boundary — unsigned messages are almost certainly bugs, not attackers. That's still load-bearing. - -** Writes are atomic - -Writers MUST use a temp-file + rename pattern (=mktemp= + =mv= within the same filesystem) so receivers never see partial files. The implementation script (=cross-agent-send=) handles this. - -Receivers ignore =.tmp.*= files, processing only the final renamed name. - -*Two-file ordering.* When a message has a sibling GPG signature file (=.org.asc=), the writer MUST rename the =.asc= to its final name *before* renaming the =.org=. Two =mv= operations are not atomic together — without this ordering, a receiver could read the =.org= in the window between the two renames and fail GPG verify because the =.asc= hasn't landed yet. The rule: receiver only acts on =.org= files, and a =.org= without a corresponding =.asc= means the signature is genuinely missing (not still in flight). - -** Sequence numbering - -=#+SEQUENCE= is a *hint*, not a strict counter. Canonical order is =#+TIMESTAMP=. Sequences may collide under rapid back-and-forth (both sides write what they think is sequence N near-simultaneously). Treat collision as a normal protocol event. - -*Receiver-side dedup rule.* When a new file shares =CONVERSATION_ID= + =SEQUENCE= with an already-processed message, compare SHA-256 hashes. Identical hashes → silent dedup, treat as a retry. Different hashes → process both, ordered by =#+TIMESTAMP=. - -*Sender-side collision-reduction (best-effort).* Before picking sequence, scan the receiver's =from-agents/= for the highest existing sequence in this conversation across both sender prefixes. Use =max(seen) + 1=. - -** Message types - -- *request* — a side asks for work, input, or a decision. Sequence 1 is always =request=. -- *progress* — work-in-progress checkpoint. "Here's where I am, no action needed from you, more coming." Originator's poll loop should NOT page the user on progress messages. -- *query* — either side asks a clarifying question that blocks further work. Originator's poll loop SHOULD surface this immediately. Originator answers and work continues. -- *pushback* — receiver formally disagrees with the request and has *not* started the work. Carries reasoning. Distinct from =query= because the originator's response path differs. -- *complete* — receiver signals the requested work is done. Triggers verification. -- *release* — terminal type. Originator writes after verifying =complete=. Carries =RELEASE_STATUS= to disambiguate the closure mode. -- *escalate* — punts the conversation to the user for adjudication. Both sides pause polling on =escalate=; the user resolves. - -Reply expectation is implied by type: =request=, =query=, =pushback=, =escalate= expect a reply; =progress=, =complete=, =release= don't. - -** Conversation lifecycle - -A conversation is a directed loop between an originator (issued sequence 1) and a receiver: - -1. Originator writes =request= (sequence 1). Begins polling for replies. -2. *Optional acknowledgment.* Receiver may write a =progress= at sequence 2 to acknowledge receipt and set expectations. Required if work will take >5 minutes (so the originator's poll loop doesn't waste wakes). -3. *Optional echo-back.* For ambiguous or large requests, receiver writes a =progress= that restates work items and announces "starting now unless you push back within N minutes." -4. Receiver works. May write =progress= updates. =query= mid-work if blocked. =pushback= if the request is wrong. -5. Receiver writes =complete=. Begins polling for =release=. -6. Originator reads, *verifies the deliverable directly*. For subjective deliverables, verification is the originator's editorial accept. -7. If verified: =release= with =RELEASE_STATUS: complete=. If problems: new =request= (next sequence number). -8. Receiver sees =release=, stops polling. - -The verification step is load-bearing. =complete= is a *claim*; =release= is *verification*. - -** Pushback path - -On receiving a =pushback=, the originator chooses: - -1. *Revise* — new =request= with adjusted scope. -2. *Insist* — new =request= addressing the pushback's reasoning, standing by direction. -3. *Withdraw* — =release= with =RELEASE_STATUS: withdrawn-after-pushback=. - -*Deadlock cap.* After two pushback-insist exchanges, the next message MUST be =MESSAGE_TYPE: escalate=. Both agents pause polling; the user resolves. - -** =RELEASE_STATUS= values - -| Status | Meaning | -|---+---| -| =complete= | Goal achieved, originator verified | -| =cancelled= | Originator changed their mind mid-conversation | -| =withdrawn-after-pushback= | Originator chose option 3 on receiver's =pushback= | -| =abandoned-after-escalation= | User adjudicated and chose to close the conversation | -| =abandoned-after-timeout= | Receiver auto-closed after originator never returned to verify | - -** Async fallback - -If the originator session ends between =request= and =complete=, the receiver's =complete= goes unverified. Receiver behavior: - -- Polls for =release= up to ~24 hours of cycles (implementation default). -- After timeout, writes a final =progress= message ("treating as terminal-without-verification; originator never returned to release") and stops polling. Receiver does NOT write =release= itself — that would contradict the lifecycle rule that =release= is the originator's terminal action. -- Next time the originator project starts, the unreleased =complete= is surfaced as a startup item. The user can issue a late =release= (with whichever =RELEASE_STATUS= fits) or open a fresh conversation to revisit. =RELEASE_STATUS: abandoned-after-timeout= is used at that point if the user wants to formally close the orphaned thread. - -** Escalation - -A side writes =escalate= when: -- Pushback-insist deadlock cap reached. -- Conversation has stalled (no productive movement in N exchanges). -- A reply-expecting message has gone unanswered past timeout. - -Body summarizes both sides' positions in 60 seconds of reading. Both agents pause polling; the user resolves. - -* Implementation notes - -This sub-section describes how to operate the protocol. Operational detail lives in the seven scripts' READMEs. - -** Recommended scripts - -| Script | Replaces user action | README | -|---+---+---| -| =cross-agent-send <dest> <msg>= | Filename generation, GPG sign, atomic write, peer lookup, rsync push, retry+backoff, failure surfacing — seven mechanical sender-side steps. Frontmatter and message body are still author-supplied. | =cross-agent-send.md= | -| =cross-agent-recv <msg>= | Frontmatter sanity-check, =PROTOCOL_VERSION= verify, GPG verify, SHA-256 dedup, =REQUIRES_TOOLS= check — five mechanical receiver-side steps. Output is a structured decision (=process= / =dedup= / =query= / =reject=) the agent acts on. | =cross-agent-recv.md= | -| =cross-agent-watch= | Manually checking inboxes; "did I get a message?" | =cross-agent-watch.md= | -| =cross-agent-status= | Walking each project to count pending messages | =cross-agent-status.md= | -| =cross-agent-discover= | Remembering project topology and reachability | =cross-agent-discover.md= | -| =cross-agent-halt [reason] [--tailnet]= | Visiting each session to stop polling, restarting Claude Code, or hand-killing processes when comms go runaway. =--tailnet= propagates HALT to all peers. | =cross-agent-halt.md= | -| =cross-agent-resume [--tailnet]= | Manually clearing the HALT state and restarting the watcher. Per-session polling does NOT auto-resume — the user re-engages each session explicitly. | =cross-agent-resume.md= | - -The scripts are tools the user runs from any terminal. They do not depend on agent context — =cross-agent-status= run from a fresh shell works. - -A reader can comprehend this protocol from this spec alone. Script READMEs add operational detail that makes the protocol practical to use, but understanding the protocol's semantics requires only this document. - -** Polling - -Default cadence: 270 seconds (≈4.5 min). Sits just under the 5-minute prompt-cache TTL. - -If a side needs to slow down (heads-down work, idle wait), it writes a =progress= message saying so in prose. The other side adapts. There are no named polling modes. - -After ~12 empty polls in a row, the poll loop surfaces the silence to the user. - -A future runtime with native filesystem-event support could replace polling for active sessions; =cross-agent-watch= already provides event-driven notifications outside active sessions. - -** User multi-tasking - -- *Deferral.* If the user's last message in the agent's session was less than 60 seconds ago AND a poll fires, queue the inbox check until either the user sends another message OR 5 minutes pass without further input. -- *Surfacing.* On the next user-facing response: "While we were working on X, a cross-agent message landed from <project>. It's a =<type>= — want me to handle it now or after we finish?" -- *Mid-question.* Answer the user first. -- *Project switch.* If the user moves to the receiver project mid-conversation, the receiver agent surfaces the in-flight thread on first user prompt. -- *Conversation state.* Always include in any response that mentions a cross-agent thread: "<conv-id> at sequence N, awaiting <event>." - -** Failure modes - -The seven scripts surface most failures with concrete error messages. Spec-level failure modes: - -- *Malformed frontmatter on a received file.* Surface to user; do not act. -- *Mismatched =PROTOCOL_VERSION=.* Receiver writes =query= asking originator to upgrade. -- *Missing or invalid GPG signature.* Receiver surfaces "unsigned/unverified message"; refuses to act. -- *Sequence collision* with non-matching SHA-256. Process both, ordered by timestamp. -- *Required tool unavailable.* Receiver checks =REQUIRES_TOOLS= during frontmatter-sanity-check (before any work begins). On a missing tool, receiver writes =query= asking the originator to reframe the request to avoid the unavailable tool. Originator may revise (new =request=) or withdraw (=release= with =RELEASE_STATUS: cancelled=). =query= is the right type rather than =pushback= because missing-tool is a capability gap, not disagreement. -- *Runaway resource usage.* User invokes =cross-agent-halt= globally (or =cross-agent-halt --tailnet= for cross-machine). HALT file stops all components within one polling cycle (~5 min). See =* Halt mechanism= for the layered checks. -- *User halts mid-conversation.* Both sides write a final =progress= note ("HALT fired; pausing"); polling stops within one cadence; conversations resume on explicit per-session re-engage after HALT clears. -- *HALT file accidentally created* (typo, errant =touch=). =cross-agent-status= prominently flags HALT active; user clears with =cross-agent-resume=. Cost: no messages send during the typo window. -- *HALT file unreadable* (perms wrong, partial write). Each component fails-closed (treats as halted) and reports "HALT file present but unreadable; treat as halted." Safer than fail-open. - -Operational failures (rsync push fails, watcher dies, peer unreachable) live in the script READMEs' failure-mode tables. - -* Halt mechanism - -A failsafe to stop all cross-agent activity on a machine without visiting individual sessions or restarting Claude Code. Designed for the runaway-polling case: an agent has spun up conversations with N other agents, polling is eating CPU, and the user needs to stop everything *now*. - -** The HALT file - -Path: =~/.config/cross-agent-comms/HALT=. - -Existence triggers halt across all components on the machine. The file's body may carry an optional human-readable reason (reviewed by the user later when deciding to resume). - -User commands: - -#+begin_example -$ touch ~/.config/cross-agent-comms/HALT # halt -$ rm ~/.config/cross-agent-comms/HALT # resume -#+end_example - -Or via convenience scripts (=cross-agent-halt= / =cross-agent-resume=) that also handle the watcher service and cross-machine propagation. - -** Layered checks (the failsafe property) - -Every component MUST check the HALT file. The "any one component stops the system independently" property is what makes this failsafe — the system doesn't depend on a single point doing the right thing. - -| Component | Check timing | Behavior on HALT | -|---+---+---| -| =cross-agent-send= | At start of send + between =.asc= and =.org= rsync + between retry iterations | Refuse to start new send; complete current step then exit. Worst case: one in-flight send finishes within a few seconds. | -| =cross-agent-recv= | Before any verify or dedup | Leave inbound message in place — do NOT dedup, reject, or move. Resume picks it up via cold-start handling. | -| =cross-agent-watch= | At iteration start | Suppress notifications; log only. Continues running, no-op until HALT clears. | -| =cross-agent-status= | At start | Print prominent "⚠ HALT ACTIVE" banner before normal output. Read-only, continues. | -| =cross-agent-discover= | At start | Print HALT banner; continue read-only enumeration. | -| Agent polling loop | First action on every wake | Write a final =progress= note to any active conversation ("HALT fired; pausing"), do NOT reschedule, surface "halt active" to user. Polling decays within one cadence (~5 min). | -| Agent user-facing responses | Every response while HALT is set | Append "(HALT active; cross-agent comms paused)" to the response. On HALT clear, the next response says "(HALT cleared; cross-agent comms ready to resume — say so to re-engage polling)." Persistent, not just first-response — keeps awareness alive. | -| Conversation initiator | Before writing sequence 1 of any new conversation | Refuse and surface to user. | -| Startup workflow | Phase A on session start | If HALT exists, surface immediately and skip cross-agent inbox checks. | - -The agent polling-loop check is the load-bearing one for "stops eating CPU." Wake-ups already scheduled fire, but each wake on-HALT is a no-op + reschedule-prevention. Within one polling cadence (~5 min) all polling stops. - -*Fail-closed on unreadable HALT.* If the HALT file exists but is unreadable (wrong permissions, partial write), components MUST treat as halted. Safer than fail-open. - -** Resume asymmetry (deliberate) - -Halt is automatic everywhere. Resume requires explicit user intent per-session. - -When the user removes HALT (or runs =cross-agent-resume=), components stop refusing to act, but agent polling does NOT auto-resume. The user must open each session and tell that agent to resume polling for its conversations. - -The asymmetry exists because: - -1. Auto-resume could silently invert intentional kills. If the user halted because a session was misbehaving, removing HALT shouldn't quietly revive it. -2. Per-session resume forces the user to look at each session and confirm the situation is resolved before re-engaging. - -** Cross-machine halt - -=cross-agent-halt --tailnet= iterates =peers.toml= and SSH-touches HALT on each peer. Same shape for resume. - -Reports per-peer status with non-zero exit on partial halt: - -#+begin_example -$ cross-agent-halt --tailnet -Halting velox.local ✓ (HALT file written) -Halting bastion.local ✗ (ssh exit 255: no route to host) -Halting locally ✓ (HALT file written) - -PARTIAL HALT: 2/3 machines halted. bastion.local needs manual halt. -Exit 1. -#+end_example - -Scripting can detect partial halt via the exit code. Same pattern for =--tailnet= on resume. - -* Limitations - -- *Local-tailnet only.* Filesystem IPC + rsync over SSH. Cross-tailnet or cross-organization is out of scope. -- *Identity has three layers (Tailscale + POSIX + GPG)* but no message-content encryption. Confidentiality is not the goal; signing is correctness, not secrecy. -- *Single-receiver per conversation.* Fan-out to multiple receivers requires manually orchestrating multiple parallel conversations. -- *Polling is best-effort.* A wake may be delayed by an in-flight tool call until the runtime is idle. =cross-agent-watch= mitigates by offering event-driven notifications. -- *Project-extension drift.* If two projects' =.ai/project-workflows/= modify shared workflow definitions in incompatible ways, cross-agent assumptions can diverge silently. The optional =#+WORKFLOW_VERSION= advisory field is informational only in v5 — no implementation reads or acts on it. A future version may add enforcement on mismatch (e.g. receiver writes =query= asking which side is stale). Today, alignment is verified manually before high-stakes conversations. - -* Persistence after release - -Conversation files persist by default. The conversation log is the audit trail. - -Manual archival is fine if the inbox grows unmanageable. Suggested cadence: once the conversation has been =release='d AND the work it produced has shipped, archive both projects' message files into =.ai/sessions/cross-agent/= as a flat directory — no per-conversation subdirectories. Rename each archived file to lead with the conversation-id so messages from the same conversation cluster on =ls=: =<conv-id>-<TIMESTAMP>-from-<sender>.org= (and the matching =.asc= sibling, if present). Inbox filenames lead with the timestamp because chronological arrival is what matters in =from-agents/=; archives invert that because grouping by conversation is what matters when reading history. Keep the =.asc= signatures alongside the =.org= files in archive — they're small and document the GPG verification chain. - -Old messages don't affect protocol behavior (=cross-agent-status='s pending semantics correctly ignore released messages) but the =from-agents/= directory grows indefinitely without manual archival. =cross-agent-status= performance degrades noticeably when a project's =from-agents/= exceeds a few hundred files. =cross-agent-init= (deferred to v6) would include an archival sub-command. - -* Open questions - -- *=cross-agent-init= and =cross-agent-compose= helper scripts.* =-init= would be one-command project bootstrap (creates =inbox/from-agents/= with =chmod 700=, installs the =cross-agent-watch= systemd path unit, validates peer config, runs a discovery probe). =-compose= would be interactive frontmatter authoring (prompts for required fields, produces a draft message file). Both deferred to v6. Current onboarding requires manual =mkdir= + systemd setup per =cross-agent-watch.md='s install recipe; current message authoring requires writing the file by hand or via a small in-agent template. -- *Hard conversation timeout.* The async-fallback timeout is implementation-default ~24 hours. Right number depends on use case; tighten as patterns emerge. -- *=paused= polling state.* Today there's no clean signal for "pause without ending." Add when first user complaint surfaces. -- *Multi-LLM context.* If we ever bring in a non-Claude agent, the protocol's natural-language framing may need formalization. - -* Examples - -** =prep-fixup= conversation (2026-04-26 → 2026-04-27) - -Eleven exchanges between homelab and career produced the v4 spec by iterative critique-and-simplification. Three real-time sequence collisions during the conversation drove the sequence-as-hint rule that landed in v4 and persists in v5. - -Files at =~/projects/{homelab,career}/inbox/from-agents/= named =*-prep-fixup.org=. Worth re-reading when designing future cross-agent flows. - -** =comms-cold-start-discovery= conversation (2026-04-27) - -The follow-up that produced this v5 spec. Cold-start, watcher tooling, agent discovery, GPG identity, sha256 dedup, atomic writes, POSIX perms, script absorption, and process-vs-text simplification. Tonight's first cold-start in real time (career session went dormant after =prep-fixup= release; Craig's user-injection re-engaged it) is the worked demonstration of the v5 user-injection rule. - -Files at =~/projects/{homelab,career}/inbox/from-agents/= named =*-comms-cold-start-discovery.org=. diff --git a/.ai/workflows/helper-mode.org b/.ai/workflows/helper-mode.org index 8ead37b..cdec200 100644 --- a/.ai/workflows/helper-mode.org +++ b/.ai/workflows/helper-mode.org @@ -65,7 +65,7 @@ The git ban is concurrency-scoped. /Helper wrap-up/ below lifts it for exactly o ** Escalation -Anything the contract blocks routes through the cross-agent message form (=machine.project.agent-id=), or just gets reported to Craig. The helper leaves its tree changes for the primary's next commit, or describes them in a targeted message. +Anything the contract blocks gets reported to Craig, or — for a cross-project handoff — routed through =inbox-send= to the owning project's =inbox/=. The helper leaves its tree changes for the primary's next commit, or describes them in a note to Craig. * Data-Integrity Rules diff --git a/.ai/workflows/inbox-zero.org b/.ai/workflows/inbox-zero.org deleted file mode 100644 index aa7c273..0000000 --- a/.ai/workflows/inbox-zero.org +++ /dev/null @@ -1,97 +0,0 @@ -#+TITLE: Inbox Zero Workflow -#+AUTHOR: Craig Jennings & Claude -#+DATE: 2026-06-13 - -* Overview - -The roam global inbox (=~/org/roam/inbox.org=) is Craig's cross-project GTD capture: one shared file every project can see. This workflow routes each inbox item to the project that owns it. The current session claims only the items belonging to THIS project, files them into the project's =todo.org=, and removes them from the shared inbox. Everything it doesn't own stays. The aspiration is inbox zero: over time, every item lands in its owning project. - -This is NOT =process-inbox.org=. That workflow handles the project's own =inbox/= directory (handoffs from other projects, scripts, Craig). This one handles the single global =roam/inbox.org= and the cross-project routing a shared file creates. - -This is also distinct from the wrap-up inbox/transcript routing feature (which moves session-filed keepers between projects). This routes the shared roam capture file by ownership prefix. - -** Scope: single-destination (v1) - -This version routes each item to its one owning project, identified by an explicit =<project>:= heading prefix. The multi-project domain-aware mode, which would guess the owner of every unprefixed item and empty the whole inbox in one run, is deferred (see "Deferred: domain-aware routing" at the end). v1 claims only what's prefixed for the current project, surfaces the rest, and never guesses. - -** Three callers - -Reused from three callers so the steps live in one place: -- *Startup* (read-only nudge) — count the items, identify which appear related to this project, surface both numbers, offer processing as one of the startup options. Never auto-files. -- *Wrap-up* (Step 3 sub-step) — sweep items that belong here before the cleanup scripts, so imported tasks lint and ride the wrap commit. -- *On demand* — "inbox zero", "empty the inbox", "process the roam inbox", "triage my roam inbox". - -Each project touches the roam inbox at least twice a session this way: once at startup, once at wrap-up. - -* The ownership rule (the coordination primitive) - -The inbox is shared, so the workflow must never let two projects fight over an item or let one project grab another's. Ownership is by explicit prefix: - -- =<project>: ...= heading → owned by that project. The current project claims only items prefixed with its own identifier. -- Prefixed for *another* project → leave untouched (cross-project boundary, =protocols.org=). -- *No prefix* → unowned. Never auto-claim. Surface as candidates a human can claim or prefix. - -The prefix partition is what makes concurrent triage across projects safe: each project only ever removes its own items, so two sessions editing the inbox touch disjoint lines. - -** Resolving this project's identifier (v1) - -Use the project root basename plus its common aliases (=.emacs.d= ↔ =emacs=, and the obvious ones: =rulesets=, =work=, =home=). A project may override the inferred set with an =:INBOX_PREFIX:= line in =notes.org='s *Workflow State* section when the basename is fragile (a dot in the name, an alias the inference misses). The explicit override is optional in v1; the durable multi-project resolution is part of the deferred domain-aware mode. - -* Phase A — Identify, count, and match - -1. Resolve the current project's identifier and aliases (above). -2. Read =~/org/roam/inbox.org=. If absent, silent no-op (the file lives only on machines with the roam clone). -3. Bucket every item under the inbox heading: - - *claimed* — prefixed for this project - - *foreign* — prefixed for another project → leave - - *unowned* — no project prefix -4. *Summarize the scan* (Craig's requirement, every scan): report the total item count in the inbox, then the count that appears related to this project. "Appears related" is the union of claimed items (exact prefix) and any unowned item whose topic plainly concerns this project's domain (a content judgment, surfaced as a candidate, never auto-claimed). Foreign-prefixed items are not "related" — they belong to their owner. -5. If both claimed and related-unowned are empty, report the total and stop (the common case for most wraps). - -* Phase B — File each claimed item into todo.org - -Apply =process-inbox.org='s discipline against the project's =todo.org=; don't reinvent it: - -1. *Status check first.* Already done, or already a task in =todo.org=? → drop it, or fold into the existing task (dated sub-entry per =todo-format.md=). Don't duplicate. -2. *Rewrite* to terse-heading + body per =todo-format.md=. -3. *Priority + tags from THIS project's scheme* — the legend at the top of its =todo.org=, tags from that scheme's allowed set only. The project expresses someday-maybe with =[#D]=; there's no special someday-maybe routing. -4. *File* under the project's Open Work section. - -* Phase C — Reconcile the shared inbox - -The roam inbox lives in a git repo (=~/org/roam=, auto-synced by the =roam-sync= timer). Edit it carefully: - -1. *Pull first* (=git -C ~/org/roam pull --ff-only=). If it can't fast-forward (dirty tree, divergence), surface and stop. Don't auto-stash, auto-merge, or force. Resolve before removing items. -2. *Remove only the claimed items.* Never touch foreign or unowned items. -3. *Commit the roam repo as its own commit* (separate from any project wrap commit): =chore(inbox): route <project> tasks to <project>/todo.org=. Push, or leave for the =roam-sync= timer. Surface a blocked push; don't force. - -* Phase D — Surface - -Report: moved (with their new priorities and tags), folded, dropped-as-done. Then the residue: foreign items (left for their owners, count only) and unowned items (count plus the headings that appear related to this project, for manual claim or prefix). Same "summarize what we kept" shape. - -* Skip conditions - -- No =~/org/roam/inbox.org= → silent no-op. -- No claimed and no related-unowned items → report the total, stop. -- Roam pull blocked → surface, stop before editing. - -* Caller integration - -** Startup (read-only nudge) - -Phase A of =startup.org= reads =~/org/roam/inbox.org= and produces the scan summary; Phase C surfaces one line: "Roam inbox: N items total, M appear related to this project — say 'inbox zero' to file them." Offered as one of the priority options. Startup never auto-files; it counts and offers. - -** Wrap-up (Step 3 sub-step) - -A sub-step at the start of wrap-up Step 3 (before the cleanup scripts, so imported tasks get linted and ride the wrap commit) delegates here for the claimed set. Skip-fast when nothing matches. - -* Deferred: domain-aware routing (future work, multi-project) - -v1 handles the single-destination case via the prefix rule. The multi-project parts are deferred until the need is real: - -1. *Domain-aware empty-it-all mode.* If rulesets held a description of each project's domain, one run could guess the owner of every item (prefixed or not) and empty the whole inbox at once, delivering each item to its owning project's =inbox/= via =inbox-send= (where that project's =process-inbox= gate still decides whether to file it). This turns "inbox zero" from a per-project aspiration into a single command. Open: where the domain map lives (central registry vs each project's =notes.org=), how confident a guess must be before auto-routing vs surfacing, and whether a low-confidence item stays put. -2. *Explicit per-project =:INBOX_PREFIX:= as the durable resolver*, replacing basename inference. -3. *Unowned-item lifecycle* once domain-aware routing exists (no item stays unrouted indefinitely). -4. *Concurrent push contention* on the shared roam repo: the pull-before-edit + ff-only + surface-on-conflict floor may want a retry-once-after-pull. - -Take these up when the single-destination version is in use and the multi-project pain is concrete. diff --git a/.ai/workflows/inbox.org b/.ai/workflows/inbox.org new file mode 100644 index 0000000..67f0dde --- /dev/null +++ b/.ai/workflows/inbox.org @@ -0,0 +1,488 @@ +#+TITLE: Inbox Workflow (Engine) +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-06-23 + +* Overview + +One engine for the project's inbox surfaces. Inbox items are *ideas to evaluate*, not orders to execute — each is a proposal that earns a place in =todo.org= or git history only when it passes the value gate. The engine holds the shared disposition machinery once: the three-question value gate, the skeptical review, the disposition ladder, the reply-to-sender discipline, the capture-guard before a roam write, and the priority-scheme check. Each *mode* is a thin section that names which surface it reads, how it enters and exits, and which core steps it runs. + +Two surfaces feed a project, and there is a recurring check over the second: + +1. *Project-local =inbox/= dir* — handoffs from other projects (via =inbox-send=), from scripts, and from Craig (typed directives saved as files). Handled by *process mode*; watched on a cadence by *monitor mode*. +2. *Global roam inbox* (=~/org/roam/inbox.org=) — Craig's cross-project GTD capture, one shared file every project can see. Handled by *roam mode*, which claims only the items this project owns. *Auto inbox zero* runs roam mode on a recurring interactive loop. + +A *third* surface — external accounts (email / calendar / PRs) — is a different domain and stays in its own engine: =triage-intake.org= and its source plugins are *not* part of this engine. "Deal with my inbox dirs" is here; "what's new across my accounts" is there. + +*Two altitudes.* For the user, the trigger phrase picks the mode and the phrases are unchanged (see When to Use). For the implementer, this is one file: the core sections are written once, and each mode references them by name ("run the value gate (core §1) on each item") rather than restating them. + +* When to Use This Workflow + +The trigger phrase selects the mode. Every phrase below still works; it now routes to a mode of this engine. + +** Process mode — the local =inbox/= dir + +- "process inbox" / "process the inbox" +- "handle the inbox" +- "what's in inbox" / "what's in the inbox" +- "let's clear the inbox" / "let's process the inbox items" + +Auto-invocation: startup Phase C delegates here when the local inbox is non-empty — don't ask, just run it. + +** Monitor mode — process mode on a cadence + +- "monitor the inbox" / "watch the inbox" — *the defined meaning:* one process pass now, then loop every 15 minutes (see the Monitor mode Cadence section). The phrase *is* the loop, not an opt-in extra. +- "respond to the handoffs" / "handle the handoffs" — a single pass now, no loop. + +Ambient (always on, even with no loop running): the =inbox-status= task-boundary check (Monitor mode). + +** Roam mode — the global roam inbox + +- "inbox zero" / "empty the inbox" / "process the roam inbox" / "triage my roam inbox" + +Called read-only from startup (count + offer) and as a wrap-up Step 3 sub-step. + +** Auto inbox zero — recurring interactive roam check + +- "auto inbox zero" + +Match this before "inbox zero" — the auto phrase contains the roam phrase as a substring, so the longer match wins. Starts a recurring =/loop=-driven roam-mode pass; see the Auto inbox zero mode. + +** Boundary + +Do *not* invoke this engine for an inbox item that is clearly out-of-scope for the project — that is a cross-project routing problem, handled per the cross-project boundary rule in =protocols.org=. And do not invoke it for external-account triage ("what's new in email/cal/PRs") — that is =triage-intake.org=. + +* Core §1 — The value gate + +Every inbox item (local or roam) passes through three questions. One *yes* is enough to accept. + +1. *Does it advance an existing TODO?* Look up by topic in =todo.org='s open work. If the item extends a filed task, fold it in. If it implements a filed task, do the work. +2. *Does it improve how the project works?* Architecture cleanup, workflow refinement, tooling, rule hygiene, drift detection — anything that makes the project itself more effective. +3. *Does it serve the project's stated mission?* Read =notes.org= *Project-Specific Context* if the mission isn't obvious from the working directory and current task. The item should advance that mission, not orbit it. + +Three *no*s means reject. The rejection isn't lazy — an idea that doesn't help any current task, doesn't improve the system, and doesn't serve the mission is genuine noise, and accepting it inflates =todo.org= without payoff. + +* Core §2 — The skeptical review + +The value gate decides whether an item is worth taking. This review decides whether what it proposes is *right*, *complete*, and *as simple as it should be*. Run it on every task and file that arrives — not only shared-asset change proposals. Pure FYIs and replies that ask for nothing skip it. + +Approach the file with curiosity and skepticism. Work through, in writing — the core pass on every item: + +1. Is the request actually right — does it do what it claims, and is the claim correct for this project? +2. Is it complete, or does it leave a gap — an unhandled case, a missing step, an untested path? +3. Should it be simpler? +4. Can it be enhanced to be more effective than as proposed? +5. Does it conflict with any existing instruction — workflows, skills, rules, protocols, CLAUDE.md? + +When the item proposes a change to *shared assets* — template workflows, rules, skills, scripts, anything synced to consuming projects — or to a substantive convention, add the cross-project battery. It arrived from one project's context; you're evaluating it for all of them: + +6. Does this make sense for *all* consuming projects, or just the sender's situation? +7. How does it change a common activity Craig performs — better, worse, or differently than the sender assumed? +8. Plus at least three more questions specific to this change — what breaks for artifacts already using the old shape, what tooling interacts with it, what's underspecified, what the sender's worked example doesn't exercise. + +Output: a short summary of the thinking and a recommendation (do it / do it with named changes / file / reject). For shared-asset and convention changes the recommendation is surfaced to Craig for approval before applying; for ordinary tasks and files it feeds the act-vs-file and no-approvals-execute decision (Monitor mode). + +** In a no-approvals session: shared-asset changes defer and stage + +Shared-asset and convention changes still don't self-apply when Craig has put the session in no-approvals mode — they need his decision, so they fail the *solo* test in Monitor mode's executing-in-no-approvals criteria. Ordinary tasks and files that pass the review and are quick + solo execute under that criteria instead; this defer-and-stage path is for the shared-asset and convention changes that don't qualify. Run the review, prepare the edits in =working/<task-slug>/= (a patch file or the worked-out diff), file a =[#B]= VERIFY carrying the decision package, and reply to the sender that it's parked. The sender's local stopgap (per =cross-project.md='s propagation process) means the delay costs nothing — the canonical update is about durability, not speed. + +Wording-only fixes — no consuming project acts differently — may proceed even then, logged in the session log. + +The VERIFY shape (top-level, =[#B]= so startup's A/B surfacing catches it; no =SCHEDULED= unless the proposal names a real deadline): + +#+begin_example +** VERIFY [#B] Parked: <proposal topic> (from <sender>) +What arrived: <one line — what the handoff proposes>. +Recommendation: <accept as-is / accept with changes / reject> — <2-3 line +skeptical-review summary: what's right, what to change, what was checked>. +Prepared diff: [[file:working/<slug>/proposed.diff]] — apply is mechanical on +your go. +Say "approve the parked <topic>" (or adjust / reject) and it gets applied. +#+end_example + +The full question-battery answers live in the session log and the =working/= dir, not the task body — the body carries the conclusion, with the trail one link away. + +* Core §3 — The disposition ladder + +Every item that clears the value gate gets one disposition. The first six are the per-item outcomes; *park* is the no-approvals shared-asset path from core §2. + +** Implement now +Small, scoped, clear, no design call required. The work is the disposition. Do the work, commit per the project's commit flow, delete the inbox file. The commit message references the inbox item by filename so the provenance lands in =git log=. + +** Fold into existing TODO +The item extends a task already filed. Update the parent TODO's body with a dated reconciliation sub-entry per =todo-format.md= (=*** YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <what landed>=). Move substantive content to =docs/design/<date>-<topic>.<ext>= if it's worth keeping; reference from the TODO body. Delete the inbox file. + +** File as TODO +Substantive but waits, or needs design/triage before implementation. Add the TODO under =* <Project> Open Work= with priority + tags per the priority-scheme check (core §6). Body summarizes the proposal and links the inbox content if it's been moved to =docs/design/=. Delete the inbox file (or move it to =docs/design/= first if the content survives). + +** Defer +Rename in place to =inbox/PROCESSED-<original-filename>= and add a brief comment line at the top: =# Deferred YYYY-MM-DD: <condition>=. Don't accumulate deferred items indefinitely — sweep them on a future process pass when the condition is met or the deferral has aged out. + +** Reject — by source +- *From Craig* — push back honestly in chat. State why you won't implement; offer the conditions under which you would, if any. The inbox file stays until Craig confirms — override re-enters as accept, acknowledgment deletes the file. Don't theatre the pushback: if you don't genuinely think Craig is wrong, just do the work. +- *From another project (handoff)* — write a response file at =/tmp/inbox-response-<topic>.org=: a heading naming the original handoff and date, one paragraph on the rejection rationale (*which* value-gate question failed and why), one paragraph on the condition under which you'd reconsider (or "never, this misreads the project's mission" if that's the truth). Deliver via =inbox-send <sender> --file /tmp/inbox-response-<topic>.org=. Delete the local inbox file after the response lands. Silent rejection on a handoff trains the sender to escalate around the channel — always close the loop. +- *From a script or automated system* — just delete. No notification. + +** Park (skeptical review in a no-approvals session) +Move the proposal file into =working/<task-slug>/= alongside the prepared diff, file the =[#B]= VERIFY per core §2, reply to the sender that it's parked for Craig's review, and delete the inbox file. On Craig's approval the apply is mechanical: apply the prepared edits, run the normal verify-and-publish flow, close the parked =**= VERIFY per =todo-format.md= (a top-level VERIFY resolves to =DONE= + =CLOSED:=, not a dated header), and send the acceptance reply. On rejection, the reject-from-another-project flow above runs unchanged. + +* Core §4 — Reply-to-sender discipline + +A handoff came from another project's agent (or the user). Close the loop: + +- *Accepted and acted on* — send a confirmation to the sender via =inbox-send <sender> --text "..."=, naming what landed and the commit, so they're not left guessing (they can't see this project's git log). =inbox-send= excludes the current project as a target, so a self-sourced item is handled in-session, not sent. +- *Accepted and filed* — a short confirmation that it's filed and where, so the sender knows it wasn't dropped. +- *Rejected* — always state the why (which value-gate question failed), per the reject-by-source ladder (core §3). + +Cross-project boundary: never act on a file under another project's =.ai/= scope from here — route it back as a handoff (see =cross-project.md=). + +* Core §5 — Capture-guard before a roam write + +Before *any* read-modify-write of =~/org/roam/inbox.org=, run =.ai/scripts/capture-guard "$HOME/org/roam/inbox.org"=. This runs first because the Phase D edit rewrites the file on disk, and editing underneath a live capture wedges it just as a stray hand edit would. + +- *Exit 0* → no live capture (or no reachable Emacs). Proceed. +- *Exit 1* → an indirect org-capture buffer is cloned from the roam inbox (the script prints the offending buffer name). Editing the file underneath it would leave the capture pointing at stale state and unable to finalize with =C-c C-c= (see =emacs.md=). Behavior depends on the caller: + - *On-demand / interactive run* → stop and surface: "You have a live org-capture session open against the roam inbox (=<buffer>=) — finalize it (=C-c C-c=) or abort it (=C-c C-k=) and I'll continue." Re-run the guard and resume once it returns clean. + - *Wrap-up sub-step* → don't block the wrap. Skip the roam reconcile for this run and surface one line: "Skipped roam-inbox reconcile — a live org-capture is open against it; claimed items stay and get caught next run." The items were already filed into =todo.org= in roam mode Phase C, so the next roam run's Phase C status-check drops the duplicates and its Phase D removes them — the skip self-heals. + +* Core §6 — Priority-scheme check + +This gates filing whenever there are accept-and-file items. Check whether =todo.org= has a top-of-file priority scheme (an explicit legend defining =[#A]= through =[#D]= semantics and mandatory/optional tag conventions — a =* <Project> Priority Scheme= section or similar). + +- *Scheme present* — file new TODOs per the scheme. Every TODO gets a priority cookie matching the legend's rules, the mandatory type tag, and any applicable effort/autonomy tags. +- *Scheme absent* — surface one sentence: "This project has no priority scheme. We should adopt one before filing the new TODOs from this inbox pass — want me to propose one based on the rulesets scheme?" If Craig says yes, do that first (the =/research-priority-scheme= research subagent pattern in rulesets is the reference). If Craig says no, file the TODOs without grading but flag in the commit message that they're un-prioritized pending a scheme. + +The point is to avoid adding ungraded =TODO= entries to a project that's never agreed on what =[#A]= means. + +* Mode: process + +Reads the project-local =inbox/= dir. Entry: a trigger phrase, or startup Phase C on a non-empty inbox. Exit: inbox empty (excluding =.gitkeep= and intentional =PROCESSED-*=), session log updated, =:LAST_INBOX_PROCESS:= stamped. + +** Phase A — Inventory (one parallel batch) + +Issue these reads in one parallel batch: + +1. List =inbox/= excluding =.gitkeep= and =PROCESSED-*= prefixes (use =\ls -la inbox/= per the protocols.org exa-alias note). +2. Read =notes.org= *Project-Specific Context* if mission isn't already loaded in the session. +3. Read =todo.org='s top-of-file priority scheme if present. + +For each inbox file, parse the filename for sender. Two common patterns: + +- =YYYY-MM-DD-HHMM-from-<sender>-<topic>.<ext>= — from another project via =inbox-send=. +- =<topic>.org= — typically from Craig directly, or from a script. + +Note the file type. =.eml= files need the extract script (not raw =Read=): + +#+begin_src bash +# View mode +python3 .ai/scripts/eml-view-and-extract-attachments.py inbox/<file>.eml + +# Pipeline mode (extract attachments to a directory) +python3 .ai/scripts/eml-view-and-extract-attachments.py inbox/<file>.eml --output-dir assets/<target>/ +#+end_src + +Everything else, read directly. + +** Phase B — Evaluate each item + +For each inbox file: + +1. *Read it.* Full read for substantive proposals (org files with TODO entries, design notes, multi-section docs); skim short FYIs and one-liner asks. +2. *Identify the shape.* Instruction, question, proposal, FYI, or handoff — shapes guide disposition. +3. *Apply the value gate* (core §1). One yes → candidate accept. Three nos → candidate reject. +4. *Run the skeptical review* (core §2) on the item before classifying — the core pass on every accepted task and file, plus the cross-project battery when it proposes a shared-asset or convention change. Its summary + recommendation rides along to Phase C; in a no-approvals session it gates whether the item self-applies (quick + solo + agreed, per Monitor mode) or, for shared-asset and convention changes, defers and stages. +5. *Within accept, classify* by the disposition ladder (core §3): implement now / fold into existing TODO / file as TODO. +6. *Within reject, classify by source* (core §3): from Craig / from another project / from a script. + +** Phase B.1 — Priority-scheme check + +Run core §6. This gates Phase C filing when there are accept-and-file items. + +** Phase C — Surface dispositions + +Numbered options inline per =interaction.md= (no popup). Recommendation at item 1. + +Batch trivial items (one-line rejections of script noise, obvious file-as-TODO accepts where the scheme is already settled) into a single confirm-all prompt. Walk substantive items one at a time so the decision is visible. + +Per-item template: + +#+begin_example +<filename> from <sender>: <one-line summary> +Value-gate read: <yes/no on each of the three questions, one phrase each> +Disposition recommendation: <implement / fold into <TODO> / file [#X] :tags: / reject> + +1. <recommendation as item 1> +2. <alternative> +3. Defer — leave in inbox under PROCESSED-<topic>.<ext> until <condition> +4. Something else +#+end_example + +For items that went through the skeptical review, the surfaced disposition includes its summary + recommendation, and approval here is what authorizes the apply. In a no-approvals session those items are reported as parked (the =[#B]= VERIFY) rather than surfaced for live approval. + +For pure FYIs that need no action, surface as a single line and recommend delete-with-acknowledgment. + +** Phase D — Apply + +Apply each disposition per the ladder (core §3). The flow is autonomous past Craig's Phase C approval. + +** Phase E — Close out + +Verify =inbox/= is empty (excluding =.gitkeep= and any intentional =PROCESSED-*= files). Run =\ls -la inbox/= and confirm. + +Update the session log per =protocols.org= with one short paragraph: count processed, count accepted (implement/fold/file split), count rejected (Craig/handoff/script split), and the commit SHA if a commit landed. + +Stamp =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section if it exists, so future workflows that gate on freshness can read it. Same format as =:LAST_AUDIT:= (=YYYY-MM-DD=). + +* Mode: monitor + +Process mode on a cadence. This is the *when, how-often, and act-vs-file* layer; the per-item disposition mechanics are the core sections, run via process mode — not restated here. Monitor decides *that* an item gets handled and *how I respond*; the core decides *what disposition* each item gets. + +The gap it closes: handoffs that arrive mid-session used to sit unseen until the user asked or the next startup ran. A handoff the sender can't see being handled trains them to escalate around the inbox channel. + +** Preconditions — before starting + +Never begin monitoring on a dirty worktree or a failing test suite. A dirty tree means the auto-commit at the end of an executed item sweeps up unrelated changes; a red suite means you can't tell whether the monitor broke something. At the start: + +1. =git status --porcelain= is empty (clean worktree). +2. A full test run is all green (=make test= here, or the project's full-suite command). + +If *dirty*: offer to commit the pending changes in discrete, logical batches before starting. If *red*: offer to investigate the failures first. Surface the blocker with inline numbered options per =interaction.md= and wait — monitoring does not start until the tree is clean and the suite is green. + +** Cadence — how often to check + +*"Monitor the inbox" = run now, then loop every 15 minutes.* Do one process pass over any pending handoffs immediately, then start the loop: + +#+begin_src +/loop 15m check the inbox with inbox-status and run inbox.org process mode over any pending handoffs +#+end_src + +Each firing runs the cheap =inbox-status= check first and only does a full process pass when items are pending. The loop is the monitoring; it runs until Craig stops it or the session ends. Honor the Preconditions gate before the first pass and the Close-out gate when the loop stops. + +*Ambient task-boundary check (always on, even without a loop).* After finishing a unit of work, before reporting back or asking "what's next," run the cheap status check: + +#+begin_src bash +.ai/scripts/inbox-status -q +#+end_src + +Exit 1 means handoffs are pending — list them (drop =-q=) and run process mode. Exit 0 means clean; say nothing. This is one =find=; it costs nothing to run often, and it's the fix for handoffs piling up unseen during long sessions. + +*Startup and wrap-up already cover their ends.* Startup Phase C processes a non-empty inbox; the wrap-up sanity check refuses to wrap with unprocessed handoffs. The task-boundary cadence fills the middle. + +*Mid-task arrivals.* If a handoff lands while you're mid-task and it's urgent (blocks the current work, or is time-sensitive), surface it right away. Otherwise batch it to the next task boundary so the current work isn't thrashed. + +** The act-vs-file decision + +Every accepted handoff (one that clears the value gate) is then either acted on now or filed as a task. + +*Act immediately — and just do it, no asking — when all of these hold:* +- *Clear* — the action is unambiguous; no design decision or option-choice is needed. +- *Bounded* — small, finishable this session, ideally a tight file set. +- *Low-risk and verifiable now* — not a risky change to load-bearing infra (or trivially revertible), and testable/lintable this session. +- *In-scope and safe* — within this project, not destructive or outward-facing without confirmation, not across a project boundary. +- *Cheaper than deferring* — doing it now costs less than filing plus re-triaging later. + +When you decide to act, queue the work and do it. Don't ask first. + +*Exception:* a proposal to change a shared asset (template workflow, rule, skill, synced script) or a substantive convention never qualifies for silent act-now, however clear and bounded it looks — it routes through the skeptical review (core §2), which carries its own approval (or, in a no-approvals session, park) step. + +*File a task when any of these hold:* +- It needs a judgment call, a design decision, or an option the user would pick. +- It's large, multi-session, or sprawls across many files. +- It's blocked (a dependency, an external thing, the user is away). +- It's risky enough to want the user's eyes before it lands. +- It's off the session's active goal and acting now would derail it (file and keep going, unless it's urgent). + +When you decide to file, *ask first* — inline numbered options per =interaction.md=, with *filing as option 1 (the recommendation)* and *"do it now" as option 2*: + +#+begin_example +<handoff> wants <X>. My read: file it (needs <reason>). + +1. File as a TODO ([#?] :tags:) — Recommended +2. Do it now instead +3. Something else + +Pick a number. +#+end_example + +*Always ask if you're unsure* which side of the line an item falls on. Decisiveness on clear act-now items is the point of the rule; the ask is for genuine ambiguity and for filing. + +** Executing in no-approvals mode + +When Craig has put the session in no-approvals mode, an accepted item may be implemented automatically — but only when all three of these hold: + +1. *Agreed* — you've run the value gate and the full skeptical review and concluded the change should be done, not merely that it's harmless. +2. *Quick* — the whole implementation, including verification, is under ~15 minutes. +3. *Solo* — you can carry it end to end without a decision from Craig. Manual verification you perform yourself is fine; needing Craig to choose an option, approve a design, or resolve an ambiguity is not. + +All three → implement it, verify, then commit and push at the end of that item (the Step 0 reconcile and pre-push check from =commits.md= still run). Miss any one and it doesn't self-apply: a shared-asset or convention change needs Craig's decision, so it fails *solo* and routes to the defer-and-stage park (core §2 / core §3); an oversized item fails *quick* and gets filed. + +** Replying to handoffs + +Close the loop per the reply-to-sender discipline (core §4): confirm what landed (accepted-and-acted), confirm where it's filed (accepted-and-filed), or state the why (rejected). + +** The inbox-status script + +=.ai/scripts/inbox-status= lists unprocessed handoffs and exits nonzero when any are pending. Exclusions match the wrap-up sanity check (=.gitkeep=, =lint-followups.org=, =PROCESSED-*=). Exit 0 = clean, 1 = pending, 2 = no inbox/ or bad usage. Use =-q= for the count-only form the cadence check calls. + +** Close out — before finishing + +End the way it started: clean worktree, green suite. Before stopping the loop or reporting the pass done: + +1. Commit or revert everything left in the worktree — nothing uncommitted remains. +2. Run the full test suite once more and confirm all green. + +If either can't be satisfied — a half-done item, a failure introduced during the pass — surface it rather than leaving it. The next monitor run assumes a clean, green starting state (the Preconditions gate). + +* Mode: roam + +Reads the *global roam inbox* (=~/org/roam/inbox.org=), Craig's cross-project GTD capture: one shared file every project can see. This mode routes each roam item to the project that owns it. The current session claims only the items belonging to THIS project, files them into the project's =todo.org=, and removes them from the shared inbox. Everything it doesn't own stays. + +The aspiration is inbox zero: after this mode runs, the current project's local handoff inbox has been processed (Phase A delegates to process mode) and the shared roam inbox no longer contains items explicitly owned by this project. + +This is distinct from the wrap-up inbox/transcript routing feature (which moves session-filed keepers between projects). This routes the shared roam capture file by ownership prefix. + +** Scope: single-destination (v1) + +Routes each item to its one owning project, identified by an explicit =<project>:= heading prefix. The multi-project domain-aware mode (guess the owner of every unprefixed item and empty the whole inbox in one run) is deferred — see "Deferred: domain-aware routing" at the end. v1 claims only what's prefixed for the current project, surfaces the rest, and never guesses. + +** Callers + +The steps live here so three callers reuse them: +- *Startup* (read-only nudge) — count the items, identify which appear related to this project, surface both numbers, offer processing as one of the startup options. Never auto-files. +- *Wrap-up* (Step 3 sub-step) — sweep items that belong here before the cleanup scripts, so imported tasks lint and ride the wrap commit. +- *On demand* — the roam-mode trigger phrases. + +Each project touches the roam inbox at least twice a session this way: once at startup, once at wrap-up. + +** The ownership rule (the coordination primitive) + +The inbox is shared, so the mode must never let two projects fight over an item or let one grab another's. Ownership is by explicit prefix: + +- =<project>: ...= heading → owned by that project. The current project claims only items prefixed with its own identifier. +- Prefixed for *another* project → leave untouched (cross-project boundary, =protocols.org=). +- *No prefix* → unowned. Never auto-claim. Surface as candidates a human can claim or prefix. + +The prefix partition is what makes concurrent triage across projects safe: each project only ever removes its own items, so two sessions editing the inbox touch disjoint lines. + +*Resolving this project's identifier (v1).* Use the project root basename plus its common aliases (=.emacs.d= ↔ =emacs=, and the obvious ones: =rulesets=, =work=, =home=). A project may override the inferred set with an =:INBOX_PREFIX:= line in =notes.org='s *Workflow State* section when the basename is fragile. The explicit override is optional in v1; the durable multi-project resolution is part of the deferred domain-aware mode. + +** Phase A — Process the project-local inbox + +1. Check the project-local =inbox/= with =.ai/scripts/inbox-status -q=. +2. If pending handoffs exist, run *process mode* before touching the roam inbox. Project handoffs are already addressed to this project, so they are higher-confidence and cheaper to clear than shared roam captures. +3. If =inbox-status= reports no =inbox/= directory, note it and continue to the roam inbox. Some projects only participate in the shared roam capture flow. +4. If process mode cannot finish because it needs Craig's decision, stop after surfacing that decision. Do not remove roam items in the same run; the project still does not have a clean inbox. + +** Phase B — Identify, count, and match roam items + +1. Resolve the current project's identifier and aliases (above). +2. Read =~/org/roam/inbox.org=. If absent, silent no-op (the file lives only on machines with the roam clone). +3. Bucket every item under the inbox heading: + - *claimed* — prefixed for this project + - *foreign* — prefixed for another project → leave + - *unowned* — no project prefix + - *empty* — a heading with no title and no body: just stars, optionally a =TODO=/keyword, and whitespace (e.g. =** =, =** TODO =, =*** TODO =). These are aborted or accidental captures, owned by nobody, and safe to delete regardless of project. A heading with any title text or any body content is never empty. +4. *Summarize the scan* (Craig's requirement, every scan): report the total item count in the inbox, then the count that appears related to this project. "Appears related" is the union of claimed items (exact prefix) and any unowned item whose topic plainly concerns this project's domain (a content judgment, surfaced as a candidate, never auto-claimed). Foreign-prefixed items are not "related" — they belong to their owner. Note the empty count separately. +5. If claimed, related-unowned, *and* empty are all absent, report the total and stop (the common case for most wraps). Empty entries on their own are enough to enter Phase D — the cleanup runs even when this project owns nothing else, since empties belong to nobody and removing them is what "check the inbox" should always do. + +** Phase C — File each claimed roam item into todo.org + +Apply the core disposition discipline against the project's =todo.org=; don't reinvent it: + +1. *Status check first.* Already done, or already a task in =todo.org=? → drop it, or fold into the existing task (dated sub-entry per =todo-format.md=). Don't duplicate. +2. *Rewrite* to terse-heading + body per =todo-format.md=. +3. *Priority + tags from THIS project's scheme* (core §6) — the legend at the top of its =todo.org=, tags from that scheme's allowed set only. The project expresses someday-maybe with =[#D]=; there's no special someday-maybe routing. +4. *File* under the project's Open Work section. + +** Phase D — Reconcile the shared roam inbox + +The roam inbox lives in a git repo (=~/org/roam=, auto-synced every 15 minutes by the =roam-sync= timer). Craig captures into it constantly, so its working tree is dirty most of the time — which is exactly why this mode never runs =git pull= itself. A pull on a dirty tree fails, and that would block triage on nearly every run. Instead, edit the file and hand the git work to =roam-sync=, which already commits-first-then-rebases and so handles the dirty tree correctly. + +1. *Guard against a live org-capture session* — run the capture-guard (core §5) before the edit. On exit 1 the caller-specific behavior (interactive stop-and-surface vs wrap-up skip-and-self-heal) is in core §5. +2. *Remove the claimed items and the empty entries* from the working-tree file. Never touch foreign or unowned (titled) items. Empty entries (Phase B's =empty= bucket) are removed on every triage regardless of who would own a titled version, since an aborted capture belongs to nobody. The claimed-item removal and the empty sweep happen in the same edit. +3. *Hand the commit + push to =roam-sync=.* Don't =git pull=, =git commit=, or =git push= here. Trigger the sync so the edit lands promptly rather than waiting up to 15 minutes for the next timer tick: + + #+begin_src bash + systemctl --user start roam-sync.service + #+end_src + + =roam-sync= commits the edit (under its generic auto-sync message), rebases onto the remote, and pushes. The removal is safe to land without a pull-first because only this project ever touches =<project>:=-prefixed lines (the ownership partition), so =roam-sync='s rebase can't conflict on the edit. Provenance for the routed tasks lives in the project's =todo.org= and session log, not the roam commit message. If =systemctl= isn't available, leave the edit for the next timer tick — it still lands. + + Don't pull or stash the roam tree to "clean" it first: that fights =roam-sync= for ownership of the repo's git state. The edit-then-sync handoff is the whole point. + +** Phase E — Surface + +Report: local project inbox disposition first (processed count and whether it is clear), then roam disposition: moved (with their new priorities and tags), folded, dropped-as-done, and empty entries swept (count). Then the residue: foreign items (left for their owners, count only) and unowned items (count plus the headings that appear related to this project, for manual claim or prefix). Same "summarize what we kept" shape. + +If triaging this batch surfaced a durable, cross-project fact (a reference pointer worth keeping, a pattern worth recording), consider writing it to the agent KB as one =:agent:= node (see =knowledge-base.md=; personal projects only). Skip silently when nothing durable came up — never pad an empty run with a KB line. + +** Skip conditions + +- No project-local =inbox/= and no =~/org/roam/inbox.org= → silent no-op. +- Project-local =inbox/= exists but has no pending handoffs → continue to roam scan. +- No =~/org/roam/inbox.org= after the local inbox check → report the local inbox disposition and stop. +- No claimed, no related-unowned, and no empty roam entries → report the total, stop. +- Live org-capture against the roam inbox (capture-guard exit 1) → surface (interactive) or skip-and-self-heal (wrap-up), per core §5. + +** Caller integration + +*Startup (read-only nudge).* Startup already checks the project-local =inbox/= via =inbox-status= and processes it through process mode when needed. It also reads =~/org/roam/inbox.org= and produces the roam scan summary; one line surfaces: "Roam inbox: N items total, M appear related to this project (K empty entries to sweep) — say 'inbox zero' to file them." Offered as one of the priority options. The empty count rides along so a clean-up-only run still gets offered. Startup never auto-files or auto-sweeps roam items; it counts and offers (the read-only nudge never edits, so empties are reported, not removed, until a real triage runs). + +*Wrap-up (Step 3 sub-step).* A sub-step at the start of wrap-up Step 3 (before the cleanup scripts, so imported tasks get linted and ride the wrap commit) delegates here for the claimed set. Skip-fast when nothing matches. + +** Deferred: domain-aware routing (future work, multi-project) + +v1 handles the single-destination case via the prefix rule. The multi-project parts are deferred until the need is real: + +1. *Domain-aware empty-it-all mode.* If rulesets held a description of each project's domain, one run could guess the owner of every item (prefixed or not) and empty the whole inbox at once, delivering each item to its owning project's =inbox/= via =inbox-send= (where that project's process-mode gate still decides whether to file it). Open: where the domain map lives, how confident a guess must be before auto-routing vs surfacing, and whether a low-confidence item stays put. +2. *Explicit per-project =:INBOX_PREFIX:= as the durable resolver*, replacing basename inference. +3. *Unowned-item lifecycle* once domain-aware routing exists (no item stays unrouted indefinitely). +4. *Concurrent push contention* on the shared roam repo: triage now hands its commit + push to =roam-sync=, which already aborts-and-surfaces on a rebase conflict. If multi-machine contention ever makes that abort frequent, =roam-sync= may want a retry-once-after-rebase. + +Take these up when the single-destination version is in use and the multi-project pain is concrete. + +* Mode: auto inbox zero + +A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports, and a find waits for Craig's go before any work happens. + +** Per cycle + +1. Run roam mode's scan (Phase A local check + Phase B roam scan), read-only — no =git pull=. The capture-guard (core §5) still gates any write, and the rare write hands its git to =roam-sync= (roam Phase D). +2. *Nothing found* → no inbox summary. One acknowledgement line: =ran at HH:MM, nothing found=. Nothing else. The acknowledge-only-on-empty rule keeps a quiet inbox quiet. +3. *Items found* → summarize the found items, file them as tasks (roam Phase C), and *append them to a displayed queue* — the harness task list, via =TaskCreate= — so the queue accumulates across cycles. Then ask: "run this batch next?" + - *Yes* → launch into implementing the found items, each through the normal disposition ladder (core §3) + verify flow. + - *No* → they stay queued for a later go. +4. *Cross-cycle dedup.* Subsequent cycles add only *newly-found* items to the same displayed queue, never re-surfacing what's already there. Dedup against the queue (the =TaskCreate= list), not against what's already been implemented — a find that was queued-but-not-yet-run must not reappear, and one already filed into =todo.org= is dropped by roam Phase C's status check. + +A find is always surfaced and gated on Craig's yes; a quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its execute step waits for a yes. + +** Fully-unattended pass (=/schedule=) — vNext, not v1 + +A fully-unattended cron pass (firing while Craig is away) is a *different contract* and is deferred. It can't wait for a yes, so it has to decide up front whether it may mutate =todo.org= and the roam inbox or stays read-only, how a find reaches Craig asynchronously, how dedup state survives across runs that don't share a session, and what session/auth context a cron run carries. + +The =/schedule= recipe, once that contract is designed, would look like: + +#+begin_src +/schedule <cron-expression> run inbox.org roam mode read-only, and <surface-mechanism> any finds +#+end_src + +v1 ships only the interactive =/loop= shape above; the unattended contract is logged to =todo.org= for its own design pass. Don't invent the unattended behavior here — route a request for it to that task. + +* Common Mistakes + +1. *Treating items as orders.* Inbox content is a proposal. The value gate is the rule. Implementing every item without evaluation inflates =todo.org= and trains senders to keep sending noise. +2. *Filing without applying the value gate.* "File as TODO" is not a default — it's the disposition for proposals that pass the gate but wait. A reject is also a valid answer. +3. *Filing raw TODOs when the project has a priority scheme.* Core §6 is mandatory when the scheme exists. An un-graded TODO in a project with a legend is a defect. +4. *Silently deleting a project handoff.* Send a response naming which value-gate question failed. Silent rejection trains the sender to escalate to Craig instead of through the inbox channel. +5. *Pushing back on a Craig directive only to immediately implement it anyway.* If you genuinely think Craig is wrong, say so and wait. If you don't, just do the work — don't theatre the pushback. +6. *Skipping the implement-vs-fold-vs-file classification.* Defaulting every accept to "file as TODO" turns the inbox into a queue that flows into =todo.org= without filtering. +7. *Not propagating value-gate failure to the response.* When you reject a handoff, name *which* gate question failed so the sender can recalibrate, not just resend. +8. *Forgetting to delete the inbox file after acting.* The local inbox should be empty when process mode ends. Files left behind become noise on the next startup. +9. *Applying a shared-asset change proposal without the skeptical review.* The value gate alone asks whether to take the change, never whether the change is right, complete, or as simple as it should be. (Worked example: the 2026-06-12 spec-decisions handoff was applied as-is and the after-the-fact review surfaced a lost state, a vacuous gate pass, and an enhancement — all catchable up front.) +10. *Editing the roam inbox without the capture-guard.* A disk write under a live org-capture wedges the capture (core §5). Guard first, every roam write. +11. *Auto inbox zero re-surfacing queued items.* The loop must dedup against the displayed queue, not just against what's been implemented — or every cycle re-lists the same un-run finds. + +* Living Document + +Refine the value gate's three questions if the project's mission sharpens. Tune the per-source rejection-response template if =inbox-send= response loops surface a pattern. Tune the monitor cadence if task-boundary checking proves too frequent or too sparse. Capture the auto-loop interval that worked once the pattern recurs. + +If a mode wants real depth — enough that it bloats the core — it can become an =inbox.<mode>.org= plugin under this engine's namespace (the pattern =triage-intake= uses) rather than swelling this file. The principle that inbox items are *ideas to evaluate* is the part that doesn't change. diff --git a/.ai/workflows/monitor-inbox.org b/.ai/workflows/monitor-inbox.org deleted file mode 100644 index 4912a2b..0000000 --- a/.ai/workflows/monitor-inbox.org +++ /dev/null @@ -1,122 +0,0 @@ -#+TITLE: Monitor Inbox Workflow -#+AUTHOR: Craig Jennings & Claude -#+DATE: 2026-05-31 - -* Overview - -Keep the project's =inbox/= responsive: notice handoffs on a cadence, triage each one, decide whether to act now or file it, and reply to the sender. This workflow is the /when, how-often, and act-vs-file/ layer. The per-item disposition mechanics — the value gate, the implement/fold/file classification, the per-source rejection flow — live in [[file:process-inbox.org][process-inbox.org]] and are not duplicated here. Think of it as: monitor-inbox decides /that/ an item gets handled and /how I respond/; process-inbox decides /what disposition/ each item gets. - -The gap this closes: handoffs that arrive mid-session used to sit unseen until the user asked or the next startup ran. A handoff the sender can't see being handled trains them to escalate around the inbox channel. - -* Preconditions — before starting - -Never begin inbox monitoring on a dirty worktree or with a failing test suite. A dirty tree means the auto-commit at the end of an executed item sweeps up unrelated changes; a red suite means you can't tell whether the monitor broke something. At the start of the workflow, check both: - -1. =git status --porcelain= is empty (clean worktree). -2. A full test run is all green (=make test= here, or the project's full-suite command). - -If the worktree is *dirty*: offer to commit the pending changes in discrete, logical batches before starting. If the suite is *red*: offer to investigate the failures first. Surface the blocker with inline numbered options per =interaction.md= and wait — monitoring does not start until the tree is clean and the suite is green. - -* When to Use This Workflow - -Trigger phrases: - -- "monitor the inbox" / "watch the inbox" — *the defined meaning:* run one process-inbox pass now, then loop process-inbox every 15 minutes (see Cadence). Not an opt-in extra — the phrase *is* the loop. -- "respond to the handoffs" / "handle the handoffs" — a single pass now, no loop. - -Cadence auto-trigger (ambient, always on): check at every task boundary during a session, even when no loop is running — see Cadence below. - -* Cadence — how often to check - -*"Monitor the inbox" = run now, then loop every 15 minutes.* When Craig says monitor or watch the inbox, do one process-inbox pass over any pending handoffs immediately, then start the loop: - -#+begin_src -/loop 15m check the inbox with inbox-status and run process-inbox.org over any pending handoffs -#+end_src - -Each firing runs the cheap =inbox-status= check first and only does a full process-inbox pass when items are pending. The loop is the monitoring; it runs until Craig stops it or the session ends. Honor the Preconditions gate before the first pass and the Close-out gate when the loop stops. - -*Ambient task-boundary check (always on, even without a loop).* After finishing a unit of work, before reporting back or asking "what's next," run the cheap status check: - -#+begin_src bash -.ai/scripts/inbox-status -q -#+end_src - -Exit 1 means handoffs are pending — list them (drop =-q=) and process per process-inbox.org. Exit 0 means clean; say nothing. This is one =find=; it costs nothing to run often, and it's the fix for handoffs piling up unseen during long sessions. - -*Startup and wrap-up already cover their ends.* Startup Phase C processes a non-empty inbox; the wrap-up sanity check refuses to wrap with unprocessed handoffs. The task-boundary cadence fills the middle. - -*Mid-task arrivals.* If a handoff lands while you're mid-task and it's urgent (blocks the current work, or is time-sensitive), surface it right away. Otherwise batch it to the next task boundary so the current work isn't thrashed. - -* The act-vs-file decision - -Every accepted handoff (one that clears process-inbox's value gate) is then either acted on now or filed as a task. The rule, and how to surface it: - -*Act immediately — and just do it, no asking — when all of these hold:* -- *Clear* — the action is unambiguous; no design decision or option-choice is needed. -- *Bounded* — small, finishable this session, ideally a tight file set. -- *Low-risk and verifiable now* — not a risky change to load-bearing infra (or trivially revertible), and testable/lintable this session. -- *In-scope and safe* — within this project, not destructive or outward-facing without confirmation, not across a project boundary. -- *Cheaper than deferring* — doing it now costs less than filing plus re-triaging later. - -When you decide to act, queue the work and do it. Don't ask first. - -*Exception:* a proposal to change a shared asset (template workflow, rule, skill, synced script) or a substantive convention never qualifies for silent act-now, however clear and bounded it looks — it routes through process-inbox's *Skeptical Review*, which carries its own approval (or, in a no-approvals session, park) step. - -*File a task when any of these hold:* -- It needs a judgment call, a design decision, or an option the user would pick. -- It's large, multi-session, or sprawls across many files. -- It's blocked (a dependency, an external thing, the user is away). -- It's risky enough to want the user's eyes before it lands. -- It's off the session's active goal and acting now would derail it (file and keep going, unless it's urgent). - -When you decide to file, *ask first* — inline numbered options per =interaction.md=, with *filing as option 1 (the recommendation)* and *"do it now" as option 2*: - -#+begin_example -<handoff> wants <X>. My read: file it (needs <reason>). - -1. File as a TODO ([#?] :tags:) — Recommended -2. Do it now instead -3. Something else - -Pick a number. -#+end_example - -*Always ask if you're unsure* which side of the line an item falls on. Decisiveness on clear act-now items is the point of the rule; the ask is for genuine ambiguity and for filing. - -** Executing in no-approvals mode - -When Craig has put the session in no-approvals mode, an accepted item may be implemented automatically — but only when all three of these hold: - -1. *Agreed* — you've run the value gate and the full Skeptical Review and concluded the change should be done, not merely that it's harmless. -2. *Quick* — the whole implementation, including verification, is under ~15 minutes. -3. *Solo* — you can carry it end to end without a decision from Craig. Manual verification you perform yourself is fine; needing Craig to choose an option, approve a design, or resolve an ambiguity is not. - -All three → implement it, verify, then commit and push at the end of that item (the Step 0 reconcile and pre-push check from =commits.md= still run). Miss any one and it doesn't self-apply: a shared-asset or convention change needs Craig's decision, so it fails *solo* and routes to process-inbox's defer-and-stage park; an oversized item fails *quick* and gets filed. - -* Replying to handoffs - -A handoff came from another project's agent (or the user). Close the loop: - -- *Accepted and acted on* — send a confirmation to the sender via =inbox-send <sender> --text "..."=, naming what landed and the commit, so they're not left guessing (they can't see this project's git log). =inbox-send= excludes the current project as a target, so a self-sourced item is handled in-session, not sent. -- *Accepted and filed* — a short confirmation that it's filed and where, so the sender knows it wasn't dropped. -- *Rejected* — always state the why (which value-gate question failed), per process-inbox's per-source rejection flow. - -Cross-project boundary: never act on a file under another project's =.ai/= scope from here — route it back as a handoff (see =cross-project.md=). - -* The inbox-status script - -=.ai/scripts/inbox-status= lists unprocessed handoffs and exits nonzero when any are pending. Exclusions match the wrap-up sanity check (=.gitkeep=, =lint-followups.org=, =PROCESSED-*=). Exit 0 = clean, 1 = pending, 2 = no inbox/ or bad usage. Use =-q= for the count-only form the cadence check calls. - -* Close out — before finishing - -End the workflow the way it started: clean worktree, green suite. Before stopping the loop or reporting the pass done: - -1. Commit or revert everything left in the worktree — nothing uncommitted remains. -2. Run the full test suite once more and confirm all green. - -If either can't be satisfied — a half-done item, a failure introduced during the pass — surface it rather than leaving it. The next monitor run assumes a clean, green starting state (the Preconditions gate), so handing it a dirty tree or a red suite breaks the next run before it begins. - -* Living Document - -Tune the cadence if task-boundary checking proves too frequent or too sparse in practice. Refine the act-vs-file criteria as edge cases recur. If the background-monitor loop becomes a common pattern, capture the interval that worked. The decision rule itself — act-now is silent, filing asks with file-as-option-1, ambiguity asks — is the stable core (set by Craig, 2026-05-30). diff --git a/.ai/workflows/process-inbox.org b/.ai/workflows/process-inbox.org deleted file mode 100644 index af406ee..0000000 --- a/.ai/workflows/process-inbox.org +++ /dev/null @@ -1,220 +0,0 @@ -#+TITLE: Process Inbox Workflow -#+AUTHOR: Craig Jennings & Claude -#+DATE: 2026-05-28 - -* Overview - -Inbox items are *ideas to evaluate*, not orders to execute. They arrive from Craig (typed directives saved as files), from other projects (handoffs via =inbox-send=), and from scripts/automated systems. Each is a proposal. An item earns a place in =todo.org= or git history only when it passes the value gate: it advances an existing task, improves how the project works, or serves the project's stated mission. - -The workflow is the disposition discipline. Read each item, evaluate honestly, apply the decision, then notify the sender if it was a project handoff and you're rejecting. Silent rejection on a handoff is worse than no reply. - -* When to Use This Workflow - -User triggers: - -- "process inbox" / "process the inbox" -- "handle the inbox" -- "what's in inbox" / "what's in the inbox" -- "let's clear the inbox" / "let's process the inbox items" - -Auto-invocation: - -- Startup =Phase C step 2= delegates here when the inbox is non-empty. Don't ask Craig — just run it. - -Do *not* invoke this for inbox items that are clearly out-of-scope for the project — those are cross-project routing problems, handled per the cross-project boundary rule in =protocols.org=. - -* The Value Gate - -Every inbox item passes through three questions. One *yes* is enough to accept. - -1. *Does it advance an existing TODO?* Look up by topic in =todo.org='s open work. If the item extends a filed task, fold it in. If it implements a filed task, do the work. -2. *Does it improve how the project works?* Architecture cleanup, workflow refinement, tooling, rule hygiene, drift detection — anything that makes the project itself more effective. -3. *Does it serve the project's stated mission?* Read =notes.org= *Project-Specific Context* if the mission isn't obvious from the working directory and current task. The item should advance that mission, not orbit it. - -Three *no*s means reject. The rejection isn't lazy — an idea that doesn't help any current task, doesn't improve the system, and doesn't serve the mission is genuine noise, and accepting it inflates =todo.org= without payoff. - -* The Skeptical Review (every arriving task and file) - -The value gate decides whether an item is worth taking. This review decides whether what it proposes is *right*, *complete*, and *as simple as it should be*. Run it on every task and file that arrives in the inbox — not only shared-asset change proposals. Pure FYIs and replies that ask for nothing skip it. - -Approach the file with curiosity and skepticism. Work through, in writing — the core pass on every item: - -1. Is the request actually right — does it do what it claims, and is the claim correct for this project? -2. Is it complete, or does it leave a gap — an unhandled case, a missing step, an untested path? -3. Should it be simpler? -4. Can it be enhanced to be more effective than as proposed? -5. Does it conflict with any existing instruction — workflows, skills, rules, protocols, CLAUDE.md? - -When the item proposes a change to shared assets — template workflows, rules, skills, scripts, anything synced to consuming projects — or to a substantive convention, add the cross-project battery. It arrived from one project's context; you're evaluating it for all of them: - -6. Does this make sense for *all* consuming projects, or just the sender's situation? -7. How does it change a common activity Craig performs — better, worse, or differently than the sender assumed? -8. Plus at least three more questions specific to this change — what breaks for artifacts already using the old shape, what tooling interacts with it, what's underspecified, what the sender's worked example doesn't exercise. - -Output: a short summary of the thinking and a recommendation (do it / do it with named changes / file / reject). For shared-asset and convention changes the recommendation is surfaced to Craig for approval before applying; for ordinary tasks and files it feeds the act-vs-file and no-approvals-execute decision (=monitor-inbox.org=). - -** In a no-approvals session: shared-asset changes defer and stage - -Shared-asset and convention changes still don't self-apply when Craig has put the session in no-approvals mode — they need his decision, so they fail the *solo* test in monitor-inbox's executing-in-no-approvals criteria. Ordinary tasks and files that pass the review and are quick + solo execute under that criteria instead; this defer-and-stage path is for the shared-asset and convention changes that don't qualify. Run the review, prepare the edits in =working/<task-slug>/= (a patch file or the worked-out diff), file a =[#B]= VERIFY carrying the decision package, and reply to the sender that it's parked. The sender's local stopgap (per =cross-project.md='s propagation process) means the delay costs nothing — the canonical update is about durability, not speed. - -Wording-only fixes — no consuming project acts differently — may proceed even then, logged in the session log. - -The VERIFY shape (top-level, =[#B]= so startup's A/B surfacing catches it; no =SCHEDULED= unless the proposal names a real deadline): - -#+begin_example -** VERIFY [#B] Parked: <proposal topic> (from <sender>) -What arrived: <one line — what the handoff proposes>. -Recommendation: <accept as-is / accept with changes / reject> — <2-3 line -skeptical-review summary: what's right, what to change, what was checked>. -Prepared diff: [[file:working/<slug>/proposed.diff]] — apply is mechanical on -your go. -Say "approve the parked <topic>" (or adjust / reject) and it gets applied. -#+end_example - -The full question-battery answers live in the session log and the =working/= dir, not the task body — the body carries the conclusion, with the trail one link away. - -* Phase A — Inventory (one parallel batch) - -Issue these reads in one parallel batch: - -1. List =inbox/= excluding =.gitkeep= and =PROCESSED-*= prefixes (use =\ls -la inbox/= per the protocols.org exa-alias note). -2. Read =notes.org= *Project-Specific Context* if mission isn't already loaded in the session. -3. Read =todo.org='s top-of-file priority scheme if present (look for a =* Priority and Tag Scheme= section or similar between the intro and the first =* <Project> Open Work= header). - -For each inbox file, parse the filename for sender. Two common patterns: - -- =YYYY-MM-DD-HHMM-from-<sender>-<topic>.<ext>= — from another project via =inbox-send=. -- =<topic>.org= — typically from Craig directly, or from a script. - -Note the file type. =.eml= files need the extract script (not raw =Read=): - -#+begin_src bash -# View mode -python3 .ai/scripts/eml-view-and-extract-attachments.py inbox/<file>.eml - -# Pipeline mode (extract attachments to a directory) -python3 .ai/scripts/eml-view-and-extract-attachments.py inbox/<file>.eml --output-dir assets/<target>/ -#+end_src - -Everything else, read directly. - -* Phase B — Evaluate each item - -For each inbox file: - -1. *Read it.* For substantive proposals (org files with TODO entries, design notes, multi-section docs), the full read is the right move. For short FYIs and one-liner asks, skim. -2. *Identify the shape.* Is it an instruction, a question, a proposal, an FYI, or a handoff? Shapes guide disposition. -3. *Apply the value gate.* Three questions above. One yes → candidate accept. Three nos → candidate reject. -4. *Run the Skeptical Review* (section above) on the item before classifying — the core pass on every accepted task and file, plus the cross-project battery when it proposes a shared-asset or convention change. Its summary + recommendation rides along to Phase C; in a no-approvals session it gates whether the item self-applies (quick + solo + agreed, per =monitor-inbox.org=) or, for shared-asset and convention changes, defers and stages. -5. *Within accept, classify:* - - *Implement now* — small, scoped, clear, no design call required. The work is the disposition. - - *Fold into existing TODO* — the item extends a task already filed; update the TODO body and link the inbox content if substantive. - - *File as TODO* — substantive but waits, or needs design/triage before implementation. -6. *Within reject, classify by source:* - - *From Craig* — push back honestly in chat. State why you won't implement. Offer the conditions under which you would, if any. Wait for Craig to override or accept. - - *From another project* — write a response file naming the rejection rationale and (optionally) the condition under which you'd reconsider. Deliver via =inbox-send <sender> --file <response>= per the cross-project handoff convention. - - *From a script or automated system* — just delete; no notification needed. - -* Phase B.1 — Priority-scheme check - -This gates Phase C filing when there are accept-and-file items. - -Check whether =todo.org= has a top-of-file priority scheme (an explicit legend defining =[#A]= through =[#D]= semantics and mandatory/optional tag conventions). - -- *Scheme present* — file new TODOs per the scheme. Every TODO gets a priority cookie matching the legend's rules, the mandatory type tag, and any applicable effort/autonomy tags. -- *Scheme absent* — surface one sentence: "This project has no priority scheme. We should adopt one before filing the new TODOs from this inbox pass — want me to propose one based on the rulesets scheme?" If Craig says yes, do that first (the =/research-priority-scheme= research subagent pattern in rulesets is the reference). If Craig says no, file the TODOs without grading but flag in the commit message that they're un-prioritized pending a scheme. - -The point is to avoid adding ungraded =TODO= entries to a project that's never agreed on what =[#A]= means. - -* Phase C — Surface dispositions - -Numbered options inline per =interaction.md= (no popup). Recommendation at item 1. - -Batch trivial items (one-line rejections of script noise, obvious file-as-TODO accepts where the scheme is already settled) into a single confirm-all prompt. Walk substantive items one at a time so the decision is visible. - -Per-item template: - -#+begin_example -<filename> from <sender>: <one-line summary> -Value-gate read: <yes/no on each of the three questions, one phrase each> -Disposition recommendation: <implement / fold into <TODO> / file [#X] :tags: / reject> - -1. <recommendation as item 1> -2. <alternative> -3. Defer — leave in inbox under PROCESSED-<topic>.<ext> until <condition> -4. Something else -#+end_example - -For items that went through the Skeptical Review, the surfaced disposition includes its summary + recommendation, and approval here is what authorizes the apply. In a no-approvals session those items are reported as parked (the =[#B]= VERIFY) rather than surfaced for live approval. - -For pure FYIs that need no action, surface as a single line and recommend delete-with-acknowledgment. - -* Phase D — Apply - -Apply each disposition. The flow is autonomous past Craig's Phase C approval. - -** Implement-now - -Do the work. Commit per the project's commit flow. Delete the inbox file. The commit message references the inbox item by filename so the provenance lands in =git log=. - -** Fold into existing TODO - -Update the parent TODO's body with a dated reconciliation sub-entry per =todo-format.md= (=*** YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <what landed>=). Move substantive content to =docs/design/<date>-<topic>.<ext>= if it's worth keeping; reference from the TODO body. Delete the inbox file. - -** File as TODO - -Add the TODO under =* <Project> Open Work= with priority + tags per Phase B.1. Body summarizes the proposal and links the inbox content if it's been moved to =docs/design/=. Delete the inbox file (or move it to =docs/design/= first if the content survives). - -** Reject from Craig - -State the rejection in chat clearly: what you won't implement, why, and the conditions (if any) under which you would. Wait for Craig's override or acknowledgment. The inbox file stays until Craig confirms — if he overrides, re-enter Phase D as accept; if he acknowledges the rejection, delete the file. - -** Reject from another project (handoff) - -Write the response file at =/tmp/inbox-response-<topic>.org=. Contents: - -- Heading naming the original handoff and date -- One paragraph: the rejection rationale (which value-gate question failed and why) -- One paragraph: the condition under which you'd reconsider, if such a condition exists. If the answer is "never, this misreads the project's mission," say so directly. - -Deliver via =inbox-send <sender> --file /tmp/inbox-response-<topic>.org=. The =inbox-send= script (per =cross-project.md=) handles the from-prefix, date stamp, and target inbox path. - -Delete the local inbox file after the response lands in the sender's inbox. - -** Reject from script or automated system - -Just delete. No notification. - -** Defer - -Rename in place to =inbox/PROCESSED-<original-filename>= and add a brief comment line at the top: =# Deferred YYYY-MM-DD: <condition>=. Don't accumulate deferred items indefinitely — sweep them on a future =process-inbox= run when the condition is met or the deferral has aged out. - -** Park (Skeptical Review in a no-approvals session) - -Move the proposal file into =working/<task-slug>/= alongside the prepared diff, file the =[#B]= VERIFY per the Skeptical Review section, reply to the sender that it's parked for Craig's review, and delete the inbox file. On Craig's approval the apply is mechanical: apply the prepared edits, run the normal verify-and-publish flow, rewrite the VERIFY to a dated log entry per =todo-format.md=, and send the sender the acceptance reply. On rejection, the reject-from-another-project flow above runs unchanged. - -* Phase E — Close out - -Verify =inbox/= is empty (excluding =.gitkeep= and any intentional =PROCESSED-*= files). Run =\ls -la inbox/= and confirm. - -Update the session log per =protocols.org= with one short paragraph summarizing this pass: count processed, count accepted (implement/fold/file split), count rejected (Craig/handoff/script split), and the commit SHA if a commit landed. - -Stamp =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section if it exists, so future workflows that gate on freshness can read it. Same format as =:LAST_AUDIT:= (=YYYY-MM-DD=). - -* Common Mistakes - -1. *Treating items as orders.* Inbox content is a proposal. The value gate is the rule. Implementing every item without evaluation inflates =todo.org= and trains senders to keep sending noise. -2. *Filing without applying the value gate.* "File as TODO" is not a default — it's the disposition for proposals that pass the gate but wait. A reject is also a valid file-as-TODO answer to nothing. -3. *Filing raw TODOs when the project has a priority scheme.* Phase B.1 is mandatory when the scheme exists. An un-graded TODO in a project with a legend is a defect. -4. *Silently deleting a project handoff.* Send a response. The sender's next session sees the response in their inbox and learns the rejection rationale. Silent rejection trains the sender to escalate to Craig instead of through the inbox channel. -5. *Pushing back on a Craig directive only to immediately implement it anyway.* If you genuinely think Craig is wrong, say so and wait for his call. If you don't, just do the work — don't theatre the pushback. -6. *Skipping the implement-vs-fold-vs-file classification.* Defaulting every accept to "file as TODO" turns the inbox into a queue that flows into =todo.org= without filtering. Small, scoped, clear items get implemented now; substantive proposals get filed; extensions to existing work get folded. -7. *Not propagating value-gate failure to the response.* When you reject a handoff, the response should name *which* gate question failed (advances no current task / doesn't improve the project / doesn't serve the mission) so the sender can recalibrate, not just resend. -8. *Forgetting to delete the inbox file after acting.* The inbox should be empty when this workflow ends. Files left behind become noise on the next startup. -9. *Applying a shared-asset change proposal without the Skeptical Review.* The value gate alone asks whether to take the change, never whether the change is right, complete, or as simple as it should be. A proposal that's clear and bounded can still carry a design gap — the review is where that surfaces, before the change syncs to every consuming project. (Worked example: the 2026-06-12 spec-decisions handoff was applied as-is and the after-the-fact review surfaced a lost state, a vacuous gate pass, and an enhancement — all catchable up front.) - -* Living Document - -Refine the value gate's three questions if the project's mission sharpens. Tune the per-source rejection-response template if =inbox-send= response loops surface a pattern. Add new auto-classification shortcuts if certain item shapes (e.g. routine FYIs from a script) become common. - -The workflow is shaped by use. The principle that inbox items are *ideas to evaluate* is the part that doesn't change. diff --git a/.ai/workflows/spec-create.org b/.ai/workflows/spec-create.org index f90c511..508b969 100644 --- a/.ai/workflows/spec-create.org +++ b/.ai/workflows/spec-create.org @@ -10,7 +10,7 @@ The guiding principle, drawn from how Google, Oxide, Amazon, Basecamp, and the A It is the front of a trio: - =spec-create.org= (this one) — author writes the spec. -- =spec-review.org= — a reviewer gates the spec for implementation-readiness and writes =<spec-basename>-review.org=. +- =spec-review.org= — a reviewer gates the spec for implementation-readiness and records findings in the spec's =* Review findings= section. - =spec-response.org= — the author folds the review back in. The spec this workflow produces has to *pass spec-review's gate* — that gate is the definition of done. So the structure below is built to answer the reviewer's questions up front. Keep it lightweight anyway: a short required spine plus a *readiness-dimensions menu* where each item is either answered or explicitly marked "N/A because…". The best spec is the shortest one that still lets an engineer build it, test it, and ship behavior that matches the user's mental model. @@ -59,7 +59,7 @@ Capture, in this order: This is where the spec earns a "Ready" from review: an engineer must be able to build it in steps, know when it's done, and never have to invent product behavior mid-implementation. -1. *Implementation phases* — decompose the work into phases each small enough to finish in one focused session and each leaving the tree in a working (not half-broken) state. =spec-review= lifts this section straight into =todo.org= tasks, so a spec that can't be phased fails the gate — the absence is itself a finding. +1. *Implementation phases* — decompose the work into phases each small enough to finish in one focused session and each leaving the tree in a working (not half-broken) state. =spec-review= checks this section decomposes cleanly and =spec-response= lifts it into =todo.org= tasks, so a spec that can't be phased fails the gate — the absence is itself a finding. 2. *Acceptance criteria* — the observable conditions that mean the feature works, written as checkable items. The review's test-surface task mirrors these. 3. *Readiness dimensions* — walk this menu and, for each, either define the behavior or write "N/A because…". The escape hatch keeps a simple spec short; the prompt keeps a hidden decision from slipping into implementation: - *Data model & ownership* — what's user-authored / generated / cached / remote; who owns each editable region; what persists vs refreshes. diff --git a/.ai/workflows/spec-response.org b/.ai/workflows/spec-response.org index 2686cf8..de5b1c8 100644 --- a/.ai/workflows/spec-response.org +++ b/.ai/workflows/spec-response.org @@ -5,9 +5,9 @@ * Overview -The spec-response workflow processes external reviews of a design spec and folds them into the spec until it is implementation-ready. A reviewer (human or another agent) leaves a review file next to the spec — typically produced by its counterpart, the *spec-review* workflow, which writes =<spec-basename>-review.org= and assigns a readiness rubric. Claude works through every recommendation, deciding accept / modify / reject for each, updates the spec for the accepted ones, documents the modified and rejected ones with reasons, then deletes the review file. Repeat for each spec under review. +The spec-response workflow processes a review's findings and folds them into the spec until it is implementation-ready. A reviewer (human or another agent) records findings in the spec's =* Review findings= section — typically via its counterpart, the *spec-review* workflow, which writes one =TODO= task per finding (=[/]= cookie on the heading) and assigns a readiness rubric. Claude works through every finding, deciding accept / modify / reject, updates the spec body for the accepted ones, and completes each finding task in place: accept and modify finish =DONE= (the modify's change noted in the body), reject finishes =CANCELLED= with the reason. Repeat for each spec under review. -The output is a spec a reader could implement from, plus a durable record — inside the spec — of why any recommendation was changed or declined, so the reviewer can find the reasoning later without re-litigating. +The output is a spec a reader could implement from, plus a durable record — inside the spec — of why any finding was changed or declined, so the reviewer can find the reasoning later without re-litigating. The reasoning lives on the completed finding task, not a separate file. This workflow was first run on 2026-05-23 against the linear-emacs =issue-query-spec.org= and =issue-representation-spec.org= reviews, and written up from that run. @@ -27,25 +27,25 @@ A review is only useful if every point in it gets a decision. Without a defined A spec's review is fully processed when: -1. *Every recommendation has an explicit disposition* — accepted, modified, or rejected. None dropped. -2. *Accepted recommendations are woven into the spec body* — the spec reads as if they were always there, not appended as a changelog. -3. *Modified and rejected recommendations are documented in a bottom section* (e.g. "Review dispositions") with a one-paragraph reason each, so the reviewer can find the reasoning. +1. *Every finding has an explicit disposition* — accepted, modified, or rejected, and its task completed (=DONE= or =CANCELLED=). None dropped. +2. *Accepted findings are woven into the spec body* — the spec reads as if they were always there, not appended as a changelog. +3. *Modified and rejected findings carry a one-paragraph reason in the completed task's body*, so the reviewer can find the reasoning. 4. *Review/response provenance is documented in the spec* — iteration count/date, contributor, role, what changed, and why. 5. *Pre-agreed decisions are flipped to =DONE=* — each settled decision's =TODO= becomes =DONE=, and the =[/]= cookie on the spec's =* Decisions= heading reflects the tally. 6. *Cross-spec tensions are reconciled in writing* when related specs were reviewed together. -7. *The review file is deleted* once 1-6 hold. +7. *Every finding task is completed* — the =* Review findings= =[/]= cookie reads complete (each finding =DONE= or =CANCELLED=). 8. *Tracking is updated* — the spec's VERIFY/task body notes "review incorporated" and whether it's implementation-ready. 9. *Implementation tasks exist* — once the author confirms the spec is Ready, the project's =todo.org= carries the full implementation-task breakdown (Phase 6), reviewed for completeness, with =:solo:= marked and a Manual-testing task for everything else. -The whole run is done when no =*-review.org= files remain and each spec is judged implementation-ready (or its remaining blockers are named). +The whole run is done when every spec's =* Review findings= cookie reads complete and each spec is judged implementation-ready (or its remaining blockers are named). -*Measurable validation:* a reader scanning the review against the revised spec can find, for every review point, either the change in the body or its disposition at the bottom. Nothing is unaccounted for. +*Measurable validation:* a reader scanning the spec can find, for every finding, either the change in the body or its disposition on the completed finding task. Nothing is unaccounted for. * When to Use This Workflow Trigger when: -- A reviewer drops a review file alongside a spec — convention: same basename with a =-review.org= suffix (=foo.org= → =foo-review.org=). +- A reviewer records findings in the spec's =* Review findings= section — typically via the spec-review workflow. - Craig says "respond to the review" / "let's run the spec-response workflow" / "process the spec reviews." - Any time a spec needs to absorb structured external feedback and converge to implementation-ready. @@ -63,7 +63,7 @@ the file should be renamed first. Spec workflows require the -spec.org suffix as guard against pointing the workflow at tutorial, inventory, or setup docs. #+end_example -The review file the response consumes follows the convention =<spec-basename>-review.org=, so a misnamed spec produces a mis-pointed review file too. Fix the spec name first. +The findings the response consumes live in the spec's own =* Review findings= section, so the spec filename is the handle the workflow keys on — point it at the right =-spec.org= file. Fix the spec name first. The user resolves the mismatch and re-invokes the workflow. Do not proceed with the response against a misnamed spec. @@ -71,7 +71,7 @@ The user resolves the mismatch and re-invokes the workflow. Do not proceed with ** Phase 0: Orient -1. List the review files (=ls docs/*-review.org= or wherever they live). Process them one at a time in whatever order; the user may name an order. +1. Find specs with open findings — a =* Review findings= section whose =[/]= cookie isn't complete (=TODO= findings remain). Process them one at a time in whatever order; the user may name an order. 2. Re-read the *current* spec, not your memory of it — it may have changed since you wrote it (the user or a linter may have edited it, and pre-agreed decisions may already be encoded in the tracking file). 3. Note any *pre-agreed decisions* the reviewer or user has already settled — in the review's own "Agreed decisions" section, or in the spec's tracking task. These are settled inputs. Don't reopen them; bake them in. @@ -79,15 +79,15 @@ The user resolves the mismatch and re-invokes the workflow. Do not proceed with Read the entire review first. Recommendations interact — an early "medium" finding may be subsumed by a "high" one, or two findings may point at the same edit. Decide dispositions with the whole picture in view, not finding-by-finding as you scroll. -** Phase 2: Decide a disposition for every recommendation +** Phase 2: Decide a disposition for every finding -For each recommendation, choose one: +For each finding, choose one: -- *Accept* — the recommendation is right as written. Plan the edit. -- *Modify* — the recommendation is right in spirit but wrong in detail or scope. Adjust it, and record what you changed and why. -- *Reject* — the recommendation doesn't fit. Record why. +- *Accept* — the finding is right as written. Plan the edit. +- *Modify* — the finding is right in spirit but wrong in detail or scope. Adjust it, and record what you changed and why. +- *Reject* — the finding doesn't fit. Record why. -*Engage critically.* Rubber-stamping is a failure mode. On a strong review most points are accepts, but actively look for the genuine modify/reject cases — they are where your judgment earns its place. Examples from the first run: +*Engage critically.* Rubber-stamping is a failure mode. On a strong review most findings are accepts, but actively look for the genuine modify/reject cases — they are where your judgment earns its place. Examples from the first run: - *Modify:* the review proposed an automatic cache-TTL defcustom. Accepted the goal (fresh data) but deferred TTL to vNext because for a single-user tool an explicit force-refresh + clear-cache command covers it without invalidation complexity. - *Reject:* the review floated a separate =default-issue-filter= defcustom. Rejected as redundant — a fixed default command plus a default-view preference already covered it. @@ -101,8 +101,8 @@ When related specs were reviewed together, two reviews can recommend opposite th ** Phase 4: Update the spec -1. *Weave accepted recommendations into the body.* The spec should read naturally — a new "Selector semantics" section, a revised phase plan, an added test-strategy section — not a list of "review said X so I did Y." The body reflects the decisions; it doesn't narrate them. -2. *Add a bottom "Review dispositions" section* listing only the *modified* and *rejected* recommendations, each with a short reason. Close it with a one-line "everything else accepted as written" so the reader knows the omissions from this section are accepts, not gaps. +1. *Weave accepted findings into the body.* The spec should read naturally — a new "Selector semantics" section, a revised phase plan, an added test-strategy section — not a list of "review said X so I did Y." The body reflects the decisions; it doesn't narrate them. +2. *Complete each finding task in place.* Accept → =DONE=, body noting where it was folded; modify → =DONE=, body noting what you changed and why; reject → =CANCELLED=, body giving the reason. The =[/]= cookie on =* Review findings= tracks progress. The reason on a modified or rejected finding is the durable record — accepted findings are recorded by the body change itself, so they need no separate note. The asymmetry is deliberate. 3. *Update or add a bottom "Review and iteration history" section.* Every response pass gets an entry, even when all findings are accepted. Each entry is an org subheading with a compound id followed by three body fields: Heading format: =YYYY-MM-DD Day @ HH:MM:SS -ZZZZ — Contributor — Role= @@ -113,17 +113,17 @@ When related specs were reviewed together, two reviews can recommend opposite th - *What changed:* compact summary of accepted, modified, and rejected work. - *Why:* the rationale or decision pressure behind the changes. - - *Artifacts:* review filename, disposition section, task IDs, source checks, or commits when useful. + - *Artifacts:* the relevant findings, task IDs, source checks, or commits when useful. 4. *Flip settled decisions to =DONE=.* Each decision the decision-maker has agreed flips its =TODO= to =DONE=; the =[/]= cookie on the =* Decisions= heading tracks the tally. A contested decision stays =TODO= with the back-and-forth under its =*** Discussion= child header. Decisions still =TODO= should be only what genuinely still blocks, each with an owner and a by-when. -5. *Raise the spec to implementation-ready:* consolidate decisions up front, add any implementation prerequisites the review surfaced (e.g. a schema-verification checklist), a consolidated test strategy, and a phased plan ordered so dependencies (like an output model everything depends on) come early. *Gate:* the spec Status cannot move past =draft= to implementation-ready while any decision is still =TODO= — the =[/]= cookie must read complete, or the author consciously accepts and records the risk of building with one open. +5. *Raise the spec to implementation-ready:* consolidate decisions up front, add any implementation prerequisites the review surfaced (e.g. a schema-verification checklist), a consolidated test strategy, and a phased plan ordered so dependencies (like an output model everything depends on) come early. *Gate:* the spec Status cannot move past =draft= to implementation-ready while any decision or any =:blocking:= finding is still =TODO= — both =[/]= cookies must read complete, or the author consciously accepts and records the risk of building with one open. *If this response expanded scope* — folding a finding in added new phases, decisions, or external-dependency assumptions — re-run spec-review's readiness rubric against the *expanded* spec, and file any new gap as a finding or decision before claiming =Ready=. Disposition-completeness gates the *review*; the readiness rubric gates the *spec*. A response can resolve every finding and still be less ready than before, because the answers introduced unproven obligations — the cookies only protect you if the new obligation is actually filed. 6. *Update the status line* to note "review incorporated (<reviewer>, <date>)." ** Phase 5: Close out and iterate -1. *Delete the review file* — only after every recommendation has a disposition. Its deletion is the signal the review is fully processed. -2. *Update tracking* — the spec's VERIFY/task body gets a line noting review incorporated, what changed at a high level, which recommendations were modified (pointing at Review dispositions), and whether it's now implementation-ready pending final go. +1. *Confirm every finding is completed* — the =* Review findings= =[/]= cookie reads complete (every finding =DONE= or =CANCELLED=). The complete cookie is the signal the review is fully processed; there is no file to delete. +2. *Update tracking* — the spec's VERIFY/task body gets a line noting review incorporated, what changed at a high level, which findings were modified or rejected (pointing at the completed findings), and whether it's now implementation-ready pending final go. 3. *Update the session log* (state changed this turn). -4. *Move to the next review file.* Repeat Phases 1-5 until none remain. +4. *Move to the next spec with open findings.* Repeat Phases 1-5 until none remain. 5. *Report* what was accepted-wholesale, what was modified/rejected and why, any cross-spec reconciliations, and the implementation-ready verdict per spec. ** Phase 6: On Ready, build the implementation-task breakdown @@ -150,7 +150,7 @@ The workflow is complete when these tasks exist, the completeness pass confirms Accept, modify, or reject — but never silently drop. The reviewer must be able to account for every point. ** Document the no's, not the yes's -Accepted recommendations live in the spec body (the change *is* the record). Modified and rejected ones need an explicit written reason at the bottom, because the change is invisible and the reasoning would otherwise be lost. The asymmetry is deliberate. +Accepted findings live in the spec body (the change *is* the record). Modified and rejected ones need an explicit written reason on the completed finding task, because the change is invisible and the reasoning would otherwise be lost. The asymmetry is deliberate. ** Critique, don't rubber-stamp A review you accept entirely without finding a single thing to push on probably wasn't read critically. Your judgment — including a well-reasoned no — is the value you add. @@ -164,11 +164,11 @@ When reviews conflict, find the framing where both are right. Silently honoring ** A reject goes back to the reviewer, not just into the file Recording a reasoned reject is the floor, not the close. Communicate the rejection and its reason to the reviewer — a reject is a two-party event, not a unilateral call. If the reviewer disagrees, that's a discussion: weigh the counter, and if you still can't agree, escalate to whoever owns the decision rather than letting the author's "no" stand by default. "I'm not doing that" with no reason the reviewer can engage is the failure mode. (For a tight solo author-reviewer loop this is lightweight; for a team it's the difference between a review and a rubber-stamp-in-reverse.) -** The spec reads forward, the dispositions read backward -The body is written for the implementer (no review archaeology). The dispositions section is written for the reviewer (the reasoning trail). Keep the two audiences separate. +** The spec reads forward, the findings read backward +The body is written for the implementer (no review archaeology). A completed finding's reason is written for the reviewer (the reasoning trail). Keep the two audiences separate. ** The history explains provenance, not implementation behavior -The spec body should still be the implementation contract. The bottom =Review and iteration history= section is for provenance: number of iterations, dates, contributors (including agents), roles, what each pass contributed, and why. Keep it short enough that future readers can understand how decisions evolved without rereading chats, deleted review files, or session logs. +The spec body should still be the implementation contract. The bottom =Review and iteration history= section is for provenance: number of iterations, dates, contributors (including agents), roles, what each pass contributed, and why. Keep it short enough that future readers can understand how decisions evolved without rereading chats or session logs. ** Re-read before editing The spec may have changed since you last saw it. Edit the current file, reconcile against the latest tracking state. @@ -213,3 +213,8 @@ Update this workflow as we learn what works. Capture new disposition patterns, b - *What:* Reconciled this workflow to spec-create's new Decisions convention (each decision is an org =TODO= task that flips to =DONE= on agreement, with a =[/]= cookie on the =* Decisions= heading and a =*** Discussion= child for disputes). Exit Criterion 5, Phase 2's pre-agreed-decisions step, and Phase 4 steps 4-5 now speak in flip-to-=DONE= terms, and the implementation-ready step gates on the all-=DONE= cookie. - *Why:* The convention change landed in spec-create.org via an .emacs.d handoff (originated in its keymap-consolidation spec); this workflow still described the retired =State: proposed | accepted | superseded= model. - *Artifacts:* Handoff =inbox/2026-06-12-1906-from-.emacs.d-spec-create-decisions-todo-note.org=. Paired spec-create.org and spec-review.org edits in the same commit. + +** 2026-06-21 Sun @ 23:16:06 -0400 — Claude Code (rulesets) — responder +- *What:* Folded the review into the spec. Findings are now =* Review findings= =TODO= tasks the responder completes in place (accept/modify → =DONE=, reject → =CANCELLED= with the reason) instead of a "Review dispositions" section; the response is done when the =[/]= cookie reads complete, not when a review file is deleted. Phase 0 finds open work by an incomplete findings cookie; the Phase 4 implementation-ready gate now also requires the findings cookie, and rerun-the-readiness-rubric-on-expanded-scope is folded into that gate (a scope-expanding response must file new obligations as findings or decisions before claiming =Ready=). +- *Why:* Deleting the review file left the iteration-history =Artifacts= line dangling and lost the verbatim review; keeping the file collided with this workflow's file discovery and its "no review files remain" done-condition. Craig's call: incorporate the review into the document, reusing the decisions machinery so the readiness signal is a cookie. The scope-expansion rerun closes a real gap — a response can resolve every finding and still introduce unreviewed obligations. +- *Artifacts:* Paired spec-review.org edits in the same commit. Inbox handoffs =2026-06-20-2339-from-home-spec-response-readiness-gate-proposal.org= and =2026-06-21-0156-from-home-companion-to-tonight-s-spec-response.org=. diff --git a/.ai/workflows/spec-review.org b/.ai/workflows/spec-review.org index d956f00..833dfc9 100644 --- a/.ai/workflows/spec-review.org +++ b/.ai/workflows/spec-review.org @@ -5,9 +5,9 @@ * Overview -The spec-review workflow evaluates a feature/specification document before implementation and decides one thing: can an engineer implement it confidently, test it thoroughly, and ship behavior that matches the user's mental model? If yes, say so and stop. If no, write a review file next to the spec naming every blocking gap and the concrete change that closes it. +The spec-review workflow evaluates a feature/specification document before implementation and decides one thing: can an engineer implement it confidently, test it thoroughly, and ship behavior that matches the user's mental model? If yes, say so and stop. If no, record every blocking gap and the concrete change that closes it as findings in the spec's own =* Review findings= section. -This is the *reviewer* side of a pair. Its counterpart is the spec-response workflow, which the spec's author runs to fold a review back in. The contract between them is the review file: =<spec-basename>-review.org= (e.g. =docs/issue-query-spec.org= → =docs/issue-query-spec-review.org=). spec-review produces it; spec-response consumes it. +This is the *reviewer* side of a pair. Its counterpart is the spec-response workflow, which the spec's author runs to disposition the findings. The contract between them lives *in the spec*: a =* Review findings= section carrying one =TODO= task per finding, with a =[/]= cookie — the same shape the spec's =* Decisions= section already uses. spec-review writes the findings; spec-response completes them. No separate review file is written, so nothing dangles when a review is processed and the full review/response trail stays in the spec. The goal is not to prove the spec is clever. It is to leave the implementer with *fewer* hidden decisions, not more prose. @@ -28,7 +28,7 @@ A review is complete when: 1. *The implementation-readiness gate has been evaluated* and a rubric label assigned (=Ready= / =Ready with caveats= / =Not ready= / =Needs research=). 2. *If ready:* the user is told plainly ("This spec is implementation-ready. I have no further blocking review notes."), and the review stops — no churn for its own sake. -3. *If not ready:* a =<spec>-review.org= file is written next to the spec, in the standard structure, with every finding specific and actionable (current behavior named, risk explained, change recommended, blocking-or-not stated). +3. *If not ready:* findings are recorded in the spec's =* Review findings= section as =TODO= tasks (one per finding, =[/]= cookie on the heading), each specific and actionable (current behavior named, risk explained, change recommended, blocking-or-not stated). 4. *The spec's review history is updated* with who reviewed it, when, which iteration it was, what changed or was recommended, and why. 5. *Deferred work is logged* to =todo.org= (v1 = =[#B]=, vNext/someday = =[#D]=), not left only in chat. 6. *Implementation tasks are enumerated* — the spec's =Implementation phases= section is lifted into a drop-in =todo.org= block (one entry per phase plus a test-surface entry), or, if the spec has no phase decomposition, that gap is raised as a finding. @@ -93,19 +93,19 @@ Mark the spec implementation-ready only if *all* of these hold: - The plan can be phased without shipping broken intermediate states, and phases are small enough to reach a clean stopping point in one focused work session. - External API assumptions are verified or explicitly listed as prerequisites. -If all true → tell the user it's ready and stop unless they ask for more. If any false → continue and write the review file. A "ready" at this phase is provisional; confirm it at Phase 3 after the code read. +If all true → tell the user it's ready and stop unless they ask for more. If any false → continue and record findings (Phase 5). A "ready" at this phase is provisional; confirm it at Phase 3 after the code read. ** Phase 2: Required reading order Never review a spec in isolation. 1. *Read the existing implementation first.* The code paths the spec would touch: public commands and entry points, internal helpers/boundaries, current data representation, persistence/write-back, async/sync, caching, error handling, existing tests, naming/style. Capture current-state facts with function names and file paths. Don't recommend designs that ignore how the package works today. -2. *Read related specs and task tracking.* Companion specs, relevant =todo.org= tasks, README/testing docs, prior review files. Record which tasks the spec absorbs, which stay separate, which decisions are already made, which are still open. +2. *Read related specs and task tracking.* Companion specs, relevant =todo.org= tasks, README/testing docs, prior reviews (in each spec's =* Review findings= and =Review and iteration history=). Record which tasks the spec absorbs, which stay separate, which decisions are already made, which are still open. 3. *Read the target spec end to end — twice.* First for its problem/behavior/phases/assumptions; second looking only for gaps. The second read asks: "What would an implementer still have to invent?" ** Phase 3: Re-run the gate (authoritative) -After reading code and spec, re-run the Phase 1 gate — this is the pass that counts, because now you can actually judge the items that needed the code: architecture fit, API verification, integration points. If now ready, don't manufacture churn. If not, write the review file. +After reading code and spec, re-run the Phase 1 gate — this is the pass that counts, because now you can actually judge the items that needed the code: architecture fit, API verification, integration points. If now ready, don't manufacture churn. If not, record findings (Phase 5). ** Phase 4: Evaluate across dimensions @@ -136,57 +136,30 @@ Work the spec against these. Each is a source of concrete findings, not a box to - *Development tooling.* Does the repo give contributors obvious commands for setup, fast tests, specific tests, compile, lint, coverage, cleanup, slow/manual tests, and release checks? Are optional/live tests gated by explicit environment variables? Is the Makefile/script surface consistent with sibling projects? - *Small enhancement radar.* Are there low-complexity, high-value affordances already provided by the platform that should be surfaced now or explicitly deferred? Examples: archive/compress commands in file managers, built-in history, previews, diagnostics, or doctor commands. Keep the hot path simple; capture the opportunity rather than accidentally losing it. -** Phase 5: Write the review file +** Phase 5: Record findings in the spec -Use this structure for =<spec-basename>-review.org= unless the spec calls for something different: +Findings live in the spec, not a sibling file. Add (or append to) a =* Review findings= section near the spec's =* Decisions= section, with a =[/]= cookie on the heading. Each finding is a =** TODO= task: the heading is the smallest noun phrase naming the gap; the body names current behavior, the risk, and the recommended change. Tag a blocking (high-priority) finding =:blocking:= — it holds the rubric at =Not ready= until dispositioned; leave non-blocking findings untagged. Findings accumulate across review rounds the way decisions do, and the responder completes each one in place (Phase 4 of spec-response), so the section becomes the full review/response trail. #+begin_src org -,#+TITLE: Review: <Spec Title> -,#+AUTHOR: <reviewer> -,#+DATE: <date> -,#+STARTUP: showall - -,* Scope reviewed -What code, tests, docs, and specs you read. - -,* Implementation-readiness -Whether the spec is ready. If not, summarize the blockers. - -,* Overall assessment -The short senior-engineering read: what's right, what's risky, what must be clarified. - -,* High-priority findings -Concrete headings. Each: why it matters and what to change. - -,* Medium-priority findings -Important improvements that shouldn't block all progress. - -,* UX observations -,* Architecture observations -,* Robustness and performance observations -,* Test strategy recommendations -Specific test cases, not generic "add tests". -,* Documentation and tooling recommendations -README/user/developer docs, Makefile/package scripts, coverage, debug tools, and customization surface. - -,* Suggested spec edits -Concrete edits to make the spec implementation-ready. - -,* Agreed decisions -Answers reached during review. Omit if none. - -,* Open questions -Only questions that truly block or materially affect implementation. - -,* vNext candidates -Deferred features to capture in task tracking. +,* Review findings [/] +,** TODO Comment edit-back is undefined :blocking: +The spec says fetched comments render as subheadings but doesn't define whether +editing one syncs back. Linear only lets users edit their own comments. V1 should +treat fetched comments as remote-owned display content and support only adding new +comments; editing own comments can be vNext. (blocking) +,** TODO Empty result and fetch error render identically +A failed fetch and a successful-but-empty fetch produce the same buffer, so the +user can't tell "no issues" from "the query broke." Define a distinct empty-state +message. (non-blocking) #+end_src +Where the old review-file sub-sections go now: the scope-reviewed and overall-assessment narrative goes in the =Review and iteration history= entry (Phase 6); suggested spec edits are the recommended-change line in each finding's body; agreed decisions flip the spec's own =* Decisions= tasks; open questions are =:blocking:= findings or open decisions; vNext candidates are logged to =todo.org= as =[#D]= (Phase 6). The Phase 4 review dimensions are where findings come *from* — not headings to reproduce in the spec. + ** Phase 6: Assign the rubric and update tracking Assign one label consistently: -- =Ready= — no blocking open questions; implementation can start. Requires no decision in the spec's =* Decisions= section to still be =TODO= (the =[/]= cookie reads complete; =SUPERSEDED= and =CANCELLED= count as resolved) — a decision still =TODO= holds the rubric at =Not ready=, or =Ready with caveats= if the author consciously accepts and records the risk. +- =Ready= — no blocking open questions; implementation can start. Requires both cookies complete: no decision in =* Decisions= and no =:blocking:= finding in =* Review findings= still =TODO= (the =[/]= cookies read complete; =SUPERSEDED=/=CANCELLED= and a completed or rejected finding count as resolved) — a still-=TODO= decision or =:blocking:= finding holds the rubric at =Not ready=, or =Ready with caveats= if the author consciously accepts and records the risk. A non-blocking finding left =TODO= is author's discretion and does not hold the rubric. - =Ready with caveats= — can start if the caveats are accepted and tracked. - =Not ready= — blocking ambiguity / missing decisions would force implementers to invent product behavior. - =Needs research= — external API/library/platform assumptions must be verified first. @@ -195,7 +168,7 @@ The most useful reviews move a spec from =Not ready= to =Ready with caveats= or Finding severity maps to blocking power: *high-priority findings block =Ready=* — they hold the rubric at =Not ready= (or =Ready with caveats= if the author accepts and tracks them) until dispositioned; *medium-priority findings are the author's discretion* and don't block. State the blocking status on each finding so the author running spec-response knows which ones gate the rubric. -Then update the spec's review history. Specs should carry a bottom section named =Review and iteration history= (or the nearest existing equivalent) that tracks each material author/reviewer pass. Add a concise entry for this review even when the spec is ready and no review file is written. +Then update the spec's review history. Specs should carry a bottom section named =Review and iteration history= (or the nearest existing equivalent) that tracks each material author/reviewer pass. Add a concise entry for this review even when the spec is ready and no findings are recorded. Each entry is an org subheading with a compound id followed by three body fields. @@ -207,27 +180,11 @@ Body fields: - *What changed or was recommended:* high-signal summary, not a duplicate of the whole review. - *Why:* the decision pressure or rationale that caused the contribution. -- *Artifacts:* links to the review file, response/disposition section, commits, task IDs, or source checks when useful. +- *Artifacts:* links to the relevant findings, commits, task IDs, or source checks when useful. If the spec has no such section, add it at the bottom. Keep the history short and cumulative; it is provenance for future readers, not a session transcript. -*Emit implementation tasks (drop-in for =todo.org=).* Read the spec's =Implementation phases= section and turn it into a paste-ready block in the review file, under a heading =Implementation tasks (drop-in for todo.org)=. One =** TODO= entry per phase, plus a final entry for the test surface. The point: the handoff to whoever implements is one paste, not a re-read of the spec, and a spec that can't be decomposed into phases fails this step, surfacing a shape problem before =Ready=. - -Per-phase entry, following =todo-format.md= (terse heading names the phase; body holds the one-line deliverable plus a pointer back to the spec; tags on the heading): - -#+begin_example -** TODO [#B] <phase name — smallest noun phrase> :feature: -<what this phase delivers, one line>. Spec: [[file:<spec path>]] (Implementation phases, phase N). -#+end_example - -Final test-surface entry, mirroring the spec's =Acceptance criteria= when present: - -#+begin_example -** TODO [#B] <feature> — test surface :test: -Unit: <...>. Integration: <...>. E2e / manual-verify: <acceptance criteria as checkable items>. Spec: [[file:<spec path>]] (Acceptance criteria). -#+end_example - -Priority and tags follow the deferred-work rule below. Emit the block in the review file; the author pastes it into =todo.org= during spec-response, or you log it directly when you're also closing the loop. If the spec has no =Implementation phases= section, don't invent one — that absence is the finding, and the step becomes the prompt to ask the author to add a phase decomposition before the spec can be =Ready=. +*Check the spec decomposes into phases.* A =Ready= spec needs an =Implementation phases= section an implementer can turn into one task per phase plus a test surface. Confirm it's present and decomposable — each phase small enough to reach a clean stopping point in one focused session, with no broken intermediate states. If it's missing or can't be phased, file that as a =:blocking:= finding; don't invent the phases. The phase-to-task breakdown itself is spec-response's job (its Phase 6 reads =Implementation phases= directly once the author confirms =Ready=); the reviewer only verifies the section exists and is sound. Then log deferred work to =todo.org=: v1 implementation = =[#B]= (unless urgent or speculative); vNext/someday = =[#D]=. Tag =:feature:= / =:bug:= / =:refactor:= / =:test:= / =:quick:= / =:solo:= only when accurate. Don't leave important deferred decisions only in chat. @@ -256,8 +213,11 @@ Every material comment should be tagged by force: blocking, should-fix, or optio ** Make feedback author-usable Review comments should be specific, neutral, and actionable: quote or name the spec behavior, explain the risk, recommend the smallest concrete change, and say how the author can verify the fix. Avoid personal language, rhetorical questions, vague "this needs work" comments, and comments that require the author to infer the desired edit. +** Keep review and response roles explicit +If the user asks for review plus "enhance the spec" in the same turn, produce the findings first. Make only low-risk provenance and tracking edits unless the user clearly wants the reviewer to respond too. Don't silently resolve product decisions on the author's behalf — a proposed default belongs in a finding until it's accepted, modified, or rejected. + ** Preserve iteration provenance -Future reviewers and implementers need to know not just the current decision, but how the spec got there: how many review/response loops happened, who contributed, what they changed or recommended, and why. Keep that record in the spec itself under =Review and iteration history= so the trail survives deleted review files, chat loss, and agent handoffs. +Future reviewers and implementers need to know not just the current decision, but how the spec got there: how many review/response loops happened, who contributed, what they changed or recommended, and why. Keep that record in the spec itself under =Review and iteration history= so the trail survives chat loss and agent handoffs. ** Be strict about ownership Especially for org-mode features: a user treats visible text as editable unless the representation says otherwise. Make generated-vs-editable explicit. @@ -265,6 +225,9 @@ Especially for org-mode features: a user treats visible text as editable unless ** Never depend on an unverified API shape If the spec assumes fields/mutations/enums, they're verified against current schema/docs/live responses, or listed as a research prerequisite. =Needs research= is a real, useful verdict. +** Source external-dependency checks in the finding +When a finding turns on a current external-dependency fact (release version, API capability, platform behavior, package availability, hosted-service terms), cite the checked source in the finding body. Stale dependency assumptions are common, and the next reviewer needs to tell "verified this pass" from "remembered from prior context." + ** Favor small pure cores and thin IO layers Push findings toward separable, unit-testable pure functions surrounded by thin command/transport layers. @@ -354,3 +317,8 @@ Sources: - *What:* Two refinements to the same-day decisions convention after Craig's review: the gate item and =Ready= rubric now read "no decision is still =TODO=" with =SUPERSEDED= and =CANCELLED= counting as resolved (spec-create's template defines them as done-class keywords via a =#+TODO:= header), and a spec still on the retired =State:= field model explicitly fails the gate item until converted — closing the vacuous-pass hole on old specs. - *Why:* Review of the freshly-landed convention flagged that TODO/DONE alone lost the old model's superseded state and that the gate as written would silently pass a spec with no decision tasks at all. Craig chose the two done-class keywords and the auto-added =#+TODO:= header (the in-file header is what makes custom keywords portable). - *Artifacts:* Paired spec-create.org edits (keyword scheme + template header) in the same commit. + +** 2026-06-21 Sun @ 23:16:06 -0400 — Claude Code (rulesets) — responder +- *What:* Moved findings from a sibling =<spec>-review.org= file into the spec itself. Findings are now =** TODO= tasks under a =* Review findings= section with a =[/]= cookie, mirroring =* Decisions=; =:blocking:= marks high-priority. Phase 5 records findings in the spec instead of writing a review file; the Phase 6 =Ready= rubric gates on both the decisions and the findings cookie; the implementation-task drop-in (which lived in the review file) is gone, leaving the reviewer to verify the spec decomposes into phases and spec-response to build the breakdown. Also added two reviewer-practice principles harvested from a home spec-review: keep review and response roles explicit, and source external-dependency checks in the finding. +- *Why:* The delete-the-review-file convention left the iteration-history =Artifacts= line dangling and dropped the verbatim review; keeping the file instead collided with spec-response's file discovery and its "no review files remain" done-condition. Craig's call: incorporate the review into the document, reusing the decisions machinery so the readiness signal is a cookie, not a file's presence or absence. The role-explicit and source-checking practices came in from the home finance-report spec via inbox handoffs. +- *Artifacts:* Paired spec-response.org edits in the same commit. Inbox handoffs =2026-06-20-2339-from-home-spec-response-readiness-gate-proposal.org=, =2026-06-21-0156-from-home-companion-to-tonight-s-spec-response.org=, and the home-edited =2026-06-21-0156-from-home-spec-review.org=. diff --git a/.ai/workflows/startup.org b/.ai/workflows/startup.org index 59c9c54..5e8f61e 100644 --- a/.ai/workflows/startup.org +++ b/.ai/workflows/startup.org @@ -10,8 +10,8 @@ The workflow is structured into four phases. *Phase A.0* is a sequential pre-fli Quick contract — runs / produces: - *Phase A.0* (sequential): refresh rulesets, then the project repo. -- *Phase A* (parallel batch): timestamp, session-context check, guarded =.ai/= sync, recent sessions, inbox-status, cross-agent status, notes.org, staleness, language-bundle freshness. -- *Phase B* (parallel batch): read the crash-recovery anchor if present, the recent session summaries, new inbox items, pending cross-agent messages. +- *Phase A* (parallel batch): timestamp, session-context check, guarded =.ai/= sync, recent sessions, inbox-status, notes.org, staleness, language-bundle freshness. +- *Phase B* (parallel batch): read the crash-recovery anchor if present, the recent session summaries, new inbox items. - *Phase C* (interactive): surface findings, process the inbox, run project startup-extras, ask priorities. * Execution @@ -146,12 +146,25 @@ These calls have no dependencies on each other. Issue them all together in one m 4. =\ls -t .ai/sessions/ 2>/dev/null | head -5= — list 5 most recent session files. The backslash bypasses any =ls= alias in the user's profile. Without it, bare =ls -t= silently returns no output under =exa= (a common =ls= replacement) — which makes a sessions directory full of files look empty, and the agent then skips Phase B step 2. 5. =\ls -la inbox/ 2>/dev/null= — inventory the inbox. Same reason for the backslash escape, applied uniformly across the Phase A =ls= calls. -6. =cross-agent-status 2>/dev/null || true= — snapshot of pending cross-agent messages across local projects. This is layer A of the cold-start design from =cross-agent-comms.org=: pending messages from other agents (delivered while no session was active here) get surfaced on session start. The =|| true= keeps Phase A from failing if =cross-agent-status= isn't installed yet — older projects without the script still boot cleanly. If HALT is active, =cross-agent-status= prints a banner; surface that prominently in Phase C. -7. Read =.ai/notes.org= — Project-Specific Context, Active Reminders, Pending Decisions sections (skip About This File). -8. Read =.ai/project-workflows/startup-extras.org= if it exists. -9. =[ -f todo.org ] && .ai/scripts/task-review-staleness.sh todo.org 7 || true= — count top-level tasks overdue for review (the daily task-review habit's startup nudge). The =[ -f todo.org ]= guard skips projects without a root todo.org; =|| true= keeps Phase A from failing if the script isn't synced yet. Threshold 7 days is one review cycle of slack — softer than the wrap-up health check's 30-day alarm. -10. =bash ~/code/rulesets/scripts/sync-language-bundle.sh "$PWD" 2>/dev/null || true= — language-bundle freshness for the current project. Fingerprint-detects which bundle (if any) the project has, auto-fixes drifted rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), and surfaces drift in =settings.json= without writing it (a project may have customized it). =CLAUDE.md= is deliberately left untracked — it's seed-only in =install-lang= and project-owned afterward, mirroring how =diff-lang= skips it. Quiet when there's no bundle or everything's clean. Hardcodes the rulesets path because =languages/= is the canonical source and lives only there — the same absolute-path dependency the rsyncs already carry. =|| true= keeps Phase A from failing on older checkouts where the script isn't present yet. The =.ai/= rsyncs and this call write to disjoint paths (=.ai/= vs =.claude/=/=githooks/=), so the batch stays parallel-safe. -11. =[ -f "$HOME/org/roam/inbox.org" ] && grep -cE '^\*\* ' "$HOME/org/roam/inbox.org" || true= — count items in the roam global inbox (=~/org/roam/inbox.org=), the inbox-zero startup nudge. Silent if the roam clone isn't on this machine. Phase C reads the file when the count is non-zero, splits total vs items related to this project, and surfaces the offer (see =inbox-zero.org=). Read-only; never files at startup. +6. Read =.ai/notes.org= — Project-Specific Context, Active Reminders, Pending Decisions sections (skip About This File). +7. Read =.ai/project-workflows/startup-extras.org= if it exists. +8. =[ -f todo.org ] && .ai/scripts/task-review-staleness.sh todo.org 7 || true= — count top-level tasks overdue for review (the daily task-review habit's startup nudge). The =[ -f todo.org ]= guard skips projects without a root todo.org; =|| true= keeps Phase A from failing if the script isn't synced yet. Threshold 7 days is one review cycle of slack — softer than the wrap-up health check's 30-day alarm. +9. =bash ~/code/rulesets/scripts/sync-language-bundle.sh "$PWD" 2>/dev/null || true= — language-bundle freshness for the current project. Fingerprint-detects which bundle (if any) the project has, auto-fixes drifted rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), and surfaces drift in =settings.json= without writing it (a project may have customized it). =CLAUDE.md= is deliberately left untracked — it's seed-only in =install-lang= and project-owned afterward, mirroring how =diff-lang= skips it. Quiet when there's no bundle or everything's clean. Hardcodes the rulesets path because =languages/= is the canonical source and lives only there — the same absolute-path dependency the rsyncs already carry. =|| true= keeps Phase A from failing on older checkouts where the script isn't present yet. The =.ai/= rsyncs and this call write to disjoint paths (=.ai/= vs =.claude/=/=githooks/=), so the batch stays parallel-safe. +10. =[ -f "$HOME/org/roam/inbox.org" ] && grep -cE '^\*\* ' "$HOME/org/roam/inbox.org" || true= — count items in the roam global inbox (=~/org/roam/inbox.org=), the roam-mode startup nudge. Silent if the roam clone isn't on this machine. Phase C reads the file when the count is non-zero, splits total vs items related to this project, and surfaces the offer (see =inbox.org= roam mode). Read-only; never files at startup. +11. KB surface prep (the read + contribute startup nudges; see =docs/design/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute). + + #+begin_src bash + ra="$HOME/org/roam/agents" + if [ -d "$ra" ]; then + proj=$(basename "$PWD") + echo "kb-total: $(rg -l '#\+filetags:.*:agent:' "$ra" 2>/dev/null | wc -l)" + echo "kb-bestpractices: $(rg -l 'agent-kb-best-practices' "$ra" 2>/dev/null | head -1)" + matches=$(rg -il "$proj" "$ra" 2>/dev/null | head -5) + [ -z "$matches" ] && matches=$(\ls -t "$ra"/*.org 2>/dev/null | head -3) + echo "kb-relevant-titles:" + for f in $matches; do rg -m1 '^#\+title:' "$f" 2>/dev/null | sed 's/^#+title:/ -/'; done + fi + #+end_src Notes on the rsync commands: - Trailing slashes on both source and destination matter — they tell rsync to sync /contents/ rather than nest a directory inside. @@ -170,7 +183,6 @@ These calls depend on Phase A outputs, but are independent of each other. Issue 1. *Read =.ai/session-context.org= if Phase A reported it exists.* The file is the crash-recovery anchor — if it's there, the previous session was interrupted and the context lives only in this file. 2. *Read each of the 5 most recent session files* from Phase A's =\ls -t .ai/sessions/= output. Read just the =* Summary= section of each — not the full file. The Summary gives Active Goal / Decisions / Data Collected / Findings / Files Modified / Next Steps. That's enough to pick up where things left off. Drill into a specific =* Session Log= later only if you need the /why/ or sequence on something. *If Phase A's listing came back empty, sanity-check with =\ls -la .ai/sessions/= before treating empty as definitive — sessions/ should normally be populated, and an empty result usually means the listing got swallowed somewhere, not that the directory is genuinely empty.* 3. *Read each new inbox file* from Phase A's =\ls -la inbox/= output. For =.eml= files, defer to Phase C — those need the extract script (below) rather than a raw Read. -4. *Process pending cross-agent messages.* For each project with a pending count >0 in Phase A's =cross-agent-status= output (typically the current project; cross-project pending is surfaced too but only acted on if the user asks), run =cross-agent-recv <message-file>= on the file path =cross-agent-status= named. The script returns a structured decision (=process= / =dedup= / =query= / =reject=) per the protocol. For =process=, read the message body to determine the action. For =query=, prepare a clarifying reply. For =reject=, surface to user with the reason. For =dedup=, no action — silent retry already handled. Surface all decisions in Phase C alongside other findings. Rationale: Reads are independent and benign. Batching them means the whole session-history view + inbox view lands in one round-trip instead of one per file. @@ -184,7 +196,9 @@ This phase touches the user and runs sequentially: - Mention Pending Decisions from notes.org. - Briefly note significant template updates noticed during sync (new workflows, protocol changes). - *Task-review nudge.* If the Phase A staleness count (step 11) is greater than zero, surface one line: "=<N>= top-level tasks unreviewed for >7 days — say 'let's do a task review' to run a cycle." If zero, say nothing. - - *Roam inbox nudge.* If the Phase A roam-inbox count is greater than zero, read =~/org/roam/inbox.org=, split total vs items related to this project (claimed by the =<project>:= prefix, plus any unprefixed item whose topic plainly concerns this project), and surface one line: "Roam inbox: =<N>= total, =<M>= appear related to this project — say 'inbox zero' to file them." Offer it as a priority option; never auto-file. If the count is zero or the file is absent, say nothing. See =inbox-zero.org=. + - *Roam inbox nudge.* If the Phase A roam-inbox count is greater than zero, read =~/org/roam/inbox.org=, split total vs items related to this project (claimed by the =<project>:= prefix, plus any unprefixed item whose topic plainly concerns this project), and surface one line: "Roam inbox: =<N>= total, =<M>= appear related to this project — say 'inbox zero' to file them." Offer it as a priority option; never auto-file. If the count is zero or the file is absent, say nothing. See =inbox.org= roam mode. + - *KB consult nudge (read side).* If the Phase A KB-surface prep returned any =kb-relevant-titles=, surface one line listing them (capped 5): "KB lessons that may be relevant: =<title>=; =<title>=… — open the node before related work." The titles are declarative, so the list alone tells you whether to open one. Gated on the roam clone; silent when the clone is absent or nothing relevant surfaced. See the best-practices node and =knowledge-base.md=. + - *KB contribute nudge (write side).* Once per session, surface one line pointing at the best-practices node (the =kb-bestpractices= path from Phase A): "Learned something durable? See =<path>= for how to write a KB node — contributing cross-project facts is welcome (personal projects only; work/unknown projects never write per =knowledge-base.md=)." Light encouragement, never a gate. Gated on the roam clone; silent when absent. - *Language-bundle sync.* If the Phase A step-12 call (=sync-language-bundle.sh=) printed anything, surface it. =fixed= lines are informational — the drift was already repaired (note that =.claude/= is now dirty if the project commits it). A =drift= line on =settings.json= is surface-only and needs the printed =make install-<lang> PROJECT=.= to reconcile; flag it so the user can decide. If the call was silent, say nothing. - *Newly-installed symlinks.* If the Phase A.0 =make install= step printed any =link= / =relink= / =WARN= line, surface it. A =link= line means a skill, rule, hook, or script added to rulesets is now linked into =~/.claude= for the first time on this machine. For a newly-linked *skill*, check the agent's available-skills list: if the harness already registered it mid-session, note it's available and move on; if it's absent, stop and tell Craig to restart the agent so it loads (whether a mid-session reload works is harness-version-dependent). For a newly-linked *hook*, note that the harness reads hooks at session start — it fires from the next session (or after Craig opens =/hooks= once); its settings.json wiring travels with the tracked file, so the link is usually the only missing piece. A =WARN ... not a symlink= line is a real collision at the target path — surface it; it needs a human. If the step printed only "nothing new to link", say nothing. - *Template-sync churn (safety net).* Check whether Phase A's rsync left uncommitted churn in the synced =.ai/= paths — accumulated from a prior session that crashed before wrap-up, or freshly added this session when rulesets advanced. Without surfacing, it builds up silently until it blocks Phase A.0's auto-ff (git won't ff a dirty tree). Skip in the rulesets repo itself (there =.ai/= is a committed mirror, kept honest by the pre-commit hook). The check is sequential here, after the rsync has finished — not a Phase A step, to keep that batch race-free. @@ -197,8 +211,7 @@ This phase touches the user and runs sequentially: #+end_src If it reports a count, surface one line: wrap-up's Step 4.0 will commit it as =chore: sync .ai tooling from templates=, or offer to commit it now. If silent, say nothing. This is the crashed-session counterpart to the wrap-up commit step (the primary fix). From the 2026-05-31 jr-estate + work handoffs. - - *Surface pending cross-agent messages.* If =cross-agent-status= reported any pending messages, list them with their =cross-agent-recv= decision (process / query / reject) per file. For =process= messages in this project's inbox, propose handling now or after the current task. For pending in other projects, mention the count so the user knows to switch projects when ready. If HALT was active, surface that prominently — cross-agent activity is paused until =cross-agent-resume= clears it. -2. *Process inbox if non-empty.* Mandatory — don't ask, just delegate to [[file:process-inbox.org][process-inbox.org]]. That workflow owns the value gate (advances an existing TODO / improves the project / serves the mission), the per-source rejection flow (Craig / project handoff / script), the priority-scheme check before filing, and the =.eml= extraction path. Single source of truth for the discipline. +2. *Process inbox if non-empty.* Mandatory — don't ask, just delegate to [[file:inbox.org][inbox.org]] process mode. That mode owns the value gate (advances an existing TODO / improves the project / serves the mission), the per-source rejection flow (Craig / project handoff / script), the priority-scheme check before filing, and the =.eml= extraction path. Single source of truth for the discipline. 3. *Execute project-specific startup extras* (the contents of =.ai/project-workflows/startup-extras.org= read in Phase A). If the file didn't exist, skip. 4. *Ask about priorities.* "What would you like to work on, or is there something urgent you need?" - If urgent: proceed immediately. diff --git a/.ai/workflows/triage-intake.org b/.ai/workflows/triage-intake.org index 9e9e3dd..a5a3bda 100644 --- a/.ai/workflows/triage-intake.org +++ b/.ai/workflows/triage-intake.org @@ -167,6 +167,10 @@ If Craig has been silent for a while after Phase D and the surface looks closed- This rule prevents the failure mode where the workflow self-declares done and the next exchange has to relitigate what state things are in. +*** KB capture (only if the sweep surfaced something durable) + +If this sweep surfaced a durable, cross-project fact — a recurring pattern across sources, a reference pointer worth keeping, an environment gotcha — consider writing it to the agent KB as one =:agent:= node (see the best-practices node and =knowledge-base.md=; personal projects only, work never writes). One line of judgment, not a step: an all-quiet sweep surfaces nothing and writes nothing. Never blocking, never padded onto a no-signal run. + * Auto mode (unattended monitoring) @@ -187,7 +191,7 @@ Running in the live session means MCP auth (Slack, Gmail, Linear) is inherited f ** Preconditions and Close-out -Auto mode borrows the inbox-monitor gates (=monitor-inbox.org=): do not start on a dirty worktree or a red test suite — a close's batch commit would otherwise sweep up unrelated changes — and leave the tree clean and green when the loop stops. Surface a blocker with inline numbered options per =interaction.md= and wait. +Auto mode borrows the inbox monitor-mode gates (=inbox.org= monitor mode): do not start on a dirty worktree or a red test suite — a close's batch commit would otherwise sweep up unrelated changes — and leave the tree clean and green when the loop stops. Surface a blocker with inline numbered options per =interaction.md= and wait. ** A sweep: accumulate, don't mutate diff --git a/.ai/workflows/wrap-it-up.org b/.ai/workflows/wrap-it-up.org index 139d612..5d2cdd2 100644 --- a/.ai/workflows/wrap-it-up.org +++ b/.ai/workflows/wrap-it-up.org @@ -4,7 +4,7 @@ * Overview -This workflow defines the process for ending a Claude Code session cleanly. It finalizes the session record, commits + pushes all work, and provides a warm handoff. +This workflow defines the process for ending a Claude Code session cleanly. It finalizes the session record, commits + pushes all work, and provides a warm handoff. A bare wrap also tears the session down (kills the ai-term buffer + tmux session, restoring geometry); a qualified wrap keeps the buffer, and a shutdown wrap powers the machine off. The teardown variants are set by the trigger phrase (see Teardown mode below) and act only at the very end, in Step 6. Triggered by Craig saying "wrap it up," "that's a wrap," "let's call it a wrap," or similar. @@ -25,14 +25,30 @@ The wrap-up is complete when: 3. *todo.org is clean.* Cleanup script ran. Any auto-fixes are staged for the wrap-up commit. Orphan planning lines surfaced for manual fix if there are any. 4. *Linear board is honest* (skip if project doesn't use Linear). Any Dev-Review ticket whose PR has merged was moved to Done or PM Acceptance per the classification rule. 5. *Git state is clean.* All changes committed + pushed to all remotes. Working tree clean. -6. *Valediction delivered.* Brief, warm closing with key accomplishments and reminders. +6. *Valediction delivered.* Brief, warm closing with key accomplishments and reminders, ending with =session wrapped.= on its own line as the signoff marker. The absence of =.ai/session-context.org= is the signal that the last session wrapped up cleanly. Its presence at session start means the previous session was interrupted. +* Teardown mode (set from the trigger phrase) + +The wrap itself — Steps 1 through 5 — is identical in every mode. The trigger phrase only decides what Step 6 does once commit + push and the valediction are done. Resolve the mode from the phrase before starting: + +- *Teardown* (the default) — bare "wrap it up", "that's a wrap", "let's call it a wrap". The full wrap, then Step 6 kills the ai-term buffer + the =aiv-<project>= tmux session (which takes =claude= with it) and restores the saved window geometry. This is Craig's typical end-of-day case. +- *No-teardown* — "wrap it up with summary" or "wrap it up and summarize". The full wrap, but Step 6 leaves the buffer and session intact so the summary stays readable. The explicit qualifier is what opts out of teardown. +- *Shutdown* — "wrap it up and shutdown". The full wrap, then Step 6 gates on this being the only live ai-term session and powers the machine off. Shutdown supersedes teardown (killing the buffer is moot if the box is going down). + +Why teardown waits for Step 6 and runs through a hook, never inline: teardown kills the very tmux session =claude= runs in, so an inline kill would cut the valediction off before it renders. Step 6 instead drops a sentinel after everything else is verified, and the =Stop= hook (=ai-wrap-teardown.sh=) does the actual teardown when this response ends — by which point the valediction has already been delivered. + +This depends on three functions in =.emacs.d/modules/ai-term.el= (=cj/ai-term-quit=, =cj/ai-term-live-count=, =cj/ai-term-shutdown-countdown=) and on the =Stop= hook being wired in =settings.json= (=hooks/settings-snippet.json=). If =emacsclient= or the daemon is unreachable, the sentinel is cleared and the session simply stays up — teardown degrades to a no-op, never a wedge. + * The Workflow ** Step 1: Finalize the Summary +*** Early KB reflection (capture while fresh, before the Summary) + +Before distilling the Summary, while the session is still fresh, ask: what did this session learn worth remembering, for yourself or a future agent? Reflect and stage any candidate durable facts — a decision and its why, an environment gotcha, a reference pointer, a transferable lesson. Self-answer silently; this adds no interactive turn (Craig already authorized the wrap). The candidates flow straight into the KB promotion check below, which does the actual writing and the receipt — this is the capture half, that is the commit half, one pipeline, one receipt. Reflecting here rather than reconstructing learnings after the Summary is the point: the early ask is what keeps the receipt from defaulting to "promoted 0" out of fatigue. + Read through the =* Session Log= in =.ai/session-context.org=. Populate (or refine) the =* Summary= section: - *Active Goal* — one or two sentences describing the session's focus @@ -90,15 +106,15 @@ Replace =DESCRIPTION= with your picked slug. (=AI_AGENT_ID= should be filename-s If the project has a =todo.org= at its root, run the cleanup script before committing. Two passes, both fast and idempotent: a hygiene pass and an archive pass. -*** Roam inbox sweep (inbox-zero) +*** Roam inbox sweep (inbox roam mode) -Before the cleanup scripts, sweep the roam global inbox (=~/org/roam/inbox.org=) for items that belong to this project, so any imported tasks get linted and ride the wrap commit. Delegate to [[file:inbox-zero.org][inbox-zero.org]] for the claimed set. +Before the cleanup scripts, sweep the roam global inbox (=~/org/roam/inbox.org=) for items that belong to this project, so any imported tasks get linted and ride the wrap commit. Delegate to [[file:inbox.org][inbox.org]] roam mode for the claimed set. #+begin_src bash [ -f "$HOME/org/roam/inbox.org" ] && grep -cE '^\*\* ' "$HOME/org/roam/inbox.org" || true #+end_src -Skip-fast when nothing matches: if the roam clone isn't on this machine, or no item is prefixed for this project, this is a silent no-op. When claimed items exist, run inbox-zero's Phase B–C (file each into =todo.org=, then remove them from the shared inbox in a separate roam commit). Report the total count and how many appeared related to this project, per inbox-zero's scan-summary rule. +Skip-fast when nothing matches: if the roam clone isn't on this machine, or no item is prefixed for this project, this is a silent no-op. When claimed items exist, run roam mode's Phase B–D (file each into =todo.org=, then remove them from the shared inbox and let =roam-sync= commit + push the edit). Report the total count and how many appeared related to this project, per roam mode's scan-summary rule. *** Hygiene pass @@ -204,7 +220,7 @@ For an interactive walk of the judgments mid-day, run =/lint-org todo.org=. *** Inbox sanity check (surface unprocessed handoffs) -If the project has an =inbox/= directory, verify it holds nothing but =.gitkeep=, =lint-followups.org= (the lint-org pipeline file the next morning's daily-prep consumes), and any explicitly-deferred =PROCESSED-*= files before the wrap completes. An inbox that arrived at session start with handoffs from other projects, or that received handoffs mid-session, needs the =process-inbox.org= workflow to run and apply its value-gate dispositions. Wrapping with a dirty inbox silently defers the work to next session and accumulates handoff debt that the sender can't see. +If the project has an =inbox/= directory, verify it holds nothing but =.gitkeep=, =lint-followups.org= (the lint-org pipeline file the next morning's daily-prep consumes), and any explicitly-deferred =PROCESSED-*= files before the wrap completes. An inbox that arrived at session start with handoffs from other projects, or that received handoffs mid-session, needs =inbox.org= process mode to run and apply its value-gate dispositions. Wrapping with a dirty inbox silently defers the work to next session and accumulates handoff debt that the sender can't see. #+begin_src bash unprocessed=$(find inbox -maxdepth 1 -type f \ @@ -213,7 +229,7 @@ unprocessed=$(find inbox -maxdepth 1 -type f \ ! -name 'PROCESSED-*' \ 2>/dev/null | wc -l) if [ "$unprocessed" -gt 0 ]; then - echo "wrap-up: inbox/ has $unprocessed unprocessed item(s). Run process-inbox.org before wrapping, or explicitly defer each item with a one-line reason in the valediction." + echo "wrap-up: inbox/ has $unprocessed unprocessed item(s). Run inbox.org process mode before wrapping, or explicitly defer each item with a one-line reason in the valediction." find inbox -maxdepth 1 -type f \ ! -name '.gitkeep' \ ! -name 'lint-followups.org' \ @@ -226,7 +242,7 @@ If the count is zero or the project has no =inbox/= directory, the check is a si The check exempts =lint-followups.org= explicitly because lint-org runs earlier in the same wrap-up workflow and writes its judgment items to that file in =inbox/= by design. The file is a pipeline artifact for the next morning's =daily-prep=, not a handoff that needs the value gate. -This integrates with =process-inbox.org=, which stamps =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section on completion. Wrap-up doesn't double-stamp. It only ensures the inbox carries nothing but the expected pipeline artifacts at session end. +This integrates with =inbox.org= process mode, which stamps =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section on completion. Wrap-up doesn't double-stamp. It only ensures the inbox carries nothing but the expected pipeline artifacts at session end. *** Review-habit health check (surface a slipped daily task-review) @@ -444,6 +460,8 @@ Include: Tone: warm but professional. No emoji unless Craig has explicitly requested. Acknowledge effort when session was long or difficult. +End on a clear signoff: the *last* line of the valediction is always =session wrapped.= on its own line (lowercase, with the period, nothing after it). It's the unmistakable end-of-session marker, so don't trail it with another sentence. This is the last user-facing output — Step 6's teardown is silent. + Example: #+begin_example That's a wrap. Today we restructured the entire claude-templates @@ -456,8 +474,47 @@ from earlier) and archsetup's layout-navigate tests. Both are ratio-local uncommitted state. Good session. Talk tomorrow. + +session wrapped. #+end_example +** Step 6: Session teardown (mode-dependent) + +The last action of the wrap, and only after Step 4's commit + push is verified and the Step 5 valediction is composed. The teardown itself happens when this response ends (via the =Stop= hook), so the valediction always renders first. Act by the mode resolved up front: + +*** No-teardown mode + +Do nothing. The buffer, the =aiv-<project>= tmux session, and =claude= all stay up so the summary stays readable. The wrap is complete. + +*** Teardown mode (default) + +Confirm commit + push succeeded (Exit Criteria 5 — never tear down over unpushed work), then drop the sentinel: + +#+begin_src bash +touch "/tmp/ai-wrap-teardown-$(basename "$PWD")" +#+end_src + +That is the whole step. Don't run any =tmux kill-session=, =emacsclient=, or buffer kill inline — the =Stop= hook reads the sentinel when this response ends and runs =cj/ai-term-quit=, which kills the =aiv-<project>= session (taking =claude= with it), kills the vterm buffer, and restores geometry. The basename of =$PWD= is the key the hook matches, so the sentinel names the session it tears down. + +*** Shutdown mode + +Confirm commit + push succeeded, then evaluate the safety gate *before* committing to the shutdown — never power the box off out from under another live session: + +#+begin_src bash +emacsclient -e '(cj/ai-term-live-count)' +#+end_src + +- *Count > 1* — another ai-term session is alive. ABORT the shutdown. List the other live =aiv-*= sessions, drop *no* sentinel, and tell Craig in the valediction that it fell back to a normal wrap (no poweroff, no teardown). This gate is the load-bearing safety of the whole feature. +- *Count = 1* — this session is the only one. Drop the shutdown sentinel: + + #+begin_src bash + touch "/tmp/ai-wrap-shutdown-$(basename "$PWD")" + #+end_src + + The =Stop= hook fires =cj/ai-term-shutdown-countdown= when this response ends: it re-checks the gate, runs an abort-able 10→1 countdown in the Emacs echo area (=C-g= cancels), then =sudo shutdown now=. Shutdown supersedes teardown — do *not* also drop the teardown sentinel. + +If =emacsclient= isn't resolvable or the daemon is down, the gate can't run — abort the shutdown, fall back to a normal wrap, and say so. Don't power off on an unverifiable gate. + * Common Mistakes to Avoid 1. *Skipping Step 1 (Summary)* — the file becomes the record; an empty Summary makes it hard to scan at catch-up @@ -493,5 +550,8 @@ Before considering wrap-up complete: - [ ] Current branch pushed to ALL remotes (verified with =git remote -v=) - [ ] All other local branches with a tracking upstream pushed to their remote - [ ] Any untracked-upstream branches surfaced for manual =git push -u= +- [ ] Step 6 teardown matches the trigger phrase: no-teardown leaves the buffer; teardown drops only =/tmp/ai-wrap-teardown-<project>=; shutdown gates on =cj/ai-term-live-count= = 1 and drops only =/tmp/ai-wrap-shutdown-<project>= +- [ ] No teardown/shutdown sentinel was dropped before commit + push was verified +- [ ] Shutdown aborted (fell back to normal wrap, logged in the valediction) when another =aiv-*= session was live or the gate couldn't run - [ ] Commit message follows format (no =session:=, no Claude attribution) - [ ] Valediction delivered (brief, specific, warm) |
