1 files changed, 192 insertions, 0 deletions
diff --git a/docs/inbox-workflow-consolidation-spec.org b/docs/inbox-workflow-consolidation-spec.org
new file mode 100644
index 0000000..2e158b6
--- /dev/null
+++ b/docs/inbox-workflow-consolidation-spec.org
@@ -0,0 +1,192 @@
+#+TITLE: Inbox Workflow Consolidation — Spec
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-06-23
+#+TODO: TODO | DONE SUPERSEDED CANCELLED
+
+* Metadata
+| Status   | Ready — review incorporated (Codex, 2026-06-23)             |
+|----------+-------------------------------------------------------------|
+| Owner    | Craig                                                       |
+|----------+-------------------------------------------------------------|
+| Reviewer | Craig                                                       |
+|----------+-------------------------------------------------------------|
+| Related  | [[file:../todo.org][Consolidate inbox/triage workflows + scheduled inbox check]] |
+|----------+-------------------------------------------------------------|
+
+* Summary
+
+Four inbox-named workflows (=inbox-zero=, =process-inbox=, =monitor-inbox=, plus the startup/wrap-up nudges) circle the same disposition logic across three different surfaces. This spec consolidates them into one =inbox= engine with explicit modes, keeps =triage-intake= (external accounts) and =no-approvals= (session mode) separate, and adds an interactive recurring roam check (=auto inbox zero=). The fully-unattended cron pass is named but deferred to vNext.
+
+* Problem / Context
+
+"Too many inbox related workflows" (Craig, roam capture 2026-06-23). The word "inbox" is overloaded onto three genuinely different surfaces, and the workflows that serve them have grown to circle the same logic:
+
+- *Project-local =inbox/= dir* (handoffs from other projects/scripts/Craig) → =process-inbox.org= owns the value gate and disposition; =monitor-inbox.org= is a thin cadence layer on top of it ("loop process-inbox every 15 min + act-vs-file + reply discipline").
+- *Global roam inbox* (=~/org/roam/inbox.org=, GTD capture) → =inbox-zero.org=, whose Phase A already *calls* =process-inbox= for the local dir before doing the roam-routing part.
+- *External accounts* (email / calendar / PRs) → =triage-intake.org= + six source plugins.
+
+So a reader (or a non-Claude agent) facing "deal with my inbox" has to know which of four files to invoke, and the shared concepts — the three-question value gate, the skeptical review, the implement/fold/file/defer/reject disposition, the reply-to-sender discipline, the capture-guard before a roam write, the priority-scheme check before filing — are spread across and cross-referenced between them. The duplication is real (=monitor-inbox= and =inbox-zero= both lean on =process-inbox='s machinery) and the count is the symptom Craig named.
+
+A second gap surfaced in the same capture: there's no documented way to run a *recurring* inbox check, and Craig wants a keyword trigger for it. v1 answers this with an interactive in-session loop (=auto inbox zero=); a fully-unattended cron pass that fires while Craig is away is a larger contract (mutation safety, surfacing-when-away, cross-run state) and is deferred.
+
+* Goals and Non-Goals
+
+** Goals
+- One engine is the single entry point for the inbox surfaces, with the shared value-gate / disposition / reply / capture-guard / priority-scheme logic living in exactly one place.
+- Mode selection is unambiguous from the trigger phrase and the caller (startup, wrap-up, on-demand).
+- Every existing trigger phrase still works, routing to the right mode — no relearning.
+- A documented interactive recurring check (=auto inbox zero=, =/loop=-based). The fully-unattended cron pass (=/schedule=) is vNext, not v1.
+- INDEX.org, protocols.org, and the startup/wrap-up callers reconciled to the new shape with no dangling references.
+- No behavior regression: the value gate, disposition rules, capture-guard, and reply discipline behave exactly as today.
+
+** Non-Goals
+- *Not* merging =triage-intake.org=. External-account triage ("what's new across my email/cal/PRs") is a different domain from "my inbox dirs"; keeping it distinct is correct, not redundancy.
+- *Not* merging =no-approvals.org=. It's a session mode, not an inbox workflow (it's referenced by the monitor cadence, not part of it).
+- *Not* changing value-gate semantics or disposition rules. This is a structural merge, behavior-preserving.
+- *Not* the domain-aware whole-roam-inbox routing (still deferred, unchanged).
+- *Not* the agent-neutral language sweep over these files — that is the parked half of the agent-source task and runs *after* this merge, over fewer files.
+- *Not* renaming =CLAUDE.md=, =.claude/=, or other structural paths.
+
+** Scope tiers
+- v1: merge =process-inbox= + =monitor-inbox= + =inbox-zero= into one =inbox.org= engine with =process= / =monitor= / =roam= modes; preserve all trigger phrases; reconcile INDEX + protocols + startup + wrap-up; add the interactive =auto inbox zero= recurring check (=/loop=).
+- Out of scope: =triage-intake= merge, =no-approvals= merge, domain-aware roam routing, the agent-neutrality sweep.
+- vNext (log to todo.org): the fully-unattended =/schedule= cron pass — needs its own contract (read-only vs may-mutate =todo.org= / =~/org/roam/inbox.org=, how a find surfaces when Craig is away, how dedup state survives across runs, auth/session constraints); a later umbrella unifying =triage-intake='s "what's new" with the inbox engine; the agent-neutrality pass over the consolidated =inbox.org=.
+
+* Design
+
+The consolidation produces one engine file, =inbox.org=, structured as a shared core plus three thin modes. The core holds every concept that today is duplicated or cross-referenced: the three-question value gate, the skeptical review (with the cross-project battery for shared-asset proposals), the disposition ladder (implement-now / fold / file / defer / reject-by-source / park), the reply-to-sender discipline, the capture-guard before any roam-inbox disk write, and the priority-scheme check before filing. A mode is a short front section that says which surface it reads, how it enters and exits, and which core steps it runs.
+
+*Two altitudes.*
+
+For the *user*: the trigger phrase picks the mode, and the phrases are unchanged. "process inbox" / "handle the inbox" → process mode (the local =inbox/= dir). "monitor the inbox" / "watch the inbox" → monitor mode (process mode on a loop, with the act-vs-file and reply discipline and the clean-tree/green-suite gates). "inbox zero" / "process the roam inbox" → roam mode (route the global roam inbox by =<project>:= prefix, sweep empties, capture-guard the write). Startup calls process mode for the local dir and the read-only roam nudge; wrap-up calls process mode then the roam sweep.
+
+For the *implementer*: =inbox.org= is one file. The core sections are written once. Each mode is a section that references core steps by name rather than restating them ("run the value gate (core §X) on each item", "guard and reconcile the roam write (core §Y)"). The old three files are deleted; their content is absorbed, not copied. The =triage-intake= engine and its plugins are untouched and keep their own namespace.
+
+*Routing and callers.* protocols.org's terminology section and the startup workflow's INDEX-driven routing both key off trigger phrases, so the phrase→mode map is the contract. Each caller that today names =process-inbox.org= / =monitor-inbox.org= / =inbox-zero.org= (startup Phase C, wrap-up Step 3, protocols, INDEX) is repointed at =inbox.org= and the relevant mode. INDEX gets one entry for =inbox.org= listing every trigger phrase, grouped by mode.
+
+*Auto inbox zero (the scheduled mode).* The trigger phrase =auto inbox zero= starts a recurring roam-mode pass. On invocation the engine *asks Craig for the interval* (e.g. 30 min, 2 hours), then drives the loop with =/loop <interval>= running roam mode. It's in-session and interactive by design — each cycle reports, and a find waits for Craig's go before any work happens. Per cycle:
+
+- *Nothing found* → no inbox summary. A single acknowledgement line: ran at =HH:MM=, nothing found. Nothing else.
+- *Items found* → summarize the found items, file them as tasks, and *append them to a displayed queue* (the harness task list, =TaskCreate=) so the queue accumulates across cycles. Then ask: "run this batch next?" If Craig says yes, the engine launches into implementing the found items (each through the normal disposition + verify flow); if no, they stay queued for a later go. Subsequent cycles add only newly-found items to the same displayed queue, never re-surfacing what's already there.
+
+The acknowledge-only-on-empty rule keeps a quiet inbox quiet — no noise when there's nothing to do — while a find is always surfaced and gated on Craig's yes. =auto inbox zero= is the interactive =/loop= shape because its execute step waits for a yes, so it is inherently in-session.
+
+A fully-unattended =/schedule= cron pass (firing while Craig is away) is a different contract and is *vNext, not v1*: it can't wait for a yes, so it has to decide up front whether it may mutate =todo.org= and the roam inbox or stays read-only, how a find reaches Craig asynchronously, how dedup state persists between runs that don't share a session, and what session/auth context a cron run carries. v1 ships only the interactive loop; the unattended contract is logged to =todo.org= for its own design pass.
+
+* Alternatives Considered
+
+** Option A — One engine with modes (chosen)
+- Good, because it cuts four inbox-named files to one and puts the shared logic in a single authoritative place, which is exactly the "too many" complaint.
+- Good, because every trigger phrase can re-home to a mode with no user relearning.
+- Bad, because =inbox.org= becomes a larger file with internal mode branching.
+- Neutral, because =triage-intake= and =no-approvals= stay separate either way.
+
+** Option B — Keep three files, extract a shared include
+- Good, because the diffs are smaller and the per-surface entry points stay familiar.
+- Bad, because it does not reduce the file count — Craig's actual complaint is the number of files, and this keeps three plus adds an include.
+- Neutral, because the dedup of logic happens, just without the count reduction.
+
+** Option C — Merge only process-inbox + monitor-inbox, leave inbox-zero
+- Good, because it fixes the tightest, least-ambiguous redundancy (monitor is literally a loop over process) at the lowest risk.
+- Bad, because roam vs local stays two files; the consolidation is partial (4→3, not 4→2).
+- Neutral, because it could be a first phase of Option A rather than a competing end state.
+
+** Option D — Do nothing, just document which file is which
+- Good, because zero risk to load-bearing synced workflows.
+- Bad, because it doesn't reduce the count at all; the complaint stands.
+
+* Decisions [4/4]
+
+** DONE Engine shape — one file with modes vs partial merge
+- Context: Option A (one =inbox.org=, 4→1) maximally addresses "too many" but is the biggest single change to load-bearing synced files. Option C (merge the process/monitor pair only, 4→3) is lower-risk and could be A's first phase.
+- Decision: We will build Option A — one =inbox.org= engine with =process= / =monitor= / =roam= modes. (Craig, 2026-06-23.)
+- Consequences: easier discovery and one home for the logic; harder single-file size and a bigger, higher-blast-radius diff, mitigated by the shared-core + thin-mode structure and the plugin-namespace escape hatch for a mode that wants depth.
+
+** DONE Trigger-phrase routing — preserve all existing phrases
+- Context: protocols + startup route by phrase; users have these in muscle memory.
+- Decision: We will keep every existing trigger phrase, re-homing each to its mode on the one engine, adding only the new =auto inbox zero= phrase. (Craig, 2026-06-23 — accepted as recommended.)
+- Consequences: easier — no relearning, no broken muscle memory; harder — the engine must document a longer phrase→mode table and guard against collisions.
+
+** DONE triage-intake stays separate
+- Context: external-account triage is a different surface; folding it in would re-bloat the engine.
+- Decision: We will leave =triage-intake.org= and its plugins untouched, out of this consolidation. (Craig, 2026-06-23 — accepted as recommended.)
+- Consequences: easier — smaller, coherent inbox engine; harder — two "what's arriving" entry points remain (inbox engine vs triage-intake), documented so the boundary is clear.
+
+** DONE Scheduled-check mechanism + behavior + keyword
+- Context: Craig wants a recurring inbox check with a keyword, an interactive find-then-execute flow, and a running queue.
+- Decision: The trigger phrase is =auto inbox zero=. On invocation it asks Craig for the interval, then runs roam-mode on =/loop <interval>=. Empty cycle → one acknowledgement line (ran at HH:MM, nothing found), no inbox summary. Find → summarize, file as tasks, append to the displayed task queue (=TaskCreate=), and ask "run this batch next?"; on yes, implement the found items; subsequent cycles append only new finds to the same queue. =/schedule= stays available for a fully-unattended pass. (Craig, 2026-06-23.)
+- Consequences: easier — a quiet inbox stays quiet, a find is always gated on a yes, and the queue is one accumulating view; harder — the loop must dedup against already-queued items so it doesn't re-surface them, and the in-session =/loop= shape means the unattended case still needs =/schedule=.
+
+* Review findings [2/2]
+
+** DONE Fully unattended scheduled behavior is not specified :blocking:
+Disposition: accepted via the narrow option. v1 ships only the interactive =auto inbox zero= (=/loop=); the fully-unattended =/schedule= pass is deferred to vNext with its open contract questions named (read-only vs may-mutate, surface-when-away, cross-run dedup state, auth/session). Folded into Summary, Goals, Problem/Context, Scope tiers (vNext), and the Design "Auto inbox zero" subsection; the vNext contract is logged to todo.org. This sequences the unattended pass rather than dropping it, preserving Decision 4's intent.
+The Summary and Goals promise a scheduled unattended inbox check with trigger keywords, but the concrete =auto inbox zero= design is intentionally interactive: it asks for an interval, runs =/loop <interval>= in the live session, and waits for Craig before executing found work. The only unattended behavior is the sentence that =/schedule= remains available for a fully-unattended cloud-cron pass. That leaves an implementer to invent the actual scheduled contract: trigger phrase(s), whether the pass is read-only or may mutate =todo.org= / =~/org/roam/inbox.org=, how findings are surfaced when Craig is away, how dedup state survives across runs, and what auth/session constraints apply. Add a distinct =/schedule= subsection and acceptance criteria for the fully unattended mode, or narrow the Summary/Goals to say v1 ships only the interactive =/loop= mode and log the unattended cron shape as vNext. (blocking)
+
+** DONE Stale-reference verification relies on a checker that does not check workflow links
+Phase 2 and the Risks section rely on the workflow-integrity / INDEX-drift check as the backstop for missed references to =process-inbox.org=, =monitor-inbox.org=, and =inbox-zero.org=. Current =scripts/workflow-integrity.py= checks INDEX coverage, script references, plugin parentage, orientation sections, and duplicate trigger phrases; it does not validate arbitrary =[[file:...org]]= workflow links or prose references in workflows/protocols/rules. That means a deleted-workflow link in =startup.org=, =wrap-it-up.org=, or =protocols.org= can survive the named checker. Keep the grep requirement, but make it an explicit acceptance item with the exact scope: at minimum =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= after caller rewrites, allowing only intentional historical/spec/todo mentions. Optionally extend =workflow-integrity.py= to validate local workflow links, but do not imply it already catches this class. (non-blocking)
+
+Disposition: accepted. Added the exact grep as an acceptance item and a Phase 2 step, reworded the Risks "missed caller reference" dodge and the Dev-tooling readiness line so the integrity checker is no longer implied to validate workflow links, and noted the optional =workflow-integrity.py= extension as not-required-for-v1.
+
+* Implementation phases
+
+** Phase 1 — Author the inbox engine
+Write =inbox.org= (canonical =claude-templates/.ai/workflows/=): the shared core (value gate, skeptical review, disposition ladder, reply discipline, capture-guard, priority-scheme check) plus the three mode sections, absorbing the content of the three source files. No caller changes and no deletions yet — the tree still works with the old files in place, the new engine sits alongside for review.
+
+** Phase 2 — Reconcile callers and retire the old files
+Repoint INDEX.org (one =inbox.org= entry, phrases grouped by mode), protocols.org terminology, startup.org Phase C, and wrap-it-up.org Step 3 at =inbox.org= + mode. Delete =process-inbox.org=, =monitor-inbox.org=, =inbox-zero.org=. Grep for stale references — =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= — and clear every live caller (the integrity checker covers INDEX coverage and trigger duplication, not workflow links). Run the workflow-integrity / INDEX-drift check. Sync the mirror.
+
+** Phase 3 — Auto inbox zero + scheduled check
+Add the =auto inbox zero= mode to =inbox.org=: ask-for-interval, =/loop <interval>= over roam mode, the empty-cycle acknowledgement, and the find → summarize → file → queue → ask-to-execute flow with cross-cycle dedup against the displayed queue. Document the =/schedule= recipe for the fully-unattended pass alongside it.
+
+** Phase 4 — Verify
+Trigger-phrase coverage (every old phrase resolves to a mode), startup + wrap-up dry-run against the new engine, capture-guard still gates the roam write, INDEX drift clean, mirror in sync.
+
+* Acceptance criteria
+- [ ] Every trigger phrase that today routes to =process-inbox= / =monitor-inbox= / =inbox-zero= resolves to a mode of =inbox.org=.
+- [ ] The three old workflow files are deleted and absent from INDEX; the integrity check reports no orphan or stale entry.
+- [ ] After caller rewrites, =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= returns only intentional historical / spec / todo mentions — no live caller reference to a deleted file. (The integrity checker validates INDEX coverage, not arbitrary workflow links, so this grep is the real backstop.)
+- [ ] Startup still processes the local inbox and produces the read-only roam nudge; wrap-up still sweeps the project's roam items.
+- [ ] The capture-guard runs before any roam-inbox disk write in the consolidated engine.
+- [ ] The value gate, disposition ladder, and reply-to-sender discipline are present once and unchanged in behavior.
+- [ ] =auto inbox zero= asks for an interval, then runs roam mode on =/loop <interval>=.
+- [ ] An empty auto cycle emits only a timestamped acknowledgement (ran at HH:MM, nothing found) — no inbox summary.
+- [ ] A find summarizes the items, files them as tasks, appends them to the displayed queue, and asks before executing; "yes" runs the batch; later cycles append only newly-found items, never re-surfacing queued ones.
+- [ ] Canonical and mirror copies are in sync (=sync-check.sh=).
+
+* Readiness dimensions
+- Data model & ownership: the engine reads two files it doesn't own (project =inbox/= dir contents, =~/org/roam/inbox.org=) and writes =todo.org= + the roam file. Ownership unchanged from today; the merge moves no data.
+- Errors, empty states & failure: empty inbox → report and stop (preserved per surface); roam pull blocked or dirty → surface and stop, never auto-stash (preserved); live org-capture on the roam file → capture-guard blocks the write (preserved).
+- Security & privacy: N/A because no credentials or sensitive data; the engine moves task text between local files.
+- Observability: the user sees which mode ran and its disposition summary; INDEX drift check surfaces a mis-wired routing.
+- Performance & scale: N/A because the inputs are small text files triaged by hand-scale counts.
+- Reuse & lost opportunities: the whole point — the shared core is written once instead of three times; =triage-intake='s plugin pattern is intentionally not reused here (different surface).
+- Architecture fit & weak points: integration points are INDEX.org, protocols.org terminology, startup Phase C, wrap-up Step 3. Weak point: a missed caller reference to an old filename breaks routing — mitigated by the explicit stale-reference grep (acceptance item), since the integrity check covers INDEX coverage, not workflow links.
+- Config surface: trigger phrases (the phrase→mode table) and the scheduled-check keyword set + cron expression.
+- Documentation plan: the engine file is the doc; INDEX entry updated; protocols terminology updated. No separate user doc needed.
+- Dev tooling: the workflow-integrity check + startup INDEX-drift check cover INDEX coverage and trigger-phrase duplication; they do *not* validate workflow file links, so the stale-reference grep (acceptance item) is a manual step. Optionally extend =workflow-integrity.py= to validate local =[[file:...org]]= workflow links — not required for v1.
+- Rollout, compatibility & rollback: the merge lands via the template sync; =rsync --delete= removes the three retired files from every consuming project on its next startup, and the new =inbox.org= arrives the same pass. Rollback = git revert of the rulesets commit, then the next sync restores the old files. Trigger-phrase preservation is the compatibility guarantee.
+- External APIs & deps: N/A because no external API; =/schedule= and =/loop= are harness features, not deps.
+
+* Risks, Rabbit Holes, and Drawbacks
+- *Missed caller reference.* A lingering mention of =process-inbox.org= / =monitor-inbox.org= / =inbox-zero.org= in a workflow, protocol, skill, or the INDEX would break routing after the files are deleted. Dodge: =rg 'process-inbox|monitor-inbox|inbox-zero' claude-templates/.ai .ai claude-rules= and clear every live caller before deleting. The workflow-integrity checker validates INDEX coverage and trigger-phrase duplication, *not* arbitrary =[[file:...]]= links, so the grep — not the checker — is the real backstop here.
+- *Single-file sprawl.* One engine with three modes risks becoming the wall-of-text the workflows were split to avoid. Dodge: the shared-core + thin-mode structure and the terseness pass; if a mode wants real depth, it can become a =inbox.<mode>.org= plugin under the engine namespace (the same pattern =triage-intake= uses) rather than bloating the core.
+- *Sequencing with the agent-neutrality sweep.* If the neutrality sweep runs first, it edits three files about to be deleted. Dodge: this consolidation lands first by construction (it's why the sweep was parked).
+
+* Review and iteration history
+** 2026-06-23 Tue @ 21:51:51 -0400 — Claude — author
+- What: initial draft.
+- Why: Craig chose to spec the inbox-workflow consolidation before building a load-bearing 3-to-1 merge of synced workflows.
+- Artifacts: docs/inbox-workflow-consolidation-spec.org; todo.org "Consolidate inbox/triage workflows + scheduled inbox check".
+** 2026-06-23 Tue @ 22:05:00 -0400 — Craig — decision-maker
+- What: resolved all four decisions; added the =auto inbox zero= scheduled mode (ask-for-interval, empty-cycle acknowledgement only, find → summarize → file → displayed queue → ask-to-execute, cross-cycle dedup). Status → ready for review.
+- Why: chose Option A (4→1 engine) and specified the recurring-check behavior in full.
+- Artifacts: Decisions [4/4]; Design "Auto inbox zero" subsection.
+** 2026-06-23 Tue @ 22:15:58 -0400 — Codex — reviewer
+- What: spec-review pass rated the spec =Not ready= and added two findings: the fully unattended =/schedule= behavior is not specified, and stale-reference verification leans on a checker that does not validate workflow links.
+- Why: the current design is strong enough for the consolidation and interactive =/loop= mode, but the stated scheduled/unattended goal would force implementers to invent behavior before shipping.
+- Artifacts: Review findings [0/2].
+** 2026-06-23 Tue @ 22:28:00 -0400 — Claude — responder
+- What: both findings accepted and folded. Finding 1 (blocking) resolved by narrowing v1 to the interactive =auto inbox zero= (=/loop=) and deferring the fully-unattended =/schedule= contract to vNext with its open questions named — Summary, Goals, Problem/Context, Scope tiers, and Design updated. Finding 2 resolved by adding the exact stale-reference grep as an acceptance item + Phase 2 step and dropping the over-claim that the integrity checker validates workflow links. Findings [2/2], Decisions [4/4]; scope narrowed (not expanded), so no readiness-rubric rerun needed. Status → Ready.
+- Why: the scheduled/unattended promise outran what Craig actually specced (he detailed the interactive loop); sequencing the cron pass to vNext keeps v1 honest. The checker genuinely doesn't catch stale workflow links.
+- Artifacts: Decisions [4/4]; Review findings [2/2]; vNext task to be logged for the unattended cron contract.