diff options
Diffstat (limited to 'claude-templates/.ai/workflows')
| -rw-r--r-- | claude-templates/.ai/workflows/INDEX.org | 15 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/clean-todo.org | 19 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/code-quality.org | 83 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/inbox.org | 21 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/no-approvals.org | 2 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/open-tasks.org | 32 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/page-me.org | 26 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/readability-audit.org | 242 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/spec-create.org | 14 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/spec-response.org | 6 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/spec-review.org | 8 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/startup.org | 25 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/suspend.org | 112 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/task-audit.org | 38 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/task-review.org | 8 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/work-the-backlog.org | 263 | ||||
| -rw-r--r-- | claude-templates/.ai/workflows/wrap-it-up.org | 44 |
17 files changed, 922 insertions, 36 deletions
diff --git a/claude-templates/.ai/workflows/INDEX.org b/claude-templates/.ai/workflows/INDEX.org index eef81df..88721ed 100644 --- a/claude-templates/.ai/workflows/INDEX.org +++ b/claude-templates/.ai/workflows/INDEX.org @@ -22,6 +22,8 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - Triggers: "wrap it up", "that's a wrap", "let's call it a wrap" - No-teardown triggers: "wrap it up with summary", "wrap it up and summarize" - Shutdown trigger: "wrap it up and shutdown" +- =suspend.org= — capture-only mid-session pause for an abrupt departure: append a resume-weighted =SUSPENDED= entry to the Session Log, note uncommitted work, and LEAVE =.ai/session-context.org= in place so the next startup resumes from it. The capture-only counterpart to =wrap-it-up= (which archives + tears down) and to =flush= (=/flush=, which prompts =/clear= and resumes the same session). Provides only the capture half; startup's interrupted-session path is the resume half. + - Triggers: "suspend the session", "suspend", "I need to go", "stick a pin in everything" - =retrospective.org= — post-mortem after a tough session. - Triggers: "let's do a retrospective", "retrospective time" @@ -52,6 +54,11 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - Roam-mode triggers: "inbox zero", "empty the inbox", "process the roam inbox", "triage my roam inbox" - Auto-mode trigger: "auto inbox zero" (match before "inbox zero") +- =work-the-backlog.org= — the autonomous task-execution loop, the single home for working a batch of marked tasks unattended: takes an ordered task set (explicit list or tag query) + session mode (=file-only= default / =autonomous-commit= + paging) + a hard run cap; each candidate passes the mechanical eligibility gate (status =TODO= + =:solo:= per the project's scheme header) and the four-item defer checklist, then is implemented to the full quality bar (TDD, =/review-code=, =/voice=) as its own logical commits. Fed by the inbox auto-loop's chain step (yes-gated, file-only, cap 1) and the no-approvals speedrun preset (pre-flight Q&A → autonomous-commit + always-push + end-of-set page over an explicit ordered list). + - Speedrun triggers: "speedrun", "no approvals speedrun", "speedrun these: <task set>" — any phrase containing "speedrun" routes here (the preset), never to =no-approvals.org= + - Manual triggers: "work the backlog", "work the backlog with <task set>" (file-only defaults) + - Synthesis trigger: "synthesize backlog metrics" — read the per-project metrics logs, compute trends + the corrections signal, write one =:agent:metrics:= KB node (personal projects only) + ** Calendar - =add-calendar-event.org= — create a calendar event. @@ -87,6 +94,13 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - =spec-response.org= — fold a spec review back in: decide accept / modify / reject for every finding, weave accepts into the spec body, complete each finding task in place (the reason recorded on modifies and rejects), reconcile cross-spec tensions, iterate to implementation-ready. The *author* side; consumes the =* Review findings= =spec-review.org= produces. - Triggers: "respond to the review", "process the spec reviews", "spec-response workflow", "fold in the review" +** Code quality + +- =code-quality.org= — one trigger that sequences every behavior-preserving quality pass over a scope of existing code: =/refactor= (complexity, duplication, dead-code, simplification) then =readability-audit= (comments, headers, names, organization), then surfaces the =:refactor:= tasks readability filed and any deferred =/refactor= findings. A thin orchestrator — each pass keeps its own gate. Excludes =/simplify= (that's for the current diff, not existing code). + - Triggers: "code quality sweep", "quality sweep", "run every quality pass on <scope>", "give me every pass on <scope>" +- =readability-audit.org= — make code readable to a future maintainer: audit file-top commentary, inline comments (why-not-what), names (intention-revealing), and organization (co-location / stepdown / cohesion). The cheap comment- and name-only fixes (dimensions A/B/C) land inline, verified by a green suite; the structural findings (dimension D — split a module, rename a public symbol) are *filed* as =:refactor:= tasks, not done here. Language-agnostic. Feeds =/refactor= (which executes the filed structural work); distinct from =/refactor='s metric scans and =/simplify='s diff cleanup. + - Triggers: "let's run the readability-audit workflow", "audit the comments and commentary in <area>", "clean up the structure/organization of <module>", "readability audit" + ** Tools and meta - =process-meeting-transcript.org= — record → transcript → labeled archive. @@ -108,6 +122,7 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e - Triggers: "session harvest", "harvest the sessions", "let's run the session-harvest workflow", "monthly harvest", "mine the sessions" - =no-approvals.org= — drop the interaction-level approval gates for a pre-agreed batch while keeping engineering-discipline gates (=/review-code=, =/voice personal=, tests, session-log updates, subagent reviews, destructive-action consent). Mode stays on until Craig turns it off, a real question arises, the queue empties, or the conversation switches topics. - Triggers: "no-approvals mode", "no approvals", "no-approval", "no need for approval gates", "stop asking, just keep going", "I'll check back in when you're done or stuck", "do all =<selector>= with no-approval" + - Exception: any phrase containing "speedrun" routes to =work-the-backlog.org='s no-approvals speedrun preset instead * Living Document diff --git a/claude-templates/.ai/workflows/clean-todo.org b/claude-templates/.ai/workflows/clean-todo.org index dd33056..a1b2af5 100644 --- a/claude-templates/.ai/workflows/clean-todo.org +++ b/claude-templates/.ai/workflows/clean-todo.org @@ -27,7 +27,17 @@ Deletes bogus =- State "X" from "X" [date]= log lines (state didn't actually cha To preview without writing, run =--check= first: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --check todo.org=. -** Step 2: Archive completed work +** Step 2: Convert done sub-tasks to dated entries + +#+begin_src bash +emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org +#+end_src + +Rewrites every heading at level 3 or deeper whose TODO state is DONE/CANCELLED/FAILED into a dated event-log entry (=<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>=), dropping the keyword, priority cookie, and tags, and removing the =CLOSED:= line. Enforces the depth rule that a completed sub-task becomes dated history — a shape interactive org closes and =--archive-done= (level-2 only) leave unapplied. Timestamp comes from each entry's =CLOSED= cookie; heading text kept verbatim; idempotent; a done sub-task with no parseable =CLOSED= is flagged and left alone. Run before archiving so a parent's sub-tasks are already dated when it moves. Capture the output. + +To preview without writing: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks --check todo.org=. + +** Step 3: Archive completed work #+begin_src bash emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org @@ -37,10 +47,11 @@ Moves every level-2 subtree whose TODO state is DONE or CANCELLED out of the "Op To preview the moves without writing: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done --check todo.org=. -** Step 3: Summarize +** Step 4: Summarize -Report to Craig from the two captured outputs: +Report to Craig from the three captured outputs: - Hygiene: how many bogus state-log lines were deleted; any orphan-planning warnings (file:line + heading), or "none". +- Convert: how many done sub-tasks were rewritten to dated entries (heading + line), any flagged for no =CLOSED= date, or "nothing to convert". - Archive: how many subtrees moved and which (heading + line), or "nothing to move" / the skip reason if a section was missing or ambiguous. - If the file changed, note that =todo.org= now has an uncommitted edit — review =git diff -- todo.org= and commit it (in this repo's commit style) if it looks right. If nothing changed, say so and stop. @@ -49,7 +60,7 @@ Don't auto-commit. The summary is the review point; Craig decides whether the di * Principles - *Both passes apply, not just preview.* The workflow is invoked because cleanup is wanted. Use the =--check= variants only when Craig asks for a dry run. -- *Two passes, two invocations.* =--archive-done= is its own mode and does not run the hygiene pass; run both. +- *Separate modes, separate invocations.* =--convert-subtasks=, =--archive-done=, and the hygiene pass are each their own mode and don't run the others; run all three. - *Never auto-commit todo.org.* Surface the diff and let Craig commit it. The cleanup is a working-tree change, fully reversible until committed. - *Trust the script.* It's fast and idempotent; if there's nothing to do, it reports zero and exits clean. No pre-checks. diff --git a/claude-templates/.ai/workflows/code-quality.org b/claude-templates/.ai/workflows/code-quality.org new file mode 100644 index 0000000..2406f4c --- /dev/null +++ b/claude-templates/.ai/workflows/code-quality.org @@ -0,0 +1,83 @@ +#+TITLE: Code-Quality Sweep Workflow +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-06-28 + +* Overview + +One trigger that runs every behavior-preserving quality pass over a scope of +*existing* code, in order, then surfaces what got filed for later. It's a thin +orchestrator — each pass keeps its own discipline and its own confirm gate; this +workflow only sequences them and collects the residue. + +The passes it chains: + +1. =/refactor= — structural and logic cleanup on measurable metrics (complexity, + duplication, dead-code) plus the simplification lens. +2. =readability-audit= ([[file:readability-audit.org][readability-audit.org]]) — prose and human-reader clarity + (comments, file headers, names, organization). + +It deliberately does *not* run =/simplify=: that works the current uncommitted +diff, not existing committed code, so it belongs to the moment you've just made a +change, not to a sweep of code already in the tree (see "The /simplify boundary" +below). + +* When to Use This Workflow + +- "code quality sweep" / "quality sweep" +- "run every quality pass on <scope>" / "full quality pass on <scope>" +- "give me every pass on <file/module/tree>" + +Do NOT use it for: +- *In-flight diff cleanup* — that's =/simplify= on the change you just made. +- *Bug hunting* — these passes are behavior-preserving; for defects use =debug= + or =/review-code=. +- *Performing the structural refactors it files* — those become =:refactor:= + tasks; work them later via =/refactor rename= / =/refactor simplification= or + =/start-work=. + +* Steps + +** 1. Scope + +Pick the target: one file, a named module set, or the whole tree (honor +=.aiignore=). The same scope is passed to both passes so they cover the same +code. + +** 2. /refactor <scope> + +Run =/refactor= on the scope. Its default full scan covers complexity, +duplication, dead-code, and simplification. It presents findings and applies +only what's approved (its own gate) — structure and logic first, so the +readability pass audits the cleaned-up code. + +** 3. readability-audit on <scope> + +Run the readability-audit workflow on the same scope. Its cheap comment- and +name-only fixes (dimensions A/B/C) land inline and are verified by a green +suite; its structural findings (dimension D — split a module, rename a public +symbol) are *filed* as =:refactor:= tasks rather than done here. + +** 4. Surface the residue + +Collect and report what the sweep left behind for later work: + +- The =:refactor:= tasks readability-audit filed (the structural backlog). +- Any =/refactor= findings deferred rather than applied in step 2. + +That residue is the "do this next" list the sweep produces; it's not a failure +to finish, it's the structural work that needs its own design and test pass. + +* The /simplify boundary + +=/simplify= and this sweep don't overlap: =/simplify= cleans the *current diff* +and applies its fixes directly, so reach for it right after making a change, +before committing. This sweep works *existing committed code* and runs the +scan-and-present passes. One trigger can't sensibly do both — a diff you're +holding and a tree you're auditing are different inputs. + +* Verification + +Each pass owns its verification (=/refactor= runs the suite after applying; +readability-audit verifies inline fixes against a green suite). The umbrella +adds nothing beyond sequencing, so when both passes report green, the sweep is +clean — confirm that before reporting done rather than assuming it. diff --git a/claude-templates/.ai/workflows/inbox.org b/claude-templates/.ai/workflows/inbox.org index c442d17..acfd11d 100644 --- a/claude-templates/.ai/workflows/inbox.org +++ b/claude-templates/.ai/workflows/inbox.org @@ -114,6 +114,16 @@ The item extends a task already filed. Update the parent TODO's body with a date ** File as TODO Substantive but waits, or needs design/triage before implementation. Add the TODO under =* <Project> Open Work= with priority + tags per the priority-scheme check (core §6). Body summarizes the proposal and links the inbox content if it's been moved to =docs/design/=. Delete the inbox file (or move it to =docs/design/= first if the content survives). +*Route-candidate marking (feeds the wrap-up router).* After filing, check whether the keeper's inferred home is a different project: + +#+begin_src bash +python3 .ai/scripts/route_recommend.py --item "<the keeper's heading + body text>" --exclude "$(basename "$PWD")" +#+end_src + +On a =<destination>\tstrong= or =<destination>\tweak= result, stamp the new TODO's property drawer with =:ROUTE_CANDIDATE: <destination>= (create the drawer if the task has none). A =none= result stamps nothing, and a local keeper stays unstamped. The marker is the wrap-up router's entire candidate set — =wrap-it-up.org= Step 3 surfaces exactly the =:ROUTE_CANDIDATE:=-tagged tasks and offers to deliver each to its destination's inbox, never scanning the standing backlog. Stamping is cheap and reversible (the router's skip leaves the task in place; a wrong marker is one property line to delete), so prefer stamping on any plausible match — the human reviews the batch at wrap time. + +*Blocking-dependency handoff.* A special shape: another project sends a note that *this* project's work is blocking one of theirs ("your task X is blocked on us — we need Y"). File or link the owning task, tag it =:blocker:=, and name the requesting project in the body (see the cross-project dependency convention in =todo-format.md=). The =:blocker:= tag makes =open-tasks.org= surface that task *first*, since clearing it unblocks the other project. Dedup against an existing task rather than filing a duplicate. When the work later lands, drop =:blocker:= and notify the waiting project (=inbox-send <their-project> --text "Delivered: <what> — you're unblocked."=) so it can lift its own =:blocked:=. + ** Defer Rename in place to =inbox/PROCESSED-<original-filename>= and add a brief comment line at the top: =# Deferred YYYY-MM-DD: <condition>=. Don't accumulate deferred items indefinitely — sweep them on a future process pass when the condition is met or the deferral has aged out. @@ -251,7 +261,7 @@ The gap it closes: handoffs that arrive mid-session used to sit unseen until the Never begin monitoring on a dirty worktree or a failing test suite. A dirty tree means the auto-commit at the end of an executed item sweeps up unrelated changes; a red suite means you can't tell whether the monitor broke something. At the start: -1. =git status --porcelain= is empty (clean worktree). +1. =git status --porcelain --untracked-files=no= is empty (no tracked modifications). Untracked and gitignored files never block — an inbox drop is exactly what this mode processes, and a scratch file is none of its business (the template-freshness policy in =startup.org= Phase A.0). The tracked-only gate is safe because the per-item commit stages its files explicitly (=commits.md=: only intended changes staged) — never =git add -A=, which would sweep untracked files and is the failure this gate guards against. 2. A full test run is all green (=make test= here, or the project's full-suite command). If *dirty*: offer to commit the pending changes in discrete, logical batches before starting. If *red*: offer to investigate the failures first. Surface the blocker with inline numbered options per =interaction.md= and wait — monitoring does not start until the tree is clean and the suite is green. @@ -336,7 +346,7 @@ Close the loop per the reply-to-sender discipline (core §4): confirm what lande End the way it started: clean worktree, green suite. Before stopping the loop or reporting the pass done: -1. Commit or revert everything left in the worktree — nothing uncommitted remains. +1. Commit or revert every tracked modification left in the worktree — no tracked change remains uncommitted. Untracked files (unprocessed inbox drops, scratch) are not the monitor's to sweep. 2. Run the full test suite once more and confirm all green. If either can't be satisfied — a half-done item, a failure introduced during the pass — surface it rather than leaving it. The next monitor run assumes a clean, green starting state (the Preconditions gate). @@ -451,18 +461,19 @@ Take these up when the single-destination version is in use and the multi-projec * Mode: auto inbox zero -A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports, and a find waits for Craig's go before any work happens. +A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports what it found and filed. ** Per cycle 1. Run roam mode's scan (Phase A local check + Phase B roam scan), read-only — no =git pull=. The capture-guard still gates any write: use =capture-guard --wait= (core §5) so a transient capture clears itself; if it's still open after the wait, *defer this cycle's roam reconcile to the next cycle* rather than surfacing — the loop cadence is the retry, and the filed items get swept next time. The rare write hands its git to =roam-sync= (roam Phase D). 2. *Nothing found* → no inbox summary. One acknowledgement line: =ran at HH:MM, nothing found=. Nothing else. The acknowledge-only-on-empty rule keeps a quiet inbox quiet. 3. *Items found* → summarize the found items, file them as tasks (roam Phase C), and *append them to a displayed queue* — the harness task list, via =TaskCreate= — so the queue accumulates across cycles. Then ask: "run this batch next?" - - *Yes* → launch into implementing the found items, each through the normal disposition ladder (core §3) + verify flow. + - *Yes* → chain into =work-the-backlog.org= as an explicit second step after routing completes: pass it the eligibility query over the queued items (status =TODO= + =:solo:= per the scheme header, priority-ordered), =file-only= mode, paging off, cap 1. The highest-priority eligible candidate runs; the rest wait for the next tick or a later yes. - *No* → they stay queued for a later go. + This mode never implements anything itself — routing ends here, and the execution loop lives in =work-the-backlog.org=, its one home. 4. *Cross-cycle dedup.* Subsequent cycles add only *newly-found* items to the same displayed queue, never re-surfacing what's already there. Dedup against the queue (the =TaskCreate= list), not against what's already been implemented — a find that was queued-but-not-yet-run must not reappear, and one already filed into =todo.org= is dropped by roam Phase C's status check. -A find is always surfaced and gated on Craig's yes; a quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its execute step waits for a yes. +A find is always surfaced and filed; execution happens only through the =work-the-backlog.org= chain and waits for Craig's yes. A quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its chain step waits for that yes. ** Fully-unattended pass (=/schedule=) — vNext, not v1 diff --git a/claude-templates/.ai/workflows/no-approvals.org b/claude-templates/.ai/workflows/no-approvals.org index 1efce82..9e1c894 100644 --- a/claude-templates/.ai/workflows/no-approvals.org +++ b/claude-templates/.ai/workflows/no-approvals.org @@ -22,6 +22,8 @@ Craig activates the mode with any of: - Queuing several tasks in =todo.org= followed by any phrase above - Any equivalent phrasing that signals he doesn't want to be re-asked between items +*Not this mode:* any phrase containing "speedrun" ("speedrun", "no approvals speedrun") routes to =work-the-backlog.org='s no-approvals speedrun preset — an autonomous batch over an explicit ordered task set, with a pre-flight Q&A, autonomous commits, always-push, and an end-of-set page. This mode is the general interaction-gate suspension for whatever work is already underway; the speedrun is the dedicated backlog-batch workflow. + Mode resets when: - Craig says approvals are back on diff --git a/claude-templates/.ai/workflows/open-tasks.org b/claude-templates/.ai/workflows/open-tasks.org index fe782d6..02a0847 100644 --- a/claude-templates/.ai/workflows/open-tasks.org +++ b/claude-templates/.ai/workflows/open-tasks.org @@ -23,15 +23,16 @@ Don't route "task review" / "review tasks" here — those trigger the hygiene ha * Phase A: Data Gathering (both modes) -** Phase A pre-step — archive any freshly-DONE tasks +** Phase A pre-step — normalize freshly-closed tasks -Before reading =todo.org=, run the cleanup script's archive-done sweep so completed level-2 subtrees move from =* $Project Open Work= to =* $Project Resolved=: +Before reading =todo.org=, run two cleanup sweeps so the read reflects current state. First convert any done sub-tasks to dated entries, then archive completed level-2 subtrees from =* $Project Open Work= to =* $Project Resolved=: #+begin_src bash +emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org #+end_src -Costs a few hundred milliseconds. Without it, a task that completed earlier in the session sits as =** DONE= under Open Work until the next =clean-todo= or wrap-up pass, and Next Mode would surface it as a "what's next" candidate. The sweep makes Phase A's read of =todo.org= reflect current state. +Costs a few hundred milliseconds. Without the archive sweep, a task that completed earlier in the session sits as =** DONE= under Open Work until the next =clean-todo= or wrap-up pass, and Next Mode would surface it as a "what's next" candidate. The convert sweep runs first so a completed parent's sub-tasks are already dated when it archives; it also keeps interactive level-3 closes from lingering as DONE keywords. Together they make Phase A's read of =todo.org= reflect current state. Skip the sweep if the workflow is invoked in an explicit read-only or dry-run context. Default is to run it. @@ -176,6 +177,10 @@ Next Mode answers two questions in one output: "what matters most right now?" (t Apply the prioritization cascade in order. Stop at the first matching step. This is the importance/urgency answer. +*Exclude blocked tasks.* A task tagged =:blocked:= has an unmet cross-project dependency (its body names the project and the work owed, per =todo-format.md=). It can't be worked until that other project delivers, so it is *never* the cascade recommendation — skip it at every cascade step below. Blocked tasks are surfaced on their own in Step 3 so the stalled dependency stays visible instead of silently dropping out of view. + +*Surface blocking tasks first.* The mirror of the above: a task tagged =:blocker:= is holding up work in *another* project (its body names which project and what's owed, per =todo-format.md=). Clearing it unblocks that project, so it carries borrowed urgency — surface it at the *top* of the cascade recommendation regardless of its own priority cookie, ahead of the normal In-Progress / deadline / priority order. When several =:blocker:= tasks exist, lead with the one blocking the most, or the longest. This is the "do the thing that unblocks someone else first" rule; a =:blocker:= task left at its own low priority is exactly how a cross-project dependency stalls. + **** 1. In-Progress Tasks - Look for tasks marked =DOING= or partially complete. - *If found:* Recommend that task (always finish what's started). @@ -228,11 +233,22 @@ Within each row, pick a single task per the same-level tie-breakers above (block The friction filter is the override path. When the cascade winner is partially blocked, hardware-dependent, or simply too large for the user's current state, one of the friction rows is what they pick instead. +*** Step 3 — Blocked-on-other-projects surface + +Independently of the cascade and the friction filter, collect every open task tagged =:blocked:=. These are tasks this project can't advance until another project delivers; surfacing them keeps a cross-project dependency from rotting at low priority on the other side — the exact failure the tag exists to prevent (a blocked task whose blocker is a =[#D]= in another project sits forever otherwise). + +For each blocked task, read its body for the blocking project and what's owed, and present one line: the task, the blocking project, and what that project owes. Then offer — per blocked task — to nudge the blocker: an =inbox-send <project> --text= note naming what's needed and why it's blocking, so the dependency gets attention in the project that owns it. Don't send without the user's go. + +If no =:blocked:= tasks exist, omit this surface entirely (the common case). + *** Output Format -Pair the cascade recommendation with the friction block beneath it. Recommendation-at-item-1 convention applies to the friction rows — quick+solo first, since it's the strongest low-friction pick. +Pair the cascade recommendation with the friction block beneath it, and the blocked-on-other-projects surface (Step 3) beneath that when any blocked task exists. Recommendation-at-item-1 convention applies to the friction rows — quick+solo first, since it's the strongest low-friction pick. #+begin_example +Unblocks other projects (do these first): +- ai-term wrap-teardown companion — :blocker:, unblocks rulesets (the three ai-term functions) + Cascade recommendation (importance/urgency): - Fix org-noter reliability — [#A], Method 1, 8/18 complete, blocks daily reading/annotation @@ -240,17 +256,25 @@ If you want lower friction instead: 1. Quick + solo: Bump linter config — [#C] :quick:solo:, ~15 min 2. Quick: Confirm new dirvish setup — [#B] :quick:, needs your eye 3. Solo: Refactor config-utilities — [#B] :solo:, bounded but multi-hour + +Blocked on other projects (can't advance until the blocker delivers): +- Wrap-teardown feature — blocked by emacsd: ai-term companion functions — nudge? #+end_example +The =:blocker:= surface sits at the very top — clearing one of those is the highest-leverage thing on the list, since it frees work in another project. Omit it when no =:blocker:= task exists (the common case). + Include for each row: - Task name / description. - Priority + tag cluster. - One-line reasoning. For the cascade row, name which cascade step matched. For friction rows, an effort hint when one is obvious. - Progress indicator (for V2MOM-structured todos) on the cascade row only. +- For a =:blocker:= row: the project it unblocks and what's owed (from the task body). +- For a blocked row: the blocking project and what it owes (from the task body), plus the nudge offer. **** Edge cases - *Empty friction block.* If no =:quick:= or =:solo:= tagged tasks exist in the open set, omit the friction block entirely. Present only the cascade recommendation. +- *No =:blocker:= tasks.* Omit the "Unblocks other projects" surface entirely (the common case) — show it only when a task carries the =:blocker:= tag. - *Dedupe.* If the cascade recommendation IS the same task as one of the friction rows (e.g. it's =:quick:solo:= and also won the cascade), show it once at the top with both labels. Don't list it twice. - *Decline behavior.* If the user declines the cascade recommendation, drop straight to the friction block as the natural next prompt. Do not fall through to lower-cascade-tier tasks; the friction filter IS the override. diff --git a/claude-templates/.ai/workflows/page-me.org b/claude-templates/.ai/workflows/page-me.org index 607ed51..8069830 100644 --- a/claude-templates/.ai/workflows/page-me.org +++ b/claude-templates/.ai/workflows/page-me.org @@ -5,9 +5,9 @@ * Overview -This workflow enables Claude to set timers and alarms that reliably notify Craig, even if the terminal session ends or is accidentally closed. Notifications are distinctive (audible + visual with alarm icon) and persist until manually dismissed. +This workflow enables Claude to set timers and alarms that reliably notify Craig, even if the terminal session ends or is accidentally closed. Notifications are distinctive (audible + visual with the blue info icon) and persist until manually dismissed. -Uses the =notify= command (alarm type) for consistent notifications across all AI workflows. +Uses the =notify= command (info type) for consistent notifications across all AI workflows. Info-level on purpose: the earlier alarm styling read as all-red urgency, and Craig's verdict was that a page "should be a persistent info notification" — noticeable, never crash-scary (2026-07-02). * Trigger Phrase @@ -63,8 +63,8 @@ Craig tells Claude when and why: Claude schedules the alarm using the =at= daemon with =notify=: #+begin_src bash -echo "notify alarm 'Page' 'Time to call the dentist' --persist" | at 3:30pm -echo "notify alarm 'Page' 'Meeting starts' --persist" | at now + 45 minutes +echo "notify info 'Page' 'Time to call the dentist' --persist" | at 3:30pm +echo "notify info 'Page' 'Meeting starts' --persist" | at now + 45 minutes #+end_src The =at= daemon: @@ -89,26 +89,26 @@ Craig dismisses the notification and acts on it. ** Setting Alarms -Use the =at= daemon to schedule a =notify alarm= command: +Use the =at= daemon to schedule a =notify info= command: #+begin_src bash # Schedule for specific time -echo "notify alarm 'Page' 'Meeting starts' --persist" | at 3:30pm +echo "notify info 'Page' 'Meeting starts' --persist" | at 3:30pm # Schedule for relative time -echo "notify alarm 'Page' 'Check the build' --persist" | at now + 30 minutes +echo "notify info 'Page' 'Check the build' --persist" | at now + 30 minutes # Schedule for tomorrow -echo "notify alarm 'Page' 'Call the dentist' --persist" | at 3:30pm tomorrow +echo "notify info 'Page' 'Call the dentist' --persist" | at 3:30pm tomorrow #+end_src ** Notification System -Uses the =notify= command with the =alarm= type. The =notify= command provides 8 notification types with matching icons and sounds. +Uses the =notify= command with the =info= type. The =notify= command provides 8 notification types with matching icons and sounds. #+begin_src bash -# Immediate alarm notification (for testing) -notify alarm "Page" "Your message here" --persist +# Immediate page notification (for testing) +notify info "Page" "Your message here" --persist #+end_src The =--persist= flag keeps the notification on screen until manually dismissed. All page-me notifications should use =--persist= by default. @@ -139,10 +139,10 @@ The alarm must fire. Use the =at= daemon which is designed for exactly this purp Simple invocation - Claude runs one command. No complex setup required per alarm. ** Fail Audibly -If the alarm fails to schedule, report the error clearly. Don't fail silently. +If the page fails to schedule, report the error clearly. Don't fail silently. ** Testable -The =notify alarm= command can be called directly to verify notifications work without waiting for a timer. +The =notify info= command can be called directly to verify notifications work without waiting for a timer. ** Non-Alarming Use normal urgency, not critical. The notification should be noticeable but not imply something has gone horribly wrong. diff --git a/claude-templates/.ai/workflows/readability-audit.org b/claude-templates/.ai/workflows/readability-audit.org new file mode 100644 index 0000000..8223a03 --- /dev/null +++ b/claude-templates/.ai/workflows/readability-audit.org @@ -0,0 +1,242 @@ +#+TITLE: Readability Audit Workflow +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-06-28 + +* Overview + +A pass over one file, a set of modules, or the whole tree that makes the code +*readable to a future maintainer*. It checks four things and fixes the cheap +ones in place: the file-top commentary, the inline comments, the names, and the +physical organization of the code. Structural changes that need a real refactor +(splitting a module, renaming a public symbol) are not done here — they are +filed as =:refactor:= tasks so they get their own design and test pass. + +This is language-agnostic. Where a step names a language-specific tool or +convention, it's stated as "the project's <X>, if it has one" — read the +project's =CLAUDE.md= / =notes.org= and the language bundle to resolve the +concrete tool. + +* Where it sits among the code-quality tools + +These tools are a pipeline, not duplicates. Knowing which to reach for: + +- *readability-audit* (this workflow) — prose and human-reader clarity: + comments, file headers, names, and physical organization. Judgment-driven + (does this comment lie? does this name reveal intent? can a newcomer place + this file in a minute?). +- =/refactor= — structure on measurable metrics: complexity, duplication, + dead-code, the =simplification= lens (behavior-preserving logic/size + reduction), and =rename= (executes a codebase-wide symbol rename). +- =/simplify= — behavior-preserving cleanup of the current diff, applied + directly. + +The link that keeps them from overlapping: when this audit finds a structural +problem too big for a comment/name fix — a module to split, a *public* symbol to +rename across call sites — it *files* a =:refactor:= task rather than doing it +here. =/refactor= (rename, simplification) or =/start-work= then executes that +filed task with a proper design and test plan. Readability finds and files; +=/refactor= transforms. + +* Problem We're Solving + +Source files drift toward two opposite failure modes, and both hurt the next +person to open the file: + +- *Documentation rot and noise.* Headers carry stale user-manual content + (quick-starts, full option matrices, setup walkthroughs) that belongs in user + docs; comments restate what the next line already says; comments go out of + date and start lying; placeholder =TODO=/=FIXME= stubs and conversational + asides accumulate. A blank summary or a missing file-top description leaves a + reader with no map. +- *Structural fog.* Names that don't reveal intent force the reader to decode + them; related functions scatter; a public entry point sits far from the + private helpers it calls; a file grows to hold several unrelated + responsibilities. + +Left alone, opening a file costs more every month. The fix is a repeatable audit +with a clear, checkable standard, run on demand or as files are touched. + +* Exit Criteria + +For the audited scope: + +1. *Every file has an accurate top section* that states what the file does and + how it fits the rest of the codebase — terse, no user-manual content, and + carrying the project's file-header convention where it has one. +2. *Every surviving comment earns its place* — it explains a *why* the code + can't (a constraint, a workaround and its reason, an ordering dependency, a + warning), it is accurate against the current code, and it is terse. Obvious + "describe the next line" comments are gone. +3. *Names reveal intent* — no cryptic abbreviations; the project's + public/private visibility convention is applied consistently. +4. *Related code is co-located* — a public function's private helpers sit right + after it; the file reads top-to-bottom by descending abstraction; sections + group what belongs together. +5. *Structural problems too big to fix in a comment pass are filed* as + =:refactor:= tasks, not left as a vague note and not half-done inline. +6. *Nothing broke* — the build is clean and the test suite is green + (comment/name edits are behavior-preserving, so this should always hold; it + is the proof, not a hope). See "Graceful degradation" for projects without a + suite. + +* When to Use This Workflow + +- "Let's run the readability-audit workflow." +- "Audit the comments and commentary in <file/area>." +- "Clean up the structure/organization of <module>." +- After landing a feature, on the files it touched, before moving on. +- On a single file you just found hard to read. +- As a tree-wide sweep: inventory all the source files, audit each, batch the + fixes. + +Do NOT use this to *perform* the structural refactors themselves (use +=/refactor= or =/start-work= against a filed task) or to hunt for bugs / +complexity / duplication (that is =/refactor=, not a readability pass). + +* Approach: How We Work Together + +** Phase 1 — Scope and inventory + +Pick the target: one file, a named module set, or the whole tree. For a sweep, +list the source files (honor =.aiignore=) and decide coverage. Lean on the +language's own doc linters as a first filter where they exist — many flag a +missing or blank file summary and malformed headers; run the project's lint +target first. + +** Phase 2 — Audit each file against the four dimensions + +Record findings as =file:line — issue — proposed fix=. The four dimensions: + +*** A. File-top commentary (the map) + +- Present, and *accurate* against what the file now does. +- States purpose, the file's role/architecture, and key entry points — + *tersely*. A reader should learn what this is and how it connects in a few + lines. +- Carries the project's file-header convention where it has one (a metadata + block, a module docstring, a standard header comment). If the project has no + header convention, skip this sub-check — don't invent one. +- Does *not* carry user-manual content — quick-starts, full option matrices, + step-by-step setup. That belongs in user docs; move it, don't keep it in the + source header. +- Mechanics are correct for the language: a filled summary line (not blank), the + expected section markers, the expected footer. + +*** B. Inline comments (why, not what) + +- Explains a *why* the code cannot: a workaround *and its reason*, an ordering + or load dependency, business-logic rationale, a real warning ("do not reorder + these — deadlock"). +- Is *accurate* — matches the current code. A wrong comment is worse than none; + fix or delete on sight. +- Is *terse and useful*. Delete the obvious "describe the next line" comment + unless it names a non-obvious constraint. Replace a stale placeholder or a + rambling aside with the real one-line reason, or remove it. +- Convert a comment that's only restating the code into a better *name* instead + (see C). + +*** C. Names (carry the what/how so comments don't have to) + +- Intention-revealing variable and function names; no cryptic single letters or + abbreviations outside tight local scopes. +- The project's public/private convention is applied consistently and correctly: + a helper only called within the file is private; a user-facing or + intentionally-reusable symbol is public. (Resolve the concrete convention from + the language and the project — a naming prefix, an export list, an + access modifier.) +- When a comment exists only to explain a name, rename instead. + +*** D. Organization (co-location and ordering) + +- Related functions sit together. A public function's private helpers come + *right after* it (stepdown / proximity / "reads like a newspaper"). +- The file reads top-to-bottom by descending abstraction. +- Sections group what belongs together. +- *Cohesion check:* if the file holds several unrelated responsibilities, or has + grown large enough that the top no longer describes one coherent thing, flag a + split into layered owners — but see Phase 4: that's a filed refactor, not an + inline fix. + +** Phase 3 — Apply the cheap, safe fixes inline + +Dimensions A, B, and C are *comment- and name-only* and *solo* (no design or +preference call): apply them directly. After each file (or a batch), verify with +the project's gates: parse/syntax check, a clean build (no new warnings), and a +green test suite. Comment/name edits can't change behavior, so green is the proof +the edit was clean, not a behavior check. + +For a tree-wide sweep, drive the uniform rewrites mechanically and verify the +whole batch at once: a *mechanical applier with a boundary assertion* that +replaces a well-defined header span is reliable and fast, then one suite run +covers the batch. Keep the varied cases (header-line fixes, summary fixes that +must preserve surrounding metadata, inline-comment surgery, generated-file +headers) as careful per-file edits. (The boundary markers are language-specific; +the principle — mechanical applier + assert + one suite run for uniform +rewrites, per-file judgment for varied cases — is not.) + +** Phase 4 — File the structural refactors, don't do them here + +Dimension D's bigger findings — split a module, rename a *public* symbol across +call sites, move a function to a different file — are real refactors with their +own risk and test surface. Do *not* slip them into a readability pass. File each +as a =:refactor:= task in =todo.org= with the specific finding, so it gets +=/refactor= or =/start-work= with a proper design and test plan. This is the +line between the cheap clarity win and the structural change; keeping it sharp is +what lets the audit stay safe and fast. + +** Phase 5 — Verify and commit in logical batches + +Full suite green, build clean. Commit the doc/comment changes as =docs:= (or +=refactor:= where a header/structure normalized) in cohesive batches — one +commit per coherent slice (a set of condensed commentaries, the +generated-file-header fixes, the obvious-comment prune), not one mega-commit and +not one-per-file. Generated files are fixed *in their generator* and then +regenerated, so the next regen stays compliant. + +* Graceful degradation + +The audit adapts to what the project provides: + +- *No file-header convention* → skip dimension A's metadata sub-check; still + check the summary/description for accuracy and terseness. +- *No test suite* → the green-suite proof in Phases 3 and 5 is unavailable. Fall + back to the strongest gate the project has (compile/byte-compile, parse check, + linters) and *flag the weaker proof as a known limit* — a behavior-preserving + edit is lower-risk, but say plainly that there's no suite to confirm it. +- *No doc linter* → do the Phase 1 first-filter by reading instead; the audit + still runs, just without the cheap pre-pass. + +* Principles to Follow + +- *Comments explain why; code explains what.* If a comment restates the code, + delete it or turn it into a better name. +- *Accuracy beats completeness.* A wrong or stale comment is worse than no + comment. When in doubt, delete. +- *Terse and useful.* Every comment and every header line earns its place. The + source header is not the user manual — move manuals to user docs. +- *Readable means the next person, fast.* The test of the top-section and the + organization is whether a maintainer who has never seen the file can place it + and navigate it in under a minute. +- *Keep the cheap pass cheap.* Comment/name fixes are solo and land inline. + Structural splits and public renames are not — they get filed, designed, and + tested separately. +- *Preserve legal and attribution headers verbatim.* Vendored / GPL / copyright + notices are never condensed away by a readability pass. +- *Manual validation is still Craig's.* Solo means no input is needed to *do* + the work; visual/behavior confirmation afterward is expected where relevant. + +* Living Document + +Update this with what real runs teach. Lessons worth keeping as the standard +sharpens: + +- *Interpretation default for "fix blank summary":* when a rewrite shows only a + header + summary and omits a metadata block the file already has, keep the + existing metadata and replace only the header line and the summary. Its + absence from the rewrite means "leave it," not "delete it." +- *Generated files:* fix the *generator*, then regenerate. Editing the generated + file directly is reverted on the next regen. +- *Vendored files:* preserve the copyright/attribution; do not auto-condense a + licensed header. +- *Mechanical applier + assert + one suite run* is the safe way to do a + many-file uniform rewrite; per-file judgment is for the varied cases. diff --git a/claude-templates/.ai/workflows/spec-create.org b/claude-templates/.ai/workflows/spec-create.org index 508b969..1249181 100644 --- a/claude-templates/.ai/workflows/spec-create.org +++ b/claude-templates/.ai/workflows/spec-create.org @@ -82,8 +82,9 @@ This is where the spec earns a "Ready" from review: an engineer must be able to ** Phase 5 — Wire it up (conventions) -- *Filename + location:* =docs/<problem-slug>-spec.org=. Org-mode. The slug names the *problem/feature*, not a date. Must end in =-spec.org=. -- *Metadata header:* a small table at the top — Status, Owner, Reviewer(s), Date, Related (link to the task/ticket). +- *Filename + location:* =docs/specs/YYYY-MM-DD-<problem-slug>-spec.org= — formal specs live in =docs/specs/=, never =docs/design/= (that's for notes, brainstorms, inventories; see =claude-rules/docs-lifecycle.md=). Org-mode. The slug names the *problem/feature*; no status suffixes ever — status lives in the file. Must end in =-spec.org=. +- *Status heading (first element after the file header):* a top-level heading carrying the lifecycle keyword, stamped =DRAFT= at authoring — spec-create owns this flip. It holds an =:ID:= UUID (generate with =uuidgen=) and dated history lines, newest first. The keyword is authoritative; the Metadata =Status= field mirrors it in lowercase. Transitions are three lines in one file (keyword + history line + mirror): spec-review flips =READY=, spec-response flips =DOING= at decomposition, the final build task flips =IMPLEMENTED=. Terminal states always record a reason. +- *Metadata header:* a small table at the top — Status (the lowercase mirror), Owner, Reviewer(s), Date, Related (link to the task/ticket). - *Review-and-iteration-history stub:* add a =Review and iteration history= section at the bottom and seed it with the author's first entry. =spec-review= and =spec-response= append provenance entries here, so the heading shape is a contract: =YYYY-MM-DD Day @ HH:MM:SS -ZZZZ — Contributor — Role=, body fields What / Why / Artifacts. - *Cross-link both ways:* the spec links its task; the task links the spec (replace the task's inline plan with a terse description + a =file:= link to the spec). @@ -103,7 +104,14 @@ Then it's ready for =spec-review.org=. Snapshot-vs-living rule: keep the spec li ,#+TITLE: <Feature> — Spec ,#+AUTHOR: <author> ,#+DATE: <YYYY-MM-DD> -,#+TODO: TODO | DONE SUPERSEDED CANCELLED +,#+TODO: TODO | DONE +,#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED + +,* DRAFT <spec short name> +:PROPERTIES: +:ID: <uuid — generate with uuidgen> +:END: +- <YYYY-MM-DD Day @ HH:MM:SS -ZZZZ> — drafted. ,* Metadata | Status | draft | diff --git a/claude-templates/.ai/workflows/spec-response.org b/claude-templates/.ai/workflows/spec-response.org index de5b1c8..7628e49 100644 --- a/claude-templates/.ai/workflows/spec-response.org +++ b/claude-templates/.ai/workflows/spec-response.org @@ -130,9 +130,11 @@ When related specs were reviewed together, two reviews can recommend opposite th This is the *last* step of the workflow, and it runs *only after the author confirms the spec is Ready* — never during review iterations. A Ready spec nobody can act on is unfinished; this phase turns it into tracked work. It applies to every project type (library, application, service, docs set). -1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it. +*This phase owns the =READY= → =DOING= lifecycle flip* (docs-lifecycle convention): when the decomposition below lands, update the spec's top-level status heading keyword to =DOING=, add a dated history line, and set the Metadata =Status= mirror to =doing= — three lines, one file. -2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks. +1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it. *Stamp the binding:* the parent task's =:PROPERTIES:= drawer gets a =:SPEC_ID:= line holding the spec's status-heading UUID. That property is the durable join task-audit uses to police =DOING= specs (a =DOING= spec whose bound parent is closed, archived, or missing gets flagged). + +2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks. *Always end the set with the flip task:* a final "flip the spec to IMPLEMENTED (+ dated history line + mirror)" task under the same parent — the tracked obligation that closes the lifecycle loop when the build finishes. Never skip it; "a human remembers" is the failure mode this exists to prevent. 3. *Turn a critical eye on completeness.* Re-read the spec — every phase, every acceptance criterion, every named deliverable, every data-safety/principle rule — and confirm each has a home in a task. The work is not done when the tasks merely exist; it is done when nothing in the spec is left untracked. This completeness pass is mandatory regardless of project type. diff --git a/claude-templates/.ai/workflows/spec-review.org b/claude-templates/.ai/workflows/spec-review.org index 833dfc9..d4998eb 100644 --- a/claude-templates/.ai/workflows/spec-review.org +++ b/claude-templates/.ai/workflows/spec-review.org @@ -50,6 +50,11 @@ Run it *early* — design review exists to catch viability problems and costly m Before Phase 1, verify the file under review ends with =-spec.org=. Every design, decision, or planning document under a project's =docs/= directory carries that suffix as its identifier. The =.org= extension alone is not enough because =docs/= holds non-spec org files too (tutorials, frozen inventories, reference material). +*Location expectation (docs-lifecycle convention).* Formal specs live in =docs/specs/=. Whether that's enforced depends on whether the project has run its one-time =spec-sort= retrofit: + +- =:LAST_SPEC_SORT:= present in =.ai/notes.org= Workflow State → the project has sorted; a =-spec.org= file outside =docs/specs/= fails this precondition. Surface it: "this spec sits outside docs/specs/ — move it (and update inbound links) before review." +- Marker absent → legacy locations (=docs/= root, =docs/design/=) stay reviewable; add one nudge line to the review output ("this project's docs pile has never been spec-sorted — say 'run spec-sort' to sort it") and proceed. No legacy spec is ever unreviewable during the transition. + If the file does not end with =-spec.org=, stop immediately and surface the mismatch: #+begin_example @@ -128,6 +133,7 @@ Work the spec against these. Each is a source of concrete findings, not a box to - *Performance & scale.* Expected counts (issues/comments/labels/teams/projects/views)? Server-side filtering where possible? Bounded, visible pagination? Cached name→ID lookups? Sync calls in the command path acceptable? Could a save hook or whole-file scan make N network calls? Rendering linear? Full-file rewrites avoided? Long-running operations async/cancellable/observable? Is concurrency/queueing/backpressure defined? Are high-output process filters throttled and cheap? Is progress/ETA exposed only when defensible, and are hung/stalled operations detectable and killable? Identify UI freezes, repeated network calls, unbounded pagination — without premature optimization. - *Security & privacy.* API keys safe? Debug logs leaking secrets or private issue text? Confirmations before mutating shared workspace objects? Personal vs shared distinguished? Local files holding sensitive descriptions/comments? Anything to redact from messages/logs? Any work-tracker integration may handle private company data. - *UX & accessibility.* Discoverable commands? Recoverable mistakes? Prompts ordered to the task? Safe, useful defaults? Informative-not-noisy status messages? Does the UI avoid implying unsupported actions are supported? Match the upstream product's permissions/concepts? Are customizations named in user language, with clear defaults and docstrings? For Emacs packages, command names, completion candidates, buffer layout, defcustom names, and message wording *are* the UX. +- *Operational-panel UI traps.* Applies when the spec covers a user-facing panel, dialog, or control surface; skip otherwise. Lists that mix saved, current, and generated items must name each item's source. Refresh or scan actions must not gate data that could be shown immediately. Add-forms must not ask the user to retype values the system already discovered. Destructive confirmations read in future tense before the action and verified-result tense after it. Diagnostics, performance, logging, and repair affordances are reviewed as one coherent flow before extra pages or buttons are added. A popup launched from a bar, tray, or tool surface should visually belong to that launcher. (Promoted from archsetup's Waybar network-panel review, 2026-06-30.) - *Test strategy and coverage.* Characterization tests before behavior changes? Pure functions to unit-test? API responses needing fixtures? Command flows needing stubs? Regression tests for prior bugs? Boundary/error cases? What's covered elsewhere and shouldn't be re-tested? Which existing tests must change? How is coverage generated, summarized, and used to find untested/refactor-worthy code? Prefer tests that lock contracts: representation shape, query compilation, sync no-op, conflict refusal, pagination, dirty-buffer protection, log redaction, and long-running/slow-operation behavior via fakes rather than flaky live dependencies. - *Observability & operations.* How does a user see what the package is doing? Progress messages for long ops? Useful, safe debug logging? Are logs structured enough to isolate issues from a bug report? Are commands provided to inspect/clear caches, test connectivity, diagnose backends/tools, copy redacted debug info, or reproduce command invocations? How are terminal states discovered: completion, failure, partial success, stalled/hung, cancelled, cleanup-unverified, and "needs user action"? Does the product notify only when useful, avoid noisy success spam, and keep non-success states visible until acknowledged? For generated org files, headers should often carry source, filter/view name, refresh time, count, truncation state. - *Comparable-product sentiment.* When there are obvious adjacent products, research what users love and hate about them from official docs plus current community reports. Do not cargo-cult their feature set; translate findings into the spec's scope. For each loved behavior, say whether the spec provides it, intentionally omits it, or defers it. For each hated behavior, say whether the spec avoids, resolves, inherits, or accepts it. @@ -166,6 +172,8 @@ Assign one label consistently: The most useful reviews move a spec from =Not ready= to =Ready with caveats= or =Ready= once decisions are captured. +*The =Ready= verdict flips the spec's lifecycle status.* spec-review owns the =DRAFT= → =READY= transition (docs-lifecycle convention): on assigning =Ready= (or =Ready with caveats= the author accepts), update the spec's top-level status heading keyword to =READY=, add a dated history line under it naming the review that passed, and set the Metadata =Status= mirror to =ready= — three lines, one file. Any other rubric label leaves the keyword where it stands (a re-review that finds new blockers on a =READY= spec demotes it back to =DRAFT= the same three-line way, with the reason in the history line). + Finding severity maps to blocking power: *high-priority findings block =Ready=* — they hold the rubric at =Not ready= (or =Ready with caveats= if the author accepts and tracks them) until dispositioned; *medium-priority findings are the author's discretion* and don't block. State the blocking status on each finding so the author running spec-response knows which ones gate the rubric. Then update the spec's review history. Specs should carry a bottom section named =Review and iteration history= (or the nearest existing equivalent) that tracks each material author/reviewer pass. Add a concise entry for this review even when the spec is ready and no findings are recorded. diff --git a/claude-templates/.ai/workflows/startup.org b/claude-templates/.ai/workflows/startup.org index 5e8f61e..47a77c8 100644 --- a/claude-templates/.ai/workflows/startup.org +++ b/claude-templates/.ai/workflows/startup.org @@ -44,6 +44,8 @@ Behavior: - *Dirty working tree* → skip the pull. Don't auto-stash and don't auto-merge — those would either lose work or invite conflicts at the worst possible moment (session start). - *Non-fast-forward history* → =--ff-only= aborts with an error. Surface that to the user; the rsync still proceeds against the working tree as-is. +*Template-freshness policy (applies to every dirty-check in the synced workflows).* "Dirty" means *tracked modifications only*. Untracked and gitignored files — an inbox drop, a file left in the tree to read, scratch output — never block a template pull, a fast-forward, or a monitoring gate. Projects were falling behind on templates because somebody sent them a task; that's the failure this policy closes. The checks here already comply (=git diff --quiet HEAD= sees only tracked changes; the ff gate uses =--untracked-files=no=), and any dirty-check added to a synced workflow follows the same rule. One deliberate exception: the rsync WIP-guard below counts untracked files *within rulesets' own synced source paths*, because an untracked half-written template is exactly the WIP it exists to hold back — that guard is about rulesets' outbound content, not the consuming project's local state. + *** Install rulesets symlinks into ~/.claude (idempotent) A skill, rule, or bin script added to rulesets and pushed reaches each machine's *files* on the next pull, but not its =~/.claude= *symlink* — =make install= only links what isn't already linked, and =git pull= doesn't run it. So a newly-added skill stays silently uninstalled until someone re-runs =make install= by hand. The flush skill sat in that gap from 2026-06-02 until a manual install on 2026-06-05. Running =make install= here, right after the rulesets pull, closes it: "add a skill, commit, push" becomes enough for it to reach every machine on the next session. @@ -151,7 +153,7 @@ These calls have no dependencies on each other. Issue them all together in one m 8. =[ -f todo.org ] && .ai/scripts/task-review-staleness.sh todo.org 7 || true= — count top-level tasks overdue for review (the daily task-review habit's startup nudge). The =[ -f todo.org ]= guard skips projects without a root todo.org; =|| true= keeps Phase A from failing if the script isn't synced yet. Threshold 7 days is one review cycle of slack — softer than the wrap-up health check's 30-day alarm. 9. =bash ~/code/rulesets/scripts/sync-language-bundle.sh "$PWD" 2>/dev/null || true= — language-bundle freshness for the current project. Fingerprint-detects which bundle (if any) the project has, auto-fixes drifted rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), and surfaces drift in =settings.json= without writing it (a project may have customized it). =CLAUDE.md= is deliberately left untracked — it's seed-only in =install-lang= and project-owned afterward, mirroring how =diff-lang= skips it. Quiet when there's no bundle or everything's clean. Hardcodes the rulesets path because =languages/= is the canonical source and lives only there — the same absolute-path dependency the rsyncs already carry. =|| true= keeps Phase A from failing on older checkouts where the script isn't present yet. The =.ai/= rsyncs and this call write to disjoint paths (=.ai/= vs =.claude/=/=githooks/=), so the batch stays parallel-safe. 10. =[ -f "$HOME/org/roam/inbox.org" ] && grep -cE '^\*\* ' "$HOME/org/roam/inbox.org" || true= — count items in the roam global inbox (=~/org/roam/inbox.org=), the roam-mode startup nudge. Silent if the roam clone isn't on this machine. Phase C reads the file when the count is non-zero, splits total vs items related to this project, and surfaces the offer (see =inbox.org= roam mode). Read-only; never files at startup. -11. KB surface prep (the read + contribute startup nudges; see =docs/design/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute). +11. KB surface prep (the read + contribute startup nudges; see =docs/specs/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute). #+begin_src bash ra="$HOME/org/roam/agents" @@ -166,6 +168,25 @@ These calls have no dependencies on each other. Issue them all together in one m fi #+end_src +12. Spec-sort probe (the docs-lifecycle retrofit nudge; see the docs-lifecycle spec in =docs/specs/=). Read-only; prints one line when the project has an unsorted docs pile — a =docs/design/= directory or stray =docs/*-spec.org= root files — and no =:LAST_SPEC_SORT:= marker in =.ai/notes.org=. Silent for projects with nothing to sort or an already-stamped marker (the marker permanently clears it). + + #+begin_src bash + { [ -d docs/design ] || [ -n "$(find docs -maxdepth 1 -name '*-spec.org' -print -quit 2>/dev/null)" ]; } \ + && ! grep -qs ':LAST_SPEC_SORT:' .ai/notes.org \ + && echo "spec-sort: unsorted docs present" || true + #+end_src + + The stray-root check uses =find= rather than a glob so the probe behaves identically under bash and zsh (=compgen= is bash-only, and zsh aborts on an unmatched glob). + +13. Host-identity probe (see the host-identity rule in =claude-rules/=). Read-only; flags fixed machine-identity claims in the project's tracked/synced docs — the "This machine is ratio" trap, false on every machine but the one that wrote it. Silent when nothing matches. + + #+begin_src bash + grep -inE '\b(this|the current) (machine|host|box|laptop|workstation) is ' \ + CLAUDE.md .ai/notes.org 2>/dev/null | head -3 || true + #+end_src + + Fleet descriptions ("the fleet is ratio and velox") and runtime derivations ("run =uname -n= to find the hostname") don't match — only current-identity assertions do. Fixture-verified under bash and zsh. + Notes on the rsync commands: - Trailing slashes on both source and destination matter — they tell rsync to sync /contents/ rather than nest a directory inside. - =--delete= on the directory syncs lets retired template files actually disappear from each project on next startup. @@ -199,6 +220,8 @@ This phase touches the user and runs sequentially: - *Roam inbox nudge.* If the Phase A roam-inbox count is greater than zero, read =~/org/roam/inbox.org=, split total vs items related to this project (claimed by the =<project>:= prefix, plus any unprefixed item whose topic plainly concerns this project), and surface one line: "Roam inbox: =<N>= total, =<M>= appear related to this project — say 'inbox zero' to file them." Offer it as a priority option; never auto-file. If the count is zero or the file is absent, say nothing. See =inbox.org= roam mode. - *KB consult nudge (read side).* If the Phase A KB-surface prep returned any =kb-relevant-titles=, surface one line listing them (capped 5): "KB lessons that may be relevant: =<title>=; =<title>=… — open the node before related work." The titles are declarative, so the list alone tells you whether to open one. Gated on the roam clone; silent when the clone is absent or nothing relevant surfaced. See the best-practices node and =knowledge-base.md=. - *KB contribute nudge (write side).* Once per session, surface one line pointing at the best-practices node (the =kb-bestpractices= path from Phase A): "Learned something durable? See =<path>= for how to write a KB node — contributing cross-project facts is welcome (personal projects only; work/unknown projects never write per =knowledge-base.md=)." Light encouragement, never a gate. Gated on the roam clone; silent when absent. + - *Spec-sort nudge.* If the Phase A spec-sort probe printed =spec-sort: unsorted docs present=, surface one line: "this project's docs pile has never been spec-sorted — say 'run spec-sort' to sort it." If the probe was silent, say nothing. A project with nothing to sort never sees the line; a stamped =:LAST_SPEC_SORT:= marker permanently clears it. See the docs-lifecycle rule and the spec in =docs/specs/=. + - *Host-identity flag.* If the Phase A host-identity probe printed any match, surface it with the file:line and the fix: "this doc asserts a fixed machine identity — false on every other machine; replace with a runtime derivation (run =uname -n=), per the host-identity rule." The probe flags for judgment, never blocks. Silent when the probe is silent. - *Language-bundle sync.* If the Phase A step-12 call (=sync-language-bundle.sh=) printed anything, surface it. =fixed= lines are informational — the drift was already repaired (note that =.claude/= is now dirty if the project commits it). A =drift= line on =settings.json= is surface-only and needs the printed =make install-<lang> PROJECT=.= to reconcile; flag it so the user can decide. If the call was silent, say nothing. - *Newly-installed symlinks.* If the Phase A.0 =make install= step printed any =link= / =relink= / =WARN= line, surface it. A =link= line means a skill, rule, hook, or script added to rulesets is now linked into =~/.claude= for the first time on this machine. For a newly-linked *skill*, check the agent's available-skills list: if the harness already registered it mid-session, note it's available and move on; if it's absent, stop and tell Craig to restart the agent so it loads (whether a mid-session reload works is harness-version-dependent). For a newly-linked *hook*, note that the harness reads hooks at session start — it fires from the next session (or after Craig opens =/hooks= once); its settings.json wiring travels with the tracked file, so the link is usually the only missing piece. A =WARN ... not a symlink= line is a real collision at the target path — surface it; it needs a human. If the step printed only "nothing new to link", say nothing. - *Template-sync churn (safety net).* Check whether Phase A's rsync left uncommitted churn in the synced =.ai/= paths — accumulated from a prior session that crashed before wrap-up, or freshly added this session when rulesets advanced. Without surfacing, it builds up silently until it blocks Phase A.0's auto-ff (git won't ff a dirty tree). Skip in the rulesets repo itself (there =.ai/= is a committed mirror, kept honest by the pre-commit hook). The check is sequential here, after the rsync has finished — not a Phase A step, to keep that batch race-free. diff --git a/claude-templates/.ai/workflows/suspend.org b/claude-templates/.ai/workflows/suspend.org new file mode 100644 index 0000000..1c16bb9 --- /dev/null +++ b/claude-templates/.ai/workflows/suspend.org @@ -0,0 +1,112 @@ +#+TITLE: Session Suspend Workflow +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-06-28 + +* Overview + +This workflow captures the live state of a session when Craig must leave +abruptly, so a future session resumes with nothing lost. It is the fast, +capture-only workflow for departure: it writes down where every thread stands, +notes any uncommitted work, then STOPS — no cleanup, no archive, no teardown. + +Triggered by Craig saying "suspend the session," "suspend," "I need to go," +"stick a pin in everything," or similar. "I need to go" is broad — if it reads +as a conversational aside rather than a request to suspend, confirm before +running. + +* Where suspend sits among its neighbors + +Three workflows touch the session anchor (=.ai/session-context.org=); keep them +straight: + +- =flush= ([[file:../../flush/SKILL.md]] / =/flush=) — *stay and sharpen.* + Refreshes the anchor in place, prompts Craig to type =/clear=, and a hook + resumes the *same* logical session in a fresh context. Craig is still here. +- *suspend* (this workflow) — *leave.* Captures richly into the anchor, leaves + the file in place, and Craig walks away. The next session is a cold startup + that detects the present anchor and resumes from it. +- =wrap-it-up= ([[file:wrap-it-up.org][wrap-it-up.org]]) — *end.* Writes the + Summary, archives the anchor into =.ai/sessions/=, commits + pushes, and runs + the phrase-dependent teardown. + +Suspend and flush share one core — capture into the anchor, leave it in place. +They differ in the exit (leave vs clear-and-continue) and the resume path +(startup vs the =/clear= hook). Suspend reuses flush's capture discipline (its +Phase 1 anchor-refresh) rather than restating it, and adds a richer, +resume-weighted Session Log entry because it's written for a cold resume after a +gap, not a same-session reset. + +* Suspend vs wrap-up — the one structural difference + +=wrap-it-up= ARCHIVES =.ai/session-context.org= (renames it into +=.ai/sessions/=); its absence at the next startup is the signal that the last +session ended cleanly. + +Suspend does the opposite: it LEAVES =.ai/session-context.org= in place. Its +presence at startup is exactly the signal that the previous session was +interrupted, so the startup workflow reads it and resumes. Suspend provides only +the *capture* half — startup's existing interrupted-session path (Phase A checks +for the anchor, Phase B reads it, Phase C offers to resume) is the *resume* half, +already built. + +So: never archive, never rename the context file in a suspend. Capture into it +and leave it. + +* What gets captured + +The point is zero lost information, weighted toward RESUME. Into the +=* Session Log= of =.ai/session-context.org=, append one dated +=** YYYY-MM-DD ... — SUSPENDED= entry holding: + +1. *Open threads — resume here.* For each active or pending thread: the topic, + its status (ACTIVE / PINNED / SET ASIDE / DEFERRED), the immediate next + step, and the pointers needed to act on it cold (files + line numbers, + commit SHAs, the specific finding or decision). This is the core; spend the + most words here. Order newest / most-active first. +2. *Pending decisions / open questions* awaiting Craig — anything blocked on + his input, with enough context that the answer is actionable. +3. *Shipped this session* — a terse list of what landed, each with its commit + SHA, so the resume knows what is already done and need not re-derive it. +4. *Uncommitted work* — anything modified on disk but not committed, named + file by file, so the resume knows what state the tree is in. +5. *Key findings not yet recorded elsewhere* — anything learned this session + that isn't already in a commit, a file, or memory, so it survives. +6. *Background work* — any running task, agent, or job, and how to check it. +7. *Resume hint* — the single most likely "start here" next action. + +Also update the top of =* Summary= (Active Goal) with a one-line SUSPENDED +pointer to the entry, so startup reading the top sees the current state even +when the Summary body is from an earlier thread. + +* Steps + +1. *Write the SUSPENDED entry* into the Session Log, per "What gets captured" + above. Timestamp with =date "+%Y-%m-%d %a @ %H:%M:%S %z"=. +2. *Update the Active Goal pointer* at the top of =* Summary=. +3. *Record uncommitted work, don't force-commit it.* A suspend records state, it + does not tidy it. Name every uncommitted change in the SUSPENDED entry and + leave the tree as it is — on an abrupt departure, a dirty tree (like any + crash) is safer than a blind commit of arbitrary mid-work state. (If a + project defines a standing always-commit set in its own workflow, commit only + that set — but the default shared behavior is to leave the tree alone.) +4. *Leave =.ai/session-context.org= in place.* Do not archive it. +5. *Brief handoff* — one or two lines: what was captured, where the resume + pointer is, the most-active thread. End and let Craig go. + +* What suspend does NOT do + +Speed over completeness. A suspend deliberately skips everything wrap-it-up +does beyond capture: + +- No =* Summary= rewrite beyond the one-line Active Goal pointer. +- No todo.org cleanup / archive-done. +- No KB / memory promotion sweep. +- No Linear / board reconciliation. +- No session-record archive (the file stays live). +- No teardown (the ai-term buffer + tmux session stay up). It drops no + =Stop=-hook teardown sentinel, so the wrap-teardown hook stays dormant. +- No blind commit of working files (step 3). +- No valediction. A suspend is a pause, not a goodbye. + +If Craig later wants the clean end, he runs wrap-it-up, which picks up the +captured state and finishes the job. diff --git a/claude-templates/.ai/workflows/task-audit.org b/claude-templates/.ai/workflows/task-audit.org index 67ce496..7d2b758 100644 --- a/claude-templates/.ai/workflows/task-audit.org +++ b/claude-templates/.ai/workflows/task-audit.org @@ -61,6 +61,8 @@ For each open task, read its body and cross-check its claims against the actual - *Calendar* — did a scheduled event happen; is a SCHEDULED/DEADLINE date now past. - *Meeting recordings* — if a task hinges on "did this conversation happen / what was said," check the recording queue (e.g. =~/sync/recordings/=) and transcribe via =process-meeting-transcript.org= if the answer lives in an un-transcribed recording. (This is exactly how a "did the interview happen?" task gets resolved instead of guessed.) +*Spec lifecycle reconcile (docs-lifecycle convention).* If the project has a =docs/specs/=, run the =:SPEC_ID:= query as part of this phase: for each spec whose top-level status heading reads =DOING=, find the =todo.org= task whose =:SPEC_ID:= property matches the spec's =:ID:=. Flag the spec NEEDS-USER when that bound parent is =DONE=/=CANCELLED=, archived, or missing — the build finished (or evaporated) without the =IMPLEMENTED= flip, exactly the drift this check exists to catch. Check the parent's own keyword, not its children (completed children become dated entries and the final flip task is a child, so child-counting misleads). + Assign each task a bucket (CURRENT / STALE / NEEDS-USER) and, for STALE, the specific factual update. *Scale tactic.* For a large open-task set, dispatch read-only investigation sub-agents over batches of tasks (parallel-safe per =subagents.md= — independent read-only domains). Each returns a per-task bucket + suggested update. *Never* let sub-agents write to =todo.org= concurrently — apply all edits serially in the main thread (concurrent writes to one file race and lose work). @@ -79,11 +81,41 @@ For every STALE task, edit it in the main thread: - *Ensure priority is set per the project scheme.* The top of the project's =todo.org= should carry the priority legend (=[#A]= through =[#D]=). Every task should carry an explicit priority cookie. If a cookie is missing, or no longer matches the reconciled facts, assign the right level per the legend. If the level is unambiguous from the body, do it autonomously; if it's a judgment call (especially the [#A] / [#B] line for important-but-not-urgent work), flag NEEDS-USER. Also enforce the [#A]-discipline rule from the legend — an [#A] task without a =SCHEDULED:= or =DEADLINE:= line is mis-graded and is either down-graded to [#B] (when reconciled facts say "important but not urgent") or surfaced as NEEDS-USER for the user to date. - *Ensure a type tag is set.* Every task carries one type tag from the project's tag legend (typically =:feature:= / =:chore:= / =:spec:= / =:bug:=). If missing or wrong, assign or correct it from the body when the type is unambiguous. If two tags fit (a refactor that also fixes a bug; a spec that's also a chore), flag NEEDS-USER rather than picking one silently. - *Enforce the project's declared tag vocabulary.* If the project's tag legend declares an *exhaustive* set of allowed tags, strip from each task any tag outside that set — the heading and parent section already carry topic/scope context, so ad-hoc tags only fragment the vocabulary and defeat tag-based filtering. Normalize near-duplicate spellings to the canonical tag (a plural to its singular, say). Where the legend does not declare the set closed, leave existing tags alone; this step applies only where the allowed set is exhaustive by design. -- *Re-assess the =:quick:= and =:solo:= tags* — reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the definitions in the project's tag legend (and [[file:task-review.org][task-review.org]]) when the reconciled facts make the call clear. When they don't — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess. +- *Re-assess the =:quick:= and =:solo:= tags (mandatory — an audit that skips this is incomplete).* Reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the hard definitions in [[file:../../claude-rules/todo-format.md][todo-format.md]] ("Hard definitions: :solo: and :quick:"; task-review carries the same three-gate walk). Autonomous execution reads =:solo:= as its eligibility gate and trusts the tag, so a stale one is a run-time hazard, not cosmetic drift. When the call isn't clear — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess. - Bump =:LAST_REVIEWED:= on each edited task. Follow =todo-format.md= for completion mechanics (depth-based DONE vs dated-rewrite) and the working-files / link-hygiene rules when moving artifacts. +** Phase C.5 — Consolidate related tasks (interactive) + +Phase C's *Consolidate duplicates* bullet folds tasks that track the *same* thing. This step is the broader case: tasks that aren't duplicates but are really *one effort* fragmented across the list. A spread-out effort — several tasks all circling "make the tooling agent-agnostic," say — is harder to see, plan, and finish as a whole than one task, or one parent with the pieces as children. + +After the Phase C edits, read the open-task set as a whole and look for *clusters*: tasks that share a goal, a subsystem, or an obvious sequence. Use judgment over the task bodies, not a keyword heuristic — adjacency is a semantic call, and a brittle title-match both misses real clusters and invents false ones. + +For each cluster, surface it to Craig (inline numbered options per =interaction.md=, no popup) with a recommendation, offering the two shapes: + +- *Merge* — fold the cluster into one task when the members are genuinely the same work split up (near-duplicates, or steps with no independent value). The merged task keeps the strongest priority, unions the type tags, and absorbs each member's body as a dated note or a short list; the absorbed tasks close per =todo-format.md= (a =**= task → =CANCELLED= + =CLOSED:= with a one-line "merged into <task>", or deletion if it carried nothing unique). +- *Parent with children* — when the members are related but distinct (each ships independently or has its own value), promote a parent task and re-home the members beneath it as sub-tasks, so the list shows the effort as a unit without losing the individual pieces. + +Never merge or re-parent autonomously — which tasks belong together, and whether they're one-work or related-distinct, is a judgment only Craig ratifies. Propose, don't apply, until he picks. A cluster he declines stays as separate tasks; don't re-surface it every audit (note the decline in the session log). + +When no clear cluster exists, say so in one line and move on — most audits won't find one, and forcing a merge fragments worse than it consolidates. + +** Phase C.6 — Retire completed parents and promote stragglers (interactive) + +Phase C.5 consolidates related *open* tasks. This step retires parent tasks whose work is *finished*, so completed containers don't linger in Open Work as scaffolding. + +Run =todo-cleanup.el --convert-subtasks= first (it's part of the =clean-todo= / wrap-up cleanup, and =open-tasks.org= runs it too) so every completed sub-task is a dated event-log entry rather than a lingering =DONE= keyword. The closure logic below reads "open child" as a child heading still carrying a task keyword (=TODO=/=DOING=/=WAITING=/=VERIFY=/=NEXT=/=PROJECT=/=STALLED=/=DELEGATED=); a dated entry is correctly not open. + +Two shapes, both proposed to Craig (inline numbered options per =interaction.md=, no popup) before applying: + +- *Zero open children → close the parent.* A parent whose child *tasks* are all resolved (now dated) and that carries no open child task is finished: close it per =todo-format.md= (=**= parent → =DONE=/=CANCELLED= + =CLOSED:=), and it moves to Resolved on the next =--archive-done=. If the work resurfaces later, a fresh task is created then; a completed container shouldn't sit open as a placeholder. +- *One or two open children → promote, then close.* When a parent has only one or two open children, pull them out and rewrite them as standalone =**= level-2 tasks — give each a priority per the project scheme, and make the heading stand alone without the parent's context — then close the now-childless parent and let it move to Resolved. The former children become first-class Open Work tasks; the retired parent stops being scaffolding for one or two stragglers. + +*The leaf-with-notes carve-out (important).* "Zero open children" is not the same as "done." A =**= leaf task whose only descendants are dated *notes* — a captured "Ideas", "Goals", or "Current State" entry, not a real completed sub-task — is unstarted work with a note attached, not a finished container. Do not close it. Tell the two apart by intent: a container reads as a grouping (a =PROJECT= keyword, an explicit "parent grouping ..." line, or several dated entries that were genuinely separate sub-tasks that shipped); a leaf-with-notes is a single feature/bug task whose title names unstarted work and whose lone dated child is a design note. When the call is ambiguous, flag it NEEDS-USER rather than closing. + +Never close or promote autonomously past the ambiguous line — surface the candidates with a recommendation and let Craig ratify, the same interactive stance as Phase C.5. Clear container completions (a =PROJECT= whose every child is dated) can be proposed as a batch; leaf-with-notes ambiguities are flagged individually. Verify open-vs-done counts against the actual headings (a real scan of the subtree), not a fragile regex that a shell's =\b= support can silently break — a miscount here closes live work. + ** Phase D — Flag the judgment calls (interactive) Present the NEEDS-USER bucket as a short, scannable list — one line per task, naming the decision or the fact required. Adjudicate with the user one item at a time (inline numbered options per =interaction.md=, no popup). Apply the user's calls as they come (which may itself produce more autonomous updates, or new tasks). @@ -132,3 +164,7 @@ Two Phase C behaviors added, both surfaced by an Emacs-config =todo.org= audit: - *Tag-vocabulary enforcement.* That project declares a closed tag set (=bug=, =feature=, =refactor=, =test=, =quick=, =solo=); the audit had to strip ~44 ad-hoc tags that had accumulated across the file. The prior workflow only checked that a type tag was *present* — it had no concept of an exhaustive allowed set. The new bullet enforces a declared closed vocabulary and leaves open-vocabulary projects untouched. - *Code-complete-but-unverified closing.* Many tasks had shipped (tests green, live in the daemon) but stayed open awaiting a manual or visual verification, so they accumulated as half-open. Leaving them open is noise; auto-closing them would violate "never claim a fix verified before the user confirms." The fix routes the pending human check into the project's =Manual testing and validation= parent (dedup-checked) per =verification.md='s manual-verification hand-off, then closes the implementation task. The work is done and the check is tracked; a failed check promotes to a bug. + +** 2026-07-01 — Retire completed parents (Phase C.6) + +Added Phase C.6: retire a parent task once its child *tasks* are all done. Zero open children → close the parent; one or two open children → promote them to standalone level-2 tasks, then close. Surfaced by an Emacs-config =todo.org= audit where several PROJECT containers had all children complete. Depends on =todo-cleanup.el --convert-subtasks= running first so completed sub-tasks are dated (not lingering =DONE= keywords) and the open-child count is accurate. Carries a leaf-with-notes carve-out: a =**= leaf task whose only descendant is a dated design note ("Ideas"/"Goals") is unstarted work, not a finished container, and must not be closed — the ambiguous case is flagged NEEDS-USER. The step also warns against counting open-vs-done with a fragile regex (a =\b= that a given shell/awk silently drops miscounts and closes live work). diff --git a/claude-templates/.ai/workflows/task-review.org b/claude-templates/.ai/workflows/task-review.org index 69e172d..ba1571a 100644 --- a/claude-templates/.ai/workflows/task-review.org +++ b/claude-templates/.ai/workflows/task-review.org @@ -57,7 +57,9 @@ Keep is the common case — most tasks are still right and just need re-stamping *** Tagging =:quick:= — small tasks -While reviewing each task, estimate its effort. If you judge it *30 minutes or less* and it doesn't already carry =:quick:=, add the tag to the heading line. If the heading and body don't tell you how long it'll take, *ask Craig* — don't guess. A wrong =:quick:= is worse than none: the tag exists so Craig can grab a genuinely small task in a spare moment, and a mislabeled one wastes that moment. +The =:quick:= and =:solo:= assessments (this section and the next) are *mandatory* for every reviewed task except a Kill — a review that skips them is incomplete. The hard definitions live in [[file:../../claude-rules/todo-format.md][todo-format.md]] ("Hard definitions: :solo: and :quick:"); autonomous execution (work-the-backlog / the no-approvals speedrun) reads =:solo:= as its eligibility gate and trusts the author's tag, so the run-time gate is only as trustworthy as this pass. + +While reviewing each task, estimate its effort. If you judge it *30 minutes or less* and it doesn't already carry =:quick:=, add the tag to the heading line. If the heading and body don't tell you how long it'll take, *ask Craig* — don't guess. A wrong =:quick:= is worse than none: the tag exists so Craig can grab a genuinely small task in a spare moment, and a mislabeled one wastes that moment. =:quick:= is an effort hint only, never an eligibility gate — size does not decide what runs autonomously. This is orthogonal to the action chosen — a task can be kept (or re-graded, or marked DOING) *and* tagged =:quick:= in the same pass. Skip the assessment on a Kill, since it's leaving the pool. Tags go on the heading line per [[file:../../claude-rules/todo-format.md][todo-format.md]], sharing one =:tag1:tag2:= cluster. @@ -67,7 +69,7 @@ While reviewing each task, judge whether Claude could build *and* verify it with 1. *Buildable* — Claude has the capability and access to do the work. 2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste. -3. *No upfront decision* — no design or preference call Craig must make before Claude can begin. +3. *No deliberation* — no open design question and no "weigh these approaches" with real tradeoffs. At most one or two *quick, upfront-answerable* factual decisions are allowed — the speedrun preset batches those into its pre-flight Q&A, so they don't break the hands-off run. A genuine design or preference call disqualifies. If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing. @@ -96,6 +98,8 @@ The exact date string matters: =task-review-staleness.sh= and the wrap-up health Follow the completion rules in [[file:../../claude-rules/todo-format.md][todo-format.md]]. A killed top-level =**= task stays task-shaped: change the keyword to =CANCELLED=, add a =CLOSED: [YYYY-MM-DD Day]= line under the heading (generate with =date "+%Y-%m-%d %a"=), and leave the priority and tags intact. It's then a candidate for =--archive-done= at the next cleanup. Don't stamp =:LAST_REVIEWED:= on a kill — it's leaving the review pool anyway. +A killed *sub-task* (=***= or deeper, under a parent task) instead becomes a dated event-log entry per the depth rule — but you don't have to hand-format it here. =todo-cleanup.el --convert-subtasks= (run in the =clean-todo= and wrap-up cleanup passes) rewrites any level-3+ DONE/CANCELLED/FAILED heading into its dated form mechanically from the =CLOSED= cookie, so a keyword-plus-=CLOSED= close at depth gets normalized on the next cleanup rather than lingering. =lint-org.el= flags any that slip through (checker =subtask-done-not-dated=). + * Phase D: Close out When the batch is done (or Craig calls it early): diff --git a/claude-templates/.ai/workflows/work-the-backlog.org b/claude-templates/.ai/workflows/work-the-backlog.org new file mode 100644 index 0000000..b0666e7 --- /dev/null +++ b/claude-templates/.ai/workflows/work-the-backlog.org @@ -0,0 +1,263 @@ +#+TITLE: Work the Backlog +#+AUTHOR: Craig Jennings & Claude +#+DATE: 2026-07-02 + +* Overview + +The single home for the autonomous task-execution loop: take a set of marked, solo-doable tasks from the project's =todo.org= and work them unattended, each held to the full quality bar, under a fixed safety contract. Spec: =rulesets/docs/specs/2026-06-16-autonomous-batch-execution-spec.org=. + +Two callers feed it, differing only in how they build the task set and which session mode they pass: + +- The *inbox auto-loop* (=inbox.org= auto mode) chains here after its routing completes, with a tag/priority query, file-only mode, cap 1. +- The *no-approvals speedrun* preset feeds an explicit ordered list with autonomous-commit + always-push + paging-on, after a pre-flight Q&A that front-loads every decision. + +This workflow owns the execution logic — eligibility gate, defer checklist, quality bar, run cap. Callers own input assembly and mode selection. Capture-routing (inbox surfaces) stays entirely in =inbox.org=; this file never reads an inbox. + +* When to Use This Workflow + +Invoked by its two callers, or directly by phrase: + +- *Speedrun triggers:* "speedrun", "no approvals speedrun", "speedrun these: <task set>" — run the no-approvals speedrun preset (below). The word "speedrun" always routes here, even when the phrase also says "no approvals": plain =no-approvals.org= is the general session mode; the speedrun is this workflow's preset over an explicit task set. +- *Loop caller:* =inbox.org= auto mode chains here after its routing (below). Not phrase-triggered. + +Manual fallback: "work the backlog" / "work the backlog with <task set>" — gather the three inputs below (ask for whichever are missing, defaulting to file-only mode; default cap is the list length for an explicit set, 1 for a query) and run the loop. + +* Inputs — the caller contract + +A caller hands this workflow three things: + +1. *A task set* — an ordered list of candidate task headings from the project's =todo.org=. Either an explicit ordered list (speedrun) or the result of a tag/priority query (the loop). The loop does not care how the set was assembled; it receives an ordered list of candidates. +2. *A session mode* — two orthogonal flags: + - *Commit autonomy:* =file-only= (default) or =autonomous-commit=. See "Commit autonomy" below. + - *Paging:* on or off. End-of-set only. +3. *A run cap* — the hard maximum number of tasks to complete this run. + +It returns a per-task outcome and a run summary. + +* Outcomes — the per-task vocabulary + +Every task in the set ends in exactly one of: + +- =implemented-committed= — implemented, committed (and pushed per the project's flow) under =autonomous-commit=. +- =implemented-diff-surfaced= — implemented, diff surfaced, *not* committed (=file-only=). +- =deferred-VERIFY= — a defer-checklist hit; a =VERIFY= filed naming what's missing or risky. +- =dropped-by-craig= — removed from the run at the speedrun pre-flight Q&A ("skip this"). +- =skipped-ineligible= — failed the mechanical eligibility gate. +- =failed= — implementation was attempted and abandoned: the tree is left working (never commit a broken state), the failure is surfaced in the run summary, and the run continues to the next task. + +The run summary lists each task with its outcome, plus the remaining set when the cap stopped the run. + +* The loop + +For the task set, in order, until the run cap is hit: + +1. *Eligibility gate* (below). Ineligible → record =skipped-ineligible=, next task. +2. *Scope read* of the relevant code. Cheap; just enough to run the defer checklist. +3. *Defer checklist* (below). Any hit → defer: file the =VERIFY= naming the gap and record =deferred-VERIFY= (or, under the speedrun preset, route a quick-question gap to the pre-flight Q&A), next task. +4. *Implement* under the project's commit discipline: TDD red→green→refactor, then =/review-code --staged=, fix all Critical/Important findings, then close the task per =todo-format.md='s completion rules. Decompose into as many logical commits as the change needs — size is not capped. If implementation fails partway, leave the tree working, record =failed=, surface it, and continue to the next task. +5. *Commit autonomy branch:* + - =file-only= → surface the diff, do *not* commit. Record =implemented-diff-surfaced=. + - =autonomous-commit= → =/voice personal= on the message, commit individually, push per the project's flow. Record =implemented-committed=. +6. *Record metrics* for the task (the JSONL append — see Metrics below). +7. Decrement the cap. At zero, stop. + +After the set: if the paging flag is set, fire the end-of-set page (below). Surface the run summary either way. + +* Eligibility gate — mechanical, no judgment + +A task is autonomous-safe when *both* hold. This layer is a lookup, not a judgment; all the judgment lives in the defer checklist. + +1. *Status is =TODO=* — never =VERIFY=, =DOING=, =DONE=, or =CANCELLED=. =VERIFY= marks "awaiting Craig's input"; auto-implementing one defeats the check it represents. The do-not-implement set is safe-by-omission: anything not plainly =TODO= (plus any project-declared "hold" marker) is out. +2. *Tagged =:solo:=* — the autonomy tag, resolved against the project's priority/tag scheme header in =todo.org= (never hardcoded). =:solo:= carries the hard definition in =todo-format.md=: completable and verifiable without Craig beyond at most one or two quick decisions answerable up front, no design deliberation. A project whose scheme declares a different autonomous-safe tag set overrides the default. + +Priority and =:next:= drive *ordering* within the eligible set, not eligibility ([#A] before [#B] before [#C], then the author's ordering). =:quick:= is an effort hint for batching and duration estimates — never a gate. + +Task *size* is deliberately absent from this gate. A large but well-specified, decision-free task is in scope and gets decomposed into per-logical-commit chunks during implementation. Size never sends a task away; only *deliberation* or *risk* does (the checklist below). + +*No scheme header → don't run.* The gate reads =:solo:= semantics from the project's scheme header; a =todo.org= without one leaves the tag undefined (=todo-format.md= makes the header mandatory). Surface that the header is missing and stop rather than guessing eligibility. + +* The defer checklist — act vs file + +After the scope read, run each eligible candidate through the checklist. Each item is a concrete, answerable question, not an adjective. *Any* hit — or any "unsure" — defers the task. Only a task that clears every item is implemented. + +1. *Test-writability (the keystone).* Can I write the failing test from the task text — plus any decisions gathered up front — without inventing a requirement? *No / unsure* → underspecified. Under the speedrun preset, if the gap is one or two quick answerable questions, route it to the pre-flight Q&A; otherwise file a =VERIFY= noting what's missing. Under the unattended loop, file the =VERIFY= (no one to ask). +2. *Data-loss / irreversible / external operation.* Does implementing it require any of: =rm= of non-scratch data, =git reset --hard= / force-push, =DROP= / =DELETE= / =TRUNCATE=, file truncate/overwrite of persisted content, a schema or data migration, any external or shared-state mutation, any credential touch? *Yes* → do NOT implement; file a =VERIFY= naming the risk. This is the hard safety gate; an upfront answer never overrides it without an explicit checkpoint. +3. *Already-satisfied.* Does the scope read show the desired end-state already holds? *Yes* → file a =VERIFY= noting it and move on. Don't make a no-op change. +4. *Design deliberation.* Does the task carry an unresolved design question, a "weigh these approaches" with real tradeoffs, or a TBD that isn't a quick factual answer? *Yes* → under the speedrun preset, if it collapses to one or two quick questions, route to the pre-flight Q&A; otherwise file and surface as a =/start-work= candidate. Under the loop, file. The discriminator is *quick-answerable question* vs *deliberation* — never task size. + +When genuinely unsure which side a task falls on, defer — a wrong auto-implement costs a revert *and* the next-session correction. + +** Filing the deferral =VERIFY= + +Every checklist hit files a =VERIFY= in the project's =todo.org=, per =todo-format.md='s VERIFY rules: + +- *Dedup first.* If a =VERIFY= sibling for this deferral already exists (a prior run filed it), don't file another — record the outcome as =deferred-VERIFY= with a "previously filed" note and move on. The deferred task keeps its =TODO= status and tags, so without this check every subsequent run would re-defer and re-file. +- *Placement:* sibling of the deferred task (the deferred task is the trigger) — a =**= task gets its =VERIFY= at =**=, a =***= sub-task gets it at =***= under the same parent, never deeper. +- *Heading:* carries the question or risk on its own ("VERIFY <topic> — migration touches persisted rows"). +- *Body:* which checklist item hit, what's missing or risky, and what answer or action would make the task runnable. For an already-satisfied hit, the evidence that the end-state already holds. + +** Routing a quick-question gap (speedrun only) + +Under the speedrun preset, a checklist-1 or checklist-4 hit that collapses to one or two quick answerable questions routes to the pre-flight Q&A instead of deferring (see the preset section below). The discriminator: a *quick question* is a factual or preference pick answerable in one line without weighing tradeoffs ("cap at 5 or 8?", "which config key name?"); *deliberation* is anything that needs tradeoffs weighed, options explored, or code read by Craig. A task needing three or more questions isn't quick-question-gapped — it's underspecified; file the =VERIFY=. Checklist item 2 (data-loss / irreversible) never routes to the Q&A: an upfront answer doesn't override the hard safety gate. + +The unattended loop has no one to ask — every hit defers there. + +* Per-task quality bar + +Autonomy changes who approves, not what quality means. Per task, non-negotiable: + +- *TDD* per =testing.md=: red first, green, refactor. The keystone checklist item already proved the failing test is writable. +- *Verification* per =verification.md=: fresh evidence, full suite green before any commit. +- *=/review-code --staged=* before every commit; Critical and Important findings block until fixed. +- *=/voice personal=* on every commit message on the =autonomous-commit= path (or the patterns walked inline if the skill is unavailable), message printed inline so the log shows what landed. +- *Task closure* per =todo-format.md=: depth-based completion (keyword + =CLOSED:= at level 2, dated rewrite at level 3+). +- *One logical change per commit.* A large task becomes several commits, not one omnibus. + +* Commit autonomy + +=file-only= is the default: surface the diff, never commit. =autonomous-commit= is honored only when the project carries the commit-autonomy waiver, read fresh each run — never from memory of past runs or "this project usually allows it." + +The waiver lives in the project's =.ai/notes.org= *Workflow State* section as marker lines, the same shape as the workflow markers already there: + +#+begin_example +:COMMIT_AUTONOMY: yes +:LOOP_MAY_COMMIT: yes +#+end_example + +- =:COMMIT_AUTONOMY: yes= — the project has the waiver. An =autonomous-commit= request (the speedrun preset, or a manual run asking for it) is honored. +- =:LOOP_MAY_COMMIT: yes= — the *unattended loop caller* may also commit. It requires =:COMMIT_AUTONOMY:= alongside it; the split exists because "Craig-initiated speedrun may commit" and "the recurring loop may commit unattended" are different levels of trust. Without this flag the loop stays =file-only= even when the project holds the waiver. + +An absent marker means no. Anything other than a plain =yes= value also means no. The read is one grep of the Workflow State section — a lookup, not a judgment. + +*The degrade contract.* When a caller requests =autonomous-commit= and the required marker is missing, degrade to =file-only= and surface it in both the run intro and the run summary: "autonomous-commit requested, no :COMMIT_AUTONOMY: waiver in notes.org — running file-only." Never honor the request without the marker, and never drop to file-only silently — the first commits into a project that didn't opt in, the second hides why nothing got committed. + +* Bounding the run + +The cap is a hard per-run task ceiling passed by the caller — the kill switch a runaway can't exceed: + +- *Loop caller default: 1.* Implement the highest-priority eligible candidate, record, stop; the next tick continues. +- *Speedrun: the length of the explicit list*, capped at a ceiling — the human bounded the set by naming it. + +Even the speedrun stops at the cap and surfaces (and, with paging on, pages) the remainder. The cap bounds task *count*, not cost; a token budget is logged as vNext. + +* Context hygiene — auto-flush between tasks + +Task boundaries are clean boundaries by construction: the previous task is closed and committed (or filed), nothing is half-edited. When the context window grows heavy mid-run, run the flush skill's *auto mode* between tasks: checkpoint the session anchor with the remaining task set, session mode, and cap in Next Steps (so the resumed context continues the run blind), arm the self-injection (=.ai/scripts/self-inject.sh= via =tmux run-shell -b=), and end the turn. The fresh context resumes from the anchor and works on. Unattended runs only — the keystroke-collision hazard and the full mechanism live in the flush skill. + +* End-of-set page + +With paging on, fire one page when the set is done or the cap is hit — end-of-set only, never per-task: + +#+begin_src sh +notify info "Page" "<project>: <N> done, <M> remaining — <one-line summary>" --persist +#+end_src + +=--persist= keeps it on screen until dismissed, and =info= is the page-me urgency convention (persistent but never crash-scary). The page fires when the set completes *or* the cap stops the run — either way exactly once. The message carries the project name, the completed count, and the remaining count (with skipped tasks noted in the run summary) so Craig can confirm ready and name the next project in one reply. There is no separate page-signal call — =notify= is the paging surface. + +* Metrics + +Each task outcome appends one JSON line to the project's =.ai/metrics/work-the-backlog.jsonl= — git-tracked, append-only, =jq=-queryable. Create the directory and file on the first append. Logging is a side effect only: a failed append surfaces a warning in the run summary but never blocks, reorders, or aborts execution. + +One record per task, written at the moment its outcome is decided: + +| Field | Meaning | +|--------------------+-------------------------------------------------------------------------------------------------| +| =ts= | ISO-8601 timestamp of the task outcome | +|--------------------+-------------------------------------------------------------------------------------------------| +| =run_id= | UUID shared by every record in one run (=uuidgen= at run start) | +|--------------------+-------------------------------------------------------------------------------------------------| +| =project= | project basename | +|--------------------+-------------------------------------------------------------------------------------------------| +| =caller= | =loop= / =speedrun= / =manual= | +|--------------------+-------------------------------------------------------------------------------------------------| +| =task= | the task heading (slug) | +|--------------------+-------------------------------------------------------------------------------------------------| +| =outcome= | =implemented-committed= / =implemented-diff= / =deferred-verify= / =skipped-ineligible= / | +| | =dropped-by-craig= / =failed= | +|--------------------+-------------------------------------------------------------------------------------------------| +| =defer_reason= | =underspecified= / =data-loss= / =already-satisfied= / =needs-deliberation= — set on | +| | =deferred-verify= records only | +|--------------------+-------------------------------------------------------------------------------------------------| +| =upfront_decision= | =true= when a pre-flight answer was recorded and used for this task | +|--------------------+-------------------------------------------------------------------------------------------------| +| =wall_clock_s= | seconds from task start to outcome | +|--------------------+-------------------------------------------------------------------------------------------------| +| =commit_sha= | committed tasks: the commit SHA (comma-separated when the task decomposed into several); empty | +| | otherwise | +|--------------------+-------------------------------------------------------------------------------------------------| +| =review_findings= | count of =/review-code= Critical + Important findings on this task | +|--------------------+-------------------------------------------------------------------------------------------------| + +The =outcome= slugs map one-to-one onto the outcome vocabulary above (=implemented-diff= is =implemented-diff-surfaced=; =deferred-verify= is =deferred-VERIFY=). Per-run rollups (attempted / completed / deferred / dropped, wall-clock total, findings per commit) are computed at synthesis, not stored per record. The =commit_sha= field is what the synthesis step's corrections signal keys on — whether a later commit reverted or hand-fixed an autonomous one — so never omit it on a committed task. + +* Caller: the inbox auto-loop + +=inbox.org= auto mode chains here as an explicit second step *after* its routing completes — never as a phase inside inbox processing. When a cycle files new items and Craig answers "run this batch next?" with yes, auto mode invokes this workflow with: + +- *Task set:* the eligibility query over the queued/filed items — status =TODO= + =:solo:= per the scheme header, priority-ordered. +- *Session mode:* =file-only=, paging off. (A project carrying both =:COMMIT_AUTONOMY:= and =:LOOP_MAY_COMMIT:= markers opts the loop into commits — see Commit autonomy above.) +- *Cap: 1.* The highest-priority eligible candidate runs, gets recorded, and the loop's next tick (or the next yes) continues from there. + +The loop has no human at kickoff of each task, so a needs-quick-decisions task defers with a =VERIFY= — the pre-flight Q&A is a speedrun capability, not a loop one. Startup and wrap-up never invoke this workflow. + +* Preset: the no-approvals speedrun + +The named preset is a label for one flag combination, not a second code path: *explicit ordered list + =autonomous-commit= + always-push + paging-on*, with every approval front-loaded into a single pre-flight step. "No approvals" means all input first, then hands-off — not no input ever. =autonomous-commit= still requires the =:COMMIT_AUTONOMY:= waiver (Commit autonomy above); without it the preset degrades to =file-only= and says so in the pre-flight intro. + +When Craig names a task set and says "speedrun": + +1. *Gather* the named task set. +2. *Scope-read and classify* each task against the eligibility gate + defer checklist: *ready* (clears everything), *needs-quick-decisions* (one or two upfront-answerable questions — checklist item 1 or 4), or *drop* (data-loss/irreversible, or deliberation that isn't a quick question). +3. *Order* the list — priority, then the author's ordering / =:next:=. +4. *Intro the work* — present the ordered plan: what will run, what was dropped and why, and the batched questions for the needs-quick-decisions tasks. +5. *Craig answers each question or says "skip this"* — a skip removes the task (recorded =dropped-by-craig=; the task itself stays =TODO=); an answer is recorded so implementation works from the decision, not a guess. +6. *Run the finalized list autonomously* — no further approvals until done. Cap = the list length (the human bounded the set by naming it), still one commit per logical change, always-push per the project's flow, auto-flushing between tasks when the context grows heavy (see Context hygiene above). +7. *End-of-set page* with completed + remaining + skipped. + +The batch-ask (step 4-5) is one message: each question names its task, puts the recommended answer at item 1 when there is one (per =interaction.md= — inline numbered, no popup), and offers "skip this" as the last option. Before the run starts, write each answer into its task's body in =todo.org= as a dated line — the implementation works from the recorded decision, and the record survives the session. The Q&A fires only under this preset; the loop caller never asks (its decision-needing tasks defer). + +*** Per-item disposition rule + +For every item the run picks up (this holds for any executing caller, including an auto-inbox-zero run given a standing yes): + +- *Feature-level task* → write a spec first (=spec-create=), don't implement directly. The spec is the run's deliverable for that item. +- *Needs decisions you can't confidently guess* → file it as a =VERIFY= carrying the question (under this preset, one or two quick questions route to the pre-flight Q&A instead). +- *Well-defined* → implement it, taking the time it needs. + +This extends the defer checklist: the checklist decides *act vs file*; this rule decides the *shape* of the act. + +* Synthesis: metrics → org-roam KB + +Trigger: "synthesize backlog metrics" (optionally a weekly scheduled run). This is the read side of the metrics log — Craig's ask was "gather data and create org-roam articles we can look at later," and this step is the second half. It is read-only over the logs plus exactly one KB write. + +1. *Gather the JSONL union.* Discover =.ai/metrics/work-the-backlog.jsonl= across the project roots (dirs carrying =.ai/protocols.org= under =~/code=, =~/projects=, =~/.emacs.d=). Classify each project per =knowledge-base.md= (work-root denylist, never inference) before reading it into the union. +2. *Enforce personal-only.* A work-classified or unknown project's metrics never enter the KB write — they stay in that project's own log. Report the exclusion per the KB refusal contract: the classification, a one-line redacted summary, and where the data stayed. +3. *Compute the rollups and trends.* Per run: attempted / completed / deferred (by reason) / dropped / failed, wall-clock total, commits landed, review findings per commit. Trends across runs: completion rate over time, defer-reason distribution, findings-per-commit trend. +4. *Compute the corrections signal* — the key metric. For each =commit_sha= in the window, check that project's history for a later commit (within ~14 days) that reverts it or carries a fix touching the same files. A clean run is one whose autonomous commits survive untouched; a flagged run is what Craig reviews by hand. This is a cheap proxy, not proof — it flags candidates, it doesn't convict. +5. *Write one KB node* at =~/org/roam/agents/YYYYMMDDHHMMSS-backlog-metrics-<window>.org= per =knowledge-base.md=: =:agent:metrics:= filetags, a concise title, the rollup table, the trend narrative, and =[[id:...]]= links to prior synthesis nodes so the series is traceable. Pull before writing, commit and push after — the normal KB session discipline. + +The KB node is the artifact Craig reads later: "are the runs completing more and getting corrected less?" should read off the trend table without touching raw logs. Synthesis never mutates the JSONL, todo.org, or any project tree. + +* Common Mistakes + +1. *Implementing a =VERIFY= or =DOING= task.* The gate is status =TODO= only — a =VERIFY= exists precisely because Craig's input is pending. +2. *Treating =:quick:= as eligibility.* It's an effort hint. =:solo:= is the gate. +3. *Deferring on size.* A large, well-specified, decision-free task runs — decomposed into logical commits. Size is not a checklist item. +4. *Guessing past the keystone.* If the failing test isn't writable from the task text, the task isn't ready. Inventing the requirement is the failure the checklist exists to stop. +5. *Rationalizing through the data-loss list.* "The migration is small" doesn't clear checklist item 2. Enumerated operations defer, full stop. +6. *Committing in =file-only= mode.* The diff is the deliverable; the commit is Craig's. +7. *One omnibus commit for the whole run.* Every logical change is its own reviewed commit. +8. *Skipping =/review-code= or =/voice= because nobody's watching.* Autonomy removes interaction gates, never engineering-discipline gates (same contract as =no-approvals.org=). +9. *Running past the cap.* The cap is the kill switch; hitting it means stop and surface, even mid-set. +10. *Paging per-task.* One page, end of set. +11. *Honoring =autonomous-commit= from memory.* The waiver is the marker line in =notes.org=, read fresh each run. "This project usually allows it" isn't a read. +12. *Re-filing the same deferral =VERIFY= every run.* The deferred task stays =TODO=, so a run that skips the existing-sibling check spams =todo.org= with duplicates. +13. *Routing a data-loss hit to the pre-flight Q&A.* Checklist item 2 is the hard gate — an upfront answer never clears it without an explicit checkpoint. + +* Living Document + +Refine as the dogfooding signal arrives — the metrics log and the corrections-in-next-session signal are the feedback loop. Fold recurring adjustments in rather than accumulating caller-side workarounds. + +* History + +Created 2026-07-02 as Phase 1 of the autonomous-batch execution spec, reconciling the inbox-zero "Phase E" proposal and the =.emacs.d= speedrun proposal into one execution loop. The auto-inbox-zero execute step in =inbox.org= reverted to routing-only in the same change so this file is the loop's only home. Phases 2-6 (same day) wired both callers, pinned the commit-autonomy waiver markers, fleshed the defer/Q&A/page mechanics, and added the metrics record + KB synthesis step. diff --git a/claude-templates/.ai/workflows/wrap-it-up.org b/claude-templates/.ai/workflows/wrap-it-up.org index 5d2cdd2..d0c4e75 100644 --- a/claude-templates/.ai/workflows/wrap-it-up.org +++ b/claude-templates/.ai/workflows/wrap-it-up.org @@ -137,6 +137,22 @@ Run the report-only variant first if you want to see what would change without w emacs --batch -q -l .ai/scripts/todo-cleanup.el --check todo.org #+end_src +*** Convert done sub-tasks to dated entries + +#+begin_src bash +[ -f todo.org ] && emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org +#+end_src + +=--convert-subtasks= rewrites every heading at level 3 or deeper whose TODO state is DONE/CANCELLED/FAILED into a dated event-log entry (=<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>=), dropping the keyword, priority cookie, and tags, and removing the now-redundant =CLOSED:= line. This enforces the =todo-format.md= depth rule that a completed *sub-task* (a heading under a parent task) becomes dated history, not a lingering DONE keyword — a shape an interactive org close (=org-log-done= → DONE + CLOSED) never applies and =--archive-done= (level-2 only) never reaches. The timestamp comes from each entry's own =CLOSED= cookie; a date-only close yields =00:00:00=. Heading text is kept verbatim. Idempotent (an already-dated heading has no keyword to match), and a done sub-task with no parseable =CLOSED= is flagged and left alone rather than stamped with a fabricated date. + +Run this *before* =--archive-done= so that when a completed level-2 parent is archived, its sub-tasks already carry their dated form. Any rewrites show up in the wrap-up commit's diff for review before push. + +Preview without writing: + +#+begin_src bash +emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks --check todo.org +#+end_src + *** Archive completed work #+begin_src bash @@ -244,6 +260,32 @@ The check exempts =lint-followups.org= explicitly because lint-org runs earlier This integrates with =inbox.org= process mode, which stamps =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section on completion. Wrap-up doesn't double-stamp. It only ensures the inbox carries nothing but the expected pipeline artifacts at session end. +*** Cross-project router (optional — route filed keepers to their home projects) + +Runs directly after the inbox sanity check. The split between the two: the sanity check *gates* the wrap (a dirty inbox blocks until resolved); the router is *optional* (skipping it never blocks anything — the candidates just stay local until a future wrap). Spec: =docs/specs/wrapup-routing-spec.org= (D7/D8/D9). + +The candidate set is exactly the local tasks carrying a =:ROUTE_CANDIDATE:= property — keepers that inbox process mode filed this session whose inferred home is another project. Never scan the standing backlog. + +#+begin_src bash +.ai/scripts/route-batch --list +#+end_src + +*Empty set = zero interaction.* =--list= prints nothing when there are no candidates; continue the wrap silently — no prompt, no "0 items" line. + +When candidates exist, surface the batch as one line per task — the task heading, the destination project, the delivery mode (=inbox-send= file handoff), and the engine's confidence — then offer exactly two options: *go* (route the whole batch) or *skip* (leave everything local). Derive each confidence label by running the engine on the task's heading + body (=python3 .ai/scripts/route_recommend.py --item "..." --exclude "$(basename "$PWD")"=); label weak matches visibly ("weak — verify the destination") so a low-confidence route gets a human glance before the keystroke. + +On *go*: + +#+begin_src bash +.ai/scripts/route-batch --go +#+end_src + +Per candidate, the helper writes the task's subtree (children ride along; =:ROUTE_CANDIDATE:= stripped, headings promoted to top level) to a one-task handoff, delivers it via =inbox-send <destination> --file= (so the =from-<this-project>= provenance is stamped and the destination's inbox process mode dispositions it as a single item), and only after a successful send removes the subtree from the local =todo.org= — a single-file local edit the wrap is already committing. A failed send leaves that task in place and exits non-zero; report it and continue the wrap. Never write the destination's =todo.org= directly; its own inbox processing files the task per its conventions. + +On *skip*, leave every candidate in place, marker included — they resurface next wrap. + +Mis-routes are recoverable: the receiving project rejects via inbox process mode's reject-from-another-project flow, which returns the item to this project's inbox with the rationale. That reject path is why removing the local source on send is safe. + *** Review-habit health check (surface a slipped daily task-review) The daily task-review habit walks the open top-level tasks on a rotating cycle, stamping =:LAST_REVIEWED:= as it goes (see =task-review.org=). This check is the watchdog for that habit. When tasks have gone too long unreviewed, the habit has slipped, and the wrap-up says so in one line — it does not re-list the tasks. @@ -536,7 +578,7 @@ Before considering wrap-up complete: - [ ] The Summary ends with the =KB: promoted N / consulted yes-no= line (promotion check ran) - [ ] File renamed to =.ai/sessions/YYYY-MM-DD-HH-MM-description.org= - [ ] =.ai/session-context.org= no longer exists -- [ ] =todo-cleanup.el= ran — hygiene pass + =--archive-done= + =--sync-child-priority= (if =todo.org= exists at project root) +- [ ] =todo-cleanup.el= ran — hygiene pass + =--convert-subtasks= + =--archive-done= + =--sync-child-priority= (if =todo.org= exists at project root) - [ ] =lint-org.el= ran on =todo.org= — mechanical fixes applied, judgments appended to follow-ups file (if =todo.org= exists) - [ ] Any orphan-planning-line warnings reviewed (fix or accept) - [ ] Inbox carries nothing but expected pipeline artifacts (=.gitkeep=, =lint-followups.org=, =PROCESSED-*= prefixes), OR each remaining handoff has an explicit deferral logged in the valediction |
