aboutsummaryrefslogtreecommitdiff
path: root/claude-templates
diff options
context:
space:
mode:
Diffstat (limited to 'claude-templates')
-rw-r--r--claude-templates/.ai/protocols.org2
-rwxr-xr-xclaude-templates/.ai/scripts/inbox-send.py21
-rw-r--r--claude-templates/.ai/scripts/lint-org.el28
-rwxr-xr-xclaude-templates/.ai/scripts/route-batch175
-rwxr-xr-xclaude-templates/.ai/scripts/self-inject.sh68
-rwxr-xr-xclaude-templates/.ai/scripts/spec-sort715
-rw-r--r--claude-templates/.ai/scripts/tests/route-batch.bats202
-rw-r--r--claude-templates/.ai/scripts/tests/self-inject.bats78
-rw-r--r--claude-templates/.ai/scripts/tests/spec-sort.bats453
-rw-r--r--claude-templates/.ai/scripts/tests/test-lint-org.el31
-rw-r--r--claude-templates/.ai/scripts/tests/test-todo-cleanup.el171
-rw-r--r--claude-templates/.ai/scripts/tests/test_inbox_send.py75
-rw-r--r--claude-templates/.ai/scripts/todo-cleanup.el192
-rw-r--r--claude-templates/.ai/workflows/INDEX.org6
-rw-r--r--claude-templates/.ai/workflows/clean-todo.org19
-rw-r--r--claude-templates/.ai/workflows/inbox.org19
-rw-r--r--claude-templates/.ai/workflows/no-approvals.org2
-rw-r--r--claude-templates/.ai/workflows/open-tasks.org7
-rw-r--r--claude-templates/.ai/workflows/page-me.org26
-rw-r--r--claude-templates/.ai/workflows/spec-create.org14
-rw-r--r--claude-templates/.ai/workflows/spec-response.org6
-rw-r--r--claude-templates/.ai/workflows/spec-review.org8
-rw-r--r--claude-templates/.ai/workflows/startup.org25
-rw-r--r--claude-templates/.ai/workflows/task-audit.org23
-rw-r--r--claude-templates/.ai/workflows/task-review.org8
-rw-r--r--claude-templates/.ai/workflows/work-the-backlog.org263
-rw-r--r--claude-templates/.ai/workflows/wrap-it-up.org44
27 files changed, 2641 insertions, 40 deletions
diff --git a/claude-templates/.ai/protocols.org b/claude-templates/.ai/protocols.org
index ed07c0e..5e18ab9 100644
--- a/claude-templates/.ai/protocols.org
+++ b/claude-templates/.ai/protocols.org
@@ -552,6 +552,8 @@ Claude needs to add information to =.ai/notes.org=. For large amounts of informa
**The gitignore set follows that same decision.** A project that gitignores =.ai/= (the code-project case) gitignores the whole personal-tooling set: =.ai/=, =.claude/=, =CLAUDE.md=, =AGENTS.md=. =.claude/= is rulesets-owned — copies of =claude-rules/*.md= plus the language bundle's rules, hooks, and settings — and re-synced from rulesets on every startup, so git isn't how it travels between machines; ignoring it also keeps those private rule copies out of the repo, which ignoring =CLAUDE.md= alone would miss. A track-mode project (personal/doc repos, or a team repo that shares config with teammates who don't run rulesets) tracks the set instead. =install-ai.sh= writes the full set at bootstrap in gitignore mode; =scripts/sweep-gitignore-tooling.sh= backfills it idempotently across existing gitignore-mode projects when the set grows.
+**Public reachability decides harder than project type.** Any repo whose remotes include a non-cjennings.net host gitignores the tooling set, whatever kind of project it is — the only exception is a team repo that deliberately shares the config, decided explicitly, never by default. And a private remote is not proof of privacy: a server-side =post-receive --mirror= hook republishes invisibly from the client (the 2026-06-30 =.emacs.d= exposure rode exactly that — a cjennings.net remote mirroring to public GitHub). The sweep recognizes both the anchored (=/.ai/=) and unanchored (=.ai/=) ignore styles — an anchored-style project used to be misread as track-mode and silently skipped — and warns when tracked tooling can reach a non-cjennings.net remote.
+
**Credential-leak concern: gate it on project type, not on the credential itself.** A tracked secret, token, or credentials doc is only a public-leak risk where the repo can reach a public remote — that is, *code projects pushed to public GitHub*, which is exactly why those gitignore =.ai/= and =.claude/=. For *personal / documentation projects* (the =~/projects/= set: elibrary, home, finances, health, philosophy, etc.), the git remote is a private single-user repo on =cjennings.net=, so tracked credentials inside =.ai/= files are fine — that's the design, the project history IS the project. Do NOT raise a leak warning or suggest gitignoring a secret for these. When the question "is this a leak / should we gitignore this secret?" comes up, decide it on *which kind of project and remote* this is, never on the mere presence of a credential in a tracked file.
**When to break out documents:**
diff --git a/claude-templates/.ai/scripts/inbox-send.py b/claude-templates/.ai/scripts/inbox-send.py
index 1362a1f..1ebb636 100755
--- a/claude-templates/.ai/scripts/inbox-send.py
+++ b/claude-templates/.ai/scripts/inbox-send.py
@@ -177,6 +177,23 @@ def build_text_org(message: str, source_name: str, timestamp: str) -> str:
)
+def uniquify(dest: Path) -> Path:
+ """Return dest, or dest with a -2/-3/... stem suffix when it already exists.
+
+ Two sends in the same minute whose text starts with the same phrase
+ derive identical filenames, and the second silently overwrote the
+ first (a message was lost this way, 2026-07-02). Never overwrite.
+ """
+ if not dest.exists():
+ return dest
+ n = 2
+ while True:
+ candidate = dest.with_name(f"{dest.stem}-{n}{dest.suffix}")
+ if not candidate.exists():
+ return candidate
+ n += 1
+
+
def send_text(
target_inbox: Path,
message: str,
@@ -191,7 +208,7 @@ def send_text(
if not slug:
raise ValueError(f"could not derive a slug from text: {message!r}")
filename = f"{now.strftime(TS_FILENAME_FMT)}-from-{source_name}-{slug}.org"
- dest = target_inbox / filename
+ dest = uniquify(target_inbox / filename)
dest.write_text(build_text_org(message, source_name, now.strftime(TS_DOC_FMT)))
return dest
@@ -211,7 +228,7 @@ def send_file(
raise ValueError(f"could not derive a slug from file: {src_path}")
ext = src_path.suffix
filename = f"{now.strftime(TS_FILENAME_FMT)}-from-{source_name}-{slug}{ext}"
- dest = target_inbox / filename
+ dest = uniquify(target_inbox / filename)
shutil.copy2(src_path, dest)
return dest
diff --git a/claude-templates/.ai/scripts/lint-org.el b/claude-templates/.ai/scripts/lint-org.el
index 5447cb3..90b1b1d 100644
--- a/claude-templates/.ai/scripts/lint-org.el
+++ b/claude-templates/.ai/scripts/lint-org.el
@@ -35,6 +35,7 @@
;; empty-heading bare stars with no title
;; malformed-priority-cookie [#x]-shaped token org rejected
;; level2-done-without-closed completed level-2 task with no CLOSED
+;; subtask-done-not-dated level-3+ done sub-task still a DONE keyword
;; (anything else) surfaced as judgment with checker name
;;
;; Output format on stdout:
@@ -503,6 +504,32 @@ the live file on the next `task-sorted'."
"level-2 DONE/CANCELLED has no CLOSED date — add CLOSED: [YYYY-MM-DD Day]; task-sorted's aging step archives an undated completed task immediately"))))))))
;;; ---------------------------------------------------------------------------
+;;; level-3+ dated-header check (claude-rules/todo-format.md)
+;;
+;; The inverse of the level-2 check above. A completed sub-task — a heading at
+;; level 3 or deeper, under a parent task — becomes a dated event-log entry, not
+;; a DONE keyword, so the parent's subtree grows a chronological history instead
+;; of a long tail of nested DONE lines. An interactive org close
+;; (`org-log-done' → DONE + CLOSED) leaves the keyword in place, and
+;; `--archive-done' only touches level 2, so these accumulate. Flag them for
+;; conversion. Judgment-only and regex-based (independent of which TODO keywords
+;; the batch Emacs recognizes); todo-cleanup.el --convert-subtasks does the fix.
+
+(defun lo--check-subtask-done-not-dated ()
+ "Flag level-3+ headings carrying a done keyword (DONE/CANCELLED/FAILED).
+Emits one judgment item per offending heading (checker
+`subtask-done-not-dated')."
+ (save-excursion
+ (goto-char (point-min))
+ ;; Case-sensitive: the keywords are uppercase, not the words in a title.
+ (let ((case-fold-search nil))
+ (while (re-search-forward
+ "^\\*\\{3,\\} \\(DONE\\|CANCELLED\\|FAILED\\) " nil t)
+ (lo--emit-judgment
+ 'subtask-done-not-dated (line-number-at-pos)
+ "level-3+ done sub-task should be a dated event-log entry (todo-format.md): run todo-cleanup.el --convert-subtasks to rewrite it")))))
+
+;;; ---------------------------------------------------------------------------
;;; File processing
(defun lo--backup (file)
@@ -543,6 +570,7 @@ left unmodified and mechanical entries are recorded with :preview t."
(lo--check-empty-headings)
(lo--check-malformed-priority-cookies)
(lo--check-level2-done-without-closed)
+ (lo--check-subtask-done-not-dated)
(when (and (not lo-check-only) (buffer-modified-p))
(save-buffer)))
(with-current-buffer buf (set-buffer-modified-p nil))
diff --git a/claude-templates/.ai/scripts/route-batch b/claude-templates/.ai/scripts/route-batch
new file mode 100755
index 0000000..8f27d19
--- /dev/null
+++ b/claude-templates/.ai/scripts/route-batch
@@ -0,0 +1,175 @@
+#!/usr/bin/env python3
+"""route-batch — the wrap-up router's mechanical go path.
+
+The wrap-up cross-project router (wrap-it-up.org Step 3; wrapup-routing spec
+D7/D8/D9) surfaces the local tasks that inbox process mode stamped with
+:ROUTE_CANDIDATE: <destination> at file time, and on "go" delivers each to its
+destination project's inbox. This script does the mechanical half so the
+subtree surgery is deterministic:
+
+ route-batch --list [--todo todo.org]
+ One "<destination>\t<heading>" line per :ROUTE_CANDIDATE:-tagged task.
+ Silent with exit 0 when there are no candidates (the workflow's
+ empty-set-equals-zero-interaction rule). Read-only.
+
+ route-batch --go [--todo todo.org]
+ For each candidate, bottom-up: extract the task's whole subtree
+ (children ride along), drop the :ROUTE_CANDIDATE: line (and the
+ property drawer if that leaves it empty), promote the subtree so its
+ top heading is level 1, write it to a temp file, and deliver it via
+ the sibling inbox-send.py to the destination's inbox/ (one file per
+ task, from-<source> provenance stamped by inbox-send). Only after a
+ successful send is the subtree removed from the local todo.org — a
+ failed send leaves that task in place, is reported, and the run exits
+ non-zero after attempting the rest.
+
+The candidate set is exactly the tagged tasks — never the standing backlog.
+Discovery, roots, and the source-project name all come from inbox-send.py
+(INBOX_SEND_ROOTS sandboxes it in tests). The reject-from-another-project
+flow in inbox process mode is the mis-route recovery; that path is why
+removing the local source after a successful send is safe.
+"""
+
+import argparse
+import os
+import re
+import subprocess
+import sys
+import tempfile
+from pathlib import Path
+
+HEADING_RE = re.compile(r"^(\*+)\s+(.*)$")
+MARKER_RE = re.compile(r"^\s*:ROUTE_CANDIDATE:\s+(\S+)\s*$")
+
+
+def find_candidates(lines):
+ """[(heading_idx, end_idx, marker_idx, destination, heading_text)] —
+ end_idx is one past the subtree's last line."""
+ candidates = []
+ for i, line in enumerate(lines):
+ m = MARKER_RE.match(line)
+ if not m:
+ continue
+ head_idx = None
+ for j in range(i, -1, -1):
+ hm = HEADING_RE.match(lines[j])
+ if hm:
+ head_idx = j
+ level = len(hm.group(1))
+ heading = hm.group(2)
+ break
+ if head_idx is None:
+ continue
+ end = len(lines)
+ for k in range(head_idx + 1, len(lines)):
+ km = HEADING_RE.match(lines[k])
+ if km and len(km.group(1)) <= level:
+ end = k
+ break
+ candidates.append((head_idx, end, i, m.group(1), heading))
+ return candidates
+
+
+def extract_handoff(lines, head_idx, end):
+ """The subtree as handoff text: every :ROUTE_CANDIDATE: line dropped
+ (a marker is meaningless at the destination), empty drawers pruned,
+ headings promoted so the task is level 1."""
+ sub = [l for l in lines[head_idx:end] if not MARKER_RE.match(l)]
+
+ pruned = []
+ i = 0
+ while i < len(sub):
+ if sub[i].strip() == ":PROPERTIES:" and i + 1 < len(sub) and sub[i + 1].strip() == ":END:":
+ i += 2
+ continue
+ pruned.append(sub[i])
+ i += 1
+
+ shift = len(HEADING_RE.match(pruned[0]).group(1)) - 1
+ if shift > 0:
+ pruned = [l[shift:] if HEADING_RE.match(l) else l for l in pruned]
+ return "\n".join(pruned).rstrip() + "\n"
+
+
+def send(destination, handoff_text, slug):
+ inbox_send = Path(__file__).with_name("inbox-send.py")
+ with tempfile.NamedTemporaryFile(
+ "w", suffix=".org", prefix=f"route-{slug}-", delete=False, encoding="utf-8"
+ ) as tf:
+ tf.write(handoff_text)
+ tmp = tf.name
+ try:
+ result = subprocess.run(
+ [sys.executable, str(inbox_send), destination, "--file", tmp],
+ capture_output=True, text=True,
+ )
+ return result.returncode == 0, (result.stderr or result.stdout).strip()
+ finally:
+ os.unlink(tmp)
+
+
+def main():
+ ap = argparse.ArgumentParser(prog="route-batch")
+ mode = ap.add_mutually_exclusive_group(required=True)
+ mode.add_argument("--list", action="store_true", dest="list_mode")
+ mode.add_argument("--go", action="store_true")
+ ap.add_argument("--todo", default="todo.org")
+ args = ap.parse_args()
+
+ todo_path = Path(args.todo)
+ if not todo_path.is_file():
+ return 0 # no todo file, no candidates
+ lines = todo_path.read_text(encoding="utf-8").splitlines()
+ candidates = find_candidates(lines)
+
+ # Two markers in one task's drawer are one candidate, not two: same span +
+ # same destination dedupes. Everything else that overlaps — a tagged child
+ # inside a tagged parent, one task tagged for two destinations — is a
+ # conflict: routing either span would silently take the other (or, with a
+ # stale end index, a bystander task) along. Conflicts are left in place
+ # and reported; the human untangles which project the pieces belong to.
+ deduped = []
+ for cand in candidates:
+ if not any(c[0] == cand[0] and c[1] == cand[1] and c[3] == cand[3] for c in deduped):
+ deduped.append(cand)
+ conflicted = set()
+ for a in deduped:
+ for b in deduped:
+ if a is not b and a[0] <= b[0] and b[1] <= a[1]:
+ conflicted.add(a)
+ conflicted.add(b)
+ routable = [c for c in deduped if c not in conflicted]
+
+ if not deduped:
+ return 0
+
+ if args.list_mode:
+ for _h, _e, _m, dest, heading in deduped:
+ flag = "\tCONFLICT (overlapping candidates — resolve by hand)" if (_h, _e, _m, dest, heading) in conflicted else ""
+ print(f"{dest}\t{heading}{flag}")
+ return 0
+
+ failures = 0
+ for _h, _e, _m, dest, heading in sorted(conflicted):
+ failures += 1
+ print(f"CONFLICT: {dest}\t{heading}\t(overlapping candidate subtrees — left in place, resolve by hand)")
+
+ # Bottom-up so earlier indices stay valid as subtrees are removed; the
+ # file is rewritten after every successful send so a crash mid-run never
+ # leaves an already-sent task still present locally.
+ for head_idx, end, _marker_idx, dest, heading in sorted(routable, reverse=True):
+ handoff = extract_handoff(lines, head_idx, end)
+ slug = re.sub(r"[^a-z0-9]+", "-", heading.lower()).strip("-")[:40] or "task"
+ ok, detail = send(dest, handoff, slug)
+ if ok:
+ del lines[head_idx:end]
+ todo_path.write_text("\n".join(lines).rstrip("\n") + "\n", encoding="utf-8")
+ print(f"routed: {dest}\t{heading}")
+ else:
+ failures += 1
+ print(f"FAILED: {dest}\t{heading}\t({detail})")
+ return 1 if failures else 0
+
+
+if __name__ == "__main__":
+ sys.exit(main())
diff --git a/claude-templates/.ai/scripts/self-inject.sh b/claude-templates/.ai/scripts/self-inject.sh
new file mode 100755
index 0000000..e7340c1
--- /dev/null
+++ b/claude-templates/.ai/scripts/self-inject.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+# self-inject.sh — type text into the tmux pane running this agent session.
+#
+# The building block for AUTO-FLUSH: an agent checkpoints its session-context,
+# then has tmux type "/clear" and a resume prompt at its own idle prompt, so a
+# session flushes with no human at the keyboard.
+#
+# Usage:
+# self-inject.sh -t %PANE <delay> <text> [<delay2> <text2> ...]
+# self-inject.sh <delay> <text> [...] # derive pane from ancestry
+# self-inject.sh [-t %PANE] # no pairs: report the pane
+#
+# Each pair: sleep <delay> seconds, then type <text> literally and press Enter.
+#
+# TWO HARD-WON GOTCHAS (2026-07-02, archsetup session):
+# 1. A detached child (setsid/nohup/&) of an agent tool call DIES when the
+# tool call ends — the harness cleans up the process group. The arm step
+# must run under the tmux SERVER instead:
+# tmux run-shell -b "self-inject.sh -t %1 25 '/clear' 15 'go — resume...'"
+# 2. Under tmux run-shell the process is a child of the tmux server, so
+# ancestry-based pane detection CANNOT work there. Derive the pane FIRST,
+# synchronously from the agent's own shell (no -t), then pass it
+# explicitly with -t when arming.
+#
+# Collision hazard: if the user happens to be typing when the send fires, the
+# injected text merges into their input line (a real /clear became "/clearto"
+# mid-word). Auto-flush is for sessions running unattended; warn the user to
+# keep hands off for the armed window if they're present.
+
+PANE=""
+if [ "$1" = "-t" ]; then
+ PANE=$2; shift 2
+fi
+
+ppid_of() {
+ # /proc/<pid>/stat: pid (comm) state ppid ... — comm may contain spaces,
+ # so take the 2nd field after the LAST ')'.
+ stat=$(cat "/proc/$1/stat" 2>/dev/null) || return 1
+ # shellcheck disable=SC2086 # word-splitting the stat tail is the point
+ set -- ${stat##*) }
+ echo "$2"
+}
+
+find_pane() {
+ anc=" "
+ pid=$$
+ while [ -n "$pid" ] && [ "$pid" -gt 1 ] 2>/dev/null; do
+ anc="$anc$pid "
+ pid=$(ppid_of "$pid") || break
+ done
+ tmux list-panes -a -F "#{pane_pid} #{pane_id}" 2>/dev/null | \
+ while read -r ppid pane; do
+ case "$anc" in *" $ppid "*) echo "$pane"; break;; esac
+ done
+}
+
+[ -n "$PANE" ] || PANE=$(find_pane)
+[ -n "$PANE" ] || { echo "self-inject: no owning pane found (pass -t %PANE)" >&2; exit 1; }
+
+# With no delay/text pairs, just report the pane (the derive-first step).
+[ $# -ge 2 ] || { echo "$PANE"; exit 0; }
+
+while [ $# -ge 2 ]; do
+ sleep "$1"
+ tmux send-keys -t "$PANE" -l "$2"
+ tmux send-keys -t "$PANE" Enter
+ shift 2
+done
diff --git a/claude-templates/.ai/scripts/spec-sort b/claude-templates/.ai/scripts/spec-sort
new file mode 100755
index 0000000..ebfef82
--- /dev/null
+++ b/claude-templates/.ai/scripts/spec-sort
@@ -0,0 +1,715 @@
+#!/usr/bin/env python3
+"""spec-sort — one-time docs-pile retrofit for the docs-lifecycle convention.
+
+Classifies every docs/**/*.org outside docs/specs/ by one predicate: a doc
+carrying BOTH a "Decisions" heading AND an "Implementation phases" heading is
+a spec candidate; everything else is a note. For each candidate it shows an
+evidence panel (Status field, decision/finding cookies, the linking todo.org
+task, recent dated history, cheap existence checks on phase-named artifacts)
+and proposes a lifecycle keyword the evidence supports — conservative
+non-terminal (DRAFT) when inconclusive. The helper proposes; a human confirms
+every move.
+
+Dry-run report is the default. --apply executes under the fail-safe contract:
+
+ - Clean-worktree preflight: refuses on a dirty git tree (exit 2) unless
+ --allow-dirty, which prints exactly what recovery loses.
+ - Every candidate must be addressed with --confirm REL=KEYWORD or
+ --skip REL; terminal keywords (IMPLEMENTED SUPERSEDED CANCELLED) also
+ need --reason REL=TEXT, recorded in the status-history line.
+ - The full move + relink plan is computed and validated first (every
+ destination free, every link resolvable), written to a plan file, and
+ only then executed from that recorded plan.
+ - Bare-path mentions of a moving doc inside the rewritten roots are
+ reported, never rewritten; they block --apply until --acknowledge-bare
+ explicitly waives them.
+ - Mid-apply failure stops the run, names what was and wasn't applied, and
+ prints the git-restore recovery recipe (plus deletion of newly created
+ destination copies, which git restore can't remove).
+ - After a successful apply, a residue scan across the rewritten roots must
+ find no link still resolving to an old path, or spec-sort exits non-zero
+ naming the residue.
+
+Per move: rename to carry the -spec.org suffix, prepend the status heading
+(:ID: UUID + dated history line), rewrite the keyword header to the
+two-sequence form, mirror the keyword into the Metadata Status field, and
+recompute every affected file: link (inbound links to the moved doc AND the
+moved doc's own outbound relative links). Rewritten roots: todo.org,
+.ai/notes.org, docs/**, .ai/project-workflows/, .ai/project-scripts/.
+Reported-never-rewritten: .ai/sessions/ (frozen history) and synced template
+paths (.ai/workflows/, .ai/scripts/, .ai/protocols.org — the report names
+the canonical claude-templates file instead).
+
+Finally stamps :LAST_SPEC_SORT: YYYY-MM-DD in .ai/notes.org's
+* Workflow State section (created idempotently), which permanently clears
+the startup nudge. A run with zero candidates still stamps.
+
+Exit codes: 0 done (or clean report), 1 blocked (confirm gate, validation,
+bare mentions, residue, mid-apply failure), 2 usage / preflight refusal.
+
+Test hook: SPEC_SORT_INJECT_FAIL_AFTER=N aborts the apply after N write
+operations, exercising the recovery path in the bats suite.
+"""
+
+import argparse
+import json
+import os
+import re
+import subprocess
+import sys
+import tempfile
+import uuid
+from datetime import datetime
+
+LIFECYCLE = ("DRAFT", "READY", "DOING", "IMPLEMENTED", "SUPERSEDED", "CANCELLED")
+TERMINAL = {"IMPLEMENTED", "SUPERSEDED", "CANCELLED"}
+TODO_HEADER = [
+ "#+TODO: TODO | DONE",
+ "#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED",
+]
+
+# Project-owned surfaces whose file: links get rewritten.
+REWRITE_ROOTS = ("todo.org", ".ai/notes.org", "docs", ".ai/project-workflows", ".ai/project-scripts")
+# Frozen or synced surfaces: occurrences are reported, never rewritten.
+REPORT_ROOTS = (".ai/sessions", ".ai/workflows", ".ai/scripts", ".ai/protocols.org")
+# Synced template paths map to their canonical rulesets file for the report.
+SYNCED_PREFIX = (".ai/workflows", ".ai/scripts", ".ai/protocols.org")
+
+LINK_RE = re.compile(r"\[\[file:([^\]\[]+)\](?:\[([^\]\[]*)\])?\]")
+HEADING_RE = re.compile(r"^(\*+)\s+(.*)$")
+COOKIE_RE = re.compile(r"\[\d+/\d+\]")
+DATED_RE = re.compile(r"\b\d{4}-\d{2}-\d{2}\b")
+
+
+def read_text(path):
+ try:
+ with open(path, encoding="utf-8") as f:
+ return f.read()
+ except (UnicodeDecodeError, OSError):
+ return None
+
+
+def heading_text(line):
+ """Heading text with the org keyword and priority cookie stripped."""
+ m = HEADING_RE.match(line)
+ if not m:
+ return None
+ text = re.sub(r"^[A-Z]+\s+", "", m.group(2))
+ text = re.sub(r"^\[#[A-Z]\]\s+", "", text)
+ return text.strip()
+
+
+def has_spine(content):
+ """The classification predicate: Decisions AND Implementation phases."""
+ dec = imp = False
+ for line in content.splitlines():
+ t = heading_text(line)
+ if t is None:
+ continue
+ tl = t.lower()
+ if tl.startswith("decisions"):
+ dec = True
+ elif tl.startswith("implementation phases"):
+ imp = True
+ return dec and imp
+
+
+def walk_files(root, rel_base):
+ """Yield project-relative paths of files under rel_base (file or dir)."""
+ abs_base = os.path.join(root, rel_base)
+ if os.path.isfile(abs_base):
+ yield rel_base
+ return
+ for dirpath, dirs, files in os.walk(abs_base):
+ dirs.sort()
+ for name in sorted(files):
+ yield os.path.relpath(os.path.join(dirpath, name), root)
+
+
+def classify(root):
+ """Split docs/**/*.org outside docs/specs/ into candidates / anomalies / notes."""
+ candidates, anomalies, notes = [], [], []
+ docs = os.path.join(root, "docs")
+ if not os.path.isdir(docs):
+ return candidates, anomalies, notes
+ for rel in walk_files(root, "docs"):
+ if not rel.endswith(".org"):
+ continue
+ parts = rel.split(os.sep)
+ if len(parts) > 1 and parts[1] == "specs":
+ continue
+ content = read_text(os.path.join(root, rel))
+ if content is None:
+ continue
+ if has_spine(content):
+ candidates.append(rel)
+ elif os.path.basename(rel).endswith("-spec.org"):
+ anomalies.append(rel)
+ else:
+ notes.append(rel)
+ return candidates, anomalies, notes
+
+
+def dest_for(rel):
+ base = os.path.basename(rel)
+ if not base.endswith("-spec.org"):
+ base = base[: -len(".org")] + "-spec.org"
+ return os.path.join("docs", "specs", base)
+
+
+# ---- Evidence panel ---------------------------------------------------
+
+
+def todo_task_for(root, rel):
+ """Heading of the first todo.org task whose subtree mentions the doc."""
+ content = read_text(os.path.join(root, "todo.org"))
+ if content is None:
+ return None
+ lines = content.splitlines()
+ basename = os.path.basename(rel)
+ for i, line in enumerate(lines):
+ if basename in line or rel in line:
+ for j in range(i, -1, -1):
+ if HEADING_RE.match(lines[j]):
+ return lines[j].lstrip("* ").strip()
+ return None
+ return None
+
+
+def gather_evidence(root, rel, content):
+ ev = {}
+ m = re.search(r"^\|\s*Status\s*\|\s*([^|]*)\|", content, re.MULTILINE | re.IGNORECASE)
+ ev["status"] = m.group(1).strip() if m else None
+
+ cookies = []
+ for line in content.splitlines():
+ t = heading_text(line)
+ if t and COOKIE_RE.search(t) and (
+ t.lower().startswith("decisions") or t.lower().startswith("review findings")
+ ):
+ cookies.append(t)
+ ev["cookies"] = cookies
+
+ ev["todo"] = todo_task_for(root, rel)
+ kw = None
+ if ev["todo"]:
+ m = re.match(r"([A-Z]+)\s", ev["todo"])
+ kw = m.group(1) if m else None
+ ev["todo_keyword"] = kw
+
+ dated = [ln.strip() for ln in content.splitlines() if DATED_RE.search(ln)]
+ ev["history"] = dated[-1][:100] if dated else None
+
+ # Cheap artifact check: =path= tokens inside the Implementation phases section.
+ artifacts, exists = [], 0
+ section = re.split(r"^\*+\s+.*implementation phases.*$", content, maxsplit=1, flags=re.MULTILINE | re.IGNORECASE)
+ if len(section) > 1:
+ for tok in re.findall(r"=([^=\s]+)=", section[1]):
+ if "/" in tok:
+ artifacts.append(tok)
+ if os.path.exists(os.path.join(root, tok)):
+ exists += 1
+ ev["artifacts"] = (exists, artifacts)
+ return ev
+
+
+def propose_keyword(ev):
+ s = (ev["status"] or "").lower()
+ words = set(re.findall(r"[a-z]+", s))
+ if words & {"implemented", "shipped", "complete", "completed", "done"}:
+ return "IMPLEMENTED"
+ if words & {"superseded"}:
+ return "SUPERSEDED"
+ if words & {"cancelled", "canceled", "dead", "abandoned"}:
+ return "CANCELLED"
+ if words & {"doing", "implementing"} or "in progress" in s or "in-progress" in s:
+ return "DOING"
+ if ev["todo_keyword"] == "DOING":
+ return "DOING"
+ if words & {"ready", "approved", "accepted"}:
+ return "READY"
+ return "DRAFT" # conservative non-terminal default
+
+
+# ---- Link scanning ----------------------------------------------------
+
+
+def rewrite_files(root):
+ """Project-relative *.org files under the rewritten roots."""
+ seen = []
+ for base in REWRITE_ROOTS:
+ if not os.path.exists(os.path.join(root, base)):
+ continue
+ for rel in walk_files(root, base):
+ if rel.endswith(".org") and rel not in seen:
+ seen.append(rel)
+ return seen
+
+
+def resolve_target(root, linker_rel, raw_target, moved):
+ """Resolve a file: link target to a project-relative path (org semantics
+ first — relative to the linking file's directory — then project-root
+ anchoring as a fallback for root-anchored links)."""
+ if raw_target.startswith(("/", "~", "http:", "https:")):
+ return None
+ rel_a = os.path.normpath(os.path.join(os.path.dirname(linker_rel), raw_target))
+ if rel_a in moved or os.path.exists(os.path.join(root, rel_a)):
+ return rel_a
+ rel_b = os.path.normpath(raw_target)
+ if rel_b in moved or os.path.exists(os.path.join(root, rel_b)):
+ return rel_b
+ return rel_a
+
+
+def plan_link_edits(root, moved):
+ """Compute every link rewrite: inbound links to moved docs and moved
+ docs' own outbound relative links. Returns ({linker_rel: [(old, new)]},
+ [ambiguity descriptions]) — a link whose file-relative and root-anchored
+ readings are both live and disagree about a moving doc blocks validation
+ rather than being rewritten against a guess."""
+ edits = {}
+ ambiguous = []
+ for linker in rewrite_files(root):
+ content = read_text(os.path.join(root, linker))
+ if content is None:
+ continue
+ linker_post = moved.get(linker, linker)
+ for m in LINK_RE.finditer(content):
+ raw = m.group(1)
+ desc = m.group(2)
+ target_path, sep, anchor = raw.partition("::")
+ target = resolve_target(root, linker, target_path, moved)
+ if target is None:
+ continue
+ rel_a = os.path.normpath(os.path.join(os.path.dirname(linker), target_path))
+ rel_b = os.path.normpath(target_path)
+ if rel_a != rel_b:
+ live_a = rel_a in moved or os.path.exists(os.path.join(root, rel_a))
+ live_b = rel_b in moved or os.path.exists(os.path.join(root, rel_b))
+ if live_a and live_b and (rel_a in moved or rel_b in moved):
+ ambiguous.append(
+ "%s: [[file:%s]] reads as %s (file-relative) or %s (root-anchored) "
+ "and a moving doc is involved — resolve the link by hand" % (linker, raw, rel_a, rel_b))
+ continue
+ if target not in moved and linker not in moved:
+ continue
+ if target not in moved and not os.path.exists(os.path.join(root, target)):
+ continue # already broken before this run; not ours to guess
+ target_post = moved.get(target, target)
+ new_path = os.path.relpath(target_post, os.path.dirname(linker_post) or ".")
+ new_raw = new_path + (sep + anchor if sep else "")
+ if new_raw == raw:
+ continue
+ new_link = "[[file:%s]%s]" % (new_raw, "[%s]" % desc if desc is not None else "")
+ if m.group(0) != new_link:
+ edits.setdefault(linker, []).append((m.group(0), new_link))
+ return edits, ambiguous
+
+
+def scan_bare_mentions(root, moved):
+ """Bare-path mentions of moving docs in the rewritten roots — text
+ occurrences outside any [[...]] link. Reported, never rewritten."""
+ found = []
+ for base in REWRITE_ROOTS:
+ if not os.path.exists(os.path.join(root, base)):
+ continue
+ for rel in walk_files(root, base):
+ content = read_text(os.path.join(root, rel))
+ if content is None:
+ continue
+ for i, line in enumerate(content.splitlines(), 1):
+ stripped = re.sub(r"\[\[[^\]]*\](?:\[[^\]]*\])?\]", "", line)
+ for src in moved:
+ if src in stripped:
+ found.append((rel, i, src))
+ return found
+
+
+def scan_report_only(root, moved):
+ """Occurrences of moving docs in frozen/synced surfaces."""
+ reports = []
+ for base in REPORT_ROOTS:
+ if not os.path.exists(os.path.join(root, base)):
+ continue
+ for rel in walk_files(root, base):
+ content = read_text(os.path.join(root, rel))
+ if content is None:
+ continue
+ for src in moved:
+ if src in content:
+ if rel.startswith(SYNCED_PREFIX):
+ note = ("synced template, not rewritten — a local edit is reverted by the "
+ "next sync; edit the canonical claude-templates/%s instead" % rel)
+ else:
+ note = "frozen history; not rewritten"
+ reports.append((rel, src, note))
+ return reports
+
+
+# ---- Content transforms -----------------------------------------------
+
+
+def transform_spec(content, keyword, reason, title, doc_id, link_edits):
+ """Apply the retrofit rewrite to a moving spec's content: two-sequence
+ keyword header, prepended status heading, Status-field mirror, and the
+ doc's own link edits."""
+ for old, new in link_edits:
+ content = content.replace(old, new)
+ lines = content.splitlines()
+
+ todo_idx = None
+ kept = []
+ for line in lines:
+ if line.startswith("#+TODO:"):
+ if todo_idx is None:
+ todo_idx = len(kept)
+ continue
+ kept.append(line)
+ lines = kept
+ if todo_idx is None:
+ todo_idx = 0
+ while todo_idx < len(lines) and lines[todo_idx].startswith("#+"):
+ todo_idx += 1
+ lines[todo_idx:todo_idx] = TODO_HEADER
+
+ head_end = 0
+ while head_end < len(lines) and (lines[head_end].startswith("#+") or not lines[head_end].strip()):
+ head_end += 1
+ ts = datetime.now().astimezone().strftime("%Y-%m-%d %a @ %H:%M:%S %z")
+ provenance = "reason: %s" % reason if reason else "evidence-based, human-confirmed"
+ block = [
+ "* %s %s" % (keyword, title),
+ ":PROPERTIES:",
+ ":ID: %s" % doc_id,
+ ":END:",
+ "- %s — retrofitted by spec-sort; status set to %s (%s)" % (ts, keyword, provenance),
+ "",
+ ]
+ lines[head_end:head_end] = block
+
+ out = []
+ mirrored = False
+ for line in lines:
+ m = re.match(r"^(\|\s*Status\s*\|)([^|]*)(\|.*)$", line, re.IGNORECASE)
+ if m and not mirrored:
+ value = " %s" % keyword.lower()
+ width = len(m.group(2))
+ line = m.group(1) + (value.ljust(width) if len(value) <= width else value + " ") + m.group(3)
+ mirrored = True
+ out.append(line)
+ return "\n".join(out) + "\n"
+
+
+def title_for(content, rel):
+ m = re.search(r"^#\+TITLE:\s*(.+)$", content, re.MULTILINE | re.IGNORECASE)
+ if m:
+ return m.group(1).strip()
+ base = os.path.basename(rel)[: -len(".org")]
+ return base[: -len("-spec")] if base.endswith("-spec") else base
+
+
+# ---- Marker ------------------------------------------------------------
+
+
+def stamp_marker(root, date):
+ path = os.path.join(root, ".ai", "notes.org")
+ os.makedirs(os.path.dirname(path), exist_ok=True)
+ content = read_text(path) or ""
+ line = ":LAST_SPEC_SORT: %s" % date
+ if ":LAST_SPEC_SORT:" in content:
+ content = re.sub(r":LAST_SPEC_SORT:.*", line, content, count=1)
+ elif re.search(r"^\* Workflow State\s*$", content, re.MULTILINE):
+ content = re.sub(r"(^\* Workflow State\s*$)", r"\1\n" + line, content, count=1, flags=re.MULTILINE)
+ else:
+ if content and not content.endswith("\n"):
+ content += "\n"
+ content += "\n* Workflow State\n\n%s\n" % line
+ with open(path, "w", encoding="utf-8") as f:
+ f.write(content)
+
+
+# ---- Apply -------------------------------------------------------------
+
+
+class ApplyFailure(Exception):
+ """Mid-apply failure: args are (applied_labels, remaining_ops, cause)."""
+
+
+def apply_plan(root, plan, fail_after):
+ """Execute the recorded plan. Returns the applied-op labels; raises
+ ApplyFailure mid-way on a write error or when the test hook fires."""
+ ops = []
+ for mv in plan["moves"]:
+ ops.append(("move", mv))
+ for linker, edits in plan["link_edits"].items():
+ if linker in {mv["src"] for mv in plan["moves"]}:
+ continue # a moving doc's own edits ride along in its transform
+ ops.append(("relink", (linker, edits)))
+
+ applied = []
+ specs_dir = os.path.join(root, "docs", "specs")
+ if plan["moves"] and not os.path.isdir(specs_dir):
+ os.makedirs(specs_dir)
+ plan["created_dirs"].append(os.path.join("docs", "specs"))
+
+ for n, (kind, payload) in enumerate(ops, 1):
+ if fail_after and n > fail_after:
+ raise ApplyFailure(applied, ops[n - 1:], "injected test failure")
+ try:
+ if kind == "move":
+ mv = payload
+ content = read_text(os.path.join(root, mv["src"]))
+ new = transform_spec(content, mv["keyword"], mv["reason"], mv["title"], mv["id"],
+ plan["link_edits"].get(mv["src"], []))
+ with open(os.path.join(root, mv["dest"]), "w", encoding="utf-8") as f:
+ f.write(new)
+ os.remove(os.path.join(root, mv["src"]))
+ applied.append("move %s -> %s" % (mv["src"], mv["dest"]))
+ else:
+ linker, edits = payload
+ path = os.path.join(root, linker)
+ content = read_text(path)
+ for old, new in edits:
+ content = content.replace(old, new)
+ with open(path, "w", encoding="utf-8") as f:
+ f.write(content)
+ applied.append("relink %s (%d link%s)" % (linker, len(edits), "s" if len(edits) != 1 else ""))
+ except OSError as exc:
+ raise ApplyFailure(applied, ops[n - 1:], str(exc))
+ return applied
+
+
+def residue_check(root, plan):
+ """Post-apply: no link in the rewritten roots may still resolve to an
+ old path; bare mentions beyond the acknowledged set fail too."""
+ moved = {mv["src"]: mv["dest"] for mv in plan["moves"]}
+ residue = []
+ for linker in rewrite_files(root):
+ content = read_text(os.path.join(root, linker))
+ if content is None:
+ continue
+ for m in LINK_RE.finditer(content):
+ target_path = m.group(1).partition("::")[0]
+ target = resolve_target(root, linker, target_path, {})
+ if target in moved:
+ residue.append("%s: link still resolves to %s" % (linker, target))
+ # Acknowledged mentions were recorded pre-apply; a mention inside a moved
+ # doc now lives at the doc's destination, so map the file side through the
+ # moves before comparing.
+ acknowledged = {(moved.get(f, f), src) for f, _ln, src in plan["bare"]}
+ for f, ln, src in scan_bare_mentions(root, moved):
+ if (f, src) not in acknowledged:
+ residue.append("%s:%d: bare mention of %s" % (f, ln, src))
+ return residue
+
+
+def print_recovery(plan, applied, not_applied):
+ print("FAILURE — the apply did not complete.")
+ print(" applied:")
+ for a in applied or ["(nothing)"]:
+ print(" %s" % a)
+ print(" not applied:")
+ for kind, payload in not_applied:
+ if kind == "move":
+ print(" move %s -> %s" % (payload["src"], payload["dest"]))
+ else:
+ print(" relink %s" % payload[0])
+ print("RECOVERY — restore the pre-run state (safe: preflight required a clean tree):")
+ touched = [mv["src"] for mv in plan["moves"]] + [l for l in plan["link_edits"] if l not in {mv["src"] for mv in plan["moves"]}]
+ print(" git restore -- %s" % " ".join(touched))
+ created = [mv["dest"] for mv in plan["moves"]]
+ print(" rm -f -- %s # git restore can't remove the created copies" % " ".join(created))
+ for d in plan.get("created_dirs", []):
+ print(" rmdir --ignore-fail-on-non-empty -- %s" % d)
+
+
+# ---- Main ---------------------------------------------------------------
+
+
+def parse_kv(pairs, label):
+ out = {}
+ for item in pairs or []:
+ if "=" not in item:
+ sys.exit("spec-sort: %s expects REL=VALUE, got %r" % (label, item))
+ k, v = item.split("=", 1)
+ out[os.path.normpath(k)] = v
+ return out
+
+
+def main():
+ ap = argparse.ArgumentParser(prog="spec-sort", add_help=True)
+ ap.add_argument("--project-root", default=".")
+ ap.add_argument("--apply", action="store_true")
+ ap.add_argument("--allow-dirty", action="store_true")
+ ap.add_argument("--acknowledge-bare", action="store_true")
+ ap.add_argument("--confirm", action="append", metavar="REL=KEYWORD")
+ ap.add_argument("--reason", action="append", metavar="REL=TEXT")
+ ap.add_argument("--skip", action="append", metavar="REL")
+ ap.add_argument("--plan-file")
+ args = ap.parse_args()
+
+ root = os.path.abspath(args.project_root)
+ confirms = parse_kv(args.confirm, "--confirm")
+ reasons = parse_kv(args.reason, "--reason")
+ skips = {os.path.normpath(s) for s in (args.skip or [])}
+
+ candidates, anomalies, notes = classify(root)
+ if not candidates and not anomalies and not notes and not os.path.isdir(os.path.join(root, "docs")):
+ return 0 # no docs pile at all — silent no-op
+
+ for named in list(confirms) + list(skips) + list(reasons):
+ if named not in candidates:
+ print("spec-sort: %s is not a spec candidate" % named)
+ return 1
+ for rel, kw in confirms.items():
+ if kw not in LIFECYCLE:
+ print("spec-sort: %r is not a lifecycle keyword (%s)" % (kw, " ".join(LIFECYCLE)))
+ return 1
+
+ # ---- Build the plan (shared by report and apply) ----
+ moves = []
+ for rel in candidates:
+ if rel in skips:
+ continue
+ if args.apply and rel not in confirms:
+ continue # gate failure reported below
+ content = read_text(os.path.join(root, rel))
+ moves.append({
+ "src": rel,
+ "dest": dest_for(rel),
+ "keyword": confirms.get(rel, None),
+ "reason": reasons.get(rel),
+ "title": title_for(content, rel),
+ "id": str(uuid.uuid4()),
+ })
+ moved_map = {mv["src"]: mv["dest"] for mv in moves}
+ link_edits, ambiguous = plan_link_edits(root, moved_map)
+ bare = scan_bare_mentions(root, moved_map)
+ reports = scan_report_only(root, moved_map)
+
+ # ---- Report ----
+ for rel in candidates:
+ content = read_text(os.path.join(root, rel))
+ ev = gather_evidence(root, rel, content)
+ proposed = propose_keyword(ev)
+ print("CANDIDATE %s -> %s" % (rel, dest_for(rel)))
+ suffix = " (terminal — requires --reason to apply)" if proposed in TERMINAL else ""
+ print(" proposed keyword: %s%s" % (proposed, suffix))
+ print(" evidence:")
+ print(" status field: %s" % (ev["status"] or "(none)"))
+ print(" cookies: %s" % ("; ".join(ev["cookies"]) or "(none)"))
+ print(" todo.org: %s" % (ev["todo"] or "(no linking task)"))
+ print(" history: %s" % (ev["history"] or "(none)"))
+ n_exist, artifacts = ev["artifacts"]
+ if artifacts:
+ print(" artifacts: %d/%d named paths exist (%s)" % (n_exist, len(artifacts), ", ".join(artifacts)))
+ else:
+ print(" artifacts: (none named)")
+ for rel in anomalies:
+ print("ANOMALY %s: named -spec.org but lacks the spec spine (Decisions + Implementation phases); surfaced, not moved" % rel)
+ for rel in notes:
+ print("NOTE %s" % rel)
+ for linker, edits in sorted(link_edits.items()):
+ for old, new in edits:
+ print("RELINK %s: %s -> %s" % (linker, old, new))
+ for a in ambiguous:
+ print("AMBIGUOUS %s" % a)
+ for f, ln, src in bare:
+ print("BARE-PATH %s:%d: %s (reported for manual handling, never rewritten)" % (f, ln, src))
+ for rel, src, note in reports:
+ print("REPORT %s: reference to %s (%s)" % (rel, src, note))
+
+ if not args.apply:
+ if candidates or anomalies or notes:
+ print("DRY RUN — no changes written. Pass --apply with per-candidate --confirm/--skip to execute.")
+ return 0
+
+ # ---- Apply: preflight ----
+ try:
+ porcelain = subprocess.run(
+ ["git", "status", "--porcelain"], cwd=root,
+ capture_output=True, text=True, check=True,
+ ).stdout
+ except (subprocess.CalledProcessError, FileNotFoundError):
+ print("spec-sort: --apply needs a git worktree (recovery depends on git restore)")
+ return 2
+ if porcelain.strip():
+ dirty = [ln[3:] for ln in porcelain.splitlines()]
+ if not args.allow_dirty:
+ print("spec-sort: refusing --apply on a dirty worktree (%d path%s). Commit or stash first, or pass --allow-dirty."
+ % (len(dirty), "s" if len(dirty) != 1 else ""))
+ return 2
+ print("WARNING --allow-dirty: recovery via git restore would also revert your pre-existing uncommitted changes:")
+ for p in dirty:
+ print(" %s" % p)
+
+ # ---- Apply: confirm gate ----
+ unaddressed = [rel for rel in candidates if rel not in confirms and rel not in skips]
+ if unaddressed:
+ print("spec-sort: unconfirmed candidate(s) — pass --confirm REL=KEYWORD or --skip REL for each:")
+ for rel in unaddressed:
+ print(" %s" % rel)
+ return 1
+ for mv in moves:
+ if mv["keyword"] in TERMINAL and not mv["reason"]:
+ print("spec-sort: %s -> %s is a terminal state and requires an explicit --reason %s=TEXT"
+ % (mv["src"], mv["keyword"], mv["src"]))
+ return 1
+
+ # ---- Apply: validation ----
+ problems = []
+ dests = {}
+ for mv in moves:
+ if os.path.exists(os.path.join(root, mv["dest"])):
+ problems.append("%s: destination exists (%s)" % (mv["src"], mv["dest"]))
+ if mv["dest"] in dests:
+ problems.append("%s and %s: destination exists twice (%s)" % (mv["src"], dests[mv["dest"]], mv["dest"]))
+ dests[mv["dest"]] = mv["src"]
+ for a in ambiguous:
+ problems.append("ambiguous link: %s" % a)
+ if bare and not args.acknowledge_bare:
+ problems.append("bare-path mention(s) listed above need manual handling — re-run with --acknowledge-bare to proceed without rewriting them")
+ if problems:
+ print("spec-sort: validation blocked — nothing written:")
+ for p in problems:
+ print(" %s" % p)
+ return 1
+
+ # ---- Apply: record the plan, then execute from it ----
+ today = datetime.now().astimezone().strftime("%Y-%m-%d")
+ plan = {
+ "root": root, "date": today, "moves": moves,
+ "link_edits": link_edits, "bare": bare,
+ "reports": [list(r) for r in reports], "created_dirs": [],
+ }
+ plan_path = args.plan_file or os.path.join(
+ tempfile.gettempdir(), "spec-sort-plan-%s.json" % os.path.basename(root))
+ with open(plan_path, "w", encoding="utf-8") as f:
+ json.dump(plan, f, indent=2)
+ print("plan written: %s" % plan_path)
+
+ fail_after = int(os.environ.get("SPEC_SORT_INJECT_FAIL_AFTER", "0") or 0)
+ try:
+ applied = apply_plan(root, plan, fail_after)
+ except ApplyFailure as exc:
+ print("write failed: %s" % exc.args[2])
+ print_recovery(plan, exc.args[0], exc.args[1])
+ return 1
+
+ residue = residue_check(root, plan)
+ if residue:
+ print("spec-sort: residue after apply — old paths still referenced:")
+ for r in residue:
+ print(" %s" % r)
+ print_recovery(plan, applied, [])
+ return 1
+
+ stamp_marker(root, today)
+ for a in applied:
+ print("applied: %s" % a)
+ print("spec-sort: done — %d spec(s) sorted, :LAST_SPEC_SORT: %s stamped" % (len(moves), today))
+ return 0
+
+
+if __name__ == "__main__":
+ sys.exit(main())
diff --git a/claude-templates/.ai/scripts/tests/route-batch.bats b/claude-templates/.ai/scripts/tests/route-batch.bats
new file mode 100644
index 0000000..84ded5f
--- /dev/null
+++ b/claude-templates/.ai/scripts/tests/route-batch.bats
@@ -0,0 +1,202 @@
+#!/usr/bin/env bats
+#
+# Tests for claude-templates/.ai/scripts/route-batch — the wrap-up router's
+# mechanical go path (wrapup-routing spec, Phase 4 / D7 / D9).
+#
+# Contract under test:
+# route-batch --list one "<destination>\t<heading>" line per task
+# carrying :ROUTE_CANDIDATE:; silent when none;
+# never modifies anything
+# route-batch --go per candidate: write the subtree (minus the
+# :ROUTE_CANDIDATE: line) as a one-task handoff,
+# deliver via inbox-send to the destination's
+# inbox/, then remove the subtree from the local
+# todo.org. Send failure leaves the task in
+# place and exits non-zero. Empty set: no-op.
+#
+# Strategy: fixture roots under $TEST_DIR hold a source project and two
+# destination projects; INBOX_SEND_ROOTS sandboxes inbox-send's discovery to
+# them (the same hook inbox-send's own tests use).
+
+SCRIPT="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)/route-batch"
+
+setup() {
+ TEST_DIR="$(mktemp -d -t route-batch-bats.XXXXXX)"
+ ROOTS="$TEST_DIR/roots"
+ SRC="$ROOTS/srcproj"
+ mkdir -p "$SRC/.ai" "$SRC/inbox" \
+ "$ROOTS/alpha/.ai" "$ROOTS/alpha/inbox" \
+ "$ROOTS/beta/.ai" "$ROOTS/beta/inbox"
+ touch "$ROOTS/alpha/todo.org" # alpha has a todo.org; beta deliberately not
+
+ cat > "$SRC/todo.org" <<'EOF'
+* Srcproj Open Work
+** TODO [#B] Alpha-bound task :feature:
+:PROPERTIES:
+:ROUTE_CANDIDATE: alpha
+:END:
+Body line about the alpha work.
+*** TODO Sub-task that rides along
+** TODO [#C] Purely local task
+Local body stays put.
+** TODO [#C] Beta-bound task :quick:
+:PROPERTIES:
+:CREATED: [2026-07-01 Tue]
+:ROUTE_CANDIDATE: beta
+:END:
+Beta body.
+EOF
+
+ export INBOX_SEND_ROOTS="$ROOTS"
+ cd "$SRC"
+}
+
+teardown() {
+ rm -rf "$TEST_DIR"
+}
+
+# ---- --list ------------------------------------------------------------
+
+@test "route-batch --list: one destination+heading line per candidate, backlog excluded" {
+ run "$SCRIPT" --list
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"alpha"*"Alpha-bound task"* ]]
+ [[ "$output" == *"beta"*"Beta-bound task"* ]]
+ [[ "$output" != *"Purely local task"* ]]
+}
+
+@test "route-batch --list: empty candidate set is silent (exit 0)" {
+ sed -i '/:ROUTE_CANDIDATE:/d' todo.org
+ run "$SCRIPT" --list
+ [ "$status" -eq 0 ]
+ [ -z "$output" ]
+}
+
+@test "route-batch --list: modifies nothing (skip leaves all in place)" {
+ before="$(cat todo.org)"
+ run "$SCRIPT" --list
+ [ "$status" -eq 0 ]
+ [ "$(cat todo.org)" = "$before" ]
+ [ -z "$(ls "$ROOTS/alpha/inbox" "$ROOTS/beta/inbox" 2>/dev/null | grep -v ':')" ]
+}
+
+# ---- --go --------------------------------------------------------------
+
+@test "route-batch --go: delivers each candidate to its destination inbox with provenance" {
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ alpha_file=$(find "$ROOTS/alpha/inbox" -name '*from-srcproj*' -type f)
+ beta_file=$(find "$ROOTS/beta/inbox" -name '*from-srcproj*' -type f)
+ [ -n "$alpha_file" ]
+ [ -n "$beta_file" ]
+ grep -q 'Alpha-bound task' "$alpha_file"
+ grep -q 'Sub-task that rides along' "$alpha_file" # children ride along
+ grep -q 'Beta-bound task' "$beta_file"
+ ! grep -q ':ROUTE_CANDIDATE:' "$alpha_file"
+ ! grep -q ':ROUTE_CANDIDATE:' "$beta_file"
+}
+
+@test "route-batch --go: removes routed subtrees from todo.org, leaves local tasks" {
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ ! grep -q 'Alpha-bound task' todo.org
+ ! grep -q 'Sub-task that rides along' todo.org
+ ! grep -q 'Beta-bound task' todo.org
+ grep -q 'Purely local task' todo.org
+ grep -q 'Local body stays put' todo.org
+}
+
+@test "route-batch --go: a kept property drawer survives minus the marker" {
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ beta_file=$(find "$ROOTS/beta/inbox" -name '*from-srcproj*' -type f)
+ grep -q ':CREATED: \[2026-07-01 Tue\]' "$beta_file"
+}
+
+@test "route-batch --go: destination with inbox/ but no todo.org still delivers" {
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ [ ! -f "$ROOTS/beta/todo.org" ]
+ [ -n "$(find "$ROOTS/beta/inbox" -name '*from-srcproj*' -type f)" ]
+}
+
+@test "route-batch --go: empty candidate set is a silent no-op (exit 0)" {
+ sed -i '/:ROUTE_CANDIDATE:/d' todo.org
+ before="$(cat todo.org)"
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ [ -z "$output" ]
+ [ "$(cat todo.org)" = "$before" ]
+}
+
+@test "route-batch --go: a failed send leaves that task in place, marker intact, and exits non-zero" {
+ sed -i 's/:ROUTE_CANDIDATE: beta/:ROUTE_CANDIDATE: ghost/' todo.org
+ run "$SCRIPT" --go
+ [ "$status" -ne 0 ]
+ grep -q 'Beta-bound task' todo.org # failed route stays local
+ grep -q ':ROUTE_CANDIDATE: ghost' todo.org # marker survives so it resurfaces next wrap
+ ! grep -q 'Alpha-bound task' todo.org # the good route still landed
+ [ -n "$(find "$ROOTS/alpha/inbox" -name '*from-srcproj*' -type f)" ]
+}
+
+@test "route-batch --go: handoff headings are promoted to top level" {
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ alpha_file=$(find "$ROOTS/alpha/inbox" -name '*from-srcproj*' -type f)
+ grep -q '^\* TODO \[#B\] Alpha-bound task' "$alpha_file"
+ grep -q '^\*\* TODO Sub-task that rides along' "$alpha_file"
+}
+
+@test "route-batch --go: a drawer emptied by the marker strip is pruned from the handoff" {
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ alpha_file=$(find "$ROOTS/alpha/inbox" -name '*from-srcproj*' -type f)
+ ! grep -q ':PROPERTIES:' "$alpha_file"
+}
+
+# ---- Overlapping candidates (nested marker data-loss regression) --------
+
+@test "route-batch --go: nested candidates conflict — both stay, bystander survives, exit non-zero" {
+ cat > todo.org <<'EOF'
+* Srcproj Open Work
+** TODO [#B] Parent bound for alpha
+:PROPERTIES:
+:ROUTE_CANDIDATE: alpha
+:END:
+Parent body.
+*** TODO Child bound for beta
+:PROPERTIES:
+:ROUTE_CANDIDATE: beta
+:END:
+Child body.
+** TODO [#C] Innocent bystander task
+Bystander body.
+EOF
+ run "$SCRIPT" --go
+ [ "$status" -ne 0 ]
+ [[ "$output" == *"CONFLICT"* ]]
+ grep -q 'Parent bound for alpha' todo.org
+ grep -q 'Child bound for beta' todo.org
+ grep -q 'Innocent bystander task' todo.org
+ grep -q 'Bystander body' todo.org
+ [ -z "$(find "$ROOTS/alpha/inbox" "$ROOTS/beta/inbox" -name '*from-srcproj*' -type f)" ]
+}
+
+@test "route-batch: duplicate identical markers in one drawer dedupe to a single route" {
+ cat > todo.org <<'EOF'
+* Srcproj Open Work
+** TODO [#B] Double-tagged for alpha
+:PROPERTIES:
+:ROUTE_CANDIDATE: alpha
+:ROUTE_CANDIDATE: alpha
+:END:
+Body.
+EOF
+ run "$SCRIPT" --list
+ [ "$status" -eq 0 ]
+ [ "$(echo "$output" | grep -c 'Double-tagged')" -eq 1 ]
+ [[ "$output" != *"CONFLICT"* ]]
+ run "$SCRIPT" --go
+ [ "$status" -eq 0 ]
+ [ "$(find "$ROOTS/alpha/inbox" -name '*from-srcproj*' -type f | wc -l)" -eq 1 ]
+}
diff --git a/claude-templates/.ai/scripts/tests/self-inject.bats b/claude-templates/.ai/scripts/tests/self-inject.bats
new file mode 100644
index 0000000..482f61d
--- /dev/null
+++ b/claude-templates/.ai/scripts/tests/self-inject.bats
@@ -0,0 +1,78 @@
+#!/usr/bin/env bats
+# Tests for self-inject.sh — tmux is the external boundary, stubbed with a
+# recording fake so no real server is needed.
+
+setup() {
+ SCRIPT="$BATS_TEST_DIRNAME/../self-inject.sh"
+ STUB_DIR="$BATS_TEST_TMPDIR/bin"
+ LOG="$BATS_TEST_TMPDIR/tmux.log"
+ mkdir -p "$STUB_DIR"
+}
+
+# A tmux stub that records every invocation and answers list-panes from
+# $STUB_PANES (empty by default, so pane derivation fails unless a test
+# provides ancestry-matching output).
+make_stub() {
+ cat > "$STUB_DIR/tmux" <<'EOF'
+#!/bin/sh
+echo "$@" >> "$LOG"
+case "$1" in
+ list-panes) printf '%s\n' "$STUB_PANES" ;;
+esac
+EOF
+ chmod +x "$STUB_DIR/tmux"
+}
+
+@test "self-inject: -t pane with no pairs echoes the pane and exits 0" {
+ make_stub
+ run env PATH="$STUB_DIR:$PATH" LOG="$LOG" STUB_PANES="" sh "$SCRIPT" -t %42
+ [ "$status" -eq 0 ]
+ [ "$output" = "%42" ]
+ # Pane was supplied, nothing sent: tmux must not have been called.
+ [ ! -e "$LOG" ]
+}
+
+@test "self-inject: no pane derivable and no -t exits 1 with an error" {
+ make_stub
+ run env PATH="$STUB_DIR:$PATH" LOG="$LOG" STUB_PANES="" sh "$SCRIPT" 0 "hello"
+ [ "$status" -eq 1 ]
+ case "$output" in *"no owning pane"*) : ;; *) false ;; esac
+}
+
+@test "self-inject: derives the pane from process ancestry via list-panes" {
+ make_stub
+ # The stub reports the bats test process itself as a pane's pane_pid;
+ # the script runs as our child, so that pid is in its ancestry.
+ run env PATH="$STUB_DIR:$PATH" LOG="$LOG" STUB_PANES="$$ %7" sh "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [ "$output" = "%7" ]
+}
+
+@test "self-inject: one delay/text pair sends literal text then Enter" {
+ make_stub
+ run env PATH="$STUB_DIR:$PATH" LOG="$LOG" STUB_PANES="" sh "$SCRIPT" -t %3 0 "/clear"
+ [ "$status" -eq 0 ]
+ run cat "$LOG"
+ [ "${lines[0]}" = "send-keys -t %3 -l /clear" ]
+ [ "${lines[1]}" = "send-keys -t %3 Enter" ]
+}
+
+@test "self-inject: multiple pairs send in order" {
+ make_stub
+ run env PATH="$STUB_DIR:$PATH" LOG="$LOG" STUB_PANES="" \
+ sh "$SCRIPT" -t %3 0 "/clear" 0 "go — resume"
+ [ "$status" -eq 0 ]
+ run cat "$LOG"
+ [ "${lines[0]}" = "send-keys -t %3 -l /clear" ]
+ [ "${lines[1]}" = "send-keys -t %3 Enter" ]
+ [ "${lines[2]}" = "send-keys -t %3 -l go — resume" ]
+ [ "${lines[3]}" = "send-keys -t %3 Enter" ]
+}
+
+@test "self-inject: dangling odd argument after pairs is ignored" {
+ make_stub
+ run env PATH="$STUB_DIR:$PATH" LOG="$LOG" STUB_PANES="" sh "$SCRIPT" -t %3 0 "one" 99
+ [ "$status" -eq 0 ]
+ run cat "$LOG"
+ [ "${#lines[@]}" -eq 2 ]
+}
diff --git a/claude-templates/.ai/scripts/tests/spec-sort.bats b/claude-templates/.ai/scripts/tests/spec-sort.bats
new file mode 100644
index 0000000..583e458
--- /dev/null
+++ b/claude-templates/.ai/scripts/tests/spec-sort.bats
@@ -0,0 +1,453 @@
+#!/usr/bin/env bats
+#
+# Tests for claude-templates/.ai/scripts/spec-sort — the one-time docs-pile
+# retrofit from the docs-lifecycle spec: classify docs/**/*.org outside
+# docs/specs/ (spec candidate iff it carries BOTH a Decisions heading AND an
+# Implementation phases heading), show an evidence panel, and on --apply
+# move + rename confirmed candidates to docs/specs/*-spec.org, prepend the
+# status heading (:ID:, dated history line), rewrite the keyword header to
+# the two-sequence form, relink file: links across the rewritten roots,
+# stamp :LAST_SPEC_SORT: in .ai/notes.org.
+#
+# Contract under test (docs/specs/2026-07-01-docs-lifecycle-spec.org,
+# "The retrofit"):
+# - dry-run report is the default; --apply writes
+# - --apply refuses on a dirty worktree (exit 2) unless --allow-dirty
+# - every candidate needs --confirm REL=KEYWORD or --skip REL (exit 1
+# otherwise); terminal keywords need --reason REL=TEXT
+# - plan validated before the first write; destination collisions block
+# - bare-path mentions in rewritten roots block --apply until
+# --acknowledge-bare waives them (reported, never rewritten)
+# - mid-apply failure names applied/not-applied + git restore recovery
+# - idempotent: a sorted project yields no candidates, no changes
+#
+# Strategy: each test builds a throwaway git project fixture and runs the
+# real script against it. Mid-apply failure is forced via the test-only
+# SPEC_SORT_INJECT_FAIL_AFTER env hook.
+
+SCRIPT="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)/spec-sort"
+
+setup() {
+ TEST_DIR="$(mktemp -d -t spec-sort-bats.XXXXXX)"
+ PROJ="$TEST_DIR/proj"
+ mkdir -p "$PROJ"
+}
+
+teardown() {
+ rm -rf "$TEST_DIR"
+}
+
+# Standard fixture: one spec candidate, one note, a stray root spec with a
+# spine, an anomaly (-spec.org name, no spine), inbound links from todo.org,
+# a sibling note, a session archive (report-only surface), and .ai/notes.org
+# with a Workflow State section.
+make_project() {
+ cd "$PROJ"
+ git init -q
+ git config user.email test@test
+ git config user.name test
+ mkdir -p docs/design .ai/sessions
+
+ cat > docs/design/widget.org <<'EOF'
+#+TITLE: Widget Feature
+#+DATE: 2026-05-01
+#+TODO: DRAFT REVIEW | SHIPPED
+
+* Metadata
+| Status | draft |
+| Owner | Craig |
+
+* Summary
+The widget feature. See [[file:scratch-note.org][the note]].
+
+* Decisions [1/2]
+** DONE Pick the widget shape
+** TODO Pick the color
+
+* Implementation phases
+** Phase 1 — build =src/widget.py=
+EOF
+
+ cat > docs/design/scratch-note.org <<'EOF'
+#+TITLE: Scratch Note
+
+* Metadata
+| Status | n/a |
+
+* Thoughts
+See [[file:widget.org][the widget spec]].
+EOF
+
+ cat > docs/rooty-spec.org <<'EOF'
+#+TITLE: Rooty
+
+* Decisions
+** DONE Only decision
+
+* Implementation phases
+** Phase 1 — nothing
+EOF
+
+ cat > docs/lonely-spec.org <<'EOF'
+#+TITLE: Lonely
+Just prose, no spine.
+EOF
+
+ cat > todo.org <<'EOF'
+* Open Work
+** DOING [#B] Widget feature
+Spec: [[file:docs/design/widget.org][widget spec]].
+Summary anchor: [[file:docs/design/widget.org::*Summary][the summary]].
+EOF
+
+ cat > .ai/notes.org <<'EOF'
+* Active Reminders
+
+* Workflow State
+:LAST_AUDIT: 2026-06-28
+EOF
+
+ cat > .ai/sessions/2026-06-01-old.org <<'EOF'
+Old log: [[file:../../docs/design/widget.org][widget]]
+EOF
+
+ git add -A
+ git commit -qm init
+}
+
+# Confirm flags that satisfy the gate for the standard fixture's candidates.
+CONFIRM_ALL=(--confirm docs/design/widget.org=DRAFT --confirm docs/rooty-spec.org=DRAFT)
+
+# ---- Classification (dry-run) ----------------------------------------
+
+@test "spec-sort: dry-run classifies the spine-carrying doc as a candidate" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"CANDIDATE docs/design/widget.org -> docs/specs/widget-spec.org"* ]]
+}
+
+@test "spec-sort: a Metadata table alone does not qualify — note stays a note" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"NOTE docs/design/scratch-note.org"* ]]
+ [[ "$output" != *"CANDIDATE docs/design/scratch-note.org"* ]]
+}
+
+@test "spec-sort: stray root spec with a spine is a candidate, suffix not doubled" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"CANDIDATE docs/rooty-spec.org -> docs/specs/rooty-spec.org"* ]]
+ [[ "$output" != *"rooty-spec-spec.org"* ]]
+}
+
+@test "spec-sort: -spec.org name without a spine is an anomaly, never auto-moved" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"ANOMALY docs/lonely-spec.org"* ]]
+ [[ "$output" != *"CANDIDATE docs/lonely-spec.org"* ]]
+}
+
+@test "spec-sort: docs/specs/ contents are excluded from classification" {
+ make_project
+ mkdir -p docs/specs
+ cp docs/design/widget.org docs/specs/sorted-spec.org
+ git add -A && git commit -qm more
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" != *"CANDIDATE docs/specs/sorted-spec.org"* ]]
+}
+
+@test "spec-sort: no docs/ directory is a silent no-op" {
+ cd "$PROJ"
+ git init -q
+ git config user.email test@test
+ git config user.name test
+ echo x > README.md
+ git add -A && git commit -qm init
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [ -z "$output" ]
+}
+
+# ---- Evidence panel ---------------------------------------------------
+
+@test "spec-sort: evidence panel shows status field, cookies, and todo.org task" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"status field: draft"* ]]
+ [[ "$output" == *"Decisions [1/2]"* ]]
+ [[ "$output" == *"todo.org:"*"DOING"*"Widget feature"* ]]
+}
+
+@test "spec-sort: keyword proposal follows the evidence — DOING from the linked DOING task" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ # status field says draft, but the linking todo.org task is DOING — the
+ # panel proposes the state the strongest evidence supports
+ [[ "$output" == *"proposed keyword: DOING"* ]]
+}
+
+@test "spec-sort: an 'incomplete' status field never proposes the terminal IMPLEMENTED" {
+ make_project
+ sed -i 's/| Status | draft |/| Status | incomplete |/' docs/design/widget.org
+ git add -A && git commit -qm status
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" != *"proposed keyword: IMPLEMENTED"* ]]
+}
+
+# ---- Confirm gate -----------------------------------------------------
+
+@test "spec-sort --apply: refuses when a candidate is neither confirmed nor skipped" {
+ make_project
+ run "$SCRIPT" --apply --confirm docs/design/widget.org=DRAFT
+ [ "$status" -eq 1 ]
+ [[ "$output" == *"unconfirmed"* ]]
+ [[ "$output" == *"docs/rooty-spec.org"* ]]
+ [ -f docs/design/widget.org ] # nothing moved
+}
+
+@test "spec-sort --apply: a terminal keyword without --reason refuses" {
+ make_project
+ run "$SCRIPT" --apply --confirm docs/design/widget.org=IMPLEMENTED --skip docs/rooty-spec.org
+ [ "$status" -eq 1 ]
+ [[ "$output" == *"--reason"* ]]
+ [ -f docs/design/widget.org ]
+}
+
+@test "spec-sort --apply: a terminal keyword with --reason records it in the history line" {
+ make_project
+ run "$SCRIPT" --apply --confirm docs/design/widget.org=IMPLEMENTED \
+ --reason "docs/design/widget.org=shipped in v2, confirmed against src" \
+ --skip docs/rooty-spec.org
+ [ "$status" -eq 0 ]
+ grep -q '^\* IMPLEMENTED Widget Feature' docs/specs/widget-spec.org
+ grep -q 'shipped in v2, confirmed against src' docs/specs/widget-spec.org
+}
+
+@test "spec-sort --apply: --skip leaves the candidate in place and still stamps the marker" {
+ make_project
+ run "$SCRIPT" --apply --skip docs/design/widget.org --skip docs/rooty-spec.org
+ [ "$status" -eq 0 ]
+ [ -f docs/design/widget.org ]
+ grep -q ':LAST_SPEC_SORT:' .ai/notes.org
+}
+
+# ---- Preflight --------------------------------------------------------
+
+@test "spec-sort --apply: refuses on a dirty worktree (exit 2)" {
+ make_project
+ echo "drift" >> todo.org
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 2 ]
+ [[ "$output" == *"dirty"* ]]
+ [ -f docs/design/widget.org ]
+}
+
+@test "spec-sort --apply --allow-dirty: proceeds and names what recovery loses" {
+ make_project
+ echo "drift" >> todo.org
+ git add todo.org && git commit -qm drift # keep the link intact; dirty a different file
+ echo "scratch" > untracked-note.txt
+ echo "local edit" >> .ai/notes.org
+ run "$SCRIPT" --apply --allow-dirty "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"pre-existing"* ]]
+ [[ "$output" == *".ai/notes.org"* ]]
+ [ -f docs/specs/widget-spec.org ]
+}
+
+# ---- Move + rename + rewrite ------------------------------------------
+
+@test "spec-sort --apply: moves, renames to -spec.org, prepends status heading with :ID: and history" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ [ -f docs/specs/widget-spec.org ]
+ [ ! -f docs/design/widget.org ]
+ grep -q '^\* DRAFT Widget Feature' docs/specs/widget-spec.org
+ grep -q ':ID:' docs/specs/widget-spec.org
+ grep -q 'retrofitted by spec-sort' docs/specs/widget-spec.org
+}
+
+@test "spec-sort --apply: keyword header rewritten to the two-sequence form" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q '^#+TODO: TODO | DONE$' docs/specs/widget-spec.org
+ grep -q '^#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED$' docs/specs/widget-spec.org
+ ! grep -q 'DRAFT REVIEW | SHIPPED' docs/specs/widget-spec.org
+}
+
+@test "spec-sort --apply: Metadata Status field mirrors the confirmed keyword in lowercase" {
+ make_project
+ run "$SCRIPT" --apply --confirm docs/design/widget.org=READY --skip docs/rooty-spec.org
+ [ "$status" -eq 0 ]
+ grep -q '^\* READY Widget Feature' docs/specs/widget-spec.org
+ grep -Eq '^\| Status[[:space:]]*\|[[:space:]]*ready' docs/specs/widget-spec.org
+}
+
+# ---- Relink -----------------------------------------------------------
+
+@test "spec-sort --apply: rewrites the todo.org link, preserving the description" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q '\[\[file:docs/specs/widget-spec.org\]\[widget spec\]\]' todo.org
+ ! grep -q 'docs/design/widget.org' todo.org
+}
+
+@test "spec-sort --apply: preserves a ::anchor suffix through the rewrite" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q '\[\[file:docs/specs/widget-spec.org::\*Summary\]\[the summary\]\]' todo.org
+}
+
+@test "spec-sort --apply: recomputes a sibling note's relative link to the moved spec" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q '\[\[file:../specs/widget-spec.org\]\[the widget spec\]\]' docs/design/scratch-note.org
+}
+
+@test "spec-sort --apply: recomputes the moved spec's own outbound link to an unmoved note" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q '\[\[file:../design/scratch-note.org\]\[the note\]\]' docs/specs/widget-spec.org
+}
+
+@test "spec-sort: session archives are reported, never rewritten" {
+ make_project
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"REPORT .ai/sessions/2026-06-01-old.org"* ]]
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q 'docs/design/widget.org' .ai/sessions/2026-06-01-old.org
+}
+
+@test "spec-sort: a synced template path report names the canonical rulesets file" {
+ make_project
+ mkdir -p .ai/workflows
+ echo 'See [[file:../../docs/design/widget.org][widget]]' > .ai/workflows/startup.org
+ git add -A && git commit -qm wf
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" == *"REPORT .ai/workflows/startup.org"* ]]
+ [[ "$output" == *"claude-templates/.ai/workflows/startup.org"* ]]
+}
+
+# ---- Bare-path mentions -----------------------------------------------
+
+@test "spec-sort --apply: a bare-path mention in a rewritten root blocks until acknowledged" {
+ make_project
+ echo "raw mention: docs/design/widget.org needs review" >> todo.org
+ git add -A && git commit -qm bare
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 1 ]
+ [[ "$output" == *"BARE"* ]]
+ [ -f docs/design/widget.org ] # nothing moved
+ run "$SCRIPT" --apply --acknowledge-bare "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q 'raw mention: docs/design/widget.org' todo.org # reported, never rewritten
+}
+
+@test "spec-sort --apply: a moving doc's bare mention of its own old path is acknowledgeable, not post-apply residue" {
+ make_project
+ echo "History: docs/design/widget.org was drafted in May." >> docs/design/widget.org
+ git add -A && git commit -qm selfmention
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 1 ]
+ [[ "$output" == *"BARE"* ]]
+ run "$SCRIPT" --apply --acknowledge-bare "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ] # the acknowledged mention rides along to docs/specs/; not residue
+ grep -q ':LAST_SPEC_SORT:' .ai/notes.org
+}
+
+# ---- Plan validation ---------------------------------------------------
+
+@test "spec-sort --apply: a destination collision blocks validation, nothing moved" {
+ make_project
+ mkdir -p docs/specs
+ echo "occupied" > docs/specs/widget-spec.org
+ git add -A && git commit -qm occupy
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 1 ]
+ [[ "$output" == *"destination exists"* ]]
+ [ -f docs/design/widget.org ]
+ [ "$(cat docs/specs/widget-spec.org)" = "occupied" ]
+}
+
+@test "spec-sort --apply: writes the plan file before executing" {
+ make_project
+ run "$SCRIPT" --apply --plan-file "$TEST_DIR/plan.json" "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ [ -f "$TEST_DIR/plan.json" ]
+ grep -q 'widget-spec.org' "$TEST_DIR/plan.json"
+}
+
+# ---- Mid-apply failure recovery ----------------------------------------
+
+@test "spec-sort --apply: forced mid-apply failure yields named recovery, not a half-migrated shrug" {
+ make_project
+ run env SPEC_SORT_INJECT_FAIL_AFTER=1 "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 1 ]
+ [[ "$output" == *"RECOVERY"* ]]
+ [[ "$output" == *"git restore"* ]]
+ [[ "$output" == *"applied"* ]]
+ [[ "$output" == *"not applied"* ]]
+ ! grep -q ':LAST_SPEC_SORT:' .ai/notes.org # no stamp on a failed apply
+}
+
+# ---- Idempotence + marker ----------------------------------------------
+
+@test "spec-sort --apply: stamps :LAST_SPEC_SORT: in the Workflow State section" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q ':LAST_SPEC_SORT: ' .ai/notes.org
+ # lands inside the Workflow State section, alongside the existing marker
+ awk '/^\* Workflow State/{ws=1} ws && /:LAST_SPEC_SORT:/{found=1} END{exit !found}' .ai/notes.org
+}
+
+@test "spec-sort --apply: creates the Workflow State section when notes.org lacks it" {
+ make_project
+ printf '* Active Reminders\n' > .ai/notes.org
+ git add -A && git commit -qm notes
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ grep -q '^\* Workflow State' .ai/notes.org
+ grep -q ':LAST_SPEC_SORT: ' .ai/notes.org
+}
+
+@test "spec-sort --apply: zero candidates still stamps the marker (clears the nudge)" {
+ make_project
+ rm docs/design/widget.org docs/rooty-spec.org docs/lonely-spec.org
+ git add -A && git commit -qm notes-only
+ run "$SCRIPT" --apply
+ [ "$status" -eq 0 ]
+ grep -q ':LAST_SPEC_SORT:' .ai/notes.org
+}
+
+@test "spec-sort: a second run after a successful apply finds nothing to do" {
+ make_project
+ run "$SCRIPT" --apply "${CONFIRM_ALL[@]}"
+ [ "$status" -eq 0 ]
+ git add -A && git commit -qm sorted
+ run "$SCRIPT"
+ [ "$status" -eq 0 ]
+ [[ "$output" != *"CANDIDATE"* ]]
+ run "$SCRIPT" --apply
+ [ "$status" -eq 0 ]
+ run git status --porcelain
+ # only the re-stamped marker (same date) may differ — tree stays clean
+ [ -z "$(git status --porcelain -- docs todo.org)" ]
+}
diff --git a/claude-templates/.ai/scripts/tests/test-lint-org.el b/claude-templates/.ai/scripts/tests/test-lint-org.el
index 3b8a9bb..d14879f 100644
--- a/claude-templates/.ai/scripts/tests/test-lint-org.el
+++ b/claude-templates/.ai/scripts/tests/test-lint-org.el
@@ -685,6 +685,37 @@ missing-rules violation."
(judgments (lo-test--judgments (plist-get out :issues))))
(should-not (member 'level-2-dated-header (lo-test--checkers judgments)))))
+;;; subtask-done-not-dated check (the inverse: level-3+ done keyword)
+
+(ert-deftest lo-subtask-done-not-dated-flags-level3 ()
+ "A level-3 DONE sub-task still carrying the keyword is flagged for conversion."
+ (let* ((out (lo-test--run
+ "* Open Work\n\n** TODO [#B] Parent\n*** DONE [#C] Sub-task done\nCLOSED: [2026-06-20 Sat 10:00]\nBody.\n"))
+ (judgments (lo-test--judgments (plist-get out :issues))))
+ (should (= 0 (plist-get out :fixes))) ; judgment-only, never auto-fixed
+ (should (member 'subtask-done-not-dated (lo-test--checkers judgments)))))
+
+(ert-deftest lo-subtask-done-not-dated-flags-level4-cancelled ()
+ "A level-4 CANCELLED sub-task is flagged too."
+ (let* ((out (lo-test--run
+ "* Open Work\n\n** PROJECT [#B] Parent\n*** TODO Mid\n**** CANCELLED Deep abandoned\nCLOSED: [2026-06-20 Sat]\n"))
+ (judgments (lo-test--judgments (plist-get out :issues))))
+ (should (member 'subtask-done-not-dated (lo-test--checkers judgments)))))
+
+(ert-deftest lo-subtask-done-not-dated-ignores-level2 ()
+ "A level-2 DONE task is a top-level task, not a sub-task — this checker skips it."
+ (let* ((out (lo-test--run
+ "* Open Work\n\n** DONE [#B] Top-level\nCLOSED: [2026-06-20 Sat]\nBody.\n"))
+ (judgments (lo-test--judgments (plist-get out :issues))))
+ (should-not (member 'subtask-done-not-dated (lo-test--checkers judgments)))))
+
+(ert-deftest lo-subtask-done-not-dated-ignores-dated-and-lowercase ()
+ "An already-dated level-3 entry, and the word done in a title, are not flagged."
+ (let* ((out (lo-test--run
+ "* Open Work\n\n** TODO [#B] Parent\n*** 2026-06-20 Sat @ 10:00:00 -0400 landed\n*** TODO wrap the done cleanup\n"))
+ (judgments (lo-test--judgments (plist-get out :issues))))
+ (should-not (member 'subtask-done-not-dated (lo-test--checkers judgments)))))
+
;;; ---------------------------------------------------------------------------
;;; structural heading checks (org-lint gaps)
diff --git a/claude-templates/.ai/scripts/tests/test-todo-cleanup.el b/claude-templates/.ai/scripts/tests/test-todo-cleanup.el
index e569d9a..ffbf2fb 100644
--- a/claude-templates/.ai/scripts/tests/test-todo-cleanup.el
+++ b/claude-templates/.ai/scripts/tests/test-todo-cleanup.el
@@ -768,5 +768,176 @@ in ISSUES, in document order."
(should (= 2 (plist-get once :bumped)))
(should (= 2 (plist-get twice :bumped)))))
+;;; ---------------------------------------------------------------------------
+;;; --convert-subtasks harness + tests
+
+(defun tc-test--reset-convert (&optional check)
+ (setq tc-fixes 0 tc-archived 0 tc-bumped 0 tc-converted 0 tc-archived-to-file 0
+ tc-issues nil
+ tc-check-only (and check t)
+ tc-archive-done nil tc-sync-child-priority nil tc-convert-subtasks t
+ tc-current-file nil
+ tc-archive-retain-days nil tc-archive-reference-date nil tc-archive-file nil))
+
+(defun tc-test--convert (content &optional runs check)
+ "Write CONTENT to a temp .org file, run `--convert-subtasks' RUNS times (default 1).
+Return a plist: :result final file contents, :converted count from the last run,
+:issues from the last run. CHECK non-nil ⇒ --check (preview, no writes)."
+ (let ((file (make-temp-file "tc-test-" nil ".org"))
+ last-converted last-issues)
+ (unwind-protect
+ (progn
+ (with-temp-file file (insert content))
+ (dotimes (_ (or runs 1))
+ (tc-test--reset-convert check)
+ (tc-process-file file)
+ (setq last-converted tc-converted last-issues tc-issues)
+ (tc-test--drop-buffer file))
+ (list :result (with-temp-buffer (insert-file-contents file)
+ (buffer-string))
+ :converted last-converted
+ :issues last-issues))
+ (tc-test--drop-buffer file)
+ (delete-file file))))
+
+;; The UTC offset in a converted header is the test machine's local offset for
+;; that date, so assertions match it as `[-+]NNNN' rather than a fixed value —
+;; the mode's job is to emit a well-formed offset, not to run in one timezone.
+
+(defconst tc-test--convert-timed
+ "* Project Open Work
+** TODO [#B] Parent task
+*** DONE [#C] F12 opens the terminal :feature:quick:
+CLOSED: [2026-06-27 Sat 12:50]
+Verified live: docks, toggles, colors clean.
+")
+
+(ert-deftest tc-convert-timed-subtask-normal ()
+ "Normal: a timed CLOSED close becomes a dated header, keyword/priority/tags/CLOSED gone."
+ (let* ((out (tc-test--convert tc-test--convert-timed))
+ (res (plist-get out :result)))
+ (should (= 1 (plist-get out :converted)))
+ (should (string-match-p
+ "^\\*\\*\\* 2026-06-27 Sat @ 12:50:00 [-+][0-9]\\{4\\} F12 opens the terminal$"
+ res))
+ (should-not (string-match-p "CLOSED:" res))
+ (should-not (string-match-p "DONE" res))
+ (should (string-match-p "Verified live: docks, toggles, colors clean\\." res))
+ (should (string-match-p "^\\*\\* TODO \\[#B\\] Parent task$" res))))
+
+(defconst tc-test--convert-dateonly
+ "* Project Open Work
+** PROJECT [#B] Parent
+**** DONE [#B] Write full spec :refactor:
+CLOSED: [2026-05-04 Mon]
+Body.
+")
+
+(ert-deftest tc-convert-dateonly-boundary-midnight ()
+ "Boundary: a date-only CLOSED (no time) yields 00:00:00, at level 4."
+ (let ((res (plist-get (tc-test--convert tc-test--convert-dateonly) :result)))
+ (should (string-match-p
+ "^\\*\\*\\*\\* 2026-05-04 Mon @ 00:00:00 [-+][0-9]\\{4\\} Write full spec$"
+ res))
+ (should-not (string-match-p "CLOSED:" res))))
+
+(defconst tc-test--convert-level2
+ "* Project Open Work
+** DONE [#B] Top-level task
+CLOSED: [2026-06-01 Mon 09:00]
+Body.
+")
+
+(ert-deftest tc-convert-leaves-level-2-alone-boundary ()
+ "Boundary: a level-2 DONE task is a top-level task, not a sub-task — untouched."
+ (let ((out (tc-test--convert tc-test--convert-level2)))
+ (should (= 0 (plist-get out :converted)))
+ (should (equal tc-test--convert-level2 (plist-get out :result)))))
+
+(ert-deftest tc-convert-idempotent-boundary ()
+ "Boundary: a second run over an already-dated entry converts nothing new."
+ (let ((once (tc-test--convert tc-test--convert-timed 1))
+ (twice (tc-test--convert tc-test--convert-timed 2)))
+ (should (equal (plist-get once :result) (plist-get twice :result)))
+ (should (= 0 (plist-get twice :converted)))))
+
+(defconst tc-test--convert-nested
+ "* Project Open Work
+** TODO [#B] Parent
+*** DONE Outer sub :feature:
+CLOSED: [2026-06-10 Wed 08:15]
+**** DONE Inner sub
+CLOSED: [2026-06-09 Tue 07:00]
+Inner body.
+")
+
+(ert-deftest tc-convert-nested-done-subtasks-boundary ()
+ "Boundary: a done sub-task nested under a done sub-task — both convert."
+ (let* ((out (tc-test--convert tc-test--convert-nested))
+ (res (plist-get out :result)))
+ (should (= 2 (plist-get out :converted)))
+ (should (string-match-p
+ "^\\*\\*\\* 2026-06-10 Wed @ 08:15:00 [-+][0-9]\\{4\\} Outer sub$" res))
+ (should (string-match-p
+ "^\\*\\*\\*\\* 2026-06-09 Tue @ 07:00:00 [-+][0-9]\\{4\\} Inner sub$" res))
+ (should-not (string-match-p "CLOSED:" res))))
+
+(defconst tc-test--convert-cancelled
+ "* Project Open Work
+** TODO [#B] Parent
+*** CANCELLED [#C] Abandoned idea :feature:
+CLOSED: [2026-06-15 Mon 10:00]
+")
+
+(ert-deftest tc-convert-cancelled-subtask-boundary ()
+ "Boundary: a CANCELLED sub-task converts too (terminal state)."
+ (let ((res (plist-get (tc-test--convert tc-test--convert-cancelled) :result)))
+ (should (string-match-p
+ "^\\*\\*\\* 2026-06-15 Mon @ 10:00:00 [-+][0-9]\\{4\\} Abandoned idea$" res))
+ (should-not (string-match-p "CANCELLED" res))))
+
+(defconst tc-test--convert-noclosed
+ "* Project Open Work
+** TODO [#B] Parent
+*** DONE Orphan with no closed date
+Body only.
+")
+
+(ert-deftest tc-convert-skips-subtask-without-closed-error ()
+ "Error: a done sub-task with no parseable CLOSED is flagged and left unchanged."
+ (let ((out (tc-test--convert tc-test--convert-noclosed)))
+ (should (= 0 (plist-get out :converted)))
+ (should (equal tc-test--convert-noclosed (plist-get out :result)))
+ (should (cl-some (lambda (i) (eq (plist-get i :kind) 'convert-skip))
+ (plist-get out :issues)))))
+
+(ert-deftest tc-convert-check-mode-previews-without-writing ()
+ "Check mode reports the conversion but writes nothing."
+ (let ((out (tc-test--convert tc-test--convert-timed 1 t)))
+ (should (= 1 (plist-get out :converted)))
+ (should (equal tc-test--convert-timed (plist-get out :result)))
+ (should (cl-some (lambda (i) (eq (plist-get i :kind) 'convert-would))
+ (plist-get out :issues)))))
+
+(defconst tc-test--convert-closed-with-deadline
+ "* Project Open Work
+** TODO [#B] Parent task
+*** DONE [#C] Ship the panel :feature:
+CLOSED: [2026-06-27 Sat 12:50] DEADLINE: <2026-06-30 Tue>
+Body line.
+")
+
+(ert-deftest tc-convert-preserves-deadline-on-shared-planning-line-boundary ()
+ "Boundary: removing the CLOSED cookie keeps a DEADLINE sharing its planning line."
+ (let* ((out (tc-test--convert tc-test--convert-closed-with-deadline))
+ (res (plist-get out :result)))
+ (should (= 1 (plist-get out :converted)))
+ (should (string-match-p
+ "^\\*\\*\\* 2026-06-27 Sat @ 12:50:00 [-+][0-9]\\{4\\} Ship the panel$"
+ res))
+ (should-not (string-match-p "CLOSED:" res))
+ (should (string-match-p "^DEADLINE: <2026-06-30 Tue>$" res))
+ (should (string-match-p "^Body line\\.$" res))))
+
(provide 'test-todo-cleanup)
;;; test-todo-cleanup.el ends here
diff --git a/claude-templates/.ai/scripts/tests/test_inbox_send.py b/claude-templates/.ai/scripts/tests/test_inbox_send.py
index cb60e63..f75d7a1 100644
--- a/claude-templates/.ai/scripts/tests/test_inbox_send.py
+++ b/claude-templates/.ai/scripts/tests/test_inbox_send.py
@@ -401,3 +401,78 @@ class TestInboxSendErrors:
assert result.returncode != 0
files = list((tmp_path / "projects" / "target" / "inbox").iterdir())
assert files == []
+
+
+# ----------------------------------------------------------------------
+# Filename collisions (two sends deriving the same name must not overwrite)
+# ----------------------------------------------------------------------
+
+def _load_module():
+ import importlib.util
+ spec = importlib.util.spec_from_file_location("inbox_send", SCRIPT)
+ mod = importlib.util.module_from_spec(spec)
+ spec.loader.exec_module(mod)
+ return mod
+
+
+class TestFilenameCollisions:
+ """Two sends in the same minute with the same leading phrase derived
+ identical filenames and the second silently overwrote the first
+ (a message was lost this way, 2026-07-02)."""
+
+ def test_send_text_same_minute_same_phrase_keeps_both(self, tmp_path):
+ from datetime import datetime
+ mod = _load_module()
+ inbox = tmp_path / "inbox"
+ inbox.mkdir()
+ now = datetime(2026, 7, 2, 5, 42, 0)
+ prefix = "identical leading phrase long enough to fill the whole slug budget entirely"
+ first = mod.send_text(inbox, prefix + " tail one", "archsetup", None, now)
+ second = mod.send_text(inbox, prefix + " tail two", "archsetup", None, now)
+ assert first != second
+ assert first.exists() and second.exists()
+ assert first.name != second.name
+ assert "tail one" in first.read_text()
+ assert "tail two" in second.read_text()
+
+ def test_send_text_collision_suffix_increments(self, tmp_path):
+ from datetime import datetime
+ mod = _load_module()
+ inbox = tmp_path / "inbox"
+ inbox.mkdir()
+ now = datetime(2026, 7, 2, 5, 42, 0)
+ paths = [mod.send_text(inbox, "same lead phrase differs later A", "src", "fixed-slug", now)
+ for _ in range(3)]
+ names = [p.name for p in paths]
+ assert names[0].endswith("fixed-slug.org")
+ assert names[1].endswith("fixed-slug-2.org")
+ assert names[2].endswith("fixed-slug-3.org")
+
+ def test_send_file_collision_preserves_extension(self, tmp_path):
+ from datetime import datetime
+ mod = _load_module()
+ inbox = tmp_path / "inbox"
+ inbox.mkdir()
+ src = tmp_path / "note.org"
+ src.write_text("body one")
+ now = datetime(2026, 7, 2, 5, 42, 0)
+ first = mod.send_file(inbox, src, "src", None, now)
+ src.write_text("body two")
+ second = mod.send_file(inbox, src, "src", None, now)
+ assert second.name.endswith("note-2.org")
+ assert first.read_text() == "body one"
+ assert second.read_text() == "body two"
+
+ def test_cli_two_rapid_sends_lose_nothing(self, project_root, run_script, tmp_path):
+ project_root("sender")
+ target = project_root("receiver")
+ roots = [tmp_path / "projects"]
+ prefix = "identical leading phrase long enough to fill the whole slug budget entirely"
+ run_script(["receiver", "--text", prefix + " message one"],
+ cwd=tmp_path / "projects" / "sender", roots=roots)
+ run_script(["receiver", "--text", prefix + " message two"],
+ cwd=tmp_path / "projects" / "sender", roots=roots)
+ files = list((target / "inbox").iterdir())
+ assert len(files) == 2
+ bodies = "".join(f.read_text() for f in files)
+ assert "message one" in bodies and "message two" in bodies
diff --git a/claude-templates/.ai/scripts/todo-cleanup.el b/claude-templates/.ai/scripts/todo-cleanup.el
index 541d106..bd8166d 100644
--- a/claude-templates/.ai/scripts/todo-cleanup.el
+++ b/claude-templates/.ai/scripts/todo-cleanup.el
@@ -5,10 +5,12 @@
;; emacs --batch -q -l todo-cleanup.el --check todo.org # hygiene report only
;; emacs --batch -q -l todo-cleanup.el --archive-done todo.org # archive completed subtrees
;; emacs --batch -q -l todo-cleanup.el --archive-done --check todo.org # preview the archive
+;; emacs --batch -q -l todo-cleanup.el --convert-subtasks todo.org # dated-rewrite done level-3+ sub-tasks
+;; emacs --batch -q -l todo-cleanup.el --convert-subtasks --check todo.org # preview the conversion
;; emacs --batch -q -l todo-cleanup.el --sync-child-priority todo.org # bump children whose priority drifted below the parent's
;; emacs --batch -q -l todo-cleanup.el --check-child-priority todo.org # preview the sync (same as --sync-child-priority --check)
;;
-;; Three independent modes:
+;; Four independent modes:
;;
;; * Default (hygiene). Designed for the wrap-it-up workflow: cheap, idempotent,
;; safe to run every session.
@@ -52,6 +54,20 @@
;; Archiving is consequential, so it's never run by default; it does *not*
;; also run the hygiene passes.
;;
+;; * --convert-subtasks (opt-in). Rewrites every level-3-and-deeper heading whose
+;; TODO state is DONE/CANCELLED/FAILED into a dated event-log entry
+;; (`<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>'), dropping the keyword,
+;; priority cookie, and tags, and removing the now-redundant CLOSED line. The
+;; date and time come from that entry's own CLOSED cookie; a date-only close
+;; yields 00:00:00, and the UTC offset is computed DST-aware for that date.
+;; This enforces the todo-format depth rule that interactive closes
+;; (`org-log-done' → DONE + CLOSED) and `--archive-done' (level-2 only) leave
+;; unapplied. The heading text is preserved verbatim — a batch tool can't
+;; past-tense an imperative title reliably. Idempotent (an already-dated
+;; heading has no done keyword); a done sub-task with no parseable CLOSED date
+;; is flagged and left alone, never stamped with a fabricated date. Like
+;; --archive-done it does not also run the hygiene passes.
+;;
;; * --sync-child-priority (opt-in). Walks every heading with a priority cookie
;; ([#A]-[#D]) and, for each of its direct child headings whose own priority
;; is lower (later in the alphabet — D is lower than A), bumps the child's
@@ -73,11 +89,16 @@
(require 'calendar)
(setq org-todo-keywords
- '((sequence "TODO" "DOING" "WAITING" "NEXT" "|" "DONE" "CANCELLED")))
+ '((sequence "TODO" "DOING" "WAITING" "NEXT" "|" "DONE" "CANCELLED" "FAILED")))
(defconst tc-done-states '("DONE" "CANCELLED")
"TODO keywords that mark an entry as completed for `--archive-done'.")
+(defconst tc--convert-done-states '("DONE" "CANCELLED" "FAILED")
+ "TODO keywords whose level-3-and-deeper entries `--convert-subtasks' rewrites
+to dated event-log entries. Broader than `tc-done-states' because a FAILED
+sub-task is terminal too and belongs in the parent's dated history.")
+
(defconst tc--priority-cookie-regexp "\\[#\\([A-Z]\\)\\]"
"Regexp matching an org priority cookie. Match group 1 is the letter.")
@@ -89,10 +110,12 @@ every heading below it.")
(defvar tc-fixes 0)
(defvar tc-archived 0)
(defvar tc-bumped 0)
+(defvar tc-converted 0)
(defvar tc-issues nil)
(defvar tc-check-only nil)
(defvar tc-archive-done nil)
(defvar tc-sync-child-priority nil)
+(defvar tc-convert-subtasks nil)
(defvar tc-current-file nil)
(defvar tc-current-dir nil)
(defvar tc-archived-to-file 0)
@@ -578,6 +601,138 @@ before their descendants — a [#A] → [#B] → [#D] chain collapses in one pas
(org-map-entries #'tc-sync-child-priority-at-heading nil 'file))
;;; ---------------------------------------------------------------------------
+;;; --convert-subtasks mode
+;;
+;; A sub-task (a heading at level 3 or deeper, i.e. under a parent task) that is
+;; marked DONE/CANCELLED/FAILED should become a dated event-log entry per the
+;; todo-format depth rule: drop the keyword, priority cookie, and tags, and
+;; rewrite the heading to `<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>' so the
+;; parent's subtree grows a chronological history instead of a long tail of
+;; nested DONE lines. Nothing enforced this before: `org-log-done' just flips an
+;; interactive close to DONE + CLOSED, and `--archive-done' only touches level 2.
+;; So level-3+ closes piled up as DONE keywords. This mode converts them
+;; mechanically, pulling the timestamp from each entry's own CLOSED cookie. The
+;; heading text is kept verbatim (a batch tool can't reliably past-tense an
+;; imperative title, and guessing prose in the task file is worse than leaving it
+;; as written). Idempotent: an already-dated heading has no done keyword, so it
+;; is skipped. A done sub-task with no parseable CLOSED cookie can't be dated, so
+;; it is flagged and left alone rather than stamped with a fabricated date.
+
+(defun tc--closed-parts-in-entry ()
+ "Return a plist (:year :month :day :dow :hour :minute) from the CLOSED cookie
+of the entry at point, or nil when the entry has no parseable CLOSED line.
+:hour and :minute are nil when the cookie carries only a date. The CLOSED line
+sits in canonical position directly under the heading, so the first match within
+the entry is the task's own close."
+ (save-excursion
+ (org-back-to-heading t)
+ (let ((end (save-excursion
+ (or (outline-next-heading) (goto-char (point-max)))
+ (point))))
+ (when (re-search-forward
+ (concat "CLOSED:[ \t]*\\[\\([0-9]\\{4\\}\\)-\\([0-9]\\{2\\}\\)-\\([0-9]\\{2\\}\\)"
+ "[ \t]+\\([A-Za-z]+\\)"
+ "\\(?:[ \t]+\\([0-9]\\{2\\}\\):\\([0-9]\\{2\\}\\)\\)?\\]")
+ end t)
+ (list :year (match-string 1) :month (match-string 2) :day (match-string 3)
+ :dow (match-string 4)
+ :hour (match-string 5) :minute (match-string 6))))))
+
+(defun tc--tz-offset-string (year month day hour minute)
+ "Return the local UTC offset (e.g. \"-0500\") for the given wall-clock instant.
+DST-aware: `encode-time' with an unknown-DST field lets the system pick the
+correct offset for that date, so a summer close reads -0400 and a winter one
+-0500 without hardcoding either."
+ (format-time-string
+ "%z" (encode-time (list 0 minute hour day month year nil -1 nil))))
+
+(defun tc--dated-header-line (level parts title)
+ "Build the dated event-log heading string from LEVEL, CLOSED PARTS, and TITLE.
+Missing time in PARTS defaults to 00:00:00 (the close logged only a date)."
+ (let* ((year (plist-get parts :year))
+ (month (plist-get parts :month))
+ (day (plist-get parts :day))
+ (dow (plist-get parts :dow))
+ (hh (or (plist-get parts :hour) "00"))
+ (mm (or (plist-get parts :minute) "00"))
+ (tz (tc--tz-offset-string (string-to-number year)
+ (string-to-number month)
+ (string-to-number day)
+ (string-to-number hh)
+ (string-to-number mm))))
+ (format "%s %s-%s-%s %s @ %s:%s:00 %s %s"
+ (make-string level ?*) year month day dow hh mm tz title)))
+
+(defun tc--convert-collect-targets ()
+ "Markers at every heading at level >= 3 whose TODO state is a done state.
+Collected up front so the rewrite loop can edit the buffer without disturbing an
+in-progress `org-map-entries' walk; markers track their headings across edits."
+ (let (targets)
+ (org-map-entries
+ (lambda ()
+ (when (and (>= (org-current-level) 3)
+ (member (org-get-todo-state) tc--convert-done-states))
+ (push (copy-marker (point)) targets)))
+ nil 'file)
+ (nreverse targets)))
+
+(defun tc--convert-one-subtask (marker)
+ "Convert the done sub-task heading at MARKER to a dated event-log entry.
+Under `tc-check-only' the conversion is reported but not performed."
+ (goto-char marker)
+ (org-back-to-heading t)
+ (let* ((level (org-current-level))
+ (title (org-get-heading t t t t))
+ (line (line-number-at-pos))
+ (parts (tc--closed-parts-in-entry)))
+ (cond
+ ((null parts)
+ (push (list :kind 'convert-skip :file tc-current-file
+ :line line :heading title
+ :detail "no CLOSED date to derive the timestamp")
+ tc-issues))
+ (t
+ (let ((new (tc--dated-header-line level parts title)))
+ (cl-incf tc-converted)
+ (if tc-check-only
+ (push (list :kind 'convert-would :file tc-current-file
+ :line line :heading title :new new)
+ tc-issues)
+ ;; Replace the heading line, then drop the now-redundant CLOSED
+ ;; cookie from the entry (its date now lives in the header). Only
+ ;; the cookie goes: a planning line can also carry DEADLINE: or
+ ;; SCHEDULED: beside it, and those survive on their line. A line
+ ;; left blank by the removal is deleted whole.
+ (delete-region (line-beginning-position) (line-end-position))
+ (insert new)
+ (let ((end (save-excursion
+ (or (outline-next-heading) (goto-char (point-max)))
+ (point))))
+ (save-excursion
+ (when (re-search-forward "CLOSED:[ \t]*\\[[^]]*\\][ \t]*" end t)
+ (replace-match "")
+ (let ((bol (line-beginning-position))
+ (eol (line-end-position)))
+ (if (string-match-p "\\`[ \t]*\\'"
+ (buffer-substring bol eol))
+ (delete-region bol (min (1+ eol) (point-max)))
+ (goto-char bol)
+ (when (looking-at "[ \t]+")
+ (replace-match "")))))))
+ (push (list :kind 'convert-done :file tc-current-file
+ :line line :heading title :new new)
+ tc-issues)))))))
+
+(defun tc-convert-subtasks-in-file ()
+ "Rewrite every level-3-and-deeper DONE/CANCELLED/FAILED heading to a dated
+event-log entry, pulling the timestamp from its CLOSED cookie. Honors
+`tc-check-only'."
+ (let ((targets (tc--convert-collect-targets)))
+ (dolist (m targets)
+ (tc--convert-one-subtask m)
+ (set-marker m nil))))
+
+;;; ---------------------------------------------------------------------------
;;; Driver + reporting
(defun tc-process-file (file)
@@ -590,6 +745,8 @@ before their descendants — a [#A] → [#B] → [#D] chain collapses in one pas
(tc-archive-done-in-file))
(tc-sync-child-priority
(tc-sync-child-priority-in-file))
+ (tc-convert-subtasks
+ (tc-convert-subtasks-in-file))
(t
;; Pass 1: auto-fix bogus state logs (or report under --check).
(org-map-entries #'tc-fix-bogus-state-log-in-entry nil 'file)
@@ -684,9 +841,34 @@ before their descendants — a [#A] → [#B] → [#D] chain collapses in one pas
(plist-get i :child-heading)
(plist-get i :parent-heading)))))))
+(defun tc--emit-convert-report ()
+ ;; Silent on a real-mode no-op (nothing to convert and nothing skipped), for
+ ;; the same reason as the archive report: the wrap runs cleanup passes more
+ ;; than once, and a vocal \"0 converted\" reads as noise. Check mode always
+ ;; reports (the preview is what the caller asked for), and a skip always
+ ;; reports (a done sub-task with no CLOSED date is a real condition to see).
+ (let ((has-skip (cl-some (lambda (i) (eq (plist-get i :kind) 'convert-skip))
+ tc-issues)))
+ (when (or tc-check-only (> tc-converted 0) has-skip)
+ (princ (format "todo-cleanup --convert-subtasks: %d sub-task(s) %s%s\n"
+ tc-converted
+ (if tc-check-only "would convert" "converted")
+ (if tc-check-only " — CHECK MODE (no writes)" "")))
+ (dolist (i (reverse tc-issues))
+ (pcase (plist-get i :kind)
+ ((or 'convert-done 'convert-would)
+ (princ (format " %s:%d: %s\n → %s\n"
+ (plist-get i :file) (plist-get i :line)
+ (plist-get i :heading) (plist-get i :new))))
+ ('convert-skip
+ (princ (format " skipped %s:%d: %s — %s\n"
+ (plist-get i :file) (plist-get i :line)
+ (plist-get i :heading) (plist-get i :detail)))))))))
+
(defun tc-emit-report ()
(cond (tc-archive-done (tc--emit-archive-report))
(tc-sync-child-priority (tc--emit-sync-report))
+ (tc-convert-subtasks (tc--emit-convert-report))
(t (tc--emit-hygiene-report))))
(defun tc-main ()
@@ -701,6 +883,9 @@ before their descendants — a [#A] → [#B] → [#D] chain collapses in one pas
(when (member "--sync-child-priority" command-line-args-left)
(setq tc-sync-child-priority t)
(setq command-line-args-left (delete "--sync-child-priority" command-line-args-left)))
+ (when (member "--convert-subtasks" command-line-args-left)
+ (setq tc-convert-subtasks t)
+ (setq command-line-args-left (delete "--convert-subtasks" command-line-args-left)))
;; --check-child-priority is the report-only alias for
;; `--sync-child-priority --check'.
(when (member "--check-child-priority" command-line-args-left)
@@ -708,7 +893,7 @@ before their descendants — a [#A] → [#B] → [#D] chain collapses in one pas
(setq command-line-args-left (delete "--check-child-priority" command-line-args-left)))
(if (null command-line-args-left)
(progn
- (princ "Usage: emacs --batch -q -l todo-cleanup.el [--check] [--archive-done | --sync-child-priority | --check-child-priority] FILE...\n")
+ (princ "Usage: emacs --batch -q -l todo-cleanup.el [--check] [--archive-done | --convert-subtasks | --sync-child-priority | --check-child-priority] FILE...\n")
(kill-emacs 1))
(let ((files command-line-args-left))
(setq command-line-args-left nil)
@@ -727,6 +912,7 @@ ert-run-tests-batch-and-exit'."
(cl-every (lambda (a)
(cond ((member a '("--check"
"--archive-done"
+ "--convert-subtasks"
"--sync-child-priority"
"--check-child-priority"))
t)
diff --git a/claude-templates/.ai/workflows/INDEX.org b/claude-templates/.ai/workflows/INDEX.org
index a474b29..88721ed 100644
--- a/claude-templates/.ai/workflows/INDEX.org
+++ b/claude-templates/.ai/workflows/INDEX.org
@@ -54,6 +54,11 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e
- Roam-mode triggers: "inbox zero", "empty the inbox", "process the roam inbox", "triage my roam inbox"
- Auto-mode trigger: "auto inbox zero" (match before "inbox zero")
+- =work-the-backlog.org= — the autonomous task-execution loop, the single home for working a batch of marked tasks unattended: takes an ordered task set (explicit list or tag query) + session mode (=file-only= default / =autonomous-commit= + paging) + a hard run cap; each candidate passes the mechanical eligibility gate (status =TODO= + =:solo:= per the project's scheme header) and the four-item defer checklist, then is implemented to the full quality bar (TDD, =/review-code=, =/voice=) as its own logical commits. Fed by the inbox auto-loop's chain step (yes-gated, file-only, cap 1) and the no-approvals speedrun preset (pre-flight Q&A → autonomous-commit + always-push + end-of-set page over an explicit ordered list).
+ - Speedrun triggers: "speedrun", "no approvals speedrun", "speedrun these: <task set>" — any phrase containing "speedrun" routes here (the preset), never to =no-approvals.org=
+ - Manual triggers: "work the backlog", "work the backlog with <task set>" (file-only defaults)
+ - Synthesis trigger: "synthesize backlog metrics" — read the per-project metrics logs, compute trends + the corrections signal, write one =:agent:metrics:= KB node (personal projects only)
+
** Calendar
- =add-calendar-event.org= — create a calendar event.
@@ -117,6 +122,7 @@ This index must list every =.org= file in =.ai/workflows/= except this one and e
- Triggers: "session harvest", "harvest the sessions", "let's run the session-harvest workflow", "monthly harvest", "mine the sessions"
- =no-approvals.org= — drop the interaction-level approval gates for a pre-agreed batch while keeping engineering-discipline gates (=/review-code=, =/voice personal=, tests, session-log updates, subagent reviews, destructive-action consent). Mode stays on until Craig turns it off, a real question arises, the queue empties, or the conversation switches topics.
- Triggers: "no-approvals mode", "no approvals", "no-approval", "no need for approval gates", "stop asking, just keep going", "I'll check back in when you're done or stuck", "do all =<selector>= with no-approval"
+ - Exception: any phrase containing "speedrun" routes to =work-the-backlog.org='s no-approvals speedrun preset instead
* Living Document
diff --git a/claude-templates/.ai/workflows/clean-todo.org b/claude-templates/.ai/workflows/clean-todo.org
index dd33056..a1b2af5 100644
--- a/claude-templates/.ai/workflows/clean-todo.org
+++ b/claude-templates/.ai/workflows/clean-todo.org
@@ -27,7 +27,17 @@ Deletes bogus =- State "X" from "X" [date]= log lines (state didn't actually cha
To preview without writing, run =--check= first: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --check todo.org=.
-** Step 2: Archive completed work
+** Step 2: Convert done sub-tasks to dated entries
+
+#+begin_src bash
+emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org
+#+end_src
+
+Rewrites every heading at level 3 or deeper whose TODO state is DONE/CANCELLED/FAILED into a dated event-log entry (=<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>=), dropping the keyword, priority cookie, and tags, and removing the =CLOSED:= line. Enforces the depth rule that a completed sub-task becomes dated history — a shape interactive org closes and =--archive-done= (level-2 only) leave unapplied. Timestamp comes from each entry's =CLOSED= cookie; heading text kept verbatim; idempotent; a done sub-task with no parseable =CLOSED= is flagged and left alone. Run before archiving so a parent's sub-tasks are already dated when it moves. Capture the output.
+
+To preview without writing: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks --check todo.org=.
+
+** Step 3: Archive completed work
#+begin_src bash
emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org
@@ -37,10 +47,11 @@ Moves every level-2 subtree whose TODO state is DONE or CANCELLED out of the "Op
To preview the moves without writing: =emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done --check todo.org=.
-** Step 3: Summarize
+** Step 4: Summarize
-Report to Craig from the two captured outputs:
+Report to Craig from the three captured outputs:
- Hygiene: how many bogus state-log lines were deleted; any orphan-planning warnings (file:line + heading), or "none".
+- Convert: how many done sub-tasks were rewritten to dated entries (heading + line), any flagged for no =CLOSED= date, or "nothing to convert".
- Archive: how many subtrees moved and which (heading + line), or "nothing to move" / the skip reason if a section was missing or ambiguous.
- If the file changed, note that =todo.org= now has an uncommitted edit — review =git diff -- todo.org= and commit it (in this repo's commit style) if it looks right. If nothing changed, say so and stop.
@@ -49,7 +60,7 @@ Don't auto-commit. The summary is the review point; Craig decides whether the di
* Principles
- *Both passes apply, not just preview.* The workflow is invoked because cleanup is wanted. Use the =--check= variants only when Craig asks for a dry run.
-- *Two passes, two invocations.* =--archive-done= is its own mode and does not run the hygiene pass; run both.
+- *Separate modes, separate invocations.* =--convert-subtasks=, =--archive-done=, and the hygiene pass are each their own mode and don't run the others; run all three.
- *Never auto-commit todo.org.* Surface the diff and let Craig commit it. The cleanup is a working-tree change, fully reversible until committed.
- *Trust the script.* It's fast and idempotent; if there's nothing to do, it reports zero and exits clean. No pre-checks.
diff --git a/claude-templates/.ai/workflows/inbox.org b/claude-templates/.ai/workflows/inbox.org
index 5fc855f..acfd11d 100644
--- a/claude-templates/.ai/workflows/inbox.org
+++ b/claude-templates/.ai/workflows/inbox.org
@@ -114,6 +114,14 @@ The item extends a task already filed. Update the parent TODO's body with a date
** File as TODO
Substantive but waits, or needs design/triage before implementation. Add the TODO under =* <Project> Open Work= with priority + tags per the priority-scheme check (core §6). Body summarizes the proposal and links the inbox content if it's been moved to =docs/design/=. Delete the inbox file (or move it to =docs/design/= first if the content survives).
+*Route-candidate marking (feeds the wrap-up router).* After filing, check whether the keeper's inferred home is a different project:
+
+#+begin_src bash
+python3 .ai/scripts/route_recommend.py --item "<the keeper's heading + body text>" --exclude "$(basename "$PWD")"
+#+end_src
+
+On a =<destination>\tstrong= or =<destination>\tweak= result, stamp the new TODO's property drawer with =:ROUTE_CANDIDATE: <destination>= (create the drawer if the task has none). A =none= result stamps nothing, and a local keeper stays unstamped. The marker is the wrap-up router's entire candidate set — =wrap-it-up.org= Step 3 surfaces exactly the =:ROUTE_CANDIDATE:=-tagged tasks and offers to deliver each to its destination's inbox, never scanning the standing backlog. Stamping is cheap and reversible (the router's skip leaves the task in place; a wrong marker is one property line to delete), so prefer stamping on any plausible match — the human reviews the batch at wrap time.
+
*Blocking-dependency handoff.* A special shape: another project sends a note that *this* project's work is blocking one of theirs ("your task X is blocked on us — we need Y"). File or link the owning task, tag it =:blocker:=, and name the requesting project in the body (see the cross-project dependency convention in =todo-format.md=). The =:blocker:= tag makes =open-tasks.org= surface that task *first*, since clearing it unblocks the other project. Dedup against an existing task rather than filing a duplicate. When the work later lands, drop =:blocker:= and notify the waiting project (=inbox-send <their-project> --text "Delivered: <what> — you're unblocked."=) so it can lift its own =:blocked:=.
** Defer
@@ -253,7 +261,7 @@ The gap it closes: handoffs that arrive mid-session used to sit unseen until the
Never begin monitoring on a dirty worktree or a failing test suite. A dirty tree means the auto-commit at the end of an executed item sweeps up unrelated changes; a red suite means you can't tell whether the monitor broke something. At the start:
-1. =git status --porcelain= is empty (clean worktree).
+1. =git status --porcelain --untracked-files=no= is empty (no tracked modifications). Untracked and gitignored files never block — an inbox drop is exactly what this mode processes, and a scratch file is none of its business (the template-freshness policy in =startup.org= Phase A.0). The tracked-only gate is safe because the per-item commit stages its files explicitly (=commits.md=: only intended changes staged) — never =git add -A=, which would sweep untracked files and is the failure this gate guards against.
2. A full test run is all green (=make test= here, or the project's full-suite command).
If *dirty*: offer to commit the pending changes in discrete, logical batches before starting. If *red*: offer to investigate the failures first. Surface the blocker with inline numbered options per =interaction.md= and wait — monitoring does not start until the tree is clean and the suite is green.
@@ -338,7 +346,7 @@ Close the loop per the reply-to-sender discipline (core §4): confirm what lande
End the way it started: clean worktree, green suite. Before stopping the loop or reporting the pass done:
-1. Commit or revert everything left in the worktree — nothing uncommitted remains.
+1. Commit or revert every tracked modification left in the worktree — no tracked change remains uncommitted. Untracked files (unprocessed inbox drops, scratch) are not the monitor's to sweep.
2. Run the full test suite once more and confirm all green.
If either can't be satisfied — a half-done item, a failure introduced during the pass — surface it rather than leaving it. The next monitor run assumes a clean, green starting state (the Preconditions gate).
@@ -453,18 +461,19 @@ Take these up when the single-destination version is in use and the multi-projec
* Mode: auto inbox zero
-A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports, and a find waits for Craig's go before any work happens.
+A recurring, *interactive* roam check. Trigger phrase: "auto inbox zero" (match before "inbox zero" — the longer phrase wins). On invocation, *ask Craig for the interval* (e.g. 30 min, 2 hours), then drive the loop with =/loop <interval>= running roam mode. It is in-session and interactive by design — each cycle reports what it found and filed.
** Per cycle
1. Run roam mode's scan (Phase A local check + Phase B roam scan), read-only — no =git pull=. The capture-guard still gates any write: use =capture-guard --wait= (core §5) so a transient capture clears itself; if it's still open after the wait, *defer this cycle's roam reconcile to the next cycle* rather than surfacing — the loop cadence is the retry, and the filed items get swept next time. The rare write hands its git to =roam-sync= (roam Phase D).
2. *Nothing found* → no inbox summary. One acknowledgement line: =ran at HH:MM, nothing found=. Nothing else. The acknowledge-only-on-empty rule keeps a quiet inbox quiet.
3. *Items found* → summarize the found items, file them as tasks (roam Phase C), and *append them to a displayed queue* — the harness task list, via =TaskCreate= — so the queue accumulates across cycles. Then ask: "run this batch next?"
- - *Yes* → launch into implementing the found items, each through the normal disposition ladder (core §3) + verify flow.
+ - *Yes* → chain into =work-the-backlog.org= as an explicit second step after routing completes: pass it the eligibility query over the queued items (status =TODO= + =:solo:= per the scheme header, priority-ordered), =file-only= mode, paging off, cap 1. The highest-priority eligible candidate runs; the rest wait for the next tick or a later yes.
- *No* → they stay queued for a later go.
+ This mode never implements anything itself — routing ends here, and the execution loop lives in =work-the-backlog.org=, its one home.
4. *Cross-cycle dedup.* Subsequent cycles add only *newly-found* items to the same displayed queue, never re-surfacing what's already there. Dedup against the queue (the =TaskCreate= list), not against what's already been implemented — a find that was queued-but-not-yet-run must not reappear, and one already filed into =todo.org= is dropped by roam Phase C's status check.
-A find is always surfaced and gated on Craig's yes; a quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its execute step waits for a yes.
+A find is always surfaced and filed; execution happens only through the =work-the-backlog.org= chain and waits for Craig's yes. A quiet inbox produces only the timestamped acknowledgement. =auto inbox zero= is inherently in-session because its chain step waits for that yes.
** Fully-unattended pass (=/schedule=) — vNext, not v1
diff --git a/claude-templates/.ai/workflows/no-approvals.org b/claude-templates/.ai/workflows/no-approvals.org
index 1efce82..9e1c894 100644
--- a/claude-templates/.ai/workflows/no-approvals.org
+++ b/claude-templates/.ai/workflows/no-approvals.org
@@ -22,6 +22,8 @@ Craig activates the mode with any of:
- Queuing several tasks in =todo.org= followed by any phrase above
- Any equivalent phrasing that signals he doesn't want to be re-asked between items
+*Not this mode:* any phrase containing "speedrun" ("speedrun", "no approvals speedrun") routes to =work-the-backlog.org='s no-approvals speedrun preset — an autonomous batch over an explicit ordered task set, with a pre-flight Q&A, autonomous commits, always-push, and an end-of-set page. This mode is the general interaction-gate suspension for whatever work is already underway; the speedrun is the dedicated backlog-batch workflow.
+
Mode resets when:
- Craig says approvals are back on
diff --git a/claude-templates/.ai/workflows/open-tasks.org b/claude-templates/.ai/workflows/open-tasks.org
index 4ba29dd..02a0847 100644
--- a/claude-templates/.ai/workflows/open-tasks.org
+++ b/claude-templates/.ai/workflows/open-tasks.org
@@ -23,15 +23,16 @@ Don't route "task review" / "review tasks" here — those trigger the hygiene ha
* Phase A: Data Gathering (both modes)
-** Phase A pre-step — archive any freshly-DONE tasks
+** Phase A pre-step — normalize freshly-closed tasks
-Before reading =todo.org=, run the cleanup script's archive-done sweep so completed level-2 subtrees move from =* $Project Open Work= to =* $Project Resolved=:
+Before reading =todo.org=, run two cleanup sweeps so the read reflects current state. First convert any done sub-tasks to dated entries, then archive completed level-2 subtrees from =* $Project Open Work= to =* $Project Resolved=:
#+begin_src bash
+emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org
emacs --batch -q -l .ai/scripts/todo-cleanup.el --archive-done todo.org
#+end_src
-Costs a few hundred milliseconds. Without it, a task that completed earlier in the session sits as =** DONE= under Open Work until the next =clean-todo= or wrap-up pass, and Next Mode would surface it as a "what's next" candidate. The sweep makes Phase A's read of =todo.org= reflect current state.
+Costs a few hundred milliseconds. Without the archive sweep, a task that completed earlier in the session sits as =** DONE= under Open Work until the next =clean-todo= or wrap-up pass, and Next Mode would surface it as a "what's next" candidate. The convert sweep runs first so a completed parent's sub-tasks are already dated when it archives; it also keeps interactive level-3 closes from lingering as DONE keywords. Together they make Phase A's read of =todo.org= reflect current state.
Skip the sweep if the workflow is invoked in an explicit read-only or dry-run context. Default is to run it.
diff --git a/claude-templates/.ai/workflows/page-me.org b/claude-templates/.ai/workflows/page-me.org
index 607ed51..8069830 100644
--- a/claude-templates/.ai/workflows/page-me.org
+++ b/claude-templates/.ai/workflows/page-me.org
@@ -5,9 +5,9 @@
* Overview
-This workflow enables Claude to set timers and alarms that reliably notify Craig, even if the terminal session ends or is accidentally closed. Notifications are distinctive (audible + visual with alarm icon) and persist until manually dismissed.
+This workflow enables Claude to set timers and alarms that reliably notify Craig, even if the terminal session ends or is accidentally closed. Notifications are distinctive (audible + visual with the blue info icon) and persist until manually dismissed.
-Uses the =notify= command (alarm type) for consistent notifications across all AI workflows.
+Uses the =notify= command (info type) for consistent notifications across all AI workflows. Info-level on purpose: the earlier alarm styling read as all-red urgency, and Craig's verdict was that a page "should be a persistent info notification" — noticeable, never crash-scary (2026-07-02).
* Trigger Phrase
@@ -63,8 +63,8 @@ Craig tells Claude when and why:
Claude schedules the alarm using the =at= daemon with =notify=:
#+begin_src bash
-echo "notify alarm 'Page' 'Time to call the dentist' --persist" | at 3:30pm
-echo "notify alarm 'Page' 'Meeting starts' --persist" | at now + 45 minutes
+echo "notify info 'Page' 'Time to call the dentist' --persist" | at 3:30pm
+echo "notify info 'Page' 'Meeting starts' --persist" | at now + 45 minutes
#+end_src
The =at= daemon:
@@ -89,26 +89,26 @@ Craig dismisses the notification and acts on it.
** Setting Alarms
-Use the =at= daemon to schedule a =notify alarm= command:
+Use the =at= daemon to schedule a =notify info= command:
#+begin_src bash
# Schedule for specific time
-echo "notify alarm 'Page' 'Meeting starts' --persist" | at 3:30pm
+echo "notify info 'Page' 'Meeting starts' --persist" | at 3:30pm
# Schedule for relative time
-echo "notify alarm 'Page' 'Check the build' --persist" | at now + 30 minutes
+echo "notify info 'Page' 'Check the build' --persist" | at now + 30 minutes
# Schedule for tomorrow
-echo "notify alarm 'Page' 'Call the dentist' --persist" | at 3:30pm tomorrow
+echo "notify info 'Page' 'Call the dentist' --persist" | at 3:30pm tomorrow
#+end_src
** Notification System
-Uses the =notify= command with the =alarm= type. The =notify= command provides 8 notification types with matching icons and sounds.
+Uses the =notify= command with the =info= type. The =notify= command provides 8 notification types with matching icons and sounds.
#+begin_src bash
-# Immediate alarm notification (for testing)
-notify alarm "Page" "Your message here" --persist
+# Immediate page notification (for testing)
+notify info "Page" "Your message here" --persist
#+end_src
The =--persist= flag keeps the notification on screen until manually dismissed. All page-me notifications should use =--persist= by default.
@@ -139,10 +139,10 @@ The alarm must fire. Use the =at= daemon which is designed for exactly this purp
Simple invocation - Claude runs one command. No complex setup required per alarm.
** Fail Audibly
-If the alarm fails to schedule, report the error clearly. Don't fail silently.
+If the page fails to schedule, report the error clearly. Don't fail silently.
** Testable
-The =notify alarm= command can be called directly to verify notifications work without waiting for a timer.
+The =notify info= command can be called directly to verify notifications work without waiting for a timer.
** Non-Alarming
Use normal urgency, not critical. The notification should be noticeable but not imply something has gone horribly wrong.
diff --git a/claude-templates/.ai/workflows/spec-create.org b/claude-templates/.ai/workflows/spec-create.org
index 508b969..1249181 100644
--- a/claude-templates/.ai/workflows/spec-create.org
+++ b/claude-templates/.ai/workflows/spec-create.org
@@ -82,8 +82,9 @@ This is where the spec earns a "Ready" from review: an engineer must be able to
** Phase 5 — Wire it up (conventions)
-- *Filename + location:* =docs/<problem-slug>-spec.org=. Org-mode. The slug names the *problem/feature*, not a date. Must end in =-spec.org=.
-- *Metadata header:* a small table at the top — Status, Owner, Reviewer(s), Date, Related (link to the task/ticket).
+- *Filename + location:* =docs/specs/YYYY-MM-DD-<problem-slug>-spec.org= — formal specs live in =docs/specs/=, never =docs/design/= (that's for notes, brainstorms, inventories; see =claude-rules/docs-lifecycle.md=). Org-mode. The slug names the *problem/feature*; no status suffixes ever — status lives in the file. Must end in =-spec.org=.
+- *Status heading (first element after the file header):* a top-level heading carrying the lifecycle keyword, stamped =DRAFT= at authoring — spec-create owns this flip. It holds an =:ID:= UUID (generate with =uuidgen=) and dated history lines, newest first. The keyword is authoritative; the Metadata =Status= field mirrors it in lowercase. Transitions are three lines in one file (keyword + history line + mirror): spec-review flips =READY=, spec-response flips =DOING= at decomposition, the final build task flips =IMPLEMENTED=. Terminal states always record a reason.
+- *Metadata header:* a small table at the top — Status (the lowercase mirror), Owner, Reviewer(s), Date, Related (link to the task/ticket).
- *Review-and-iteration-history stub:* add a =Review and iteration history= section at the bottom and seed it with the author's first entry. =spec-review= and =spec-response= append provenance entries here, so the heading shape is a contract: =YYYY-MM-DD Day @ HH:MM:SS -ZZZZ — Contributor — Role=, body fields What / Why / Artifacts.
- *Cross-link both ways:* the spec links its task; the task links the spec (replace the task's inline plan with a terse description + a =file:= link to the spec).
@@ -103,7 +104,14 @@ Then it's ready for =spec-review.org=. Snapshot-vs-living rule: keep the spec li
,#+TITLE: <Feature> — Spec
,#+AUTHOR: <author>
,#+DATE: <YYYY-MM-DD>
-,#+TODO: TODO | DONE SUPERSEDED CANCELLED
+,#+TODO: TODO | DONE
+,#+TODO: DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED
+
+,* DRAFT <spec short name>
+:PROPERTIES:
+:ID: <uuid — generate with uuidgen>
+:END:
+- <YYYY-MM-DD Day @ HH:MM:SS -ZZZZ> — drafted.
,* Metadata
| Status | draft |
diff --git a/claude-templates/.ai/workflows/spec-response.org b/claude-templates/.ai/workflows/spec-response.org
index de5b1c8..7628e49 100644
--- a/claude-templates/.ai/workflows/spec-response.org
+++ b/claude-templates/.ai/workflows/spec-response.org
@@ -130,9 +130,11 @@ When related specs were reviewed together, two reviews can recommend opposite th
This is the *last* step of the workflow, and it runs *only after the author confirms the spec is Ready* — never during review iterations. A Ready spec nobody can act on is unfinished; this phase turns it into tracked work. It applies to every project type (library, application, service, docs set).
-1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it.
+*This phase owns the =READY= → =DOING= lifecycle flip* (docs-lifecycle convention): when the decomposition below lands, update the spec's top-level status heading keyword to =DOING=, add a dated history line, and set the Metadata =Status= mirror to =doing= — three lines, one file.
-2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks.
+1. *Decide where the tasks live.* If the work is spinning off into its own project/repo, move the parent task into that project's =todo.org= (and relocate the spec with it); otherwise use the current project's =todo.org=. One parent task owns the effort; the phase tasks hang under it. *Stamp the binding:* the parent task's =:PROPERTIES:= drawer gets a =:SPEC_ID:= line holding the spec's status-heading UUID. That property is the durable join task-audit uses to police =DOING= specs (a =DOING= spec whose bound parent is closed, archived, or missing gets flagged).
+
+2. *Create one task per implementation phase* from the spec's =Implementation phases=, in dependency order, so the task set as a whole describes the *full* milestone (e.g. v1) with no gaps. Each task body names the deliverable, its tests, and how it is verified. Carry over deferred/vNext work and any publish/release steps as their own tasks. *Always end the set with the flip task:* a final "flip the spec to IMPLEMENTED (+ dated history line + mirror)" task under the same parent — the tracked obligation that closes the lifecycle loop when the build finishes. Never skip it; "a human remembers" is the failure mode this exists to prevent.
3. *Turn a critical eye on completeness.* Re-read the spec — every phase, every acceptance criterion, every named deliverable, every data-safety/principle rule — and confirm each has a home in a task. The work is not done when the tasks merely exist; it is done when nothing in the spec is left untracked. This completeness pass is mandatory regardless of project type.
diff --git a/claude-templates/.ai/workflows/spec-review.org b/claude-templates/.ai/workflows/spec-review.org
index 833dfc9..d4998eb 100644
--- a/claude-templates/.ai/workflows/spec-review.org
+++ b/claude-templates/.ai/workflows/spec-review.org
@@ -50,6 +50,11 @@ Run it *early* — design review exists to catch viability problems and costly m
Before Phase 1, verify the file under review ends with =-spec.org=. Every design, decision, or planning document under a project's =docs/= directory carries that suffix as its identifier. The =.org= extension alone is not enough because =docs/= holds non-spec org files too (tutorials, frozen inventories, reference material).
+*Location expectation (docs-lifecycle convention).* Formal specs live in =docs/specs/=. Whether that's enforced depends on whether the project has run its one-time =spec-sort= retrofit:
+
+- =:LAST_SPEC_SORT:= present in =.ai/notes.org= Workflow State → the project has sorted; a =-spec.org= file outside =docs/specs/= fails this precondition. Surface it: "this spec sits outside docs/specs/ — move it (and update inbound links) before review."
+- Marker absent → legacy locations (=docs/= root, =docs/design/=) stay reviewable; add one nudge line to the review output ("this project's docs pile has never been spec-sorted — say 'run spec-sort' to sort it") and proceed. No legacy spec is ever unreviewable during the transition.
+
If the file does not end with =-spec.org=, stop immediately and surface the mismatch:
#+begin_example
@@ -128,6 +133,7 @@ Work the spec against these. Each is a source of concrete findings, not a box to
- *Performance & scale.* Expected counts (issues/comments/labels/teams/projects/views)? Server-side filtering where possible? Bounded, visible pagination? Cached name→ID lookups? Sync calls in the command path acceptable? Could a save hook or whole-file scan make N network calls? Rendering linear? Full-file rewrites avoided? Long-running operations async/cancellable/observable? Is concurrency/queueing/backpressure defined? Are high-output process filters throttled and cheap? Is progress/ETA exposed only when defensible, and are hung/stalled operations detectable and killable? Identify UI freezes, repeated network calls, unbounded pagination — without premature optimization.
- *Security & privacy.* API keys safe? Debug logs leaking secrets or private issue text? Confirmations before mutating shared workspace objects? Personal vs shared distinguished? Local files holding sensitive descriptions/comments? Anything to redact from messages/logs? Any work-tracker integration may handle private company data.
- *UX & accessibility.* Discoverable commands? Recoverable mistakes? Prompts ordered to the task? Safe, useful defaults? Informative-not-noisy status messages? Does the UI avoid implying unsupported actions are supported? Match the upstream product's permissions/concepts? Are customizations named in user language, with clear defaults and docstrings? For Emacs packages, command names, completion candidates, buffer layout, defcustom names, and message wording *are* the UX.
+- *Operational-panel UI traps.* Applies when the spec covers a user-facing panel, dialog, or control surface; skip otherwise. Lists that mix saved, current, and generated items must name each item's source. Refresh or scan actions must not gate data that could be shown immediately. Add-forms must not ask the user to retype values the system already discovered. Destructive confirmations read in future tense before the action and verified-result tense after it. Diagnostics, performance, logging, and repair affordances are reviewed as one coherent flow before extra pages or buttons are added. A popup launched from a bar, tray, or tool surface should visually belong to that launcher. (Promoted from archsetup's Waybar network-panel review, 2026-06-30.)
- *Test strategy and coverage.* Characterization tests before behavior changes? Pure functions to unit-test? API responses needing fixtures? Command flows needing stubs? Regression tests for prior bugs? Boundary/error cases? What's covered elsewhere and shouldn't be re-tested? Which existing tests must change? How is coverage generated, summarized, and used to find untested/refactor-worthy code? Prefer tests that lock contracts: representation shape, query compilation, sync no-op, conflict refusal, pagination, dirty-buffer protection, log redaction, and long-running/slow-operation behavior via fakes rather than flaky live dependencies.
- *Observability & operations.* How does a user see what the package is doing? Progress messages for long ops? Useful, safe debug logging? Are logs structured enough to isolate issues from a bug report? Are commands provided to inspect/clear caches, test connectivity, diagnose backends/tools, copy redacted debug info, or reproduce command invocations? How are terminal states discovered: completion, failure, partial success, stalled/hung, cancelled, cleanup-unverified, and "needs user action"? Does the product notify only when useful, avoid noisy success spam, and keep non-success states visible until acknowledged? For generated org files, headers should often carry source, filter/view name, refresh time, count, truncation state.
- *Comparable-product sentiment.* When there are obvious adjacent products, research what users love and hate about them from official docs plus current community reports. Do not cargo-cult their feature set; translate findings into the spec's scope. For each loved behavior, say whether the spec provides it, intentionally omits it, or defers it. For each hated behavior, say whether the spec avoids, resolves, inherits, or accepts it.
@@ -166,6 +172,8 @@ Assign one label consistently:
The most useful reviews move a spec from =Not ready= to =Ready with caveats= or =Ready= once decisions are captured.
+*The =Ready= verdict flips the spec's lifecycle status.* spec-review owns the =DRAFT= → =READY= transition (docs-lifecycle convention): on assigning =Ready= (or =Ready with caveats= the author accepts), update the spec's top-level status heading keyword to =READY=, add a dated history line under it naming the review that passed, and set the Metadata =Status= mirror to =ready= — three lines, one file. Any other rubric label leaves the keyword where it stands (a re-review that finds new blockers on a =READY= spec demotes it back to =DRAFT= the same three-line way, with the reason in the history line).
+
Finding severity maps to blocking power: *high-priority findings block =Ready=* — they hold the rubric at =Not ready= (or =Ready with caveats= if the author accepts and tracks them) until dispositioned; *medium-priority findings are the author's discretion* and don't block. State the blocking status on each finding so the author running spec-response knows which ones gate the rubric.
Then update the spec's review history. Specs should carry a bottom section named =Review and iteration history= (or the nearest existing equivalent) that tracks each material author/reviewer pass. Add a concise entry for this review even when the spec is ready and no findings are recorded.
diff --git a/claude-templates/.ai/workflows/startup.org b/claude-templates/.ai/workflows/startup.org
index 5e8f61e..47a77c8 100644
--- a/claude-templates/.ai/workflows/startup.org
+++ b/claude-templates/.ai/workflows/startup.org
@@ -44,6 +44,8 @@ Behavior:
- *Dirty working tree* → skip the pull. Don't auto-stash and don't auto-merge — those would either lose work or invite conflicts at the worst possible moment (session start).
- *Non-fast-forward history* → =--ff-only= aborts with an error. Surface that to the user; the rsync still proceeds against the working tree as-is.
+*Template-freshness policy (applies to every dirty-check in the synced workflows).* "Dirty" means *tracked modifications only*. Untracked and gitignored files — an inbox drop, a file left in the tree to read, scratch output — never block a template pull, a fast-forward, or a monitoring gate. Projects were falling behind on templates because somebody sent them a task; that's the failure this policy closes. The checks here already comply (=git diff --quiet HEAD= sees only tracked changes; the ff gate uses =--untracked-files=no=), and any dirty-check added to a synced workflow follows the same rule. One deliberate exception: the rsync WIP-guard below counts untracked files *within rulesets' own synced source paths*, because an untracked half-written template is exactly the WIP it exists to hold back — that guard is about rulesets' outbound content, not the consuming project's local state.
+
*** Install rulesets symlinks into ~/.claude (idempotent)
A skill, rule, or bin script added to rulesets and pushed reaches each machine's *files* on the next pull, but not its =~/.claude= *symlink* — =make install= only links what isn't already linked, and =git pull= doesn't run it. So a newly-added skill stays silently uninstalled until someone re-runs =make install= by hand. The flush skill sat in that gap from 2026-06-02 until a manual install on 2026-06-05. Running =make install= here, right after the rulesets pull, closes it: "add a skill, commit, push" becomes enough for it to reach every machine on the next session.
@@ -151,7 +153,7 @@ These calls have no dependencies on each other. Issue them all together in one m
8. =[ -f todo.org ] && .ai/scripts/task-review-staleness.sh todo.org 7 || true= — count top-level tasks overdue for review (the daily task-review habit's startup nudge). The =[ -f todo.org ]= guard skips projects without a root todo.org; =|| true= keeps Phase A from failing if the script isn't synced yet. Threshold 7 days is one review cycle of slack — softer than the wrap-up health check's 30-day alarm.
9. =bash ~/code/rulesets/scripts/sync-language-bundle.sh "$PWD" 2>/dev/null || true= — language-bundle freshness for the current project. Fingerprint-detects which bundle (if any) the project has, auto-fixes drifted rulesets-owned files (=.claude/rules/*.md=, =.claude/hooks/*=, =githooks/*=), and surfaces drift in =settings.json= without writing it (a project may have customized it). =CLAUDE.md= is deliberately left untracked — it's seed-only in =install-lang= and project-owned afterward, mirroring how =diff-lang= skips it. Quiet when there's no bundle or everything's clean. Hardcodes the rulesets path because =languages/= is the canonical source and lives only there — the same absolute-path dependency the rsyncs already carry. =|| true= keeps Phase A from failing on older checkouts where the script isn't present yet. The =.ai/= rsyncs and this call write to disjoint paths (=.ai/= vs =.claude/=/=githooks/=), so the batch stays parallel-safe.
10. =[ -f "$HOME/org/roam/inbox.org" ] && grep -cE '^\*\* ' "$HOME/org/roam/inbox.org" || true= — count items in the roam global inbox (=~/org/roam/inbox.org=), the roam-mode startup nudge. Silent if the roam clone isn't on this machine. Phase C reads the file when the count is non-zero, splits total vs items related to this project, and surfaces the offer (see =inbox.org= roam mode). Read-only; never files at startup.
-11. KB surface prep (the read + contribute startup nudges; see =docs/design/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute).
+11. KB surface prep (the read + contribute startup nudges; see =docs/specs/2026-06-16-encourage-kb-contribution-spec.org=). Gated on the agent KB clone. Counts =:agent:= nodes, lists up to 5 whose content matches the current project basename (titles only; a few most-recent nodes as a fallback when nothing matches), and resolves the best-practices node path. Read-only; silent when the clone is absent. Phase C surfaces the relevant titles (consult) and the best-practices link (contribute).
#+begin_src bash
ra="$HOME/org/roam/agents"
@@ -166,6 +168,25 @@ These calls have no dependencies on each other. Issue them all together in one m
fi
#+end_src
+12. Spec-sort probe (the docs-lifecycle retrofit nudge; see the docs-lifecycle spec in =docs/specs/=). Read-only; prints one line when the project has an unsorted docs pile — a =docs/design/= directory or stray =docs/*-spec.org= root files — and no =:LAST_SPEC_SORT:= marker in =.ai/notes.org=. Silent for projects with nothing to sort or an already-stamped marker (the marker permanently clears it).
+
+ #+begin_src bash
+ { [ -d docs/design ] || [ -n "$(find docs -maxdepth 1 -name '*-spec.org' -print -quit 2>/dev/null)" ]; } \
+ && ! grep -qs ':LAST_SPEC_SORT:' .ai/notes.org \
+ && echo "spec-sort: unsorted docs present" || true
+ #+end_src
+
+ The stray-root check uses =find= rather than a glob so the probe behaves identically under bash and zsh (=compgen= is bash-only, and zsh aborts on an unmatched glob).
+
+13. Host-identity probe (see the host-identity rule in =claude-rules/=). Read-only; flags fixed machine-identity claims in the project's tracked/synced docs — the "This machine is ratio" trap, false on every machine but the one that wrote it. Silent when nothing matches.
+
+ #+begin_src bash
+ grep -inE '\b(this|the current) (machine|host|box|laptop|workstation) is ' \
+ CLAUDE.md .ai/notes.org 2>/dev/null | head -3 || true
+ #+end_src
+
+ Fleet descriptions ("the fleet is ratio and velox") and runtime derivations ("run =uname -n= to find the hostname") don't match — only current-identity assertions do. Fixture-verified under bash and zsh.
+
Notes on the rsync commands:
- Trailing slashes on both source and destination matter — they tell rsync to sync /contents/ rather than nest a directory inside.
- =--delete= on the directory syncs lets retired template files actually disappear from each project on next startup.
@@ -199,6 +220,8 @@ This phase touches the user and runs sequentially:
- *Roam inbox nudge.* If the Phase A roam-inbox count is greater than zero, read =~/org/roam/inbox.org=, split total vs items related to this project (claimed by the =<project>:= prefix, plus any unprefixed item whose topic plainly concerns this project), and surface one line: "Roam inbox: =<N>= total, =<M>= appear related to this project — say 'inbox zero' to file them." Offer it as a priority option; never auto-file. If the count is zero or the file is absent, say nothing. See =inbox.org= roam mode.
- *KB consult nudge (read side).* If the Phase A KB-surface prep returned any =kb-relevant-titles=, surface one line listing them (capped 5): "KB lessons that may be relevant: =<title>=; =<title>=… — open the node before related work." The titles are declarative, so the list alone tells you whether to open one. Gated on the roam clone; silent when the clone is absent or nothing relevant surfaced. See the best-practices node and =knowledge-base.md=.
- *KB contribute nudge (write side).* Once per session, surface one line pointing at the best-practices node (the =kb-bestpractices= path from Phase A): "Learned something durable? See =<path>= for how to write a KB node — contributing cross-project facts is welcome (personal projects only; work/unknown projects never write per =knowledge-base.md=)." Light encouragement, never a gate. Gated on the roam clone; silent when absent.
+ - *Spec-sort nudge.* If the Phase A spec-sort probe printed =spec-sort: unsorted docs present=, surface one line: "this project's docs pile has never been spec-sorted — say 'run spec-sort' to sort it." If the probe was silent, say nothing. A project with nothing to sort never sees the line; a stamped =:LAST_SPEC_SORT:= marker permanently clears it. See the docs-lifecycle rule and the spec in =docs/specs/=.
+ - *Host-identity flag.* If the Phase A host-identity probe printed any match, surface it with the file:line and the fix: "this doc asserts a fixed machine identity — false on every other machine; replace with a runtime derivation (run =uname -n=), per the host-identity rule." The probe flags for judgment, never blocks. Silent when the probe is silent.
- *Language-bundle sync.* If the Phase A step-12 call (=sync-language-bundle.sh=) printed anything, surface it. =fixed= lines are informational — the drift was already repaired (note that =.claude/= is now dirty if the project commits it). A =drift= line on =settings.json= is surface-only and needs the printed =make install-<lang> PROJECT=.= to reconcile; flag it so the user can decide. If the call was silent, say nothing.
- *Newly-installed symlinks.* If the Phase A.0 =make install= step printed any =link= / =relink= / =WARN= line, surface it. A =link= line means a skill, rule, hook, or script added to rulesets is now linked into =~/.claude= for the first time on this machine. For a newly-linked *skill*, check the agent's available-skills list: if the harness already registered it mid-session, note it's available and move on; if it's absent, stop and tell Craig to restart the agent so it loads (whether a mid-session reload works is harness-version-dependent). For a newly-linked *hook*, note that the harness reads hooks at session start — it fires from the next session (or after Craig opens =/hooks= once); its settings.json wiring travels with the tracked file, so the link is usually the only missing piece. A =WARN ... not a symlink= line is a real collision at the target path — surface it; it needs a human. If the step printed only "nothing new to link", say nothing.
- *Template-sync churn (safety net).* Check whether Phase A's rsync left uncommitted churn in the synced =.ai/= paths — accumulated from a prior session that crashed before wrap-up, or freshly added this session when rulesets advanced. Without surfacing, it builds up silently until it blocks Phase A.0's auto-ff (git won't ff a dirty tree). Skip in the rulesets repo itself (there =.ai/= is a committed mirror, kept honest by the pre-commit hook). The check is sequential here, after the rsync has finished — not a Phase A step, to keep that batch race-free.
diff --git a/claude-templates/.ai/workflows/task-audit.org b/claude-templates/.ai/workflows/task-audit.org
index 94b99da..7d2b758 100644
--- a/claude-templates/.ai/workflows/task-audit.org
+++ b/claude-templates/.ai/workflows/task-audit.org
@@ -61,6 +61,8 @@ For each open task, read its body and cross-check its claims against the actual
- *Calendar* — did a scheduled event happen; is a SCHEDULED/DEADLINE date now past.
- *Meeting recordings* — if a task hinges on "did this conversation happen / what was said," check the recording queue (e.g. =~/sync/recordings/=) and transcribe via =process-meeting-transcript.org= if the answer lives in an un-transcribed recording. (This is exactly how a "did the interview happen?" task gets resolved instead of guessed.)
+*Spec lifecycle reconcile (docs-lifecycle convention).* If the project has a =docs/specs/=, run the =:SPEC_ID:= query as part of this phase: for each spec whose top-level status heading reads =DOING=, find the =todo.org= task whose =:SPEC_ID:= property matches the spec's =:ID:=. Flag the spec NEEDS-USER when that bound parent is =DONE=/=CANCELLED=, archived, or missing — the build finished (or evaporated) without the =IMPLEMENTED= flip, exactly the drift this check exists to catch. Check the parent's own keyword, not its children (completed children become dated entries and the final flip task is a child, so child-counting misleads).
+
Assign each task a bucket (CURRENT / STALE / NEEDS-USER) and, for STALE, the specific factual update.
*Scale tactic.* For a large open-task set, dispatch read-only investigation sub-agents over batches of tasks (parallel-safe per =subagents.md= — independent read-only domains). Each returns a per-task bucket + suggested update. *Never* let sub-agents write to =todo.org= concurrently — apply all edits serially in the main thread (concurrent writes to one file race and lose work).
@@ -79,7 +81,7 @@ For every STALE task, edit it in the main thread:
- *Ensure priority is set per the project scheme.* The top of the project's =todo.org= should carry the priority legend (=[#A]= through =[#D]=). Every task should carry an explicit priority cookie. If a cookie is missing, or no longer matches the reconciled facts, assign the right level per the legend. If the level is unambiguous from the body, do it autonomously; if it's a judgment call (especially the [#A] / [#B] line for important-but-not-urgent work), flag NEEDS-USER. Also enforce the [#A]-discipline rule from the legend — an [#A] task without a =SCHEDULED:= or =DEADLINE:= line is mis-graded and is either down-graded to [#B] (when reconciled facts say "important but not urgent") or surfaced as NEEDS-USER for the user to date.
- *Ensure a type tag is set.* Every task carries one type tag from the project's tag legend (typically =:feature:= / =:chore:= / =:spec:= / =:bug:=). If missing or wrong, assign or correct it from the body when the type is unambiguous. If two tags fit (a refactor that also fixes a bug; a spec that's also a chore), flag NEEDS-USER rather than picking one silently.
- *Enforce the project's declared tag vocabulary.* If the project's tag legend declares an *exhaustive* set of allowed tags, strip from each task any tag outside that set — the heading and parent section already carry topic/scope context, so ad-hoc tags only fragment the vocabulary and defeat tag-based filtering. Normalize near-duplicate spellings to the canonical tag (a plural to its singular, say). Where the legend does not declare the set closed, leave existing tags alone; this step applies only where the allowed set is exhaustive by design.
-- *Re-assess the =:quick:= and =:solo:= tags* — reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the definitions in the project's tag legend (and [[file:task-review.org][task-review.org]]) when the reconciled facts make the call clear. When they don't — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess.
+- *Re-assess the =:quick:= and =:solo:= tags (mandatory — an audit that skips this is incomplete).* Reconciliation can change a task's effort or autonomy: a resolved dependency may make a stuck task =:solo:=, a scope cut may make it =:quick:=, and new complexity surfaced by the sources can invalidate either. Add or remove the tags per the hard definitions in [[file:../../claude-rules/todo-format.md][todo-format.md]] ("Hard definitions: :solo: and :quick:"; task-review carries the same three-gate walk). Autonomous execution reads =:solo:= as its eligibility gate and trusts the tag, so a stale one is a run-time hazard, not cosmetic drift. When the call isn't clear — an effort estimate you can't pin down, a =:solo:= gate you can't confirm — it's a NEEDS-USER flag, not a guess.
- Bump =:LAST_REVIEWED:= on each edited task.
Follow =todo-format.md= for completion mechanics (depth-based DONE vs dated-rewrite) and the working-files / link-hygiene rules when moving artifacts.
@@ -99,6 +101,21 @@ Never merge or re-parent autonomously — which tasks belong together, and wheth
When no clear cluster exists, say so in one line and move on — most audits won't find one, and forcing a merge fragments worse than it consolidates.
+** Phase C.6 — Retire completed parents and promote stragglers (interactive)
+
+Phase C.5 consolidates related *open* tasks. This step retires parent tasks whose work is *finished*, so completed containers don't linger in Open Work as scaffolding.
+
+Run =todo-cleanup.el --convert-subtasks= first (it's part of the =clean-todo= / wrap-up cleanup, and =open-tasks.org= runs it too) so every completed sub-task is a dated event-log entry rather than a lingering =DONE= keyword. The closure logic below reads "open child" as a child heading still carrying a task keyword (=TODO=/=DOING=/=WAITING=/=VERIFY=/=NEXT=/=PROJECT=/=STALLED=/=DELEGATED=); a dated entry is correctly not open.
+
+Two shapes, both proposed to Craig (inline numbered options per =interaction.md=, no popup) before applying:
+
+- *Zero open children → close the parent.* A parent whose child *tasks* are all resolved (now dated) and that carries no open child task is finished: close it per =todo-format.md= (=**= parent → =DONE=/=CANCELLED= + =CLOSED:=), and it moves to Resolved on the next =--archive-done=. If the work resurfaces later, a fresh task is created then; a completed container shouldn't sit open as a placeholder.
+- *One or two open children → promote, then close.* When a parent has only one or two open children, pull them out and rewrite them as standalone =**= level-2 tasks — give each a priority per the project scheme, and make the heading stand alone without the parent's context — then close the now-childless parent and let it move to Resolved. The former children become first-class Open Work tasks; the retired parent stops being scaffolding for one or two stragglers.
+
+*The leaf-with-notes carve-out (important).* "Zero open children" is not the same as "done." A =**= leaf task whose only descendants are dated *notes* — a captured "Ideas", "Goals", or "Current State" entry, not a real completed sub-task — is unstarted work with a note attached, not a finished container. Do not close it. Tell the two apart by intent: a container reads as a grouping (a =PROJECT= keyword, an explicit "parent grouping ..." line, or several dated entries that were genuinely separate sub-tasks that shipped); a leaf-with-notes is a single feature/bug task whose title names unstarted work and whose lone dated child is a design note. When the call is ambiguous, flag it NEEDS-USER rather than closing.
+
+Never close or promote autonomously past the ambiguous line — surface the candidates with a recommendation and let Craig ratify, the same interactive stance as Phase C.5. Clear container completions (a =PROJECT= whose every child is dated) can be proposed as a batch; leaf-with-notes ambiguities are flagged individually. Verify open-vs-done counts against the actual headings (a real scan of the subtree), not a fragile regex that a shell's =\b= support can silently break — a miscount here closes live work.
+
** Phase D — Flag the judgment calls (interactive)
Present the NEEDS-USER bucket as a short, scannable list — one line per task, naming the decision or the fact required. Adjudicate with the user one item at a time (inline numbered options per =interaction.md=, no popup). Apply the user's calls as they come (which may itself produce more autonomous updates, or new tasks).
@@ -147,3 +164,7 @@ Two Phase C behaviors added, both surfaced by an Emacs-config =todo.org= audit:
- *Tag-vocabulary enforcement.* That project declares a closed tag set (=bug=, =feature=, =refactor=, =test=, =quick=, =solo=); the audit had to strip ~44 ad-hoc tags that had accumulated across the file. The prior workflow only checked that a type tag was *present* — it had no concept of an exhaustive allowed set. The new bullet enforces a declared closed vocabulary and leaves open-vocabulary projects untouched.
- *Code-complete-but-unverified closing.* Many tasks had shipped (tests green, live in the daemon) but stayed open awaiting a manual or visual verification, so they accumulated as half-open. Leaving them open is noise; auto-closing them would violate "never claim a fix verified before the user confirms." The fix routes the pending human check into the project's =Manual testing and validation= parent (dedup-checked) per =verification.md='s manual-verification hand-off, then closes the implementation task. The work is done and the check is tracked; a failed check promotes to a bug.
+
+** 2026-07-01 — Retire completed parents (Phase C.6)
+
+Added Phase C.6: retire a parent task once its child *tasks* are all done. Zero open children → close the parent; one or two open children → promote them to standalone level-2 tasks, then close. Surfaced by an Emacs-config =todo.org= audit where several PROJECT containers had all children complete. Depends on =todo-cleanup.el --convert-subtasks= running first so completed sub-tasks are dated (not lingering =DONE= keywords) and the open-child count is accurate. Carries a leaf-with-notes carve-out: a =**= leaf task whose only descendant is a dated design note ("Ideas"/"Goals") is unstarted work, not a finished container, and must not be closed — the ambiguous case is flagged NEEDS-USER. The step also warns against counting open-vs-done with a fragile regex (a =\b= that a given shell/awk silently drops miscounts and closes live work).
diff --git a/claude-templates/.ai/workflows/task-review.org b/claude-templates/.ai/workflows/task-review.org
index 69e172d..ba1571a 100644
--- a/claude-templates/.ai/workflows/task-review.org
+++ b/claude-templates/.ai/workflows/task-review.org
@@ -57,7 +57,9 @@ Keep is the common case — most tasks are still right and just need re-stamping
*** Tagging =:quick:= — small tasks
-While reviewing each task, estimate its effort. If you judge it *30 minutes or less* and it doesn't already carry =:quick:=, add the tag to the heading line. If the heading and body don't tell you how long it'll take, *ask Craig* — don't guess. A wrong =:quick:= is worse than none: the tag exists so Craig can grab a genuinely small task in a spare moment, and a mislabeled one wastes that moment.
+The =:quick:= and =:solo:= assessments (this section and the next) are *mandatory* for every reviewed task except a Kill — a review that skips them is incomplete. The hard definitions live in [[file:../../claude-rules/todo-format.md][todo-format.md]] ("Hard definitions: :solo: and :quick:"); autonomous execution (work-the-backlog / the no-approvals speedrun) reads =:solo:= as its eligibility gate and trusts the author's tag, so the run-time gate is only as trustworthy as this pass.
+
+While reviewing each task, estimate its effort. If you judge it *30 minutes or less* and it doesn't already carry =:quick:=, add the tag to the heading line. If the heading and body don't tell you how long it'll take, *ask Craig* — don't guess. A wrong =:quick:= is worse than none: the tag exists so Craig can grab a genuinely small task in a spare moment, and a mislabeled one wastes that moment. =:quick:= is an effort hint only, never an eligibility gate — size does not decide what runs autonomously.
This is orthogonal to the action chosen — a task can be kept (or re-graded, or marked DOING) *and* tagged =:quick:= in the same pass. Skip the assessment on a Kill, since it's leaving the pool. Tags go on the heading line per [[file:../../claude-rules/todo-format.md][todo-format.md]], sharing one =:tag1:tag2:= cluster.
@@ -67,7 +69,7 @@ While reviewing each task, judge whether Claude could build *and* verify it with
1. *Buildable* — Claude has the capability and access to do the work.
2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste.
-3. *No upfront decision* — no design or preference call Craig must make before Claude can begin.
+3. *No deliberation* — no open design question and no "weigh these approaches" with real tradeoffs. At most one or two *quick, upfront-answerable* factual decisions are allowed — the speedrun preset batches those into its pre-flight Q&A, so they don't break the hands-off run. A genuine design or preference call disqualifies.
If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
@@ -96,6 +98,8 @@ The exact date string matters: =task-review-staleness.sh= and the wrap-up health
Follow the completion rules in [[file:../../claude-rules/todo-format.md][todo-format.md]]. A killed top-level =**= task stays task-shaped: change the keyword to =CANCELLED=, add a =CLOSED: [YYYY-MM-DD Day]= line under the heading (generate with =date "+%Y-%m-%d %a"=), and leave the priority and tags intact. It's then a candidate for =--archive-done= at the next cleanup. Don't stamp =:LAST_REVIEWED:= on a kill — it's leaving the review pool anyway.
+A killed *sub-task* (=***= or deeper, under a parent task) instead becomes a dated event-log entry per the depth rule — but you don't have to hand-format it here. =todo-cleanup.el --convert-subtasks= (run in the =clean-todo= and wrap-up cleanup passes) rewrites any level-3+ DONE/CANCELLED/FAILED heading into its dated form mechanically from the =CLOSED= cookie, so a keyword-plus-=CLOSED= close at depth gets normalized on the next cleanup rather than lingering. =lint-org.el= flags any that slip through (checker =subtask-done-not-dated=).
+
* Phase D: Close out
When the batch is done (or Craig calls it early):
diff --git a/claude-templates/.ai/workflows/work-the-backlog.org b/claude-templates/.ai/workflows/work-the-backlog.org
new file mode 100644
index 0000000..b0666e7
--- /dev/null
+++ b/claude-templates/.ai/workflows/work-the-backlog.org
@@ -0,0 +1,263 @@
+#+TITLE: Work the Backlog
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-07-02
+
+* Overview
+
+The single home for the autonomous task-execution loop: take a set of marked, solo-doable tasks from the project's =todo.org= and work them unattended, each held to the full quality bar, under a fixed safety contract. Spec: =rulesets/docs/specs/2026-06-16-autonomous-batch-execution-spec.org=.
+
+Two callers feed it, differing only in how they build the task set and which session mode they pass:
+
+- The *inbox auto-loop* (=inbox.org= auto mode) chains here after its routing completes, with a tag/priority query, file-only mode, cap 1.
+- The *no-approvals speedrun* preset feeds an explicit ordered list with autonomous-commit + always-push + paging-on, after a pre-flight Q&A that front-loads every decision.
+
+This workflow owns the execution logic — eligibility gate, defer checklist, quality bar, run cap. Callers own input assembly and mode selection. Capture-routing (inbox surfaces) stays entirely in =inbox.org=; this file never reads an inbox.
+
+* When to Use This Workflow
+
+Invoked by its two callers, or directly by phrase:
+
+- *Speedrun triggers:* "speedrun", "no approvals speedrun", "speedrun these: <task set>" — run the no-approvals speedrun preset (below). The word "speedrun" always routes here, even when the phrase also says "no approvals": plain =no-approvals.org= is the general session mode; the speedrun is this workflow's preset over an explicit task set.
+- *Loop caller:* =inbox.org= auto mode chains here after its routing (below). Not phrase-triggered.
+
+Manual fallback: "work the backlog" / "work the backlog with <task set>" — gather the three inputs below (ask for whichever are missing, defaulting to file-only mode; default cap is the list length for an explicit set, 1 for a query) and run the loop.
+
+* Inputs — the caller contract
+
+A caller hands this workflow three things:
+
+1. *A task set* — an ordered list of candidate task headings from the project's =todo.org=. Either an explicit ordered list (speedrun) or the result of a tag/priority query (the loop). The loop does not care how the set was assembled; it receives an ordered list of candidates.
+2. *A session mode* — two orthogonal flags:
+ - *Commit autonomy:* =file-only= (default) or =autonomous-commit=. See "Commit autonomy" below.
+ - *Paging:* on or off. End-of-set only.
+3. *A run cap* — the hard maximum number of tasks to complete this run.
+
+It returns a per-task outcome and a run summary.
+
+* Outcomes — the per-task vocabulary
+
+Every task in the set ends in exactly one of:
+
+- =implemented-committed= — implemented, committed (and pushed per the project's flow) under =autonomous-commit=.
+- =implemented-diff-surfaced= — implemented, diff surfaced, *not* committed (=file-only=).
+- =deferred-VERIFY= — a defer-checklist hit; a =VERIFY= filed naming what's missing or risky.
+- =dropped-by-craig= — removed from the run at the speedrun pre-flight Q&A ("skip this").
+- =skipped-ineligible= — failed the mechanical eligibility gate.
+- =failed= — implementation was attempted and abandoned: the tree is left working (never commit a broken state), the failure is surfaced in the run summary, and the run continues to the next task.
+
+The run summary lists each task with its outcome, plus the remaining set when the cap stopped the run.
+
+* The loop
+
+For the task set, in order, until the run cap is hit:
+
+1. *Eligibility gate* (below). Ineligible → record =skipped-ineligible=, next task.
+2. *Scope read* of the relevant code. Cheap; just enough to run the defer checklist.
+3. *Defer checklist* (below). Any hit → defer: file the =VERIFY= naming the gap and record =deferred-VERIFY= (or, under the speedrun preset, route a quick-question gap to the pre-flight Q&A), next task.
+4. *Implement* under the project's commit discipline: TDD red→green→refactor, then =/review-code --staged=, fix all Critical/Important findings, then close the task per =todo-format.md='s completion rules. Decompose into as many logical commits as the change needs — size is not capped. If implementation fails partway, leave the tree working, record =failed=, surface it, and continue to the next task.
+5. *Commit autonomy branch:*
+ - =file-only= → surface the diff, do *not* commit. Record =implemented-diff-surfaced=.
+ - =autonomous-commit= → =/voice personal= on the message, commit individually, push per the project's flow. Record =implemented-committed=.
+6. *Record metrics* for the task (the JSONL append — see Metrics below).
+7. Decrement the cap. At zero, stop.
+
+After the set: if the paging flag is set, fire the end-of-set page (below). Surface the run summary either way.
+
+* Eligibility gate — mechanical, no judgment
+
+A task is autonomous-safe when *both* hold. This layer is a lookup, not a judgment; all the judgment lives in the defer checklist.
+
+1. *Status is =TODO=* — never =VERIFY=, =DOING=, =DONE=, or =CANCELLED=. =VERIFY= marks "awaiting Craig's input"; auto-implementing one defeats the check it represents. The do-not-implement set is safe-by-omission: anything not plainly =TODO= (plus any project-declared "hold" marker) is out.
+2. *Tagged =:solo:=* — the autonomy tag, resolved against the project's priority/tag scheme header in =todo.org= (never hardcoded). =:solo:= carries the hard definition in =todo-format.md=: completable and verifiable without Craig beyond at most one or two quick decisions answerable up front, no design deliberation. A project whose scheme declares a different autonomous-safe tag set overrides the default.
+
+Priority and =:next:= drive *ordering* within the eligible set, not eligibility ([#A] before [#B] before [#C], then the author's ordering). =:quick:= is an effort hint for batching and duration estimates — never a gate.
+
+Task *size* is deliberately absent from this gate. A large but well-specified, decision-free task is in scope and gets decomposed into per-logical-commit chunks during implementation. Size never sends a task away; only *deliberation* or *risk* does (the checklist below).
+
+*No scheme header → don't run.* The gate reads =:solo:= semantics from the project's scheme header; a =todo.org= without one leaves the tag undefined (=todo-format.md= makes the header mandatory). Surface that the header is missing and stop rather than guessing eligibility.
+
+* The defer checklist — act vs file
+
+After the scope read, run each eligible candidate through the checklist. Each item is a concrete, answerable question, not an adjective. *Any* hit — or any "unsure" — defers the task. Only a task that clears every item is implemented.
+
+1. *Test-writability (the keystone).* Can I write the failing test from the task text — plus any decisions gathered up front — without inventing a requirement? *No / unsure* → underspecified. Under the speedrun preset, if the gap is one or two quick answerable questions, route it to the pre-flight Q&A; otherwise file a =VERIFY= noting what's missing. Under the unattended loop, file the =VERIFY= (no one to ask).
+2. *Data-loss / irreversible / external operation.* Does implementing it require any of: =rm= of non-scratch data, =git reset --hard= / force-push, =DROP= / =DELETE= / =TRUNCATE=, file truncate/overwrite of persisted content, a schema or data migration, any external or shared-state mutation, any credential touch? *Yes* → do NOT implement; file a =VERIFY= naming the risk. This is the hard safety gate; an upfront answer never overrides it without an explicit checkpoint.
+3. *Already-satisfied.* Does the scope read show the desired end-state already holds? *Yes* → file a =VERIFY= noting it and move on. Don't make a no-op change.
+4. *Design deliberation.* Does the task carry an unresolved design question, a "weigh these approaches" with real tradeoffs, or a TBD that isn't a quick factual answer? *Yes* → under the speedrun preset, if it collapses to one or two quick questions, route to the pre-flight Q&A; otherwise file and surface as a =/start-work= candidate. Under the loop, file. The discriminator is *quick-answerable question* vs *deliberation* — never task size.
+
+When genuinely unsure which side a task falls on, defer — a wrong auto-implement costs a revert *and* the next-session correction.
+
+** Filing the deferral =VERIFY=
+
+Every checklist hit files a =VERIFY= in the project's =todo.org=, per =todo-format.md='s VERIFY rules:
+
+- *Dedup first.* If a =VERIFY= sibling for this deferral already exists (a prior run filed it), don't file another — record the outcome as =deferred-VERIFY= with a "previously filed" note and move on. The deferred task keeps its =TODO= status and tags, so without this check every subsequent run would re-defer and re-file.
+- *Placement:* sibling of the deferred task (the deferred task is the trigger) — a =**= task gets its =VERIFY= at =**=, a =***= sub-task gets it at =***= under the same parent, never deeper.
+- *Heading:* carries the question or risk on its own ("VERIFY <topic> — migration touches persisted rows").
+- *Body:* which checklist item hit, what's missing or risky, and what answer or action would make the task runnable. For an already-satisfied hit, the evidence that the end-state already holds.
+
+** Routing a quick-question gap (speedrun only)
+
+Under the speedrun preset, a checklist-1 or checklist-4 hit that collapses to one or two quick answerable questions routes to the pre-flight Q&A instead of deferring (see the preset section below). The discriminator: a *quick question* is a factual or preference pick answerable in one line without weighing tradeoffs ("cap at 5 or 8?", "which config key name?"); *deliberation* is anything that needs tradeoffs weighed, options explored, or code read by Craig. A task needing three or more questions isn't quick-question-gapped — it's underspecified; file the =VERIFY=. Checklist item 2 (data-loss / irreversible) never routes to the Q&A: an upfront answer doesn't override the hard safety gate.
+
+The unattended loop has no one to ask — every hit defers there.
+
+* Per-task quality bar
+
+Autonomy changes who approves, not what quality means. Per task, non-negotiable:
+
+- *TDD* per =testing.md=: red first, green, refactor. The keystone checklist item already proved the failing test is writable.
+- *Verification* per =verification.md=: fresh evidence, full suite green before any commit.
+- *=/review-code --staged=* before every commit; Critical and Important findings block until fixed.
+- *=/voice personal=* on every commit message on the =autonomous-commit= path (or the patterns walked inline if the skill is unavailable), message printed inline so the log shows what landed.
+- *Task closure* per =todo-format.md=: depth-based completion (keyword + =CLOSED:= at level 2, dated rewrite at level 3+).
+- *One logical change per commit.* A large task becomes several commits, not one omnibus.
+
+* Commit autonomy
+
+=file-only= is the default: surface the diff, never commit. =autonomous-commit= is honored only when the project carries the commit-autonomy waiver, read fresh each run — never from memory of past runs or "this project usually allows it."
+
+The waiver lives in the project's =.ai/notes.org= *Workflow State* section as marker lines, the same shape as the workflow markers already there:
+
+#+begin_example
+:COMMIT_AUTONOMY: yes
+:LOOP_MAY_COMMIT: yes
+#+end_example
+
+- =:COMMIT_AUTONOMY: yes= — the project has the waiver. An =autonomous-commit= request (the speedrun preset, or a manual run asking for it) is honored.
+- =:LOOP_MAY_COMMIT: yes= — the *unattended loop caller* may also commit. It requires =:COMMIT_AUTONOMY:= alongside it; the split exists because "Craig-initiated speedrun may commit" and "the recurring loop may commit unattended" are different levels of trust. Without this flag the loop stays =file-only= even when the project holds the waiver.
+
+An absent marker means no. Anything other than a plain =yes= value also means no. The read is one grep of the Workflow State section — a lookup, not a judgment.
+
+*The degrade contract.* When a caller requests =autonomous-commit= and the required marker is missing, degrade to =file-only= and surface it in both the run intro and the run summary: "autonomous-commit requested, no :COMMIT_AUTONOMY: waiver in notes.org — running file-only." Never honor the request without the marker, and never drop to file-only silently — the first commits into a project that didn't opt in, the second hides why nothing got committed.
+
+* Bounding the run
+
+The cap is a hard per-run task ceiling passed by the caller — the kill switch a runaway can't exceed:
+
+- *Loop caller default: 1.* Implement the highest-priority eligible candidate, record, stop; the next tick continues.
+- *Speedrun: the length of the explicit list*, capped at a ceiling — the human bounded the set by naming it.
+
+Even the speedrun stops at the cap and surfaces (and, with paging on, pages) the remainder. The cap bounds task *count*, not cost; a token budget is logged as vNext.
+
+* Context hygiene — auto-flush between tasks
+
+Task boundaries are clean boundaries by construction: the previous task is closed and committed (or filed), nothing is half-edited. When the context window grows heavy mid-run, run the flush skill's *auto mode* between tasks: checkpoint the session anchor with the remaining task set, session mode, and cap in Next Steps (so the resumed context continues the run blind), arm the self-injection (=.ai/scripts/self-inject.sh= via =tmux run-shell -b=), and end the turn. The fresh context resumes from the anchor and works on. Unattended runs only — the keystroke-collision hazard and the full mechanism live in the flush skill.
+
+* End-of-set page
+
+With paging on, fire one page when the set is done or the cap is hit — end-of-set only, never per-task:
+
+#+begin_src sh
+notify info "Page" "<project>: <N> done, <M> remaining — <one-line summary>" --persist
+#+end_src
+
+=--persist= keeps it on screen until dismissed, and =info= is the page-me urgency convention (persistent but never crash-scary). The page fires when the set completes *or* the cap stops the run — either way exactly once. The message carries the project name, the completed count, and the remaining count (with skipped tasks noted in the run summary) so Craig can confirm ready and name the next project in one reply. There is no separate page-signal call — =notify= is the paging surface.
+
+* Metrics
+
+Each task outcome appends one JSON line to the project's =.ai/metrics/work-the-backlog.jsonl= — git-tracked, append-only, =jq=-queryable. Create the directory and file on the first append. Logging is a side effect only: a failed append surfaces a warning in the run summary but never blocks, reorders, or aborts execution.
+
+One record per task, written at the moment its outcome is decided:
+
+| Field | Meaning |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =ts= | ISO-8601 timestamp of the task outcome |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =run_id= | UUID shared by every record in one run (=uuidgen= at run start) |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =project= | project basename |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =caller= | =loop= / =speedrun= / =manual= |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =task= | the task heading (slug) |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =outcome= | =implemented-committed= / =implemented-diff= / =deferred-verify= / =skipped-ineligible= / |
+| | =dropped-by-craig= / =failed= |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =defer_reason= | =underspecified= / =data-loss= / =already-satisfied= / =needs-deliberation= — set on |
+| | =deferred-verify= records only |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =upfront_decision= | =true= when a pre-flight answer was recorded and used for this task |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =wall_clock_s= | seconds from task start to outcome |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =commit_sha= | committed tasks: the commit SHA (comma-separated when the task decomposed into several); empty |
+| | otherwise |
+|--------------------+-------------------------------------------------------------------------------------------------|
+| =review_findings= | count of =/review-code= Critical + Important findings on this task |
+|--------------------+-------------------------------------------------------------------------------------------------|
+
+The =outcome= slugs map one-to-one onto the outcome vocabulary above (=implemented-diff= is =implemented-diff-surfaced=; =deferred-verify= is =deferred-VERIFY=). Per-run rollups (attempted / completed / deferred / dropped, wall-clock total, findings per commit) are computed at synthesis, not stored per record. The =commit_sha= field is what the synthesis step's corrections signal keys on — whether a later commit reverted or hand-fixed an autonomous one — so never omit it on a committed task.
+
+* Caller: the inbox auto-loop
+
+=inbox.org= auto mode chains here as an explicit second step *after* its routing completes — never as a phase inside inbox processing. When a cycle files new items and Craig answers "run this batch next?" with yes, auto mode invokes this workflow with:
+
+- *Task set:* the eligibility query over the queued/filed items — status =TODO= + =:solo:= per the scheme header, priority-ordered.
+- *Session mode:* =file-only=, paging off. (A project carrying both =:COMMIT_AUTONOMY:= and =:LOOP_MAY_COMMIT:= markers opts the loop into commits — see Commit autonomy above.)
+- *Cap: 1.* The highest-priority eligible candidate runs, gets recorded, and the loop's next tick (or the next yes) continues from there.
+
+The loop has no human at kickoff of each task, so a needs-quick-decisions task defers with a =VERIFY= — the pre-flight Q&A is a speedrun capability, not a loop one. Startup and wrap-up never invoke this workflow.
+
+* Preset: the no-approvals speedrun
+
+The named preset is a label for one flag combination, not a second code path: *explicit ordered list + =autonomous-commit= + always-push + paging-on*, with every approval front-loaded into a single pre-flight step. "No approvals" means all input first, then hands-off — not no input ever. =autonomous-commit= still requires the =:COMMIT_AUTONOMY:= waiver (Commit autonomy above); without it the preset degrades to =file-only= and says so in the pre-flight intro.
+
+When Craig names a task set and says "speedrun":
+
+1. *Gather* the named task set.
+2. *Scope-read and classify* each task against the eligibility gate + defer checklist: *ready* (clears everything), *needs-quick-decisions* (one or two upfront-answerable questions — checklist item 1 or 4), or *drop* (data-loss/irreversible, or deliberation that isn't a quick question).
+3. *Order* the list — priority, then the author's ordering / =:next:=.
+4. *Intro the work* — present the ordered plan: what will run, what was dropped and why, and the batched questions for the needs-quick-decisions tasks.
+5. *Craig answers each question or says "skip this"* — a skip removes the task (recorded =dropped-by-craig=; the task itself stays =TODO=); an answer is recorded so implementation works from the decision, not a guess.
+6. *Run the finalized list autonomously* — no further approvals until done. Cap = the list length (the human bounded the set by naming it), still one commit per logical change, always-push per the project's flow, auto-flushing between tasks when the context grows heavy (see Context hygiene above).
+7. *End-of-set page* with completed + remaining + skipped.
+
+The batch-ask (step 4-5) is one message: each question names its task, puts the recommended answer at item 1 when there is one (per =interaction.md= — inline numbered, no popup), and offers "skip this" as the last option. Before the run starts, write each answer into its task's body in =todo.org= as a dated line — the implementation works from the recorded decision, and the record survives the session. The Q&A fires only under this preset; the loop caller never asks (its decision-needing tasks defer).
+
+*** Per-item disposition rule
+
+For every item the run picks up (this holds for any executing caller, including an auto-inbox-zero run given a standing yes):
+
+- *Feature-level task* → write a spec first (=spec-create=), don't implement directly. The spec is the run's deliverable for that item.
+- *Needs decisions you can't confidently guess* → file it as a =VERIFY= carrying the question (under this preset, one or two quick questions route to the pre-flight Q&A instead).
+- *Well-defined* → implement it, taking the time it needs.
+
+This extends the defer checklist: the checklist decides *act vs file*; this rule decides the *shape* of the act.
+
+* Synthesis: metrics → org-roam KB
+
+Trigger: "synthesize backlog metrics" (optionally a weekly scheduled run). This is the read side of the metrics log — Craig's ask was "gather data and create org-roam articles we can look at later," and this step is the second half. It is read-only over the logs plus exactly one KB write.
+
+1. *Gather the JSONL union.* Discover =.ai/metrics/work-the-backlog.jsonl= across the project roots (dirs carrying =.ai/protocols.org= under =~/code=, =~/projects=, =~/.emacs.d=). Classify each project per =knowledge-base.md= (work-root denylist, never inference) before reading it into the union.
+2. *Enforce personal-only.* A work-classified or unknown project's metrics never enter the KB write — they stay in that project's own log. Report the exclusion per the KB refusal contract: the classification, a one-line redacted summary, and where the data stayed.
+3. *Compute the rollups and trends.* Per run: attempted / completed / deferred (by reason) / dropped / failed, wall-clock total, commits landed, review findings per commit. Trends across runs: completion rate over time, defer-reason distribution, findings-per-commit trend.
+4. *Compute the corrections signal* — the key metric. For each =commit_sha= in the window, check that project's history for a later commit (within ~14 days) that reverts it or carries a fix touching the same files. A clean run is one whose autonomous commits survive untouched; a flagged run is what Craig reviews by hand. This is a cheap proxy, not proof — it flags candidates, it doesn't convict.
+5. *Write one KB node* at =~/org/roam/agents/YYYYMMDDHHMMSS-backlog-metrics-<window>.org= per =knowledge-base.md=: =:agent:metrics:= filetags, a concise title, the rollup table, the trend narrative, and =[[id:...]]= links to prior synthesis nodes so the series is traceable. Pull before writing, commit and push after — the normal KB session discipline.
+
+The KB node is the artifact Craig reads later: "are the runs completing more and getting corrected less?" should read off the trend table without touching raw logs. Synthesis never mutates the JSONL, todo.org, or any project tree.
+
+* Common Mistakes
+
+1. *Implementing a =VERIFY= or =DOING= task.* The gate is status =TODO= only — a =VERIFY= exists precisely because Craig's input is pending.
+2. *Treating =:quick:= as eligibility.* It's an effort hint. =:solo:= is the gate.
+3. *Deferring on size.* A large, well-specified, decision-free task runs — decomposed into logical commits. Size is not a checklist item.
+4. *Guessing past the keystone.* If the failing test isn't writable from the task text, the task isn't ready. Inventing the requirement is the failure the checklist exists to stop.
+5. *Rationalizing through the data-loss list.* "The migration is small" doesn't clear checklist item 2. Enumerated operations defer, full stop.
+6. *Committing in =file-only= mode.* The diff is the deliverable; the commit is Craig's.
+7. *One omnibus commit for the whole run.* Every logical change is its own reviewed commit.
+8. *Skipping =/review-code= or =/voice= because nobody's watching.* Autonomy removes interaction gates, never engineering-discipline gates (same contract as =no-approvals.org=).
+9. *Running past the cap.* The cap is the kill switch; hitting it means stop and surface, even mid-set.
+10. *Paging per-task.* One page, end of set.
+11. *Honoring =autonomous-commit= from memory.* The waiver is the marker line in =notes.org=, read fresh each run. "This project usually allows it" isn't a read.
+12. *Re-filing the same deferral =VERIFY= every run.* The deferred task stays =TODO=, so a run that skips the existing-sibling check spams =todo.org= with duplicates.
+13. *Routing a data-loss hit to the pre-flight Q&A.* Checklist item 2 is the hard gate — an upfront answer never clears it without an explicit checkpoint.
+
+* Living Document
+
+Refine as the dogfooding signal arrives — the metrics log and the corrections-in-next-session signal are the feedback loop. Fold recurring adjustments in rather than accumulating caller-side workarounds.
+
+* History
+
+Created 2026-07-02 as Phase 1 of the autonomous-batch execution spec, reconciling the inbox-zero "Phase E" proposal and the =.emacs.d= speedrun proposal into one execution loop. The auto-inbox-zero execute step in =inbox.org= reverted to routing-only in the same change so this file is the loop's only home. Phases 2-6 (same day) wired both callers, pinned the commit-autonomy waiver markers, fleshed the defer/Q&A/page mechanics, and added the metrics record + KB synthesis step.
diff --git a/claude-templates/.ai/workflows/wrap-it-up.org b/claude-templates/.ai/workflows/wrap-it-up.org
index 5d2cdd2..d0c4e75 100644
--- a/claude-templates/.ai/workflows/wrap-it-up.org
+++ b/claude-templates/.ai/workflows/wrap-it-up.org
@@ -137,6 +137,22 @@ Run the report-only variant first if you want to see what would change without w
emacs --batch -q -l .ai/scripts/todo-cleanup.el --check todo.org
#+end_src
+*** Convert done sub-tasks to dated entries
+
+#+begin_src bash
+[ -f todo.org ] && emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks todo.org
+#+end_src
+
+=--convert-subtasks= rewrites every heading at level 3 or deeper whose TODO state is DONE/CANCELLED/FAILED into a dated event-log entry (=<stars> YYYY-MM-DD Day @ HH:MM:SS -ZZZZ <text>=), dropping the keyword, priority cookie, and tags, and removing the now-redundant =CLOSED:= line. This enforces the =todo-format.md= depth rule that a completed *sub-task* (a heading under a parent task) becomes dated history, not a lingering DONE keyword — a shape an interactive org close (=org-log-done= → DONE + CLOSED) never applies and =--archive-done= (level-2 only) never reaches. The timestamp comes from each entry's own =CLOSED= cookie; a date-only close yields =00:00:00=. Heading text is kept verbatim. Idempotent (an already-dated heading has no keyword to match), and a done sub-task with no parseable =CLOSED= is flagged and left alone rather than stamped with a fabricated date.
+
+Run this *before* =--archive-done= so that when a completed level-2 parent is archived, its sub-tasks already carry their dated form. Any rewrites show up in the wrap-up commit's diff for review before push.
+
+Preview without writing:
+
+#+begin_src bash
+emacs --batch -q -l .ai/scripts/todo-cleanup.el --convert-subtasks --check todo.org
+#+end_src
+
*** Archive completed work
#+begin_src bash
@@ -244,6 +260,32 @@ The check exempts =lint-followups.org= explicitly because lint-org runs earlier
This integrates with =inbox.org= process mode, which stamps =:LAST_INBOX_PROCESS:= in =notes.org='s *Workflow State* section on completion. Wrap-up doesn't double-stamp. It only ensures the inbox carries nothing but the expected pipeline artifacts at session end.
+*** Cross-project router (optional — route filed keepers to their home projects)
+
+Runs directly after the inbox sanity check. The split between the two: the sanity check *gates* the wrap (a dirty inbox blocks until resolved); the router is *optional* (skipping it never blocks anything — the candidates just stay local until a future wrap). Spec: =docs/specs/wrapup-routing-spec.org= (D7/D8/D9).
+
+The candidate set is exactly the local tasks carrying a =:ROUTE_CANDIDATE:= property — keepers that inbox process mode filed this session whose inferred home is another project. Never scan the standing backlog.
+
+#+begin_src bash
+.ai/scripts/route-batch --list
+#+end_src
+
+*Empty set = zero interaction.* =--list= prints nothing when there are no candidates; continue the wrap silently — no prompt, no "0 items" line.
+
+When candidates exist, surface the batch as one line per task — the task heading, the destination project, the delivery mode (=inbox-send= file handoff), and the engine's confidence — then offer exactly two options: *go* (route the whole batch) or *skip* (leave everything local). Derive each confidence label by running the engine on the task's heading + body (=python3 .ai/scripts/route_recommend.py --item "..." --exclude "$(basename "$PWD")"=); label weak matches visibly ("weak — verify the destination") so a low-confidence route gets a human glance before the keystroke.
+
+On *go*:
+
+#+begin_src bash
+.ai/scripts/route-batch --go
+#+end_src
+
+Per candidate, the helper writes the task's subtree (children ride along; =:ROUTE_CANDIDATE:= stripped, headings promoted to top level) to a one-task handoff, delivers it via =inbox-send <destination> --file= (so the =from-<this-project>= provenance is stamped and the destination's inbox process mode dispositions it as a single item), and only after a successful send removes the subtree from the local =todo.org= — a single-file local edit the wrap is already committing. A failed send leaves that task in place and exits non-zero; report it and continue the wrap. Never write the destination's =todo.org= directly; its own inbox processing files the task per its conventions.
+
+On *skip*, leave every candidate in place, marker included — they resurface next wrap.
+
+Mis-routes are recoverable: the receiving project rejects via inbox process mode's reject-from-another-project flow, which returns the item to this project's inbox with the rationale. That reject path is why removing the local source on send is safe.
+
*** Review-habit health check (surface a slipped daily task-review)
The daily task-review habit walks the open top-level tasks on a rotating cycle, stamping =:LAST_REVIEWED:= as it goes (see =task-review.org=). This check is the watchdog for that habit. When tasks have gone too long unreviewed, the habit has slipped, and the wrap-up says so in one line — it does not re-list the tasks.
@@ -536,7 +578,7 @@ Before considering wrap-up complete:
- [ ] The Summary ends with the =KB: promoted N / consulted yes-no= line (promotion check ran)
- [ ] File renamed to =.ai/sessions/YYYY-MM-DD-HH-MM-description.org=
- [ ] =.ai/session-context.org= no longer exists
-- [ ] =todo-cleanup.el= ran — hygiene pass + =--archive-done= + =--sync-child-priority= (if =todo.org= exists at project root)
+- [ ] =todo-cleanup.el= ran — hygiene pass + =--convert-subtasks= + =--archive-done= + =--sync-child-priority= (if =todo.org= exists at project root)
- [ ] =lint-org.el= ran on =todo.org= — mechanical fixes applied, judgments appended to follow-ups file (if =todo.org= exists)
- [ ] Any orphan-planning-line warnings reviewed (fix or accept)
- [ ] Inbox carries nothing but expected pipeline artifacts (=.gitkeep=, =lint-followups.org=, =PROCESSED-*= prefixes), OR each remaining handoff has an explicit deferral logged in the valediction