aboutsummaryrefslogtreecommitdiff
path: root/todo.org
diff options
context:
space:
mode:
Diffstat (limited to 'todo.org')
-rw-r--r--todo.org162
1 files changed, 88 insertions, 74 deletions
diff --git a/todo.org b/todo.org
index 96423e7..0a050b1 100644
--- a/todo.org
+++ b/todo.org
@@ -161,18 +161,6 @@ The work project edited two synced scripts locally as a stopgap (2026-06-17) and
Note (2026-06-24): the Anki =#+TITLE= deck-name fix landed (commit 060a938) — =default_deck_name= is now =default_deck_name(input_path, org_text)= with a new docstring. The preserved 2026-06-17 =to-anki.py= predates that, so *don't* copy it wholesale (it would revert the title-fix). Re-derive the multi-tag changes against the current canonical =flashcard-to-anki.py= and keep the =#+TITLE= behavior.
-** DONE [#C] Guard against hardcoded host identity in synced files :feature:solo:
-CLOSED: [2026-07-02 Thu]
-:PROPERTIES:
-:CREATED: [2026-06-22 Mon]
-:LAST_REVIEWED: 2026-06-24
-:END:
-A =CLAUDE.md= / notes file that asserts mutable environment identity as a fixed fact ("This machine is ratio", a current OS, an IP, "the laptop") is false on every machine the synced/tracked file lands on but one. It bit a real archsetup session: a stale "this machine is ratio" line made the agent reason backwards all session while on velox. Proposal: a claude-rule — don't assert mutable host/env identity as a fixed fact in a tracked/synced project file; derive it at runtime and name the command (=uname -n= for host; the =hostname= binary is often absent). Optionally a codify- or startup-time lint flagging "this machine is <name>" / "the current host is" style claims. Proposal: [[file:docs/design/2026-06-21-host-identity-guard-proposal.org][proposal]]. From archsetup 2026-06-21.
-
-2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): rule + startup lint. A new claude-rules file plus a cheap grep probe in startup flagging host-identity claims in CLAUDE.md / notes.org fleet-wide.
-
-Resolution 2026-07-02: claude-rules/host-identity.md written (fixed-identity claims banned in tracked/synced docs, runtime derivation via uname -n, fleet-description carve-out, the archsetup worked failure) and linked machine-wide by make install. startup.org gained Phase A probe 13 (grep for "this machine/host/box is" claims in CLAUDE.md + notes.org, fixture-verified bash+zsh) and the Phase C host-identity flag line. Flags for judgment, never blocks.
-
** TODO [#C] coverage-summary.el documented as a local-only helper :chore:
:PROPERTIES:
:CREATED: [2026-06-22 Mon]
@@ -409,70 +397,9 @@ What we're verifying: the pilot's relink pass left todo.org and docs links worki
- After the Phase 3 pilot, open todo.org in Emacs and click three links that point at moved specs (including one from a dated log entry).
Expected: each opens the spec at its new docs/specs/ path.
-*** TODO [#D] Docs lifecycle vNext — org-agenda spec-status view :feature:
+*** TODO [#D] Docs lifecycle vNext — org-agenda spec-status view :feature:no-sync:
Once specs carry lifecycle TODO keywords under =docs/specs/=, add a custom org-agenda view that lists =DRAFT= / =READY= / =DOING= / terminal specs by status. Deferred from [[id:80b0787b-4a60-4c82-8a16-b383d3e3c8f2][the docs-lifecycle spec]]; not part of v1 because the grep board is sufficient until the status headings exist.
-** DOING [#C] No-approvals speedrun — cross-project autonomous-batch mode :feature:spec:
-:PROPERTIES:
-:CREATED: [2026-06-15 Mon]
-:LAST_REVIEWED: 2026-06-24
-:SPEC_ID: 90f623cd-fdbe-4f5c-b63d-b2f84d9151cf
-:END:
-A named mode for coding projects: Craig names an ordered task set and says "speedrun" / "no approvals speedrun"; the set is worked autonomously, each task held to the full quality bar (TDD red→green, =/review-code=, =/voice= on the commit) and committed + pushed as its own logical commit, with all needed quick decisions gathered in one pre-flight Q&A (answer or "skip this") and a VERIFY filed for anything underspecified or needing deliberation, plus an end-of-set page listing completed + remaining + skipped tasks. Task size is not a gate — large tasks decompose into per-commit chunks. Surfaced by .emacs.d from a 2026-06-15 theme-studio session where the shape worked. Source proposal: [[file:docs/design/2026-06-15-fix-speedrun-workflow-proposal.org]] (.emacs.d handoff 2026-06-15). Build via =spec-create= when worked; we handle the task in priority order.
-
-Skeptical-review read (open design questions to resolve in the spec, not settled here):
-- *Is it a new workflow or a documented preset?* The proposal frames it as no-approvals + always-push session modes plus an end page. Decide whether it needs its own workflow file or is mostly documentation of a preset over the two existing modes.
-- *Where/how the page fires* — every task vs end-of-set, and via what. The paging surface is in flux (=page-signal= removed 2026-06-12), so reconcile against =notify --persist= or whatever paging stands now.
-- *Auto-pull vs explicit list* — whether the set comes from an explicit ordered list or a tag/priority query.
-- *Guardrails* — must refuse to speedrun tasks needing design decisions or carrying data-loss risk without a checkpoint (the sender's biased-safe unused-tile flag is the worked example).
-
-*** 2026-07-01 Wed @ 22:10:35 -0400 Phase 0 landed — hard tag definitions + review/audit enforcement
-todo-format.md gained the "Hard definitions: :solo: and :quick:" subsection under the scheme header (fixed across projects: :solo: = buildable + agent-verifiable + no deliberation, with one-or-two upfront-answerable quick decisions allowed per the ratified spec; :quick: = ≤30-min effort hint, never a gate). task-review.org: the two tagging sections are now explicitly mandatory ("a review that skips them is incomplete") and gate 3 was realigned from "no upfront decision" to the spec's no-deliberation form — the stricter old wording predated the pre-flight-Q&A decision and would have wrongly excluded quick-question tasks. task-audit.org: the re-assess bullet is marked mandatory and points at the todo-format hard definitions as canonical. Phases 1-6 (work-the-backlog extraction, callers, commit gate, checklist/Q&A/page, metrics, synthesis) remain.
-
-*** 2026-06-16 Tue @ 00:53:36 -0500 Spec written; design questions answered
-Craig's "your call" (2026-06-16) answered in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][the autonomous-batch execution spec]], which reconciles this with Phase E into one feature:
-- *Most effective / workflow-vs-preset:* one dedicated =work-the-backlog.org= workflow holds the execution loop; "fix speedrun" is a thin named preset (no-approvals + always-push + end page) feeding it an explicit list, and the inbox-zero loop feeds it a tag query. Pros of the shared workflow: one execution loop to audit, inbox-zero's three callers stay clean, both input shapes reuse one guardrail set. Cons: one more workflow file and a caller-to-workflow indirection. The con list is shorter and lighter than the duplication cost of two separate features, which is why the shared workflow wins. The pros carry the more important entries (single audit surface, clean seam).
-- *Paging:* end-of-set only, via =notify ... --persist= (reconciled past the removed page-signal wrapper).
-- *Auto-pull vs explicit list:* both — explicit list for the preset, tag/priority query for the loop.
-- *Effectiveness measurement (the trial Craig asked for):* the spec designs a per-task JSONL metrics log (=.ai/metrics/work-the-backlog.jsonl=), a corrections-in-next-session signal, and a periodic synthesis step that writes =:agent:metrics:= org-roam articles for later review — the "gather data + create org-roam articles" loop.
-*** 2026-06-29 Mon @ 03:48:09 -0400 Ratified the autonomous-batch execution spec
-Craig ratified all eight decisions in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][2026-06-16-autonomous-batch-execution-spec.org]] (revised this session — size gate removed, crisp four-item defer checklist, =:solo:= / =:quick:= definitions + task-review/audit enforcement, speedrun pre-flight Q&A). Spec Status → ready; implementation-ready across Phase 0–6. Decisions grew from six to eight during the revision.
-
-*** 2026-07-02 Thu @ 00:44:59 -0400 spec-response decomposition — :SPEC_ID: bound, spec DOING
-Stamped the spec's UUID on this parent, broke Phases 1-6 into the build tasks below (plus the flip task and a live-trial validation child), and flipped the spec's status heading READY → DOING per the transition-ownership table.
-
-*** 2026-07-02 Thu @ 01:07:29 -0400 Phase 1 landed — execution loop extracted into work-the-backlog.org
-work-the-backlog.org written (canonical + mirror): caller contract (task set + session mode + cap), five-outcome vocabulary, the loop, mechanical eligibility gate (TODO + :solo: per scheme header, safe-by-omission, no-scheme-header → don't run), four-item defer checklist, per-task quality bar, cap/kill-switch semantics, page + metrics stubs pointing at Phases 4-5. inbox.org's auto-mode per-cycle item 3 reverted to routing-only (yes-path execution removed; mode intro + closing line updated to match). INDEX.org entry added. make test green, sync clean; nothing invokes the new workflow yet.
-
-*** 2026-07-02 Thu @ 01:13:33 -0400 Phase 2 landed — both callers wired
-inbox.org auto-mode item 3 regained its "run this batch next?" ask, now chaining into work-the-backlog as an explicit second step after routing (eligibility query + file-only + paging off + cap 1). work-the-backlog.org gained the two caller sections: the auto-loop contract and the no-approvals speedrun preset (seven-step pre-flight → autonomous-commit + always-push + paging-on over an explicit list; finer Q&A mechanics deferred to Phase 4). Speedrun trigger phrases live in the workflow + INDEX; "speedrun" always routes to the preset, with a disambiguation note in no-approvals.org and its INDEX entry. Each caller independently exercisable.
-
-*** 2026-07-02 Thu @ 01:18:07 -0400 Phase 3 landed — waiver-gated commit autonomy
-Pinned the waiver format per D5: two marker lines in .ai/notes.org Workflow State — :COMMIT_AUTONOMY: yes (has the waiver) and :LOOP_MAY_COMMIT: yes (the unattended loop may also commit; requires the first). Absent or non-yes reads as no; the read is a fresh grep each run, never memory. Degrade contract written into work-the-backlog.org (surface in run intro + summary, never honor without the marker, never degrade silently); caller sections + Common Mistakes updated. Stamped rulesets' own :COMMIT_AUTONOMY: yes; :LOOP_MAY_COMMIT: deliberately not granted — Craig's call. .emacs.d holds the waiver too but its notes.org is its own scope; told via inbox-send to stamp its marker.
-
-*** 2026-07-02 Thu @ 01:21:47 -0400 Phase 4 landed — checklist mechanics, pre-flight Q&A contract, page
-The four-item checklist (in since Phase 1) gained its mechanics: a VERIFY-filing subsection (dedup against an existing sibling first — the deferred task stays TODO, so without the check every run re-files; placement/heading/body per todo-format.md) and a quick-question routing subsection (discriminator: one-line factual/preference pick vs tradeoff-weighing; three-plus questions = underspecified = file; item 2 data-loss never routes to Q&A). Preset section gained the batch-ask contract (one message, recommendation-first numbered options per interaction.md, answers recorded as dated lines in the task bodies before the run). Page section finalized (fires once on set-done or cap-hit; notify --persist is the paging surface). Common Mistakes 12-13 added. Checklist only ever reduces what runs; pre-flight fires only under the preset.
-
-*** 2026-07-02 Thu @ 01:24:50 -0400 Phase 5 landed — per-task JSONL metrics log
-Metrics section written into work-the-backlog.org: one record per task at outcome time, appended to the project's .ai/metrics/work-the-backlog.jsonl (git-tracked, append-only, dir+file created on first append). Full field table per the spec (ts, run_id, project, caller, task, outcome, defer_reason, upfront_decision, wall_clock_s, commit_sha, review_findings), outcome slugs mapped to the prose vocabulary, commit_sha flagged as the corrections-signal key (comma-separated when a task decomposed into several commits). Added the sixth outcome the spec's readiness section demanded but the enum missed: failed (tree left working, surfaced, run continues) — wired into the Outcomes vocabulary and loop step 4. A failed append warns in the run summary but never blocks, reorders, or aborts execution.
-
-*** 2026-07-02 Thu @ 01:27:43 -0400 Phase 6 landed — synthesis step to org-roam
-Synthesis section written into work-the-backlog.org (trigger "synthesize backlog metrics", INDEX row added): discover the JSONL union across project roots, classify each project per knowledge-base.md's denylist before reading, exclude work/unknown projects with the refusal contract, compute per-run rollups + trends, compute the corrections signal (later revert/fix commit touching the same files within ~14 days — a flag for human review, not a conviction), write one :agent:metrics: KB node under ~/org/roam/agents/ with [[id:...]] links to prior synthesis nodes, pull-before/commit-push-after. Read-only over the logs plus the single KB write; never mutates JSONL, todo.org, or any tree.
-
-*** TODO [#C] Speedrun — live trial validation :test:
-What we're verifying: the whole loop under a real run. Craig names a small ordered set in a coding project and says "no approvals speedrun": pre-flight Q&A fires once up front, each task lands as its own reviewed commit, ineligible/underspecified tasks get VERIFYs instead of half-work, the end-of-set page arrives via notify --persist, and the metrics JSONL carries one record per task. Not :solo: — needs Craig's set and his read on the run.
-
-*** TODO [#C] Flip the autonomous-batch spec to IMPLEMENTED
-When the final phase completes and the live trial validates: flip docs/specs/2026-06-16-autonomous-batch-execution-spec.org DOING → IMPLEMENTED with a dated history line and the Metadata mirror, per the transition-ownership table.
-
-** DONE [#C] Template sync with gitignored-only local changes :feature:
-CLOSED: [2026-07-02 Thu]
-From Craig via the roam inbox (2026-07-02, routed by archsetup): downstream projects should still pull template updates when their local changes sit entirely in gitignored files or directories — an inbox drop or a file left to read doesn't affect the templates, yet it currently holds the sync back and projects fall behind. When worked: verify how the sync gate actually detects dirtiness today, then let gitignored-only changes pass it.
-
-2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): policy + audit. Scope read found startup's git gates already ignore untracked/ignored files; state the policy in startup.org and audit every dirty-check in the synced workflows to match (monitor-inbox's bare porcelain check is the known offender; tracked-modification blocking stays).
-
-Resolution 2026-07-02: template-freshness policy stated in startup.org Phase A.0 (dirty = tracked modifications only; untracked/gitignored never block pulls, ffs, or monitoring gates; the rsync WIP-guard named as the one deliberate exception — it holds back rulesets' own outbound WIP). Full audit of dirty-checks across synced workflows: startup's two git gates already complied; inbox.org monitor mode was the one offender — its precondition now uses --untracked-files=no with the explicit-staging rationale, and its close-out sweeps tracked changes only. triage-intake auto mode borrows monitor's gates, so it inherits the fix by reference.
-
** TODO [#C] Wrap-it-up summary mode — keep or cut :feature:
From Craig via the roam inbox (2026-07-02, routed by archsetup). Teardown-by-default already shipped (bare "wrap it up" closes the window; "with summary" keeps it). Craig's follow-on: "maybe we cut the summary altogether. help me think through when I'd want a summary and how I would recognize it before confirming and then having it close." Run that think-through with him (brainstorm-shaped, not solo), then adjust wrap-it-up.org's Step 6 + trigger phrases to the outcome.
@@ -1367,3 +1294,90 @@ Fresh pre-flight, all green: Stop hook block in =~/.claude/settings.json= points
*** 2026-07-01 Wed @ 21:59:43 -0400 Manual end-to-end validation passed — all five tests, Craig's live run
Craig ran the full checklist in a live Emacs/tmux ai-term setup: (1) bare "wrap it up" tore down after the valediction rendered, geometry restored, no lingering sentinel; (2) "with summary" / "and summarize" both wrapped without teardown, buffer stayed readable; (3) "wrap it up and shutdown" with another aiv-* session live refused the shutdown, named the other session, fell back to a normal wrap; (4) as the sole session, the 10→1 echo-area countdown rendered one-per-second, C-g cancelled cleanly, and a full run fired the (stubbed) shutdown command; (5) with the push made to fail, the wrap stopped at the failure and no sentinel was dropped. Works great — feature validated and live. Both sides complete: rulesets Stop hook + wrap-it-up Teardown mode, .emacs.d companion functions.
+** DONE [#C] Guard against hardcoded host identity in synced files :feature:solo:
+CLOSED: [2026-07-02 Thu]
+:PROPERTIES:
+:CREATED: [2026-06-22 Mon]
+:LAST_REVIEWED: 2026-06-24
+:END:
+A =CLAUDE.md= / notes file that asserts mutable environment identity as a fixed fact ("This machine is ratio", a current OS, an IP, "the laptop") is false on every machine the synced/tracked file lands on but one. It bit a real archsetup session: a stale "this machine is ratio" line made the agent reason backwards all session while on velox. Proposal: a claude-rule — don't assert mutable host/env identity as a fixed fact in a tracked/synced project file; derive it at runtime and name the command (=uname -n= for host; the =hostname= binary is often absent). Optionally a codify- or startup-time lint flagging "this machine is <name>" / "the current host is" style claims. Proposal: [[file:docs/design/2026-06-21-host-identity-guard-proposal.org][proposal]]. From archsetup 2026-06-21.
+
+2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): rule + startup lint. A new claude-rules file plus a cheap grep probe in startup flagging host-identity claims in CLAUDE.md / notes.org fleet-wide.
+
+Resolution 2026-07-02: claude-rules/host-identity.md written (fixed-identity claims banned in tracked/synced docs, runtime derivation via uname -n, fleet-description carve-out, the archsetup worked failure) and linked machine-wide by make install. startup.org gained Phase A probe 13 (grep for "this machine/host/box is" claims in CLAUDE.md + notes.org, fixture-verified bash+zsh) and the Phase C host-identity flag line. Flags for judgment, never blocks.
+** DONE [#C] No-approvals speedrun — cross-project autonomous-batch mode :feature:spec:
+CLOSED: [2026-07-02 Thu]
+:PROPERTIES:
+:CREATED: [2026-06-15 Mon]
+:LAST_REVIEWED: 2026-06-24
+:SPEC_ID: 90f623cd-fdbe-4f5c-b63d-b2f84d9151cf
+:END:
+A named mode for coding projects: Craig names an ordered task set and says "speedrun" / "no approvals speedrun"; the set is worked autonomously, each task held to the full quality bar (TDD red→green, =/review-code=, =/voice= on the commit) and committed + pushed as its own logical commit, with all needed quick decisions gathered in one pre-flight Q&A (answer or "skip this") and a VERIFY filed for anything underspecified or needing deliberation, plus an end-of-set page listing completed + remaining + skipped tasks. Task size is not a gate — large tasks decompose into per-commit chunks. Surfaced by .emacs.d from a 2026-06-15 theme-studio session where the shape worked. Source proposal: [[file:docs/design/2026-06-15-fix-speedrun-workflow-proposal.org]] (.emacs.d handoff 2026-06-15). Build via =spec-create= when worked; we handle the task in priority order.
+
+Skeptical-review read (open design questions to resolve in the spec, not settled here):
+- *Is it a new workflow or a documented preset?* The proposal frames it as no-approvals + always-push session modes plus an end page. Decide whether it needs its own workflow file or is mostly documentation of a preset over the two existing modes.
+- *Where/how the page fires* — every task vs end-of-set, and via what. The paging surface is in flux (=page-signal= removed 2026-06-12), so reconcile against =notify --persist= or whatever paging stands now.
+- *Auto-pull vs explicit list* — whether the set comes from an explicit ordered list or a tag/priority query.
+- *Guardrails* — must refuse to speedrun tasks needing design decisions or carrying data-loss risk without a checkpoint (the sender's biased-safe unused-tile flag is the worked example).
+
+*** 2026-07-01 Wed @ 22:10:35 -0400 Phase 0 landed — hard tag definitions + review/audit enforcement
+todo-format.md gained the "Hard definitions: :solo: and :quick:" subsection under the scheme header (fixed across projects: :solo: = buildable + agent-verifiable + no deliberation, with one-or-two upfront-answerable quick decisions allowed per the ratified spec; :quick: = ≤30-min effort hint, never a gate). task-review.org: the two tagging sections are now explicitly mandatory ("a review that skips them is incomplete") and gate 3 was realigned from "no upfront decision" to the spec's no-deliberation form — the stricter old wording predated the pre-flight-Q&A decision and would have wrongly excluded quick-question tasks. task-audit.org: the re-assess bullet is marked mandatory and points at the todo-format hard definitions as canonical. Phases 1-6 (work-the-backlog extraction, callers, commit gate, checklist/Q&A/page, metrics, synthesis) remain.
+
+*** 2026-06-16 Tue @ 00:53:36 -0500 Spec written; design questions answered
+Craig's "your call" (2026-06-16) answered in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][the autonomous-batch execution spec]], which reconciles this with Phase E into one feature:
+- *Most effective / workflow-vs-preset:* one dedicated =work-the-backlog.org= workflow holds the execution loop; "fix speedrun" is a thin named preset (no-approvals + always-push + end page) feeding it an explicit list, and the inbox-zero loop feeds it a tag query. Pros of the shared workflow: one execution loop to audit, inbox-zero's three callers stay clean, both input shapes reuse one guardrail set. Cons: one more workflow file and a caller-to-workflow indirection. The con list is shorter and lighter than the duplication cost of two separate features, which is why the shared workflow wins. The pros carry the more important entries (single audit surface, clean seam).
+- *Paging:* end-of-set only, via =notify ... --persist= (reconciled past the removed page-signal wrapper).
+- *Auto-pull vs explicit list:* both — explicit list for the preset, tag/priority query for the loop.
+- *Effectiveness measurement (the trial Craig asked for):* the spec designs a per-task JSONL metrics log (=.ai/metrics/work-the-backlog.jsonl=), a corrections-in-next-session signal, and a periodic synthesis step that writes =:agent:metrics:= org-roam articles for later review — the "gather data + create org-roam articles" loop.
+*** 2026-06-29 Mon @ 03:48:09 -0400 Ratified the autonomous-batch execution spec
+Craig ratified all eight decisions in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][2026-06-16-autonomous-batch-execution-spec.org]] (revised this session — size gate removed, crisp four-item defer checklist, =:solo:= / =:quick:= definitions + task-review/audit enforcement, speedrun pre-flight Q&A). Spec Status → ready; implementation-ready across Phase 0–6. Decisions grew from six to eight during the revision.
+
+*** 2026-07-02 Thu @ 00:44:59 -0400 spec-response decomposition — :SPEC_ID: bound, spec DOING
+Stamped the spec's UUID on this parent, broke Phases 1-6 into the build tasks below (plus the flip task and a live-trial validation child), and flipped the spec's status heading READY → DOING per the transition-ownership table.
+
+*** 2026-07-02 Thu @ 01:07:29 -0400 Phase 1 landed — execution loop extracted into work-the-backlog.org
+work-the-backlog.org written (canonical + mirror): caller contract (task set + session mode + cap), five-outcome vocabulary, the loop, mechanical eligibility gate (TODO + :solo: per scheme header, safe-by-omission, no-scheme-header → don't run), four-item defer checklist, per-task quality bar, cap/kill-switch semantics, page + metrics stubs pointing at Phases 4-5. inbox.org's auto-mode per-cycle item 3 reverted to routing-only (yes-path execution removed; mode intro + closing line updated to match). INDEX.org entry added. make test green, sync clean; nothing invokes the new workflow yet.
+
+*** 2026-07-02 Thu @ 01:13:33 -0400 Phase 2 landed — both callers wired
+inbox.org auto-mode item 3 regained its "run this batch next?" ask, now chaining into work-the-backlog as an explicit second step after routing (eligibility query + file-only + paging off + cap 1). work-the-backlog.org gained the two caller sections: the auto-loop contract and the no-approvals speedrun preset (seven-step pre-flight → autonomous-commit + always-push + paging-on over an explicit list; finer Q&A mechanics deferred to Phase 4). Speedrun trigger phrases live in the workflow + INDEX; "speedrun" always routes to the preset, with a disambiguation note in no-approvals.org and its INDEX entry. Each caller independently exercisable.
+
+*** 2026-07-02 Thu @ 01:18:07 -0400 Phase 3 landed — waiver-gated commit autonomy
+Pinned the waiver format per D5: two marker lines in .ai/notes.org Workflow State — :COMMIT_AUTONOMY: yes (has the waiver) and :LOOP_MAY_COMMIT: yes (the unattended loop may also commit; requires the first). Absent or non-yes reads as no; the read is a fresh grep each run, never memory. Degrade contract written into work-the-backlog.org (surface in run intro + summary, never honor without the marker, never degrade silently); caller sections + Common Mistakes updated. Stamped rulesets' own :COMMIT_AUTONOMY: yes; :LOOP_MAY_COMMIT: deliberately not granted — Craig's call. .emacs.d holds the waiver too but its notes.org is its own scope; told via inbox-send to stamp its marker.
+
+*** 2026-07-02 Thu @ 01:21:47 -0400 Phase 4 landed — checklist mechanics, pre-flight Q&A contract, page
+The four-item checklist (in since Phase 1) gained its mechanics: a VERIFY-filing subsection (dedup against an existing sibling first — the deferred task stays TODO, so without the check every run re-files; placement/heading/body per todo-format.md) and a quick-question routing subsection (discriminator: one-line factual/preference pick vs tradeoff-weighing; three-plus questions = underspecified = file; item 2 data-loss never routes to Q&A). Preset section gained the batch-ask contract (one message, recommendation-first numbered options per interaction.md, answers recorded as dated lines in the task bodies before the run). Page section finalized (fires once on set-done or cap-hit; notify --persist is the paging surface). Common Mistakes 12-13 added. Checklist only ever reduces what runs; pre-flight fires only under the preset.
+
+*** 2026-07-02 Thu @ 01:24:50 -0400 Phase 5 landed — per-task JSONL metrics log
+Metrics section written into work-the-backlog.org: one record per task at outcome time, appended to the project's .ai/metrics/work-the-backlog.jsonl (git-tracked, append-only, dir+file created on first append). Full field table per the spec (ts, run_id, project, caller, task, outcome, defer_reason, upfront_decision, wall_clock_s, commit_sha, review_findings), outcome slugs mapped to the prose vocabulary, commit_sha flagged as the corrections-signal key (comma-separated when a task decomposed into several commits). Added the sixth outcome the spec's readiness section demanded but the enum missed: failed (tree left working, surfaced, run continues) — wired into the Outcomes vocabulary and loop step 4. A failed append warns in the run summary but never blocks, reorders, or aborts execution.
+
+*** 2026-07-02 Thu @ 01:27:43 -0400 Phase 6 landed — synthesis step to org-roam
+Synthesis section written into work-the-backlog.org (trigger "synthesize backlog metrics", INDEX row added): discover the JSONL union across project roots, classify each project per knowledge-base.md's denylist before reading, exclude work/unknown projects with the refusal contract, compute per-run rollups + trends, compute the corrections signal (later revert/fix commit touching the same files within ~14 days — a flag for human review, not a conviction), write one :agent:metrics: KB node under ~/org/roam/agents/ with [[id:...]] links to prior synthesis nodes, pull-before/commit-push-after. Read-only over the logs plus the single KB write; never mutates JSONL, todo.org, or any tree.
+
+*** 2026-07-02 Thu @ 05:26:07 -0400 Live trial passed — first speedrun ran 3/3, every loop part exercised
+Craig named the ordered set (id-link conversion, host-identity guard, template-sync policy) and said it was the validation run. Pre-flight Q&A fired once (two questions, both answered, answers stamped as dated lines before the run); each task landed as its own reviewed commit under the waiver (78bbaae, b6a977c, ed75d3c); metrics JSONL carries one record per task (run c726f526); the end-of-set page arrived via notify --persist. Nothing needed a VERIFY this run (all three cleared the checklist). Craig's read: granted :LOOP_MAY_COMMIT: on the strength of the run.
+
+*** 2026-07-02 Thu @ 05:26:07 -0400 Flipped the spec DOING → IMPLEMENTED
+All six phases built and the live trial validated. Keyword, dated history line, and Metadata mirror all flipped per the transition-ownership table.
+** DONE [#B] inbox-send filename collision silently overwrote a message :bug:solo:
+CLOSED: [2026-07-02 Thu]
+:PROPERTIES:
+:CREATED: [2026-07-02 Thu]
+:END:
+From archsetup (2026-07-02 0543, found in the wild): two --text sends in the same minute whose text starts with the same phrase derive identical filenames, and the second silently overwrites the first — archsetup lost a message at 05:42 and had to resend. Severity data-loss x rare-edge = P2 = [#B].
+
+Resolution 2026-07-02 (auto-inbox-zero loop, standing yes): uniquify() guard in inbox-send.py — an existing target gets a -2/-3/... stem suffix, both send_text and send_file paths, extension preserved. Four red-first tests reproduce the loss (module-level with a fixed timestamp so the same-minute collision is deterministic, plus a CLI loss-proof check); 30/30 green.
+** DONE [#C] page-me notify styling — all-red too alarming :bug:solo:
+CLOSED: [2026-07-02 Thu]
+:PROPERTIES:
+:CREATED: [2026-07-02 Thu]
+:END:
+From Craig via the roam inbox (2026-07-02, routed by archsetup): the page notify's all-red styling "makes me feel like somehow the system is about to crash" — should be a persistent info-level notification.
+
+Resolution 2026-07-02 (auto-inbox-zero loop, standing yes): pages now use notify info --persist instead of notify alarm — page-me.org (all examples + prose, with Craig's verdict recorded) and work-the-backlog.org's end-of-set page. status-check's success/fail types untouched (job outcomes, not pages).
+** DONE [#C] Template sync with gitignored-only local changes :feature:
+CLOSED: [2026-07-02 Thu]
+From Craig via the roam inbox (2026-07-02, routed by archsetup): downstream projects should still pull template updates when their local changes sit entirely in gitignored files or directories — an inbox drop or a file left to read doesn't affect the templates, yet it currently holds the sync back and projects fall behind. When worked: verify how the sync gate actually detects dirtiness today, then let gitignored-only changes pass it.
+
+2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): policy + audit. Scope read found startup's git gates already ignore untracked/ignored files; state the policy in startup.org and audit every dirty-check in the synced workflows to match (monitor-inbox's bare porcelain check is the known offender; tracked-modification blocking stays).
+
+Resolution 2026-07-02: template-freshness policy stated in startup.org Phase A.0 (dirty = tracked modifications only; untracked/gitignored never block pulls, ffs, or monitoring gates; the rsync WIP-guard named as the one deliberate exception — it holds back rulesets' own outbound WIP). Full audit of dirty-checks across synced workflows: startup's two git gates already complied; inbox.org monitor mode was the one offender — its precondition now uses --untracked-files=no with the explicit-staging rationale, and its close-out sweeps tracked changes only. triage-intake auto mode borrows monitor's gates, so it inherits the fix by reference.