diff options
| author | Craig Jennings <c@cjennings.net> | 2026-07-02 09:35:18 -0400 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-07-02 09:35:18 -0400 |
| commit | 5dc7da34800292ff9e9872a07859f1e0bcffa7d9 (patch) | |
| tree | 6ede17429d8f688f19dddb70bdd096cf6a56be1a | |
| parent | e9235a10e2d0359ca89ed8affaa9e7f3b3a0c4e8 (diff) | |
| download | rulesets-5dc7da34800292ff9e9872a07859f1e0bcffa7d9.tar.gz rulesets-5dc7da34800292ff9e9872a07859f1e0bcffa7d9.zip | |
Host-identity guard, the no-approvals speedrun, the inbox-send collision fix, page styling, and template-sync policy all moved to Resolved. The vNext agenda-view child keeps its deliberate [#D] under a :no-sync: tag.
| -rw-r--r-- | .ai/sessions/2026-07-02-09-29-docs-lifecycle-speedrun-autonomous-loop.org | 110 | ||||
| -rw-r--r-- | inbox/lint-followups.org | 2 | ||||
| -rw-r--r-- | todo.org | 181 |
3 files changed, 200 insertions, 93 deletions
diff --git a/.ai/sessions/2026-07-02-09-29-docs-lifecycle-speedrun-autonomous-loop.org b/.ai/sessions/2026-07-02-09-29-docs-lifecycle-speedrun-autonomous-loop.org new file mode 100644 index 0000000..8fc23e9 --- /dev/null +++ b/.ai/sessions/2026-07-02-09-29-docs-lifecycle-speedrun-autonomous-loop.org @@ -0,0 +1,110 @@ +#+TITLE: Session Context — Docs-lifecycle build, wrap-up router, speedrun build +#+DATE: 2026-07-02 + +* Summary + +** Active Goal +(Session closed 2026-07-02 ~09:30 — the standing loop directive below ran through the night and morning and ended with this wrap; cron job 752624d1 deleted at wrap.) +Standing directive (Craig, 2026-07-02 ~01:30): run auto-inbox-zero every 30 minutes with a STANDING YES — execute on all items found each cycle. Per-item disposition: feature-level task → write a spec (spec-create); decisions I can't confidently guess → file a VERIFY; well-defined → implement with the full quality bar. Autonomous commit + push under rulesets' :COMMIT_AUTONOMY: waiver. Auto-flush (/flush auto, self-inject) at clean boundaries when context grows heavy. Earlier directive items all DONE tonight: wrap-up routing, speedrun Phases 1-6 (work-the-backlog.org), roam inbox zero, auto-flush implementation + speedrun incorporation + disposition rule. + +** Decisions +- Inbox: Craig approved all five dispositions (convert-subtasks bundle + planning-line fix, task-audit C.6, sweep security fix + public-reachability convention + broadcast, spec-review UI-traps promotion, KB orphan report → [#C] task). +- Wrap-teardown: Craig ran all five manual tests, all passed — feature closed DONE, armed as the default (a bare "wrap it up" tears the session down; "with summary" keeps the buffer). +- Speedrun Phase 0: hard :solo:/:quick: definitions are fixed cross-project in todo-format.md; review/audit tag assessment is mandatory; task-review gate 3 realigned to no-deliberation (1-2 upfront-answerable quick decisions allowed). +- Docs-lifecycle spec: ratified through two independent review rounds (Codex + fresh-context Claude agent; 14 findings, all fixed, verify passes held); Codex flipped READY; spec-response decomposition flipped DOING. Key forks: two-sequence keyword header; :SPEC_ID: parent-keyword binding; fail-safe --apply; file: links through the pilot with id: conversion gated on the .emacs.d id-index mechanism; evidence-based status confirmation. +- Lesson: never flip a lifecycle state in the same pass that authored the fixes — the reviewer owns the flip. + +** Data Collected / Findings +- Spec: docs/specs/2026-07-01-docs-lifecycle-spec.org, status DOING, :ID: 80b0787b-4a60-4c82-8a16-b383d3e3c8f2; build parent in todo.org carries :SPEC_ID: (task "Spec storage location + lifecycle-status convention", ~line 361). +- Pilot surface: docs/design has 41 files (3 spec-spine candidates: Decisions AND Implementation phases); 2 stray root specs (agent-knowledge-base-spec.org, inbox-workflow-consolidation-spec.org); docs/design/task-review.org is the note counter-case (Metadata only). +- Validations: KB check 1 verified (55 rg = 55 org-roam DB); check 4 velox half verified (probe node agents/20260701214910-kb-sync-validation-probe.org pushed f0252bb, 0 conflicts) — ratio half blocked (ssh times out; probe left in place; confirm command logged in todo.org). KB checks 2+3 need work/unknown-project sessions. Roam inbox holds 2 rulesets items (ai-term colors → .emacs.d territory; "wrap it up closes window" → already delivered by wrap-teardown). +- Bare [N/N] tokens in org prose get mangled by cookie updates — spell counts in words. + +** Files Modified +All committed and pushed through 9ad415d. Tonight: 80ca5d0 spec-sort helper + 33-test bats; f4b64d6 pilot (5 specs sorted to docs/specs/, board live, :LAST_SPEC_SORT: stamped); 21639cb startup nudge (find-based probe, compgen was bash-only) + .emacs.d convention-live/id-index note; 7c12007 wrap-up router (route-batch + 13-test bats, inbox.org :ROUTE_CANDIDATE: stamp, wrap-it-up router step, cross-project.md); 9ad415d speedrun decomposition + spec DOING flip. Earlier (2026-07-01): d0c92d0 docs-lifecycle Phase 1 and everything before it. + +** Next Steps +SESSION CLOSED — the auto-inbox-zero loop ended with the wrap (job deleted; a future session re-arms it only on a fresh directive from Craig). What carries forward: Craig's parked [#C] wrap-summary keep-or-cut think-through; ratio probe confirm (ssh was timing out — verify agents/20260701214910-kb-sync-validation-probe.org landed on ratio); KB refusal checks 2+3 (need work/unknown-project sessions); docs-lifecycle leftovers (Craig's manual tests: nudge visibility + Emacs id-link click-through; 4 anomaly renames; id-conversion gated on .emacs.d id-index). All build work from this session is DONE and pushed. + +Original loop contract (historical): continue the hourly auto-inbox-zero loop (cron job 752624d1, fires at :37, session-only). Each cycle: inbox-status + roam scan (capture-guard before roam writes); quiet → one acknowledgement line; finds → file, then execute ALL under Craig's standing yes with autonomous-commit + push (:COMMIT_AUTONOMY: + :LOOP_MAY_COMMIT: both stamped in notes.org Workflow State), disposition feature→spec / unguessable→VERIFY / well-defined→implement, full quality bar, metrics JSONL per task, session-log update per state-mutating cycle, /flush auto at heavy-context clean boundaries. Parked for Craig at his choosing: wrap-summary keep-or-cut think-through ([#C] in todo.org). Carryovers: ratio probe confirm (ssh), KB refusal checks (work/unknown sessions), docs-lifecycle leftovers (Craig's manual tests: nudge visibility + Emacs id-link click-through; 4 anomaly renames). Everything else from tonight is DONE and pushed (speedrun spec IMPLEMENTED after live trial; auto-flush + self-inject shipped; inbox-send collision fix; page info-styling; host-identity rule; template-sync policy; id-link conversion). + +KB: promoted 1 / consulted no +(Node: agents/20260702093025-reviewer-owns-the-lifecycle-flip.org — the don't-flip-your-own-fixes lesson from the docs-lifecycle spec review.) + +OLD (completed): build the speedrun / autonomous-batch phases per the spec docs/specs/2026-06-16-autonomous-batch-execution-spec.org (DOING, :ID: 90f623cd-fdbe-4f5c-b63d-b2f84d9151cf; build parent "No-approvals speedrun" in todo.org carries :SPEC_ID:, children Phases 1-6 + live-trial + flip). Read the spec's Design section (lines ~66-177: loop at both altitudes, eligibility gate, defer checklist, pre-flight Q&A, session modes/preset, run cap + kill switch, paging) before writing. Phase 1: write claude-templates/.ai/workflows/work-the-backlog.org (eligibility gate, defer checklist, per-task quality bar, run-cap; inputs task set + session mode + cap) AND revert inbox.org's "auto inbox zero" per-cycle item 3 yes-path to routing-only in the same commit (one home for execution). Phase 2: wire both callers (auto-inbox-zero yes-path → work-the-backlog tag-query/file-only/cap-1; speedrun preset → explicit list/autonomous-commit/always-push/paging after pre-flight Q&A). Phases 3-6 per the child task bodies. Canonical-side edits, sync-check --fix, make test, commit per phase. THEN: roam inbox zero (inbox.org roam mode; 2 known rulesets-related items: ai-term colors → .emacs.d territory, wrap-it-up-closes-window → already delivered) and react. Docs-lifecycle leftovers for Craig: flip decision pending his manual tests (nudge visibility + Emacs link click-through), 4 anomaly renames, id-conversion gated on .emacs.d. Carryovers: ratio probe confirm (ssh still timing out), KB refusal checks. + +* Session Log + +** 2026-07-02 Thu @ 09:30 -0400 — Session wrapped (teardown mode) +Morning loop cycles after the 07:51 auto-flush were all quiet (0 pending handoffs, roam inbox empty; last manual cycle ~09:15). Craig called the wrap. Cron job 752624d1 deleted; one KB node promoted (reviewer-owns-the-lifecycle-flip); todo cleanup + lint ran; wrap commit pushed. Teardown sentinel dropped per the validated default. + +** 2026-07-02 Thu @ 07:51:13 -0400 — flushed (auto-flush, self-injected) +Clean boundary: 07:50 loop cycle came back empty, tree clean, everything pushed through the metrics commit after a6b534f, suites green. Nothing in flight. First live use of the auto-flush mechanism shipped tonight (self-inject via tmux run-shell -b). Post-clear resume: read this Summary and continue the hourly loop per Next Steps — the cron job survives the clear (session-only, not conversation-only). Craig's standing yes remains in force (Active Goal). + +** 2026-07-02 Thu @ 06:00 -0400 — Loop cycle executed 3 finds (commits through a6b534f + metrics, pushed) +First executing loop cycle under the marker-granted autonomy. Found 3 archsetup handoffs + 2 rulesets roam entries (duplicates of the routed ones). Shipped: inbox-send collision fix (uniquify -2/-3 suffix, 4 red-first deterministic tests, 30/30 — a wild data-loss find, graded [#B] P2), page styling alarm → info --persist (page-me.org + work-the-backlog page; status-check untouched), dupre-blue ai-term refinement forwarded to .emacs.d (#67809c). Roam inbox: swept the 2 rulesets entries (archsetup's 3 left, roam-sync owns the git). Local inbox 0 pending; archsetup replied. Suites green, sync clean (one drift caught by the pre-commit check and re-synced). Loop continues hourly at :37 (job 752624d1). + +** 2026-07-02 Thu @ 05:30 -0400 — Live trial validated, spec IMPLEMENTED, loop now hourly (5eae9e0, pushed) +Craig's "1" granted :LOOP_MAY_COMMIT: (stamped in notes.org Workflow State) and validated the run. Closed the live-trial + flip children dated, parent "No-approvals speedrun" DONE + CLOSED, spec flipped DOING → IMPLEMENTED (board: 2 IMPLEMENTED / 2 READY / 2 DOING). Wrap-summary keep-or-cut think-through PARKED (stays filed [#C] in todo.org). Auto-inbox-zero rescheduled: old 30-min job deleted, new hourly job 752624d1 (at :37, off-minute per fleet guidance), same standing-yes contract, now with loop commits authorized by the marker rather than only the directive. + +** 2026-07-02 Thu @ 05:25 -0400 — First no-approvals speedrun complete: 3/3 (78bbaae, b6a977c, ed75d3c + metrics commit, all pushed) +Live-trial run c726f526 over Craig's ordered set. Pre-flight Q&A fired once (2 questions, Craig took both recommendations, answers stamped into task bodies as dated lines). Task 1 id-link conversion: 13 links → id: form, Review-findings heading got its own :ID: for the 2 search-target links, residue zero, all ids verified. Task 2 host-identity: claude-rules/host-identity.md (linked machine-wide, verified) + startup probe 13 (fixture-verified bash+zsh) + Phase C flag line. Task 3 template-sync: freshness policy in startup Phase A.0 (dirty = tracked-only; WIP-guard named as deliberate exception), monitor-inbox precondition fixed to --untracked-files=no + close-out symmetrized. Every task: /review-code, /voice, make test green, sync clean, one JSONL record (3 records in .ai/metrics/work-the-backlog.jsonl). End-of-set page fired via notify --persist. AWAITING: Craig's read on the run → then close the live-trial child dated + flip the autonomous-batch spec DOING → IMPLEMENTED. Then his two parked decisions: :LOOP_MAY_COMMIT: grant, wrap-summary keep-or-cut think-through. + +** 2026-07-02 Thu @ 01:40 -0400 — Inbox pass done; auto-flush shipped (d4f132b, 794b248, pushed); 30-min executing loop armed +Local inbox zero + roam inbox zero (roam was already empty — archsetup routed it). Processed: .emacs.d org-id delivery → id-conversion task ungated (:solo: now); archsetup's three roam items → template-sync-gitignored filed [#C], ai-term colors forwarded to .emacs.d, wrap-summary keep-or-cut filed [#C]; auto-flush bundle → self-inject.sh canonicalized + 6-test bats, flush skill auto mode (gate order preserved: verify anchor write BEFORE arming), work-the-backlog auto-flush section + preset step 6 + per-item disposition rule (feature→spec, unguessable→VERIFY, well-defined→implement). Design note preserved at docs/design/2026-07-02-auto-flush-mechanism-note.org. Replies sent to .emacs.d (x2) + archsetup (x2). All suites green, sync clean. NOW: /loop 30m auto-inbox-zero with Craig's standing yes (see Active Goal). + +** 2026-07-02 Thu @ 01:30 -0400 — Speedrun Phases 2-6 shipped (263138a, 8d790c0, 04561b2, eea93f1, 44c8cc2 — all pushed) +Phase 2: both callers wired (auto-mode chain ask scoped to the queued batch — a deliberate judgment, logged in the commit; speedrun preset section + trigger routing, "speedrun" always beats "no approvals" with disambiguation in no-approvals.org + INDEX). Phase 3: waiver pinned as :COMMIT_AUTONOMY:/:LOOP_MAY_COMMIT: markers in notes.org Workflow State; rulesets stamped :COMMIT_AUTONOMY: yes, :LOOP_MAY_COMMIT: deliberately left for Craig; .emacs.d told to stamp its own (inbox-send 0118). Phase 4: VERIFY-filing dedup (existing-sibling check), quick-question discriminator, batch-ask contract, page finalized. Phase 5: JSONL field table (+failed outcome, +manual caller, +comma-separated commit_sha — three traceable spec gaps closed). Phase 6: synthesis section + "synthesize backlog metrics" trigger. Every phase: /review-code, /voice, make test green, sync clean, todo.org child flipped dated. Parent stays DOING pending Craig's live trial + the flip task. NEXT: local inbox (2 handoffs: .emacs.d org-id delivery unblocks the docs-lifecycle id-conversion task; archsetup roam-routed item unread), then roam inbox zero + react. + +** 2026-07-02 Thu @ 01:15 -0400 — Speedrun Phase 1 shipped (d379a23, pushed) +work-the-backlog.org created (canonical + mirror + INDEX entry): caller contract, five-outcome vocabulary, mechanical eligibility gate (TODO + :solo:, no-scheme-header → don't run), four-item defer checklist, quality bar, cap semantics, Phase 3-5 stubs. inbox.org auto-mode item 3 reverted to routing-only. Review fixed a cap-default contradiction (explicit set defaults to list length, not 1) pre-commit. make test green twice, sync clean, todo.org Phase 1 child rewritten dated. NOTE: new inbox handoff from .emacs.d (2026-07-02-0056) — org-id resolution delivered, the gated id-conversion task is unblocked; process during the inbox pass after the speedrun build. Next: Phase 2 (wire the two callers). + +** 2026-07-02 Thu @ 00:47:00 -0400 — flushed +Clean boundary: wrap-up router shipped (7c12007 — route-batch helper + 13-test bats, inbox.org marker stamp, wrap-it-up router step, cross-project.md note; review fixed a reproduced nested-candidate data-loss bug pre-commit) and the speedrun spec decomposed + flipped DOING (9ad415d). Nothing half-edited; tree clean, pushed through 9ad415d, suite green. Post-clear resume goes straight to speedrun Phase 1 (contract pointers in Next Steps). Remaining under wrapup-routing: manual e2e (Craig's) + vNext transcript task. + +** 2026-07-02 Thu @ 00:25 -0400 — Phases 3 + 4 shipped: pilot ran, nudge live, .emacs.d notified +Craig confirmed all five pilot keywords as-is (option 1) plus the IMPLEMENTED reason for agent-knowledge-base-spec. Applied with --allow-dirty (only the untracked session anchor was dirty): 5 specs moved to docs/specs/, 12 todo.org links + the moved specs' outbound links rewritten, :LAST_SPEC_SORT: 2026-07-02 stamped, residue zero, board live (6 specs: 1 IMPLEMENTED, 3 READY, 2 DOING). Phase 4: spec-sort probe added to startup.org Phase A + Phase C nudge line; replaced the spec's compgen sketch with a find-based check (compgen is bash-only, zsh false-negatived on stray root specs) — fixture-verified both shells, four project shapes; fixed startup.org's stale path to the moved encourage-kb spec; sent .emacs.d the convention-live note + id-index ask (2026-07-02-0022 handoff). f4b64d6 + 21639cb pushed, make test green, sync clean. + +** 2026-07-02 Wed @ 00:15 -0400 — Phase 2 shipped: spec-sort + 33-test bats suite (80ca5d0, pushed) +TDD'd the retrofit helper (bats red-first, then the Python implementation). A fresh-context review agent found 4 real issues, all fixed pre-commit: acknowledged bare mentions weren't mapped through moves (a self-mention turned a successful apply into a false FAILURE with a destructive recovery recipe — regression-tested), real OSError mid-apply lost the applied-ops list (both failure paths now share ApplyFailure), "incomplete" status proposed terminal IMPLEMENTED (word-boundary matching now), and file-relative vs root-anchored link ambiguity now blocks validation as AMBIGUOUS. Real-data dry run matches predictions (5 candidates / 4 anomalies / 30 notes / 1 self bare mention / 10 report-only incl. the Codex-flagged startup.org case). make test green, sync clean. Next: Phase 3 pilot — candidate keywords + anomaly dispositions need Craig (see Next Steps). + +** 2026-07-01 Wed @ 23:41:36 -0400 — flushed +Clean boundary after docs-lifecycle Phase 1 (d0c92d0, pushed, tree clean, suite green). In flight: nothing half-edited. Post-clear resume goes straight to Phase 2 — the spec-sort build (contract pointers in Next Steps above). + +** 2026-07-01 21:05 EDT — Session resumed after interruption; inbox first, then validations/specs +The 2026-06-30 session died right after the validation pre-flight (nothing lost beyond the plan itself — no edits had landed). New session startup found this anchor plus 15 pending inbox handoffs. Craig's call: process the inbox first, then return to the validations + spec writing goal above. Also committed the one-line .claude/settings.json change from his /model command (c976f5b, "chore: set fable as project default model"). Green baseline confirmed before the commit: make test exit 0 — pytest 370+67+12 passed, all ERT suites 0 unexpected, 0 bats failures. + +Inbox inventory (15): the .emacs.d convert-subtasks bundle (10 files — todo-cleanup/lint-org + tests + 5 workflow/rule wirings), the .emacs.d task-audit Phase C.6 follow-on (2 files), the .emacs.d sweep-gitignore anchored-pattern security fix ask, the archsetup spec-review UI-traps promotion proposal, and a KB hygiene report (42 orphan agent nodes). Plan: skeptical review via diff-against-canonical for the bundles, then surface dispositions for approval. + +** 2026-07-01 ~21:45 EDT — Inbox items A, B, C shipped +Craig approved all five recommendations (A1 B1 C1 D1 E1). Shipped so far: +- A (19ba7cb): convert-subtasks bundle applied to canonicals + mirror — todo-cleanup --convert-subtasks, lint-org subtask-done-not-dated checker, wiring in wrap-it-up/clean-todo/open-tasks/task-review/todo-format.md. TDD'd the planning-line edge on top (CLOSED removal now preserves a DEADLINE/SCHEDULED sharing the line; red then green). Suites 45/45, 49/49, full make test green. +- B (356b905): task-audit Phase C.6 (retire completed parents / promote stragglers) applied as sent. +- C (909b21b + bac3fe4): sweep-gitignore-tooling.sh now recognizes anchored /.ai/ (mode detection + per-pattern presence + style-matched append) and WARNs on tracked tooling reachable via a non-cjennings.net remote; bare cjennings ssh-alias counts as private (false positive on rulesets itself caught by the real-data dry run, fixed with a 13th bats test). protocols.org gained the public-reachability convention. Real sweep run: 6 projects backfilled (archsetup, chime, emacs-wttrin in anchored style), archsetup's tracked CLAUDE.md flagged. Broadcast sent to 14 projects via inbox-send.py (the inbox-send wrapper isn't on PATH here — used the .py directly); archsetup got an extra tailored note about its tracked CLAUDE.md. +Still open in this pass: D (spec-review UI-traps), E (KB orphan task), replies to .emacs.d + archsetup, inbox file cleanup, LAST_INBOX_PROCESS stamp, push. + +** 2026-07-01 ~21:55 EDT — Inbox pass complete: D, E, replies, cleanup +D (9814b94): archsetup's six UI-traps checks promoted into spec-review.org Phase 4 as the conditional "Operational-panel UI traps" dimension; accept reply sent to archsetup. E: KB orphan review filed as [#C] :chore: in todo.org (orphan-ness isn't a defect; periodic prune/merge/link pass, regenerate the list before running); report deleted, no reply (script source). Full reply to .emacs.d sent covering all three of its handoffs, including the planning-line divergence from what it sent. All 15 inbox files deleted; inbox-status 0 pending; :LAST_INBOX_PROCESS: stamped 2026-07-01. Next: commit todo/notes, push all five commits, then return to the interrupted goal — manual validations (wrap-teardown task 42, Agent-KB refusal checks task 309) and the queued specs. + +** 2026-07-01 ~22:00 EDT — Validations: agent-runnable half done +Inbox pass pushed (7 commits, e36e932..6ec05bb). Resumed the validations goal. Done tonight: wrap-teardown plumbing re-verified fresh (Stop hook + symlink + no stale sentinel + companion fns (t t t); 3 live aiv-* sessions available for the gate test); Agent-KB check 1 verified (55 :agent: nodes in rg inventory AND 55 in the live org-roam DB — match); check 4 velox half verified (probe node agents/20260701214910-kb-sync-validation-probe.org committed + pushed by roam-sync in seconds, f0252bb, 0 conflicts). BLOCKED: ratio ssh times out (tailscale ping pongs via DERP, TCP:22 unreachable — likely suspended); probe left in place for later confirmation. Still needing Craig: wrap-teardown 5-test checklist (scratch session), KB checks 2+3 (work/unknown-project refusal sessions), work-machine-no-clone check. Evidence logged in todo.org under both tasks. Next: surface status + spec-writing choice to Craig. + +** 2026-07-01 ~22:00 EDT — Wrap-teardown feature validated and closed +Craig ran all five manual tests live — teardown-after-valediction, both summary qualifiers, the multi-session shutdown refusal, the cancellable countdown + stubbed shutdown, and the push-failure guard. All passed ("works great"). Rewrote the checklist sub-task to a dated entry and closed the parent DONE + CLOSED [2026-07-01 Wed] (archives on the next --archive-done). Sent .emacs.d a one-line FYI since its companion functions are half the feature. NOTE for this session's own wrap: teardown is now the validated default — a bare "wrap it up" here will tear this session down; use "with summary" to keep the buffer. Remaining validation carryover: KB refusal checks (work/unknown project sessions), ratio probe confirmation, work-machine-no-clone check. + +** 2026-07-01 ~22:15 EDT — Speedrun Phase 0 + docs-lifecycle spec drafted +Craig picked "2 then 1". Phase 0 shipped (2a45f07): hard :solo:/:quick: definitions in todo-format.md (fixed cross-project; :solo: = buildable + agent-verifiable + no deliberation with 1-2 upfront-answerable quick decisions allowed; :quick: = ≤30-min effort hint, never a gate), mandatory-assessment language in task-review + task-audit, and task-review gate 3 realigned from "no upfront decision" to the ratified no-deliberation form. Then the docs-lifecycle spec drafted at docs/specs/2026-07-01-docs-lifecycle-spec.org from the five 2026-06-28 decisions, dogfooding itself (first resident of docs/specs/, status heading DRAFT, :ID: link). Design call made in the draft: the authoritative keyword lives on a prepended top-level status heading (vocabulary DRAFT READY DOING | IMPLEMENTED SUPERSEDED CANCELLED) — additive, retrofittable, grep- and agenda-scannable. lint-org --check on the spec: 0 mechanical, 6 judgment items that are todo-shape checkers misreading spec conventions (spec DONE decisions carry no CLOSED; review-history dated headers are template shape) — noted, no action. Task flipped DOING with a dated entry. Awaiting Craig's spec review to flip DRAFT → READY. + +** 2026-07-01 ~22:30 EDT — Dual review of the docs-lifecycle spec; all nine findings fixed +Craig had two independent agents review the spec: Codex (4 blocking findings, recorded in the spec's Review findings section) and my dispatched fresh-context reviewer (9 findings: same top blocker as Codex plus 5 unique, incl. the unowned DOING→IMPLEMENTED flip). Both rated Not ready; both independently caught that my #+TODO replacement destroyed the decision-task keyword machinery — the spec's own cookie was hand-faked. Craig approved fixing all nine. Responder pass done: merged ledger [9/9] with per-finding responses, two-sequence keyword header (verified: org computes [5/5] and [9/9]), transition-ownership table + spec-response flip task + task-audit safety net, single classification predicate, -spec.org rename, full relink contract, marker/nudge contract, compatibility rule, org-id prerequisite, three-line transition. Lint judgments on the spec are known todo-shape false positives (spec DONE entries and dated history headers). Status stays DRAFT; Craig decides the READY flip (option: send the fixed spec back to the reviewer agent, a082bea09c72a4e15, for a verify pass). + +** 2026-07-01 ~22:55 EDT — Second review round: five more findings, fixed and verified +After the first nine fixes, my premature READY flip raced Codex's re-review — no data lost (commit 642be35 carries both my flip and Codex's demotion + five new blocking implementation-readiness findings; twelve seconds of READY). Craig approved a second responder pass, including the fork on the org-id finding (keep file: links through the pilot; id: conversion gated on a concrete .emacs.d id-index mechanism). Fixed all five (b163637): canonical-placement contract, :SPEC_ID: parent-keyword binding for task-audit (dissolves the flip-task chicken-and-egg, survives --convert-subtasks), fail-safe --apply (preflight/plan/recovery), staged id conversion, evidence-based status confirmation. Also de-cookified bracket [N/N] prose tokens org's cookie updater would mangle. My reviewer's second verify pass: ready, all held, nothing regressed, three minor nits — folded in (43cecd4): scoped id-link criterion, untracked-copy cleanup in recovery, two stale prose spots. Spec parked at DRAFT [14/14]; the authoritative READY flip is left to Codex's rerun or Craig. Lesson recorded: don't flip a lifecycle state the same pass that authored the fixes. + +** 2026-07-01 ~23:40 EDT — Spec READY (Codex flip), decomposed, Phase 1 built +Codex's rerun flipped the spec READY at 23:22 (all fourteen findings closed). Craig: run with it, then flush and do Phase 2. Committed the reviewer flip, then ran spec-response Phase 6 as the first live exercise of the convention: :SPEC_ID: stamped on the build parent in todo.org, six child tasks (Phases 1-4, the gated id-conversion pass, the flip-to-IMPLEMENTED task) plus a manual-testing child (nudge visibility, link click-through), spec flipped READY→DOING (328ca18). Phase 1 then built and committed: claude-rules/docs-lifecycle.md (new rule, linked machine-wide), spec-create location + template updates (two-sequence header, DRAFT status heading with :ID:), spec-review location expectation + compatibility rule + READY flip ownership (incl. the demote path), spec-response DOING flip + :SPEC_ID: + mandatory flip task, task-audit :SPEC_ID: reconcile query. Mirror synced, make test green. NEXT AFTER FLUSH: Phase 2 — build claude-templates/.ai/scripts/spec-sort + bats per the spec's retrofit contract (classify predicate, evidence panel, plan/validate/apply, preflight, recovery, relink, marker stamp); the spec section "The retrofit" and the Phase 2 task body carry the full contract. + +** 2026-06-30 ~14:25 EDT — Startup + validation plan (interrupted session's log) +Fresh session, clean startup (no crash anchor, clean tree, repos current). Craig: "do all the validations now, then write all the specs." Two validation items pending: wrap-teardown (task 42, 5-test checklist) and Agent-KB refusal checks (task 309, work/unknown project refusal — the cross-machine half is already confirmed on velox+ratio). + +Pre-flight on wrap-teardown plumbing: Stop hook wired in settings.json, hook script present+executable, all three companion functions (cj/ai-term-quit, -live-count, -shutdown-countdown) live in the daemon (t t t). Four live aiv-* sessions right now (aiv-_emacs_d, aiv-archsetup, aiv-rulesets [this], aiv-work). Read the hook + wrap-it-up Teardown mode + the three functions: cj/ai-term-quit is a safe idempotent no-op on a nonexistent project; shutdown-countdown aborts when >1 session live. So the gate + hook-wiring pieces are safely verifiable from this session without endangering it; only the buffer-teardown/geometry-restore and the countdown render/C-g need Craig's scratch session + eyes. diff --git a/inbox/lint-followups.org b/inbox/lint-followups.org new file mode 100644 index 0000000..a81b1ce --- /dev/null +++ b/inbox/lint-followups.org @@ -0,0 +1,2 @@ + +* 2026-07-02 Thu — Task-review health: 2 top-level [#A]/[#B]/[#C] tasks unreviewed for >30 days (daily review may have slipped) @@ -161,18 +161,6 @@ The work project edited two synced scripts locally as a stopgap (2026-06-17) and Note (2026-06-24): the Anki =#+TITLE= deck-name fix landed (commit 060a938) — =default_deck_name= is now =default_deck_name(input_path, org_text)= with a new docstring. The preserved 2026-06-17 =to-anki.py= predates that, so *don't* copy it wholesale (it would revert the title-fix). Re-derive the multi-tag changes against the current canonical =flashcard-to-anki.py= and keep the =#+TITLE= behavior. -** DONE [#C] Guard against hardcoded host identity in synced files :feature:solo: -CLOSED: [2026-07-02 Thu] -:PROPERTIES: -:CREATED: [2026-06-22 Mon] -:LAST_REVIEWED: 2026-06-24 -:END: -A =CLAUDE.md= / notes file that asserts mutable environment identity as a fixed fact ("This machine is ratio", a current OS, an IP, "the laptop") is false on every machine the synced/tracked file lands on but one. It bit a real archsetup session: a stale "this machine is ratio" line made the agent reason backwards all session while on velox. Proposal: a claude-rule — don't assert mutable host/env identity as a fixed fact in a tracked/synced project file; derive it at runtime and name the command (=uname -n= for host; the =hostname= binary is often absent). Optionally a codify- or startup-time lint flagging "this machine is <name>" / "the current host is" style claims. Proposal: [[file:docs/design/2026-06-21-host-identity-guard-proposal.org][proposal]]. From archsetup 2026-06-21. - -2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): rule + startup lint. A new claude-rules file plus a cheap grep probe in startup flagging host-identity claims in CLAUDE.md / notes.org fleet-wide. - -Resolution 2026-07-02: claude-rules/host-identity.md written (fixed-identity claims banned in tracked/synced docs, runtime derivation via uname -n, fleet-description carve-out, the archsetup worked failure) and linked machine-wide by make install. startup.org gained Phase A probe 13 (grep for "this machine/host/box is" claims in CLAUDE.md + notes.org, fixture-verified bash+zsh) and the Phase C host-identity flag line. Flags for judgment, never blocks. - ** TODO [#C] coverage-summary.el documented as a local-only helper :chore: :PROPERTIES: :CREATED: [2026-06-22 Mon] @@ -409,89 +397,9 @@ What we're verifying: the pilot's relink pass left todo.org and docs links worki - After the Phase 3 pilot, open todo.org in Emacs and click three links that point at moved specs (including one from a dated log entry). Expected: each opens the spec at its new docs/specs/ path. -*** TODO [#D] Docs lifecycle vNext — org-agenda spec-status view :feature: +*** TODO [#D] Docs lifecycle vNext — org-agenda spec-status view :feature:no-sync: Once specs carry lifecycle TODO keywords under =docs/specs/=, add a custom org-agenda view that lists =DRAFT= / =READY= / =DOING= / terminal specs by status. Deferred from [[id:80b0787b-4a60-4c82-8a16-b383d3e3c8f2][the docs-lifecycle spec]]; not part of v1 because the grep board is sufficient until the status headings exist. -** DONE [#C] No-approvals speedrun — cross-project autonomous-batch mode :feature:spec: -CLOSED: [2026-07-02 Thu] -:PROPERTIES: -:CREATED: [2026-06-15 Mon] -:LAST_REVIEWED: 2026-06-24 -:SPEC_ID: 90f623cd-fdbe-4f5c-b63d-b2f84d9151cf -:END: -A named mode for coding projects: Craig names an ordered task set and says "speedrun" / "no approvals speedrun"; the set is worked autonomously, each task held to the full quality bar (TDD red→green, =/review-code=, =/voice= on the commit) and committed + pushed as its own logical commit, with all needed quick decisions gathered in one pre-flight Q&A (answer or "skip this") and a VERIFY filed for anything underspecified or needing deliberation, plus an end-of-set page listing completed + remaining + skipped tasks. Task size is not a gate — large tasks decompose into per-commit chunks. Surfaced by .emacs.d from a 2026-06-15 theme-studio session where the shape worked. Source proposal: [[file:docs/design/2026-06-15-fix-speedrun-workflow-proposal.org]] (.emacs.d handoff 2026-06-15). Build via =spec-create= when worked; we handle the task in priority order. - -Skeptical-review read (open design questions to resolve in the spec, not settled here): -- *Is it a new workflow or a documented preset?* The proposal frames it as no-approvals + always-push session modes plus an end page. Decide whether it needs its own workflow file or is mostly documentation of a preset over the two existing modes. -- *Where/how the page fires* — every task vs end-of-set, and via what. The paging surface is in flux (=page-signal= removed 2026-06-12), so reconcile against =notify --persist= or whatever paging stands now. -- *Auto-pull vs explicit list* — whether the set comes from an explicit ordered list or a tag/priority query. -- *Guardrails* — must refuse to speedrun tasks needing design decisions or carrying data-loss risk without a checkpoint (the sender's biased-safe unused-tile flag is the worked example). - -*** 2026-07-01 Wed @ 22:10:35 -0400 Phase 0 landed — hard tag definitions + review/audit enforcement -todo-format.md gained the "Hard definitions: :solo: and :quick:" subsection under the scheme header (fixed across projects: :solo: = buildable + agent-verifiable + no deliberation, with one-or-two upfront-answerable quick decisions allowed per the ratified spec; :quick: = ≤30-min effort hint, never a gate). task-review.org: the two tagging sections are now explicitly mandatory ("a review that skips them is incomplete") and gate 3 was realigned from "no upfront decision" to the spec's no-deliberation form — the stricter old wording predated the pre-flight-Q&A decision and would have wrongly excluded quick-question tasks. task-audit.org: the re-assess bullet is marked mandatory and points at the todo-format hard definitions as canonical. Phases 1-6 (work-the-backlog extraction, callers, commit gate, checklist/Q&A/page, metrics, synthesis) remain. - -*** 2026-06-16 Tue @ 00:53:36 -0500 Spec written; design questions answered -Craig's "your call" (2026-06-16) answered in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][the autonomous-batch execution spec]], which reconciles this with Phase E into one feature: -- *Most effective / workflow-vs-preset:* one dedicated =work-the-backlog.org= workflow holds the execution loop; "fix speedrun" is a thin named preset (no-approvals + always-push + end page) feeding it an explicit list, and the inbox-zero loop feeds it a tag query. Pros of the shared workflow: one execution loop to audit, inbox-zero's three callers stay clean, both input shapes reuse one guardrail set. Cons: one more workflow file and a caller-to-workflow indirection. The con list is shorter and lighter than the duplication cost of two separate features, which is why the shared workflow wins. The pros carry the more important entries (single audit surface, clean seam). -- *Paging:* end-of-set only, via =notify ... --persist= (reconciled past the removed page-signal wrapper). -- *Auto-pull vs explicit list:* both — explicit list for the preset, tag/priority query for the loop. -- *Effectiveness measurement (the trial Craig asked for):* the spec designs a per-task JSONL metrics log (=.ai/metrics/work-the-backlog.jsonl=), a corrections-in-next-session signal, and a periodic synthesis step that writes =:agent:metrics:= org-roam articles for later review — the "gather data + create org-roam articles" loop. -*** 2026-06-29 Mon @ 03:48:09 -0400 Ratified the autonomous-batch execution spec -Craig ratified all eight decisions in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][2026-06-16-autonomous-batch-execution-spec.org]] (revised this session — size gate removed, crisp four-item defer checklist, =:solo:= / =:quick:= definitions + task-review/audit enforcement, speedrun pre-flight Q&A). Spec Status → ready; implementation-ready across Phase 0–6. Decisions grew from six to eight during the revision. - -*** 2026-07-02 Thu @ 00:44:59 -0400 spec-response decomposition — :SPEC_ID: bound, spec DOING -Stamped the spec's UUID on this parent, broke Phases 1-6 into the build tasks below (plus the flip task and a live-trial validation child), and flipped the spec's status heading READY → DOING per the transition-ownership table. - -*** 2026-07-02 Thu @ 01:07:29 -0400 Phase 1 landed — execution loop extracted into work-the-backlog.org -work-the-backlog.org written (canonical + mirror): caller contract (task set + session mode + cap), five-outcome vocabulary, the loop, mechanical eligibility gate (TODO + :solo: per scheme header, safe-by-omission, no-scheme-header → don't run), four-item defer checklist, per-task quality bar, cap/kill-switch semantics, page + metrics stubs pointing at Phases 4-5. inbox.org's auto-mode per-cycle item 3 reverted to routing-only (yes-path execution removed; mode intro + closing line updated to match). INDEX.org entry added. make test green, sync clean; nothing invokes the new workflow yet. - -*** 2026-07-02 Thu @ 01:13:33 -0400 Phase 2 landed — both callers wired -inbox.org auto-mode item 3 regained its "run this batch next?" ask, now chaining into work-the-backlog as an explicit second step after routing (eligibility query + file-only + paging off + cap 1). work-the-backlog.org gained the two caller sections: the auto-loop contract and the no-approvals speedrun preset (seven-step pre-flight → autonomous-commit + always-push + paging-on over an explicit list; finer Q&A mechanics deferred to Phase 4). Speedrun trigger phrases live in the workflow + INDEX; "speedrun" always routes to the preset, with a disambiguation note in no-approvals.org and its INDEX entry. Each caller independently exercisable. - -*** 2026-07-02 Thu @ 01:18:07 -0400 Phase 3 landed — waiver-gated commit autonomy -Pinned the waiver format per D5: two marker lines in .ai/notes.org Workflow State — :COMMIT_AUTONOMY: yes (has the waiver) and :LOOP_MAY_COMMIT: yes (the unattended loop may also commit; requires the first). Absent or non-yes reads as no; the read is a fresh grep each run, never memory. Degrade contract written into work-the-backlog.org (surface in run intro + summary, never honor without the marker, never degrade silently); caller sections + Common Mistakes updated. Stamped rulesets' own :COMMIT_AUTONOMY: yes; :LOOP_MAY_COMMIT: deliberately not granted — Craig's call. .emacs.d holds the waiver too but its notes.org is its own scope; told via inbox-send to stamp its marker. - -*** 2026-07-02 Thu @ 01:21:47 -0400 Phase 4 landed — checklist mechanics, pre-flight Q&A contract, page -The four-item checklist (in since Phase 1) gained its mechanics: a VERIFY-filing subsection (dedup against an existing sibling first — the deferred task stays TODO, so without the check every run re-files; placement/heading/body per todo-format.md) and a quick-question routing subsection (discriminator: one-line factual/preference pick vs tradeoff-weighing; three-plus questions = underspecified = file; item 2 data-loss never routes to Q&A). Preset section gained the batch-ask contract (one message, recommendation-first numbered options per interaction.md, answers recorded as dated lines in the task bodies before the run). Page section finalized (fires once on set-done or cap-hit; notify --persist is the paging surface). Common Mistakes 12-13 added. Checklist only ever reduces what runs; pre-flight fires only under the preset. - -*** 2026-07-02 Thu @ 01:24:50 -0400 Phase 5 landed — per-task JSONL metrics log -Metrics section written into work-the-backlog.org: one record per task at outcome time, appended to the project's .ai/metrics/work-the-backlog.jsonl (git-tracked, append-only, dir+file created on first append). Full field table per the spec (ts, run_id, project, caller, task, outcome, defer_reason, upfront_decision, wall_clock_s, commit_sha, review_findings), outcome slugs mapped to the prose vocabulary, commit_sha flagged as the corrections-signal key (comma-separated when a task decomposed into several commits). Added the sixth outcome the spec's readiness section demanded but the enum missed: failed (tree left working, surfaced, run continues) — wired into the Outcomes vocabulary and loop step 4. A failed append warns in the run summary but never blocks, reorders, or aborts execution. - -*** 2026-07-02 Thu @ 01:27:43 -0400 Phase 6 landed — synthesis step to org-roam -Synthesis section written into work-the-backlog.org (trigger "synthesize backlog metrics", INDEX row added): discover the JSONL union across project roots, classify each project per knowledge-base.md's denylist before reading, exclude work/unknown projects with the refusal contract, compute per-run rollups + trends, compute the corrections signal (later revert/fix commit touching the same files within ~14 days — a flag for human review, not a conviction), write one :agent:metrics: KB node under ~/org/roam/agents/ with [[id:...]] links to prior synthesis nodes, pull-before/commit-push-after. Read-only over the logs plus the single KB write; never mutates JSONL, todo.org, or any tree. - -*** 2026-07-02 Thu @ 05:26:07 -0400 Live trial passed — first speedrun ran 3/3, every loop part exercised -Craig named the ordered set (id-link conversion, host-identity guard, template-sync policy) and said it was the validation run. Pre-flight Q&A fired once (two questions, both answered, answers stamped as dated lines before the run); each task landed as its own reviewed commit under the waiver (78bbaae, b6a977c, ed75d3c); metrics JSONL carries one record per task (run c726f526); the end-of-set page arrived via notify --persist. Nothing needed a VERIFY this run (all three cleared the checklist). Craig's read: granted :LOOP_MAY_COMMIT: on the strength of the run. - -*** 2026-07-02 Thu @ 05:26:07 -0400 Flipped the spec DOING → IMPLEMENTED -All six phases built and the live trial validated. Keyword, dated history line, and Metadata mirror all flipped per the transition-ownership table. - -** DONE [#B] inbox-send filename collision silently overwrote a message :bug:solo: -CLOSED: [2026-07-02 Thu] -:PROPERTIES: -:CREATED: [2026-07-02 Thu] -:END: -From archsetup (2026-07-02 0543, found in the wild): two --text sends in the same minute whose text starts with the same phrase derive identical filenames, and the second silently overwrites the first — archsetup lost a message at 05:42 and had to resend. Severity data-loss x rare-edge = P2 = [#B]. - -Resolution 2026-07-02 (auto-inbox-zero loop, standing yes): uniquify() guard in inbox-send.py — an existing target gets a -2/-3/... stem suffix, both send_text and send_file paths, extension preserved. Four red-first tests reproduce the loss (module-level with a fixed timestamp so the same-minute collision is deterministic, plus a CLI loss-proof check); 30/30 green. - -** DONE [#C] page-me notify styling — all-red too alarming :bug:solo: -CLOSED: [2026-07-02 Thu] -:PROPERTIES: -:CREATED: [2026-07-02 Thu] -:END: -From Craig via the roam inbox (2026-07-02, routed by archsetup): the page notify's all-red styling "makes me feel like somehow the system is about to crash" — should be a persistent info-level notification. - -Resolution 2026-07-02 (auto-inbox-zero loop, standing yes): pages now use notify info --persist instead of notify alarm — page-me.org (all examples + prose, with Craig's verdict recorded) and work-the-backlog.org's end-of-set page. status-check's success/fail types untouched (job outcomes, not pages). - -** DONE [#C] Template sync with gitignored-only local changes :feature: -CLOSED: [2026-07-02 Thu] -From Craig via the roam inbox (2026-07-02, routed by archsetup): downstream projects should still pull template updates when their local changes sit entirely in gitignored files or directories — an inbox drop or a file left to read doesn't affect the templates, yet it currently holds the sync back and projects fall behind. When worked: verify how the sync gate actually detects dirtiness today, then let gitignored-only changes pass it. - -2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): policy + audit. Scope read found startup's git gates already ignore untracked/ignored files; state the policy in startup.org and audit every dirty-check in the synced workflows to match (monitor-inbox's bare porcelain check is the known offender; tracked-modification blocking stays). - -Resolution 2026-07-02: template-freshness policy stated in startup.org Phase A.0 (dirty = tracked modifications only; untracked/gitignored never block pulls, ffs, or monitoring gates; the rsync WIP-guard named as the one deliberate exception — it holds back rulesets' own outbound WIP). Full audit of dirty-checks across synced workflows: startup's two git gates already complied; inbox.org monitor mode was the one offender — its precondition now uses --untracked-files=no with the explicit-staging rationale, and its close-out sweeps tracked changes only. triage-intake auto mode borrows monitor's gates, so it inherits the fix by reference. - ** TODO [#C] Wrap-it-up summary mode — keep or cut :feature: From Craig via the roam inbox (2026-07-02, routed by archsetup). Teardown-by-default already shipped (bare "wrap it up" closes the window; "with summary" keeps it). Craig's follow-on: "maybe we cut the summary altogether. help me think through when I'd want a summary and how I would recognize it before confirming and then having it close." Run that think-through with him (brainstorm-shaped, not solo), then adjust wrap-it-up.org's Step 6 + trigger phrases to the outcome. @@ -1386,3 +1294,90 @@ Fresh pre-flight, all green: Stop hook block in =~/.claude/settings.json= points *** 2026-07-01 Wed @ 21:59:43 -0400 Manual end-to-end validation passed — all five tests, Craig's live run Craig ran the full checklist in a live Emacs/tmux ai-term setup: (1) bare "wrap it up" tore down after the valediction rendered, geometry restored, no lingering sentinel; (2) "with summary" / "and summarize" both wrapped without teardown, buffer stayed readable; (3) "wrap it up and shutdown" with another aiv-* session live refused the shutdown, named the other session, fell back to a normal wrap; (4) as the sole session, the 10→1 echo-area countdown rendered one-per-second, C-g cancelled cleanly, and a full run fired the (stubbed) shutdown command; (5) with the push made to fail, the wrap stopped at the failure and no sentinel was dropped. Works great — feature validated and live. Both sides complete: rulesets Stop hook + wrap-it-up Teardown mode, .emacs.d companion functions. +** DONE [#C] Guard against hardcoded host identity in synced files :feature:solo: +CLOSED: [2026-07-02 Thu] +:PROPERTIES: +:CREATED: [2026-06-22 Mon] +:LAST_REVIEWED: 2026-06-24 +:END: +A =CLAUDE.md= / notes file that asserts mutable environment identity as a fixed fact ("This machine is ratio", a current OS, an IP, "the laptop") is false on every machine the synced/tracked file lands on but one. It bit a real archsetup session: a stale "this machine is ratio" line made the agent reason backwards all session while on velox. Proposal: a claude-rule — don't assert mutable host/env identity as a fixed fact in a tracked/synced project file; derive it at runtime and name the command (=uname -n= for host; the =hostname= binary is often absent). Optionally a codify- or startup-time lint flagging "this machine is <name>" / "the current host is" style claims. Proposal: [[file:docs/design/2026-06-21-host-identity-guard-proposal.org][proposal]]. From archsetup 2026-06-21. + +2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): rule + startup lint. A new claude-rules file plus a cheap grep probe in startup flagging host-identity claims in CLAUDE.md / notes.org fleet-wide. + +Resolution 2026-07-02: claude-rules/host-identity.md written (fixed-identity claims banned in tracked/synced docs, runtime derivation via uname -n, fleet-description carve-out, the archsetup worked failure) and linked machine-wide by make install. startup.org gained Phase A probe 13 (grep for "this machine/host/box is" claims in CLAUDE.md + notes.org, fixture-verified bash+zsh) and the Phase C host-identity flag line. Flags for judgment, never blocks. +** DONE [#C] No-approvals speedrun — cross-project autonomous-batch mode :feature:spec: +CLOSED: [2026-07-02 Thu] +:PROPERTIES: +:CREATED: [2026-06-15 Mon] +:LAST_REVIEWED: 2026-06-24 +:SPEC_ID: 90f623cd-fdbe-4f5c-b63d-b2f84d9151cf +:END: +A named mode for coding projects: Craig names an ordered task set and says "speedrun" / "no approvals speedrun"; the set is worked autonomously, each task held to the full quality bar (TDD red→green, =/review-code=, =/voice= on the commit) and committed + pushed as its own logical commit, with all needed quick decisions gathered in one pre-flight Q&A (answer or "skip this") and a VERIFY filed for anything underspecified or needing deliberation, plus an end-of-set page listing completed + remaining + skipped tasks. Task size is not a gate — large tasks decompose into per-commit chunks. Surfaced by .emacs.d from a 2026-06-15 theme-studio session where the shape worked. Source proposal: [[file:docs/design/2026-06-15-fix-speedrun-workflow-proposal.org]] (.emacs.d handoff 2026-06-15). Build via =spec-create= when worked; we handle the task in priority order. + +Skeptical-review read (open design questions to resolve in the spec, not settled here): +- *Is it a new workflow or a documented preset?* The proposal frames it as no-approvals + always-push session modes plus an end page. Decide whether it needs its own workflow file or is mostly documentation of a preset over the two existing modes. +- *Where/how the page fires* — every task vs end-of-set, and via what. The paging surface is in flux (=page-signal= removed 2026-06-12), so reconcile against =notify --persist= or whatever paging stands now. +- *Auto-pull vs explicit list* — whether the set comes from an explicit ordered list or a tag/priority query. +- *Guardrails* — must refuse to speedrun tasks needing design decisions or carrying data-loss risk without a checkpoint (the sender's biased-safe unused-tile flag is the worked example). + +*** 2026-07-01 Wed @ 22:10:35 -0400 Phase 0 landed — hard tag definitions + review/audit enforcement +todo-format.md gained the "Hard definitions: :solo: and :quick:" subsection under the scheme header (fixed across projects: :solo: = buildable + agent-verifiable + no deliberation, with one-or-two upfront-answerable quick decisions allowed per the ratified spec; :quick: = ≤30-min effort hint, never a gate). task-review.org: the two tagging sections are now explicitly mandatory ("a review that skips them is incomplete") and gate 3 was realigned from "no upfront decision" to the spec's no-deliberation form — the stricter old wording predated the pre-flight-Q&A decision and would have wrongly excluded quick-question tasks. task-audit.org: the re-assess bullet is marked mandatory and points at the todo-format hard definitions as canonical. Phases 1-6 (work-the-backlog extraction, callers, commit gate, checklist/Q&A/page, metrics, synthesis) remain. + +*** 2026-06-16 Tue @ 00:53:36 -0500 Spec written; design questions answered +Craig's "your call" (2026-06-16) answered in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][the autonomous-batch execution spec]], which reconciles this with Phase E into one feature: +- *Most effective / workflow-vs-preset:* one dedicated =work-the-backlog.org= workflow holds the execution loop; "fix speedrun" is a thin named preset (no-approvals + always-push + end page) feeding it an explicit list, and the inbox-zero loop feeds it a tag query. Pros of the shared workflow: one execution loop to audit, inbox-zero's three callers stay clean, both input shapes reuse one guardrail set. Cons: one more workflow file and a caller-to-workflow indirection. The con list is shorter and lighter than the duplication cost of two separate features, which is why the shared workflow wins. The pros carry the more important entries (single audit surface, clean seam). +- *Paging:* end-of-set only, via =notify ... --persist= (reconciled past the removed page-signal wrapper). +- *Auto-pull vs explicit list:* both — explicit list for the preset, tag/priority query for the loop. +- *Effectiveness measurement (the trial Craig asked for):* the spec designs a per-task JSONL metrics log (=.ai/metrics/work-the-backlog.jsonl=), a corrections-in-next-session signal, and a periodic synthesis step that writes =:agent:metrics:= org-roam articles for later review — the "gather data + create org-roam articles" loop. +*** 2026-06-29 Mon @ 03:48:09 -0400 Ratified the autonomous-batch execution spec +Craig ratified all eight decisions in [[id:90f623cd-fdbe-4f5c-b63d-b2f84d9151cf][2026-06-16-autonomous-batch-execution-spec.org]] (revised this session — size gate removed, crisp four-item defer checklist, =:solo:= / =:quick:= definitions + task-review/audit enforcement, speedrun pre-flight Q&A). Spec Status → ready; implementation-ready across Phase 0–6. Decisions grew from six to eight during the revision. + +*** 2026-07-02 Thu @ 00:44:59 -0400 spec-response decomposition — :SPEC_ID: bound, spec DOING +Stamped the spec's UUID on this parent, broke Phases 1-6 into the build tasks below (plus the flip task and a live-trial validation child), and flipped the spec's status heading READY → DOING per the transition-ownership table. + +*** 2026-07-02 Thu @ 01:07:29 -0400 Phase 1 landed — execution loop extracted into work-the-backlog.org +work-the-backlog.org written (canonical + mirror): caller contract (task set + session mode + cap), five-outcome vocabulary, the loop, mechanical eligibility gate (TODO + :solo: per scheme header, safe-by-omission, no-scheme-header → don't run), four-item defer checklist, per-task quality bar, cap/kill-switch semantics, page + metrics stubs pointing at Phases 4-5. inbox.org's auto-mode per-cycle item 3 reverted to routing-only (yes-path execution removed; mode intro + closing line updated to match). INDEX.org entry added. make test green, sync clean; nothing invokes the new workflow yet. + +*** 2026-07-02 Thu @ 01:13:33 -0400 Phase 2 landed — both callers wired +inbox.org auto-mode item 3 regained its "run this batch next?" ask, now chaining into work-the-backlog as an explicit second step after routing (eligibility query + file-only + paging off + cap 1). work-the-backlog.org gained the two caller sections: the auto-loop contract and the no-approvals speedrun preset (seven-step pre-flight → autonomous-commit + always-push + paging-on over an explicit list; finer Q&A mechanics deferred to Phase 4). Speedrun trigger phrases live in the workflow + INDEX; "speedrun" always routes to the preset, with a disambiguation note in no-approvals.org and its INDEX entry. Each caller independently exercisable. + +*** 2026-07-02 Thu @ 01:18:07 -0400 Phase 3 landed — waiver-gated commit autonomy +Pinned the waiver format per D5: two marker lines in .ai/notes.org Workflow State — :COMMIT_AUTONOMY: yes (has the waiver) and :LOOP_MAY_COMMIT: yes (the unattended loop may also commit; requires the first). Absent or non-yes reads as no; the read is a fresh grep each run, never memory. Degrade contract written into work-the-backlog.org (surface in run intro + summary, never honor without the marker, never degrade silently); caller sections + Common Mistakes updated. Stamped rulesets' own :COMMIT_AUTONOMY: yes; :LOOP_MAY_COMMIT: deliberately not granted — Craig's call. .emacs.d holds the waiver too but its notes.org is its own scope; told via inbox-send to stamp its marker. + +*** 2026-07-02 Thu @ 01:21:47 -0400 Phase 4 landed — checklist mechanics, pre-flight Q&A contract, page +The four-item checklist (in since Phase 1) gained its mechanics: a VERIFY-filing subsection (dedup against an existing sibling first — the deferred task stays TODO, so without the check every run re-files; placement/heading/body per todo-format.md) and a quick-question routing subsection (discriminator: one-line factual/preference pick vs tradeoff-weighing; three-plus questions = underspecified = file; item 2 data-loss never routes to Q&A). Preset section gained the batch-ask contract (one message, recommendation-first numbered options per interaction.md, answers recorded as dated lines in the task bodies before the run). Page section finalized (fires once on set-done or cap-hit; notify --persist is the paging surface). Common Mistakes 12-13 added. Checklist only ever reduces what runs; pre-flight fires only under the preset. + +*** 2026-07-02 Thu @ 01:24:50 -0400 Phase 5 landed — per-task JSONL metrics log +Metrics section written into work-the-backlog.org: one record per task at outcome time, appended to the project's .ai/metrics/work-the-backlog.jsonl (git-tracked, append-only, dir+file created on first append). Full field table per the spec (ts, run_id, project, caller, task, outcome, defer_reason, upfront_decision, wall_clock_s, commit_sha, review_findings), outcome slugs mapped to the prose vocabulary, commit_sha flagged as the corrections-signal key (comma-separated when a task decomposed into several commits). Added the sixth outcome the spec's readiness section demanded but the enum missed: failed (tree left working, surfaced, run continues) — wired into the Outcomes vocabulary and loop step 4. A failed append warns in the run summary but never blocks, reorders, or aborts execution. + +*** 2026-07-02 Thu @ 01:27:43 -0400 Phase 6 landed — synthesis step to org-roam +Synthesis section written into work-the-backlog.org (trigger "synthesize backlog metrics", INDEX row added): discover the JSONL union across project roots, classify each project per knowledge-base.md's denylist before reading, exclude work/unknown projects with the refusal contract, compute per-run rollups + trends, compute the corrections signal (later revert/fix commit touching the same files within ~14 days — a flag for human review, not a conviction), write one :agent:metrics: KB node under ~/org/roam/agents/ with [[id:...]] links to prior synthesis nodes, pull-before/commit-push-after. Read-only over the logs plus the single KB write; never mutates JSONL, todo.org, or any tree. + +*** 2026-07-02 Thu @ 05:26:07 -0400 Live trial passed — first speedrun ran 3/3, every loop part exercised +Craig named the ordered set (id-link conversion, host-identity guard, template-sync policy) and said it was the validation run. Pre-flight Q&A fired once (two questions, both answered, answers stamped as dated lines before the run); each task landed as its own reviewed commit under the waiver (78bbaae, b6a977c, ed75d3c); metrics JSONL carries one record per task (run c726f526); the end-of-set page arrived via notify --persist. Nothing needed a VERIFY this run (all three cleared the checklist). Craig's read: granted :LOOP_MAY_COMMIT: on the strength of the run. + +*** 2026-07-02 Thu @ 05:26:07 -0400 Flipped the spec DOING → IMPLEMENTED +All six phases built and the live trial validated. Keyword, dated history line, and Metadata mirror all flipped per the transition-ownership table. +** DONE [#B] inbox-send filename collision silently overwrote a message :bug:solo: +CLOSED: [2026-07-02 Thu] +:PROPERTIES: +:CREATED: [2026-07-02 Thu] +:END: +From archsetup (2026-07-02 0543, found in the wild): two --text sends in the same minute whose text starts with the same phrase derive identical filenames, and the second silently overwrites the first — archsetup lost a message at 05:42 and had to resend. Severity data-loss x rare-edge = P2 = [#B]. + +Resolution 2026-07-02 (auto-inbox-zero loop, standing yes): uniquify() guard in inbox-send.py — an existing target gets a -2/-3/... stem suffix, both send_text and send_file paths, extension preserved. Four red-first tests reproduce the loss (module-level with a fixed timestamp so the same-minute collision is deterministic, plus a CLI loss-proof check); 30/30 green. +** DONE [#C] page-me notify styling — all-red too alarming :bug:solo: +CLOSED: [2026-07-02 Thu] +:PROPERTIES: +:CREATED: [2026-07-02 Thu] +:END: +From Craig via the roam inbox (2026-07-02, routed by archsetup): the page notify's all-red styling "makes me feel like somehow the system is about to crash" — should be a persistent info-level notification. + +Resolution 2026-07-02 (auto-inbox-zero loop, standing yes): pages now use notify info --persist instead of notify alarm — page-me.org (all examples + prose, with Craig's verdict recorded) and work-the-backlog.org's end-of-set page. status-check's success/fail types untouched (job outcomes, not pages). +** DONE [#C] Template sync with gitignored-only local changes :feature: +CLOSED: [2026-07-02 Thu] +From Craig via the roam inbox (2026-07-02, routed by archsetup): downstream projects should still pull template updates when their local changes sit entirely in gitignored files or directories — an inbox drop or a file left to read doesn't affect the templates, yet it currently holds the sync back and projects fall behind. When worked: verify how the sync gate actually detects dirtiness today, then let gitignored-only changes pass it. + +2026-07-02 Thu @ 05:09:58 -0400 — Craig (speedrun pre-flight): policy + audit. Scope read found startup's git gates already ignore untracked/ignored files; state the policy in startup.org and audit every dirty-check in the synced workflows to match (monitor-inbox's bare porcelain check is the known offender; tracked-modification blocking stays). + +Resolution 2026-07-02: template-freshness policy stated in startup.org Phase A.0 (dirty = tracked modifications only; untracked/gitignored never block pulls, ffs, or monitoring gates; the rsync WIP-guard named as the one deliberate exception — it holds back rulesets' own outbound WIP). Full audit of dirty-checks across synced workflows: startup's two git gates already complied; inbox.org monitor mode was the one offender — its precondition now uses --untracked-files=no with the explicit-staging rationale, and its close-out sweeps tracked changes only. triage-intake auto mode borrows monitor's gates, so it inherits the fix by reference. |
