aboutsummaryrefslogtreecommitdiff
path: root/.ai/workflows/work-the-backlog.org
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-07-02 01:38:24 -0400
committerCraig Jennings <c@cjennings.net>2026-07-02 01:38:24 -0400
commitd4f132b716a6cdbc3a6a521a21fd2811c9da3480 (patch)
tree3cfa003d4924866c28792e569a7e6c32d9d2e578 /.ai/workflows/work-the-backlog.org
parent44c8cc2f0657653b2173faaa3518fa74f931d468 (diff)
downloadrulesets-d4f132b716a6cdbc3a6a521a21fd2811c9da3480.tar.gz
rulesets-d4f132b716a6cdbc3a6a521a21fd2811c9da3480.zip
feat(flush): add auto mode with self-injected /clear for unattended runs
Long autonomous sessions bloat or hit auto-compaction because /clear is a prompt keystroke no tool call can execute. Auto mode closes that gap: after the write-verified checkpoint, the agent derives its own tmux pane, arms self-inject.sh through tmux run-shell -b, and ends the turn so /clear and a resume line land at an idle prompt. The server-owned arm is load-bearing: a detached child of a tool call dies at the turn boundary. The pane must be derived before arming because ancestry detection can't work under the tmux server. self-inject.sh joins the synced scripts with a six-test bats suite, tmux stubbed at the boundary. work-the-backlog now auto-flushes between tasks when context grows heavy, and its speedrun preset gained the per-item disposition rule: feature-level work gets a spec, unguessable decisions get a VERIFY, well-defined tasks get implemented. The mechanism was proven live in another project's session and its design note is preserved under docs/design/.
Diffstat (limited to '.ai/workflows/work-the-backlog.org')
-rw-r--r--.ai/workflows/work-the-backlog.org16
1 files changed, 15 insertions, 1 deletions
diff --git a/.ai/workflows/work-the-backlog.org b/.ai/workflows/work-the-backlog.org
index 284935b..642162d 100644
--- a/.ai/workflows/work-the-backlog.org
+++ b/.ai/workflows/work-the-backlog.org
@@ -140,6 +140,10 @@ The cap is a hard per-run task ceiling passed by the caller — the kill switch
Even the speedrun stops at the cap and surfaces (and, with paging on, pages) the remainder. The cap bounds task *count*, not cost; a token budget is logged as vNext.
+* Context hygiene — auto-flush between tasks
+
+Task boundaries are clean boundaries by construction: the previous task is closed and committed (or filed), nothing is half-edited. When the context window grows heavy mid-run, run the flush skill's *auto mode* between tasks: checkpoint the session anchor with the remaining task set, session mode, and cap in Next Steps (so the resumed context continues the run blind), arm the self-injection (=.ai/scripts/self-inject.sh= via =tmux run-shell -b=), and end the turn. The fresh context resumes from the anchor and works on. Unattended runs only — the keystroke-collision hazard and the full mechanism live in the flush skill.
+
* End-of-set page
With paging on, fire one page when the set is done or the cap is hit — end-of-set only, never per-task:
@@ -207,11 +211,21 @@ When Craig names a task set and says "speedrun":
3. *Order* the list — priority, then the author's ordering / =:next:=.
4. *Intro the work* — present the ordered plan: what will run, what was dropped and why, and the batched questions for the needs-quick-decisions tasks.
5. *Craig answers each question or says "skip this"* — a skip removes the task (recorded =dropped-by-craig=; the task itself stays =TODO=); an answer is recorded so implementation works from the decision, not a guess.
-6. *Run the finalized list autonomously* — no further approvals until done. Cap = the list length (the human bounded the set by naming it), still one commit per logical change, always-push per the project's flow.
+6. *Run the finalized list autonomously* — no further approvals until done. Cap = the list length (the human bounded the set by naming it), still one commit per logical change, always-push per the project's flow, auto-flushing between tasks when the context grows heavy (see Context hygiene above).
7. *End-of-set page* with completed + remaining + skipped.
The batch-ask (step 4-5) is one message: each question names its task, puts the recommended answer at item 1 when there is one (per =interaction.md= — inline numbered, no popup), and offers "skip this" as the last option. Before the run starts, write each answer into its task's body in =todo.org= as a dated line — the implementation works from the recorded decision, and the record survives the session. The Q&A fires only under this preset; the loop caller never asks (its decision-needing tasks defer).
+*** Per-item disposition rule
+
+For every item the run picks up (this holds for any executing caller, including an auto-inbox-zero run given a standing yes):
+
+- *Feature-level task* → write a spec first (=spec-create=), don't implement directly. The spec is the run's deliverable for that item.
+- *Needs decisions you can't confidently guess* → file it as a =VERIFY= carrying the question (under this preset, one or two quick questions route to the pre-flight Q&A instead).
+- *Well-defined* → implement it, taking the time it needs.
+
+This extends the defer checklist: the checklist decides *act vs file*; this rule decides the *shape* of the act.
+
* Synthesis: metrics → org-roam KB
Trigger: "synthesize backlog metrics" (optionally a weekly scheduled run). This is the read side of the metrics log — Craig's ask was "gather data and create org-roam articles we can look at later," and this step is the second half. It is read-only over the logs plus exactly one KB write.