docs(design): task-review spec + filed [#A] task

docs/design/task-review.org captures the brainstorm output for a daily 5-min keystroke-driven review habit that walks 7 oldest-unreviewed top-level [#A]/[#B]/[#C] tasks per session, rotating through the list over ~12 days. Replaces wrap-it-up.org's date-coverage scan once implemented; the watchdog flips from "do all priorities have dates?" to "is the review habit happening?" with a 30-day threshold. todo.org gets a [#A] entry at the top of Rulesets Open Work pointing at the spec, so the implementation work isn't lost. Six components in the spec's Next Steps: extract task-review-staleness.sh, replace the wrap-up section, author task-review.el in archsetup, author the workflow file plus INDEX entry, add the startup nudge, smoke test.
author: Craig Jennings <c@cjennings.net> 2026-05-16 11:59:40 -0500
committer: Craig Jennings <c@cjennings.net> 2026-05-16 11:59:40 -0500
commit: daf960fad5169f0d3fa52bae770e83ad73344f30 (patch)
tree: 92a0884f16c91299ce8ab63591d482049d04065c /docs/design/task-review.org
parent: 9c5aff1febe3eaf44dfe59bc3e3b70a35cde4c8a (diff)
download: rulesets-daf960fad5169f0d3fa52bae770e83ad73344f30.tar.gz
rulesets-daf960fad5169f0d3fa52bae770e83ad73344f30.zip
1 files changed, 335 insertions, 0 deletions
diff --git a/docs/design/task-review.org b/docs/design/task-review.org
new file mode 100644
index 0000000..79b7d4e
--- /dev/null
+++ b/docs/design/task-review.org
@@ -0,0 +1,335 @@
+#+TITLE: Design: Daily Task-Review Habit
+#+AUTHOR: Craig Jennings & Claude
+#+DATE: 2026-05-16
+#+OPTIONS: toc:nil num:nil
+
+* Metadata
+
+- Date: 2026-05-16
+- Status: Draft
+- Supersedes: =wrap-it-up.org= date-coverage scan (lines 177-213 at design time)
+
+* Problem
+
+The wrap-up's date-coverage scan flags every open =[#A]= / =[#B]=
+=todo.org= task that lacks a =DEADLINE:= or =SCHEDULED:= line. The
+underlying assumption — that high-priority work needs a date — is
+wrong. A task can be important without being time-bound (research,
+foundational work, watch-list items). The scan generates per-night
+cleanup work that the operator either dismisses repetitively or
+papers over with bogus dates, neither of which improves the list.
+
+The actual need is a recurring judgment habit: walk the task list,
+verify each entry is still important, well-written, and actionable,
+and act on what's drifted. Once the habit is in place, the date-
+coverage scan becomes redundant.
+
+Failure modes the habit is designed to catch (in order of severity):
+
+1. /Priority drift/ — =[#A]=s that were urgent six months ago and
+   aren't anymore; =[#C]=s that should be =[#A]=s now.
+2. /Stalled commitments/ — tasks that keep getting deferred without
+   ever being killed or started; they sit as low-grade guilt.
+3. /Prose decay/ — entries whose body has lost context or whose next
+   action isn't clear anymore.
+
+Success metric: /trust/. When the operator opens =todo.org=, the
+priorities and content feel accurate. Quantitative throughput targets
+were rejected as distorting the thing they measure.
+
+* Non-Goals
+
+- Replacing =org-agenda= for daily planning. The review is a /list
+  hygiene/ habit, not a planning ritual.
+- Reviewing every task in the list each session. Coverage is
+  rotational, ~12-day cycle.
+- Managing children of top-level tasks. Children inherit parent
+  priority; only top-level =**= headings are review units.
+- Multi-file review. One =todo.org= per project per session.
+- Replacing the lint-org pass. That stays as-is; only the date-
+  coverage subsection of =wrap-it-up.org= retires.
+- Cross-machine state sync beyond what =todo.org= already gets from
+  the project's normal git/file-sync path.
+
+* Approaches Considered
+
+Six approaches were generated (three conventional, three tail).
+
+** Recommended: Pure Emacs custom agenda + keystroke macros
+
+A custom =org-agenda= view surfaces the N oldest-unreviewed top-level
+=[#A]= / =[#B]= / =[#C]= entries from =todo.org=, sorted by a
+=:LAST_REVIEWED:= property ascending (NIL sorts first). A minor
+mode binds single keystrokes to actions (re-grade, kill, mark DOING,
+keep, skip, edit). Each action stamps =:LAST_REVIEWED:= and advances
+point. The review fires daily; cycle length ≈12 days at N=7.
+
+/Why this wins/: trust as the success metric demands the habit
+actually happens daily. Friction kills daily habits. Single
+keystrokes in an agenda Craig already opens minimize friction. The
+operator lives in Emacs already; routing through Claude every
+morning fights that. Re-grade and kill (the dominant actions) are
+keystroke-trivial. The 5-minute slot constraint rules out Claude
+session round-trips.
+
+/Trade-offs/: requires 1-2 evenings of elisp work upfront (vs. a
+Claude command that could ship today). Prose touch-ups are rare per
+the failure-mode analysis; when they happen, escalate manually via
+=e= (edit in source buffer) or invoke Claude separately.
+
+** Rejected: =/task-review= Claude command (daily-prep step)
+
+Conversational walk per task. Pro: immediately buildable, no elisp.
+Con: Claude session every morning eats most of the 5-min slot;
+5-10 conversational decisions before coffee is heavy friction.
+
+** Rejected: Hybrid Emacs + Claude on demand
+
+Emacs surfaces the batch; Claude handles non-obvious decisions on
+request. Pro: deep when needed. Con: two surfaces to maintain;
+"when do I escalate" adds cognitive load. May become a v2 path if
+the pure Emacs version surfaces a real need for Claude analysis.
+
+** Rejected: No state, random daily sample
+
+Probabilistic coverage via random selection. Pro: zero state. Con:
+no guarantee every task gets reviewed in a window; stale tasks can
+(probabilistically) hide for months. Harder to claim trust.
+
+** Rejected: Review at write-time
+
+Hook into task creation: prompt to re-grade 1-2 existing tasks
+when adding a new one. Pro: no separate habit. Con: doesn't fire
+on quiet weeks — exactly when rot accumulates most.
+
+** Rejected: Section-rotating review (one section per day)
+
+Each day review all tasks in one top-level org section. Pro: full
+context for one domain. Con: section sizes wildly uneven; doesn't
+surface oldest-untouched first; adds pressure to balance section
+counts.
+
+* Design
+
+** State Model
+
+A =:LAST_REVIEWED: YYYY-MM-DD= property on each top-level =**=
+heading, inside the standard =:PROPERTIES:= drawer. Drawer folds by
+default, so visual impact is one collapsed =:PROPERTIES:...:END:=
+line per task. NIL (never reviewed) sorts as oldest, so first-pass
+coverage is automatic without a manual seeding step.
+
+Mutation surface: one elisp call,
+=(org-entry-put (point) "LAST_REVIEWED" (format-time-string "%Y-%m-%d"))=.
+
+** Architecture
+
+Single elisp file: =task-review.el=. Lives at
+=archsetup/dotfiles/common/.emacs.d/lisp/task-review.el=. Loaded by
+=init.el= via =(require 'task-review)=. One key binding hooks the
+entry point.
+
+Exports:
+
+- =task-review-agenda= — interactive entry point. Builds the custom
+  agenda buffer.
+- =task-review-mode= — minor mode owning the keymap.
+- Action functions: one per keystroke, each ending in
+  =task-review--stamp-and-refresh=.
+
+Configuration variables (=defcustom=):
+
+- =task-review-batch-size= — default 7. Range 3-10 makes sense.
+- =task-review-todo-file= — default lookup:
+  =projectile-project-root= + =/todo.org= if present, else fallback.
+- =task-review-priorities= — default =("A" "B" "C")=.
+- =task-review-confirm-kill= — default =t=. Single-key =y=
+  confirmation on the =k= action.
+
+** Selection Algorithm
+
+=org-map-entries= over =task-review-todo-file=, keeping headings
+that match every predicate:
+
+- Heading depth = 2 (=**=; top-level under each section).
+- Keyword ∈ ={TODO, DOING, VERIFY}=. Excludes DONE and CANCELLED.
+- Priority cookie ∈ =task-review-priorities=.
+
+Sort by =LAST_REVIEWED= ascending. NIL sorts first. Ties broken by
+file position. Take first =task-review-batch-size= entries.
+
+If zero candidates: display "nothing to review" and return without
+opening a buffer. If fewer than batch size: surface what's available
+(final batch of a cycle).
+
+Cycle math at N=7 with ~80 open top-level tasks: ~12-day rotation,
+3-5 minute daily session.
+
+** Keymap
+
+Bindings in =task-review-mode=:
+
+| Key | Action | Side effects |
+|-----+--------+--------------|
+| =1= | Set priority to =[#A]= | Stamps, advances |
+| =2= | Set priority to =[#B]= | Stamps, advances |
+| =3= | Set priority to =[#C]= | Stamps, advances |
+| =r= | Keep as-is | Stamps, advances |
+| =k= | Kill → =CANCELLED= + =CLOSED:= line | y/n confirmation; no stamp (leaves pool); advances |
+| =d= | =TODO= → =DOING= | Stamps; advances; no-op message on =DOING= / =VERIFY= |
+| =e= | Open in source buffer for edit | Stamps on return |
+| =s= | Skip without stamping | Advances (task stays in pool) |
+| =q= | Quit | Saves =todo.org= silently; kills buffer |
+
+=r= is the highest-frequency keystroke by design — most reviewed
+tasks are still correct, just need re-stamping.
+
+** Workflow file
+
+Template-level workflow at
+=claude-templates/.ai/workflows/task-review.org= (canonical), rsync'd
+into every project's =.ai/workflows/task-review.org=. Registered in
+=INDEX.org= with three trigger aliases: =task review=, =review tasks=,
+=task-review=.
+
+Body shape:
+
+1. Precondition: confirm =todo.org= exists at the project root. If
+   absent, surface "this project has no =todo.org=" and exit.
+2. Open the review agenda:
+   =emacsclient -n --eval "(task-review-agenda)"=.
+3. Surface the keymap cheat sheet inline.
+4. Wait silently while the operator works in Emacs.
+5. Optional close-out: offer help on any task the operator marked
+   =e= for prose touch-up.
+
+** Wrap-up integration
+
+The existing date-coverage scan in
+=claude-templates/.ai/workflows/wrap-it-up.org= (lines 177-213 at
+design time) gets replaced with a /review-habit health check/:
+
+- Run =task-review-staleness.sh todo.org 30= (the extracted bash
+  script — see Testing section).
+- If the count is non-zero, append one line to =$followups=:
+  "/N top-level =[#A]= / =[#B]= / =[#C]= tasks unreviewed for >30
+  days. Daily review may have slipped./"
+- If zero, silent. No per-task list, no follow-ups dump.
+
+Threshold 30 days = ~2.5 cycle lengths of slack at N=7. One missed
+review week is fine; three weeks signals habit problem.
+
+** Startup reminder
+
+Project-specific. =.ai/project-workflows/startup-extras.org= gets a
+block calling =task-review-staleness.sh todo.org 7=. If the count
+is non-zero, surface one line in Phase C findings: "/N top-level
+tasks unreviewed for >7 days — say 'let's do a task review' to run
+a cycle/."
+
+Threshold 7 days = ~1 cycle of slack. Softer than the wrap-up alarm.
+
+** Error handling
+
+- =task-review-todo-file= missing → message "no todo.org found";
+  return without opening a buffer.
+- Malformed =LAST_REVIEWED= value → treated as NIL (sorts oldest).
+- Level-1 and level-3+ headings ignored by the selection query.
+- Headings without a priority cookie ignored.
+- =d= on =DOING= or =VERIFY= → one-line message ("already DOING" /
+  "VERIFY tasks transition via answer, not =d="); no state change.
+- =k= confirmation aborted → no state change.
+
+** Testing
+
+Three test surfaces.
+
+/1. ERT — =task-review.el= core/
+
+Test file: =archsetup/dotfiles/common/.emacs.d/lisp/tests/test-task-review.el=.
+~20 tests across Normal / Boundary / Error per =testing.md=.
+
+Design discipline: pure data layer
+(=task-review--candidates=, =task-review--sort-by-reviewed=,
+=task-review--stamp=, =task-review--kill-to-cancelled=,
+=task-review--transition-doing=, =task-review--regrade=) is
+testable directly. UI shell (=task-review-agenda=,
+=task-review-mode=, keymap) is composed of the data layer and
+verified manually.
+
+Each test uses =with-temp-buffer= + a synthetic =todo.org= fixture
+string. No global state.
+
+/2. Bats — =task-review-staleness.sh=/
+
+The staleness math is extracted to
+=claude-templates/.ai/scripts/task-review-staleness.sh= so both
+wrap-up and startup-extras call the same source. Tests at
+=claude-templates/.ai/scripts/tests/task-review-staleness.bats=
+(~5 cases): zero candidates, all stale, all fresh, mixed,
+threshold-boundary (exactly cutoff date).
+
+/3. Manual smoke checklist/
+
+1. Hit the binding → agenda opens with batch of 7.
+2. Press =1= → priority changes to =[#A]=, =LAST_REVIEWED= stamps,
+   cursor advances.
+3. Press =k= then =y= → task transitions to =CANCELLED=, =CLOSED:=
+   line added.
+4. Press =q= → buffer closes, =todo.org= saved.
+5. Reopen =task-review-agenda= → reviewed tasks no longer appear
+   in the top 7.
+
+** Observability
+
+Implicit. =LAST_REVIEWED= timestamps themselves are the log of
+when each task was last touched. =git blame= against =todo.org=
+gives a deeper trail when needed. No separate logging surface.
+
+* Open Questions
+
+These deferred decisions should resolve during or shortly after
+implementation:
+
+- [ ] Exact key binding for =task-review-agenda= entry point.
+  Suggested: =C-c r=. Likely candidate for arch-decide if a binding
+  collision surfaces.
+- [ ] Whether to promote the startup reminder to template-level
+  (currently project-specific by Chunk 6 decision). Revisit once a
+  second project adopts the elisp.
+- [ ] Whether =org-id-get-create= should be added to top-level tasks
+  during the first review cycle, hedging in case the state model
+  later migrates to a separate state file. Probably premature.
+- [ ] Behavior when =todo.org= is edited externally during a review
+  session. =task-review-agenda= reads the buffer once; concurrent
+  external edits could desync. Likely candidate: refresh on each
+  action.
+
+* Next Steps
+
+1. /Implementation/ — direct work (not =start-work=; the design is
+   the planning artifact). Order:
+
+   a. Extract staleness script (=claude-templates/.ai/scripts/task-review-staleness.sh=) with bats tests.
+   b. Edit =wrap-it-up.org= to call the staleness script with threshold 30.
+   c. Author =task-review.el= in =archsetup/dotfiles/common/.emacs.d/lisp/= with ERT tests.
+   d. Author =task-review.org= workflow in =claude-templates/.ai/workflows/= and add INDEX entry.
+   e. Add startup-extras block to =rulesets/.ai/project-workflows/startup-extras.org=.
+   f. Smoke test in =rulesets/= against live =todo.org=.
+
+2. /First cycle/ — run daily for ~2 weeks, gauge whether the habit
+   sticks. Adjust =task-review-batch-size= if 7/day feels wrong.
+
+3. /Retire/ — once the habit is established and trusted, the date-
+   coverage scan retirement is complete. The review-habit health
+   check stays as the watchdog.
+
+* References
+
+- Brainstorm session: =.ai/sessions/2026-05-16-*-task-review-design.org=
+  (filed at wrap-up of this session).
+- Replaced subsection: =claude-templates/.ai/workflows/wrap-it-up.org=
+  lines 177-213 (commit immediately preceding implementation).
+- Existing patterns followed:
+  =.ai/scripts/lint-org.el= (canonical-source rule, ERT layout),
+  =scripts/tests/audit.bats= (bats test patterns).
author	Craig Jennings <c@cjennings.net>	2026-05-16 11:59:40 -0500
committer	Craig Jennings <c@cjennings.net>	2026-05-16 11:59:40 -0500
commit	daf960fad5169f0d3fa52bae770e83ad73344f30 (patch)
tree	92a0884f16c91299ce8ab63591d482049d04065c /docs/design/task-review.org
parent	9c5aff1febe3eaf44dfe59bc3e3b70a35cde4c8a (diff)
download	rulesets-daf960fad5169f0d3fa52bae770e83ad73344f30.tar.gz rulesets-daf960fad5169f0d3fa52bae770e83ad73344f30.zip