docs(task-review): sharpen the :solo: tag definition

Craig clarified what :solo: means. The old third gate ("the outcome is verifiable locally, no ... confirmation that the result is right") read literally disqualified every task, since Craig spot-checks everything regardless of the tag. It conflated "Craig will also check" with "only Craig can check." The three gates are now buildable, verifiable by Claude, and no upfront decision. The fix is decoupling Craig's routine spot-check from the determination: a task Claude builds and verifies itself, leaving a manual-testing reminder for the residual human-in-the-loop confirmation, is solo. The disqualifier is having no verification path of Claude's own, a result only judgeable by Craig's eyes. task-audit.org Phase C already defers here for the definition, so this is the one edit site.
author: Craig Jennings <c@cjennings.net> 2026-06-10 01:24:35 -0500
committer: Craig Jennings <c@cjennings.net> 2026-06-10 01:24:35 -0500
commit: 4a4a88cefda2e3689df594008e086382351be818 (patch)
tree: 4f958941c7596de274868796c86aeeb3a14953a2
parent: cc72aa635f733da36010567c8718b1ede7622c52 (diff)
download: rulesets-4a4a88cefda2e3689df594008e086382351be818.tar.gz
rulesets-4a4a88cefda2e3689df594008e086382351be818.zip
2 files changed, 20 insertions, 4 deletions
diff --git a/.ai/workflows/task-review.org b/.ai/workflows/task-review.org
index 7cc1e29..69e172d 100644
--- a/.ai/workflows/task-review.org
+++ b/.ai/workflows/task-review.org
@@ -63,7 +63,15 @@ This is orthogonal to the action chosen — a task can be kept (or re-graded, or
 
 *** Tagging =:solo:= — tasks Claude can finish end-to-end
 
-While reviewing each task, judge whether Claude could finish it without Craig's input, and if so add =:solo:= to the heading line. Three gates, all of which must hold: the scope is well-defined and bounded, there's no design or preference call that needs Craig, and the outcome is verifiable locally — no waiting on hardware Craig owns, an external service, or Craig's own confirmation that the result is right. If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+While reviewing each task, judge whether Claude could build *and* verify it without Craig's help, and if so add =:solo:= to the heading line. Three gates, all of which must hold:
+
+1. *Buildable* — Claude has the capability and access to do the work.
+2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste.
+3. *No upfront decision* — no design or preference call Craig must make before Claude can begin.
+
+If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+
+The shape of a solo task: Claude builds it, verifies everything it can verify itself, and leaves a manual-testing reminder for the residual confirmation Craig does anyway. Craig running that reminder is assumed and does not flip the task to non-solo — solo is about whether Claude *can* verify, not whether Craig *also will*.
 
 =:solo:= is independent of both the action and =:quick:=. A task can be =:solo:= but slow (a bounded refactor that takes hours yet needs no input) or =:quick:= but not =:solo:= (a five-minute change that hinges on a preference call). Tag each axis on its own merits; both share the one =:tag1:tag2:= cluster. Skip the assessment on a Kill.
 
@@ -104,7 +112,7 @@ When the batch is done (or Craig calls it early):
 5. *Drifting the date format* — =:LAST_REVIEWED:= must be =YYYY-MM-DD=, or the staleness script won't read it.
 6. *Marking a kill DONE instead of CANCELLED* — DONE means finished, CANCELLED means abandoned. A task review kills tasks that shouldn't be done at all.
 7. *Guessing a =:quick:= estimate* — if the heading and body don't make the effort clear, ask Craig instead of tagging on a hunch. A mislabeled =:quick:= defeats the tag's purpose.
-8. *Over-tagging =:solo:=* — if you can't confirm all three gates (bounded scope, no preference call, locally verifiable), leave it off. A =:solo:= that actually needs Craig's input, his hardware, or his sign-off to verify defeats the tag's purpose.
+8. *Over-tagging =:solo:=* — if you can't confirm all three gates (buildable, verifiable by Claude, no upfront decision), leave it off. A =:solo:= whose only verification is Craig's eyes, or that needs his input or hardware, defeats the tag's purpose. But a task Claude builds and verifies itself, leaving Craig a manual-testing reminder for his routine spot-check, *is* solo — the reminder doesn't disqualify it.
 
 * Living Document
 
diff --git a/claude-templates/.ai/workflows/task-review.org b/claude-templates/.ai/workflows/task-review.org
index 7cc1e29..69e172d 100644
--- a/claude-templates/.ai/workflows/task-review.org
+++ b/claude-templates/.ai/workflows/task-review.org
@@ -63,7 +63,15 @@ This is orthogonal to the action chosen — a task can be kept (or re-graded, or
 
 *** Tagging =:solo:= — tasks Claude can finish end-to-end
 
-While reviewing each task, judge whether Claude could finish it without Craig's input, and if so add =:solo:= to the heading line. Three gates, all of which must hold: the scope is well-defined and bounded, there's no design or preference call that needs Craig, and the outcome is verifiable locally — no waiting on hardware Craig owns, an external service, or Craig's own confirmation that the result is right. If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+While reviewing each task, judge whether Claude could build *and* verify it without Craig's help, and if so add =:solo:= to the heading line. Three gates, all of which must hold:
+
+1. *Buildable* — Claude has the capability and access to do the work.
+2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste.
+3. *No upfront decision* — no design or preference call Craig must make before Claude can begin.
+
+If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+
+The shape of a solo task: Claude builds it, verifies everything it can verify itself, and leaves a manual-testing reminder for the residual confirmation Craig does anyway. Craig running that reminder is assumed and does not flip the task to non-solo — solo is about whether Claude *can* verify, not whether Craig *also will*.
 
 =:solo:= is independent of both the action and =:quick:=. A task can be =:solo:= but slow (a bounded refactor that takes hours yet needs no input) or =:quick:= but not =:solo:= (a five-minute change that hinges on a preference call). Tag each axis on its own merits; both share the one =:tag1:tag2:= cluster. Skip the assessment on a Kill.
 
@@ -104,7 +112,7 @@ When the batch is done (or Craig calls it early):
 5. *Drifting the date format* — =:LAST_REVIEWED:= must be =YYYY-MM-DD=, or the staleness script won't read it.
 6. *Marking a kill DONE instead of CANCELLED* — DONE means finished, CANCELLED means abandoned. A task review kills tasks that shouldn't be done at all.
 7. *Guessing a =:quick:= estimate* — if the heading and body don't make the effort clear, ask Craig instead of tagging on a hunch. A mislabeled =:quick:= defeats the tag's purpose.
-8. *Over-tagging =:solo:=* — if you can't confirm all three gates (bounded scope, no preference call, locally verifiable), leave it off. A =:solo:= that actually needs Craig's input, his hardware, or his sign-off to verify defeats the tag's purpose.
+8. *Over-tagging =:solo:=* — if you can't confirm all three gates (buildable, verifiable by Claude, no upfront decision), leave it off. A =:solo:= whose only verification is Craig's eyes, or that needs his input or hardware, defeats the tag's purpose. But a task Claude builds and verifies itself, leaving Craig a manual-testing reminder for his routine spot-check, *is* solo — the reminder doesn't disqualify it.
 
 * Living Document
author	Craig Jennings <c@cjennings.net>	2026-06-10 01:24:35 -0500
committer	Craig Jennings <c@cjennings.net>	2026-06-10 01:24:35 -0500
commit	4a4a88cefda2e3689df594008e086382351be818 (patch)
tree	4f958941c7596de274868796c86aeeb3a14953a2
parent	cc72aa635f733da36010567c8718b1ede7622c52 (diff)
download	rulesets-4a4a88cefda2e3689df594008e086382351be818.tar.gz rulesets-4a4a88cefda2e3689df594008e086382351be818.zip