2 files changed, 20 insertions, 4 deletions
diff --git a/.ai/workflows/task-review.org b/.ai/workflows/task-review.org
index 7cc1e29..69e172d 100644
--- a/.ai/workflows/task-review.org
+++ b/.ai/workflows/task-review.org
@@ -63,7 +63,15 @@ This is orthogonal to the action chosen — a task can be kept (or re-graded, or
 
 *** Tagging =:solo:= — tasks Claude can finish end-to-end
 
-While reviewing each task, judge whether Claude could finish it without Craig's input, and if so add =:solo:= to the heading line. Three gates, all of which must hold: the scope is well-defined and bounded, there's no design or preference call that needs Craig, and the outcome is verifiable locally — no waiting on hardware Craig owns, an external service, or Craig's own confirmation that the result is right. If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+While reviewing each task, judge whether Claude could build *and* verify it without Craig's help, and if so add =:solo:= to the heading line. Three gates, all of which must hold:
+
+1. *Buildable* — Claude has the capability and access to do the work.
+2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste.
+3. *No upfront decision* — no design or preference call Craig must make before Claude can begin.
+
+If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+
+The shape of a solo task: Claude builds it, verifies everything it can verify itself, and leaves a manual-testing reminder for the residual confirmation Craig does anyway. Craig running that reminder is assumed and does not flip the task to non-solo — solo is about whether Claude *can* verify, not whether Craig *also will*.
 
 =:solo:= is independent of both the action and =:quick:=. A task can be =:solo:= but slow (a bounded refactor that takes hours yet needs no input) or =:quick:= but not =:solo:= (a five-minute change that hinges on a preference call). Tag each axis on its own merits; both share the one =:tag1:tag2:= cluster. Skip the assessment on a Kill.
 
@@ -104,7 +112,7 @@ When the batch is done (or Craig calls it early):
 5. *Drifting the date format* — =:LAST_REVIEWED:= must be =YYYY-MM-DD=, or the staleness script won't read it.
 6. *Marking a kill DONE instead of CANCELLED* — DONE means finished, CANCELLED means abandoned. A task review kills tasks that shouldn't be done at all.
 7. *Guessing a =:quick:= estimate* — if the heading and body don't make the effort clear, ask Craig instead of tagging on a hunch. A mislabeled =:quick:= defeats the tag's purpose.
-8. *Over-tagging =:solo:=* — if you can't confirm all three gates (bounded scope, no preference call, locally verifiable), leave it off. A =:solo:= that actually needs Craig's input, his hardware, or his sign-off to verify defeats the tag's purpose.
+8. *Over-tagging =:solo:=* — if you can't confirm all three gates (buildable, verifiable by Claude, no upfront decision), leave it off. A =:solo:= whose only verification is Craig's eyes, or that needs his input or hardware, defeats the tag's purpose. But a task Claude builds and verifies itself, leaving Craig a manual-testing reminder for his routine spot-check, *is* solo — the reminder doesn't disqualify it.
 
 * Living Document
 
diff --git a/claude-templates/.ai/workflows/task-review.org b/claude-templates/.ai/workflows/task-review.org
index 7cc1e29..69e172d 100644
--- a/claude-templates/.ai/workflows/task-review.org
+++ b/claude-templates/.ai/workflows/task-review.org
@@ -63,7 +63,15 @@ This is orthogonal to the action chosen — a task can be kept (or re-graded, or
 
 *** Tagging =:solo:= — tasks Claude can finish end-to-end
 
-While reviewing each task, judge whether Claude could finish it without Craig's input, and if so add =:solo:= to the heading line. Three gates, all of which must hold: the scope is well-defined and bounded, there's no design or preference call that needs Craig, and the outcome is verifiable locally — no waiting on hardware Craig owns, an external service, or Craig's own confirmation that the result is right. If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+While reviewing each task, judge whether Claude could build *and* verify it without Craig's help, and if so add =:solo:= to the heading line. Three gates, all of which must hold:
+
+1. *Buildable* — Claude has the capability and access to do the work.
+2. *Verifiable by Claude* — an objective or local check exists that Claude can run itself. Craig's routine spot-checking does not count against this, and neither does handing off a residual human-in-the-loop confirmation as a structured manual-testing reminder (the =verification.md= "Handing Off Manual Verification" pattern). The disqualifier is having no verification path of Claude's own at all — when the success criterion is only judgeable by Craig's eyes or subjective taste.
+3. *No upfront decision* — no design or preference call Craig must make before Claude can begin.
+
+If any gate is shaky, leave the tag off. Like =:quick:=, a wrong =:solo:= is worse than none — it tells Craig he can hand the task off and walk away, so a mislabeled one wastes that trust. When the heading and body don't make all three gates clear, ask Craig instead of guessing.
+
+The shape of a solo task: Claude builds it, verifies everything it can verify itself, and leaves a manual-testing reminder for the residual confirmation Craig does anyway. Craig running that reminder is assumed and does not flip the task to non-solo — solo is about whether Claude *can* verify, not whether Craig *also will*.
 
 =:solo:= is independent of both the action and =:quick:=. A task can be =:solo:= but slow (a bounded refactor that takes hours yet needs no input) or =:quick:= but not =:solo:= (a five-minute change that hinges on a preference call). Tag each axis on its own merits; both share the one =:tag1:tag2:= cluster. Skip the assessment on a Kill.
 
@@ -104,7 +112,7 @@ When the batch is done (or Craig calls it early):
 5. *Drifting the date format* — =:LAST_REVIEWED:= must be =YYYY-MM-DD=, or the staleness script won't read it.
 6. *Marking a kill DONE instead of CANCELLED* — DONE means finished, CANCELLED means abandoned. A task review kills tasks that shouldn't be done at all.
 7. *Guessing a =:quick:= estimate* — if the heading and body don't make the effort clear, ask Craig instead of tagging on a hunch. A mislabeled =:quick:= defeats the tag's purpose.
-8. *Over-tagging =:solo:=* — if you can't confirm all three gates (bounded scope, no preference call, locally verifiable), leave it off. A =:solo:= that actually needs Craig's input, his hardware, or his sign-off to verify defeats the tag's purpose.
+8. *Over-tagging =:solo:=* — if you can't confirm all three gates (buildable, verifiable by Claude, no upfront decision), leave it off. A =:solo:= whose only verification is Craig's eyes, or that needs his input or hardware, defeats the tag's purpose. But a task Claude builds and verifies itself, leaving Craig a manual-testing reminder for his routine spot-check, *is* solo — the reminder doesn't disqualify it.
 
 * Living Document