docs: add a green-baseline-before-work gate to verification and start-work

A red test suite at the start of work poisons every later "did I break this?" check: you can't tell your own regressions from pre-existing noise, and the end-of-work green bar stops being readable. verification.md now asks for a clean suite run before work begins, not only before commit. start-work runs it as Pre-work step 0.3 against the reconciled base. I added two carve-outs the original proposal lacked: a project with no suite has nothing to baseline, and a suite that can't run is the existing "When You Cannot Verify" case rather than a blocker. The step lands after the reconcile, so the baseline reflects the base the work is cut from. The Phase 4 TDD red is called out as expected, distinct from a baseline failure.
author: Craig Jennings <c@cjennings.net> 2026-06-30 13:27:24 -0400
committer: Craig Jennings <c@cjennings.net> 2026-06-30 13:27:24 -0400
commit: d0ab04751fe437b6c9509a2ff3217cda0f624edc (patch)
tree: 501250177ea7174f38795d693acdac01a8d3351d /.claude
parent: a266250c1e24c9b2f6f246206c1b726dbb84c4bf (diff)
download: rulesets-d0ab04751fe437b6c9509a2ff3217cda0f624edc.tar.gz
rulesets-d0ab04751fe437b6c9509a2ff3217cda0f624edc.zip
1 files changed, 16 insertions, 5 deletions
diff --git a/.claude/commands/start-work.md b/.claude/commands/start-work.md
index d146622..85ed6b0 100644
--- a/.claude/commands/start-work.md
+++ b/.claude/commands/start-work.md
@@ -1,12 +1,12 @@
 ---
-description: Pick up a task (Linear ticket, GitHub issue, todo.org task, or a described scope) and take it through Pre-work, Claim, Justify, Approach, Implement, Verify, and Hand-off. Three user-approval gates separate the phases. Pre-work covers eligibility, a fetch-and-reconcile against the base branch, and a source-code check that the problem still exists in the tree. The Justify gate weighs benefits, costs, impact, urgency, effort, alternatives, and ticket quality. The Approach gate covers root cause, risk, refactor prerequisites, test strategy (unit, integration, e2e, pairwise, characterization), migration and backwards-compat, feature flags, commit decomposition, and branch name. Implementation uses TDD (red, green, edge cases); a refactor audit then walks every touched file against a language-agnostic checklist, fixing each finding here or filing it as a ticket, never dropping one. A verify phase exercises the feature end-to-end locally (Playwright against localhost for web, scripted manual test otherwise) before the final gate hands off to the Review-and-Publish flow in commits.md. Use when starting work on a specific task where both "should we" and "how exactly" are worth deliberating. Do NOT use for open-ended bug investigation without a clear target (use debug first), for architectural paradigm exploration (use arch-design), for architectural decision recording (use arch-decide), when the task is trivial and obvious (just do it), or when requirements are still being shaped (use brainstorm).
+description: Pick up a task (Linear ticket, GitHub issue, todo.org task, or a described scope) and take it through Pre-work, Claim, Justify, Approach, Implement, Verify, and Hand-off. Three user-approval gates separate the phases. Pre-work covers eligibility, a fetch-and-reconcile against the base branch, a green-baseline suite run, and a source-code check that the problem still exists in the tree. The Justify gate weighs benefits, costs, impact, urgency, effort, alternatives, and ticket quality. The Approach gate covers root cause, risk, refactor prerequisites, test strategy (unit, integration, e2e, pairwise, characterization), migration and backwards-compat, feature flags, commit decomposition, and branch name. Implementation uses TDD (red, green, edge cases); a refactor audit then walks every touched file against a language-agnostic checklist, fixing each finding here or filing it as a ticket, never dropping one. A verify phase exercises the feature end-to-end locally (Playwright against localhost for web, scripted manual test otherwise) before the final gate hands off to the Review-and-Publish flow in commits.md. Use when starting work on a specific task where both "should we" and "how exactly" are worth deliberating. Do NOT use for open-ended bug investigation without a clear target (use debug first), for architectural paradigm exploration (use arch-design), for architectural decision recording (use arch-decide), when the task is trivial and obvious (just do it), or when requirements are still being shaped (use brainstorm).
 ---
 
 # /start-work: pick up a task, justify it, plan it, build it
 
 Three review gates separate the phases. The user can redirect or kill the work at each one.
 
-0. **Pre-work.** Eligibility check, fetch-and-reconcile against the base branch, source-code check that the problem still exists.
+0. **Pre-work.** Eligibility check, fetch-and-reconcile against the base branch, green-baseline suite run, source-code check that the problem still exists.
 1. **Claim.** Mark in-progress, assign, label, verify project.
 2. **Justify (gate 1).** Benefits, costs, impact, urgency, effort, alternatives, ticket quality. Stop for approval.
 3. **Approach (gate 2).** Root cause, risk, tests, migration, flag, commit decomposition. Stop for approval.
@@ -53,7 +53,7 @@ If the reference is ambiguous, ask the user to clarify before proceeding.
 
 ## Phase 0: pre-work
 
-Three checks before claiming the task. All run before any state change — no assignee added, no label written, no status moved. If any of them disqualify the task, the rollback is free.
+Four checks before claiming the task. All run before any state change — no assignee added, no label written, no status moved. If any of them disqualify the task, the rollback is free.
 
 ### 0.1 Eligibility
 
@@ -84,7 +84,18 @@ The branch this task will be cut from must reflect the remote — otherwise the
 
 4. If the current branch is *not* the base branch (e.g. left over from a prior task), surface and ask whether to switch before continuing. Don't auto-switch — the user may want to finish or stash WIP first.
 
-### 0.3 Existence check (validate the problem is real)
+### 0.3 Green baseline (confirm the tree starts known-good)
+
+Run the project's test suite now, against the reconciled base, so the baseline you build on is actually green (see the Green Baseline section in `verification.md`). This runs after 0.2 — baselining a stale tree is pointless.
+
+- **Green** — proceed to 0.4.
+- **Red** — fix the failure first, or, when it's out of scope or needs a decision, file a tracked task with the diagnosis and carry its name forward as the only tolerated failure for this work. Surface the baseline result either way so "we started from green" is on the record.
+- **No suite** — nothing to baseline. Note it and proceed (your personal/doc projects hit this).
+- **Suite can't run** (no network, missing dep, sandbox limit) — that's the "When You Cannot Verify" case in `verification.md`, not a blocker. Record what you couldn't run, name the risk, and proceed.
+
+This baseline is the green starting point; the intentional red test you write in Phase 4 (TDD) is expected and distinct from a baseline failure.
+
+### 0.4 Existence check (validate the problem is real)
 
 The ticket may describe a problem the code no longer has — fixed independently of the ticket, made obsolete by another change, or never present in the first place. Read the source to confirm the problem exists in the tree as the ticket describes, before justifying or planning the fix.
 
@@ -337,7 +348,7 @@ Follow `commits.md` exactly. Summary of the flow:
 ## Anti-patterns
 
 - **Skipping the pre-flight reconcile.** Cutting a new branch from a stale base means the whole task happens on top of yesterday's main. Conflicts surface at PR time instead of at the start; rebases later are noisier than a fetch up front.
-- **Taking the ticket's word that the problem still exists.** Tickets age. Read the source. A `git log --grep` for a fix commit is a hint, not a check — fixes ship under all kinds of commit-message wording, and the buggy behavior may be gone for reasons that never landed in a commit titled "fix." Five minutes of source-read at Phase 0.3 saves an entire Justify-and-Approach cycle on a phantom problem.
+- **Taking the ticket's word that the problem still exists.** Tickets age. Read the source. A `git log --grep` for a fix commit is a hint, not a check — fixes ship under all kinds of commit-message wording, and the buggy behavior may be gone for reasons that never landed in a commit titled "fix." Five minutes of source-read at Phase 0.4 saves an entire Justify-and-Approach cycle on a phantom problem.
 - **Skipping the Justify gate.** "This is obviously worth doing" is exactly what the gate exists to verify. If the answer really is obvious, the gate takes thirty seconds.
 - **Skipping the Approach gate.** Implementation without a plan is how scope creep happens. It is also how the user loses the chance to redirect.
 - **Marking a personal todo task DOING before Phase 2 approval.** Personal claims carry no teammate signal, so they wait until the gate clears — a killed task then needs no rollback. Team-tracker claims (Linear, GitHub) are the exception: they happen in Phase 1 to flag intent, but only after the prior state is recorded so the gate can restore it cleanly.
author	Craig Jennings <c@cjennings.net>	2026-06-30 13:27:24 -0400
committer	Craig Jennings <c@cjennings.net>	2026-06-30 13:27:24 -0400
commit	d0ab04751fe437b6c9509a2ff3217cda0f624edc (patch)
tree	501250177ea7174f38795d693acdac01a8d3351d /.claude
parent	a266250c1e24c9b2f6f246206c1b726dbb84c4bf (diff)
download	rulesets-d0ab04751fe437b6c9509a2ff3217cda0f624edc.tar.gz rulesets-d0ab04751fe437b6c9509a2ff3217cda0f624edc.zip