aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/design/2026-06-29-green-baseline-proposal.org72
1 files changed, 72 insertions, 0 deletions
diff --git a/docs/design/2026-06-29-green-baseline-proposal.org b/docs/design/2026-06-29-green-baseline-proposal.org
new file mode 100644
index 0000000..47de18d
--- /dev/null
+++ b/docs/design/2026-06-29-green-baseline-proposal.org
@@ -0,0 +1,72 @@
+#+TITLE: Proposal: ensure a green test run before starting work
+#+AUTHOR: Craig Jennings (via .emacs.d session)
+#+DATE: 2026-06-29
+
+* Why
+
+While running a multi-task refactor speedrun in =.emacs.d=, the very first full
+=make test= surfaced a failing test (=test-system-cmd-restart-emacs-no-service-aborts=)
+that turned out to be *pre-existing* -- it failed on clean HEAD, unrelated to
+the work. It had nothing to do with the task; it just happened to fail on this
+machine (a native-comp mock that bypasses =symbol-function= redefinition, real
+check passing because the box has =emacs.service=).
+
+Two costs landed because the red was already there when work began:
+
+1. Every later "did I break this?" suite run carried a known failure, so the
+ green bar became "only that one fails" instead of a clean pass I could read
+ at a glance. Easy to let a *new* regression hide behind the familiar red.
+2. The work assumed the tree was in a known-good state. It wasn't, and nothing
+ in the workflow forced that assumption to be checked first.
+
+The fix is cheap and general: run the suite *before* starting work, and clear
+(or explicitly triage) any failure before the work begins. A green start
+confirms we're in the known-good place we think we are, and any issue is fixed
+before it can be confused with our own changes.
+
+* Proposed change 1 -- claude-rules/verification.md
+
+Add a section (suggested placement: right after =## The Rule=, before
+=## What Fresh Means=). Proposed text, ready to paste:
+
+#+begin_example
+## Green Baseline Before Starting Work
+
+Run the test suite before you start work, not only before you finish. A clean run at the start confirms the tree is in the known-good state you assume it is, so the baseline you build on and measure your changes against is actually green.
+
+If the suite is red before you touch anything, fix or explicitly triage the failure first. A pre-existing failure left in place poisons every later "did I break this?" check: you can't separate your own regressions from the noise, and the end-of-work run stops being readable as pass/fail at a glance. Work that assumes a known-good base may also be built on a broken assumption you never saw.
+
+When a pre-existing failure genuinely can't be fixed before the work begins (out of scope, or it needs a decision), record it as a tracked task with the diagnosis and carry its name forward. The green bar for the rest of the work is then explicitly "only this known failure remains," not a silent tolerance for red.
+
+This is the start-of-work counterpart to the Before Committing gate below: one confirms the ground is solid before you build, the other confirms you didn't crack it.
+#+end_example
+
+* Proposed change 2 -- start-work skill, Pre-work phase
+
+The start-work skill already has a Pre-work phase (eligibility, fetch-and-reconcile
+against base, source-code check that the problem still exists). Add a green-baseline
+step to that phase:
+
+- Run the project's test suite before claiming the work.
+- If it's fully green, proceed.
+- If it's red, fix the failure first, or (when out of scope / needs a decision)
+ file a tracked task with the diagnosis and carry its name forward as the only
+ tolerated failure for this work.
+- Surface the baseline result so "we started from green" is on the record.
+
+This makes the verification.md principle operational at the exact moment it
+matters -- the start of a task -- the same way the Verify phase and the
+Review-and-Publish flow operationalize the end-of-work gates.
+
+* Note on why this came as a proposal, not a direct edit
+
+=.emacs.d='s =.claude/rules/*.md= are symlinks into =~/code/rulesets/claude-rules/=,
+so editing =verification.md= from the downstream session would modify the
+rulesets canonical directly. Per the cross-project rule, downstream sessions
+send rulesets the proposed change rather than editing its canonical in place.
+Hence this note instead of a local stopgap edit. The start-work skill isn't
+installed on this machine to edit anyway.
+
+The test that surfaced all this is already fixed in =.emacs.d= (commit on main:
+mock =executable-find= at the boundary instead of the helper). The durable
+process change is the part that belongs here.