From 234163dfed4aae1984889c03679eb62f7d8e42dc Mon Sep 17 00:00:00 2001 From: Craig Jennings Date: Sun, 24 May 2026 00:49:53 -0500 Subject: docs(verification): add a manual-verification handoff format MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When a verification gap needs the user's hands or eyes — interactive UI a script can't drive, a live service, visual rendering — describing the steps in prose isn't enough. I added a section to verification.md that says to write them as a "Manual testing and validation" task: one sub-header per test, each with a descriptive title, what we're verifying, the steps as a bullet list, and the expected result. If a test fails, the user writes the actual behavior, flips the header to a TODO, and promotes it, so a failed check becomes a tracked bug in one step. --- claude-rules/verification.md | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/claude-rules/verification.md b/claude-rules/verification.md index 673a54d..617c02c 100644 --- a/claude-rules/verification.md +++ b/claude-rules/verification.md @@ -44,6 +44,33 @@ When a check cannot run, report it in this order: Do not let an unverifiable check vanish into a confident summary. State it plainly and hand the gap to the user. +## Handing Off Manual Verification + +Some checks can only be run by the user: interactive UI a script can't drive, a live external service, visual rendering (colors, layout, faces), a real device, or anything where the verification *is* a human looking at the result. When the gap needs the user's hands or eyes — not just a command they could paste — don't bury the steps in prose. Write them as a structured, runnable checklist in the project's task file. + +Create (or append to) a single parent task named **"Manual testing and validation"** in the project's todo file (`todo.org`, or the project's equivalent). Under it, write **one org sub-header per test**: + +- **Title** — descriptive, naming the behavior under test (not "test 1"). +- **"What we're verifying:"** — one line on the intent, so a failed test explains itself. +- **Steps** — a bullet list, **one action per item**, in order. Concrete: the exact command to run, key to press, or file to open. +- **"Expected:"** — the observable result after the last step. One outcome, stated plainly, so a mismatch is unambiguous. + +Format: + +``` +** TODO Manual testing and validation +*** +What we're verifying: . +- +- +- +Expected: . +``` + +**Promote on failure.** If the actual behavior matches Expected, the test passed — the user marks or deletes it. If it differs, the user writes the actual behavior and any notes under that test, flips its header to a `TODO`, and promotes it to a top-level task. The structured shape is what makes that one-step: the failing case already carries the repro and the expected result, so it becomes a tracked bug without rewriting anything. + +Use this whenever the verification gap from "When You Cannot Verify" above is a human-in-the-loop check rather than a command the user can run blind. Write every such test you want run, not a representative sample — the user runs the checklist once and reports back. + ## Before Committing Before any commit: -- cgit v1.2.3