diff options
| -rw-r--r-- | claude-rules/verification.md | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/claude-rules/verification.md b/claude-rules/verification.md index 673a54d..617c02c 100644 --- a/claude-rules/verification.md +++ b/claude-rules/verification.md @@ -44,6 +44,33 @@ When a check cannot run, report it in this order: Do not let an unverifiable check vanish into a confident summary. State it plainly and hand the gap to the user. +## Handing Off Manual Verification + +Some checks can only be run by the user: interactive UI a script can't drive, a live external service, visual rendering (colors, layout, faces), a real device, or anything where the verification *is* a human looking at the result. When the gap needs the user's hands or eyes — not just a command they could paste — don't bury the steps in prose. Write them as a structured, runnable checklist in the project's task file. + +Create (or append to) a single parent task named **"Manual testing and validation"** in the project's todo file (`todo.org`, or the project's equivalent). Under it, write **one org sub-header per test**: + +- **Title** — descriptive, naming the behavior under test (not "test 1"). +- **"What we're verifying:"** — one line on the intent, so a failed test explains itself. +- **Steps** — a bullet list, **one action per item**, in order. Concrete: the exact command to run, key to press, or file to open. +- **"Expected:"** — the observable result after the last step. One outcome, stated plainly, so a mismatch is unambiguous. + +Format: + +``` +** TODO Manual testing and validation +*** <descriptive title> +What we're verifying: <one line>. +- <step one> +- <step two> +- <step three> +Expected: <the observable result after the last step>. +``` + +**Promote on failure.** If the actual behavior matches Expected, the test passed — the user marks or deletes it. If it differs, the user writes the actual behavior and any notes under that test, flips its header to a `TODO`, and promotes it to a top-level task. The structured shape is what makes that one-step: the failing case already carries the repro and the expected result, so it becomes a tracked bug without rewriting anything. + +Use this whenever the verification gap from "When You Cannot Verify" above is a human-in-the-loop check rather than a command the user can run blind. Write every such test you want run, not a representative sample — the user runs the checklist once and reports back. + ## Before Committing Before any commit: |
