1 files changed, 5 insertions, 3 deletions
diff --git a/.claude/commands/prompt-engineering.md b/.claude/commands/prompt-engineering.md
index 8e03367..9dff9a0 100644
--- a/.claude/commands/prompt-engineering.md
+++ b/.claude/commands/prompt-engineering.md
@@ -1,5 +1,5 @@
 ---
-description: Craft prompts (commands, hooks, skill descriptions, sub-agent instructions, system prompts, one-shot requests to other LLMs) that do what they're meant to and resist common failure modes. Covers four moves that determine whether a prompt holds up: classify the prompt type (discipline-enforcing / guidance / collaborative / reference) to pick the right tone and techniques; apply the persuasion framework appropriate to that type (seven principles from Meincke et al. 2025, including which to avoid — notably Liking, which breeds sycophancy); match task fragility to degrees of freedom (high/medium/low); and spend the context window like a shared resource. Also contains a brief reference for classical techniques (few-shot, chain-of-thought, system prompts, templates). Use both in design mode (asking for help writing a new prompt from scratch) and critique mode (paste a draft, get it rewritten to resist common failure modes). Do NOT use for prose editing unrelated to LLM prompts (use a writing skill), for implementing application code that uses an LLM (different scope), or for content moderation / prompt-injection defense (adjacent but separate domain).
+description: Craft prompts (commands, hooks, skill descriptions, sub-agent instructions, system prompts, one-shot requests to other LLMs) that do what they're meant to and resist common failure modes. Covers four moves that determine whether a prompt holds up: classify the prompt type (discipline-enforcing / guidance / collaborative / reference) to pick the right tone and techniques; apply the persuasion framework appropriate to that type (seven principles, including which to avoid — notably Liking, which breeds sycophancy); match task fragility to degrees of freedom (high/medium/low); and spend the context window like a shared resource. Also contains a brief reference for classical techniques (few-shot, chain-of-thought, system prompts, templates). Use both in design mode (asking for help writing a new prompt from scratch) and critique mode (paste a draft, get it rewritten to resist common failure modes). Do NOT use for prose editing unrelated to LLM prompts (use a writing skill), for implementing application code that uses an LLM (different scope), or for content moderation / prompt-injection defense (adjacent but separate domain).
 disable-model-invocation: true
 ---
 
@@ -76,7 +76,7 @@ LLMs are parahuman — they were trained on human text that's full of persuasion
 
 ### The Seven Principles
 
-Adapted from Meincke et al., *Persuasion and Compliance in Large Language Models* (2025, N≈28,000 conversations). Two-principle combinations shifted compliance rates from ~33% to ~72%.
+These are the seven Cialdini persuasion principles, framed here for prompt design. One caution before using them: Meincke et al., *Call Me A Jerk: Persuading AI to Comply with Objectionable Requests* (2025, N≈28,000 conversations) found that applying these same principles raised an LLM's compliance with *objectionable* requests from ~33% to ~72%. That is a prompt-safety finding, not evidence that persuasion framing makes engineering prompts better. The principles below are a vocabulary for matching tone to prompt type — not a lever to pull for higher compliance. Stronger framing hardens whatever behavior the prompt encodes, including the wrong one, which is exactly why the wrong principle for the type (Liking on a collaborative prompt) backfires.
 
 - **Authority** — Non-negotiable framing. "YOU MUST", "Never", "Always", "No exceptions."
 - **Commitment** — Force explicit action. "Announce the rule you're applying before applying it." "Output your checklist with each item checked."
@@ -169,6 +169,7 @@ When handed an existing draft:
 4. **Check token economy.** Redundancy, restated instructions, unnecessary preamble.
 5. **Check for footguns** from the anti-patterns list.
 6. **Rewrite** — show before/after. Name the changes.
+7. **For fragile or reusable prompts, verify the rewrite — don't assert it.** A prompt that runs once for a throwaway task can ship on the rewrite alone. A prompt that gates discipline, gets reused, or runs in production cannot. Write 3-5 adversarial or edge-case inputs — the cases most likely to make the prompt fail: the exception the discipline rule must refuse, the ambiguous scope the trigger must catch, the input that tempts a sycophantic answer. Run both the old prompt and the new prompt against each, and record the observed behavioral delta. Without examples, "the rewrite is better" is an assertion, not a result.
 
 ## Ethics Test
 
@@ -202,6 +203,7 @@ Before declaring a prompt done:
 - [ ] Each sentence earns its tokens
 - [ ] Trigger phrases and (for reference prompts) explicit negative conditions are present
 - [ ] Sycophancy traps (for collaborative) and rationalization loopholes (for discipline) are closed
+- [ ] For fragile or reusable/production prompts, 3-5 adversarial examples were run against old and new, and the behavioral delta is recorded
 - [ ] The ethics test passes
 
 ## Output
@@ -223,6 +225,6 @@ No long explanations. The prompt itself should demonstrate the principles, not n
 
 ## References
 
-- Meincke, L., et al. (2025). *Persuasion and Compliance in Large Language Models.* N≈28,000 AI conversations; 7 persuasion principles; compliance shifts of ~2x with appropriate combinations.
+- Meincke, L., et al. (2025). *Call Me A Jerk: Persuading AI to Comply with Objectionable Requests.* SSRN abstract_id=5357179. N≈28,000 AI conversations; the seven Cialdini principles roughly doubled compliance with objectionable requests (~33% → ~72%). Read as a prompt-safety caution — persuasion framing can override an LLM's reluctance — not as a recipe for better engineering prompts.
 - Anthropic prompt engineering guidance — context window as shared resource; progressive disclosure; degrees-of-freedom framing.
 - Classical prompt engineering literature (few-shot, CoT, system prompts) — assumed background; not re-taught here.