diff options
Diffstat (limited to 'todo.org')
| -rw-r--r-- | todo.org | 54 |
1 files changed, 18 insertions, 36 deletions
@@ -758,16 +758,16 @@ Each item below is a one-line summary of a sub-TODO further down. Tick the box w - [X] [#B] =pairwise-tests=: clarify negative value syntax + generator availability **** V2MOM -- [ ] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment) -- [ ] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog -- [ ] [#B] =create-v2mom=: mitigation/owner fields for Obstacles +- [X] [#A] =create-v2mom=: rename Metrics → Measures (Salesforce alignment) +- [X] [#B] =create-v2mom=: prevent task migration from turning V2MOM into a backlog +- [X] [#B] =create-v2mom=: mitigation/owner fields for Obstacles **** Prompt engineering -- [ ] [#A] =prompt-engineering=: correct/narrow Meincke citation -- [ ] [#B] =prompt-engineering=: eval-harness requirement for production prompts +- [X] [#A] =prompt-engineering=: correct/narrow Meincke citation +- [X] [#B] =prompt-engineering=: eval-harness requirement for production prompts **** Codify -- [ ] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md= +- [X] [#B] =codify=: stale-entry review + privacy checks before writing project =CLAUDE.md= **** Code review - [X] [#A] =review-code=: resolve local-verification vs CI boundary @@ -874,47 +874,29 @@ Added an "Escalating Beyond Pairwise (t-way)" subsection: start with pairwise ac Added a "~ prefix" explanation (PICT marker tagging a value as negative/invalid, not an arithmetic operator; PICT pairs negatives with valid values once and strips the marker before the SUT) and a stop-at-the-model rule: if neither the =pict= binary nor =pypict= is present, produce the model and stop rather than hand-writing a table and passing it off as PICT output. -*** TODO [#A] =create-v2mom=: rename "Metrics" to Salesforce's "Measures" or explicitly justify the deviation +*** 2026-05-22 Fri @ 14:43:17 -0500 Renamed Metrics → Measures throughout create-v2mom -V2MOM's final M is officially "Measures." The skill uses "Metrics" throughout. -Either rename the section and description to "Measures" or add a clear note -that this fork intentionally says "Metrics" while preserving the V2MOM concept. +Full rename across =.claude/commands/create-v2mom.md= (acronym expansions, Phase 7 heading, the "Measures must be measurable" principle, exit criteria, review questions, red flags, examples) to match Salesforce's official term. Kept the "vanity metrics" idiom intact — it's the anti-pattern term, not a section reference. -*** TODO [#A] =create-v2mom=: prevent task migration from turning V2MOM into a backlog +*** 2026-05-22 Fri @ 14:43:17 -0500 Split strategy from execution in create-v2mom task migration -Salesforce presents V2MOM as a simple alignment framework. This skill's -optional task-migration phase can make the V2MOM the entire todo system. Split -strategy from execution: keep the V2MOM concise, and link to method-specific -backlogs instead of embedding every task under the strategic document. +Rewrote Phase 8 (and tightened Phase 5.5): tasks stay in the backlog grouped by method, and each method gains a one-line link to where its tasks live, instead of transplanting the task tree into the V2MOM. Strategy (V2MOM) and execution (backlog) are now explicitly separate sources of truth, keeping the V2MOM concise. -*** TODO [#A] =create-v2mom=: add mitigation/owner fields for Obstacles +*** 2026-05-22 Fri @ 14:43:17 -0500 Made create-v2mom obstacles operational (mitigation/owner/cadence) -The current Obstacles phase captures barriers but not consistently how each -will be overcome. Add "mitigation, owner, and review cadence" per obstacle so -the section becomes operational instead of just candid. +Phase 6 now captures, per obstacle: name, manifestation, stakes, mitigation, owner, and review cadence — with a worked example per domain (health/finance/software), a "good obstacle" characteristic, a Phase 9 review question, and a red flag for candid-but-not-operational obstacles. An obstacle without a countermove is now flagged as an observation, not a plan. -*** TODO [#A] =prompt-engineering=: correct and narrow the Meincke citation +*** 2026-05-22 Fri @ 14:43:17 -0500 Corrected and narrowed the Meincke citation in prompt-engineering -The skill cites "Persuasion and Compliance in Large Language Models" but the -paper found in research is "Call Me A Jerk: Persuading AI to Comply with -Objectionable Requests." Revise the reference and avoid overgeneralizing the -result: it shows persuasion can raise compliance with objectionable requests, -which is a cautionary prompt-safety finding, not broad evidence that persuasion -principles improve engineering prompt quality. +Fixed the title to "Call Me A Jerk: Persuading AI to Comply with Objectionable Requests" (SSRN abstract_id=5357179) in all three spots (frontmatter, Seven Principles intro, References). Reframed the ~33%→72% result as what it is — a prompt-safety caution that persuasion raises compliance with objectionable requests — explicitly not evidence that persuasion framing improves engineering prompt quality. Kept the seven principles as a tone vocabulary. -*** TODO [#A] =prompt-engineering=: add an evaluation harness requirement for production prompts +*** 2026-05-22 Fri @ 14:43:17 -0500 Added an eval-harness requirement to prompt-engineering critique mode -Prompt critique currently ends with a rewrite and checklist. Add a requirement -for fragile or reusable prompts: create 3-5 adversarial/edge examples, run the -old and new prompt against them, and record the observed behavioral delta. -Without examples, prompt quality remains asserted rather than verified. +Added critique step 7 + a checklist line: for fragile or reusable/production prompts, write 3-5 adversarial/edge inputs, run both the old and new prompt against each, and record the behavioral delta. A throwaway prompt can ship on the rewrite alone; a discipline/reused/production one can't. Without examples, "the rewrite is better" is an assertion, not a result. -*** TODO [#A] =codify=: add stale-entry review and privacy checks before writing project =CLAUDE.md= +*** 2026-05-22 Fri @ 14:43:17 -0500 Added mandatory stale-entry + privacy pre-write checks to codify -The skill has good gates, but it should explicitly scan for stale entries, -private context, and team-visible leakage before appending. Add "would this be -safe if the project were public?" and "does this belong in private memory -instead?" as mandatory checks, not just table background. +Added a "Mandatory pre-write checks" block at the top of Phase 3 (Write) in =.claude/commands/codify.md=: a stale-entry scan (update/remove no-longer-true entries in place, don't append contradictions around them) and a privacy/leak check carrying both questions verbatim — "safe if the project were public?" and "belongs in private memory instead?" — routing private content to auto-memory. Gates, not background guidance. *** 2026-05-22 Fri @ 14:06:41 -0500 Scoped review-code's CI-trust rule to reviewing, not shipping |
