#+TITLE: Voice Profile: canonical source-of-truth for the voice skill
#+DATE: 2026-05-29
#+SOURCE: rulesets session 2026-05-29

* How this combines with SKILL.md (pairing rule)

This file is the canonical source-of-truth for the voice skill's rationale, evidence, examples, and history. =voice/SKILL.md= holds the thin rule-set: one-line directives per pattern, mode applicability, and a pointer back here. Everything else (Problem, Basis, Before/After, Detection guidance, History) lives in the per-pattern sections below.

Pairing rule. Every change to =voice/SKILL.md= MUST land alongside the corresponding update in this file. The two are normatively paired. A SKILL.md edit without a profile update is incomplete. A profile update without a SKILL.md edit is fine (rationale or evidence can deepen without changing the rule).

Pattern numbering in both files matches: SKILL.md's =### N. <Name>= maps to this file's =* §N <Name>= section. Mode tags use the same vocabulary: =general=, =prose=, =personal=.

When the agent runs =/voice=, it reads SKILL.md for the rules and consults this file for the examples and basis it needs to apply each pattern correctly.

* Corpus

** Phase 1 (2026-05-29): git commit bodies

Git commit bodies authored by Craig Jennings across all repos under =~/code/= and =~/projects/=. After cleanup (subject lines, trailers, URL-only lines, AI-attribution lines, blank-run collapse):

- 5355 raw commits, 1895 with non-trivial bodies
- 128608 words, 912400 characters
- 33 repos contributing; top sources: archsetup (703), rulesets (621), work (565), archangel (455), home (395)

One register (deliberate technical prose). The view is useful but narrow on its own.

** Phase 2 (2026-05-29): email + GitHub PR bodies + PR review comments

Four sub-corpora added so the rules can be tested across registers.

- *Personal email* (gmail + cmail, sent-only, body ≥50 words after cleanup): 1139 messages, 283,092 words.
- *Work email* (dmail, same filter): 22 messages, 3910 words. Small sample.
- *PR descriptions* (github.com, author cjennings, body ≥100 chars after cleanup): 9 PRs, 1613 words. Small sample.
- *PR review comments* (github.com, author cjennings, ≥20 words): 3 comments, 256 words. Tiny sample. Public GHE work isn't in this index.

Signatures, quoted replies, and forwarded blocks stripped before analysis. Stats streamed; no corpus files written to disk.

** Cross-register findings (the key result of Phase 2)

The most important Phase 2 result is that *register splits matter*. Phase 1's signal from commit prose does not generalize cleanly to conversational prose.

| Metric (per 1000 words)  | Commits | Personal email | Work email | PR bodies | PR comments |
|--------------------------+---------+----------------+------------+-----------+-------------|
| Em-dash                  |    3.49 |           0.28 |       2.05 |      0.62 |        0.00 |
| Semicolon                |    3.16 |           0.64 |       0.26 |      0.62 |        0.00 |
| Contractions             |    3.57 |          38.52 |      28.13 |     17.36 |       50.78 |
| Standalone "I"           |    3.85 |          36.91 |      23.79 |      8.68 |       42.97 |
| "we"                     |    0.22 |           8.18 |      14.83 |      1.24 |        0.00 |
| "I'm"                    |    0.07 |           6.04 |       3.58 |      1.24 |        7.81 |

Three observations:

1. Em-dashes and semicolons are concentrated in commit prose, not conversational prose. The personal-mode rules on those (§13 and §33) hold up under Phase 2, but the basis shifts: the rules mostly enforce what is already true for email and PR comments. Commit prose is the outlier register that needs the rule, not the universal pattern.
2. Contractions invert. Commits suppress contractions; email and PR-review prose use them heavily (38 to 50 per 1000). The Phase 1 contraction rule (§34) is strongly confirmed in the registers where contractions are most expected.
3. The Phase 1 curiosity (I'm/I'll surprisingly rare relative to standalone "I") was a register effect, not a personal preference. In personal email, "I'm" runs 6.04 per 1000 vs standalone I at 36.91 — ratio close to natural English. Commit prose is the outlier where "I am" beats "I'm".

AI-writing tells stay near zero across all five corpora. "leverage" surfaces 18 times in personal email (0.064 per 1000) — small but the only non-zero hit on the watch-list outside commits. All other watch-words clock 0 to 4 per corpus.

* Findings against the 41 SKILL.md patterns

** Strongly confirmed by the corpus

*Pattern 17 (no emojis).* Zero emojis in corpus. Confirmed.

*Pattern 7 (AI vocabulary).* "delve" 0. "embark" 0. "navigate the" 0. "in the realm of" 0. "seamless" 0. "moreover" 0. "furthermore" 0. "in conclusion" 0. "additionally" 1. "robust" 1. "leverage" 1. Rule confirmed for 11 of 12 watch-words. (One exception below.)

*Pattern 22 (filler).* "moreover" / "furthermore" / "additionally" / "in conclusion": all zero or one occurrence. Filler-phrase avoidance confirmed.

*Pattern 32 (first-person rewrite).* Standalone "I" at 3.85 per 1000 words. Craig writes first-person heavily. This is real, not aspirational.

*Pattern 34 (contractions).* 459 contractions total (3.57 per 1000). Top hits: =doesn't= (92), =don't= (59), =isn't= (46), =it's= (43), =can't= (40), =that's= (34). Rule confirmed.

*Pattern 38 (terse cut).* 41.1% of paragraphs are single-sentence. Craig writes terse. Paragraph breaks land after one complete thought even when short. Confirmed indirectly via paragraph structure.

** Aspirational (corpus contradicts, but the rule is intentional self-discipline)

*Pattern 13 (em-dash zero-tolerance, personal mode).* Corpus rate: 3.49 em-dashes per 1000 words. Comparable to AI-generated prose. Craig USES em-dashes regularly in commit bodies. The rule overrides his habit, it doesn't reflect it. Suggested rewording: drop the "LLMs use em dashes more than humans" framing; keep the zero-tolerance directive but rationale becomes "Craig's published voice (commit messages going forward, PR bodies, emails) drops em-dashes by choice because it reads cleaner and avoids a common AI tell, regardless of his pre-rule habit." Honest about the source.

*Pattern 33 (semicolons → period/comma).* Corpus rate: 3.16 semicolons per 1000. Craig uses semicolons regularly. Same shape as #13: rule is self-discipline, not habit-reflection. Suggested rewording: acknowledge the rule overrides habit rather than implying it codifies one.

These two rules are still valuable. Em-dashes and semicolons both read cleaner when absent from short imperative-leaning prose. But the SKILL.md should say "this is a rule I've decided to follow," not "this is how I already write."

** Worth challenging

*Pattern 7 watch-word "comprehensive".* 42 occurrences in corpus (~0.33 per 1000). All other AI-tell watch-words clock near zero. "comprehensive" appears to be genuine vocabulary for Craig in technical contexts ("comprehensive test coverage", "comprehensive audit"). Suggested change: pull "comprehensive" out of the watch-list, or carve out a "watch in clusters, not solo" note that flags only when "comprehensive" co-occurs with other AI-tell words.

** Worth adding (corpus surfaces traits the rules don't capture)

*Single-sentence paragraph cadence.* 41.1% of paragraphs are exactly one sentence. This is distinctive. Most prose-style guides advise multi-sentence paragraphs. Suggested addition (prose + personal): a positive pattern noting "a one-sentence paragraph is a finished thought, not a fragment. Break paragraphs after one complete thought when the next thought shifts angle, even if both are short." Anti-rule against "merge short paragraphs into multi-sentence ones."

*Parenthetical density.* 23.07 opening parens per 1000 words. Heavy parenthetical use covers asides, clarifications, and scope-narrowing in parens. Currently no rule addresses this either way. Could add a positive pattern: "parentheses for asides are part of the voice. Don't strip them in a 'clean prose' pass."

*Question-mark rarity.* 0.33 per 1000. Craig's prose is declarative. He states things, rarely asks them. Worth noting as a register marker (when /voice personal output has questions, double-check whether they're contextual or AI rhetoric).

** Out of corpus (commits don't test these, Phase 2 needed)

- *Pattern 13 in long-form prose.* Commit bodies are short. Email and PR bodies may show different em-dash rates.
- *Pattern 14 (boldface).* Org-mode bold uses =*word*=, not detectable by simple grep. Markdown bold rare in commits.
- *Pattern 16 (title case in headings).* Commits don't carry headings.
- *Pattern 19 (collaborative artifacts).* Not present in commit bodies.
- *Pattern 35 (sentence split on conjunctions).* Average sentence is 18.81 words, median 14, with 28% of sentences 21+ words. Long-sentence rate is moderate. Need to inspect actual sentences to know if they're conjunction-stitched. Defer.
- *Pattern 36 (felt-experience cut).* Commit bodies wouldn't carry felt-experience prose. Email + journal corpus needed.
- *Pattern 37 (sentence fragments).* 9.7% of sentences are 1-5 words. Some are legitimate ("All eight pass."), some may be fragments. Can't tell from word-count alone. Defer to a pass that does syntactic detection.
- *Pattern 39 (public-artifact scope).* The corpus IS the public artifacts. The check is circular. Defer.
- *Pattern 40 (praise vs correction asymmetry).* Not detectable in commit bodies. Email or PR-review corpus needed.

** Curiosities (resolved by Phase 2)

- *=I'm=* (9 occurrences) and *=I'll=* (2 occurrences) were surprisingly rare in Phase 1 relative to standalone =I= (495 occurrences). Phase 2 resolved this. Personal email shows I'm at 1710 occurrences (6.04 per 1000), I'll at 865 (3.06), I've at 458 (1.62), I'd at 384 (1.36). The Phase 1 rarity was a register effect, not a personal preference. Commit prose uniquely suppresses contractions; conversational prose runs them at near-natural English rate.

* Suggested deltas

*All six deltas landed 2026-06-10* via the voice-skill revision from the work-project session: 1 and 2 are in the §13/§33 rule lines and entries, 3 is the §7 soft-flag, 4-6 became patterns §43-§45. The list is kept as the record of what was proposed on 2026-05-29.

Six concrete edits to =voice/SKILL.md=, all of which can land independently:

1. *#13 (Em-Dash).* Drop the "LLMs use em dashes more than humans" framing in the personal-mode section. Restate the zero-tolerance rule as self-discipline ("Craig's published voice drops em-dashes by choice"), not habit-reflection. Cite: corpus rate 3.49/1000, AI-comparable.

2. *#33 (Semicolons).* Same shape. Restate as self-discipline. Cite: corpus rate 3.16/1000.

3. *#7 (AI Vocabulary).* Remove "comprehensive" from the watch-list, OR add a note that "comprehensive" alone is acceptable; flag only when it co-occurs with =delve= / =leverage= / =robust= / =seamless= / =moreover= etc. Cite: 42 occurrences, all other watch-words at 0 or 1.

4. *NEW pattern (prose + personal): "Single-sentence paragraph cadence is a feature."* 41.1% of corpus paragraphs are exactly one sentence. A one-sentence paragraph is a finished thought, not a fragment. The voice pass should not merge short paragraphs into multi-sentence ones.

5. *NEW pattern (prose + personal): "Parentheses for asides are part of the voice."* 23 opening parens per 1000 words. Heavy parenthetical use is distinctive. Don't strip parenthetical asides in a "clean prose" pass.

6. *Register marker (advisory, not a rewrite rule): "Declarative is the default."* 0.33 question marks per 1000. Voice personal output that contains rhetorical questions should be checked. They're often AI rhetoric, not Craig's register.

* What Phase 2 would add

- Email corpus (gmail + cmail, sent-only, long-form): different register, especially long-form prose flow.
- PR bodies and review comments: longer prose, deliberate register, includes the praise/correction asymmetry test ground.
- Slack messages: casual register, contraction rate, sentence-fragment rate.
- Syntactic detection: distinguish fragments from terse complete sentences for pattern #37.
- Long-form documents (résumé, proposals if any): single register but high prose density.

* Per-pattern entries

All patterns are entered below per the pairing rule above, §1 through §45, matching SKILL.md's numbering.

** §1 Undue Emphasis on Significance, Legacy, and Broader Trends

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Strip statements that puff up importance by claiming an arbitrary aspect represents or contributes to a broader trend. Watch for phrases like "stands as", "serves as", "testament to", "vital role", "pivotal moment", "evolving landscape", "marks a shift", "reflects broader", "setting the stage for", "indelible mark", "deeply rooted". Replace with a concrete fact or cut the sentence entirely.

*** Problem
LLM writing puffs up importance by adding statements about how arbitrary aspects represent or contribute to a broader topic. Watch-list words: stands/serves as, is a testament/reminder, a vital/significant/crucial/pivotal/key role/moment, underscores/highlights its importance/significance, reflects broader, symbolizing its ongoing/enduring/lasting, contributing to the, setting the stage for, marking/shaping the, represents/marks a shift, key turning point, evolving landscape, focal point, indelible mark, deeply rooted.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
The Statistical Institute of Catalonia was officially established in 1989, marking a pivotal moment in the evolution of regional statistics in Spain. This initiative was part of a broader movement across Spain to decentralize administrative functions and enhance regional governance.
#+end_example

*** After
#+begin_example
The Statistical Institute of Catalonia was established in 1989 to collect and publish regional statistics independently from Spain's national statistics office.
#+end_example

*** History
- Original SKILL.md entry: significance and broader-trend puffery, with watch-list phrases drawn from Wikipedia's AI-writing guide.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §2 Undue Emphasis on Notability and Media Coverage

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Cut notability claims that list sources without giving the substance. Replace "cited in The New York Times, BBC, Financial Times" with the actual argument made in one of them.

*** Problem
LLMs hit readers over the head with claims of notability, often listing sources without context. Watch-list words: independent coverage, local/regional/national media outlets, written by a leading expert, active social media presence.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Her views have been cited in The New York Times, BBC, Financial Times, and The Hindu. She maintains an active social media presence with over 500,000 followers.
#+end_example

*** After
#+begin_example
In a 2024 New York Times interview, she argued that AI regulation should focus on outcomes rather than methods.
#+end_example

*** History
- Original SKILL.md entry: notability inflation through bare source lists.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §3 Superficial Analyses with -ing Endings

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Cut tacked-on present-participle phrases (highlighting, underscoring, emphasizing, ensuring, reflecting, symbolizing, contributing to, cultivating, fostering, encompassing, showcasing) that add fake depth without new information.

*** Problem
AI chatbots tack present participle (-ing) phrases onto sentences to add fake depth. Watch-list words: highlighting, underscoring, emphasizing, ensuring, reflecting, symbolizing, contributing to, cultivating, fostering, encompassing, showcasing.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
The temple's color palette of blue, green, and gold resonates with the region's natural beauty, symbolizing Texas bluebonnets, the Gulf of Mexico, and the diverse Texan landscapes, reflecting the community's deep connection to the land.
#+end_example

*** After
#+begin_example
The temple uses blue, green, and gold colors. The architect said these were chosen to reference local bluebonnets and the Gulf coast.
#+end_example

*** History
- Original SKILL.md entry: -ing phrase tacking for fake analytical depth.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §4 Promotional and Advertisement-like Language

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Remove travel-brochure adjectives (vibrant, breathtaking, nestled, stunning, renowned, must-visit, profound, rich figurative use) and replace promotional framing with concrete facts about the subject.

*** Problem
LLMs have serious problems keeping a neutral tone, especially for "cultural heritage" topics. Watch-list words: boasts a, vibrant, rich (figurative), profound, enhancing its, showcasing, exemplifies, commitment to, natural beauty, nestled, in the heart of, groundbreaking (figurative), renowned, breathtaking, must-visit, stunning.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Nestled within the breathtaking region of Gonder in Ethiopia, Alamata Raya Kobo stands as a vibrant town with a rich cultural heritage and stunning natural beauty.
#+end_example

*** After
#+begin_example
Alamata Raya Kobo is a town in the Gonder region of Ethiopia, known for its weekly market and 18th-century church.
#+end_example

*** History
- Original SKILL.md entry: travel-brochure adjective patterns.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §5 Vague Attributions and Weasel Words

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Replace vague attributions (experts say, observers have cited, industry reports, some critics argue, several sources) with a named source plus the specific claim made.

*** Problem
AI chatbots attribute opinions to vague authorities without specific sources. Watch-list words: Industry reports, Observers have cited, Experts argue, Some critics argue, several sources or publications (when few cited).

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Due to its unique characteristics, the Haolai River is of interest to researchers and conservationists. Experts believe it plays a crucial role in the regional ecosystem.
#+end_example

*** After
#+begin_example
The Haolai River supports several endemic fish species, according to a 2019 survey by the Chinese Academy of Sciences.
#+end_example

*** History
- Original SKILL.md entry: vague-authority attribution patterns.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §6 Outline-like "Challenges and Future Prospects" Sections

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Delete formulaic "Despite its prosperity, X faces challenges" wrap-ups and "Future Outlook" boilerplate. Replace with the actual events that happened or omit the section.

*** Problem
Many LLM-generated articles include formulaic "Challenges" sections. Watch-list words: "Despite its... faces several challenges...", "Despite these challenges", "Challenges and Legacy", "Future Outlook".

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Despite its industrial prosperity, Korattur faces challenges typical of urban areas, including traffic congestion and water scarcity. Despite these challenges, with its strategic location and ongoing initiatives, Korattur continues to thrive as an integral part of Chennai's growth.
#+end_example

*** After
#+begin_example
Traffic congestion increased after 2015 when three new IT parks opened. The municipal corporation began a stormwater drainage project in 2022 to address recurring floods.
#+end_example

*** History
- Original SKILL.md entry: outline-template "Challenges and Future Prospects" sections.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §7 Overused "AI Vocabulary" Words

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Flag and rewrite around the high-frequency AI vocabulary list. Watch-list words: Additionally, align with, comprehensive, crucial, delve, emphasizing, enduring, enhance, fostering, garner, highlight (verb), interplay, intricate or intricacies, key (adjective), landscape (abstract noun), pivotal, showcase, tapestry (abstract noun), testament, underscore (verb), valuable, vibrant. "comprehensive" is a soft flag because the corpus shows it as genuine Craig vocabulary he chooses to use sparingly. Suggest an alternative ("full", "complete", "thorough", or rewording to drop the adjective) and let Craig decide per instance.

*** Problem
These words appear far more frequently in post-2023 text. They often co-occur.

*** Basis
Corpus-measured across registers (2026-05-29). Phase 1 git commits: "comprehensive" 42 occurrences, every other watch-word 0 or 1. Phase 2 conversational and PR corpora: "comprehensive" 1 in personal email, 0 in work email, PR descriptions, and PR review comments. "leverage" 18 in personal email, 0 to 1 elsewhere. Every other watch-word stays at 0 to 4 across all five corpora.

Two takeaways. First, "comprehensive" is concentrated in commit prose (technical-doc register: "comprehensive test coverage", "comprehensive audit") and almost absent from conversational prose. Craig has chosen to keep it on the watch-list because he is consciously trying to use it sparingly. Second, "leverage" earns a soft watch in personal email even though the rest of the list stays clean. The two together suggest the rule should flag-and-suggest individual hits in technical prose without treating any single watch-word as automatic disqualification.

*** Before
#+begin_example
Additionally, a distinctive feature of Somali cuisine is the incorporation of camel meat. An enduring testament to Italian colonial influence is the widespread adoption of pasta in the local culinary landscape, showcasing how these dishes have integrated into the traditional diet.
#+end_example

*** After
#+begin_example
Somali cuisine also includes camel meat, which is considered a delicacy. Pasta dishes, introduced during Italian colonization, remain common, especially in the south.
#+end_example

*** History
- Original SKILL.md entry: high-frequency post-2023 AI vocabulary list.
- 2026-05-29 (commit =c3cf9a5=): note on "comprehensive" added with corpus measurement and soft-flag guidance.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §8 Avoidance of "is"/"are" (Copula Avoidance)

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Replace elaborate copula substitutes (serves as, stands as, marks, represents, boasts, features, offers) with plain "is" or "has".

*** Problem
LLMs substitute elaborate constructions for simple copulas. Watch-list words: serves as, stands as, marks, represents (a), boasts, features, offers (a).

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Gallery 825 serves as LAAA's exhibition space for contemporary art. The gallery features four separate spaces and boasts over 3,000 square feet.
#+end_example

*** After
#+begin_example
Gallery 825 is LAAA's exhibition space for contemporary art. The gallery has four rooms totaling 3,000 square feet.
#+end_example

*** History
- Original SKILL.md entry: copula avoidance patterns.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §9 Negative Parallelisms

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Rewrite "not only X but Y" and "it's not just about X, it's Y" constructions as a single direct claim.

*** Problem
Constructions like "Not only...but..." or "It's not just about..., it's..." are overused as a way to claim depth.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
It's not just about the beat riding under the vocals; it's part of the aggression and atmosphere. It's not merely a song, it's a statement.
#+end_example

*** After
#+begin_example
The heavy beat adds to the aggressive tone.
#+end_example

*** History
- Original SKILL.md entry: negative-parallelism stock phrasing.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §10 Rule of Three Overuse

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Break the reflexive three-item list pattern when the third item is filler. Collapse to one or two specific items.

*** Problem
LLMs force ideas into groups of three to appear comprehensive. The third item is usually filler chosen to fit the cadence, not because it adds substance.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
The event features keynote sessions, panel discussions, and networking opportunities. Attendees can expect innovation, inspiration, and industry insights.
#+end_example

*** After
#+begin_example
The event includes talks and panels. There's also time for informal networking between sessions.
#+end_example

*** History
- Original SKILL.md entry: rule-of-three cadence overuse.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §11 Elegant Variation (Synonym Cycling)

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Stop cycling synonyms for the same referent across consecutive sentences. Repeat the noun, or merge the sentences.

*** Problem
AI has repetition-penalty code causing excessive synonym substitution. The protagonist becomes the main character becomes the central figure becomes the hero, all referring to the same person.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
The protagonist faces many challenges. The main character must overcome obstacles. The central figure eventually triumphs. The hero returns home.
#+end_example

*** After
#+begin_example
The protagonist faces many challenges but eventually triumphs and returns home.
#+end_example

*** History
- Original SKILL.md entry: elegant-variation synonym cycling driven by repetition penalty.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §12 False Ranges

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Rewrite "from X to Y" constructions where X and Y are not on the same scale. List the items plainly instead.

*** Problem
LLMs use "from X to Y" constructions where X and Y are not on a meaningful scale ("from the Big Bang to dark matter") to imply comprehensive sweep.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Our journey through the universe has taken us from the singularity of the Big Bang to the grand cosmic web, from the birth and death of stars to the enigmatic dance of dark matter.
#+end_example

*** After
#+begin_example
The book covers the Big Bang, star formation, and current theories about dark matter.
#+end_example

*** History
- Original SKILL.md entry: false-range "from X to Y" constructions.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §13 Em Dash Overuse

*** Modes
General mode: overuse-reduction.
Prose + personal modes: zero-tolerance.

*** Rule
Replace em-dashes (=—=) with a comma, period, colon, or parentheses, whichever fits. Zero-tolerance in prose and personal modes holds *everywhere in the text*, including inside example blocks, code-fence prose, and quoted material. An em-dash in a quoted line still gets replaced.

*** Problem
Craig's published voice drops em-dashes by choice: they read cleaner absent from short imperative-leaning prose and their overuse is a common AI tell (LLMs use em dashes more than the median human writer, mimicking "punchy" sales writing). The rule is chosen self-discipline, not a reflection of his pre-rule habit — the corpus shows he used them regularly in commit bodies.

*** Basis
Phase 1 corpus (git commits, 128k words): 3.49 em-dashes per 1000 words. Comparable to AI-generated prose. Phase 2 corpus reveals a sharp register split: personal email 0.28 per 1000, work email 2.05, PR descriptions 0.62, PR review comments 0.00. Em-dashes are concentrated in commit prose, almost absent from email and PR review prose. The zero-tolerance rule in prose and personal modes mostly enforces what is already true for non-commit registers. The rule still earns its place because commit prose is the high-volume register where the AI-tell em-dash habit shows up. Self-discipline, not habit-reflection, for the commit register specifically.

*** Before
#+begin_example
The term is primarily promoted by Dutch institutions—not by the people themselves. You don't say "Netherlands, Europe" as an address—yet this mislabeling continues—even in official documents.
#+end_example

*** After
#+begin_example
The term is primarily promoted by Dutch institutions, not by the people themselves. You don't say "Netherlands, Europe" as an address, yet this mislabeling continues in official documents.
#+end_example

*** History
- Original SKILL.md entry: rule scoped to general overuse-reduction.
- 2026-05-26 (commit =4fac2a0=): prose mode added, rule strengthened to zero-tolerance in prose and personal.
- 2026-05-29 (commit =c3cf9a5=): Note on basis added with corpus measurement.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.
- 2026-06-10: the self-discipline reframing (a "Suggested deltas" item from 2026-05-29, never applied) moved from the findings section into the entry proper and into the SKILL.md rule line. Craig's call, from the work-project session.

** §14 Overuse of Boldface

*** Modes
General mode only. Prose and personal inherit it. Pattern §41 is the related Craig-voice rule covering emphasis-by-formatting in his authored prose.

*** Rule
Strip mechanical boldface used to call out terms, acronyms, or phrases in running prose. Bold survives only for structural emphasis the document genuinely needs.

*** Problem
AI chatbots emphasize phrases in boldface mechanically. Acronyms, names, and key terms get wrapped in bold even when the surrounding sentence already gives them stress.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
It blends **OKRs (Objectives and Key Results)**, **KPIs (Key Performance Indicators)**, and visual strategy tools such as the **Business Model Canvas (BMC)** and **Balanced Scorecard (BSC)**.
#+end_example

*** After
#+begin_example
It blends OKRs, KPIs, and visual strategy tools like the Business Model Canvas and Balanced Scorecard.
#+end_example

*** History
- Original SKILL.md entry: mechanical boldface around terms and acronyms.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §15 Inline-Header Vertical Lists

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Collapse bullet lists whose items start with a bold header plus colon into running prose, unless the list structure is genuinely the right shape.

*** Problem
AI outputs lists where items start with bolded headers followed by colons, often when a paragraph would carry the same content more naturally.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
- **User Experience:** The user experience has been significantly improved with a new interface.
- **Performance:** Performance has been enhanced through optimized algorithms.
- **Security:** Security has been strengthened with end-to-end encryption.
#+end_example

*** After
#+begin_example
The update improves the interface, speeds up load times through optimized algorithms, and adds end-to-end encryption.
#+end_example

*** History
- Original SKILL.md entry: inline-header vertical list pattern.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §16 Title Case in Headings

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Lowercase headings that are reflexively title-cased. Sentence case is the default unless the project's house style is title case.

*** Problem
AI chatbots capitalize all main words in headings even when the surrounding document uses sentence case.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
## Strategic Negotiations And Global Partnerships
#+end_example

*** After
#+begin_example
## Strategic negotiations and global partnerships
#+end_example

*** History
- Original SKILL.md entry: reflexive title-case in headings.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §17 Emojis

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Remove decorative emojis from headings, bullets, and prose unless the document is a register where emoji is genuinely intended.

*** Problem
AI chatbots often decorate headings or bullet points with emojis to add visual structure that the prose itself does not need.

*** Basis
Corpus-measured: 2026-05-29 commit corpus shows zero emojis. The rule reflects established practice.

*** Before
#+begin_example
🚀 **Launch Phase:** The product launches in Q3
💡 **Key Insight:** Users prefer simplicity
✅ **Next Steps:** Schedule follow-up meeting
#+end_example

*** After
#+begin_example
The product launches in Q3. User research showed a preference for simplicity. Next step: schedule a follow-up meeting.
#+end_example

*** History
- Original SKILL.md entry: decorative emoji in headings and bullets.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §18 Curly Quotation Marks

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Convert curly quotation marks to straight ASCII quotes.

*** Problem
ChatGPT uses curly quotes instead of straight quotes, which is a recognizable tell in technical and source-controlled writing.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
He said “the project is on track” but others disagreed.
#+end_example

*** After
#+begin_example
He said "the project is on track" but others disagreed.
#+end_example

*** History
- Original SKILL.md entry: curly-quote substitution.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §19 Collaborative Communication Artifacts

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Strip chatbot correspondence framing ("I hope this helps", "Let me know if...", "Here is an overview of...", "Certainly!", "Of course!", "Would you like...") that leaked into the body.

*** Problem
Text meant as chatbot correspondence gets pasted as content, carrying the assistant's framing into a document that should stand alone. Watch-list words: I hope this helps, Of course!, Certainly!, You're absolutely right!, Would you like..., let me know, here is a...

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Here is an overview of the French Revolution. I hope this helps! Let me know if you'd like me to expand on any section.
#+end_example

*** After
#+begin_example
The French Revolution began in 1789 when financial crisis and food shortages led to widespread unrest.
#+end_example

*** History
- Original SKILL.md entry: collaborative-communication artifacts pasted as content.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §20 Knowledge-Cutoff Disclaimers

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Remove training-cutoff hedges ("as of my last update", "while specific details are scarce", "based on available information") and either commit to a fact or omit the claim.

*** Problem
AI disclaimers about incomplete information get left in text, signaling the model's uncertainty rather than the author's. Watch-list words: as of [date], Up to my last training update, While specific details are limited or scarce, based on available information.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
While specific details about the company's founding are not extensively documented in readily available sources, it appears to have been established sometime in the 1990s.
#+end_example

*** After
#+begin_example
The company was founded in 1994, according to its registration documents.
#+end_example

*** History
- Original SKILL.md entry: knowledge-cutoff hedging.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §21 Sycophantic/Servile Tone

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Cut servile opener phrases ("Great question!", "You're absolutely right", "That's an excellent point") and proceed straight to the substance.

*** Problem
Overly positive, people-pleasing language reads as performance rather than communication and signals AI assistant register.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
Great question! You're absolutely right that this is a complex topic. That's an excellent point about the economic factors.
#+end_example

*** After
#+begin_example
The economic factors you mentioned are relevant here.
#+end_example

*** History
- Original SKILL.md entry: sycophantic and servile opener patterns.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §22 Filler Phrases

*** Modes
General mode only. Prose and personal inherit it. Pattern §38 is the stricter cousin for Craig's authored prose.

*** Rule
Compress wordy filler to plain equivalents: "in order to" to "to", "due to the fact that" to "because", "at this point in time" to "now", "in the event that" to "if", "has the ability to" to "can", "it is important to note that" to nothing, "for the purpose of" to "to", "in spite of the fact that" to "although", "a great deal of" to "much", "at this juncture" to "now".

*** Problem
Wordy filler stretches a sentence without adding precision. Cutting it shortens the prose and sharpens the claim.

*** Basis
Corpus-measured: 2026-05-29 commit corpus shows "moreover", "furthermore", "additionally", "in conclusion" all at zero or one occurrence. Filler-phrase avoidance confirmed at the watch-list level.

*** Before
#+begin_example
In order to achieve this goal, we need to allocate resources due to the fact that the team has the ability to deliver. At this point in time, it is important to note that the data shows progress.
#+end_example

*** After
#+begin_example
To achieve this, we need to allocate resources because the team can deliver. The data shows progress.
#+end_example

*** Detection
The original SKILL.md entry uses a Before to After substitution table:
- "In order to achieve this goal" to "To achieve this"
- "Due to the fact that it was raining" to "Because it was raining"
- "At this point in time" to "Now"
- "In the event that you need help" to "If you need help"
- "The system has the ability to process" to "The system can process"
- "It is important to note that the data shows" to "The data shows"
- "For the purpose of" to "To"
- "In spite of the fact that" to "Although"
- "A great deal of" to "Much"
- "At this juncture" to "Now"

*** History
- Original SKILL.md entry: wordy filler phrase substitutions.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §23 Excessive Hedging

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Strip stacked hedges ("could potentially possibly", "might have some effect") down to a single appropriate qualifier.

*** Problem
Over-qualifying statements weakens them without adding accuracy. One hedge does the job that three do.

*** Basis
Observation-derived (Strunk and White, Garner).

*** Before
#+begin_example
It could potentially possibly be argued that the policy might have some effect on outcomes.
#+end_example

*** After
#+begin_example
The policy may affect outcomes.
#+end_example

*** History
- Original SKILL.md entry: stacked hedge reduction.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §24 Generic Positive Conclusions

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Replace vague upbeat endings ("the future looks bright", "exciting times lie ahead", "a step in the right direction") with a concrete fact or cut the closer entirely.

*** Problem
Vague upbeat endings give the document the shape of a press release without making any specific claim.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
The future looks bright for the company. Exciting times lie ahead as they continue their journey toward excellence. This represents a major step in the right direction.
#+end_example

*** After
#+begin_example
The company plans to open two more locations next year.
#+end_example

*** History
- Original SKILL.md entry: generic positive conclusion boilerplate.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §25 Hyphenated Word Pair Overuse

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Drop reflexive hyphens from common modifier pairs (third-party, cross-functional, client-facing, data-driven, decision-making, well-known, high-quality, real-time, long-term, end-to-end) where humans hyphenate inconsistently. Less common or genuinely technical compound modifiers can keep their hyphens.

*** Problem
AI hyphenates common word pairs with perfect consistency. Humans rarely hyphenate these uniformly, and when they do, it is inconsistent. The uniformity itself is the tell.

*** Basis
Observation-derived (Wikipedia "Signs of AI Writing").

*** Before
#+begin_example
The cross-functional team delivered a high-quality, data-driven report on our client-facing tools. Their decision-making process was well-known for being thorough and detail-oriented.
#+end_example

*** After
#+begin_example
The cross functional team delivered a high quality, data driven report on our client facing tools. Their decision making process was known for being thorough and detail oriented.
#+end_example

*** History
- Original SKILL.md entry: hyphenated common-modifier overuse.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §26 Long Word → Short Word

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Swap long Latinate words for their short Anglo-Saxon equivalents per the Plain English wordlist: utilize to use, commence to start or begin, terminate to end, facilitate to help, demonstrate to show, sufficient to enough, prior to to before, subsequent to to after, approximately to about, endeavor to try, ascertain to find out, assistance to help, obtain to get, modification to change, implement to carry out, optimal to best, regarding to about, methodology to method, "in the event of" to "if".

*** Problem
Long Latinate words signal effortful writing without adding precision. Anglo-Saxon roots are shorter and clearer.

*** Basis
Observation-derived (Strunk and White, Orwell, Plain English Campaign, Garner).

*** Before
#+begin_example
The system will utilize advanced algorithms to facilitate optimal performance. Prior to deployment, we must ascertain that the methodology is sufficient.
#+end_example

*** After
#+begin_example
The system uses algorithms to get the best performance. Before deployment, we must check that the method works.
#+end_example

*** History
- Original SKILL.md entry: Plain English wordlist substitutions.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §27 Active Over Passive Voice

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Rewrite passive constructions to active when the actor is recoverable from context. Flag rather than auto-rewrite when the actor genuinely does not matter, because passive is sometimes the right choice in technical contexts.

*** Problem
Passive voice hides who did what. Active voice is shorter and clearer in most cases. Skip when the actor genuinely does not matter (technical writing about an inanimate process: "the table was created in 2024" can stay passive).

*** Basis
Observation-derived (Strunk and White, Orwell).

*** Before
#+begin_example
The migration was run by the deployment script. The bug was introduced in commit abc123. The fix was applied by the team.
#+end_example

*** After
#+begin_example
The deployment script ran the migration. Commit abc123 introduced the bug. The team applied the fix.
#+end_example

*** Detection
"to be" plus past-participle patterns where the actor is recoverable from context.

*** History
- Original SKILL.md entry: active-over-passive with suggestion-only treatment in v1.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §28 Comma Splices

*** Modes
General mode only. Prose and personal inherit it. The semicolon escape route is blocked in personal mode by §33.

*** Rule
Split two independent clauses joined only by a comma into two sentences or join them with a conjunction. In general mode a semicolon is an acceptable repair. In personal mode the semicolon is itself a target (§33), so prefer the period.

*** Problem
Comma splices read as run-ons. Either split into two sentences, join with a conjunction, or use a semicolon (in personal mode this becomes a period).

*** Basis
Observation-derived (Strunk and White).

*** Before
#+begin_example
The build failed, the test suite reported three errors.
#+end_example

*** After
#+begin_example
The build failed. The test suite reported three errors.
#+end_example

*** Detection
Two independent clauses joined only by a comma.

*** History
- Original SKILL.md entry: comma-splice repair.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §29 Cliché Flag

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Replace business and conversational clichés with the plain meaning, including in casual register where "it's fine, it's casual" is the tell. Watch-list phrases: at the end of the day, moving forward, going forward, at this juncture, circle back, low-hanging fruit, deep dive, leverage (as verb), synergy, take it offline, ducks in a row, boil the ocean, pivot (corporate sense), keep it loose, keep it casual, touch base, circle up, hit the ground running, move the needle, on the same page, no-brainer, win-win.

*** Problem
Clichés signal effortful prose without saying anything specific. Replace with the actual meaning. A casual, friendly, or conversational register is not a license to keep a cliché. Cut it there too. If you catch yourself justifying one as "it's fine, it's casual," that is the tell. Craig flagged this on 2026-05-22 when "keep it loose" slipped through as "acceptable casual." That is exactly the miss this note prevents.

*** Basis
Observation-derived (Orwell, Garner).

*** Before
#+begin_example
At the end of the day, we need to leverage our core competencies and circle back on the low-hanging fruit.
#+end_example

*** After
#+begin_example
We need to use what we already do well and start with the easiest improvements first.
#+end_example

*** History
- Original SKILL.md entry: business and conversational cliché list.
- 2026-05-22: Craig added "keep it loose" / "keep it casual" after a miss in earlier output.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §30 Jargon-Fragment → Complete Sentence

*** Modes
General mode only. Prose and personal inherit it. Pattern §37 is the stricter cousin for Craig's authored prose.

*** Rule
Rewrite telegraphic sentence fragments inside prose paragraphs as complete sentences with subject and verb. Headings and bullet items are exempt because fragments are valid there.

*** Problem
Telegraphic fragments in prose paragraphs read as bullet-style notes leaking into running text. They lose the connective tissue a complete sentence carries.

*** Basis
Observation-derived (Strunk and White).

*** Before
#+begin_example
The new function handles edge cases. Empty input throws. Whitespace gets trimmed. Returns null on no match.
#+end_example

*** After
#+begin_example
The new function handles edge cases. It throws on empty input, trims whitespace, and returns null when no match is found.
#+end_example

*** Detection
Sentence-like fragments inside prose paragraphs that read as bullet-list shorthand. Headings and bullet items are exempt because fragments are valid there.

*** History
- Original SKILL.md entry: jargon-fragment rewrite for prose paragraphs.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §31 Noun-ified Verbs

*** Modes
General mode only. Prose and personal inherit it.

*** Rule
Replace corporate-speak noun-ifications with the real noun: "the ask" to "the request", "a learn" to "the lesson", "the spend" to "the budget", "a build" to "the system" or "the prototype", "the reveal" to "the announcement", "the lift" to "the effort", "the get" to "the result". Philosophical nominalizations ("the becoming", "the unfolding") are not targets.

*** Problem
Corporate-speak nominalization reads as performance. The real nouns are shorter and clearer. Watch-list: the ask, a learn, the spend, a build, the reveal, a do, the lift, the get, the say.

*** Basis
Observation-derived (Garner; Craig's voice rules in claude-rules/commits.md).

*** Before
#+begin_example
The ask was for a quick build. After the reveal, we'll do a learn.
#+end_example

*** After
#+begin_example
The request was for a quick prototype. After the announcement, we'll review what worked.
#+end_example

*** History
- Original SKILL.md entry: corporate-speak nominalization.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §32 First-Person Voice Rewrite

*** Modes
Personal mode only. General and prose skip it because a research note, a document, or anyone else's text is legitimately third-person.

*** Rule
Rewrite impersonal third-person publish-artifact bodies into first person ("I added X", "I missed Y", "I kept Z because..."). The commit subject line stays imperative per Conventional Commits ("feat: add support for X"). The body shifts to first person. Skip the rewrite for mechanical changes (a chore version bump, a typo fix) where the subject alone carries the message.

*** Problem
Impersonal third-person ("Add support for X", "The change adds Y") reads as press-release voice in a commit body or PR description. First-person ("I added X", "I kept Y because...") sounds like one engineer talking to another.

*** Basis
Corpus-measured across registers (2026-05-29): standalone "I" runs 3.85 per 1000 words in git commits, 36.91 in personal email, 23.79 in work email, 8.68 in PR descriptions, 42.97 in PR review comments. First-person density is roughly 10x higher in conversational registers than in commits. Craig writes first-person heavily across the board, but commit prose under-uses "I" relative to natural English. The rule strengthens the under-using register without overreaching: it asks the publish-artifact body to write the way the email body already does.

*** Before
#+begin_example
Adds the new validation step before saving. The previous flow allowed empty values to leak into the database. This change blocks them at the API boundary.
#+end_example

*** After
#+begin_example
I added a validation step before saving. The previous flow let empty values leak into the database. I'm blocking them at the API boundary now.
#+end_example

*** Detection
Impersonal third-person construction in a publish-artifact body where first-person fits naturally.

*** History
- Original SKILL.md entry: first-person rewrite for publish artifacts.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §33 Semicolon → Period or Comma

*** Modes
Prose and personal modes. General mode keeps semicolons because academic and literary registers use them legitimately.

*** Rule
Replace semicolons with a period (split into two sentences) or a comma (when the clauses are tightly coupled) in Craig's authored prose: emails, documents, working notes, commit-message bodies, PR descriptions, PR review comments. A formal long-form document can keep the semicolon, but the default is to split. Chosen self-discipline, not habit-reflection.

*** Problem
Craig's published voice drops semicolons by choice. They make the writing feel unnecessarily literary, the period-split usually reads better, and dropping them removes one common AI tell. The rule overrides his pre-rule habit rather than codifying one — the corpus shows he used semicolons regularly in commit prose.

*** Basis
Corpus-measured across registers (2026-05-29): semicolons run 3.16 per 1000 words in git commits, 0.64 in personal email, 0.26 in work email, 0.62 in PR descriptions, 0.00 in PR review comments. Same register split as em-dashes (§13). Semicolons are concentrated in commit prose; conversational prose almost never uses them. The rule mostly enforces what is already true for non-commit registers. It earns its place because commit prose is the register where Craig's habit and the AI-tell pattern overlap.

*** Before
#+begin_example
I added the validation; the previous flow allowed empty values to leak through.
#+end_example

*** After
#+begin_example
I added the validation. The previous flow allowed empty values to leak through.
#+end_example

*** Detection
Semicolons in prose Craig authors: emails, documents, working notes, commit-message bodies, PR descriptions, PR review comments.

*** History
- Original SKILL.md entry: semicolon to period or comma in Craig's authored prose.
- 2026-05-29 (commit =c3cf9a5=): basis note added with corpus measurement reframing the rule as self-discipline.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.
- 2026-06-10: the self-discipline reframing (a "Suggested deltas" item from 2026-05-29, never applied) moved from the findings section into the entry proper and into the SKILL.md rule line. Craig's call, from the work-project session.

** §34 Contractions

*** Modes
Prose and personal modes. General mode skips because academic, literary, and formal registers often prefer uncontracted forms.

*** Rule
Prefer contractions in Craig's prose (it's, that's, don't, we're, I'd, won't) unless a negation or emphasis genuinely needs the uncontracted weight.

*** Problem
Uncontracted English reads stiff in a short prose body unless a negation or emphasis needs the weight. Prefer contractions in his prose: emails, documents, commit and PR bodies.

*** Basis
Corpus-measured across registers (2026-05-29). Contraction rate per 1000 words: git commits 3.57, personal email 38.52, work email 28.13, PR descriptions 17.36, PR review comments 50.78. Commit prose is the outlier register that suppresses contractions; conversational and PR-review prose use them heavily, near the natural-English rate. The Phase 1 curiosity (I'm 9 occurrences vs standalone I at 495 in commits) was a register effect, not a personal preference. Personal email runs I'm at 6.04 per 1000 vs standalone I at 36.91, ratio close to natural English. Top contractions in personal email: i'm 1710, it's 928, i'll 865, don't 632, you're 567, i've 458, that's 433, i'd 384, we're 307, didn't 299. The rule confirms across the board, with the strongest evidence from the conversational registers where contractions are most expected.

*** Before
#+begin_example
It is worth noting that the change does not break the existing flow. We are confident that this is the right approach.
#+end_example

*** After
#+begin_example
It's worth noting the change doesn't break the existing flow. We're confident this is the right approach.
#+end_example

*** Detection
Uncontracted forms in publish-artifact prose where the contraction reads more naturally. Note pattern §38 catches "worth noting" as rhetorical padding. The example above shows isolated transformation. In practice both passes apply.

*** History
- Original SKILL.md entry: contractions preferred in Craig's authored prose.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §35 Sentence Split on Conjunctions

*** Modes
Prose and personal modes. General mode skips because academic and literary registers use long compound sentences deliberately.

*** Rule
Split sentences that stack three or four clauses joined by "so", "and", "but" into two or three shorter sentences when the split does not lose meaning.

*** Problem
Long compound sentences read easier as two or three shorter ones in a prose or publish-artifact body. Skip in academic or literary prose where deliberate long sentences are the register.

*** Basis
Observation-derived (Craig's voice rules in claude-rules/commits.md). Corpus context: average sentence is 18.81 words, median 14, with 28% of sentences at 21+ words. Long-sentence rate is moderate. Inspection of actual sentences for conjunction-stitching is deferred to Phase 2.

*** Before
#+begin_example
I added the validation step before saving so empty values get blocked at the API boundary, and I also added a regression test that exercises the empty-string case, but I did not change the upstream caller because that's a separate concern.
#+end_example

*** After
#+begin_example
I added the validation step before saving so empty values get blocked at the API boundary. I added a regression test that exercises the empty-string case. I didn't change the upstream caller because that's a separate concern.
#+end_example

*** Detection
Sentences that stack three or four clauses with commas and conjunctions ("so", "and", "but") where splitting on a conjunction would not lose meaning.

*** History
- Original SKILL.md entry: sentence split on conjunctions for Craig's authored prose.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §36 Felt-Experience Narration

*** Modes
Prose and personal modes. General mode skips because third-party prose is legitimately allowed to describe how something feels.

*** Rule
Cut phrases that tell the reader how the change will feel or how often the writer will use it ("I'll feel this every time I commit", "this will be a relief", "I'm excited about", "this is going to be huge"). State what changed and let the reader decide what to do with it.

*** Problem
Felt-experience phrases read as performance, not communication. They tell the reader how the writer wants them to receive the change rather than describing the change.

*** Basis
Observation-derived (Craig's voice rules in claude-rules/commits.md). Commit-body corpus would not carry felt-experience prose; email and journal corpus deferred to Phase 2.

*** Before
#+begin_example
I'm so excited about this — I'll feel the speedup every time I run the build. This is going to be a huge relief.
#+end_example

*** After
#+begin_example
The build now finishes in roughly half the time it used to take.
#+end_example

*** Detection
Phrases that tell the reader how the change will feel or how often the writer will use it.

*** History
- Original SKILL.md entry: felt-experience narration cut for Craig's authored prose.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §37 Sentence Fragments → Complete

*** Modes
Prose and personal modes. General mode keeps the softer §30, which exempts more, because the strong "every sentence" rule is Craig's voice and should not be imposed on third-party text.

*** Rule
Rewrite every sentence fragment inside a prose paragraph in Craig's authored text as a complete sentence with subject and verb. Bullets and headings can stay fragments. Exemption: verdict formulas in PR review summaries ("Approving.", "Requesting changes.", "Approved.") are house style and stay — rewriting them imposes the rule where Craig's calibrated voice already decided otherwise.

*** Problem
Bullet shorthand leaking into running prose ("Two changes." "Fix incoming." "Body as decision log.") reads as bullet-list notes pasted into a paragraph. Every prose sentence needs a subject and a verb in prose and personal modes.

*** Basis
Observation-derived (Craig's voice rules in claude-rules/commits.md). Corpus context: 9.7% of sentences are 1-5 words. Some are legitimate single-word claims ("All eight pass."), some may be fragments. Word count alone cannot distinguish. Syntactic detection deferred to Phase 2.

*** Before
#+begin_example
Big change to the validator. Three new patterns. Test coverage up. Old behavior preserved.
#+end_example

*** After
#+begin_example
I made a big change to the validator. There are three new patterns and the test coverage is up. The old behavior is preserved.
#+end_example

*** Detection
Sentence fragments inside prose paragraphs in any text Craig authors: an email, a document, a working note, a commit or PR body. Bullets and headings remain fair game for fragments.

*** History
- Original SKILL.md entry: sentence-fragment rewrite for Craig's authored prose.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.
- 2026-06-10: verdict-formula exemption added. The skill survived in practice by being selectively ignored on "Approving." / "Requesting changes." / "Approved.", and selective ignoring is the same muscle that skips real patterns. Documenting the exception removes one standing occasion for judgment-override. Craig's call, from the work-project session.

** §38 Terse Cut — Omit Needless Words

*** Modes
Prose and personal modes. General mode keeps the softer §22 because academic registers retain "worth noting" and "it's important to understand" as legitimate transition markers.

*** Rule
Two cuts. First strip the soft rhetorical padding ("worth noting", "it's important to understand", "as you can see", "needless to say", "obviously", "of course", "in essence", "fundamentally"). Then run the general omit-needless-words sweep the padding list only samples: read each sentence and cut or collapse every word and clause that can go without losing meaning, not only the named phrases. Forcing test, per sentence: try to delete half of it and keep only what changes meaning.

*** Execution position (prose + personal)
§38 is not just one pattern in the walk — it is the mandatory *last* pass before any draft is presented. The SKILL.md Process makes it an explicit standalone final step, run after every other pattern. The reason is empirical: a draft that cleared the other 40 patterns still routinely runs a third too long, because ordinary verbosity matches no named trigger and the categorical detectors come back clean while the text is still bloated. Folded into the general walk, §38 gets glossed as a wordlist match. As a separate final step it gets the real per-sentence "delete half of it" sweep. A public draft shown without this pass is a defect in the same class as skipping the skill entirely.

*** Problem
Tier 1 omit-needless-words (§26) catches rigid offenders ("the fact that", "in order to"). The original §38 added a named padding list ("worth noting", "obviously"). But a draft can clear both and still run a third too long, because ordinary verbosity matches no named trigger: "that already merged via" for "landed on", "with it still in the PR, the same fix lands" for "keeping it re-lands the fix", restated subjects, throat-clearing lead-ins, clauses the reader already has. Those slip the categorical detectors silently — the walk comes back clean while the text is still bloated. So §38 is a real walk step, not a wordlist match: after the named padding, read each sentence and try to delete half of it. Academic registers keep the transition markers, so the aggressive cut stays prose and personal only.

*** Basis
Corpus-measured across registers (2026-05-29). Single-sentence-paragraph rate: git commits 41.1%, personal email 57.4%, work email 44.5%, PR descriptions 74.4%, PR review comments 50.0%. The terse-paragraph cadence is even more pronounced in conversational and PR-description prose than in commits. Craig writes terse across registers, with the highest density in deliberate PR descriptions where each paragraph carries one focused thought. Confirmed indirectly via paragraph structure across all five corpora.

*** Before
#+begin_example
It's worth noting that the change doesn't break the existing flow. Needless to say, the test suite is green. Obviously, this means we can ship.
#+end_example

*** After
#+begin_example
The change doesn't break the existing flow. The test suite is green. We can ship.
#+end_example

*** Before (generic verbosity, no named padding)
#+begin_example
This try/except is the same isolation change that already merged via #203. With it still in the PR, the same production fix lands under a second ticket, which is what the test: label means.
#+end_example

*** After
#+begin_example
This try/except already landed on development via #203. Keeping it re-lands a merged fix under a second ticket, like the test: label says.
#+end_example

This second pair carries no padding phrase from the named list. Every cut is ordinary verbosity: "is the same isolation change that already merged via" collapses to "already landed on", "With it still in the PR, the same production fix lands" to "Keeping it re-lands a merged fix", "which is what ... means" to "like ... says". A wordlist match finds nothing here; the per-sentence "delete half of it" test finds all of it.

*** Detection
Two passes. (1) Named padding phrases: "worth noting", "it's important to understand", "as you can see", "needless to say", "obviously", "of course", "in essence", "fundamentally". (2) Ordinary verbosity beyond the list: verbose verb phrases ("already merged via" → "landed on"), restated subjects, throat-clearing lead-ins, and any clause whose content the reader already has. The forcing test for pass 2 is per sentence: try to delete half of it and keep only what changes meaning.

*** History
- Original SKILL.md entry: rhetorical-padding cut for Craig's authored prose.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.
- 2026-06-02: generalized from a named-padding-list detector to a real omit-needless-words walk step. A PR-review comment cleared the §22/§23/§26/§38-padding patterns yet still ran a third too long on ordinary verbosity; the wordlist matched none of it. Renamed "Rhetorical Padding" to "Omit Needless Words", added the per-sentence "delete half of it" forcing test and the generic-verbosity example pair above. Craig's call.
- 2026-06-05: elevated to the mandatory final pass in the SKILL.md Process (new step 7). The pattern existed and was being walked, but got glossed as one of 41; a commit message went out needing two manual Orwell-walk requests before it read terse. Made it an explicit standalone last step that runs before any draft is shown, so the terse cut happens before Craig sees the draft rather than after he asks for it. Added the "Execution position" subsection above. Craig's call.

** §39 Public-Artifact Scope Check

*** Modes
Personal mode only. General and prose skip because a private journal or a third-party document has no public-scope concern. Flag only; no auto-rewrite.

*** Rule
Flag (do not auto-rewrite) local absolute paths, private repo names, and personal-tooling references in publish artifacts. Surface each match as a WARN line so the author resolves manually. Output format:
#+begin_example
WARN: line 12: "/home/cjennings/code/rulesets" — local absolute path in commit body
WARN: line 18: "claude-rules/commits.md" — personal-tooling reference; state the underlying reason instead
#+end_example

*** Problem
Commit messages, PR descriptions, PR comments, and Linear ticket bodies are visible to teammates and anyone with read access. References to the writer's personal layout are noise to a reader who cannot reproduce it. Auto-masking risks silently editing meaningful content because a legitimate file path mention may be load-bearing, and only the author can tell.

*** Basis
Observation-derived (Craig's voice rules in claude-rules/commits.md, Content scope section). Corpus is the public artifacts themselves, so confirmation is circular. Deferred to Phase 2.

*** Detection
Local absolute paths (=/home/<user>/...=, =/Users/<user>/...=), private repo names (any repo not in this project's known public set), personal-tooling references (humanizer, voice, commits.md, anything under =claude-rules/=, anything under =.ai/= or =.claude/=).

*** History
- Original SKILL.md entry: public-artifact scope flag for personal mode.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §40 Praise vs Correction Asymmetry

*** Modes
Personal mode only. General and prose skip because the rule assumes a PR review context.

*** Rule
Praise on a PR review is short and unjustified (the author knows why their good change is good). Correction always explains the why, gently and briefly, the way a mentor would, never as a verdict from on high. Keep it brief either way.

On an approve summary: praise plus verdict, nothing else. Cut any clause that describes or justifies the change. "Clean fix on the stacking bug, the tri-state is the right level to solve it at, and the tests cover the edges. Approving." becomes "Clean fix on the stacking bug. Approving." If a clause references what the code does or why it works, delete it.

On a finding or change-request: always give the why, gently and briefly. Not "Move this to a helper." but "I'd pull this into one helper — three copies of the same rule means the next change has to touch all three, and missing one brings the bug back."

Verification narration is the same defect as justified praise. "I traced X and it's safe because..." pads the compliment with the reviewer's homework. Tracing the code is the reviewer's job, not content for the comment — if verification found a problem, the problem gets the words; if it found nothing, it gets zero words.

*** Problem
Praise and correction call for opposite treatment. The author already knows why their good change is good, so justifying praise reads as flattery. Correction is the reverse. Behavior only changes when the reason lands, so a finding, change-request, or inline coaching note must always explain the why. And the why is delivered gently, the way a kind coach or mentor would.

*** Basis
Observation-derived (Craig's voice rules in claude-rules/commits.md, Voice and Focus section). PR-review corpus needed for empirical measurement. Deferred to Phase 2.

*** Before
#+begin_example
Nice clean migration, the provider mocks and the Normal/Boundary/Error cases are all covered which is exactly what I'd want here. Approving. Also rename `x`.
#+end_example

*** After
#+begin_example
Clean migration. Approving. One note inline: I'd rename `x` to `provider` — it reads as a generic placeholder and the next person won't know it's the resolved provider without tracing it.
#+end_example

*** Before (verification narration)
#+begin_example
All three fixes look right. I traced useMapActions and the unmount cleanup is safe because the hook returns a memoized object, and the provider wraps the whole app so neither call site lands on the no-op path.
#+end_example

*** After
#+begin_example
All three fixes are clean and well-aimed.
#+end_example

*** Detection
In a PR review summary or comment: a praise clause that explains why the good thing is good, a praise clause followed by the verification work that supports it, or a finding or change-request that states what to fix without saying why.

*** History
- Original SKILL.md entry: praise-versus-correction asymmetry for PR review.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.
- 2026-06-10: verification-narration variant added after the third recurrence — a review draft praised a fix and then narrated the verification supporting the praise (the #236 draft). Added to the SKILL.md rule line and the high-recurrence attestation set. Craig's call, from the work-project session.

** §41 No Emphasis Formatting

*** Modes
Prose and personal modes. General mode keeps the related but mechanical §14 (boldface strip). §41 carries Craig's own principle and covers italics and underscores too.

*** Rule
Remove emphasis markup (bold, italics, underscore-wrapped words) used to stress a phrase in Craig's prose, and rephrase so the stress lives in word choice and sentence shape. Structural markup stays: headings, defined terms on first use where the convention is house style, code spans for literal identifiers.

*** Problem
Craig makes his points with words, not formatting. Emphasis markup is a crutch. When a sentence leans on bold or italics to land, the wording is not doing the work. The fix is not to delete the markup and leave a flat sentence. It is to rephrase so the stress lives in the word choice and sentence shape. This is the same principle behind his terminal-rendering rule in chat, but here it is about the writing itself, not the display.

*** Basis
Observation-derived (Craig's voice rules in claude-rules/interaction.md, No Reverse-Video Highlighting rule). Org-mode bold uses =*word*= rather than Markdown =**word**= so corpus grep for Markdown emphasis is not directly applicable. Corpus measurement deferred to Phase 2.

*** Before
#+begin_example
This is **really** important: you must run the migration *before* deploying, or the app will crash.
#+end_example

*** After
#+begin_example
Run the migration before deploying. Skip that step and the app crashes on the first request.
#+end_example

*** Detection
Bold (=**...**=), italic (=*...*= or =_..._=), or underscore-wrapped words used to emphasize a phrase in Craig's prose.

*** History
- Original SKILL.md entry: no emphasis formatting for Craig's authored prose.
- 2026-05-29: migrated to this file as the canonical home per the pairing rule.

** §42 Finding Stems — One Claim Per Sentence

*** Modes
Personal mode only. General and prose skip because the rule assumes a PR review finding.

*** Rule
A PR review finding is built from clean stems, each a straightforward sentence carrying one claim: (1) where the bug is, (2) the way(s) to fix it, (3) why that's better. Cut context sentences that don't change what the author does next (ticket history, design archaeology). Rewrite the anti-pattern shapes: hedged gerund chains ("the real bug looks like the model emitting a partial set"), compressed trade-off clauses ("I'd rather X, or Y, than lose Z"), multi-claim sentences chained through so-clauses or "and", and fixes buried after a mid-sentence colon.

*** Problem
Craig named detangling overly complex or overly wordy Claude-drafted PR review text as THE key issue he fights in PR reviews, and the reason he gates every review draft. The tangles passed all 41 then-existing patterns — §38 shortens but doesn't untangle; a sentence can be terse and still carry three claims. §40 governs praise; this governs how finding text is constructed.

*** Basis
Observation-derived from PR #233 (2026-06-10): a review comment shipped with hedged gerund chains and compressed trade-off clauses that cleared the full walk. The three Before/After pairs below are Craig-approved rewrites from that PR.

*** Before (multi-claim opener + context sentence)
#+begin_example
POST fixes the wipe but it's additive: it no-ops on an empty list and never removes, so "cancel all partners" and any de-selection silently stop working. PUT came from SE-195 so the confirm could reconcile the full set. The real bug is upstream: on a new tasking the confirm emits only the new provider, not the full set. Fix that, or merge with the mission's current providers before the PUT. Either way removal keeps working.
#+end_example

*** After
#+begin_example
POST is additive: it no-ops on an empty list and never removes. That breaks "cancel all partners" and any de-selection. The real bug is upstream: on a new tasking the confirm emits only the new provider, not the full set. Fix that, or merge with the mission's current providers before the PUT, and removal keeps working.
#+end_example

The SE-195 context sentence is cut because it doesn't change the author's next action.

*** Before (hedged gerund chain + compressed trade-off — the calibration case Craig pulled up)
#+begin_example
The real bug looks like the model emitting a partial set on a new tasking. I'd rather fix what the confirm emits, or merge client-side before the PUT, than lose removal.
#+end_example

*** After (where / fix / payoff)
#+begin_example
The real bug is upstream: on a new tasking the confirm emits only the new provider, not the full set. Fix that, or merge with the mission's current providers before the PUT. Either way removal keeps working.
#+end_example

*** Before (claims joined with "and"; fix buried after a mid-sentence colon)
#+begin_example
The prefix check catches any message starting with "confirm ", and the options block exists so the LLM can resolve "number 2" style references. A typed "confirm number 2" loses the list it needs. The card click already sends a self-describing "confirm <id>": pass an explicit parameter through sendAgentMessage and strip only on that path.
#+end_example

*** After
#+begin_example
The prefix check strips the options block from any typed message starting with "confirm ", so "confirm number 2" loses the list the LLM needs to resolve it. Strip on the card-click path instead, with an explicit parameter passed through sendAgentMessage. The click already sends a self-describing "confirm <id>", so stripping is safe there.
#+end_example

*** Detection
In a PR review finding: a sentence carrying more than one claim (chained through so-clauses, "and", or a mid-sentence colon hiding the fix), a hedged gerund chain where a direct claim belongs, a compressed trade-off clause, or a context sentence that doesn't change the author's next action.

*** History
- 2026-06-10: created from the PR #233 calibration session. Proposed in the work project's stems handoff, landed via the consolidated voice-skill revision. Included in the high-recurrence attestation set from day one. Craig's call.

** §43 Single-Sentence Paragraph Cadence Is a Feature

*** Modes
Prose and personal modes. General mode skips because third-party registers legitimately prefer multi-sentence paragraphs.

*** Rule
A one-sentence paragraph is a finished thought, not a fragment. Break paragraphs after one complete thought when the next thought shifts angle, even if both are short. Never merge short paragraphs into multi-sentence ones in a "clean prose" pass.

*** Problem
Most prose-style guides advise multi-sentence paragraphs, so a generic cleanup pass merges Craig's short paragraphs and erases a distinctive feature of his voice. This is a protective pattern: it guards an existing trait rather than correcting a defect.

*** Basis
Corpus-measured (2026-05-29). Single-sentence-paragraph rate: git commits 41.1%, personal email 57.4%, work email 44.5%, PR descriptions 74.4%, PR review comments 50.0%. Between 41% and 74% of Craig's paragraphs are exactly one sentence, depending on register.

*** Before (a cleanup pass merging short paragraphs)
#+begin_example
The build now finishes in half the time, and the cache no longer invalidates on every run, which means the CI queue clears faster too.
#+end_example

*** After
#+begin_example
The build now finishes in half the time.

The cache no longer invalidates on every run, so the CI queue clears faster too.
#+end_example

*** Detection
An edit pass that merged short paragraphs, or a draft whose paragraphs each stack multiple shifted angles that would read better broken apart.

*** History
- 2026-05-29: surfaced by the Phase 1-2 corpus measurement as a "worth adding" trait; filed as suggested delta 4.
- 2026-06-10: promoted from the suggested-deltas list into a numbered pattern. Craig's call, from the work-project session.

** §44 Parenthetical Asides Are Part of the Voice

*** Modes
Prose and personal modes. General mode skips because third-party text owns its own aside conventions.

*** Rule
Parentheses for asides, clarifications, and scope-narrowing are Craig's voice. Don't strip them in a cleanup pass. They're also the preferred landing spot for em-dash replacements under §13.

*** Problem
Generic style passes treat parentheticals as clutter and strip them. For Craig they carry asides, clarifications, and scope-narrowing, and removing them flattens the voice. Protective pattern, like §43.

*** Basis
Corpus-measured (2026-05-29): 23.07 opening parens per 1000 words across the commit corpus. Heavy parenthetical use is distinctive and consistent.

*** Before (a cleanup pass stripping the aside)
#+begin_example
The sync runs on every startup. It skips lockfiles. It also skips build output.
#+end_example

*** After
#+begin_example
The sync runs on every startup (skipping lockfiles and build output).
#+end_example

*** Detection
An edit pass that removed parenthetical asides present in the source text, or an em-dash replacement under §13 where parentheses fit better than a comma or period.

*** History
- 2026-05-29: surfaced by the Phase 1-2 corpus measurement as a "worth adding" trait; filed as suggested delta 5.
- 2026-06-10: promoted from the suggested-deltas list into a numbered pattern, with the §13 landing-spot note. Craig's call, from the work-project session.

** §45 Declarative Register Marker

*** Modes
Prose and personal modes, advisory. Flag only; no auto-rewrite. General mode skips.

*** Rule
Craig's prose is declarative. When a draft contains a rhetorical question, flag it for a second look — it's usually AI rhetoric, not his register. Genuine questions to the reader (a review asking the author's intent, an email asking for a decision) stay.

*** Problem
AI drafts reach for rhetorical questions ("So what does this mean for the build?") as a transition device. Craig states things; he rarely asks them. A rhetorical question in his voice is a tell, but a genuine question is legitimate content, so the pattern flags rather than rewrites.

*** Basis
Corpus-measured (2026-05-29): 0.33 question marks per 1000 words across the commit corpus. His prose register is declarative.

*** Before (rhetorical transition flagged)
#+begin_example
So what does this change for the deploy flow? The staging gate now runs before the canary, which means a bad build never reaches it.
#+end_example

*** After
#+begin_example
The staging gate now runs before the canary, so a bad build never reaches it.
#+end_example

*** Detection
A question mark in a draft in Craig's voice. Flag it; keep genuine questions to the reader, rewrite rhetorical ones as declarative claims.

*** History
- 2026-05-29: surfaced by the Phase 1-2 corpus measurement as a register marker; filed as suggested delta 6.
- 2026-06-10: promoted from the suggested-deltas list into a numbered advisory pattern. Craig's call, from the work-project session.