aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--voice/SKILL.md12
1 files changed, 9 insertions, 3 deletions
diff --git a/voice/SKILL.md b/voice/SKILL.md
index 4572197..2368687 100644
--- a/voice/SKILL.md
+++ b/voice/SKILL.md
@@ -146,7 +146,9 @@ Avoiding AI patterns is half the job. Sterile, voiceless writing is just as obvi
### 7. Overused "AI Vocabulary" Words
-**High-frequency AI words:** Additionally, align with, crucial, delve, emphasizing, enduring, enhance, fostering, garner, highlight (verb), interplay, intricate/intricacies, key (adjective), landscape (abstract noun), pivotal, showcase, tapestry (abstract noun), testament, underscore (verb), valuable, vibrant
+**High-frequency AI words:** Additionally, align with, comprehensive, crucial, delve, emphasizing, enduring, enhance, fostering, garner, highlight (verb), interplay, intricate/intricacies, key (adjective), landscape (abstract noun), pivotal, showcase, tapestry (abstract noun), testament, underscore (verb), valuable, vibrant
+
+**Note on "comprehensive" (2026-05-29):** A corpus pass found 42 occurrences in Craig's git commit bodies, while every other watch-word in this list registered zero or one occurrence. "comprehensive" is genuine Craig vocabulary in technical contexts ("comprehensive test coverage", "comprehensive audit"). Craig has chosen to keep it on the watch-list because he's consciously trying to use it sparingly. Flag it; suggest an alternative ("full", "complete", "thorough", or rewording to drop the adjective) and let Craig decide per instance.
**Problem:** These words appear far more frequently in post-2023 text. They often co-occur.
@@ -212,10 +214,12 @@ Avoiding AI patterns is half the job. Sterile, voiceless writing is just as obvi
### 13. Em Dash Overuse
-**Problem:** LLMs use em dashes (—) more than humans, mimicking "punchy" sales writing.
+**Problem:** LLMs use em dashes (—) more than the median human writer, mimicking "punchy" sales writing.
**Mode-dependent strength.** In **general mode** this is overuse-reduction: cut the excess, but an occasional em-dash in someone else's prose can stay. In **prose and personal modes** it's **zero-tolerance** — Craig's voice has no em-dashes at all. Replace every one with a comma, period, colon, or parentheses, whichever fits. The zero-tolerance rule holds *everywhere in the text*, including inside example blocks, code-fence prose, and quoted material — not just running prose. An em-dash in a quoted line still gets replaced.
+**Note on basis (2026-05-29).** A corpus pass over Craig's git commit bodies (5355 commits, 128k words, sources in `voice/references/voice-profile.org`) measured his em-dash rate at 3.49 per 1000 words, comparable to AI-generated prose. The zero-tolerance rule in prose and personal modes is self-discipline, not habit-reflection. Craig has decided his published voice drops em-dashes by choice because the result reads cleaner and avoids the most common AI tell, regardless of his pre-rule frequency.
+
**Before:**
> The term is primarily promoted by Dutch institutions—not by the people themselves. You don't say "Netherlands, Europe" as an address—yet this mislabeling continues—even in official documents.
@@ -460,7 +464,9 @@ The subject line of a commit stays imperative per Conventional Commits ("feat: a
**Detection:** Semicolons in prose Craig authors — emails, documents, working notes, commit-message bodies, PR descriptions, PR review comments.
-**Problem:** Craig's voice avoids semicolons. They make the writing feel unnecessarily literary. Replace with a period (split into two sentences) or a comma (when the clauses are tightly coupled). In a genuinely formal long-form document the semicolon can be defensible, so weigh the register — but the default for his prose is to split.
+**Problem:** Craig's voice avoids semicolons. They make the writing feel unnecessarily literary. Replace with a period (split into two sentences) or a comma (when the clauses are tightly coupled). In a genuinely formal long-form document the semicolon can be defensible, so weigh the register, but the default for his prose is to split.
+
+**Note on basis (2026-05-29).** A corpus pass over Craig's git commit bodies measured his semicolon rate at 3.16 per 1000 words, comparable to AI-generated prose. The rule is self-discipline, not habit-reflection. Craig has decided his published voice drops semicolons because the period-split usually reads better, not because semicolons are inherently an AI tell.
**Before:**
> I added the validation; the previous flow allowed empty values to leak through.