aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-05-29 18:03:44 -0500
committerCraig Jennings <c@cjennings.net>2026-05-29 18:03:44 -0500
commit1a795d6147ba15519e8cf934c1b764809f79567f (patch)
tree9b0a587b109272594d87c0bb5bc73d617fa35382
parent10d0bc1ac516b44625e44ee34a31b2207e5ce34e (diff)
downloadrulesets-1a795d6147ba15519e8cf934c1b764809f79567f.tar.gz
rulesets-1a795d6147ba15519e8cf934c1b764809f79567f.zip
docs(voice): scrub prose em-dashes from voice-profile.org
This hygiene sweep covers the profile's prose sections (Problem, Basis, History, Phase 1 findings, Phase 2 list) where em-dashes had carried over from the original SKILL.md text. 21 prose em-dashes were replaced with context-appropriate punctuation (periods, colons, parentheses, or rewords). Eight em-dashes are preserved as legitimate exceptions: the literal symbol reference in §13 Rule, §13 Before example (shows source text with em-dashes), §36 Before example (felt-experience tic), §38 heading "Terse Cut — Rhetorical Padding" (paired with SKILL.md heading verbatim), §39 Before example (WARN output format), §40 example dialogue (shows what a kind correction reads like). The profile now follows its own rule for prose voice. The known follow-up flagged in 10d0bc1's commit message is closed.
-rw-r--r--voice/references/voice-profile.org40
1 files changed, 20 insertions, 20 deletions
diff --git a/voice/references/voice-profile.org b/voice/references/voice-profile.org
index 867d8a1..d61abaa 100644
--- a/voice/references/voice-profile.org
+++ b/voice/references/voice-profile.org
@@ -1,4 +1,4 @@
-#+TITLE: Voice Profile — canonical source-of-truth for the voice skill
+#+TITLE: Voice Profile: canonical source-of-truth for the voice skill
#+DATE: 2026-05-29
#+SOURCE: rulesets session 2026-05-29
@@ -20,7 +20,7 @@ Git commit bodies authored by Craig Jennings across all repos under =~/code/= an
- 128608 words, 912400 characters
- 33 repos contributing; top sources: archsetup (703), rulesets (621), work (565), archangel (455), home (395)
-PRs deferred. Email + Slack deferred. This is one register (deliberate technical prose) — useful but narrow.
+PRs deferred. Email + Slack deferred. This is one register (deliberate technical prose). The view is useful but narrow.
* Findings against the 41 SKILL.md patterns
@@ -32,42 +32,42 @@ PRs deferred. Email + Slack deferred. This is one register (deliberate technical
*Pattern 22 (filler).* "moreover" / "furthermore" / "additionally" / "in conclusion": all zero or one occurrence. Filler-phrase avoidance confirmed.
-*Pattern 32 (first-person rewrite).* Standalone "I" at 3.85 per 1000 words. Craig writes first-person heavily — this is real, not aspirational.
+*Pattern 32 (first-person rewrite).* Standalone "I" at 3.85 per 1000 words. Craig writes first-person heavily. This is real, not aspirational.
*Pattern 34 (contractions).* 459 contractions total (3.57 per 1000). Top hits: =doesn't= (92), =don't= (59), =isn't= (46), =it's= (43), =can't= (40), =that's= (34). Rule confirmed.
-*Pattern 38 (terse cut).* 41.1% of paragraphs are single-sentence. Craig writes terse — paragraph breaks land after one complete thought even when short. Confirmed indirectly via paragraph structure.
+*Pattern 38 (terse cut).* 41.1% of paragraphs are single-sentence. Craig writes terse. Paragraph breaks land after one complete thought even when short. Confirmed indirectly via paragraph structure.
** Aspirational (corpus contradicts, but the rule is intentional self-discipline)
-*Pattern 13 (em-dash zero-tolerance, personal mode).* Corpus rate: 3.49 em-dashes per 1000 words. Comparable to AI-generated prose. Craig USES em-dashes regularly in commit bodies — the rule overrides his habit, it doesn't reflect it. Suggested rewording: drop the "LLMs use em dashes more than humans" framing; keep the zero-tolerance directive but rationale becomes "Craig's published voice — commit messages going forward, PR bodies, emails — drops em-dashes by choice because it reads cleaner and avoids a common AI tell, regardless of his pre-rule habit." Honest about the source.
+*Pattern 13 (em-dash zero-tolerance, personal mode).* Corpus rate: 3.49 em-dashes per 1000 words. Comparable to AI-generated prose. Craig USES em-dashes regularly in commit bodies. The rule overrides his habit, it doesn't reflect it. Suggested rewording: drop the "LLMs use em dashes more than humans" framing; keep the zero-tolerance directive but rationale becomes "Craig's published voice (commit messages going forward, PR bodies, emails) drops em-dashes by choice because it reads cleaner and avoids a common AI tell, regardless of his pre-rule habit." Honest about the source.
*Pattern 33 (semicolons → period/comma).* Corpus rate: 3.16 semicolons per 1000. Craig uses semicolons regularly. Same shape as #13: rule is self-discipline, not habit-reflection. Suggested rewording: acknowledge the rule overrides habit rather than implying it codifies one.
-These two rules are still valuable — em-dashes and semicolons both read cleaner when absent from short imperative-leaning prose. But the SKILL.md should say "this is a rule I've decided to follow," not "this is how I already write."
+These two rules are still valuable. Em-dashes and semicolons both read cleaner when absent from short imperative-leaning prose. But the SKILL.md should say "this is a rule I've decided to follow," not "this is how I already write."
** Worth challenging
-*Pattern 7 watch-word "comprehensive".* 42 occurrences in corpus (~0.33 per 1000). All other AI-tell watch-words clock near zero. "comprehensive" appears to be genuine vocabulary for Craig in technical contexts ("comprehensive test coverage", "comprehensive audit"). Suggested change: pull "comprehensive" out of the watch-list, or carve out a "watch in clusters, not solo" note — flag only when "comprehensive" co-occurs with other AI-tell words.
+*Pattern 7 watch-word "comprehensive".* 42 occurrences in corpus (~0.33 per 1000). All other AI-tell watch-words clock near zero. "comprehensive" appears to be genuine vocabulary for Craig in technical contexts ("comprehensive test coverage", "comprehensive audit"). Suggested change: pull "comprehensive" out of the watch-list, or carve out a "watch in clusters, not solo" note that flags only when "comprehensive" co-occurs with other AI-tell words.
** Worth adding (corpus surfaces traits the rules don't capture)
-*Single-sentence paragraph cadence.* 41.1% of paragraphs are exactly one sentence. This is distinctive — most prose-style guides advise multi-sentence paragraphs. Suggested addition (prose + personal): a positive pattern noting "a one-sentence paragraph is a finished thought, not a fragment. Break paragraphs after one complete thought when the next thought shifts angle, even if both are short." Anti-rule against "merge short paragraphs into multi-sentence ones."
+*Single-sentence paragraph cadence.* 41.1% of paragraphs are exactly one sentence. This is distinctive. Most prose-style guides advise multi-sentence paragraphs. Suggested addition (prose + personal): a positive pattern noting "a one-sentence paragraph is a finished thought, not a fragment. Break paragraphs after one complete thought when the next thought shifts angle, even if both are short." Anti-rule against "merge short paragraphs into multi-sentence ones."
-*Parenthetical density.* 23.07 opening parens per 1000 words. Heavy parenthetical use — asides, clarifications, scope-narrowing in parens. Currently no rule addresses this either way. Could add a positive pattern: "parentheses for asides are part of the voice. Don't strip them in a 'clean prose' pass."
+*Parenthetical density.* 23.07 opening parens per 1000 words. Heavy parenthetical use covers asides, clarifications, and scope-narrowing in parens. Currently no rule addresses this either way. Could add a positive pattern: "parentheses for asides are part of the voice. Don't strip them in a 'clean prose' pass."
-*Question-mark rarity.* 0.33 per 1000. Craig's prose is declarative — he states things, rarely asks them. Worth noting as a register marker (when /voice personal output has questions, double-check whether they're contextual or AI rhetoric).
+*Question-mark rarity.* 0.33 per 1000. Craig's prose is declarative. He states things, rarely asks them. Worth noting as a register marker (when /voice personal output has questions, double-check whether they're contextual or AI rhetoric).
-** Out of corpus (commits don't test these — Phase 2 needed)
+** Out of corpus (commits don't test these, Phase 2 needed)
- *Pattern 13 in long-form prose.* Commit bodies are short. Email and PR bodies may show different em-dash rates.
- *Pattern 14 (boldface).* Org-mode bold uses =*word*=, not detectable by simple grep. Markdown bold rare in commits.
- *Pattern 16 (title case in headings).* Commits don't carry headings.
- *Pattern 19 (collaborative artifacts).* Not present in commit bodies.
-- *Pattern 35 (sentence split on conjunctions).* Average sentence is 18.81 words, median 14, with 28% of sentences 21+ words — long-sentence rate is moderate. Need to inspect actual sentences to know if they're conjunction-stitched. Defer.
+- *Pattern 35 (sentence split on conjunctions).* Average sentence is 18.81 words, median 14, with 28% of sentences 21+ words. Long-sentence rate is moderate. Need to inspect actual sentences to know if they're conjunction-stitched. Defer.
- *Pattern 36 (felt-experience cut).* Commit bodies wouldn't carry felt-experience prose. Email + journal corpus needed.
- *Pattern 37 (sentence fragments).* 9.7% of sentences are 1-5 words. Some are legitimate ("All eight pass."), some may be fragments. Can't tell from word-count alone. Defer to a pass that does syntactic detection.
-- *Pattern 39 (public-artifact scope).* The corpus IS the public artifacts — circular. Defer.
+- *Pattern 39 (public-artifact scope).* The corpus IS the public artifacts. The check is circular. Defer.
- *Pattern 40 (praise vs correction asymmetry).* Not detectable in commit bodies. Email or PR-review corpus needed.
** Curiosities
@@ -88,15 +88,15 @@ Six concrete edits to =voice/SKILL.md=, all of which can land independently:
5. *NEW pattern (prose + personal): "Parentheses for asides are part of the voice."* 23 opening parens per 1000 words. Heavy parenthetical use is distinctive. Don't strip parenthetical asides in a "clean prose" pass.
-6. *Register marker (advisory, not a rewrite rule): "Declarative is the default."* 0.33 question marks per 1000. Voice personal output that contains rhetorical questions should be checked — they're often AI rhetoric, not Craig's register.
+6. *Register marker (advisory, not a rewrite rule): "Declarative is the default."* 0.33 question marks per 1000. Voice personal output that contains rhetorical questions should be checked. They're often AI rhetoric, not Craig's register.
* What Phase 2 would add
-- Email corpus (gmail + cmail, sent-only, long-form) — different register, especially long-form prose flow.
-- PR bodies and review comments — longer prose, deliberate register, includes the praise/correction asymmetry test ground.
-- Slack messages — casual register, contraction rate, sentence-fragment rate.
-- Syntactic detection — distinguish fragments from terse complete sentences for pattern #37.
-- Long-form documents (résumé, proposals if any) — single register but high prose density.
+- Email corpus (gmail + cmail, sent-only, long-form): different register, especially long-form prose flow.
+- PR bodies and review comments: longer prose, deliberate register, includes the praise/correction asymmetry test ground.
+- Slack messages: casual register, contraction rate, sentence-fragment rate.
+- Syntactic detection: distinguish fragments from terse complete sentences for pattern #37.
+- Long-form documents (résumé, proposals if any): single register but high prose density.
* Per-pattern entries
@@ -922,7 +922,7 @@ General mode only. Prose and personal inherit it.
Replace business and conversational clichés with the plain meaning, including in casual register where "it's fine, it's casual" is the tell. Watch-list phrases: at the end of the day, moving forward, going forward, at this juncture, circle back, low-hanging fruit, deep dive, leverage (as verb), synergy, take it offline, ducks in a row, boil the ocean, pivot (corporate sense), keep it loose, keep it casual, touch base, circle up, hit the ground running, move the needle, on the same page, no-brainer, win-win.
*** Problem
-Clichés signal effortful prose without saying anything specific. Replace with the actual meaning. A casual, friendly, or conversational register is not a license to keep a cliché. Cut it there too. If you catch yourself justifying one as "it's fine, it's casual," that is the tell. Craig flagged this on 2026-05-22 when "keep it loose" slipped through as "acceptable casual" — exactly the miss this note prevents.
+Clichés signal effortful prose without saying anything specific. Replace with the actual meaning. A casual, friendly, or conversational register is not a license to keep a cliché. Cut it there too. If you catch yourself justifying one as "it's fine, it's casual," that is the tell. Craig flagged this on 2026-05-22 when "keep it loose" slipped through as "acceptable casual." That is exactly the miss this note prevents.
*** Basis
Observation-derived (Orwell, Garner).