diff options
| author | Craig Jennings <c@cjennings.net> | 2026-05-29 18:03:44 -0500 |
|---|---|---|
| committer | Craig Jennings <c@cjennings.net> | 2026-05-29 18:03:44 -0500 |
| commit | 1a795d6147ba15519e8cf934c1b764809f79567f (patch) | |
| tree | 9b0a587b109272594d87c0bb5bc73d617fa35382 /voice/references | |
| parent | 10d0bc1ac516b44625e44ee34a31b2207e5ce34e (diff) | |
| download | rulesets-1a795d6147ba15519e8cf934c1b764809f79567f.tar.gz rulesets-1a795d6147ba15519e8cf934c1b764809f79567f.zip | |
docs(voice): scrub prose em-dashes from voice-profile.org
This hygiene sweep covers the profile's prose sections (Problem,
Basis, History, Phase 1 findings, Phase 2 list) where em-dashes had
carried over from the original SKILL.md text. 21 prose em-dashes were
replaced with context-appropriate punctuation (periods, colons,
parentheses, or rewords).
Eight em-dashes are preserved as legitimate exceptions: the literal
symbol reference in §13 Rule, §13 Before example (shows source text
with em-dashes), §36 Before example (felt-experience tic), §38
heading "Terse Cut — Rhetorical Padding" (paired with SKILL.md
heading verbatim), §39 Before example (WARN output format), §40
example dialogue (shows what a kind correction reads like).
The profile now follows its own rule for prose voice. The known
follow-up flagged in 10d0bc1's commit message is closed.
Diffstat (limited to 'voice/references')
| -rw-r--r-- | voice/references/voice-profile.org | 40 |
1 files changed, 20 insertions, 20 deletions
diff --git a/voice/references/voice-profile.org b/voice/references/voice-profile.org index 867d8a1..d61abaa 100644 --- a/voice/references/voice-profile.org +++ b/voice/references/voice-profile.org @@ -1,4 +1,4 @@ -#+TITLE: Voice Profile — canonical source-of-truth for the voice skill +#+TITLE: Voice Profile: canonical source-of-truth for the voice skill #+DATE: 2026-05-29 #+SOURCE: rulesets session 2026-05-29 @@ -20,7 +20,7 @@ Git commit bodies authored by Craig Jennings across all repos under =~/code/= an - 128608 words, 912400 characters - 33 repos contributing; top sources: archsetup (703), rulesets (621), work (565), archangel (455), home (395) -PRs deferred. Email + Slack deferred. This is one register (deliberate technical prose) — useful but narrow. +PRs deferred. Email + Slack deferred. This is one register (deliberate technical prose). The view is useful but narrow. * Findings against the 41 SKILL.md patterns @@ -32,42 +32,42 @@ PRs deferred. Email + Slack deferred. This is one register (deliberate technical *Pattern 22 (filler).* "moreover" / "furthermore" / "additionally" / "in conclusion": all zero or one occurrence. Filler-phrase avoidance confirmed. -*Pattern 32 (first-person rewrite).* Standalone "I" at 3.85 per 1000 words. Craig writes first-person heavily — this is real, not aspirational. +*Pattern 32 (first-person rewrite).* Standalone "I" at 3.85 per 1000 words. Craig writes first-person heavily. This is real, not aspirational. *Pattern 34 (contractions).* 459 contractions total (3.57 per 1000). Top hits: =doesn't= (92), =don't= (59), =isn't= (46), =it's= (43), =can't= (40), =that's= (34). Rule confirmed. -*Pattern 38 (terse cut).* 41.1% of paragraphs are single-sentence. Craig writes terse — paragraph breaks land after one complete thought even when short. Confirmed indirectly via paragraph structure. +*Pattern 38 (terse cut).* 41.1% of paragraphs are single-sentence. Craig writes terse. Paragraph breaks land after one complete thought even when short. Confirmed indirectly via paragraph structure. ** Aspirational (corpus contradicts, but the rule is intentional self-discipline) -*Pattern 13 (em-dash zero-tolerance, personal mode).* Corpus rate: 3.49 em-dashes per 1000 words. Comparable to AI-generated prose. Craig USES em-dashes regularly in commit bodies — the rule overrides his habit, it doesn't reflect it. Suggested rewording: drop the "LLMs use em dashes more than humans" framing; keep the zero-tolerance directive but rationale becomes "Craig's published voice — commit messages going forward, PR bodies, emails — drops em-dashes by choice because it reads cleaner and avoids a common AI tell, regardless of his pre-rule habit." Honest about the source. +*Pattern 13 (em-dash zero-tolerance, personal mode).* Corpus rate: 3.49 em-dashes per 1000 words. Comparable to AI-generated prose. Craig USES em-dashes regularly in commit bodies. The rule overrides his habit, it doesn't reflect it. Suggested rewording: drop the "LLMs use em dashes more than humans" framing; keep the zero-tolerance directive but rationale becomes "Craig's published voice (commit messages going forward, PR bodies, emails) drops em-dashes by choice because it reads cleaner and avoids a common AI tell, regardless of his pre-rule habit." Honest about the source. *Pattern 33 (semicolons → period/comma).* Corpus rate: 3.16 semicolons per 1000. Craig uses semicolons regularly. Same shape as #13: rule is self-discipline, not habit-reflection. Suggested rewording: acknowledge the rule overrides habit rather than implying it codifies one. -These two rules are still valuable — em-dashes and semicolons both read cleaner when absent from short imperative-leaning prose. But the SKILL.md should say "this is a rule I've decided to follow," not "this is how I already write." +These two rules are still valuable. Em-dashes and semicolons both read cleaner when absent from short imperative-leaning prose. But the SKILL.md should say "this is a rule I've decided to follow," not "this is how I already write." ** Worth challenging -*Pattern 7 watch-word "comprehensive".* 42 occurrences in corpus (~0.33 per 1000). All other AI-tell watch-words clock near zero. "comprehensive" appears to be genuine vocabulary for Craig in technical contexts ("comprehensive test coverage", "comprehensive audit"). Suggested change: pull "comprehensive" out of the watch-list, or carve out a "watch in clusters, not solo" note — flag only when "comprehensive" co-occurs with other AI-tell words. +*Pattern 7 watch-word "comprehensive".* 42 occurrences in corpus (~0.33 per 1000). All other AI-tell watch-words clock near zero. "comprehensive" appears to be genuine vocabulary for Craig in technical contexts ("comprehensive test coverage", "comprehensive audit"). Suggested change: pull "comprehensive" out of the watch-list, or carve out a "watch in clusters, not solo" note that flags only when "comprehensive" co-occurs with other AI-tell words. ** Worth adding (corpus surfaces traits the rules don't capture) -*Single-sentence paragraph cadence.* 41.1% of paragraphs are exactly one sentence. This is distinctive — most prose-style guides advise multi-sentence paragraphs. Suggested addition (prose + personal): a positive pattern noting "a one-sentence paragraph is a finished thought, not a fragment. Break paragraphs after one complete thought when the next thought shifts angle, even if both are short." Anti-rule against "merge short paragraphs into multi-sentence ones." +*Single-sentence paragraph cadence.* 41.1% of paragraphs are exactly one sentence. This is distinctive. Most prose-style guides advise multi-sentence paragraphs. Suggested addition (prose + personal): a positive pattern noting "a one-sentence paragraph is a finished thought, not a fragment. Break paragraphs after one complete thought when the next thought shifts angle, even if both are short." Anti-rule against "merge short paragraphs into multi-sentence ones." -*Parenthetical density.* 23.07 opening parens per 1000 words. Heavy parenthetical use — asides, clarifications, scope-narrowing in parens. Currently no rule addresses this either way. Could add a positive pattern: "parentheses for asides are part of the voice. Don't strip them in a 'clean prose' pass." +*Parenthetical density.* 23.07 opening parens per 1000 words. Heavy parenthetical use covers asides, clarifications, and scope-narrowing in parens. Currently no rule addresses this either way. Could add a positive pattern: "parentheses for asides are part of the voice. Don't strip them in a 'clean prose' pass." -*Question-mark rarity.* 0.33 per 1000. Craig's prose is declarative — he states things, rarely asks them. Worth noting as a register marker (when /voice personal output has questions, double-check whether they're contextual or AI rhetoric). +*Question-mark rarity.* 0.33 per 1000. Craig's prose is declarative. He states things, rarely asks them. Worth noting as a register marker (when /voice personal output has questions, double-check whether they're contextual or AI rhetoric). -** Out of corpus (commits don't test these — Phase 2 needed) +** Out of corpus (commits don't test these, Phase 2 needed) - *Pattern 13 in long-form prose.* Commit bodies are short. Email and PR bodies may show different em-dash rates. - *Pattern 14 (boldface).* Org-mode bold uses =*word*=, not detectable by simple grep. Markdown bold rare in commits. - *Pattern 16 (title case in headings).* Commits don't carry headings. - *Pattern 19 (collaborative artifacts).* Not present in commit bodies. -- *Pattern 35 (sentence split on conjunctions).* Average sentence is 18.81 words, median 14, with 28% of sentences 21+ words — long-sentence rate is moderate. Need to inspect actual sentences to know if they're conjunction-stitched. Defer. +- *Pattern 35 (sentence split on conjunctions).* Average sentence is 18.81 words, median 14, with 28% of sentences 21+ words. Long-sentence rate is moderate. Need to inspect actual sentences to know if they're conjunction-stitched. Defer. - *Pattern 36 (felt-experience cut).* Commit bodies wouldn't carry felt-experience prose. Email + journal corpus needed. - *Pattern 37 (sentence fragments).* 9.7% of sentences are 1-5 words. Some are legitimate ("All eight pass."), some may be fragments. Can't tell from word-count alone. Defer to a pass that does syntactic detection. -- *Pattern 39 (public-artifact scope).* The corpus IS the public artifacts — circular. Defer. +- *Pattern 39 (public-artifact scope).* The corpus IS the public artifacts. The check is circular. Defer. - *Pattern 40 (praise vs correction asymmetry).* Not detectable in commit bodies. Email or PR-review corpus needed. ** Curiosities @@ -88,15 +88,15 @@ Six concrete edits to =voice/SKILL.md=, all of which can land independently: 5. *NEW pattern (prose + personal): "Parentheses for asides are part of the voice."* 23 opening parens per 1000 words. Heavy parenthetical use is distinctive. Don't strip parenthetical asides in a "clean prose" pass. -6. *Register marker (advisory, not a rewrite rule): "Declarative is the default."* 0.33 question marks per 1000. Voice personal output that contains rhetorical questions should be checked — they're often AI rhetoric, not Craig's register. +6. *Register marker (advisory, not a rewrite rule): "Declarative is the default."* 0.33 question marks per 1000. Voice personal output that contains rhetorical questions should be checked. They're often AI rhetoric, not Craig's register. * What Phase 2 would add -- Email corpus (gmail + cmail, sent-only, long-form) — different register, especially long-form prose flow. -- PR bodies and review comments — longer prose, deliberate register, includes the praise/correction asymmetry test ground. -- Slack messages — casual register, contraction rate, sentence-fragment rate. -- Syntactic detection — distinguish fragments from terse complete sentences for pattern #37. -- Long-form documents (résumé, proposals if any) — single register but high prose density. +- Email corpus (gmail + cmail, sent-only, long-form): different register, especially long-form prose flow. +- PR bodies and review comments: longer prose, deliberate register, includes the praise/correction asymmetry test ground. +- Slack messages: casual register, contraction rate, sentence-fragment rate. +- Syntactic detection: distinguish fragments from terse complete sentences for pattern #37. +- Long-form documents (résumé, proposals if any): single register but high prose density. * Per-pattern entries @@ -922,7 +922,7 @@ General mode only. Prose and personal inherit it. Replace business and conversational clichés with the plain meaning, including in casual register where "it's fine, it's casual" is the tell. Watch-list phrases: at the end of the day, moving forward, going forward, at this juncture, circle back, low-hanging fruit, deep dive, leverage (as verb), synergy, take it offline, ducks in a row, boil the ocean, pivot (corporate sense), keep it loose, keep it casual, touch base, circle up, hit the ground running, move the needle, on the same page, no-brainer, win-win. *** Problem -Clichés signal effortful prose without saying anything specific. Replace with the actual meaning. A casual, friendly, or conversational register is not a license to keep a cliché. Cut it there too. If you catch yourself justifying one as "it's fine, it's casual," that is the tell. Craig flagged this on 2026-05-22 when "keep it loose" slipped through as "acceptable casual" — exactly the miss this note prevents. +Clichés signal effortful prose without saying anything specific. Replace with the actual meaning. A casual, friendly, or conversational register is not a license to keep a cliché. Cut it there too. If you catch yourself justifying one as "it's fine, it's casual," that is the tell. Craig flagged this on 2026-05-22 when "keep it loose" slipped through as "acceptable casual." That is exactly the miss this note prevents. *** Basis Observation-derived (Orwell, Garner). |
