#+TITLE: Rulesets — Open Work
#+AUTHOR: Craig Jennings
#+DATE: 2026-04-19

Tracking TODOs for the rulesets repo that span more than one commit.
Project-scoped (not the global =~/sync/org/roam/inbox.org= list).

* TODO [#A] Build =/update-skills= skill for keeping forks in sync with upstream

The rulesets repo has a growing set of forks (=arch-decide= from
wshobson/agents, =playwright-js= from lackeyjb/playwright-skill, =playwright-py=
from anthropics/skills/webapp-testing). Over time, upstream releases fixes,
new templates, or scope expansions that we'd want to pull in without losing
our local modifications. A skill should handle this deliberately rather than
by manual re-cloning.

** Design decisions (agreed)

- *Upstream tracking:* per-fork manifest =.skill-upstream= (YAML or JSON):
  - =url= (GitHub URL)
  - =ref= (branch or tag)
  - =subpath= (path inside the upstream repo when it's a monorepo)
  - =last_synced_commit= (updated on successful sync)
- *Local modifications:* 3-way merge. Requires a pristine baseline snapshot of
  the upstream-at-time-of-fork. Store under =.skill-upstream/baseline/= or
  similar; committed to the rulesets repo so the merge base is reproducible.
- *Apply changes:* skill edits files directly with per-file confirmation.
- *Conflict policy:* per-hunk prompt inside the skill. When a 3-way merge
  produces a conflict, the skill walks each conflicting hunk and asks Craig:
  keep-local / take-upstream / both / skip. Editor-independent; works on
  machines where Emacs isn't available. Fallback when baseline is missing
  or corrupt (can't run 3-way merge): write =.local=, =.upstream=,
  =.baseline= files side-by-side and surface as manual review.

** V1 Scope

- [ ] Skill at =~/code/rulesets/update-skills/=
- [ ] Discovery: scan sibling skill dirs for =.skill-upstream= manifests
- [ ] Helper script (bash or python) to:
  - Clone each upstream at =ref= shallowly into =/tmp/=
  - Compare current skill state vs latest upstream vs stored baseline
  - Classify each file: =unchanged= / =upstream-only= / =local-only= / =both-changed=
  - For =both-changed=: run =git merge-file --stdout <local> <baseline> <upstream>=;
    if clean, write result directly; if conflicts, parse the conflict-marker
    output and feed each hunk into the per-hunk prompt loop
- [ ] Per-hunk prompt loop:
  - Show base / local / upstream side-by-side for each conflicting hunk
  - Ask: keep-local / take-upstream / both (concatenate) / skip (leave marker)
  - Assemble resolved hunks into the final file content
- [ ] Per-fork summary output with file-level classification table
- [ ] Per-file confirmation flow (yes / no / show-diff) BEFORE per-hunk loop
- [ ] On successful sync: update =last_synced_commit= in the manifest
- [ ] =--dry-run= to preview without writing

** V2+ (deferred)

- [ ] Track upstream *releases* (tags) not just branches, so skill can propose
  "upgrade from v1.2 to v1.3" with release notes pulled in
- [ ] Generate patch files as an alternative apply method (for users who prefer
  =git apply= / =patch= over in-place edits)
- [ ] Non-interactive mode (=--non-interactive= / CI): skip conflict resolution,
  emit side-by-side files for later manual review
- [ ] Auto-run on a schedule via Claude Code background agent
- [ ] Summary of aggregate upstream activity across all forks (which forks have
  upstream changes waiting, which don't)
- [ ] Optional editor integration: on machines with Emacs, offer
  =M-x smerge-ediff= as an alternate path for users who prefer ediff over
  per-hunk prompts

** Initial forks to enumerate (for manifest bootstrap)

- [ ] =arch-decide= → =wshobson/agents= :: =plugins/documentation-generation/skills/architecture-decision-records= :: MIT
- [ ] =playwright-js= → =lackeyjb/playwright-skill= :: =skills/playwright-skill= :: MIT
- [ ] =playwright-py= → =anthropics/skills= :: =skills/webapp-testing= :: Apache-2.0

** Open questions

- [ ] What happens when upstream *renames* a file we fork? Skill would see
  "file gone from upstream, still present locally" — drop, keep, or prompt?
- [ ] What happens when upstream splits into multiple forks (e.g., a plugin
  reshuffles its structure)? Probably out of scope for v1; manual migration.
- [ ] Rate-limit / offline mode: if GitHub is unreachable, should skill fail
  or degrade gracefully? Likely degrade; print warning per fork.

* TODO [#B] Build /research-writer — clean-room synthesis for research-backed long-form
SCHEDULED: <2026-05-15 Fri>

Gap in current rulesets: between =brainstorm= (idea refinement → design doc)
and =arch-document= (arc42 technical docs), there's no skill for
research-backed long-form prose — blog posts, essays, white papers,
proposals with data backing, article-length content with citations.

Craig writes documents across many contexts (defense-contractor work,
personal, technical, proposals). The gap is real.

*Evaluated 2026-04-19:* ComposioHQ/awesome-claude-skills has a
=content-research-writer= skill (540 lines, 14 KB) that attempts this. *Not
adopting:*
- Parent repo has no LICENSE file — reuse legally ambiguous
- Bloated: 540 lines of prose-scaffolding with no tooling
- No citation-style enforcement (APA/Chicago/IEEE/MLA)
- No source-quality heuristics (primary vs secondary, peer-review, recency)
- Fictional example citations in the skill itself (models the hallucination
  failure mode a citation-focused skill should prevent)
- No citation-verification step
- Overlaps with =humanizer= at polish with no composition guidance

*Patterns worth lifting clean-room (from their better parts):*
- Folder convention =~/writing/<article-name>/= with =outline.md=,
  =research.md=, versioned drafts, =sources/=
- Section-by-section feedback loop (outline validated → per-section
  research validated → per-section draft validated)
- Hook alternatives pattern (generate three hook variants with rationale)

*Additions for the clean-room version (v1):*
- Citation-style selection (APA / Chicago / MLA / IEEE / custom) with
  style-specific examples and a pick-one step up front
- Source-quality heuristics: primary > secondary; peer-reviewed; recency
  thresholds by domain; publisher reputation; funding transparency
- Citation-verification discipline: fetch real sources, never fabricate,
  mark unverifiable claims with =[citation needed]= rather than inventing
- Composition hand-off to =/humanizer= at the polish stage
- Classification awareness: if the working directory or context signals
  defense / regulated territory, flag any sentence that might touch CUI
  or classified material before emission

*Target:* ~150-200 lines, clean-room per blanket policy.

*When to build:* wait for a real research-writing task to validate the
design against actual document patterns. Building preemptively risks
tuning for my guess at Craig's workflow rather than his real one.
Triggers that would prompt "let's build it now":
- Starting a white paper / proposal that needs citation discipline
- Writing a technical blog post with external references
- A pattern of hitting the same research-writing friction 3+ times

Upstream reference (do not vendor): ComposioHQ/awesome-claude-skills
=content-research-writer/SKILL.md=.

* TODO [#C] Try Skill Seekers on a real DeepSat docs-briefing need
SCHEDULED: <2026-05-15 Fri>

=Skill Seekers= ([[https://github.com/yusufkaraaslan/Skill_Seekers]]) is a Python
CLI + MCP server that ingests 18 source types (docs sites, PDFs, GitHub
repos, YouTube videos, Confluence, Notion, OpenAPI specs, etc.) and
exports to 20+ AI targets including Claude skills. MIT licensed, 12.9k
stars, active as of 2026-04-12.

*Evaluated: 2026-04-19 — not adopted for rulesets.* Generates
*reference-style* skills (encyclopedic dumps of scraped source material),
not *operational* skills (opinionated how-we-do-things content). Doesn't
fit the rulesets curation pattern.

*Next-trigger experiment (this TODO):* the next time a DeepSat task needs
Claude briefed deeply on a specific library, API, or docs site — try:
#+begin_src bash
pip install skill-seekers
skill-seekers create <url> --target claude
#+end_src
Measure output quality vs hand-curated briefing. If usable, consider
installing as a persistent tool. If output is bloated / under-structured,
discard and stick with hand briefing.

*Candidate first experiments (pick one from an actual need, don't invent):*
- A Django ORM reference skill scoped to the version DeepSat pins
- An OpenAPI-to-skill conversion for a partner-vendor API
- A React hooks reference skill for the frontend team's current patterns
- A specific AWS service's docs (e.g. GovCloud-flavored)

*Patterns worth borrowing into rulesets even without adopting the tool:*
- Enhancement-via-agent pipeline (scrape raw → LLM pass → structured
  SKILL.md). Applicable if we ever build internal-docs-to-skill tooling.
- Multi-target export abstraction (one knowledge extraction → many output
  formats). Clean design for any future multi-AI-tool workflow.

*Concerns to verify on actual use:*
- =LICENSE= has an unfilled =[Your Name/Username]= placeholder (MIT is
  unambiguous, but sloppy for a 12k-star project)
- Default branch is =development=, not =main= — pin with care
- Heavy commercialization signals (website at skillseekersweb.com,
  Trendshift promo, branded badges) — license might shift later; watch
- Companion =skill-seekers-configs= community repo has only 8 stars
  despite main's 12.9k — ecosystem thinner than headline adoption