#+TITLE: Rulesets — Open Work #+AUTHOR: Craig Jennings #+DATE: 2026-04-19 Tracking TODOs for the rulesets repo that span more than one commit. Project-scoped (not the global =~/sync/org/roam/inbox.org= list). * TODO [#A] Build =/update-skills= skill for keeping forks in sync with upstream The rulesets repo has a growing set of forks (=arch-decide= from wshobson/agents, =playwright-js= from lackeyjb/playwright-skill, =playwright-py= from anthropics/skills/webapp-testing). Over time, upstream releases fixes, new templates, or scope expansions that we'd want to pull in without losing our local modifications. A skill should handle this deliberately rather than by manual re-cloning. ** Design decisions (agreed) - *Upstream tracking:* per-fork manifest =.skill-upstream= (YAML or JSON): - =url= (GitHub URL) - =ref= (branch or tag) - =subpath= (path inside the upstream repo when it's a monorepo) - =last_synced_commit= (updated on successful sync) - *Local modifications:* 3-way merge. Requires a pristine baseline snapshot of the upstream-at-time-of-fork. Store under =.skill-upstream/baseline/= or similar; committed to the rulesets repo so the merge base is reproducible. - *Apply changes:* skill edits files directly with per-file confirmation. - *Conflict policy:* per-hunk prompt inside the skill. When a 3-way merge produces a conflict, the skill walks each conflicting hunk and asks Craig: keep-local / take-upstream / both / skip. Editor-independent; works on machines where Emacs isn't available. Fallback when baseline is missing or corrupt (can't run 3-way merge): write =.local=, =.upstream=, =.baseline= files side-by-side and surface as manual review. ** V1 Scope - [ ] Skill at =~/code/rulesets/update-skills/= - [ ] Discovery: scan sibling skill dirs for =.skill-upstream= manifests - [ ] Helper script (bash or python) to: - Clone each upstream at =ref= shallowly into =/tmp/= - Compare current skill state vs latest upstream vs stored baseline - Classify each file: =unchanged= / =upstream-only= / =local-only= / =both-changed= - For =both-changed=: run =git merge-file --stdout =; if clean, write result directly; if conflicts, parse the conflict-marker output and feed each hunk into the per-hunk prompt loop - [ ] Per-hunk prompt loop: - Show base / local / upstream side-by-side for each conflicting hunk - Ask: keep-local / take-upstream / both (concatenate) / skip (leave marker) - Assemble resolved hunks into the final file content - [ ] Per-fork summary output with file-level classification table - [ ] Per-file confirmation flow (yes / no / show-diff) BEFORE per-hunk loop - [ ] On successful sync: update =last_synced_commit= in the manifest - [ ] =--dry-run= to preview without writing ** V2+ (deferred) - [ ] Track upstream *releases* (tags) not just branches, so skill can propose "upgrade from v1.2 to v1.3" with release notes pulled in - [ ] Generate patch files as an alternative apply method (for users who prefer =git apply= / =patch= over in-place edits) - [ ] Non-interactive mode (=--non-interactive= / CI): skip conflict resolution, emit side-by-side files for later manual review - [ ] Auto-run on a schedule via Claude Code background agent - [ ] Summary of aggregate upstream activity across all forks (which forks have upstream changes waiting, which don't) - [ ] Optional editor integration: on machines with Emacs, offer =M-x smerge-ediff= as an alternate path for users who prefer ediff over per-hunk prompts ** Initial forks to enumerate (for manifest bootstrap) - [ ] =arch-decide= → =wshobson/agents= :: =plugins/documentation-generation/skills/architecture-decision-records= :: MIT - [ ] =playwright-js= → =lackeyjb/playwright-skill= :: =skills/playwright-skill= :: MIT - [ ] =playwright-py= → =anthropics/skills= :: =skills/webapp-testing= :: Apache-2.0 ** Open questions - [ ] What happens when upstream *renames* a file we fork? Skill would see "file gone from upstream, still present locally" — drop, keep, or prompt? - [ ] What happens when upstream splits into multiple forks (e.g., a plugin reshuffles its structure)? Probably out of scope for v1; manual migration. - [ ] Rate-limit / offline mode: if GitHub is unreachable, should skill fail or degrade gracefully? Likely degrade; print warning per fork. * TODO [#B] Build /research-writer — clean-room synthesis for research-backed long-form SCHEDULED: <2026-05-15 Fri> Gap in current rulesets: between =brainstorm= (idea refinement → design doc) and =arch-document= (arc42 technical docs), there's no skill for research-backed long-form prose — blog posts, essays, white papers, proposals with data backing, article-length content with citations. Craig writes documents across many contexts (defense-contractor work, personal, technical, proposals). The gap is real. *Evaluated 2026-04-19:* ComposioHQ/awesome-claude-skills has a =content-research-writer= skill (540 lines, 14 KB) that attempts this. *Not adopting:* - Parent repo has no LICENSE file — reuse legally ambiguous - Bloated: 540 lines of prose-scaffolding with no tooling - No citation-style enforcement (APA/Chicago/IEEE/MLA) - No source-quality heuristics (primary vs secondary, peer-review, recency) - Fictional example citations in the skill itself (models the hallucination failure mode a citation-focused skill should prevent) - No citation-verification step - Overlaps with =humanizer= at polish with no composition guidance *Patterns worth lifting clean-room (from their better parts):* - Folder convention =~/writing//= with =outline.md=, =research.md=, versioned drafts, =sources/= - Section-by-section feedback loop (outline validated → per-section research validated → per-section draft validated) - Hook alternatives pattern (generate three hook variants with rationale) *Additions for the clean-room version (v1):* - Citation-style selection (APA / Chicago / MLA / IEEE / custom) with style-specific examples and a pick-one step up front - Source-quality heuristics: primary > secondary; peer-reviewed; recency thresholds by domain; publisher reputation; funding transparency - Citation-verification discipline: fetch real sources, never fabricate, mark unverifiable claims with =[citation needed]= rather than inventing - Composition hand-off to =/humanizer= at the polish stage - Classification awareness: if the working directory or context signals defense / regulated territory, flag any sentence that might touch CUI or classified material before emission *Target:* ~150-200 lines, clean-room per blanket policy. *When to build:* wait for a real research-writing task to validate the design against actual document patterns. Building preemptively risks tuning for my guess at Craig's workflow rather than his real one. Triggers that would prompt "let's build it now": - Starting a white paper / proposal that needs citation discipline - Writing a technical blog post with external references - A pattern of hitting the same research-writing friction 3+ times Upstream reference (do not vendor): ComposioHQ/awesome-claude-skills =content-research-writer/SKILL.md=. * TODO [#C] Try Skill Seekers on a real DeepSat docs-briefing need SCHEDULED: <2026-05-15 Fri> =Skill Seekers= ([[https://github.com/yusufkaraaslan/Skill_Seekers]]) is a Python CLI + MCP server that ingests 18 source types (docs sites, PDFs, GitHub repos, YouTube videos, Confluence, Notion, OpenAPI specs, etc.) and exports to 20+ AI targets including Claude skills. MIT licensed, 12.9k stars, active as of 2026-04-12. *Evaluated: 2026-04-19 — not adopted for rulesets.* Generates *reference-style* skills (encyclopedic dumps of scraped source material), not *operational* skills (opinionated how-we-do-things content). Doesn't fit the rulesets curation pattern. *Next-trigger experiment (this TODO):* the next time a DeepSat task needs Claude briefed deeply on a specific library, API, or docs site — try: #+begin_src bash pip install skill-seekers skill-seekers create --target claude #+end_src Measure output quality vs hand-curated briefing. If usable, consider installing as a persistent tool. If output is bloated / under-structured, discard and stick with hand briefing. *Candidate first experiments (pick one from an actual need, don't invent):* - A Django ORM reference skill scoped to the version DeepSat pins - An OpenAPI-to-skill conversion for a partner-vendor API - A React hooks reference skill for the frontend team's current patterns - A specific AWS service's docs (e.g. GovCloud-flavored) *Patterns worth borrowing into rulesets even without adopting the tool:* - Enhancement-via-agent pipeline (scrape raw → LLM pass → structured SKILL.md). Applicable if we ever build internal-docs-to-skill tooling. - Multi-target export abstraction (one knowledge extraction → many output formats). Clean design for any future multi-AI-tool workflow. *Concerns to verify on actual use:* - =LICENSE= has an unfilled =[Your Name/Username]= placeholder (MIT is unambiguous, but sloppy for a 12k-star project) - Default branch is =development=, not =main= — pin with care - Heavy commercialization signals (website at skillseekersweb.com, Trendshift promo, branded badges) — license might shift later; watch - Companion =skill-seekers-configs= community repo has only 8 stars despite main's 12.9k — ecosystem thinner than headline adoption