aboutsummaryrefslogtreecommitdiff
path: root/review-code
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-06-09 14:15:52 -0500
committerCraig Jennings <c@cjennings.net>2026-06-09 14:15:52 -0500
commit5bd8247acf4757e0868883ed65d395fec3fb7a0c (patch)
tree331eaf0f67837b32c1c220a2aa8c1b2ad5b003ff /review-code
parent75ab07df65351f9c96738cd7efe5edacc523ac80 (diff)
downloadrulesets-5bd8247acf4757e0868883ed65d395fec3fb7a0c.tar.gz
rulesets-5bd8247acf4757e0868883ed65d395fec3fb7a0c.zip
feat(review-code): gate deep-dive checks on a project review profile
The skill ran one flat criteria set on every change, so a concurrent, DB-backed service got the same review as a shell script, and the checks that matter for the service (concurrency, performance, auth, API compatibility) weren't in the set at all. Phase 3.5 splits the checks into a universal core that always runs plus nine modules that activate on the project's review profile. A project declares the profile with a "Review profile:" line in its CLAUDE.md. Absent that, the skill auto-detects from framework, ORM, concurrency, and manifest signals and recommends declaring one. The security and dependency modules defer their depth to /security-check.
Diffstat (limited to 'review-code')
-rw-r--r--review-code/SKILL.md78
1 files changed, 77 insertions, 1 deletions
diff --git a/review-code/SKILL.md b/review-code/SKILL.md
index cec6b54..ec08a9d 100644
--- a/review-code/SKILL.md
+++ b/review-code/SKILL.md
@@ -1,6 +1,6 @@
---
name: review-code
-description: Review code changes against engineering standards. Accepts a PR number, a SHA range (BASE..HEAD), the current branch's diff against main, staged changes, or a described scope ("the last 3 commits"). Audits CLAUDE.md adherence (reads root + per-directory CLAUDE.md), intent-vs-delivery (when given a plan/ADR/ticket), security, testing (TDD evidence + three-category coverage), conventions (conventional commits + no AI attribution), root-cause discipline, architecture (layering + scope), API contracts. Produces a structured report — Strengths, per-criterion PASS/WARN/FAIL, per-issue Critical/Important/Minor severity — ending with an explicit verdict (Approve / Request Changes / Needs Discussion) plus 1-2 sentence reasoning. Self-filters low-confidence findings; never flags pre-existing issues, lint/typecheck issues (CI handles those), or changes on unmodified lines. Use before merging a PR, before pushing a branch, or when reviewing a teammate's work. Do NOT use for proposing features (use brainstorm or arch-design), drafting implementation (use start-work or add-tests), standalone security audits (use security-check), or narrow style-only checks (a linter handles those).
+description: Review code changes against engineering standards. Accepts a PR number, a SHA range (BASE..HEAD), the current branch's diff against main, staged changes, or a described scope ("the last 3 commits"). Audits CLAUDE.md adherence (reads root + per-directory CLAUDE.md), intent-vs-delivery (when given a plan/ADR/ticket), security, testing (TDD evidence + three-category coverage), conventions (conventional commits + no AI attribution), root-cause discipline, architecture (layering + scope), API contracts, and conditional deep dives (security/OWASP, auth, concurrency, performance, API compatibility, dependencies) gated by a project's declared or auto-detected review profile. Produces a structured report — Strengths, per-criterion PASS/WARN/FAIL, per-issue Critical/Important/Minor severity — ending with an explicit verdict (Approve / Request Changes / Needs Discussion) plus 1-2 sentence reasoning. Self-filters low-confidence findings; never flags pre-existing issues, lint/typecheck issues (CI handles those), or changes on unmodified lines. Use before merging a PR, before pushing a branch, or when reviewing a teammate's work. Do NOT use for proposing features (use brainstorm or arch-design), drafting implementation (use start-work or add-tests), standalone security audits (use security-check), or narrow style-only checks (a linter handles those).
---
# Review Code
@@ -59,6 +59,7 @@ If skipping, report: "Skipped — <reason>" and stop.
- Root project `CLAUDE.md` if present
- Any `CLAUDE.md` in directories whose files the diff modified
- Their paths go into the audit; their content guides the CLAUDE.md adherence criterion
+ - While reading the root `CLAUDE.md`, capture any `Review profile:` line — it drives the conditional deep dives in Phase 3.5.
3. **If intent context was provided**, fetch and read it. Extract: stated goal, scope, non-goals, acceptance criteria, linked ADRs.
@@ -165,6 +166,78 @@ Skip if no intent context; note "not evaluated" in the report.
- Client-side types match server-side output
- Data flows through the API layer, not direct data access from handlers
+## Phase 3.5 — Conditional Deep Dives (project-profile gated)
+
+Phase 3's criteria are *universal* — they run on every review. The deep dives below are *conditional*: each activates only when the project fits its criterion. A CLI tool and a concurrent, DB-backed service share the same universal core but earn different deep dives. Running every module on every project buries real findings in irrelevant checklist noise; gating them keeps the review proportional to what the code actually does.
+
+### Resolving the project profile
+
+Decide which modules to run — declared overrides detected:
+
+1. **Declared profile (preferred).** Read the root `CLAUDE.md` (captured in Phase 1) for a `Review profile:` line — a comma-separated list of profile tags:
+
+ ```
+ Review profile: service, db-backed, concurrent, public-api
+ ```
+
+ A declared profile is authoritative: run exactly the modules its tags name, plus the universal core. Recognized tags map to the modules in the table below.
+
+2. **Auto-detect (fallback).** When no `Review profile:` line exists, infer the profile from signals in the repo and diff:
+ - web/RPC framework imports or route definitions (Flask, FastAPI, Express, gRPC, Rails, …) → `service`, `public-api`
+ - ORM/DB-driver imports, a `migrations/` dir, or raw SQL → `db-backed`
+ - threading / async / goroutine / lock primitives, shared mutable state → `concurrent`
+ - a touched dependency manifest or lockfile → `dependencies`
+ - code that stores or transmits PII, credentials, financial or health data → `sensitive-data`
+ - authn/authz present (login, sessions, permission/role checks) → `auth`
+ - none of the above (CLI, scripts, config, a single-purpose library) → universal-only
+
+ Auto-detect is a safety net, not the intended path. When it fires, name the inferred profile in the report and recommend the project add an explicit `Review profile:` line so future reviews don't re-derive it.
+
+### Tag → module map
+
+| Profile tag | Activated module(s) |
+|-------------|---------------------|
+| `service` | Input Validation & OWASP; Error Handling (deep) |
+| `public-api` | API Compatibility |
+| `auth` | Auth & Authorization |
+| `sensitive-data` | Sensitive-Data & Crypto |
+| `db-backed` | Performance: Database |
+| `perf-sensitive` | Performance: Code Efficiency & Scaling |
+| `concurrent` | Concurrency |
+| `dependencies` | Dependencies |
+
+Run a module's checks only when its tag is active. Report every activated module in the Phase 5 Per-Criterion Audit; mark unactivated modules `N/A (profile tag not set)` so the report shows what was and wasn't in scope — a silent omission reads as "checked and clean."
+
+### The modules
+
+**Input Validation & OWASP** (`service`) — the code ingests untrusted/external input.
+- Input validated and sanitized at the boundary; SQL parameterized (no string concatenation); output encoded against XSS; file uploads validated (type, size, content); path traversal prevented; command-injection and unsafe-deserialization vectors closed; HTTP security headers present.
+- This is the surface flag, not the full audit — hand the diff to `/security-check` for the deep OWASP Top 10 pass rather than duplicating it here.
+
+**Error Handling (deep)** (`service`) — a production/long-running service with availability needs. (Baseline "don't swallow exceptions, fail safe" stays universal in Phase 2.)
+- Exceptions caught at the right level; system denies on error (fail safe); transactions rolled back on failure; correlation IDs for cross-service tracing; log levels appropriate; critical failures trigger alerts.
+
+**API Compatibility** (`public-api`) — the change touches a published API surface (REST/gRPC endpoints, a library's exported symbols, a wire format other code consumes).
+- Classify each change non-breaking vs breaking. Non-breaking: added optional fields, new endpoints, required→optional, new enum values clients tolerate. Breaking: removed fields/endpoints, changed types, optional→required, changed semantics or status codes. For any breaking change: version bumped, migration path documented, deprecation warnings added, transition period planned.
+
+**Auth & Authorization** (`auth`) — login, sessions, permission/role checks, multi-tenant data.
+- Auth enforced at every access point; object-level and function-level authorization (one user can't reach another's records, privileged endpoints require a role); session management secure; CSRF protection on state-changing operations; rate limiting where appropriate.
+
+**Sensitive-Data & Crypto** (`sensitive-data`) — stores or transmits PII, credentials, financial or health data.
+- Sensitive data encrypted at rest and in transit; error messages don't leak system internals; logs exclude sensitive values; no weak or absent crypto (no hardcoded keys, no plaintext storage).
+
+**Performance: Database** (`db-backed`) — the change touches a data layer (ORM/query code, migrations, schema).
+- No N+1 queries; appropriate indexes exist; explicit columns (no `SELECT *`); pagination on large result sets; connection pooling used.
+
+**Performance: Code Efficiency & Scaling** (`perf-sensitive`) — a hot/latency-sensitive path, large-data processing, or a stated throughput/scale SLA.
+- No needless loops or iterations; right data structures (map vs list lookup); expensive work cached where it pays; streaming over load-all for large collections; resources closed (connections, file handles); works at 10x/100x the data; async where beneficial; no blocking calls in hot paths.
+
+**Concurrency** (`concurrent`) — threads, async/await, goroutines, shared mutable state, locks, or parallel workers.
+- Shared mutable state synchronized; no race conditions; lazy initialization thread-safe; getters don't return mutable internal objects; thread-safe collections where needed; consistent lock ordering (no deadlocks); loop variables captured into closures correctly (not by reference); timeouts on blocking operations; tests run with the race detector where the language has one.
+
+**Dependencies** (`dependencies`) — the change adds or bumps an external dependency (manifest/lockfile touched).
+- Actually necessary (not just convenience); from a trusted, actively-maintained source; no known vulnerabilities; license compatible; doesn't bloat the dependency tree; version pinned in the lockfile. Route the vulnerability scan through `/security-check` (it runs `pip-audit` / `npm audit` / osv).
+
## Phase 4 — Filter and Categorize
### Confidence Filter
@@ -316,6 +389,9 @@ Meta-level suggestions: process changes, follow-up tickets, architectural drift
| Root Cause & Thoroughness | PASS | Empty inventory, Unicode, large batches all covered |
| Architecture | PASS | Handler thin; logic in inventory service |
| API Contracts | PASS | Schema defined; TS types match in client/ |
+| Performance: Database | PASS | Export streams; no N+1 (profile: `db-backed`) |
+| API Compatibility | PASS | New endpoint, no breaking change (profile: `public-api`) |
+| Concurrency | N/A | profile tag `concurrent` not set |
## Issues