diff options
| -rw-r--r-- | review-code/SKILL.md | 78 |
1 files changed, 77 insertions, 1 deletions
diff --git a/review-code/SKILL.md b/review-code/SKILL.md index cec6b54..ec08a9d 100644 --- a/review-code/SKILL.md +++ b/review-code/SKILL.md @@ -1,6 +1,6 @@ --- name: review-code -description: Review code changes against engineering standards. Accepts a PR number, a SHA range (BASE..HEAD), the current branch's diff against main, staged changes, or a described scope ("the last 3 commits"). Audits CLAUDE.md adherence (reads root + per-directory CLAUDE.md), intent-vs-delivery (when given a plan/ADR/ticket), security, testing (TDD evidence + three-category coverage), conventions (conventional commits + no AI attribution), root-cause discipline, architecture (layering + scope), API contracts. Produces a structured report — Strengths, per-criterion PASS/WARN/FAIL, per-issue Critical/Important/Minor severity — ending with an explicit verdict (Approve / Request Changes / Needs Discussion) plus 1-2 sentence reasoning. Self-filters low-confidence findings; never flags pre-existing issues, lint/typecheck issues (CI handles those), or changes on unmodified lines. Use before merging a PR, before pushing a branch, or when reviewing a teammate's work. Do NOT use for proposing features (use brainstorm or arch-design), drafting implementation (use start-work or add-tests), standalone security audits (use security-check), or narrow style-only checks (a linter handles those). +description: Review code changes against engineering standards. Accepts a PR number, a SHA range (BASE..HEAD), the current branch's diff against main, staged changes, or a described scope ("the last 3 commits"). Audits CLAUDE.md adherence (reads root + per-directory CLAUDE.md), intent-vs-delivery (when given a plan/ADR/ticket), security, testing (TDD evidence + three-category coverage), conventions (conventional commits + no AI attribution), root-cause discipline, architecture (layering + scope), API contracts, and conditional deep dives (security/OWASP, auth, concurrency, performance, API compatibility, dependencies) gated by a project's declared or auto-detected review profile. Produces a structured report — Strengths, per-criterion PASS/WARN/FAIL, per-issue Critical/Important/Minor severity — ending with an explicit verdict (Approve / Request Changes / Needs Discussion) plus 1-2 sentence reasoning. Self-filters low-confidence findings; never flags pre-existing issues, lint/typecheck issues (CI handles those), or changes on unmodified lines. Use before merging a PR, before pushing a branch, or when reviewing a teammate's work. Do NOT use for proposing features (use brainstorm or arch-design), drafting implementation (use start-work or add-tests), standalone security audits (use security-check), or narrow style-only checks (a linter handles those). --- # Review Code @@ -59,6 +59,7 @@ If skipping, report: "Skipped — <reason>" and stop. - Root project `CLAUDE.md` if present - Any `CLAUDE.md` in directories whose files the diff modified - Their paths go into the audit; their content guides the CLAUDE.md adherence criterion + - While reading the root `CLAUDE.md`, capture any `Review profile:` line — it drives the conditional deep dives in Phase 3.5. 3. **If intent context was provided**, fetch and read it. Extract: stated goal, scope, non-goals, acceptance criteria, linked ADRs. @@ -165,6 +166,78 @@ Skip if no intent context; note "not evaluated" in the report. - Client-side types match server-side output - Data flows through the API layer, not direct data access from handlers +## Phase 3.5 — Conditional Deep Dives (project-profile gated) + +Phase 3's criteria are *universal* — they run on every review. The deep dives below are *conditional*: each activates only when the project fits its criterion. A CLI tool and a concurrent, DB-backed service share the same universal core but earn different deep dives. Running every module on every project buries real findings in irrelevant checklist noise; gating them keeps the review proportional to what the code actually does. + +### Resolving the project profile + +Decide which modules to run — declared overrides detected: + +1. **Declared profile (preferred).** Read the root `CLAUDE.md` (captured in Phase 1) for a `Review profile:` line — a comma-separated list of profile tags: + + ``` + Review profile: service, db-backed, concurrent, public-api + ``` + + A declared profile is authoritative: run exactly the modules its tags name, plus the universal core. Recognized tags map to the modules in the table below. + +2. **Auto-detect (fallback).** When no `Review profile:` line exists, infer the profile from signals in the repo and diff: + - web/RPC framework imports or route definitions (Flask, FastAPI, Express, gRPC, Rails, …) → `service`, `public-api` + - ORM/DB-driver imports, a `migrations/` dir, or raw SQL → `db-backed` + - threading / async / goroutine / lock primitives, shared mutable state → `concurrent` + - a touched dependency manifest or lockfile → `dependencies` + - code that stores or transmits PII, credentials, financial or health data → `sensitive-data` + - authn/authz present (login, sessions, permission/role checks) → `auth` + - none of the above (CLI, scripts, config, a single-purpose library) → universal-only + + Auto-detect is a safety net, not the intended path. When it fires, name the inferred profile in the report and recommend the project add an explicit `Review profile:` line so future reviews don't re-derive it. + +### Tag → module map + +| Profile tag | Activated module(s) | +|-------------|---------------------| +| `service` | Input Validation & OWASP; Error Handling (deep) | +| `public-api` | API Compatibility | +| `auth` | Auth & Authorization | +| `sensitive-data` | Sensitive-Data & Crypto | +| `db-backed` | Performance: Database | +| `perf-sensitive` | Performance: Code Efficiency & Scaling | +| `concurrent` | Concurrency | +| `dependencies` | Dependencies | + +Run a module's checks only when its tag is active. Report every activated module in the Phase 5 Per-Criterion Audit; mark unactivated modules `N/A (profile tag not set)` so the report shows what was and wasn't in scope — a silent omission reads as "checked and clean." + +### The modules + +**Input Validation & OWASP** (`service`) — the code ingests untrusted/external input. +- Input validated and sanitized at the boundary; SQL parameterized (no string concatenation); output encoded against XSS; file uploads validated (type, size, content); path traversal prevented; command-injection and unsafe-deserialization vectors closed; HTTP security headers present. +- This is the surface flag, not the full audit — hand the diff to `/security-check` for the deep OWASP Top 10 pass rather than duplicating it here. + +**Error Handling (deep)** (`service`) — a production/long-running service with availability needs. (Baseline "don't swallow exceptions, fail safe" stays universal in Phase 2.) +- Exceptions caught at the right level; system denies on error (fail safe); transactions rolled back on failure; correlation IDs for cross-service tracing; log levels appropriate; critical failures trigger alerts. + +**API Compatibility** (`public-api`) — the change touches a published API surface (REST/gRPC endpoints, a library's exported symbols, a wire format other code consumes). +- Classify each change non-breaking vs breaking. Non-breaking: added optional fields, new endpoints, required→optional, new enum values clients tolerate. Breaking: removed fields/endpoints, changed types, optional→required, changed semantics or status codes. For any breaking change: version bumped, migration path documented, deprecation warnings added, transition period planned. + +**Auth & Authorization** (`auth`) — login, sessions, permission/role checks, multi-tenant data. +- Auth enforced at every access point; object-level and function-level authorization (one user can't reach another's records, privileged endpoints require a role); session management secure; CSRF protection on state-changing operations; rate limiting where appropriate. + +**Sensitive-Data & Crypto** (`sensitive-data`) — stores or transmits PII, credentials, financial or health data. +- Sensitive data encrypted at rest and in transit; error messages don't leak system internals; logs exclude sensitive values; no weak or absent crypto (no hardcoded keys, no plaintext storage). + +**Performance: Database** (`db-backed`) — the change touches a data layer (ORM/query code, migrations, schema). +- No N+1 queries; appropriate indexes exist; explicit columns (no `SELECT *`); pagination on large result sets; connection pooling used. + +**Performance: Code Efficiency & Scaling** (`perf-sensitive`) — a hot/latency-sensitive path, large-data processing, or a stated throughput/scale SLA. +- No needless loops or iterations; right data structures (map vs list lookup); expensive work cached where it pays; streaming over load-all for large collections; resources closed (connections, file handles); works at 10x/100x the data; async where beneficial; no blocking calls in hot paths. + +**Concurrency** (`concurrent`) — threads, async/await, goroutines, shared mutable state, locks, or parallel workers. +- Shared mutable state synchronized; no race conditions; lazy initialization thread-safe; getters don't return mutable internal objects; thread-safe collections where needed; consistent lock ordering (no deadlocks); loop variables captured into closures correctly (not by reference); timeouts on blocking operations; tests run with the race detector where the language has one. + +**Dependencies** (`dependencies`) — the change adds or bumps an external dependency (manifest/lockfile touched). +- Actually necessary (not just convenience); from a trusted, actively-maintained source; no known vulnerabilities; license compatible; doesn't bloat the dependency tree; version pinned in the lockfile. Route the vulnerability scan through `/security-check` (it runs `pip-audit` / `npm audit` / osv). + ## Phase 4 — Filter and Categorize ### Confidence Filter @@ -316,6 +389,9 @@ Meta-level suggestions: process changes, follow-up tickets, architectural drift | Root Cause & Thoroughness | PASS | Empty inventory, Unicode, large batches all covered | | Architecture | PASS | Handler thin; logic in inventory service | | API Contracts | PASS | Schema defined; TS types match in client/ | +| Performance: Database | PASS | Export streams; no N+1 (profile: `db-backed`) | +| API Compatibility | PASS | New endpoint, no breaking change (profile: `public-api`) | +| Concurrency | N/A | profile tag `concurrent` not set | ## Issues |
