feat(hooks): scan file-backed messages and harden rm parsing

Two audit gaps in the confirmation hooks, plus the test harness they were missing. The git-commit and gh-pr-create hooks scanned for AI attribution but only saw inline messages. A commit made with -F/--file or a PR made with --body-file slipped through, since the hook stored a placeholder instead of the file's text, and the publish flow uses -F constantly. A new read_referenced_file helper in _common.py reads the referenced local file (missing, oversized, or non-UTF-8 returns None, which means "couldn't inspect" and never "clean"), so attribution scanning now sees the real committed and posted text. An unreadable file falls through to the existing ask-anyway path. destructive-bash-confirm.py parsed rm flags by splitting on whitespace, which mangled quoted paths and missed flag variants. detect_rm_rf now tokenizes with shlex, so quoted or spaced paths and combined, separate, or reordered flags all parse. It fails toward asking (a sentinel that still fires the modal) on unbalanced quotes, or when a forced recursive rm sits alongside a pipeline, compound command, substitution, or redirect, since target attribution isn't trustworthy there. The supported and unsupported shell constructs are documented in the docstrings. These hooks had no tests and weren't in make test. Added a pytest harness under hooks/tests (an importlib-by-path loader, since the hook filenames are hyphenated) with 54 tests across the three hooks and the shared helper, and wired hooks/tests into make test. Full suite green.
author: Craig Jennings <c@cjennings.net> 2026-05-22 15:38:58 -0500
committer: Craig Jennings <c@cjennings.net> 2026-05-22 15:38:58 -0500
commit: 53d33d9cdd5c6d7fd7c4dc7315b04d225add94d6 (patch)
tree: a2cfe5331eed9e22cc9880c3b75e246d74e2239e /hooks/tests/test_gh_pr_create_confirm.py
parent: efcc8e5ffdd09f538fd2d83824f4c632264ad96c (diff)
download: rulesets-53d33d9cdd5c6d7fd7c4dc7315b04d225add94d6.tar.gz
rulesets-53d33d9cdd5c6d7fd7c4dc7315b04d225add94d6.zip
1 files changed, 70 insertions, 0 deletions
diff --git a/hooks/tests/test_gh_pr_create_confirm.py b/hooks/tests/test_gh_pr_create_confirm.py
new file mode 100644
index 0000000..19dde2e
--- /dev/null
+++ b/hooks/tests/test_gh_pr_create_confirm.py
@@ -0,0 +1,70 @@
+"""Tests for hooks/gh-pr-create-confirm.py — --body-file reads real content."""
+
+from conftest import load_hook
+
+hook = load_hook("gh-pr-create-confirm.py")
+
+
+# --- existing parsing still works (regression guard) -----------------------
+
+def test_parse_title_and_inline_body():
+    cmd = 'gh pr create --title "feat: thing" --body "does the thing"'
+    fields = hook.parse_pr_create(cmd)
+    assert fields["title"] == "feat: thing"
+    assert fields["body"] == "does the thing"
+
+
+def test_parse_reviewers():
+    cmd = 'gh pr create --title "x" --reviewer alice,bob'
+    fields = hook.parse_pr_create(cmd)
+    assert fields["reviewers"] == ["alice", "bob"]
+
+
+# --- new: --body-file reads the real content -------------------------------
+
+def test_body_file_reads_real_content(tmp_path):
+    f = tmp_path / "body.md"
+    f.write_text("## Problem\nthings broke\n\n## Fix\nfixed them\n")
+    fields = hook.parse_pr_create(f'gh pr create --title "x" --body-file {f}')
+    assert "things broke" in fields["body"]
+    assert "fixed them" in fields["body"]
+    # No longer the old placeholder.
+    assert not fields["body"].startswith("(body read from file")
+
+
+def test_body_file_attribution_is_caught(tmp_path):
+    f = tmp_path / "body.md"
+    f.write_text("## Summary\nshipped a feature \U0001F916 generated with Claude\n")
+    fields = hook.parse_pr_create(f'gh pr create --title "feat: x" --body-file {f}')
+    scan_text = "\n".join(
+        filter(None, [fields.get("title"), fields.get("body")])
+    )
+    hits = hook.scan_attribution(scan_text)
+    assert hits  # robot emoji + 'Generated with Claude' both leak
+
+
+def test_body_file_clean_content_no_hits(tmp_path):
+    f = tmp_path / "body.md"
+    f.write_text("## Summary\nfixed the off-by-one in the pager\n")
+    fields = hook.parse_pr_create(f'gh pr create --title "fix: pager" --body-file {f}')
+    scan_text = "\n".join(
+        filter(None, [fields.get("title"), fields.get("body")])
+    )
+    assert hook.scan_attribution(scan_text) == []
+
+
+# --- unreadable file keeps an informative could-not-inspect placeholder ----
+
+def test_body_file_missing_keeps_could_not_inspect_placeholder(tmp_path):
+    missing = tmp_path / "nope.md"
+    fields = hook.parse_pr_create(f'gh pr create --title "x" --body-file {missing}')
+    assert "could not inspect" in fields["body"]
+    assert str(missing) in fields["body"]
+
+
+def test_body_file_oversized_keeps_placeholder(tmp_path, monkeypatch):
+    f = tmp_path / "big.md"
+    f.write_text("x" * 5000)
+    monkeypatch.setattr(hook, "read_referenced_file", lambda p: None)
+    fields = hook.parse_pr_create(f'gh pr create --title "x" --body-file {f}')
+    assert "could not inspect" in fields["body"]
author	Craig Jennings <c@cjennings.net>	2026-05-22 15:38:58 -0500
committer	Craig Jennings <c@cjennings.net>	2026-05-22 15:38:58 -0500
commit	53d33d9cdd5c6d7fd7c4dc7315b04d225add94d6 (patch)
tree	a2cfe5331eed9e22cc9880c3b75e246d74e2239e /hooks/tests/test_gh_pr_create_confirm.py
parent	efcc8e5ffdd09f538fd2d83824f4c632264ad96c (diff)
download	rulesets-53d33d9cdd5c6d7fd7c4dc7315b04d225add94d6.tar.gz rulesets-53d33d9cdd5c6d7fd7c4dc7315b04d225add94d6.zip