#+TITLE: Design: Testinfra Post-Install Validation for archsetup #+AUTHOR: Craig Jennings #+DATE: 2026-06-25 #+STATUS: Accepted (2026-06-25) * Problem The VM integration harness (=scripts/testing/run-test.sh=) runs archsetup in a QEMU VM, then verifies the result two ways: 1. Parses archsetup's own install log for its Error Summary and the =ARCHSETUP_EXECUTION_COMPLETE= marker (did the script finish, did it log errors). 2. Runs =run_all_validations= from =scripts/testing/lib/validation.sh= — a hand-rolled, shell-based post-install assertion sweep of ~26 checks over SSH. The shell sweep works, but each check is 6-40 lines of =ssh_cmd= + =validation_pass/fail= + =attribute_issue= boilerplate, the pass/fail counters are hand-maintained globals, and the reporting is bespoke. Adding or reading a check is heavier than it should be, and growing the suite (archsetup configures far more than the 26 checks cover) compounds that weight. This doc proposes porting the post-install validation to Testinfra (Python + pytest) for more expressive checks and better reporting, then growing coverage. * Decision Port the post-install validation layer to Testinfra + pytest, reaching parity with the existing =validation.sh= sweep, then expand coverage. Recorded rationale: the up-front port cost (parity rewrite + a test-only dependency) is an accepted trade — the priority is a robust, well-reported, growing validation suite over feature speed. The framework swap alone buys ergonomics and reporting, not coverage, so it is paired with real new coverage (below). This replaces the shell sweep; it does not touch archsetup's own install-log parsing (that stays as a separate signal). The full coverage expansion (P4) lands in this task too, sequenced strictly after the parity cutover so the parity verification stays clean. * Current harness (what exists today) ** Flow (run-test.sh) 1. Revert VM to base snapshot, boot, wait for SSH. 2. =capture_pre_install_state=. 3. Bundle + copy archsetup + dotfiles into the VM, run archsetup in background, poll to completion. 4. =capture_post_install_state=. 5. =run_all_validations= (the shell sweep). 6. =analyze_log_diff= + =generate_issue_report= (issue attribution). 7. Explicit pass/fail exit code; cleanup. ** The shell sweep (validation.sh) ~26 checks under =run_all_validations=: user created / shell / groups, dotfiles, yay, pacman working, window manager, firewall, DNS, avahi, fail2ban, NetworkManager, emacs, git config, dev tools, zfs, boot config, autologin, gnome-keyring, terminus font, mkinitcpio hooks, initramfs consolefont, nvme module, archsetup log, state markers. ** Issue attribution =attribute_issue = sorts each failure into one of three arrays — =ARCHSETUP_ISSUES=, =BASE_INSTALL_ISSUES=, =UNKNOWN_ISSUES= — and =generate_issue_report= writes them out (base-install issues route to the archzfs inbox). This is domain logic Testinfra has no equivalent for; the port must preserve it. ** Connection =ssh_cmd= uses =sshpass -p "$ROOT_PASSWORD" ssh ... -p "$SSH_PORT" root@$VM_IP=, with =VM_IP=localhost=, =SSH_PORT=2222=, =ROOT_PASSWORD=archsetup=. * Design ** Where Testinfra fits Replace the =run_all_validations= call (step 5) with a pytest invocation against the running VM. Steps 1-4 and 6-7 are unchanged; =analyze_log_diff= stays. Testinfra connects over the same SSH the harness already exposes. ** Connection model Testinfra's paramiko/ssh backend targets the live VM via its host spec: #+begin_src sh pytest scripts/testing/tests/ \ --hosts="ssh://root@localhost:2222" \ --ssh-config= \ --json-report --json-report-file="$TEST_RESULTS_DIR/testinfra.json" #+end_src Password auth: generate a throwaway ssh-config (or reuse sshpass via a =--ssh-identity= once archsetup drops the key, but at validation time we only have the root password). Simplest: a tiny generated ssh config + sshpass wrapper, or switch the test VM to a known test key injected pre-run. Open question below. ** Test layout #+begin_example scripts/testing/tests/ conftest.py # host fixture, markers, attribution hook, report glue test_users.py # user created / shell / groups test_dotfiles.py # stow symlinks, readable by user test_packages.py # yay, pacman working, dev tools, key packages test_services.py # firewall, dns, avahi, fail2ban, networkmanager test_boot.py # zfs, mkinitcpio hooks, nvme, consolefont, terminus test_desktop.py # window manager, autologin, gnome-keyring test_archsetup.py # install log, state markers test_hardening.py # NEW: sshd drop-in, sysctl, /etc fstab perms, backups #+end_example ** Example tests (parity) #+begin_src python def test_ufw_enabled(host): assert host.service("ufw").is_enabled def test_user_cjennings_exists(host): u = host.user("cjennings") assert u.exists assert u.shell == "/usr/bin/zsh" def test_zshrc_stowed_and_readable(host): f = host.file("/home/cjennings/.zshrc") assert f.is_symlink assert ".dotfiles/" in f.linked_to assert f.exists # not broken assert host.run("sudo -u cjennings test -r %s" % f.path).rc == 0 def test_mkinitcpio_systemd_hook(host): # non-ZFS systems delegate fsck from udev to systemd conf = host.file("/etc/mkinitcpio.conf").content_string assert "systemd" in conf #+end_src Compare =test_ufw_enabled= (1 line) to the current =validate_firewall= (8 lines of ssh_cmd + branch + counters). ** Preserving issue attribution Map the three buckets to pytest markers and collect them in a =conftest.py= hook: #+begin_src python @pytest.mark.attribution("archsetup") # or "base_install" / "unknown" def test_ufw_enabled(host): ... #+end_src A =pytest_runtest_makereport= hook records each failure under its marker's bucket and writes the same three-way report =generate_issue_report= produces (base-install failures still route to the archzfs inbox). Default bucket = archsetup when unmarked. ** Tiered strategy Markers =@pytest.mark.smoke= (user, key packages, dotfiles present) and =@pytest.mark.integration= (services, configs, boot). =pytest -m smoke= for a fast gate, full run otherwise. Drop the task's original X11/startx end-to-end slice — the fleet is Wayland/Hyprland and headless GUI e2e is flaky and expensive; a Wayland-session smoke check can be reconsidered later as its own task. ** Reporting =pytest-json-report= (or junit-xml) → =$TEST_RESULTS_DIR/=, surfaced in the test report alongside the install-log analysis. pytest's own per-test pass/fail/skip output replaces the hand-maintained counters. * Coverage ** Parity (port all current checks) All ~26 =validation.sh= checks, grouped per the layout above. ** Expansion (new — the coverage win) archsetup configures much that isn't validated today. Candidates: - sshd hardening drop-in (=/etc/ssh/sshd_config.d/10-hardening.conf=, PermitRootLogin prohibit-password). - =backup_system_file= behavior — assert =.archsetup.bak= exists for files archsetup edited in place (fstab, mkinitcpio.conf, sudoers, …). - pacman.conf (ParallelDownloads, Color, multilib) and makepkg.conf (MAKEFLAGS, OPTIONS) settings actually applied. - systemd-resolved DNS-over-TLS drop-in; NetworkManager wifi-privacy. - fail2ban jail.local present; reflector config; sysctl printk; /etc/issue emptied; vconsole font; fstab /efi fmask/dmask perms. - sanoid / zfs-replicate units (ZFS hosts). * Dependencies Add =python-pytest=, =python-pytest-testinfra= (pulls paramiko), and a JSON reporter to =make deps= (test host only — not installed by archsetup itself). Note: the existing unit suites run under =python3 -m unittest=; the integration layer runs under pytest. Two runners, both Python; =make test-unit= unchanged, =make test= gains the pytest step. * Goss comparison (the task asked) - *Goss* — YAML-declarative health specs, a single Go binary executed *on the target*. Fast, no Python. But the spec must be pushed into the VM and run there, the assertions are less programmable, and it adds a Go binary to the flow. - *Testinfra* — Python, runs *on the host* over SSH (nothing installed in the VM), assertions are full Python with rich built-in modules (File/Package/Service/User/Command), integrates with pytest's tooling. Choose Testinfra: it runs from the host (the VM stays clean), it's far more programmable for the conditional checks archsetup needs (DESKTOP_ENV branches, ZFS-vs-not), and it aligns with the repo's existing Python test tooling. * Migration plan (phased, TDD where the helper logic is ours) - *P1 — Scaffold.* conftest.py (host fixture + connection), the attribution marker + report hook, and 3 parity checks (firewall, user, dotfiles). Wire a pytest step into run-test.sh behind a flag so the shell sweep still runs. - *P2 — Full parity.* Port all ~26 checks; diff a real VM run's results against the shell sweep to confirm no check was lost. - *P3 — Cut over.* Make pytest the primary sweep in run-test.sh; keep =analyze_log_diff= and the install-log signal. - *P4 — Expand.* Add the new coverage (hardening, backups, applied settings). - *P5 — Retire.* Remove =run_all_validations= from validation.sh (keep the capture/analyze helpers that pytest doesn't replace). * Acceptance criteria - =make test= runs archsetup in a VM, then a pytest sweep over SSH, and a real run reports parity with (or a superset of) the current shell checks. - Failures still sort into archsetup / base-install / unknown, with base-install issues routed to the archzfs inbox as today. - =make deps= installs the test dependencies; the VM has nothing extra installed. - A documented =pytest -m smoke= fast path exists. * Resolved decisions (2026-06-25) 1. *Auth at validation time — inject a throwaway test key.* Pre-run, generate an ephemeral keypair, push the pubkey into the VM's =/root/.ssh/authorized_keys= over the existing sshpass channel, and point Testinfra at the private key via a generated ssh-config. No password in the pytest invocation; paramiko key auth just works; the keypair is discarded after the run. (Chosen over wrapping sshpass around Testinfra, which is awkward since Testinfra spawns its own ssh connections.) 2. *Cut over — run both through parity, then switch.* Keep the shell sweep running alongside pytest through P2 so a real VM run can diff pytest's results against the shell sweep and prove no check was dropped. pytest becomes primary at P3; =run_all_validations= is deleted at P5 after the expanded suite proves out. 3. *Expansion scope — full, in this task, after cutover.* All of P4 lands here, sequenced strictly after the P3 parity cutover so the parity diff is clean before new checks are added.