<feed xmlns='http://www.w3.org/2005/Atom'>
<title>archangel/tests, branch main</title>
<subtitle>Arch Linux installer ISO — ZFS-on-root or BTRFS, doubles as rescue disk
</subtitle>
<id>https://git.cjennings.net/archangel/atom?h=main</id>
<link rel='self' href='https://git.cjennings.net/archangel/atom?h=main'/>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/'/>
<updated>2026-06-10T04:45:00+00:00</updated>
<entry>
<title>feat(install): install baked AUR packages and clean the target config</title>
<updated>2026-06-10T04:45:00+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-06-10T04:45:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=4e6f4cc66206f02e92d4a2ca2f414fad5a3439a1'/>
<id>urn:sha1:4e6f4cc66206f02e92d4a2ca2f414fad5a3439a1</id>
<content type='text'>
Wire the baked AUR repo into the installer. Before pacstrap, install_base checks whether the ISO shipped the repo and, if so, exposes [aur] in the live /etc/pacman.conf and reads the package names from the manifest, adding them to the pacstrap set so they install into the target offline. This mirrors the existing [archzfs] handling. pacstrap resolves repos from the live system, not $MNTPOINT.

The live config already carries [aur] from the shipped ISO config, so the append is idempotent by design. A --skip-aur ISO ships no repo, and aur_repo_available gates the whole path, so the installer still works there.

configure_system strips any [aur] stanza from the target /etc/pacman.conf. pacstrap installs a stock target config with no [aur], so this is defensive, but it guarantees the installed system never references /usr/share/aur-packages, which exists only on the live ISO.

Four new common.sh helpers carry the logic: aur_repo_available, append_aur_repo (idempotent), aur_manifest_names (the manifest is the source of what to install, so the list never drifts), and strip_repo_stanza. All four covered across Normal, Boundary, and Error.
</content>
</entry>
<entry>
<title>feat(build): inject the AUR repo into the profile and live ISO</title>
<updated>2026-06-10T04:41:41+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-06-10T04:41:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=39b4a8bc5cac3a2092122f6c4fbede9bf0139286'/>
<id>urn:sha1:39b4a8bc5cac3a2092122f6c4fbede9bf0139286</id>
<content type='text'>
Wire build-aur.sh into build.sh. After the pacoloco block, build the AUR repo and append a build-host [aur] stanza to profile/pacman.conf with an absolute file:// Server, so mkarchiso installs the baked packages into airootfs. The stanza lands after the pacoloco rewrite so its file:// path isn't redirected to localhost.

Add the audited official extra packages and the baked AUR names to packages.x86_64, both sourced from build-aur.sh so the list never drifts from the build array. Ship the repo into airootfs and write a complete live /etc/pacman.conf: the pristine releng config with [aur] appended, not an [aur]-only file, since this replaces the live system's stock config and an AUR-only one would strip the official repos. Copy the manifest beside the ISO in out/.

--skip-aur skips the build, the stanza, the AUR names, and the live config. The three injection points also guard on the repo dir existing, so the documented empty-set path can't point mkarchiso at a missing repo. Moved BUILD_LOG creation ahead of the AUR build so its output is captured too.

A unit test reproduces the live-config construction and asserts core, extra, the mirrorlist, and [aur] all survive. The end-to-end proof that mkarchiso installs from the build-host repo needs a real root build and is tracked as manual verification.
</content>
</entry>
<entry>
<title>feat(build): add AUR local-repo build helpers</title>
<updated>2026-06-10T04:37:07+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-06-10T04:37:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=5e43d8c4ad8685e88331ac78641ca84666cb9e7a'/>
<id>urn:sha1:5e43d8c4ad8685e88331ac78641ca84666cb9e7a</id>
<content type='text'>
Add build-aur.sh, sourced by build.sh, that builds the v1 genuine-AUR set into a local pacman repo and emits an auditable manifest. The pure helpers carry the testable surface: the package sets (one source of truth for the build array and the package-list append), the [aur] stanza renderer, the TSV manifest header/row, the package-file locator, the staged repo replacement, and the build-environment preflight.

makepkg refuses to run as root, so the orchestrator drops to $SUDO_USER for the clone and build. It stages on the same filesystem and swaps in with mv -T on full success, so a failure ships no repo and leaves no stale one. On any failure error() names the package, the phase, and the log path.

The orchestrator and manifest-append need root, network, and makepkg, so they stay out of bats and are covered by the build integration test and the manual checklist instead. Eighteen unit tests cover the pure helpers across Normal, Boundary, and Error.
</content>
</entry>
<entry>
<title>test: cover disk_in_use and network_available failure paths</title>
<updated>2026-05-23T09:08:53+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-23T09:08:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=98fc424f7edb26314ffe124d3bc24549146a06d5'/>
<id>urn:sha1:98fc424f7edb26314ffe124d3bc24549146a06d5</id>
<content type='text'>
These two boundary functions backed the pre-flight guards from #215 but had no unit coverage of their own. The VM harness exercised them instead. I added 7 bats tests that mock the system commands they query, so the real branching logic runs.

test_disk.bats covers disk_in_use across mountpoint, active swap, imported-zpool member, and idle — that's the gate that refuses to wipe an already-mounted disk. test_archangel.bats covers network_available for DNS failure, TCP-connect failure, and success, the check that fails the install before pacstrap. The /proc/mdstat-positive branch and the live probes stay in the VM harness, since neither drives cleanly without writing to /proc or hitting the network. Suite 238 to 245, lint clean.
</content>
</entry>
<entry>
<title>fix(build): clear stale archzfs from the pacoloco cache too</title>
<updated>2026-05-23T01:28:15+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-23T01:28:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=bed054f46e3b41aae0d599ed7fbc3e1e42d6ddd7'/>
<id>urn:sha1:bed054f46e3b41aae0d599ed7fbc3e1e42d6ddd7</id>
<content type='text'>
archzfs re-uploads its GitHub release assets under the same filename, so pacoloco keeps serving a zfs-dkms/zfs-utils it cached earlier while pacman fetches a fresh archzfs.db with a new checksum. The two mismatch and pacstrap aborts with "invalid or corrupted package." build.sh already drops the stale packages from the host pacman cache, but it never cleared the pacoloco layer, which the VM test installs route through too, so test-install.sh kept hitting the corruption (four times in one session).

build.sh runs as root, so it now clears /var/cache/pacoloco/pkgs/archzfs/zfs-* alongside the host cache, which makes the build-then-test flow self-healing. The pacoloco cache is root-owned and test-install.sh runs as the user, so it can't clear it unattended. Instead, test-install.sh now recognizes the corruption (is_archzfs_cache_corruption) and prints how to clear it, the way it already names the SSH_PORT override on a port collision. A retry alone won't help since it hits the same cached file, so this fails fast with the hint rather than retrying.
</content>
</entry>
<entry>
<title>fix(test): fail clearly when the VM forward port is taken</title>
<updated>2026-05-23T01:12:14+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-23T01:12:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=0f8bbc7c1e2c2f6fec0b17753ac0d9c4a3ad4317'/>
<id>urn:sha1:0f8bbc7c1e2c2f6fec0b17753ac0d9c4a3ad4317</id>
<content type='text'>
A test run launched qemu without first checking the SSH forward port, so a collision with another VM already holding it surfaced only as an opaque "Failed to start VM," with qemu unable to bind and no hint why. I added a port_in_use check in run_test before the launch: it errors with the port number and the SSH_PORT override to set, records the failure, and moves on.

The check lives in run_test, not start_vm, because start_vm runs in a command substitution (vm_pid=$(start_vm ...)) where this harness's non-exiting error() would be captured as the PID instead of failing the run. The pure half, port_listening_in, takes an `ss -tln` snapshot as a string so it's unit-testable.
</content>
</entry>
<entry>
<title>test: make SSH_PORT overridable in test-install.sh</title>
<updated>2026-05-22T23:09:50+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-22T23:09:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=0a57d75d3947fddd6c6ab62924c52a456f4776b0'/>
<id>urn:sha1:0a57d75d3947fddd6c6ab62924c52a456f4776b0</id>
<content type='text'>
The port was hardcoded, so a test run collided with any other VM already forwarding 2222. It now defaults to 2222, so existing invocations are unchanged. SSH_PORT=2223 scripts/test-install.sh picks a free port to run alongside another VM.
</content>
</entry>
<entry>
<title>feat(install): add pre-flight environment and disk-target validation</title>
<updated>2026-05-22T23:03:40+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-22T23:03:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=b6525a50fabf3aedf41eee70c164519b00d27704'/>
<id>urn:sha1:b6525a50fabf3aedf41eee70c164519b00d27704</id>
<content type='text'>
archangel went straight from filesystem selection into a destructive install behind only a root check and a ZFS module load. A missing tool, a BIOS boot, a too-small or in-use disk, or a dead network surfaced as a confusing abort partway through, sometimes after partitioning had already run.

Two gates now fail fast. validate_environment runs after filesystem selection, before any disk is touched: it confirms UEFI boot mode and that every required command is present, with the list coming from a new required_commands helper built like pacstrap_packages. validate_install_targets runs after disk selection, before the first wipe: it refuses a target that's mounted, holds active swap, or belongs to an imported pool or md array, rejects disks under 20 GB, and confirms a mirror is reachable via DNS plus a TCP probe (no ICMP, since some networks drop it).

I folded the install_failure_cleanup hardening into the same change. It now falls back to lazy unmounts, so a pacstrap-interrupted target with busy bind mounts still releases the pool and unmounts the EFI partition. Without that, the disk-in-use guard would block the very retry the cleanup exists to enable. "Re-run to retry" only holds if the disk is genuinely freed first.

The 20 GB floor is decimal on purpose. It reads as the natural minimum and clears a 20 GiB disk image with headroom instead of sitting on the boundary.
</content>
</entry>
<entry>
<title>feat(test): retry pacstrap through transient mirror flakes</title>
<updated>2026-05-20T14:58:01+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-20T14:58:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=4ef30e5c84ab22ba1724608009093d6725a1ceda'/>
<id>urn:sha1:4ef30e5c84ab22ba1724608009093d6725a1ceda</id>
<content type='text'>
test-install.sh aborts a whole 5-minute VM run when pacstrap hits a transient mirror blip, and the suite reports a failure indistinguishable from a real install regression. run_test now retries the install up to twice, but only when the in-VM log shows both pacstrap's "Failed to install packages to new root" marker and a download/network indicator. A deterministic failure like "target not found" carries the marker without a network indicator, so it still fails fast. archangel's failure trap exports the pool and unmounts on abort, so each retry re-partitions and re-pacstraps from a clean state.

Wiring the predicate up needed a source-guard so bats can source the harness, which had none. With that in place I unit-covered the pure helpers — is_transient_install_failure, char_to_qemu_key, get_disk_count, get_disk_args — and lifted char_to_qemu_key out of monitor_sendkeys so the QEMU keymap is testable on its own.

The keymap test found a dead branch. The backslash case pattern was '\\', which never matches a lone backslash because bash matches one against '\', so a passphrase containing a backslash would have sent an invalid QEMU keyname instead of "backslash". No test passphrase uses one, so it never bit. I fixed the pattern.
</content>
</entry>
<entry>
<title>refactor: extract validate_encryption_passphrase from gather_input</title>
<updated>2026-05-19T17:30:07+00:00</updated>
<author>
<name>Craig Jennings</name>
<email>c@cjennings.net</email>
</author>
<published>2026-05-19T17:30:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.cjennings.net/archangel/commit/?id=9405b1fc9984e43b0297d2bb89dea1666e1f4853'/>
<id>urn:sha1:9405b1fc9984e43b0297d2bb89dea1666e1f4853</id>
<content type='text'>
gather_input's unattended branch had two parallel if-blocks, one for ZFS and one for Btrfs, each doing the same encryption-passphrase empty check against a filesystem-specific variable (ZFS_PASSPHRASE or LUKS_PASSPHRASE). The two blocks shared the condition surface and error template. Only the variable name differed.

I lifted the check into validate_encryption_passphrase in lib/config.sh next to validate_filesystem. The helper takes the variable name and uses indirect expansion (${!var_name}) so one function covers both filesystems. gather_input now dispatches via if/elif on FILESYSTEM and calls the helper with the right variable, collapsing 14 lines to 6.

The original tests in test_archangel.bats (gather_input errors when ZFS without ZFS_PASSPHRASE / when Btrfs without LUKS_PASSPHRASE / accepts ZFS with NO_ENCRYPT=yes) still pass, exercising the helper through the dispatch. Added 4 direct unit tests in test_config.bats covering the four cases: NO_ENCRYPT=yes passes regardless, NO_ENCRYPT=no with empty fails, NO_ENCRYPT=no with value passes, and the error message names the offending variable. Bats: 177 → 181.

No behavior change. The helper preserves the original error message format and exit conditions.
</content>
</entry>
</feed>
