aboutsummaryrefslogtreecommitdiff
path: root/docs/2026-01-22-mkinitcpio-config-boot-failure.org
diff options
context:
space:
mode:
authorCraig Jennings <c@cjennings.net>2026-01-22 14:27:49 -0600
committerCraig Jennings <c@cjennings.net>2026-01-22 14:27:49 -0600
commit0720a543d0eacf890ec99a6a5b337c85f896d647 (patch)
treeb8a40b3f3a02e1631f1e92c2b2207a558445d560 /docs/2026-01-22-mkinitcpio-config-boot-failure.org
parent50a5f78c5a7be0f5e3d630efb10cd23902549667 (diff)
downloadarchangel-0720a543d0eacf890ec99a6a5b337c85f896d647.tar.gz
archangel-0720a543d0eacf890ec99a6a5b337c85f896d647.zip
Fix ratio boot issues: firmware, mkinitcpio, and document ZFS rollback dangers
Root cause: Missing/outdated linux-firmware broke AMD Strix Halo GPU init. Fixed by installing linux-firmware 20260110-1. Changes: - install-archzfs: Fix mkinitcpio config (remove archiso.conf, fix preset) - todo.org: Add ZFS rollback + /boot mismatch issue, recommend ZFSBootMenu - docs/2026-01-22-ratio-boot-fix-session.org: Full troubleshooting session - docs/2026-01-22-mkinitcpio-config-boot-failure.org: Bug report - assets/: Supporting documentation and video transcript Key learnings: - AMD Strix Halo requires linux-firmware 20260110+ - ZFS rollback with /boot on EFI partition can break boot - zpool import -R can permanently change mountpoints
Diffstat (limited to 'docs/2026-01-22-mkinitcpio-config-boot-failure.org')
-rw-r--r--docs/2026-01-22-mkinitcpio-config-boot-failure.org159
1 files changed, 159 insertions, 0 deletions
diff --git a/docs/2026-01-22-mkinitcpio-config-boot-failure.org b/docs/2026-01-22-mkinitcpio-config-boot-failure.org
new file mode 100644
index 0000000..ba5bc72
--- /dev/null
+++ b/docs/2026-01-22-mkinitcpio-config-boot-failure.org
@@ -0,0 +1,159 @@
+#+TITLE: install-archzfs leaves broken mkinitcpio configuration
+#+DATE: 2026-01-22
+
+* Problem Summary
+
+After installing Arch Linux with ZFS via install-archzfs, the system has incorrect mkinitcpio configuration that can cause boot failures. The configuration issues are latent - the system may boot initially but will fail after any mkinitcpio regeneration (kernel updates, manual rebuilds, etc.).
+
+* Root Cause
+
+The install-archzfs script does not properly configure mkinitcpio for a ZFS boot environment. Three issues were identified:
+
+** Issue 1: Wrong HOOKS in mkinitcpio.conf
+
+The installed system had:
+#+begin_example
+HOOKS=(base systemd autodetect microcode modconf kms keyboard keymap sd-vconsole block filesystems fsck)
+#+end_example
+
+This is wrong for ZFS because:
+- Uses =systemd= init hook, but ZFS hook is busybox-based and incompatible with systemd init
+- Missing =zfs= hook entirely
+- Has =fsck= hook which is unnecessary/wrong for ZFS
+
+Correct HOOKS for ZFS:
+#+begin_example
+HOOKS=(base udev autodetect microcode modconf kms keyboard keymap consolefont block zfs filesystems)
+#+end_example
+
+** Issue 2: Leftover archiso.conf drop-in
+
+The file =/etc/mkinitcpio.conf.d/archiso.conf= was left over from the live ISO:
+#+begin_example
+HOOKS=(base udev microcode modconf kms memdisk archiso archiso_loop_mnt archiso_pxe_common archiso_pxe_nbd archiso_pxe_http archiso_pxe_nfs block filesystems keyboard)
+COMPRESSION="xz"
+COMPRESSION_OPTIONS=(-9e)
+#+end_example
+
+This drop-in OVERRIDES the HOOKS setting in mkinitcpio.conf, so even if mkinitcpio.conf were correct, this file would break it.
+
+** Issue 3: Wrong preset file
+
+The file =/etc/mkinitcpio.d/linux-lts.preset= contained archiso-specific configuration:
+#+begin_example
+# mkinitcpio preset file for the 'linux-lts' package on archiso
+
+PRESETS=('archiso')
+
+ALL_kver='/boot/vmlinuz-linux-lts'
+archiso_config='/etc/mkinitcpio.conf.d/archiso.conf'
+
+archiso_image="/boot/initramfs-linux-lts.img"
+#+end_example
+
+Should be:
+#+begin_example
+# mkinitcpio preset file for linux-lts
+
+PRESETS=(default fallback)
+
+ALL_kver="/boot/vmlinuz-linux-lts"
+
+default_image="/boot/initramfs-linux-lts.img"
+
+fallback_image="/boot/initramfs-linux-lts-fallback.img"
+fallback_options="-S autodetect"
+#+end_example
+
+* How This Manifests
+
+1. Fresh install appears to work (initramfs built during install has ZFS support somehow)
+2. System boots fine initially
+3. Kernel update or manual =mkinitcpio -P= rebuilds initramfs
+4. New initramfs lacks ZFS support due to wrong config
+5. Next reboot fails with "cannot import pool" or "failed to mount /sysroot"
+
+* Fix Required in install-archzfs
+
+The script needs to, after arch-chroot setup:
+
+1. *Set correct mkinitcpio.conf HOOKS*:
+ #+begin_src bash
+ sed -i 's/^HOOKS=.*/HOOKS=(base udev autodetect microcode modconf kms keyboard keymap consolefont block zfs filesystems)/' /mnt/etc/mkinitcpio.conf
+ #+end_src
+
+2. *Remove archiso drop-in*:
+ #+begin_src bash
+ rm -f /mnt/etc/mkinitcpio.conf.d/archiso.conf
+ #+end_src
+
+3. *Create proper preset file*:
+ #+begin_src bash
+ cat > /mnt/etc/mkinitcpio.d/linux-lts.preset << 'EOF'
+ # mkinitcpio preset file for linux-lts
+
+ PRESETS=(default fallback)
+
+ ALL_kver="/boot/vmlinuz-linux-lts"
+
+ default_image="/boot/initramfs-linux-lts.img"
+
+ fallback_image="/boot/initramfs-linux-lts-fallback.img"
+ fallback_options="-S autodetect"
+ EOF
+ #+end_src
+
+4. *Rebuild initramfs after fixing config*:
+ #+begin_src bash
+ arch-chroot /mnt mkinitcpio -P
+ #+end_src
+
+* Recovery Procedure (for affected systems)
+
+Boot from archzfs live ISO, then:
+
+#+begin_src bash
+# Import and mount ZFS
+zpool import -f zroot
+zfs mount zroot/ROOT/default
+mount /dev/nvme0n1p1 /boot # adjust device as needed
+
+# Fix mkinitcpio.conf
+sed -i 's/^HOOKS=.*/HOOKS=(base udev autodetect microcode modconf kms keyboard keymap consolefont block zfs filesystems)/' /etc/mkinitcpio.conf
+
+# Remove archiso drop-in
+rm -f /etc/mkinitcpio.conf.d/archiso.conf
+
+# Fix preset (adjust for your kernel: linux, linux-lts, linux-zen, etc.)
+cat > /etc/mkinitcpio.d/linux-lts.preset << 'EOF'
+PRESETS=(default fallback)
+ALL_kver="/boot/vmlinuz-linux-lts"
+default_image="/boot/initramfs-linux-lts.img"
+fallback_image="/boot/initramfs-linux-lts-fallback.img"
+fallback_options="-S autodetect"
+EOF
+
+# Mount system directories for chroot
+mount --rbind /dev /dev
+mount --rbind /sys /sys
+mount --rbind /proc /proc
+mount --rbind /run /run
+
+# Rebuild initramfs
+chroot / mkinitcpio -P
+
+# Reboot
+reboot
+#+end_src
+
+* Machine Details (ratio)
+
+- Two NVMe drives in ZFS mirror (nvme0n1, nvme1n1)
+- Pool: zroot
+- Root dataset: zroot/ROOT/default
+- Kernel: linux-lts 6.12.66-1
+- Boot partition: /dev/nvme0n1p1 (FAT32, mounted at /boot)
+
+* Related Information
+
+The immediate trigger for discovering this was a system freeze during mkinitcpio regeneration. That freeze was caused by the AMD GPU VPE power gating bug (separate issue - see archsetup NOTES.org for details). However, the system's inability to boot afterward exposed these latent mkinitcpio configuration problems.